Message ID | 1440339964-16075-1-git-send-email-dsa@cumulusnetworks.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 08/23/15 at 08:26am, David Ahern wrote: > inetpeer caches based on address only, so duplicate IP addresses within > a namespace return the same cached entry. Similar to IP fragments handle > duplicate addresses across VRFs by adding the VRF master device index to > the lookup. We have a lot of other places which use the address only. Are you going to add the VRF id to all these places as well? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/23/15 6:15 PM, Thomas Graf wrote: > On 08/23/15 at 08:26am, David Ahern wrote: >> inetpeer caches based on address only, so duplicate IP addresses within >> a namespace return the same cached entry. Similar to IP fragments handle >> duplicate addresses across VRFs by adding the VRF master device index to >> the lookup. > > We have a lot of other places which use the address only. Are you > going to add the VRF id to all these places as well? > If appropriate, yes. I have fixed IP fragments and this patch fixes inetpeer cache. In both cases (L3 artifacts) the vrf device index provides the means to uniquely identify duplicate IP addresses within a namespace. If you know of other code that might be impacted I will investigate and fix as needed. Thanks, David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/23/15 at 08:01pm, David Ahern wrote: > On 8/23/15 6:15 PM, Thomas Graf wrote: > >On 08/23/15 at 08:26am, David Ahern wrote: > >>inetpeer caches based on address only, so duplicate IP addresses within > >>a namespace return the same cached entry. Similar to IP fragments handle > >>duplicate addresses across VRFs by adding the VRF master device index to > >>the lookup. > > > >We have a lot of other places which use the address only. Are you > >going to add the VRF id to all these places as well? > > > > If appropriate, yes. I have fixed IP fragments and this patch fixes inetpeer > cache. In both cases (L3 artifacts) the vrf device index provides the means > to uniquely identify duplicate IP addresses within a namespace. If you know > of other code that might be impacted I will investigate and fix as needed. OK, then the question is what do you consider appropriate? ;-) An obvious example is netfilter conntrack but eventually any decision based on an address would require the VRF id if you want to go all the way. I see the advantages over netns based VRF right now due to the lightweight nature but if this turns out to require a new field in practically every address datastructure then that seems not what we want. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Ahern <dsa@cumulusnetworks.com> Date: Sun, 23 Aug 2015 20:01:34 -0600 > On 8/23/15 6:15 PM, Thomas Graf wrote: >> On 08/23/15 at 08:26am, David Ahern wrote: >>> inetpeer caches based on address only, so duplicate IP addresses >>> within >>> a namespace return the same cached entry. Similar to IP fragments >>> handle >>> duplicate addresses across VRFs by adding the VRF master device index >>> to >>> the lookup. >> >> We have a lot of other places which use the address only. Are you >> going to add the VRF id to all these places as well? >> > > If appropriate, yes. I have fixed IP fragments and this patch fixes > inetpeer cache. In both cases (L3 artifacts) the vrf device index > provides the means to uniquely identify duplicate IP addresses within > a namespace. If you know of other code that might be impacted I will > investigate and fix as needed. Anyways, what this inetpeer patch is doing is the wrong abstraction. The key is really "daddr + netdev" so make a helper that works using those arguments. Then it is clear as we propagate this around that addresses need to be coupled with the device in question in order to be keyed properly. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/25/15 1:47 PM, David Miller wrote: > From: David Ahern <dsa@cumulusnetworks.com> > Date: Sun, 23 Aug 2015 20:01:34 -0600 > >> On 8/23/15 6:15 PM, Thomas Graf wrote: >>> On 08/23/15 at 08:26am, David Ahern wrote: >>>> inetpeer caches based on address only, so duplicate IP addresses >>>> within >>>> a namespace return the same cached entry. Similar to IP fragments >>>> handle >>>> duplicate addresses across VRFs by adding the VRF master device index >>>> to >>>> the lookup. >>> >>> We have a lot of other places which use the address only. Are you >>> going to add the VRF id to all these places as well? >>> >> >> If appropriate, yes. I have fixed IP fragments and this patch fixes >> inetpeer cache. In both cases (L3 artifacts) the vrf device index >> provides the means to uniquely identify duplicate IP addresses within >> a namespace. If you know of other code that might be impacted I will >> investigate and fix as needed. > > Anyways, what this inetpeer patch is doing is the wrong abstraction. > > The key is really "daddr + netdev" so make a helper that works using > those arguments. That's what I have here: struct inetpeer_addr { struct inetpeer_addr_base addr; __u16 family; #if IS_ENABLED(CONFIG_NET_VRF) int vif; #endif }; the addr_compare then checks the vif (VRF device index) after the N-word address compare. > > Then it is clear as we propagate this around that addresses need to > be coupled with the device in question in order to be keyed properly. > Meaning rename struct inetpeer_addr to struct inetpeer_key and addr_compare to entry_compare or key_compare? Everything else still treats the address + VRF device as the key. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Ahern <dsa@cumulusnetworks.com> Date: Tue, 25 Aug 2015 15:41:36 -0700 > Meaning rename struct inetpeer_addr to struct inetpeer_key and > addr_compare to entry_compare or key_compare? I'm not talking about inetpeer specifically, but generally speaking everywhere you're going to have to handle this including inetpeer. So something like "inet4_daddr_key" which is a __be32 and the ifindex. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/inetpeer.h b/include/net/inetpeer.h index 002f0bd27001..a75b648b8545 100644 --- a/include/net/inetpeer.h +++ b/include/net/inetpeer.h @@ -26,6 +26,9 @@ struct inetpeer_addr_base { struct inetpeer_addr { struct inetpeer_addr_base addr; __u16 family; +#if IS_ENABLED(CONFIG_NET_VRF) + int vif; +#endif }; struct inet_peer { @@ -78,12 +81,15 @@ struct inet_peer *inet_getpeer(struct inet_peer_base *base, static inline struct inet_peer *inet_getpeer_v4(struct inet_peer_base *base, __be32 v4daddr, - int create) + int vif, int create) { struct inetpeer_addr daddr; daddr.addr.a4 = v4daddr; daddr.family = AF_INET; +#if IS_ENABLED(CONFIG_NET_VRF) + daddr.vif = vif; +#endif return inet_getpeer(base, &daddr, create); } @@ -95,6 +101,9 @@ static inline struct inet_peer *inet_getpeer_v6(struct inet_peer_base *base, daddr.addr.in6 = *v6daddr; daddr.family = AF_INET6; +#if IS_ENABLED(CONFIG_NET_VRF) + daddr.vif = 0; /* placeholder until VRF suppoort is added to IPv6 */ +#endif return inet_getpeer(base, &daddr, create); } diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index f16488efa1c8..79fe05befcae 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -309,9 +309,10 @@ static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt, rc = false; if (icmp_global_allow()) { + int vif = vrf_master_ifindex(dst->dev); struct inet_peer *peer; - peer = inet_getpeer_v4(net->ipv4.peers, fl4->daddr, 1); + peer = inet_getpeer_v4(net->ipv4.peers, fl4->daddr, vif, 1); rc = inet_peer_xrlim_allow(peer, net->ipv4.sysctl_icmp_ratelimit); if (peer) diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index 241afd743d2c..b5f268a3ea6b 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -170,6 +170,11 @@ static int addr_compare(const struct inetpeer_addr *a, return 1; } +#if IS_ENABLED(CONFIG_NET_VRF) + if (a->vif != b->vif) + return a->vif < b->vif ? -1 : 1; +#endif + return 0; } diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 15762e758861..fa7f15305f9a 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -151,7 +151,8 @@ static void ip4_frag_init(struct inet_frag_queue *q, const void *a) qp->vif = arg->vif; qp->user = arg->user; qp->peer = sysctl_ipfrag_max_dist ? - inet_getpeer_v4(net->ipv4.peers, arg->iph->saddr, 1) : NULL; + inet_getpeer_v4(net->ipv4.peers, arg->iph->saddr, arg->vif, 1) : + NULL; } static void ip4_frag_free(struct inet_frag_queue *q) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 2403e85107f0..6805d57152b9 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -838,6 +838,7 @@ void ip_rt_send_redirect(struct sk_buff *skb) struct inet_peer *peer; struct net *net; int log_martians; + int vif; rcu_read_lock(); in_dev = __in_dev_get_rcu(rt->dst.dev); @@ -846,10 +847,11 @@ void ip_rt_send_redirect(struct sk_buff *skb) return; } log_martians = IN_DEV_LOG_MARTIANS(in_dev); + vif = vrf_master_ifindex_rcu(rt->dst.dev); rcu_read_unlock(); net = dev_net(rt->dst.dev); - peer = inet_getpeer_v4(net->ipv4.peers, ip_hdr(skb)->saddr, 1); + peer = inet_getpeer_v4(net->ipv4.peers, ip_hdr(skb)->saddr, vif, 1); if (!peer) { icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, rt_nexthop(rt, ip_hdr(skb)->daddr)); @@ -938,7 +940,8 @@ static int ip_error(struct sk_buff *skb) break; } - peer = inet_getpeer_v4(net->ipv4.peers, ip_hdr(skb)->saddr, 1); + peer = inet_getpeer_v4(net->ipv4.peers, ip_hdr(skb)->saddr, + vrf_master_ifindex(skb->dev), 1); send = true; if (peer) {
inetpeer caches based on address only, so duplicate IP addresses within a namespace return the same cached entry. Similar to IP fragments handle duplicate addresses across VRFs by adding the VRF master device index to the lookup. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> --- include/net/inetpeer.h | 11 ++++++++++- net/ipv4/icmp.c | 3 ++- net/ipv4/inetpeer.c | 5 +++++ net/ipv4/ip_fragment.c | 3 ++- net/ipv4/route.c | 7 +++++-- 5 files changed, 24 insertions(+), 5 deletions(-)