Message ID | 1372170295-4717-3-git-send-email-nicolas.dichtel@6wind.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Nicolas Dichtel <nicolas.dichtel@6wind.com> Date: Tue, 25 Jun 2013 16:24:55 +0200 > @@ -453,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, > tstats->rx_bytes += skb->len; > u64_stats_update_end(&tstats->syncp); > > + skb_scrub_packet(skb); > + > if (tunnel->dev->type == ARPHRD_ETHER) { > skb->protocol = eth_type_trans(skb, tunnel->dev); > skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); I can't see how this can be ok. If something in netfilter depends upon the state you are clearing out here, someone's packet filtering setup is going to break. I'm not applying these patches, sorry. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> writes: > From: Nicolas Dichtel <nicolas.dichtel@6wind.com> > Date: Tue, 25 Jun 2013 16:24:55 +0200 > >> @@ -453,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, >> tstats->rx_bytes += skb->len; >> u64_stats_update_end(&tstats->syncp); >> >> + skb_scrub_packet(skb); >> + >> if (tunnel->dev->type == ARPHRD_ETHER) { >> skb->protocol = eth_type_trans(skb, tunnel->dev); >> skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); > > I can't see how this can be ok. > > If something in netfilter depends upon the state you are clearing out > here, someone's packet filtering setup is going to break. > > I'm not applying these patches, sorry. How can netfilter depend on the state of a packet inside of a tunnel? How can it even make sense? Or is your concern that we unintentionally allowed this in the past so to avoid breaking binary compatibility we should continue in case someone somewhere cares? I really can't see how this could possibly be an intentional feature. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: ebiederm@xmission.com (Eric W. Biederman) Date: Tue, 25 Jun 2013 18:35:30 -0700 > David Miller <davem@davemloft.net> writes: > >> From: Nicolas Dichtel <nicolas.dichtel@6wind.com> >> Date: Tue, 25 Jun 2013 16:24:55 +0200 >> >>> @@ -453,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, >>> tstats->rx_bytes += skb->len; >>> u64_stats_update_end(&tstats->syncp); >>> >>> + skb_scrub_packet(skb); >>> + >>> if (tunnel->dev->type == ARPHRD_ETHER) { >>> skb->protocol = eth_type_trans(skb, tunnel->dev); >>> skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); >> >> I can't see how this can be ok. >> >> If something in netfilter depends upon the state you are clearing out >> here, someone's packet filtering setup is going to break. >> >> I'm not applying these patches, sorry. > > How can netfilter depend on the state of a packet inside of a tunnel? > > How can it even make sense? > > Or is your concern that we unintentionally allowed this in the past so > to avoid breaking binary compatibility we should continue in case > someone somewhere cares? > > I really can't see how this could possibly be an intentional feature. You can make all of these issues go away by only clearing the SKB meta state when namespaces are actually changing as we go through the tunnel. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> writes: > From: ebiederm@xmission.com (Eric W. Biederman) > Date: Tue, 25 Jun 2013 18:35:30 -0700 > >> David Miller <davem@davemloft.net> writes: >> >>> From: Nicolas Dichtel <nicolas.dichtel@6wind.com> >>> Date: Tue, 25 Jun 2013 16:24:55 +0200 >>> >>>> @@ -453,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, >>>> tstats->rx_bytes += skb->len; >>>> u64_stats_update_end(&tstats->syncp); >>>> >>>> + skb_scrub_packet(skb); >>>> + >>>> if (tunnel->dev->type == ARPHRD_ETHER) { >>>> skb->protocol = eth_type_trans(skb, tunnel->dev); >>>> skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); >>> >>> I can't see how this can be ok. >>> >>> If something in netfilter depends upon the state you are clearing out >>> here, someone's packet filtering setup is going to break. >>> >>> I'm not applying these patches, sorry. >> >> How can netfilter depend on the state of a packet inside of a tunnel? >> >> How can it even make sense? >> >> Or is your concern that we unintentionally allowed this in the past so >> to avoid breaking binary compatibility we should continue in case >> someone somewhere cares? >> >> I really can't see how this could possibly be an intentional feature. > > You can make all of these issues go away by only clearing the SKB > meta state when namespaces are actually changing as we go through > the tunnel. I have spent some time thinking about the cases where I have had an opportunity to use the marks on packets and it turns out that if I had been using a tunnel with any of those configurations leaving the marks on would have either broken my configuration or at the very least have required me to make certain I changed those marks. So I really think this is a bug fix, for a long standing bug in a rare corner case of kernel behavior that people just haven't noticed. Which is why I suggested to Nicolas Ditchtel that he remove the test to see if we were changing network namespaces before scrubbing the packet. That said I won't object if Nocolas Ditchel resends his patches with that test put back in. I just think it is silly and when someone finally gets bit by the bug and complains we will have to go through and remove the test. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-06-26 at 03:03 -0700, Eric W. Biederman wrote: > That said I won't object if Nocolas Ditchel resends his patches with > that test put back in. I just think it is silly and when someone > finally gets bit by the bug and complains we will have to go through and > remove the test. Well, what is the reason skb_orphan() must be called in a tunnel xmit path ? This patch changes more things than what advertised in changelog :( -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 26/06/2013 12:22, Eric Dumazet a écrit : > On Wed, 2013-06-26 at 03:03 -0700, Eric W. Biederman wrote: > > >> That said I won't object if Nocolas Ditchel resends his patches with >> that test put back in. I just think it is silly and when someone >> finally gets bit by the bug and complains we will have to go through and >> remove the test. > > Well, what is the reason skb_orphan() must be called in a tunnel xmit > path ? > > This patch changes more things than what advertised in changelog :( In fact, this is true. If we finally found that skb_scrub_packet() is needed in all cases (not only when changing namespace), this will be another patch. I will resend the serie with the test put back. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 26/06/2013 01:56, David Miller a écrit : > From: Nicolas Dichtel <nicolas.dichtel@6wind.com> > Date: Tue, 25 Jun 2013 16:24:55 +0200 > >> @@ -453,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, >> tstats->rx_bytes += skb->len; >> u64_stats_update_end(&tstats->syncp); >> >> + skb_scrub_packet(skb); >> + >> if (tunnel->dev->type == ARPHRD_ETHER) { >> skb->protocol = eth_type_trans(skb, tunnel->dev); >> skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); > > I can't see how this can be ok. > > If something in netfilter depends upon the state you are clearing out > here, someone's packet filtering setup is going to break. Just for the record, note that nf_reset() is already called in iptunnel_pull_header() and iptunnel_xmit(). Hence 4in4 (ipip and sit) and gre tunnels are already reseting netfilter state. 6in4 (sit) do it only in xmit path and Xin6 (ip6_tunnel) never. We can also notice that nf_reset() was added by the commit 3d7b46cd20e3 "ip_tunnel: push generic protocol handling to ip_tunnel module." (net-next only) in rx path. The nf_reset() of xmit path of 4in4 (ipip) is here for years (at least 2.6.12). For gre, it has been added by c54419321455 "GRE: Refactor GRE tunneling code." (v3.10-rc1). It seems that the code is different depending of the type of the tunnel. If we omit skb_orphan() (and maybe another one?, to be done only when changing namespace), it can be good to have a common function to have the same behavior for each tunnel. Maybe something like: void skb_scrub_packet(bool netnschange) { if (netnschange) skb_orphan(skb); skb->tstamp.tv64 = 0; skb->pkt_type = PACKET_HOST; skb->skb_iif = 0; skb_dst_drop(skb); skb->mark = 0; secpath_reset(skb); nf_reset(skb); nf_reset_trace(skb); } What's your opinion? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This patch is a follow up of the thread "switching network namespace midway": http://marc.info/?t=135101459500004&r=1&w=2 The goal of this serie is to add x-netns support for the module sit, ie the encapsulation addresses and the network device are not owned by the same namespace. Example to configure a tunnel: modprobe sit ip netns add netns1 ip link add sit1 type sit remote 10.16.0.121 local 10.16.0.249 ip l s sit1 netns netns1 ip netns exec netns1 ip l s lo up ip netns exec netns1 ip l s sit1 up ip netns exec netns1 ip a a dev sit1 192.168.2.123 remote 192.168.2.121 ip netns exec netns1 ip -6 a a dev sit1 2001:1234::123 remote 2001:1234::121 Once this serie is approved, I will add the same feature for the module ipip and ip6_tunnel. v3: put again the test about netns before calling skb_scrub_packet() add a missing skb_scrub_packet() call in ip_tunnel_xmit() v2: rename dev_cleanup_skb to skb_scrub_packet move skb_scrub_packet to skbuff.c fix netns cleanup remove string comparison in netns cleanup add a comment about FB device call skb_scrub_packet() unconditionnaly remove 'RFC' include/linux/skbuff.h | 1 + include/net/ip_tunnels.h | 1 + net/core/dev.c | 11 +---------- net/core/skbuff.c | 23 +++++++++++++++++++++++ net/ipv4/ip_tunnel.c | 10 +++++++++- net/ipv6/sit.c | 42 ++++++++++++++++++++++++++++++++---------- 6 files changed, 67 insertions(+), 21 deletions(-) Comments are welcome. Regards, Nicolas -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Nicolas Dichtel <nicolas.dichtel@6wind.com> Date: Wed, 26 Jun 2013 16:11:26 +0200 > This patch is a follow up of the thread "switching network namespace midway": > http://marc.info/?t=135101459500004&r=1&w=2 > > The goal of this serie is to add x-netns support for the module sit, ie the > encapsulation addresses and the network device are not owned by the same > namespace. Ok, applied. And yes I agree we should look into making tunnel's behave consistently wrt. SKB orphaning, cleaning netfilter state, etc. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This patch is a follow up of the previous serie witch add this functionality for sit tunnels. The goal is to add x-netns support for the module ipip and ip6_tunnel, ie the encapsulation addresses and the network device are not owned by the same namespace. Note that the first patch is a fix of the previous serie. Example to configure an ipip tunnel: modprobe ipip ip netns add netns1 ip link add ipip1 type ipip remote 10.16.0.121 local 10.16.0.249 ip l s ipip1 netns netns1 ip netns exec netns1 ip l s lo up ip netns exec netns1 ip l s ipip1 up ip netns exec netns1 ip a a dev ipip1 192.168.2.123 remote 192.168.2.121 or an ip6_tunnel: modprobe ip6_tunnel ip netns add netns1 ip link add ip6tnl1 type ip6tnl remote 2001:660:3008:c1c3::121 local 2001:660:3008:c1c3::123 ip l s ip6tnl1 netns netns1 ip netns exec netns1 ip l s lo up ip netns exec netns1 ip l s ip6tnl1 up ip netns exec netns1 ip a a dev ip6tnl1 192.168.1.123 remote 192.168.1.121 ip netns exec netns1 ip -6 a a dev ip6tnl1 2001:1235::123 remote 2001:1235::121 include/net/ip6_tunnel.h | 1 + include/net/ip_tunnels.h | 2 +- net/ipv4/ip_gre.c | 4 ++-- net/ipv4/ip_tunnel.c | 42 +++++++++++++++++++++++++++--------------- net/ipv4/ipip.c | 3 +-- net/ipv6/ip6_gre.c | 5 +++++ net/ipv6/ip6_tunnel.c | 41 +++++++++++++++++++++++++++++++---------- net/ipv6/sit.c | 4 ++-- 8 files changed, 70 insertions(+), 32 deletions(-) Comments are welcome. Regards, Nicolas -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Nicolas Dichtel <nicolas.dichtel@6wind.com> Date: Wed, 3 Jul 2013 17:00:33 +0200 > > This patch is a follow up of the previous serie witch add this functionality > for sit tunnels. > > The goal is to add x-netns support for the module ipip and ip6_tunnel, ie the > encapsulation addresses and the network device are not owned by the same > namespace. > > Note that the first patch is a fix of the previous serie. The first patch, as it is a bug fix, is fine and is applied. The rest will have to wait until the net-next tree opens again, sorry. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This serie is a follow up of the previous serie witch adds this functionality for sit tunnels. The goal is to add x-netns support for the module ipip and ip6_tunnel, ie the encapsulation addresses and the network device are not owned by the same namespace. Note that the two first patches are cleanup. Example to configure an ipip tunnel: modprobe ipip ip netns add netns1 ip link add ipip1 type ipip remote 10.16.0.121 local 10.16.0.249 ip l s ipip1 netns netns1 ip netns exec netns1 ip l s lo up ip netns exec netns1 ip l s ipip1 up ip netns exec netns1 ip a a dev ipip1 192.168.2.123 remote 192.168.2.121 or an ip6_tunnel: modprobe ip6_tunnel ip netns add netns1 ip link add ip6tnl1 type ip6tnl remote 2001:660:3008:c1c3::121 local 2001:660:3008:c1c3::123 ip l s ip6tnl1 netns netns1 ip netns exec netns1 ip l s lo up ip netns exec netns1 ip l s ip6tnl1 up ip netns exec netns1 ip a a dev ip6tnl1 192.168.1.123 remote 192.168.1.121 ip netns exec netns1 ip -6 a a dev ip6tnl1 2001:1235::123 remote 2001:1235::121 v2: remove the patch 1/3 of the v1 serie (already included) use net_eq() add patch 1/4 and 2/4 include/net/ip6_tunnel.h | 1 + include/net/ip_tunnels.h | 2 +- net/core/dev.c | 6 +++--- net/ipv4/ip_gre.c | 4 ++-- net/ipv4/ip_tunnel.c | 52 ++++++++++++++++++++++++++++++------------------ net/ipv4/ip_vti.c | 2 +- net/ipv4/ipip.c | 3 +-- net/ipv6/ip6_gre.c | 5 +++++ net/ipv6/ip6_tunnel.c | 41 ++++++++++++++++++++++++++++---------- net/ipv6/sit.c | 6 +++--- 10 files changed, 81 insertions(+), 41 deletions(-) Comments are welcome. Regards, Nicolas -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Nicolas Dichtel <nicolas.dichtel@6wind.com> Date: Tue, 13 Aug 2013 17:51:08 +0200 > > This serie is a follow up of the previous serie witch adds this functionality > for sit tunnels. > > The goal is to add x-netns support for the module ipip and ip6_tunnel, ie the > encapsulation addresses and the network device are not owned by the same > namespace. > > Note that the two first patches are cleanup. Looks good, series applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index b0d9824..781b3cf 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -42,6 +42,7 @@ struct ip_tunnel { struct ip_tunnel __rcu *next; struct hlist_node hash_node; struct net_device *dev; + struct net *net; /* netns for packet i/o */ int err_count; /* Number of arrived ICMP errors */ unsigned long err_time; /* Time when the last ICMP error diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index bd227e5..d375e4d 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -304,6 +304,7 @@ static struct net_device *__ip_tunnel_create(struct net *net, tunnel = netdev_priv(dev); tunnel->parms = *parms; + tunnel->net = net; err = register_netdevice(dev); if (err) @@ -453,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, tstats->rx_bytes += skb->len; u64_stats_update_end(&tstats->syncp); + skb_scrub_packet(skb); + if (tunnel->dev->type == ARPHRD_ETHER) { skb->protocol = eth_type_trans(skb, tunnel->dev); skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); @@ -541,7 +544,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, tos = ipv6_get_dsfield((const struct ipv6hdr *)inner_iph); } - rt = ip_route_output_tunnel(dev_net(dev), &fl4, + rt = ip_route_output_tunnel(tunnel->net, &fl4, tunnel->parms.iph.protocol, dst, tnl_params->saddr, tunnel->parms.o_key, @@ -888,6 +891,7 @@ int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], if (ip_tunnel_find(itn, p, dev->type)) return -EEXIST; + nt->net = net; nt->parms = *p; err = register_netdevice(dev); if (err) diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index f639866..8765f4e 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -466,14 +466,14 @@ isatap_chksrc(struct sk_buff *skb, const struct iphdr *iph, struct ip_tunnel *t) static void ipip6_tunnel_uninit(struct net_device *dev) { - struct net *net = dev_net(dev); - struct sit_net *sitn = net_generic(net, sit_net_id); + struct ip_tunnel *tunnel = netdev_priv(dev); + struct sit_net *sitn = net_generic(tunnel->net, sit_net_id); if (dev == sitn->fb_tunnel_dev) { RCU_INIT_POINTER(sitn->tunnels_wc[0], NULL); } else { - ipip6_tunnel_unlink(sitn, netdev_priv(dev)); - ipip6_tunnel_del_prl(netdev_priv(dev), NULL); + ipip6_tunnel_unlink(sitn, tunnel); + ipip6_tunnel_del_prl(tunnel, NULL); } dev_put(dev); } @@ -621,6 +621,7 @@ static int ipip6_rcv(struct sk_buff *skb) tstats->rx_packets++; tstats->rx_bytes += skb->len; + skb_scrub_packet(skb); netif_rx(skb); return 0; @@ -803,7 +804,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, goto tx_error; } - rt = ip_route_output_ports(dev_net(dev), &fl4, NULL, + rt = ip_route_output_ports(tunnel->net, &fl4, NULL, dst, tiph->saddr, 0, 0, IPPROTO_IPV6, RT_TOS(tos), @@ -858,6 +859,8 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, tunnel->err_count = 0; } + skb_scrub_packet(skb); + /* * Okay, now see if we can stuff it in the buffer as-is. */ @@ -944,7 +947,8 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev) iph = &tunnel->parms.iph; if (iph->daddr) { - struct rtable *rt = ip_route_output_ports(dev_net(dev), &fl4, NULL, + struct rtable *rt = ip_route_output_ports(tunnel->net, &fl4, + NULL, iph->daddr, iph->saddr, 0, 0, IPPROTO_IPV6, @@ -959,7 +963,7 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev) } if (!tdev && tunnel->parms.link) - tdev = __dev_get_by_index(dev_net(dev), tunnel->parms.link); + tdev = __dev_get_by_index(tunnel->net, tunnel->parms.link); if (tdev) { dev->hard_header_len = tdev->hard_header_len + sizeof(struct iphdr); @@ -972,7 +976,7 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev) static void ipip6_tunnel_update(struct ip_tunnel *t, struct ip_tunnel_parm *p) { - struct net *net = dev_net(t->dev); + struct net *net = t->net; struct sit_net *sitn = net_generic(net, sit_net_id); ipip6_tunnel_unlink(sitn, t); @@ -1248,7 +1252,6 @@ static void ipip6_tunnel_setup(struct net_device *dev) dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; dev->iflink = 0; dev->addr_len = 4; - dev->features |= NETIF_F_NETNS_LOCAL; dev->features |= NETIF_F_LLTX; } @@ -1257,6 +1260,7 @@ static int ipip6_tunnel_init(struct net_device *dev) struct ip_tunnel *tunnel = netdev_priv(dev); tunnel->dev = dev; + tunnel->net = dev_net(dev); memcpy(dev->dev_addr, &tunnel->parms.iph.saddr, 4); memcpy(dev->broadcast, &tunnel->parms.iph.daddr, 4); @@ -1277,6 +1281,7 @@ static int __net_init ipip6_fb_tunnel_init(struct net_device *dev) struct sit_net *sitn = net_generic(net, sit_net_id); tunnel->dev = dev; + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name); iph->version = 4; @@ -1564,8 +1569,14 @@ static struct xfrm_tunnel ipip_handler __read_mostly = { static void __net_exit sit_destroy_tunnels(struct sit_net *sitn, struct list_head *head) { + struct net *net = dev_net(sitn->fb_tunnel_dev); + struct net_device *dev, *aux; int prio; + for_each_netdev_safe(net, dev, aux) + if (dev->rtnl_link_ops == &sit_link_ops) + unregister_netdevice_queue(dev, head); + for (prio = 1; prio < 4; prio++) { int h; for (h = 0; h < HASH_SIZE; h++) { @@ -1573,7 +1584,12 @@ static void __net_exit sit_destroy_tunnels(struct sit_net *sitn, struct list_hea t = rtnl_dereference(sitn->tunnels[prio][h]); while (t != NULL) { - unregister_netdevice_queue(t->dev, head); + /* If dev is in the same netns, it has already + * been added to the list by the previous loop. + */ + if (dev_net(t->dev) != net) + unregister_netdevice_queue(t->dev, + head); t = rtnl_dereference(t->next); } } @@ -1598,6 +1614,10 @@ static int __net_init sit_init_net(struct net *net) goto err_alloc_dev; } dev_net_set(sitn->fb_tunnel_dev, net); + /* FB netdevice is special: we have one, and only one per netns. + * Allowing to move it to another netns is clearly unsafe. + */ + sitn->fb_tunnel_dev->features |= NETIF_F_NETNS_LOCAL; err = ipip6_fb_tunnel_init(sitn->fb_tunnel_dev); if (err)
This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> --- include/net/ip_tunnels.h | 1 + net/ipv4/ip_tunnel.c | 6 +++++- net/ipv6/sit.c | 40 ++++++++++++++++++++++++++++++---------- 3 files changed, 36 insertions(+), 11 deletions(-)