Message ID | 20140320.160229.857536522237793124.davem@davemloft.net |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
>From: David Miller >Any objections to this? I think it maintains the pre-ipv6-support >behavior. I know there may be some concerns about supporting >multiple >families on the same socket, but I'm not so sure the code is able to >support that right now anyways. I'm ok with the idea of determining the AF from the socket -- mixed support, if useful, can be added later. But the patch needs to then check and drop packets encapsulated with the wrong address family. And it still shouldn't assume !v4 means v6. [apologies for spacing; doing this from a web browser...] So, I think we need something like: if (vs->family == AF_INET && skb->protocol == ntohs(ETH_P_IP)) { .... } else if (vs->family == AF_INET6 && skb->protocol == ntohs(ETH_P_IPV6)) { ... } else goto drop +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 20, 2014 at 10:02 PM, David Miller <davem@davemloft.net> wrote: > From: Mike Rapoport <mike.rapoport@ravellosystems.com> > Date: Mon, 17 Mar 2014 13:17:30 +0200 > >> If the vxlan interface is created without group definition, there is a >> panic on the first packet reception: > ... >> The crash occurs because vxlan_rcv decides on protocol version of outer >> packed using vxlan->default_dst.remote_ip.sa.sa_family field which is >> not initialized if no multicast group was specified at interface >> creation time. This causes vxlan driver to always assume that outer >> packet is IPv6. >> >> Using IP protocol version from skb instead of default destination >> address family fixes the problem. >> >> Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com> > > Thinking some more, I'd like to propose an alternate version of this fix. > > Any objections to this? I think it maintains the pre-ipv6-support > behavior. I know there may be some concerns about supporting multiple > families on the same socket, but I'm not so sure the code is able to > support that right now anyways. No objections from my side either. > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index a7eb3f2..3a23623 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -1206,7 +1206,7 @@ static void vxlan_rcv(struct vxlan_sock *vs, > goto drop; > > /* Re-examine inner Ethernet packet */ > - if (remote_ip->sa.sa_family == AF_INET) { > + if (vs->family == AF_INET) { > oip = ip_hdr(skb); > saddr.sin.sin_addr.s_addr = oip->saddr; > saddr.sa.sa_family = AF_INET; > @@ -2409,10 +2409,13 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port, > > INIT_WORK(&vs->del_work, vxlan_del_work); > > - if (ipv6) > + if (ipv6) { > + vs->family = AF_INET6; > sock = create_v6_sock(net, port); > - else > + } else { > + vs->family = AF_INET; > sock = create_v4_sock(net, port); > + } > if (IS_ERR(sock)) { > kfree(vs); > return ERR_CAST(sock); > diff --git a/include/net/vxlan.h b/include/net/vxlan.h > index 5deef1a..6f00731 100644 > --- a/include/net/vxlan.h > +++ b/include/net/vxlan.h > @@ -16,6 +16,7 @@ struct vxlan_sock { > struct hlist_node hlist; > vxlan_rcv_t *rcv; > void *data; > + __u16 family; > struct work_struct del_work; > struct socket *sock; > struct rcu_head rcu;
On Thu, Mar 20, 2014 at 10:47 PM, David Stevens <dlstevens@us.ibm.com> wrote: > >>From: David Miller > >>Any objections to this? I think it maintains the pre-ipv6-support >>behavior. I know there may be some concerns about supporting >>multiple >>families on the same socket, but I'm not so sure the code is able to >>support that right now anyways. > > I'm ok with the idea of determining the AF from the socket -- mixed > support, if useful, can be added later. But the patch needs to then > check and drop packets encapsulated with the wrong address family. > > And it still shouldn't assume !v4 means v6. > > [apologies for spacing; doing this from a web browser...] > So, I think we need something like: > > if (vs->family == AF_INET && skb->protocol == ntohs(ETH_P_IP)) { > .... > } else if (vs->family == AF_INET6 && skb->protocol == ntohs(ETH_P_IPV6)) { > ... > } else > goto drop > Checking skb->protocol will drop ARP requests. What about using ip_hdr(skb)->version? > +-DLS >
-----Mike Rapoport <mike.rapoport@ravellosystems.com> wrote: ----- >Checking skb->protocol will drop ARP requests. What about using >ip_hdr(skb)->version? Mike, ip_hdr() here is the outer packet, so it's got to be a UDP packet-- we just don't know if it's UDP/IP or UDP/IPv6 when it is bound to INADDR_ANY, since both can be delivered. It could use version in this case, because both possible protocols have version in the same place, but I think it's more correct to use the MAC layer protocol rather than relying on the fact that IPv4 and IPv6 have "version" in the same spot. "It could be ARP" would be the argument for NOT using the version in places where it really could be ARP, even though that isn't the case here. vxlan_rcv() is only called for VXLAN encapsulated packets sent to the bound UDP port. So, if "vs->family" holds the one we want to support, we can't just blindly assume the received packet is IPv4, for example, and start accessing IPv4 fields, because it could be an IPv6 packet. We have to check the packet type too. And if it's not the one we bound to, drop it. That's what the code snippet I outlined is trying to do. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 21, 2014 at 1:22 PM, David Stevens <dlstevens@us.ibm.com> wrote: > > -----Mike Rapoport <mike.rapoport@ravellosystems.com> wrote: ----- > > Mike, ip_hdr() here is the outer packet, so it's got to be a UDP packet-- > we just don't know if it's UDP/IP or UDP/IPv6 when it is bound to INADDR_ANY, > since both can be delivered. It could use version in this case, because > both possible protocols have version in the same place, but I think it's > more correct to use the MAC layer protocol rather than relying on the > fact that IPv4 and IPv6 have "version" in the same spot. "It could be ARP" > would be the argument for NOT using the version in places where it really > could be ARP, even though that isn't the case here. > > vxlan_rcv() is only called for VXLAN encapsulated packets sent to the bound > UDP port. > > So, if "vs->family" holds the one we want to support, we can't just blindly > assume the received packet is IPv4, for example, and start accessing > IPv4 fields, because it could be an IPv6 packet. We have to check the > packet type too. And if it's not the one we bound to, drop it. > > That's what the code snippet I outlined is trying to do. > David, I've tried your snippet with IPv4 and I've got all ARP replies dropped. And if I enable IPv6 I still get crushes in ipv6_rcv. It seems to me that at the time vxlan_rcv gets outer IP header, the SKB contains mixed information of outer and inner packets. I'll continue to look into it. +-DLS > > >
On Fri, Mar 21, 2014 at 05:22:06AM -0600, David Stevens wrote: > -----Mike Rapoport <mike.rapoport@ravellosystems.com> wrote: ----- > > >Checking skb->protocol will drop ARP requests. What about using > >ip_hdr(skb)->version? > > Mike, ip_hdr() here is the outer packet, so it's got to be a UDP packet-- > we just don't know if it's UDP/IP or UDP/IPv6 when it is bound to INADDR_ANY, > since both can be delivered. It could use version in this case, because > both possible protocols have version in the same place, but I think it's > more correct to use the MAC layer protocol rather than relying on the > fact that IPv4 and IPv6 have "version" in the same spot. "It could be ARP" > would be the argument for NOT using the version in places where it really > could be ARP, even though that isn't the case here. > > vxlan_rcv() is only called for VXLAN encapsulated packets sent to the bound > UDP port. > > So, if "vs->family" holds the one we want to support, we can't just blindly > assume the received packet is IPv4, for example, and start accessing > IPv4 fields, because it could be an IPv6 packet. We have to check the > packet type too. And if it's not the one we bound to, drop it. > > That's what the code snippet I outlined is trying to do. > > +-DLS > I beleive I've groked what's going on in vxlan_udp_encap_recv and vxlan_rcv. There are actually two unrelated problems: 1) When the vxlan is configured with IPv4 group it crashes when it starts to receive IPv6 IGMP packets encapsulated into IPv4 vxlan packets. This happens because when ipv6_rcv handles the inner packet, the skb->dst still refernces outer IPv4 info. The very old vxlan code had skb_dst_drop call in vxlan_udp_encap_recv, which was removed when vxlan was refactored to use iptunnel_pull_header (commit 7ce04758279514ca1d8ebfe322508a4a430fe2c8: "vxlan: Restructure vxlan receive"). The iptunnel_pull_header called skb_dst_drop until recent commit 10ddceb22bab11dab10ba645c7df2e4a8e7a5db5 ("ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer"). The simplest fix, I think, would be to restore call to skb_dst_drop in vxlan_udp_encap_recv. 2) When the vxlan is using custom configuration and the vxlan interface is created without group definition, the vxlan_rcv always takes IPv6 path because the decision is based on default_dst.sa.sa_family which is AF_UNSPEC in this case. The code snippet proposed by David S. is not working because by the time vxlan_rcv checks the outer protocol the skb->protocol is already set to that of the inner packet in iptunnel_pull_header. So, to have proper check for packet type in vxlan_rcv we can either check outer IP header version or pass outer skb->protocol to vxlan_rcv. I personally favor checking ip_hdr(skb)->version despite David S. objection above. The version field is not coincidentally at the same spot for both v4 and v6, and check for version keeps code simpler and cleaner, IMHO. Waiting for your comments, Mike. -- [1] http://www.spinics.net/lists/netdev/msg276490.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 23, 2014 at 2:27 AM, Mike Rapoport <mike.rapoport@ravellosystems.com> wrote: > On Fri, Mar 21, 2014 at 05:22:06AM -0600, David Stevens wrote: >> -----Mike Rapoport <mike.rapoport@ravellosystems.com> wrote: ----- >> >> >Checking skb->protocol will drop ARP requests. What about using >> >ip_hdr(skb)->version? >> >> Mike, ip_hdr() here is the outer packet, so it's got to be a UDP packet-- >> we just don't know if it's UDP/IP or UDP/IPv6 when it is bound to INADDR_ANY, >> since both can be delivered. It could use version in this case, because >> both possible protocols have version in the same place, but I think it's >> more correct to use the MAC layer protocol rather than relying on the >> fact that IPv4 and IPv6 have "version" in the same spot. "It could be ARP" >> would be the argument for NOT using the version in places where it really >> could be ARP, even though that isn't the case here. >> >> vxlan_rcv() is only called for VXLAN encapsulated packets sent to the bound >> UDP port. >> >> So, if "vs->family" holds the one we want to support, we can't just blindly >> assume the received packet is IPv4, for example, and start accessing >> IPv4 fields, because it could be an IPv6 packet. We have to check the >> packet type too. And if it's not the one we bound to, drop it. >> >> That's what the code snippet I outlined is trying to do. >> >> +-DLS >> > > I beleive I've groked what's going on in vxlan_udp_encap_recv and > vxlan_rcv. There are actually two unrelated problems: > > 1) When the vxlan is configured with IPv4 group it crashes when it > starts to receive IPv6 IGMP packets encapsulated into IPv4 vxlan > packets. This happens because when ipv6_rcv handles the inner packet, > the skb->dst still refernces outer IPv4 info. The very old vxlan code > had skb_dst_drop call in vxlan_udp_encap_recv, which was removed when > vxlan was refactored to use iptunnel_pull_header (commit > 7ce04758279514ca1d8ebfe322508a4a430fe2c8: "vxlan: Restructure vxlan > receive"). The iptunnel_pull_header called skb_dst_drop until recent > commit 10ddceb22bab11dab10ba645c7df2e4a8e7a5db5 ("ip_tunnel:multicast > process cause panic due to skb->_skb_refdst NULL pointer"). > The simplest fix, I think, would be to restore call to skb_dst_drop in > vxlan_udp_encap_recv. > iptunnel_pull() is used in multiple tunnel implementation, therefore we need drop dst in that function. I send out fix : http://marc.info/?l=linux-netdev&m=139563761515707 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index a7eb3f2..3a23623 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1206,7 +1206,7 @@ static void vxlan_rcv(struct vxlan_sock *vs, goto drop; /* Re-examine inner Ethernet packet */ - if (remote_ip->sa.sa_family == AF_INET) { + if (vs->family == AF_INET) { oip = ip_hdr(skb); saddr.sin.sin_addr.s_addr = oip->saddr; saddr.sa.sa_family = AF_INET; @@ -2409,10 +2409,13 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port, INIT_WORK(&vs->del_work, vxlan_del_work); - if (ipv6) + if (ipv6) { + vs->family = AF_INET6; sock = create_v6_sock(net, port); - else + } else { + vs->family = AF_INET; sock = create_v4_sock(net, port); + } if (IS_ERR(sock)) { kfree(vs); return ERR_CAST(sock); diff --git a/include/net/vxlan.h b/include/net/vxlan.h index 5deef1a..6f00731 100644 --- a/include/net/vxlan.h +++ b/include/net/vxlan.h @@ -16,6 +16,7 @@ struct vxlan_sock { struct hlist_node hlist; vxlan_rcv_t *rcv; void *data; + __u16 family; struct work_struct del_work; struct socket *sock; struct rcu_head rcu;