Message ID | 8d2a6ece65b17ae61343dce27f3709d3cd249832.1455118911.git.pabeni@redhat.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 10 Feb 2016 16:47:21 +0100, Paolo Abeni wrote: > --- a/drivers/net/geneve.c > +++ b/drivers/net/geneve.c > @@ -1441,7 +1441,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name, > return dev; > > err = geneve_configure(net, dev, &geneve_remote_unspec, > - 0, 0, 0, htons(dst_port), true, 0); > + 0, 0, 0, htons(dst_port), true, > + GENEVE_F_UDP_ZERO_CSUM6_RX); > if (err) { > free_netdev(dev); > return ERR_PTR(err); > diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c > index 1605691..d933cb8 100644 > --- a/net/openvswitch/vport-vxlan.c > +++ b/net/openvswitch/vport-vxlan.c > @@ -90,7 +90,7 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms) > int err; > struct vxlan_config conf = { > .no_share = true, > - .flags = VXLAN_F_COLLECT_METADATA, > + .flags = VXLAN_F_COLLECT_METADATA | VXLAN_F_UDP_ZERO_CSUM6_RX, I'm afraid this looks wrong, we should not default to zero UDP checksum over IPv6. See RFC 2460, section 8.1: o Unlike IPv4, when UDP packets are originated by an IPv6 node, the UDP checksum is not optional. That is, whenever originating a UDP packet, an IPv6 node must compute a UDP checksum over the packet and the pseudo-header, and, if that computation yields a result of zero, it must be changed to hex FFFF for placement in the UDP header. IPv6 receivers must discard UDP packets containing a zero checksum, and should log the error. One may argue that with tunneling, the situation is different but that's the reason why we have the IPv6 checksum flag and why it has opposite meaning to the IPv4 one. We shouldn't default to non-RFC behavior by default. What is the problem with the current code? Is the UDP checksum not calculated correctly? Jiri
From: Jiri Benc <jbenc@redhat.com> Date: Thu, 11 Feb 2016 11:41:24 +0100 > On Wed, 10 Feb 2016 16:47:21 +0100, Paolo Abeni wrote: >> --- a/drivers/net/geneve.c >> +++ b/drivers/net/geneve.c >> @@ -1441,7 +1441,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name, >> return dev; >> >> err = geneve_configure(net, dev, &geneve_remote_unspec, >> - 0, 0, 0, htons(dst_port), true, 0); >> + 0, 0, 0, htons(dst_port), true, >> + GENEVE_F_UDP_ZERO_CSUM6_RX); >> if (err) { >> free_netdev(dev); >> return ERR_PTR(err); >> diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c >> index 1605691..d933cb8 100644 >> --- a/net/openvswitch/vport-vxlan.c >> +++ b/net/openvswitch/vport-vxlan.c >> @@ -90,7 +90,7 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms) >> int err; >> struct vxlan_config conf = { >> .no_share = true, >> - .flags = VXLAN_F_COLLECT_METADATA, >> + .flags = VXLAN_F_COLLECT_METADATA | VXLAN_F_UDP_ZERO_CSUM6_RX, > > I'm afraid this looks wrong, we should not default to zero UDP checksum > over IPv6. See RFC 2460, section 8.1: > > o Unlike IPv4, when UDP packets are originated by an IPv6 node, > the UDP checksum is not optional. That is, whenever > originating a UDP packet, an IPv6 node must compute a UDP > checksum over the packet and the pseudo-header, and, if that > computation yields a result of zero, it must be changed to hex > FFFF for placement in the UDP header. IPv6 receivers must > discard UDP packets containing a zero checksum, and should log > the error. > > One may argue that with tunneling, the situation is different but > that's the reason why we have the IPv6 checksum flag and why it has > opposite meaning to the IPv4 one. We shouldn't default to non-RFC > behavior by default. > > What is the problem with the current code? Is the UDP checksum not > calculated correctly? Furthermore, we should be setting the checksum by default for ipv4 too. It's faster.
On Thu, 2016-02-11 at 11:41 +0100, Jiri Benc wrote: > On Wed, 10 Feb 2016 16:47:21 +0100, Paolo Abeni wrote: > > --- a/drivers/net/geneve.c > > +++ b/drivers/net/geneve.c > > @@ -1441,7 +1441,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name, > > return dev; > > > > err = geneve_configure(net, dev, &geneve_remote_unspec, > > - 0, 0, 0, htons(dst_port), true, 0); > > + 0, 0, 0, htons(dst_port), true, > > + GENEVE_F_UDP_ZERO_CSUM6_RX); > > if (err) { > > free_netdev(dev); > > return ERR_PTR(err); > > diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c > > index 1605691..d933cb8 100644 > > --- a/net/openvswitch/vport-vxlan.c > > +++ b/net/openvswitch/vport-vxlan.c > > @@ -90,7 +90,7 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms) > > int err; > > struct vxlan_config conf = { > > .no_share = true, > > - .flags = VXLAN_F_COLLECT_METADATA, > > + .flags = VXLAN_F_COLLECT_METADATA | VXLAN_F_UDP_ZERO_CSUM6_RX, > > I'm afraid this looks wrong, we should not default to zero UDP checksum > over IPv6. See RFC 2460, section 8.1: > > o Unlike IPv4, when UDP packets are originated by an IPv6 node, > the UDP checksum is not optional. That is, whenever > originating a UDP packet, an IPv6 node must compute a UDP > checksum over the packet and the pseudo-header, and, if that > computation yields a result of zero, it must be changed to hex > FFFF for placement in the UDP header. IPv6 receivers must > discard UDP packets containing a zero checksum, and should log > the error. > > One may argue that with tunneling, the situation is different but > that's the reason why we have the IPv6 checksum flag and why it has > opposite meaning to the IPv4 one. We shouldn't default to non-RFC > behavior by default. Agreed. As far as I can understand RFC 6936, Section 5 do allows specific tunneling protocol to accept zero UDP checksum on specific ports. We are already sending by default zero UDP checksum when tunneling over vxlan/geneve light weight tunnel since the commit 35e2d1152b22 ("tunnels: Allow IPv6 UDP checksums to be correctly controlled."). Currently, geneve/vxlan lwt pairs tunneling over ipv6 are not able to talk each-other with default setting since the sender will set zero UDP checksum in the external header and the receiver will discard such packets. This commit is intended to address the above issue. Such issue could be addressed also by defaulting all lwt geneve/vxlan flows (comprising those tunneling over ipv4) to set the external UDP checksum, but a single lwt tunnel device can be used to terminate flows with different checksum settings. Paolo
On Thu, 11 Feb 2016 12:38:51 +0100, Paolo Abeni wrote: > We are already sending by default zero UDP checksum when tunneling over > vxlan/geneve light weight tunnel since the commit 35e2d1152b22 > ("tunnels: Allow IPv6 UDP checksums to be correctly controlled."). > > Currently, geneve/vxlan lwt pairs tunneling over ipv6 are not able to > talk each-other with default setting since the sender will set zero UDP > checksum in the external header and the receiver will discard such > packets. > > This commit is intended to address the above issue. > > Such issue could be addressed also by defaulting all lwt geneve/vxlan > flows (comprising those tunneling over ipv4) to set the external UDP > checksum, but a single lwt tunnel device can be used to terminate flows > with different checksum settings. Hmm, right. The checksumming in tx path is solely controlled by TUNNEL_CSUM value in tun_flags and openvswitch does not set OVS_TUNNEL_KEY_ATTR_CSUM by default. This is another property that should have been per-interface and not per-packet in metadata based mode :-/ But we cannot change it now. We could probably provide a way for the ovs user space to control the rx checksumming flag but as the tx flag can be set per-flow, it would only lead to ovs allowing acceptance of zero rx checksums unconditionally anyway. I don't see a better way out of this now. Fortunately, this does not affect route based tunneling. Acked-by: Jiri Benc <jbenc@redhat.com> We should follow up in ovs (user space) to set OVS_TUNNEL_KEY_ATTR_CSUM by default, at least for IPv6. Jiri
On Thu, 11 Feb 2016 13:16:52 +0100, Jiri Benc wrote: > I don't see a better way out of this now. Fortunately, this does not > affect route based tunneling. > > Acked-by: Jiri Benc <jbenc@redhat.com> Oh, and this should go to net, not net-next. Jiri
On Thu, Feb 11, 2016 at 2:41 AM, Jiri Benc <jbenc@redhat.com> wrote: > On Wed, 10 Feb 2016 16:47:21 +0100, Paolo Abeni wrote: >> --- a/drivers/net/geneve.c >> +++ b/drivers/net/geneve.c >> @@ -1441,7 +1441,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name, >> return dev; >> >> err = geneve_configure(net, dev, &geneve_remote_unspec, >> - 0, 0, 0, htons(dst_port), true, 0); >> + 0, 0, 0, htons(dst_port), true, >> + GENEVE_F_UDP_ZERO_CSUM6_RX); >> if (err) { >> free_netdev(dev); >> return ERR_PTR(err); >> diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c >> index 1605691..d933cb8 100644 >> --- a/net/openvswitch/vport-vxlan.c >> +++ b/net/openvswitch/vport-vxlan.c >> @@ -90,7 +90,7 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms) >> int err; >> struct vxlan_config conf = { >> .no_share = true, >> - .flags = VXLAN_F_COLLECT_METADATA, >> + .flags = VXLAN_F_COLLECT_METADATA | VXLAN_F_UDP_ZERO_CSUM6_RX, > > I'm afraid this looks wrong, we should not default to zero UDP checksum > over IPv6. See RFC 2460, section 8.1: > > o Unlike IPv4, when UDP packets are originated by an IPv6 node, > the UDP checksum is not optional. That is, whenever > originating a UDP packet, an IPv6 node must compute a UDP > checksum over the packet and the pseudo-header, and, if that > computation yields a result of zero, it must be changed to hex > FFFF for placement in the UDP header. IPv6 receivers must > discard UDP packets containing a zero checksum, and should log > the error. > > One may argue that with tunneling, the situation is different but > that's the reason why we have the IPv6 checksum flag and why it has > opposite meaning to the IPv4 one. We shouldn't default to non-RFC > behavior by default. There's a bigger problem here, not really related to lightweight tunnels or OVS. The VXLAN RFC says (referring to the UDP checksum and not specific to IPv4/v6): "It SHOULD be transmitted as zero. When a packet is received with a UDP checksum of zero, it MUST be accepted for decapsulation." We can debate whether this is correct or whether it conflicts with RFC 2460 but this is what essentially everyone is going to implement. With the default settings of the flags in IPv6, we are violating both statements. With the second one in particular, the result is that Linux will not be able to communicate with any non-Linux VXLAN endpoint over IPv6 with default settings.
From: Jesse Gross <jesse@kernel.org> Date: Tue, 16 Feb 2016 10:22:38 -0800 > On Thu, Feb 11, 2016 at 2:41 AM, Jiri Benc <jbenc@redhat.com> wrote: > There's a bigger problem here, not really related to lightweight tunnels or OVS. > > The VXLAN RFC says (referring to the UDP checksum and not specific to IPv4/v6): > "It SHOULD be transmitted as zero. When a packet is received with a > UDP checksum of zero, it MUST be accepted for decapsulation." > > We can debate whether this is correct or whether it conflicts with RFC > 2460 but this is what essentially everyone is going to implement. With > the default settings of the flags in IPv6, we are violating both > statements. With the second one in particular, the result is that > Linux will not be able to communicate with any non-Linux VXLAN > endpoint over IPv6 with default settings. I do not see any such conflict here. It's a SHOULD, therefore a recommendation. Likely they thought this would improve performance, and ironically it has the opposite effect. The text of the VXLAN RFC does not say that the checksum MUST be sent as zero, and it also does not say that receiving a non-zero checksum is violating the RFC. I therefore do not see the interoperability issue. Maybe some deployed systems will run more slowly or hit a slot path (which is not our problem), but they absolutely should not drop such frames.
On Tue, Feb 16, 2016 at 11:47 AM, David Miller <davem@davemloft.net> wrote: > From: Jesse Gross <jesse@kernel.org> > Date: Tue, 16 Feb 2016 10:22:38 -0800 > >> On Thu, Feb 11, 2016 at 2:41 AM, Jiri Benc <jbenc@redhat.com> wrote: >> There's a bigger problem here, not really related to lightweight tunnels or OVS. >> >> The VXLAN RFC says (referring to the UDP checksum and not specific to IPv4/v6): >> "It SHOULD be transmitted as zero. When a packet is received with a >> UDP checksum of zero, it MUST be accepted for decapsulation." >> >> We can debate whether this is correct or whether it conflicts with RFC >> 2460 but this is what essentially everyone is going to implement. With >> the default settings of the flags in IPv6, we are violating both >> statements. With the second one in particular, the result is that >> Linux will not be able to communicate with any non-Linux VXLAN >> endpoint over IPv6 with default settings. > > I do not see any such conflict here. > > It's a SHOULD, therefore a recommendation. Likely they thought this > would improve performance, and ironically it has the opposite effect. > > The text of the VXLAN RFC does not say that the checksum MUST be sent > as zero, and it also does not say that receiving a non-zero checksum > is violating the RFC. > > I therefore do not see the interoperability issue. Maybe some > deployed systems will run more slowly or hit a slot path (which is not > our problem), but they absolutely should not drop such frames. "When a packet is received with a UDP checksum of zero, it MUST be accepted for decapsulation." This is a requirement and directly in conflict with having VXLAN_F_UDP_ZERO_CSUM6_RX set to false as the default. As far as the use of checksums, saying that it improves performance is only valid for a class of devices. It is not true as a blanket statement, such as for switching ASICs. In these cases, the performance drop in hitting a slow path is so big that it is equivalent to dropping frames and creates a de-facto interoperability issue.
From: Jesse Gross <jesse@kernel.org> Date: Tue, 16 Feb 2016 12:11:57 -0800 > On Tue, Feb 16, 2016 at 11:47 AM, David Miller <davem@davemloft.net> wrote: >> From: Jesse Gross <jesse@kernel.org> >> Date: Tue, 16 Feb 2016 10:22:38 -0800 >> >>> On Thu, Feb 11, 2016 at 2:41 AM, Jiri Benc <jbenc@redhat.com> wrote: >>> There's a bigger problem here, not really related to lightweight tunnels or OVS. >>> >>> The VXLAN RFC says (referring to the UDP checksum and not specific to IPv4/v6): >>> "It SHOULD be transmitted as zero. When a packet is received with a >>> UDP checksum of zero, it MUST be accepted for decapsulation." >>> >>> We can debate whether this is correct or whether it conflicts with RFC >>> 2460 but this is what essentially everyone is going to implement. With >>> the default settings of the flags in IPv6, we are violating both >>> statements. With the second one in particular, the result is that >>> Linux will not be able to communicate with any non-Linux VXLAN >>> endpoint over IPv6 with default settings. >> >> I do not see any such conflict here. >> >> It's a SHOULD, therefore a recommendation. Likely they thought this >> would improve performance, and ironically it has the opposite effect. >> >> The text of the VXLAN RFC does not say that the checksum MUST be sent >> as zero, and it also does not say that receiving a non-zero checksum >> is violating the RFC. >> >> I therefore do not see the interoperability issue. Maybe some >> deployed systems will run more slowly or hit a slot path (which is not >> our problem), but they absolutely should not drop such frames. > > "When a packet is received with a UDP checksum of zero, it MUST be > accepted for decapsulation." > > This is a requirement and directly in conflict with having > VXLAN_F_UDP_ZERO_CSUM6_RX set to false as the default. Oh yes, I'm mixing different parts of the conversation. We must accept on RX zero checksum fields even for ipv6 because of the way the VXLAN RFC is worded, correct.
From: David Miller <davem@davemloft.net> Date: Tue, 16 Feb 2016 15:40:04 -0500 (EST) > Oh yes, I'm mixing different parts of the conversation. We must > accept on RX zero checksum fields even for ipv6 because of the way the > VXLAN RFC is worded, correct. Paolo, can you please respin your patch against 'net'? I get rejects.
From: Tom Herbert <tom@herbertland.com> Date: Tue, 16 Feb 2016 12:50:23 -0800 > On Feb 16, 2016 12:40 PM, "David Miller" <davem@davemloft.net> wrote: >> >> From: Jesse Gross <jesse@kernel.org> >> Date: Tue, 16 Feb 2016 12:11:57 -0800 >> >> > On Tue, Feb 16, 2016 at 11:47 AM, David Miller <davem@davemloft.net> > wrote: >> >> From: Jesse Gross <jesse@kernel.org> >> >> Date: Tue, 16 Feb 2016 10:22:38 -0800 >> >> >> >>> On Thu, Feb 11, 2016 at 2:41 AM, Jiri Benc <jbenc@redhat.com> wrote: >> >>> There's a bigger problem here, not really related to lightweight > tunnels or OVS. >> >>> >> >>> The VXLAN RFC says (referring to the UDP checksum and not specific to > IPv4/v6): >> >>> "It SHOULD be transmitted as zero. When a packet is received with a >> >>> UDP checksum of zero, it MUST be accepted for decapsulation." >> >>> >> >>> We can debate whether this is correct or whether it conflicts with RFC >> >>> 2460 but this is what essentially everyone is going to implement. With >> >>> the default settings of the flags in IPv6, we are violating both >> >>> statements. With the second one in particular, the result is that >> >>> Linux will not be able to communicate with any non-Linux VXLAN >> >>> endpoint over IPv6 with default settings. >> >> >> >> I do not see any such conflict here. >> >> >> >> It's a SHOULD, therefore a recommendation. Likely they thought this >> >> would improve performance, and ironically it has the opposite effect. >> >> >> >> The text of the VXLAN RFC does not say that the checksum MUST be sent >> >> as zero, and it also does not say that receiving a non-zero checksum >> >> is violating the RFC. >> >> >> >> I therefore do not see the interoperability issue. Maybe some >> >> deployed systems will run more slowly or hit a slot path (which is not >> >> our problem), but they absolutely should not drop such frames. >> > >> > "When a packet is received with a UDP checksum of zero, it MUST be >> > accepted for decapsulation." >> > >> > This is a requirement and directly in conflict with having >> > VXLAN_F_UDP_ZERO_CSUM6_RX set to false as the default. >> >> Oh yes, I'm mixing different parts of the conversation. We must >> accept on RX zero checksum fields even for ipv6 because of the way the >> VXLAN RFC is worded, correct. > > That MUST conflicts directly with RFC2460 (zero UDP csums must be dropped). > We allow configuring to accept zero checksums per Rfc6935 and rfc6936. So > there is no interoperability issue and by default we maintain IPv6 protocol > compliance. And practically speaking we disappear from the internet for VXLAN tunnel endpoints implementing the VXLAN spec properly. That's not going to help anyone at all.
From: Tom Herbert <tom@herbertland.com> Date: Tue, 16 Feb 2016 13:31:34 -0800 > On Feb 16, 2016 12:53 PM, "David Miller" <davem@davemloft.net> wrote: >> And practically speaking we disappear from the internet for VXLAN tunnel >> endpoints implementing the VXLAN spec properly. >> >> That's not going to help anyone at all. > > A lot of effort has gone into defining the constraints for using zero udp6 > csum (look at Rfc6935 and rfc6936). Just because Cisco unilaterally decided > the csum is a nuisance, doesn't mean it can be ignored. No udp6 csum > increases chance of misdelivery, which in multitenant nv is really bad. People who review RFCs had ample opportunity to review the VXLAN RFC and find this issue and fix it. It slipped through and now we are stuck with it, just like we are stuck with bad user facing kernel APIs. Objecting now and voicing our objection by breaking interoperability serves no useful purpose whatsoever.
On Tue, 2016-02-16 at 15:40 -0500, David Miller wrote: > > Oh yes, I'm mixing different parts of the conversation. We must > > accept on RX zero checksum fields even for ipv6 because of the way the > > VXLAN RFC is worded, correct. > > Paolo, can you please respin your patch against 'net'? I get rejects. Sure. I'll resubmit soon, Paolo
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 0b14ac3..d8d8f33 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1441,7 +1441,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name, return dev; err = geneve_configure(net, dev, &geneve_remote_unspec, - 0, 0, 0, htons(dst_port), true, 0); + 0, 0, 0, htons(dst_port), true, + GENEVE_F_UDP_ZERO_CSUM6_RX); if (err) { free_netdev(dev); return ERR_PTR(err); diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c index 1605691..d933cb8 100644 --- a/net/openvswitch/vport-vxlan.c +++ b/net/openvswitch/vport-vxlan.c @@ -90,7 +90,7 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms) int err; struct vxlan_config conf = { .no_share = true, - .flags = VXLAN_F_COLLECT_METADATA, + .flags = VXLAN_F_COLLECT_METADATA | VXLAN_F_UDP_ZERO_CSUM6_RX, }; if (!options) {
the commit 35e2d1152b22 ("tunnels: Allow IPv6 UDP checksums to be correctly controlled.") changed the default xmit checksum setting for lwt vxlan/geneve ipv6 tunnels, so that now the checksum is not set into external UDP header. This commit changes the rx checksum setting for both lwt vxlan/geneve devices created by openvswitch accordingly, so that lwt over ipv6 tunnel pairs are again able to communicate with default values. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- drivers/net/geneve.c | 3 ++- net/openvswitch/vport-vxlan.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)