Message ID | lsq.1461711744.699003961@decadent.org.uk |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On 04/26/2016 04:02 PM, Ben Hutchings wrote: > 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html Thanks, Ben > > ------------------ > > From: Vijay Pandurangan <vijayp@vijayp.ca> > > [ Upstream commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f ] > > Packets that arrive from real hardware devices have ip_summed == > CHECKSUM_UNNECESSARY if the hardware verified the checksums, or > CHECKSUM_NONE if the packet is bad or it was unable to verify it. The > current version of veth will replace CHECKSUM_NONE with > CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to > a veth device to be delivered to the application. This caused applications > at Twitter to receive corrupt data when network hardware was corrupting > packets. > > We believe this was added as an optimization to skip computing and > verifying checksums for communication between containers. However, locally > generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as > written does nothing for them. As far as we can tell, after removing this > code, these packets are transmitted from one stack to another unmodified > (tcpdump shows invalid checksums on both sides, as expected), and they are > delivered correctly to applications. We didn’t test every possible network > configuration, but we tried a few common ones such as bridging containers, > using NAT between the host and a container, and routing from hardware > devices to containers. We have effectively deployed this in production at > Twitter (by disabling RX checksum offloading on veth devices). > > This code dates back to the first version of the driver, commit > <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I > suspect this bug occurred mostly because the driver API has evolved > significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix > packet checksumming") (in December 2010) fixed this for packets that get > created locally and sent to hardware devices, by not changing > CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming > in from hardware devices. > > Co-authored-by: Evan Jones <ej@evanjones.ca> > Signed-off-by: Evan Jones <ej@evanjones.ca> > Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> > Cc: Phil Sutter <phil@nwl.cc> > Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca> > Acked-by: Cong Wang <cwang@twopensource.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > [bwh: Backported to 3.2: adjust context] > Signed-off-by: Ben Hutchings <ben@decadent.org.uk> > --- > drivers/net/veth.c | 6 ------ > 1 file changed, 6 deletions(-) > > --- a/drivers/net/veth.c > +++ b/drivers/net/veth.c > @@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b > stats = this_cpu_ptr(priv->stats); > rcv_stats = this_cpu_ptr(rcv_priv->stats); > > - /* don't change ip_summed == CHECKSUM_PARTIAL, as that > - will cause bad checksum on forwarded packets */ > - if (skb->ip_summed == CHECKSUM_NONE && > - rcv->features & NETIF_F_RXCSUM) > - skb->ip_summed = CHECKSUM_UNNECESSARY; > > length = skb->len; > if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS) >
On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote: > On 04/26/2016 04:02 PM, Ben Hutchings wrote: > > > > 3.2.80-rc1 review patch. If anyone has any objections, please let me know. > I would be careful about this. It causes regressions when sending > PACKET_SOCKET buffers from user-space to veth devices. > > There was a proposed upstream fix for the regression, but it has not gone > into the tree as far as I know. > > http://www.spinics.net/lists/netdev/msg370436.html [...] OK, I'll drop this for now. Ben.
Hi Ben, On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote: > On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote: > > On 04/26/2016 04:02 PM, Ben Hutchings wrote: > > > > > > 3.2.80-rc1 review patch. If anyone has any objections, please let me know. > > I would be careful about this. It causes regressions when sending > > PACKET_SOCKET buffers from user-space to veth devices. > > > > There was a proposed upstream fix for the regression, but it has not gone > > into the tree as far as I know. > > > > http://www.spinics.net/lists/netdev/msg370436.html > [...] > > OK, I'll drop this for now. The fall out from not having this patch is in my opinion a bigger fallout than not having this patch. This patch fixes silent data corruption vs. the problem Ben Greear is talking about, which might not be that a common usage. What do others think? Bye, Hannes
--- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b stats = this_cpu_ptr(priv->stats); rcv_stats = this_cpu_ptr(rcv_priv->stats); - /* don't change ip_summed == CHECKSUM_PARTIAL, as that - will cause bad checksum on forwarded packets */ - if (skb->ip_summed == CHECKSUM_NONE && - rcv->features & NETIF_F_RXCSUM) - skb->ip_summed = CHECKSUM_UNNECESSARY; length = skb->len; if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS)