diff mbox

[Precise,SRO] =?UTF-8?q?veth:=20don=E2=80=99t=20modify=20ip=5Fsummed; ?= doing so treats packets with bad checksums as good.

Message ID 1455832046-14284-1-git-send-email-tim.gardner@canonical.com
State New
Headers show

Commit Message

Tim Gardner Feb. 18, 2016, 9:47 p.m. UTC
From: Vijay Pandurangan <vijayp@vijayp.ca>

BugLink: http://bugs.launchpad.net/bugs/1547207

Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
<e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones <ej@evanjones.ca>
Signed-off-by: Evan Jones <ej@evanjones.ca>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 drivers/net/veth.c | 6 ------
 1 file changed, 6 deletions(-)

Comments

Kamal Mostafa Feb. 18, 2016, 9:52 p.m. UTC | #1
Ack for Precise.

Note: this has already been applied to all Ubuntu kernels >= trusty,
and their associated stables.
Stefan Bader Feb. 19, 2016, 1:08 p.m. UTC | #2
The risk should be limited and also already applied to kernels past P.

Hm, is SRO a new thing of just fat fingers...?

-Stefan
Tim Gardner Feb. 19, 2016, 1:13 p.m. UTC | #3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 02/19/2016 06:08 AM, Stefan Bader wrote:
> The risk should be limited and also already applied to kernels past
> P.
> 
> Hm, is SRO a new thing of just fat fingers...?
> 
> -Stefan
> 
> 
> 

SRO -> SRU

- -- 
Tim Gardner tim.gardner@canonical.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCgAGBQJWxxT2AAoJED12yEX6FEfKpncP/iePwEqk/uccGHJ3EZysak92
JhZ6Z2TbNuVi35gJi3Gb3sD/5m3gDBjx/suo/NLtrCrLsK+XwTPJ2ZlbalaGF/vR
yv4W0dEDZ0X5ljqL7Gjs4QICyB7WvpJnXiK468ecDCxTAN6qIx+Wo5wjAiKQryy4
KtRSFkRPLM8XsZjt8M9bUMVcqjTQ9zGfb1bsB+TUd0aEiQiZFWpwX5q/gRmL2s7W
hKUurquDp15nw2yT1rYTNzWByOIRp1VjdEANAooYB5JLykRpxI2s/RySmTU562H5
BoMhUxKtZc68nrHKuN7GUGs5hY0gaknptFDX3aMwty4XXk5WJPiQueUcyNHTCuNX
zef1snrH/onPIrd5Ja5x86h1Fv14O89H6aFeGDl8aYiAkFuRjBnOWdJJlU9VQndx
CPGd38ytfTt569DnaAVG2lNruikC9x/NxZGXJ9M8+31PeYOAxTGpujdWjlKDbMSZ
1BSwjmIltWpDp0IbvIA3kHPD+Tn2JB2LL0eHv02XHrMxQJlCm1RRnO8T2vVJGaXY
IlMbtw9x9E2jFCAlJkGQN6dlmrzf8pCgdeDWojZslP9PMSORbeZQ4f/QwrSkHrWz
Z9GgYCUB9GErIQt6v63bbwUkmgaKeVXTlpiu2iS33tHG3wUMrJfph+z4deRRN2ca
Kz6dQ8QSowiCnMLHhRS9
=VLEp
-----END PGP SIGNATURE-----
Brad Figg Feb. 19, 2016, 10:14 p.m. UTC | #4
Applied to the master-next branch of Precise
diff mbox

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 3634032..9de9dc3 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -117,12 +117,6 @@  static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 		kfree_skb(skb);
 		goto drop;
 	}
-	/* don't change ip_summed == CHECKSUM_PARTIAL, as that
-	 * will cause bad checksum on forwarded packets
-	 */
-	if (skb->ip_summed == CHECKSUM_NONE &&
-	    rcv->features & NETIF_F_RXCSUM)
-		skb->ip_summed = CHECKSUM_UNNECESSARY;
 
 	if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
 		struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);