[net] veth: Orphan skb before GRO

Message ID 1536899624-2438-1-git-send-email-makita.toshiaki@lab.ntt.co.jp
State Accepted
Delegated to: David Miller
Headers show
Series
  • [net] veth: Orphan skb before GRO
Related show

Commit Message

Toshiaki Makita Sept. 14, 2018, 4:33 a.m.
GRO expects skbs not to be owned by sockets, but when XDP is enabled veth
passed skbs owned by sockets. It caused corrupted sk_wmem_alloc.

Paolo Abeni reported the following splat:

[  362.098904] refcount_t overflow at skb_set_owner_w+0x5e/0xa0 in iperf3[1644], uid/euid: 0/0
[  362.108239] WARNING: CPU: 0 PID: 1644 at kernel/panic.c:648 refcount_error_report+0xa0/0xa4
[  362.117547] Modules linked in: tcp_diag inet_diag veth intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf ipmi_ssif iTCO_wdt sg ipmi_si iTCO_vendor_support ipmi_devintf mxm_wmi ipmi_msghandler pcspkr dcdbas mei_me wmi mei lpc_ich acpi_power_meter pcc_cpufreq xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe igb ttm ahci mdio libahci ptp crc32c_intel drm pps_core libata i2c_algo_bit dca dm_mirror dm_region_hash dm_log dm_mod
[  362.176622] CPU: 0 PID: 1644 Comm: iperf3 Not tainted 4.19.0-rc2.vanilla+ #2025
[  362.184777] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
[  362.193124] RIP: 0010:refcount_error_report+0xa0/0xa4
[  362.198758] Code: 08 00 00 48 8b 95 80 00 00 00 49 8d 8c 24 80 0a 00 00 41 89 c1 44 89 2c 24 48 89 de 48 c7 c7 18 4d e7 9d 31 c0 e8 30 fa ff ff <0f> 0b eb 88 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 49 89 fc
[  362.219711] RSP: 0018:ffff9ee6ff603c20 EFLAGS: 00010282
[  362.225538] RAX: 0000000000000000 RBX: ffffffff9de83e10 RCX: 0000000000000000
[  362.233497] RDX: 0000000000000001 RSI: ffff9ee6ff6167d8 RDI: ffff9ee6ff6167d8
[  362.241457] RBP: ffff9ee6ff603d78 R08: 0000000000000490 R09: 0000000000000004
[  362.249416] R10: 0000000000000000 R11: ffff9ee6ff603990 R12: ffff9ee664b94500
[  362.257377] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff9de615f9
[  362.265337] FS:  00007f1d22d28740(0000) GS:ffff9ee6ff600000(0000) knlGS:0000000000000000
[  362.274363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  362.280773] CR2: 00007f1d222f35d0 CR3: 0000001fddfec003 CR4: 00000000001606f0
[  362.288733] Call Trace:
[  362.291459]  <IRQ>
[  362.293702]  ex_handler_refcount+0x4e/0x80
[  362.298269]  fixup_exception+0x35/0x40
[  362.302451]  do_trap+0x109/0x150
[  362.306048]  do_error_trap+0xd5/0x130
[  362.315766]  invalid_op+0x14/0x20
[  362.319460] RIP: 0010:skb_set_owner_w+0x5e/0xa0
[  362.324512] Code: ef ff ff 74 49 48 c7 43 60 20 7b 4a 9d 8b 85 f4 01 00 00 85 c0 75 16 8b 83 e0 00 00 00 f0 01 85 44 01 00 00 0f 88 d8 23 16 00 <5b> 5d c3 80 8b 91 00 00 00 01 8b 85 f4 01 00 00 89 83 a4 00 00 00
[  362.345465] RSP: 0018:ffff9ee6ff603e20 EFLAGS: 00010a86
[  362.351291] RAX: 0000000000001100 RBX: ffff9ee65deec700 RCX: ffff9ee65e829244
[  362.359250] RDX: 0000000000000100 RSI: ffff9ee65e829100 RDI: ffff9ee65deec700
[  362.367210] RBP: ffff9ee65e829100 R08: 000000000002a380 R09: 0000000000000000
[  362.375169] R10: 0000000000000002 R11: fffff1a4bf77bb00 R12: ffffc0754661d000
[  362.383130] R13: ffff9ee65deec200 R14: ffff9ee65f597000 R15: 00000000000000aa
[  362.391092]  veth_xdp_rcv+0x4e4/0x890 [veth]
[  362.399357]  veth_poll+0x4d/0x17a [veth]
[  362.403731]  net_rx_action+0x2af/0x3f0
[  362.407912]  __do_softirq+0xdd/0x29e
[  362.411897]  do_softirq_own_stack+0x2a/0x40
[  362.416561]  </IRQ>
[  362.418899]  do_softirq+0x4b/0x70
[  362.422594]  __local_bh_enable_ip+0x50/0x60
[  362.427258]  ip_finish_output2+0x16a/0x390
[  362.431824]  ip_output+0x71/0xe0
[  362.440670]  __tcp_transmit_skb+0x583/0xab0
[  362.445333]  tcp_write_xmit+0x247/0xfb0
[  362.449609]  __tcp_push_pending_frames+0x2d/0xd0
[  362.454760]  tcp_sendmsg_locked+0x857/0xd30
[  362.459424]  tcp_sendmsg+0x27/0x40
[  362.463216]  sock_sendmsg+0x36/0x50
[  362.467104]  sock_write_iter+0x87/0x100
[  362.471382]  __vfs_write+0x112/0x1a0
[  362.475369]  vfs_write+0xad/0x1a0
[  362.479062]  ksys_write+0x52/0xc0
[  362.482759]  do_syscall_64+0x5b/0x180
[  362.486841]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  362.492473] RIP: 0033:0x7f1d22293238
[  362.496458] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 c5 54 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[  362.517409] RSP: 002b:00007ffebaef8008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  362.525855] RAX: ffffffffffffffda RBX: 0000000000002800 RCX: 00007f1d22293238
[  362.533816] RDX: 0000000000002800 RSI: 00007f1d22d36000 RDI: 0000000000000005
[  362.541775] RBP: 00007f1d22d36000 R08: 00000002db777a30 R09: 0000562b70712b20
[  362.549734] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
[  362.557693] R13: 0000000000002800 R14: 00007ffebaef8060 R15: 0000562b70712260

In order to avoid this, orphan the skb before entering GRO.

Fixes: 948d4f214fde ("veth: Add driver XDP")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 drivers/net/veth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Paolo Abeni Sept. 14, 2018, 2:16 p.m. | #1
On Fri, 2018-09-14 at 13:33 +0900, Toshiaki Makita wrote:
> GRO expects skbs not to be owned by sockets, but when XDP is enabled veth
> passed skbs owned by sockets. It caused corrupted sk_wmem_alloc.
> 
> Paolo Abeni reported the following splat:
> 
> [  362.098904] refcount_t overflow at skb_set_owner_w+0x5e/0xa0 in iperf3[1644], uid/euid: 0/0
> [  362.108239] WARNING: CPU: 0 PID: 1644 at kernel/panic.c:648 refcount_error_report+0xa0/0xa4
> [  362.117547] Modules linked in: tcp_diag inet_diag veth intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf ipmi_ssif iTCO_wdt sg ipmi_si iTCO_vendor_support ipmi_devintf mxm_wmi ipmi_msghandler pcspkr dcdbas mei_me wmi mei lpc_ich acpi_power_meter pcc_cpufreq xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe igb ttm ahci mdio libahci ptp crc32c_intel drm pps_core libata i2c_algo_bit dca dm_mirror dm_region_hash dm_log dm_mod
> [  362.176622] CPU: 0 PID: 1644 Comm: iperf3 Not tainted 4.19.0-rc2.vanilla+ #2025
> [  362.184777] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
> [  362.193124] RIP: 0010:refcount_error_report+0xa0/0xa4
> [  362.198758] Code: 08 00 00 48 8b 95 80 00 00 00 49 8d 8c 24 80 0a 00 00 41 89 c1 44 89 2c 24 48 89 de 48 c7 c7 18 4d e7 9d 31 c0 e8 30 fa ff ff <0f> 0b eb 88 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 49 89 fc
> [  362.219711] RSP: 0018:ffff9ee6ff603c20 EFLAGS: 00010282
> [  362.225538] RAX: 0000000000000000 RBX: ffffffff9de83e10 RCX: 0000000000000000
> [  362.233497] RDX: 0000000000000001 RSI: ffff9ee6ff6167d8 RDI: ffff9ee6ff6167d8
> [  362.241457] RBP: ffff9ee6ff603d78 R08: 0000000000000490 R09: 0000000000000004
> [  362.249416] R10: 0000000000000000 R11: ffff9ee6ff603990 R12: ffff9ee664b94500
> [  362.257377] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff9de615f9
> [  362.265337] FS:  00007f1d22d28740(0000) GS:ffff9ee6ff600000(0000) knlGS:0000000000000000
> [  362.274363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  362.280773] CR2: 00007f1d222f35d0 CR3: 0000001fddfec003 CR4: 00000000001606f0
> [  362.288733] Call Trace:
> [  362.291459]  <IRQ>
> [  362.293702]  ex_handler_refcount+0x4e/0x80
> [  362.298269]  fixup_exception+0x35/0x40
> [  362.302451]  do_trap+0x109/0x150
> [  362.306048]  do_error_trap+0xd5/0x130
> [  362.315766]  invalid_op+0x14/0x20
> [  362.319460] RIP: 0010:skb_set_owner_w+0x5e/0xa0
> [  362.324512] Code: ef ff ff 74 49 48 c7 43 60 20 7b 4a 9d 8b 85 f4 01 00 00 85 c0 75 16 8b 83 e0 00 00 00 f0 01 85 44 01 00 00 0f 88 d8 23 16 00 <5b> 5d c3 80 8b 91 00 00 00 01 8b 85 f4 01 00 00 89 83 a4 00 00 00
> [  362.345465] RSP: 0018:ffff9ee6ff603e20 EFLAGS: 00010a86
> [  362.351291] RAX: 0000000000001100 RBX: ffff9ee65deec700 RCX: ffff9ee65e829244
> [  362.359250] RDX: 0000000000000100 RSI: ffff9ee65e829100 RDI: ffff9ee65deec700
> [  362.367210] RBP: ffff9ee65e829100 R08: 000000000002a380 R09: 0000000000000000
> [  362.375169] R10: 0000000000000002 R11: fffff1a4bf77bb00 R12: ffffc0754661d000
> [  362.383130] R13: ffff9ee65deec200 R14: ffff9ee65f597000 R15: 00000000000000aa
> [  362.391092]  veth_xdp_rcv+0x4e4/0x890 [veth]
> [  362.399357]  veth_poll+0x4d/0x17a [veth]
> [  362.403731]  net_rx_action+0x2af/0x3f0
> [  362.407912]  __do_softirq+0xdd/0x29e
> [  362.411897]  do_softirq_own_stack+0x2a/0x40
> [  362.416561]  </IRQ>
> [  362.418899]  do_softirq+0x4b/0x70
> [  362.422594]  __local_bh_enable_ip+0x50/0x60
> [  362.427258]  ip_finish_output2+0x16a/0x390
> [  362.431824]  ip_output+0x71/0xe0
> [  362.440670]  __tcp_transmit_skb+0x583/0xab0
> [  362.445333]  tcp_write_xmit+0x247/0xfb0
> [  362.449609]  __tcp_push_pending_frames+0x2d/0xd0
> [  362.454760]  tcp_sendmsg_locked+0x857/0xd30
> [  362.459424]  tcp_sendmsg+0x27/0x40
> [  362.463216]  sock_sendmsg+0x36/0x50
> [  362.467104]  sock_write_iter+0x87/0x100
> [  362.471382]  __vfs_write+0x112/0x1a0
> [  362.475369]  vfs_write+0xad/0x1a0
> [  362.479062]  ksys_write+0x52/0xc0
> [  362.482759]  do_syscall_64+0x5b/0x180
> [  362.486841]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  362.492473] RIP: 0033:0x7f1d22293238
> [  362.496458] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 c5 54 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
> [  362.517409] RSP: 002b:00007ffebaef8008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [  362.525855] RAX: ffffffffffffffda RBX: 0000000000002800 RCX: 00007f1d22293238
> [  362.533816] RDX: 0000000000002800 RSI: 00007f1d22d36000 RDI: 0000000000000005
> [  362.541775] RBP: 00007f1d22d36000 R08: 00000002db777a30 R09: 0000562b70712b20
> [  362.549734] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
> [  362.557693] R13: 0000000000002800 R14: 00007ffebaef8060 R15: 0000562b70712260
> 
> In order to avoid this, orphan the skb before entering GRO.
> 
> Fixes: 948d4f214fde ("veth: Add driver XDP")
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> ---
>  drivers/net/veth.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 8d679c8..41a00cd 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -463,6 +463,8 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb,
>  	int mac_len, delta, off;
>  	struct xdp_buff xdp;
>  
> +	skb_orphan(skb);
> +
>  	rcu_read_lock();
>  	xdp_prog = rcu_dereference(rq->xdp_prog);
>  	if (unlikely(!xdp_prog)) {
> @@ -508,8 +510,6 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb,
>  		skb_copy_header(nskb, skb);
>  		head_off = skb_headroom(nskb) - skb_headroom(skb);
>  		skb_headers_offset_update(nskb, head_off);
> -		if (skb->sk)
> -			skb_set_owner_w(nskb, skb->sk);
>  		consume_skb(skb);
>  		skb = nskb;
>  	}

I just gave it a run in my test environment, and it fixes the reported
issue.

Tested-by: Paolo Abeni <pabeni@redhat.com>
David Miller Sept. 16, 2018, 10:34 p.m. | #2
From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Date: Fri, 14 Sep 2018 13:33:44 +0900

> GRO expects skbs not to be owned by sockets, but when XDP is enabled veth
> passed skbs owned by sockets. It caused corrupted sk_wmem_alloc.
> 
> Paolo Abeni reported the following splat:
 ...
> In order to avoid this, orphan the skb before entering GRO.
> 
> Fixes: 948d4f214fde ("veth: Add driver XDP")
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>

Applied, thanks.

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 8d679c8..41a00cd 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -463,6 +463,8 @@  static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb,
 	int mac_len, delta, off;
 	struct xdp_buff xdp;
 
+	skb_orphan(skb);
+
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (unlikely(!xdp_prog)) {
@@ -508,8 +510,6 @@  static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb,
 		skb_copy_header(nskb, skb);
 		head_off = skb_headroom(nskb) - skb_headroom(skb);
 		skb_headers_offset_update(nskb, head_off);
-		if (skb->sk)
-			skb_set_owner_w(nskb, skb->sk);
 		consume_skb(skb);
 		skb = nskb;
 	}