Patchwork tun: fix BUG with large packets without TUN_VNET_HDR

login
register
mail settings
Submitter Michael S. Tsirkin
Date April 16, 2009, 8:45 p.m.
Message ID <20090416204556.GA5846@dhcp-1-124.tlv.redhat.com>
Download mbox | patch
Permalink /patch/26091/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Michael S. Tsirkin - April 16, 2009, 8:45 p.m.
On a tap device, the linear part of skb must be at least ETH_HLEN,
otherwise eth_type_trans triggers BUG_ON in skb_pull(skb, ETH_HLEN).
This patch makes sure the linear part is always large enough.

Without the patch, tun sets the linear part size to 0 if TUN_VNET_HDR
is not set and the packet is too large to put in a linear skb,
which causes BUGs for me.

BTW, it seems that this behaviour has been there for a long while, since at
least v2.6.27 (commit f42157cb568c1eb02eca7df4da67553a9edae24a: tun: fallback
if skb_alloc() fails on big packets), but started triggering for me only in
v2.6.30-rc1, because of commit 33dccbb050bbe35b88ca8cf1228dcf3e4d4b3554 (tun:
Limit amount of queued packets per device), which made all large packets
non-linear.  Before that, most packets would still typically be linear until
memory gets fragmented, which hides the issue.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

This patch is against 2.6.30-rc2.
Here's the crash I get without this patch on 2.6.30-rc2.

[   26.882288] ------------[ cut here ]------------                                                 
[   26.883011] kernel BUG at include/linux/skbuff.h:1127!                                           
[   26.883011] invalid opcode: 0000 [#1] SMP                                                        
[   26.883011] last sysfs file: /sys/kernel/uevent_seqnum                                           
[   26.883011] CPU 0                                                                                
[   26.883011] Modules linked in: tun virtio_balloon virtio_pci virtio_ring virtio 8139cp piix ide_core aacraid [last unloaded: scsi_wait_scan]                                                         
[   26.883011] Pid: 2717, comm: a.out Not tainted 2.6.30-rc2 #6                                     
[   26.883011] RIP: 0010:[<ffffffff813b00b7>]  [<ffffffff813b00b7>] skb_pull+0x19/0x2f              
[   26.883011] RSP: 0018:ffff88003c807d48  EFLAGS: 00010283                                         
[   26.883011] RAX: 000000000000270e RBX: ffff88003e550f00 RCX: 0000000000000000                    
[   26.883011] RDX: 000000000000271c RSI: 000000000000000e RDI: ffff88003e550f00                    
[   26.883011] RBP: ffff88003c807d48 R08: ffff88003e845000 R09: ffff88003f801040                    
[   26.883011] R10: 000000000003a15b R11: 0000000000000001 R12: ffff88003dc94000                    
[   26.883011] R13: 0000000000000000 R14: 000000000000271c R15: ffff88003dc94600                    
[   26.883011] FS:  0000000000000000(0000) GS:ffff880002a00000(0063) knlGS:00000000f7d7aae0         
[   26.883011] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b                                    
[   26.883011] CR2: 00000000f7e4c560 CR3: 000000003c800000 CR4: 00000000000006a0                    
[   26.883011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000                    
[   26.883011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400                    
[   26.883011] Process a.out (pid: 2717, threadinfo ffff88003c806000, task ffff88003f272ca0)        
[   26.883011] Stack:                                                                               
[   26.883011]  ffff88003c807d68 ffffffff813c9943 0000000000000002 ffff88003e550f00
[   26.883011]  ffff88003c807dd8 ffffffffa004f43b ffff88003c807ed8 000000000000271c
[   26.883011]  0000000000000000 ffff88003e2a0000 0000000000080000 0000000000000000
[   26.883011] Call Trace:
[   26.883011]  [<ffffffff813c9943>] eth_type_trans+0x2e/0xaf
[   26.883011]  [<ffffffffa004f43b>] tun_chr_aio_write+0x298/0x3fd [tun]
[   26.883011]  [<ffffffff810be16e>] do_sync_write+0xe7/0x12d
[   26.883011]  [<ffffffff8104e4fc>] ? autoremove_wake_function+0x0/0x38
[   26.883011]  [<ffffffff810f19fc>] ? dev_ifsioc+0x1bd/0x23b
[   26.883011]  [<ffffffff8117f57a>] ? selinux_file_permission+0x53/0x58
[   26.883011]  [<ffffffff8117907c>] ? security_file_permission+0x11/0x13
[   26.883011]  [<ffffffff810beb94>] vfs_write+0xab/0x105
[   26.883011]  [<ffffffff810becb2>] sys_write+0x47/0x6f
[   26.883011]  [<ffffffff8102c0c8>] sysenter_dispatch+0x7/0x27
[   26.883011] Code: 04 44 01 47 68 44 01 47 6c 44 01 87 d8 00 00 00 c9 c3 8b 57 68 55 31 c0 48 89 e5 39 d6 77 20 89 d0 29 f0 3b 47 6c 89 47 68 73 04 <0f> 0b eb fe 89 f0 48 03 87 d0 00 00 00 48 89 87d0 00 00 00 c9
[   26.883011] RIP  [<ffffffff813b00b7>] skb_pull+0x19/0x2f
[   26.883011]  RSP <ffff88003c807d48>
[   27.018727] ---[ end trace 8df47a38a245b011 ]---

 drivers/net/tun.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)
Herbert Xu - April 16, 2009, 11:50 p.m.
On Thu, Apr 16, 2009 at 11:45:57PM +0300, Michael S. Tsirkin wrote:
> On a tap device, the linear part of skb must be at least ETH_HLEN,
> otherwise eth_type_trans triggers BUG_ON in skb_pull(skb, ETH_HLEN).
> This patch makes sure the linear part is always large enough.
> 
> Without the patch, tun sets the linear part size to 0 if TUN_VNET_HDR
> is not set and the packet is too large to put in a linear skb,
> which causes BUGs for me.
> 
> BTW, it seems that this behaviour has been there for a long while, since at
> least v2.6.27 (commit f42157cb568c1eb02eca7df4da67553a9edae24a: tun: fallback
> if skb_alloc() fails on big packets), but started triggering for me only in
> v2.6.30-rc1, because of commit 33dccbb050bbe35b88ca8cf1228dcf3e4d4b3554 (tun:
> Limit amount of queued packets per device), which made all large packets
> non-linear.  Before that, most packets would still typically be linear until
> memory gets fragmented, which hides the issue.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

I'd already fixed that a couple of days ago :)

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7cfe3d1..965f6f2 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -547,6 +549,7 @@  static __inline__ ssize_t tun_get_user(struct tun_struct *tun,
 	struct sk_buff *skb;
 	size_t len = count, align = 0;
 	struct virtio_net_hdr gso = { 0 };
+	size_t linear;
 
 	if (!(tun->flags & TUN_NO_PI)) {
 		if ((len -= sizeof(pi)) > count)
@@ -567,13 +570,16 @@  static __inline__ ssize_t tun_get_user(struct tun_struct *tun,
 			return -EINVAL;
 	}
 
+	linear = gso.hdr_len;
 	if ((tun->flags & TUN_TYPE_MASK) == TUN_TAP_DEV) {
 		align = NET_IP_ALIGN;
 		if (unlikely(len < ETH_HLEN))
 			return -EINVAL;
+		if (linear < ETH_HLEN)
+			linear = ETH_HLEN;
 	}
 
-	skb = tun_alloc_skb(tun, align, len, gso.hdr_len, noblock);
+	skb = tun_alloc_skb(tun, align, len, linear, noblock);
 	if (IS_ERR(skb)) {
 		if (PTR_ERR(skb) != -EAGAIN)
 			tun->dev->stats.rx_dropped++;