mbox series

[net,0/2] Fix vlan untag and insertion for bridge and vlan with reorder_hdr off

Message ID 1520920288-2483-1-git-send-email-makita.toshiaki@lab.ntt.co.jp
Headers show
Series Fix vlan untag and insertion for bridge and vlan with reorder_hdr off | expand

Message

Toshiaki Makita March 13, 2018, 5:51 a.m. UTC
As Brandon Carpenter reported[1], sending non-vlan-offloaded packets from
bridge devices ends up with corrupted packets. He narrowed down this problem
and found that the root cause is in skb_reorder_vlan_header().

While I was working on fixing this problem, I found that the function does
not work properly for double tagged packets with reorder_hdr off as well.

Patch 1 fixes these 2 problems in skb_reorder_vlan_header().

And it turned out that fixing skb_reorder_vlan_header() is not sufficient
to receive double tagged packets with reorder_hdr off while I was testing the
fix. Vlan tags got out of order when vlan devices with reorder_hdr disabled
were stacked. Patch 2 fixes this problem.

[1] https://www.spinics.net/lists/linux-ethernet-bridging/msg07039.html

Toshiaki Makita (2):
  net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
  vlan: Fix out of order vlan headers with reorder header off

 include/linux/if_vlan.h       | 66 +++++++++++++++++++++++++++++++++++--------
 include/uapi/linux/if_ether.h |  1 +
 net/8021q/vlan_core.c         |  4 +--
 net/core/skbuff.c             |  7 +++--
 4 files changed, 63 insertions(+), 15 deletions(-)

Comments

David Miller March 16, 2018, 2:05 p.m. UTC | #1
From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Date: Tue, 13 Mar 2018 14:51:26 +0900

> As Brandon Carpenter reported[1], sending non-vlan-offloaded packets from
> bridge devices ends up with corrupted packets. He narrowed down this problem
> and found that the root cause is in skb_reorder_vlan_header().
> 
> While I was working on fixing this problem, I found that the function does
> not work properly for double tagged packets with reorder_hdr off as well.
> 
> Patch 1 fixes these 2 problems in skb_reorder_vlan_header().
> 
> And it turned out that fixing skb_reorder_vlan_header() is not sufficient
> to receive double tagged packets with reorder_hdr off while I was testing the
> fix. Vlan tags got out of order when vlan devices with reorder_hdr disabled
> were stacked. Patch 2 fixes this problem.
> 
> [1] https://www.spinics.net/lists/linux-ethernet-bridging/msg07039.html

Series applied and queued up for -stable, thanks.

I was thinking of pushing back on the addition of the ETH_TLEN UAPI visible
macro, because I don't see any other system providing that define.  But in
the end I decided that it's harmless and really that header file is the
correct location for such a definition.

Thank you.
Eric Dumazet March 28, 2018, 6:11 a.m. UTC | #2
On 03/16/2018 07:05 AM, David Miller wrote:
> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> Date: Tue, 13 Mar 2018 14:51:26 +0900
> 
>> As Brandon Carpenter reported[1], sending non-vlan-offloaded packets from
>> bridge devices ends up with corrupted packets. He narrowed down this problem
>> and found that the root cause is in skb_reorder_vlan_header().
>>
>> While I was working on fixing this problem, I found that the function does
>> not work properly for double tagged packets with reorder_hdr off as well.
>>
>> Patch 1 fixes these 2 problems in skb_reorder_vlan_header().
>>
>> And it turned out that fixing skb_reorder_vlan_header() is not sufficient
>> to receive double tagged packets with reorder_hdr off while I was testing the
>> fix. Vlan tags got out of order when vlan devices with reorder_hdr disabled
>> were stacked. Patch 2 fixes this problem.
>>
>> [1] https://www.spinics.net/lists/linux-ethernet-bridging/msg07039.html
> 
> Series applied and queued up for -stable, thanks.
> 
> I was thinking of pushing back on the addition of the ETH_TLEN UAPI visible
> macro, because I don't see any other system providing that define.  But in
> the end I decided that it's harmless and really that header file is the
> correct location for such a definition.
> 
> Thank you.
> 

syzbot reported some crashes caused by a memmove(..., ..., count=-2)

So something needs to be refined I guess.

BUG: unable to handle kernel paging request at ffff8801cccb8000
IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
PGD 9cee067 P4D 9cee067 PUD 1d9401063 PMD 1cccb7063 PTE 2810100028101
Oops: 000b [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
RSP: 0018:ffff8801cc046e28 EFLAGS: 00010287
RAX: ffff8801ccc244c4 RBX: fffffffffffffffe RCX: fffffffffff6c4c2
RDX: fffffffffffffffe RSI: ffff8801cccb7ffc RDI: ffff8801cccb8000
RBP: ffff8801cc046e48 R08: ffff8801ccc244be R09: ffffed0039984899
R10: 0000000000000001 R11: ffffed0039984898 R12: ffff8801ccc244c4
R13: ffff8801ccc244c0 R14: ffff8801d96b7c06 R15: ffff8801d96b7b40
FS:  00007febd562d700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8801cccb8000 CR3: 00000001ccb2f006 CR4: 00000000001606e0
DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Call Trace:
 memmove include/linux/string.h:360 [inline]
 skb_reorder_vlan_header net/core/skbuff.c:5031 [inline]
 skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061
 __netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701
 netif_receive_skb+0xae/0x390 net/core/dev.c:4725
 tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555
 tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962
 tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990
 call_write_iter include/linux/fs.h:1782 [inline]
 new_sync_write fs/read_write.c:469 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:482
 vfs_write+0x189/0x510 fs/read_write.c:544
 SYSC_write fs/read_write.c:589 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:581
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x454879
RSP: 002b:00007febd562cc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007febd562d6d4 RCX: 0000000000454879
RDX: 0000000000000157 RSI: 0000000020000180 RDI: 0000000000000014
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000006b0 R14: 00000000006fc120 R15: 0000000000000000
Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 <f3> a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20 
RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: ffff8801cc046e28
CR2: ffff8801cccb8000
---[ end trace b21c0866ee797d6d ]---
BUG: unable to handle kernel
Toshiaki Makita March 28, 2018, 8:03 a.m. UTC | #3
On 2018/03/28 15:11, Eric Dumazet wrote:
> On 03/16/2018 07:05 AM, David Miller wrote:
>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> Date: Tue, 13 Mar 2018 14:51:26 +0900
>>
>>> As Brandon Carpenter reported[1], sending non-vlan-offloaded packets from
>>> bridge devices ends up with corrupted packets. He narrowed down this problem
>>> and found that the root cause is in skb_reorder_vlan_header().
>>>
>>> While I was working on fixing this problem, I found that the function does
>>> not work properly for double tagged packets with reorder_hdr off as well.
>>>
>>> Patch 1 fixes these 2 problems in skb_reorder_vlan_header().
>>>
>>> And it turned out that fixing skb_reorder_vlan_header() is not sufficient
>>> to receive double tagged packets with reorder_hdr off while I was testing the
>>> fix. Vlan tags got out of order when vlan devices with reorder_hdr disabled
>>> were stacked. Patch 2 fixes this problem.
>>>
>>> [1] https://www.spinics.net/lists/linux-ethernet-bridging/msg07039.html
>>
>> Series applied and queued up for -stable, thanks.
>>
>> I was thinking of pushing back on the addition of the ETH_TLEN UAPI visible
>> macro, because I don't see any other system providing that define.  But in
>> the end I decided that it's harmless and really that header file is the
>> correct location for such a definition.
>>
>> Thank you.
>>
> 
> syzbot reported some crashes caused by a memmove(..., ..., count=-2)

Thank you for your report.
Interesting, tun device can make vlan packets which does not have
ethernet header.

> 
> So something needs to be refined I guess.

Probably we can skip memmove() for such packets, since there is no
header to copy.
I'll make a fix. Thanks.

> 
> BUG: unable to handle kernel paging request at ffff8801cccb8000
> IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
> PGD 9cee067 P4D 9cee067 PUD 1d9401063 PMD 1cccb7063 PTE 2810100028101
> Oops: 000b [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
> RSP: 0018:ffff8801cc046e28 EFLAGS: 00010287
> RAX: ffff8801ccc244c4 RBX: fffffffffffffffe RCX: fffffffffff6c4c2
> RDX: fffffffffffffffe RSI: ffff8801cccb7ffc RDI: ffff8801cccb8000
> RBP: ffff8801cc046e48 R08: ffff8801ccc244be R09: ffffed0039984899
> R10: 0000000000000001 R11: ffffed0039984898 R12: ffff8801ccc244c4
> R13: ffff8801ccc244c0 R14: ffff8801d96b7c06 R15: ffff8801d96b7b40
> FS:  00007febd562d700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffff8801cccb8000 CR3: 00000001ccb2f006 CR4: 00000000001606e0
> DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Call Trace:
>  memmove include/linux/string.h:360 [inline]
>  skb_reorder_vlan_header net/core/skbuff.c:5031 [inline]
>  skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061
>  __netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460
>  __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627
>  netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701
>  netif_receive_skb+0xae/0x390 net/core/dev.c:4725
>  tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555
>  tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962
>  tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990
>  call_write_iter include/linux/fs.h:1782 [inline]
>  new_sync_write fs/read_write.c:469 [inline]
>  __vfs_write+0x684/0x970 fs/read_write.c:482
>  vfs_write+0x189/0x510 fs/read_write.c:544
>  SYSC_write fs/read_write.c:589 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:581
>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x454879
> RSP: 002b:00007febd562cc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 00007febd562d6d4 RCX: 0000000000454879
> RDX: 0000000000000157 RSI: 0000000020000180 RDI: 0000000000000014
> RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 00000000000006b0 R14: 00000000006fc120 R15: 0000000000000000
> Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 <f3> a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20 
> RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: ffff8801cc046e28
> CR2: ffff8801cccb8000
> ---[ end trace b21c0866ee797d6d ]---
> BUG: unable to handle kernel 
> 
> 
>