Message ID | 2c692688-ea77-1e80-6607-6577b157873a@advantech-bb.cz |
---|---|
State | RFC |
Delegated to: | David Miller |
Headers | show |
Series | Bug in br_handle_frame | expand |
On 15/02/2019 16:03, Tomas Paukrt wrote: > Hi, > > I have recently discovered that kernel 3.12.10 is occasionally crashing > due to NULL pointer dereference in function br_handle_frame when we > reconfigure the bridge, because function br_port_get_rcu returns NULL. > > It is very hard for us to replicate this issue, because it happens about > once per month in our testing environment, but I have created the > attached patch. Can you please check it? The latest kernel seems to be > affected too. > > Best regards > > Tomas > Hi, That should not be possible, br_port_get_rcu() is a wrapper for dev->rx_handler_data which in turn should always be present in case rx_handler is called as can be seen in netdev_rx_handler_unregister(). Could you please share details about the crash and possibly a trace ? Do you have any custom patches applied ? Thanks, Nik
On 15/02/2019 16:23, Tomas Paukrt wrote: > Hi Nik, > > this is the trace: > > [ 522.578735] Unable to handle kernel NULL pointer dereference at virtual address 00000011 > [ 522.578804] pgd = c3b70000 > [ 522.578842] [00000011] *pgd=03b63831, *pte=00000000, *ppte=00000000 > [ 522.578943] Internal error: Oops: 17 [#1] ARM > [ 522.578980] Modules linked in: > [ 522.579039] CPU: 0 Not tainted (3.5.0-lsp-3.3.1 #1) > [ 522.579103] PC is at br_handle_frame+0xf4/0x26c > [ 522.579146] LR is at 0xffff > [ 522.579194] pc : [] lr : [<0000ffff>] psr: 00000013 > [ 522.579194] sp : c3bd5c10 ip : 0000ffff fp : c3bd5c44 > [ 522.579242] r10: c3affd80 r9 : 0000ff3d r8 : c3a10600 > [ 522.579279] r7 : c3bd5c5c r6 : 00000000 r5 : c3b5daa2 r4 : c3affd80 > [ 522.579322] r3 : c398c800 r2 : 0000ffff r1 : 0000ffff r0 : 0000ffff > [ 522.579364] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user > [ 522.579407] Control: 0005317f Table: 03b70000 DAC: 00000015 > [ 522.579444] Process brctl (pid: 4573, stack limit = 0xc3bd4270) > [ 522.579482] Stack: (0xc3bd5c10 to 0xc3bd6000) > [ 522.579535] 5c00: c0446f94 c0470c44 c3bd5c5c c3bd5c28 > [ 522.579599] 5c20: c02c1780 c043e548 c398c800 00000001 c0470720 c043e548 c3bd5c94 c3bd5c48 > [ 522.579658] 5c40: c02164d0 c02c1790 00000000 00000000 c043e540 00000120 00000120 c3affd80 > [ 522.579722] 5c60: c3bd5c94 c043e548 c020a5c4 c3affd80 00000180 c398cbd0 000000be c3affd80 > [ 522.579786] 5c80: 00000080 0000003e c3bd5cb4 c3bd5c98 c0216704 c02161c8 ffdd8000 00000180 > [ 522.579844] 5ca0: 00000180 00000180 c3bd5cf4 c3bd5cb8 c01ab67c c02166f4 00000040 00000040 > [ 522.579908] 5cc0: 00000000 00000000 20000013 c01ab438 c398cbd0 0000012c 00000040 c0470720 > [ 522.579972] 5ce0: 000056f3 c0470720 c3bd5d2c c3bd5cf8 c0217568 c01ab448 c0470728 c0445270 > [ 522.580031] 5d00: 00000000 00000001 c047a3ac 0000000c c3bd4000 c0445ff0 00000100 00000003 > [ 522.580095] 5d20: c3bd5d6c c3bd5d30 c001b7e0 c0217474 c3bd5d6c c3bd5d40 c3bd4030 0000000a > [ 522.580154] 5d40: c0012418 c0451b9c 00000001 00000000 c0471144 c047116c 00000000 00000000 > [ 522.580218] 5d60: c3bd5d7c c3bd5d70 c001bc24 c001b748 c3bd5d9c c3bd5d80 c0009b80 c001bbe4 > [ 522.580282] 5d80: 00000002 c3bd5dc8 00000000 00000000 c3bd5dc4 c3bd5da0 c00086c8 c0009b54 > [ 522.580346] 5da0: c0213bb0 c02c127c 60000013 ffffffff c3bd5dfc c3a1068c c3bd5e34 c3bd5dc8 > [ 522.580404] 5dc0: c0008f80 c000867c 00000000 c02c1780 c3a10600 00000000 c3a10600 00000000 > [ 522.580468] 5de0: c3bb33a0 c398c800 c3a1068c 00000000 00000000 c3bd5e34 c3bd5df0 c3bd5e10 > [ 522.580532] 5e00: c0213bb0 c02c127c 60000013 ffffffff 00000004 c3bb33a0 00000001 c0359864 > [ 522.580591] 5e20: c3bd4018 00000000 c3bd5e54 c3bd5e38 c02c1a38 c02c1050 000089a2 000089a2 > [ 522.580655] 5e40: c3bb3000 c3bd5ea0 c3bd5e64 c3bd5e58 c02c2334 c02c19fc c3bd5e94 c3bd5e68 > [ 522.580719] 5e60: c02190dc c02c22e4 c0469e00 be8efb10 c3bd5e94 c3bd5e80 000089a2 c0469e00 > [ 522.580783] 5e80: be8efb10 c3bd5ea0 c3bd5eec c3bd5e98 c0219480 c0218e28 c3ab6d20 00000000 > [ 522.580842] 5ea0: 00307262 00000000 00000000 00000000 00000004 b6f24e88 b6f4acf8 b6d768e4 > [ 522.580906] 5ec0: c0262184 000089a2 fffffdfd be8efb10 c30d6500 c0009484 c3bd4000 00000000 > [ 522.580975] 5ee0: c3bd5f0c c3bd5ef0 c0205680 c0219128 c0205544 c3494660 be8efb10 c30d6500 > [ 522.581039] 5f00: c3bd5f7c c3bd5f10 c008ca98 c0205554 c001b83c c001b310 c3bd5f54 c3bd5f28 > [ 522.581108] 5f20: c3bd4018 00000009 c0012418 c0451b9c 00000001 00000000 c047a380 c0451b9c > [ 522.581172] 5f40: 00000001 00000000 c3bd5f64 c3bd5f58 c001bc28 c00501ac be8efb10 000089a2 > [ 522.581236] 5f60: 00000003 c30d6500 c0009484 c3bd4000 c3bd5fa4 c3bd5f80 c008cc4c c008c704 > [ 522.581306] 5f80: c00086c8 00000000 be8eff14 00000002 be8efe10 00000036 00000000 c3bd5fa8 > [ 522.581370] 5fa0: c0009300 c008cc20 be8eff14 00000002 00000003 000089a2 be8efb10 00056f0d > [ 522.581434] 5fc0: be8eff14 00000002 be8efe10 00000036 be8eff10 00000000 b6f4b000 00000000 > [ 522.581498] 5fe0: b6e36ef0 be8efac4 000150d0 b6e36efc 60000010 00000003 00000000 00000000 > [ 522.581535] Backtrace: > [ 522.581647] [] (br_handle_frame+0x0/0x26c) from [] (__netif_receive_skb+0x318/0x52c) > [ 522.581695] r9:c043e548 r8:c0470720 r7:00000001 r6:c398c800 r5:c043e548 > r4:c02c1780 > [ 522.581892] [] (__netif_receive_skb+0x0/0x52c) from [] (netif_receive_skb+0x20/0x68) > [ 522.581972] [] (netif_receive_skb+0x0/0x68) from [] (macb_poll+0x244/0x3cc) > [ 522.582015] r4:00000180 > [ 522.582100] [] (macb_poll+0x0/0x3cc) from [] (net_rx_action+0x104/0x1b8) > [ 522.582196] [] (net_rx_action+0x0/0x1b8) from [] (__do_softirq+0xa8/0x14c) > [ 522.582282] [] (__do_softirq+0x0/0x14c) from [] (irq_exit+0x50/0x5c) > [ 522.582362] [] (irq_exit+0x0/0x5c) from [] (handle_IRQ+0x3c/0x8c) > [ 522.582431] [] (handle_IRQ+0x0/0x8c) from [] (vic_handle_irq+0x5c/0x9c) > [ 522.582474] r6:00000000 r5:00000000 r4:c3bd5dc8 r3:00000002 > [ 522.582607] [] (vic_handle_irq+0x0/0x9c) from [] (__irq_svc+0x40/0x60) > [ 522.582655] Exception stack(0xc3bd5dc8 to 0xc3bd5e10) > [ 522.582719] 5dc0: 00000000 c02c1780 c3a10600 00000000 c3a10600 00000000 > [ 522.582783] 5de0: c3bb33a0 c398c800 c3a1068c 00000000 00000000 c3bd5e34 c3bd5df0 c3bd5e10 > [ 522.582842] 5e00: c0213bb0 c02c127c 60000013 ffffffff > [ 522.582874] r8:c3a1068c r7:c3bd5dfc r6:ffffffff r5:60000013 r4:c02c127c > r3:c0213bb0 > [ 522.583071] [] (br_add_if+0x0/0x3f4) from [] (add_del_if+0x4c/0x6c) > [ 522.583114] r9:00000000 r8:c3bd4018 r7:c0359864 r6:00000001 r5:c3bb33a0 > r4:00000004 > [ 522.583306] [] (add_del_if+0x0/0x6c) from [] (br_dev_ioctl+0x60/0x6c) > [ 522.583343] r6:c3bd5ea0 r5:c3bb3000 r4:000089a2 r3:000089a2 > [ 522.583492] [] (br_dev_ioctl+0x0/0x6c) from [] (dev_ifsioc+0x2c4/0x300) > [ 522.583556] [] (dev_ifsioc+0x0/0x300) from [] (dev_ioctl+0x368/0x82c) > [ 522.583599] r7:c3bd5ea0 r6:be8efb10 r5:c0469e00 r4:000089a2 > [ 522.583748] [] (dev_ioctl+0x0/0x82c) from [] (sock_ioctl+0x13c/0x270) > [ 522.583834] [] (sock_ioctl+0x0/0x270) from [] (do_vfs_ioctl+0x3a4/0x51c) > [ 522.583876] r6:c30d6500 r5:be8efb10 r4:c3494660 r3:c0205544 > [ 522.584026] [] (do_vfs_ioctl+0x0/0x51c) from [] (sys_ioctl+0x3c/0x68) > [ 522.584063] r9:c3bd4000 r8:c0009484 r7:c30d6500 r6:00000003 r5:000089a2 > r4:be8efb10 > [ 522.584255] [] (sys_ioctl+0x0/0x68) from [] (ret_fast_syscall+0x0/0x2c) > [ 522.584298] r7:00000036 r6:be8efe10 r5:00000002 r4:be8eff14 > [ 522.584431] Code: e3c22c0f 11a06008 e1912002 0a000036 (e5d65011) > [ 522.584554] ---[ end trace 715c438c778f2442 ]--- > [ 522.584607] Kernel panic - not syncing: Fatal exception in interrupt > [ 522.584650] Rebooting in 1 seconds.. > > We have several patches that fix various (mostly security) issues. I > have attached them. > > I cannot provide any additional details, because we are not able to > reproduce this issue. It happens when we reconfigure Ethernet interfaces. > > Best regards > > Tomas > > Dne 15.2.2019 v 15:12 Nikolay Aleksandrov napsal(a): >> On 15/02/2019 16:03, Tomas Paukrt wrote: >>> Hi, >>> >>> I have recently discovered that kernel 3.12.10 is occasionally crashing >>> due to NULL pointer dereference in function br_handle_frame when we >>> reconfigure the bridge, because function br_port_get_rcu returns NULL. >>> >>> It is very hard for us to replicate this issue, because it happens about >>> once per month in our testing environment, but I have created the >>> attached patch. Can you please check it? The latest kernel seems to be >>> affected too. >>> >>> Best regards >>> >>> Tomas >>> >> Hi, >> That should not be possible, br_port_get_rcu() is a wrapper for >> dev->rx_handler_data which in turn should always be present in case >> rx_handler is called as can be seen in netdev_rx_handler_unregister(). >> Could you please share details about the crash and possibly a trace ? >> Do you have any custom patches applied ? >> >> Thanks, >> Nik >> >> . >> Hi again, Please don't top post on netdev@. As usual I'll have to ask you to reproduce this on a vanilla latest kernel (if possible on the current -net or -net-next trees) without any modifications and please provide a trace from such kernel. From the explanation it sounds like it would take some time but noone will look into it seriously unless that happens. Thank you, Nik
diff --exclude CVS --exclude .git -uNr linux-3.12.10/net/bridge/br_input.c linux-3.12.10.modified/net/bridge/br_input.c --- linux-3.12.10/net/bridge/br_input.c 2014-03-31 03:41:44.000000000 +0200 +++ linux-3.12.10.modified/net/bridge/br_input.c 2019-02-15 10:51:23.376424560 +0100 @@ -174,6 +174,8 @@ return RX_HANDLER_CONSUMED; p = br_port_get_rcu(skb->dev); + if (!p) + return RX_HANDLER_PASS; if (unlikely(is_link_local_ether_addr(dest))) { /*