Message ID | 1524160886-20401-1-git-send-email-greearb@candelatech.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Series | net: Work around crash in ipv6 fib-walk-continue | expand |
On 4/19/18 12:01 PM, greearb@candelatech.com wrote: > From: Ben Greear <greearb@candelatech.com> > > This keeps us from crashing in certain test cases where we > bring up many (1000, for instance) mac-vlans with IPv6 > enabled in the kernel. This bug has been around for a > very long time. > > Until a real fix is found (and for stable), maybe it > is better to return an incomplete fib walk instead > of crashing. > > BUG: unable to handle kernel NULL pointer dereference at 8 > IP: fib6_walk_continue+0x5b/0x140 [ipv6] > PGD 80000007dfc0c067 P4D 80000007dfc0c067 PUD 7e66ff067 PMD 0 > Oops: 0000 [#1] PREEMPT SMP PTI > Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 libcrc32c vrf] > CPU: 3 PID: 15117 Comm: ip Tainted: G O 4.16.0+ #5 > Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017 > RIP: 0010:fib6_walk_continue+0x5b/0x140 [ipv6] > RSP: 0018:ffffc90008c3bc10 EFLAGS: 00010287 > RAX: ffff88085ac45050 RBX: ffff8807e03008a0 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffffc90008c3bc48 RDI: ffffffff8232b240 > RBP: ffff880819167600 R08: 0000000000000008 R09: ffff8807dff10071 > R10: ffffc90008c3bbd0 R11: 0000000000000000 R12: ffff8807e03008a0 > R13: 0000000000000002 R14: ffff8807e05744c8 R15: ffff8807e08ef000 > FS: 00007f2f04342700(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000008 CR3: 00000007e0556002 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > inet6_dump_fib+0x14b/0x2c0 [ipv6] > netlink_dump+0x216/0x2a0 > netlink_recvmsg+0x254/0x400 > ? copy_msghdr_from_user+0xb5/0x110 > ___sys_recvmsg+0xe9/0x230 > ? find_held_lock+0x3b/0xb0 > ? __handle_mm_fault+0x617/0x1180 > ? __audit_syscall_entry+0xb3/0x110 > ? __sys_recvmsg+0x39/0x70 > __sys_recvmsg+0x39/0x70 > do_syscall_64+0x63/0x120 > entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > RIP: 0033:0x7f2f03a72030 > RSP: 002b:00007fffab3de508 EFLAGS: 00000246 ORIG_RAX: 000000000000002f > RAX: ffffffffffffffda RBX: 00007fffab3e641c RCX: 00007f2f03a72030 > RDX: 0000000000000000 RSI: 00007fffab3de570 RDI: 0000000000000004 > RBP: 0000000000000000 R08: 0000000000007e6c R09: 00007fffab3e63a8 > R10: 00007fffab3de5b0 R11: 0000000000000246 R12: 00007fffab3e6608 > R13: 000000000066b460 R14: 0000000000007e6c R15: 0000000000000000 > Code: 85 d2 74 17 f6 40 2a 04 74 11 8b 53 2c 85 d2 0f 84 d7 00 00 00 83 ea 01 89 53 2c c7 4 > RIP: fib6_walk_continue+0x5b/0x140 [ipv6] RSP: ffffc90008c3bc10 > CR2: 0000000000000008 > ---[ end trace bd03458864eb266c ]--- > > Signed-off-by: Ben Greear <greearb@candelatech.com> > --- > Does your use case that triggers this involve replacing routes? I just noticed the route delete code in fib6_add_rt2node does not have the 'Adjust walkers' code that is in fib6_del_route. Further, the adjust walkers code in fib6_del_route looks suspicious in its timing with route deletes. If you have a reliable reproducer we can try a few things with fib6_del_route and the walker code.
On 05/04/2018 10:47 AM, David Ahern wrote: > On 4/19/18 12:01 PM, greearb@candelatech.com wrote: >> From: Ben Greear <greearb@candelatech.com> >> >> This keeps us from crashing in certain test cases where we >> bring up many (1000, for instance) mac-vlans with IPv6 >> enabled in the kernel. This bug has been around for a >> very long time. >> >> Until a real fix is found (and for stable), maybe it >> is better to return an incomplete fib walk instead >> of crashing. >> >> BUG: unable to handle kernel NULL pointer dereference at 8 >> IP: fib6_walk_continue+0x5b/0x140 [ipv6] >> PGD 80000007dfc0c067 P4D 80000007dfc0c067 PUD 7e66ff067 PMD 0 >> Oops: 0000 [#1] PREEMPT SMP PTI >> Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 libcrc32c vrf] >> CPU: 3 PID: 15117 Comm: ip Tainted: G O 4.16.0+ #5 >> Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017 >> RIP: 0010:fib6_walk_continue+0x5b/0x140 [ipv6] >> RSP: 0018:ffffc90008c3bc10 EFLAGS: 00010287 >> RAX: ffff88085ac45050 RBX: ffff8807e03008a0 RCX: 0000000000000000 >> RDX: 0000000000000000 RSI: ffffc90008c3bc48 RDI: ffffffff8232b240 >> RBP: ffff880819167600 R08: 0000000000000008 R09: ffff8807dff10071 >> R10: ffffc90008c3bbd0 R11: 0000000000000000 R12: ffff8807e03008a0 >> R13: 0000000000000002 R14: ffff8807e05744c8 R15: ffff8807e08ef000 >> FS: 00007f2f04342700(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 0000000000000008 CR3: 00000007e0556002 CR4: 00000000003606e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> inet6_dump_fib+0x14b/0x2c0 [ipv6] >> netlink_dump+0x216/0x2a0 >> netlink_recvmsg+0x254/0x400 >> ? copy_msghdr_from_user+0xb5/0x110 >> ___sys_recvmsg+0xe9/0x230 >> ? find_held_lock+0x3b/0xb0 >> ? __handle_mm_fault+0x617/0x1180 >> ? __audit_syscall_entry+0xb3/0x110 >> ? __sys_recvmsg+0x39/0x70 >> __sys_recvmsg+0x39/0x70 >> do_syscall_64+0x63/0x120 >> entry_SYSCALL_64_after_hwframe+0x3d/0xa2 >> RIP: 0033:0x7f2f03a72030 >> RSP: 002b:00007fffab3de508 EFLAGS: 00000246 ORIG_RAX: 000000000000002f >> RAX: ffffffffffffffda RBX: 00007fffab3e641c RCX: 00007f2f03a72030 >> RDX: 0000000000000000 RSI: 00007fffab3de570 RDI: 0000000000000004 >> RBP: 0000000000000000 R08: 0000000000007e6c R09: 00007fffab3e63a8 >> R10: 00007fffab3de5b0 R11: 0000000000000246 R12: 00007fffab3e6608 >> R13: 000000000066b460 R14: 0000000000007e6c R15: 0000000000000000 >> Code: 85 d2 74 17 f6 40 2a 04 74 11 8b 53 2c 85 d2 0f 84 d7 00 00 00 83 ea 01 89 53 2c c7 4 >> RIP: fib6_walk_continue+0x5b/0x140 [ipv6] RSP: ffffc90008c3bc10 >> CR2: 0000000000000008 >> ---[ end trace bd03458864eb266c ]--- >> >> Signed-off-by: Ben Greear <greearb@candelatech.com> >> --- >> > > Does your use case that triggers this involve replacing routes? I just > noticed the route delete code in fib6_add_rt2node does not have the > 'Adjust walkers' code that is in fib6_del_route. > > Further, the adjust walkers code in fib6_del_route looks suspicious in > its timing with route deletes. If you have a reliable reproducer we can > try a few things with fib6_del_route and the walker code. Yes, we replace routes, and yes we can reliably reproduce it and will be happy to test patches. Thanks, Ben
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 92b8d8c..afef362 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1855,6 +1855,12 @@ static int fib6_walk_continue(struct fib6_walker *w) if (fn == w->root) return 0; pn = rcu_dereference_protected(fn->parent, 1); + if (WARN_ON_ONCE(!pn)) { + pr_err("FWS-U, w: %p fn: %p pn: %p\n", + w, fn, pn); + /* Attempt to work around crash that has been here forever. --Ben */ + return 0; + } left = rcu_dereference_protected(pn->left, 1); right = rcu_dereference_protected(pn->right, 1); w->node = pn;