From patchwork Sat Jan 6 01:38:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 856318 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="fc0E5dnQ"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zD41v4Sfwz9sNw for ; Sat, 6 Jan 2018 12:38:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753533AbeAFBil (ORCPT ); Fri, 5 Jan 2018 20:38:41 -0500 Received: from mail-pg0-f66.google.com ([74.125.83.66]:36838 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752226AbeAFBij (ORCPT ); Fri, 5 Jan 2018 20:38:39 -0500 Received: by mail-pg0-f66.google.com with SMTP id j2so1435180pgv.3 for ; Fri, 05 Jan 2018 17:38:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=heylvJ+0O3UKNKrP7s+rqJuC+iblAyk23QsxD0fyDD8=; b=fc0E5dnQdPYKbdwwgcd2q1tO/3sHVZEdClNhGSXCKJnH7HZvxcqj6pC9U/rgIkHlXI eQK2+k3Pb0X3ES0FxgsV+hE6PVxsW6yJluQBcWfnFKSTgDjWQXj1Wio4O25rz/xmwm5p lrcjbaebbXZFfhjA8/xVXKWfiPUFFIfgl1A9Ey3jzsWUEIb+LYJJd/UwKiDjgytHTyph zFFvhkewRRhQdwPknk3jnvckwuB96DRX7t5q4oyPrCu5USyL37i2MFqpKWIB5aHXbkKx flsLmon+yU8QRQHFR6LsVuQKs4tzzeM4z0GLNkWR1MnPWwYt3e6HQKAffw3Q/1JoEtw+ XAKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=heylvJ+0O3UKNKrP7s+rqJuC+iblAyk23QsxD0fyDD8=; b=AfVDu19ysltiGvKwejl21+KSJTzvXKuMQPfJsyfLxugtJeZBKf9z7n5aMSM5bhGJBB bM2hzXnlTJL1mFclTi+0zcZjNJ0zZXMuF/OddEJkKt+scvxQAzNmjXTco73pxonGlQLy nmdBTdpTICYziAecK3BMrUR/NIjNWZSyY3B17C+5ZyVM21Mpynlf0dYALlEnUV5xhfR2 wFksmDxxB0KN3FgExUtf1sg8maY/C2CiK4hY3fT0/xUlDKxSrFecATPgF5SnQnnj3ic1 /9TcHA23gZpYS0iLicOSAI+I1UPku/E3OYI4aGEL4T2G2cDdrNX7Ki3ie8B0Uwx1mCts 78SA== X-Gm-Message-State: AKGB3mJ9VH6Xm3Alzr+uf5Y081R0ht5mom3yYr8Ckn2KVmS7gFHF8iac 3GZtq/+BHwTnNsyT7A9wfxAzmQ== X-Google-Smtp-Source: ACJfBotjsqLfK8/fj1JEkzndho37oVxIIKzOzuJFadhGJxWJqruI+Uufe+uzm+XaMnlpvOjSvn41Dg== X-Received: by 10.98.212.94 with SMTP id u30mr4591576pfl.54.1515202718674; Fri, 05 Jan 2018 17:38:38 -0800 (PST) Received: from localhost ([2620:15c:2c4:201:573:305:3985:5c48]) by smtp.gmail.com with ESMTPSA id s14sm17067021pfe.36.2018.01.05.17.38.37 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Fri, 05 Jan 2018 17:38:38 -0800 (PST) From: Wei Wang X-Google-Original-From: Wei Wang To: David Miller , netdev@vger.kernel.org Cc: Martin KaFai Lau , Eric Dumazet , Wei Wang Subject: [PATCH net] ipv6: remove null_entry before adding default route Date: Fri, 5 Jan 2018 17:38:35 -0800 Message-Id: <20180106013835.127556-1-tracywwnj@gmail.com> X-Mailer: git-send-email 2.16.0.rc0.223.g4a4ac83678-goog Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Wei Wang In the current code, when creating a new fib6 table, tb6_root.leaf gets initialized to net->ipv6.ip6_null_entry. If a default route is being added with rt->rt6i_metric = 0xffffffff, fib6_add() will add this route after net->ipv6.ip6_null_entry. As null_entry is shared, it could cause problem. In order to fix it, set fn->leaf to NULL before calling fib6_add_rt2node() when trying to add the first default route. And reset fn->leaf to null_entry when adding fails or when deleting the last default route. syzkaller reported the following issue which is fixed by this commit: ============================= WARNING: suspicious RCU usage 4.15.0-rc5+ #171 Not tainted ----------------------------- net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 4 locks held by swapper/0/0: #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline] #0: ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310 #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline] #1: (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007 #2: (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560 #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline] #3: (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585 fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701 fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892 fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815 fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863 fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933 __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949 fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline] fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016 fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033 call_timer_fn+0x228/0x820 kernel/time/timer.c:1320 expire_timers kernel/time/timer.c:1357 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660 run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:540 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904 Reported-by: syzbot Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table") Signed-off-by: Wei Wang --- net/ipv6/ip6_fib.c | 45 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 35 insertions(+), 10 deletions(-) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index d11a5578e4f8..37cb4ad1ea29 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net, if (!(fn->fn_flags & RTN_RTINFO)) { RCU_INIT_POINTER(fn->leaf, NULL); rt6_release(leaf); + /* remove null_entry in the root node */ + } else if (fn->fn_flags & RTN_TL_ROOT && + rcu_access_pointer(fn->leaf) == + net->ipv6.ip6_null_entry) { + RCU_INIT_POINTER(fn->leaf, NULL); } return fn; @@ -1270,14 +1275,27 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, return err; failure: - /* fn->leaf could be NULL if fn is an intermediate node and we - * failed to add the new route to it in both subtree creation - * failure and fib6_add_rt2node() failure case. - * In both cases, fib6_repair_tree() should be called to fix + /* fn->leaf could be NULL if: + * 1. fn is the root node in the table and we fail to add the default + * route to it. + * In this case, we put fn->leaf back to net->ipv6.ip6_null_entry as + * the way the table was created. + * 2. fn is an intermediate node and we failed to add the new + * route to it in both subtree creation failure and fib6_add_rt2node() + * failure case. + * In this case, fib6_repair_tree() should be called to fix * fn->leaf. */ - if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) - fib6_repair_tree(info->nl_net, table, fn); + if (fn) { + if (fn->fn_flags & RTN_TL_ROOT) { + if (!rcu_access_pointer(fn->leaf)) { + rcu_assign_pointer(fn->leaf, + info->nl_net->ipv6.ip6_null_entry); + } + } else if (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) { + fib6_repair_tree(info->nl_net, table, fn); + } + } /* Always release dst as dst->__refcnt is guaranteed * to be taken before entering this function */ @@ -1685,11 +1703,18 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn, } read_unlock(&net->ipv6.fib6_walker_lock); - /* If it was last route, expunge its radix tree node */ + /* If it was last route: + * 1. For root node, put back null_entry as how the table was created. + * 2. For other nodes, expunge its radix tree node. + */ if (!rcu_access_pointer(fn->leaf)) { - fn->fn_flags &= ~RTN_RTINFO; - net->ipv6.rt6_stats->fib_route_nodes--; - fn = fib6_repair_tree(net, table, fn); + if (fn->fn_flags & RTN_TL_ROOT) { + rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry); + } else { + fn->fn_flags &= ~RTN_RTINFO; + net->ipv6.rt6_stats->fib_route_nodes--; + fn = fib6_repair_tree(net, table, fn); + } } fib6_purge_rt(rt, fn, net);