diff mbox

RCU problems in fib_table_insert

Message ID 1269206752.3004.9.camel@edumazet-laptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet March 21, 2010, 9:25 p.m. UTC
Le dimanche 21 mars 2010 à 21:25 +0100, Andi Kleen a écrit :
> Hi,
> 
> I got the following warning at boot with a 2.6.34-rc2ish git kernel
> with RCU debugging and preemption enabled.
> 
> It seems the problem is that not all callers of fib_find_node
> call it with rcu_read_lock() to stabilize access to the fib. 
> 
> I tried to fix it, but especially for fib_table_insert() that's rather 
> tricky: it does a lot of memory allocations and also route flushing and 
> other blocking operations while assuming the original fa is RCU stable.
> 
> I first tried to move some allocations to the beginning and keep
> preemption disabled in the rest, but it's difficult with all of them.
> No patch because of that.
> 
> Does the fa need an additional reference count for this problem?
> Or perhaps some optimistic locking?
> 
> -Andi

No real changes needed, only a lockdep warning...

Probably a rcu_dereference() should be changed to
rcu_dereference_check() like we did for __in6_dev_get()

We hold RTNL or rcu_read_lock

[PATCH] net: fib_find_node() rcu check

We hold rcu read lock or RTNL when fib_find_node() is called.
Shutup lockdep complain.

Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paul E. McKenney March 21, 2010, 9:38 p.m. UTC | #1
On Sun, Mar 21, 2010 at 10:25:52PM +0100, Eric Dumazet wrote:
> Le dimanche 21 mars 2010 à 21:25 +0100, Andi Kleen a écrit :
> > Hi,
> > 
> > I got the following warning at boot with a 2.6.34-rc2ish git kernel
> > with RCU debugging and preemption enabled.
> > 
> > It seems the problem is that not all callers of fib_find_node
> > call it with rcu_read_lock() to stabilize access to the fib. 
> > 
> > I tried to fix it, but especially for fib_table_insert() that's rather 
> > tricky: it does a lot of memory allocations and also route flushing and 
> > other blocking operations while assuming the original fa is RCU stable.
> > 
> > I first tried to move some allocations to the beginning and keep
> > preemption disabled in the rest, but it's difficult with all of them.
> > No patch because of that.
> > 
> > Does the fa need an additional reference count for this problem?
> > Or perhaps some optimistic locking?
> > 
> > -Andi
> 
> No real changes needed, only a lockdep warning...
> 
> Probably a rcu_dereference() should be changed to
> rcu_dereference_check() like we did for __in6_dev_get()
> 
> We hold RTNL or rcu_read_lock
> 
> [PATCH] net: fib_find_node() rcu check
> 
> We hold rcu read lock or RTNL when fib_find_node() is called.
> Shutup lockdep complain.

You beat me to it, Eric.  ;-)

So:

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Reported-by: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index af5d897..471fe07 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -961,7 +961,8 @@ fib_find_node(struct trie *t, u32 key)
>  	struct node *n;
> 
>  	pos = 0;
> -	n = rcu_dereference(t->trie);
> +	n = rcu_dereference_check(t->trie,
> +				  rcu_read_lock_held() || lockdep_rtnl_is_held());
> 
>  	while (n != NULL &&  NODE_TYPE(n) == T_TNODE) {
>  		tn = (struct tnode *) n;
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet March 21, 2010, 9:49 p.m. UTC | #2
Le dimanche 21 mars 2010 à 14:38 -0700, Paul E. McKenney a écrit :

> You beat me to it, Eric.  ;-)
> 
> So:
> 
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Well, to be fair, I typed less text than you did ;)

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney March 21, 2010, 10:57 p.m. UTC | #3
On Sun, Mar 21, 2010 at 10:49:46PM +0100, Eric Dumazet wrote:
> Le dimanche 21 mars 2010 à 14:38 -0700, Paul E. McKenney a écrit :
> 
> > You beat me to it, Eric.  ;-)
> > 
> > So:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Well, to be fair, I typed less text than you did ;)

;-) ;-) ;-)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index af5d897..471fe07 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -961,7 +961,8 @@  fib_find_node(struct trie *t, u32 key)
 	struct node *n;
 
 	pos = 0;
-	n = rcu_dereference(t->trie);
+	n = rcu_dereference_check(t->trie,
+				  rcu_read_lock_held() || lockdep_rtnl_is_held());
 
 	while (n != NULL &&  NODE_TYPE(n) == T_TNODE) {
 		tn = (struct tnode *) n;