diff mbox

Freeing alive fib_info caused by ebc0ffae5

Message ID 1288869699.2659.77.camel@edumazet-laptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Nov. 4, 2010, 11:21 a.m. UTC
Le jeudi 04 novembre 2010 à 11:30 +0100, Eric Dumazet a écrit :
> Le jeudi 04 novembre 2010 à 21:23 +1100, Michael Ellerman a écrit :
> > Hi all,
> > 
> > I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
> > "Freeing alive fib_info" messages, from free_fib_info().
> > 
> > Actually I only get one per boot, when network interfaces come up.
> > Seemingly related I am getting refcount problems when I shutdown, ie.
> > unregister_netdevice() sees a usage count of 1, which never decrements.
> > 
> > Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
> > 
> >     fib: RCU conversion of fib_lookup()
> >     
> >     fib_lookup() converted to be called in RCU protected context, no
> >     reference taken and released on a contended cache line (fib_clntref)
> >     
> > 
> > Is this a bug in that commit, or a driver bug exposed?
> 
> Hi Michael, thanks for the report (and painful bisection I guess)
> 
> Thats hard to say... Is it reproductable on my machine ?
> 

Hmm, a review of the code spotted a bug in fib_result_assign()

Please try following patch :

Thanks again !

[PATCH] fib: fib_result_assign() should not change fib refcounts

After commit ebc0ffae5 (RCU conversion of fib_lookup()),
fib_result_assign()  should not change fib refcounts anymore.

Thanks to Michael who did the bisection and bug report.

Reported-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/ipv4/fib_lookup.h |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael Ellerman Nov. 4, 2010, 11:23 a.m. UTC | #1
On Thu, 2010-11-04 at 12:21 +0100, Eric Dumazet wrote:
> Le jeudi 04 novembre 2010 à 11:30 +0100, Eric Dumazet a écrit :
> > Le jeudi 04 novembre 2010 à 21:23 +1100, Michael Ellerman a écrit :
> > > Hi all,
> > > 
> > > I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
> > > "Freeing alive fib_info" messages, from free_fib_info().
> > > 
> > > Actually I only get one per boot, when network interfaces come up.
> > > Seemingly related I am getting refcount problems when I shutdown, ie.
> > > unregister_netdevice() sees a usage count of 1, which never decrements.
> > > 
> > > Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
> > > 
> > >     fib: RCU conversion of fib_lookup()
> > >     
> > >     fib_lookup() converted to be called in RCU protected context, no
> > >     reference taken and released on a contended cache line (fib_clntref)
> > >     
> > > 
> > > Is this a bug in that commit, or a driver bug exposed?
> > 
> > Hi Michael, thanks for the report (and painful bisection I guess)
> > 
> > Thats hard to say... Is it reproductable on my machine ?
> > 
> 
> Hmm, a review of the code spotted a bug in fib_result_assign()

Aha, I was just adding some debug in there. Let me test the patch.

cheers
Michael Ellerman Nov. 4, 2010, 11:35 a.m. UTC | #2
On Thu, 2010-11-04 at 12:21 +0100, Eric Dumazet wrote:
> 
> Hmm, a review of the code spotted a bug in fib_result_assign()
> 
> Please try following patch :
> 
> Thanks again !
> 
> [PATCH] fib: fib_result_assign() should not change fib refcounts
> 
> After commit ebc0ffae5 (RCU conversion of fib_lookup()),
> fib_result_assign()  should not change fib refcounts anymore.
> 
> Thanks to Michael who did the bisection and bug report.
> 
> Reported-by: Michael Ellerman <michael@ellerman.id.au>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  net/ipv4/fib_lookup.h |    5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h
> index a29edf2..c079cc0 100644
> --- a/net/ipv4/fib_lookup.h
> +++ b/net/ipv4/fib_lookup.h
> @@ -47,11 +47,8 @@ extern int fib_detect_death(struct fib_info *fi, int order,
>  static inline void fib_result_assign(struct fib_result *res,
>  				     struct fib_info *fi)
>  {
> -	if (res->fi != NULL)
> -		fib_info_put(res->fi);
> +	/* we used to play games with refcounts, but we now use RCU */
>  	res->fi = fi;
> -	if (fi != NULL)
> -		atomic_inc(&fi->fib_clntref);
>  }
>  
>  #endif /* _FIB_LOOKUP_H */

Perfect, that fixes it, thanks!

cheers
David Miller Nov. 4, 2010, 7:06 p.m. UTC | #3
From: Michael Ellerman <michael@ellerman.id.au>
Date: Thu, 04 Nov 2010 22:35:26 +1100

> On Thu, 2010-11-04 at 12:21 +0100, Eric Dumazet wrote:
>> [PATCH] fib: fib_result_assign() should not change fib refcounts
>> 
>> After commit ebc0ffae5 (RCU conversion of fib_lookup()),
>> fib_result_assign()  should not change fib refcounts anymore.
>> 
>> Thanks to Michael who did the bisection and bug report.
 ...
> Perfect, that fixes it, thanks!

Applied, thanks everyone!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h
index a29edf2..c079cc0 100644
--- a/net/ipv4/fib_lookup.h
+++ b/net/ipv4/fib_lookup.h
@@ -47,11 +47,8 @@  extern int fib_detect_death(struct fib_info *fi, int order,
 static inline void fib_result_assign(struct fib_result *res,
 				     struct fib_info *fi)
 {
-	if (res->fi != NULL)
-		fib_info_put(res->fi);
+	/* we used to play games with refcounts, but we now use RCU */
 	res->fi = fi;
-	if (fi != NULL)
-		atomic_inc(&fi->fib_clntref);
 }
 
 #endif /* _FIB_LOOKUP_H */