diff mbox

Bug in net/ipv6/ip6_fib.c:fib6_dump_table()

Message ID 1340388785.4604.11442.camel@edumazet-glaptop
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet June 22, 2012, 6:13 p.m. UTC
On Fri, 2012-06-22 at 08:44 -0500, Josh Hunt wrote:

> Ahh. That makes sense and is what Alexey said before I just didn't put
> it all together. So we are OK reverting this patch? I cannot find a path
> where the walker's pointers are updated without the tb6_lock write_lock.
> 

There was a bug somewhere, not sure we want to NULL dereference again.

Following fix should at least avoid a never ending dump



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Debabrata Banerjee June 22, 2012, 9:12 p.m. UTC | #1
On Fri, Jun 22, 2012 at 2:13 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-06-22 at 08:44 -0500, Josh Hunt wrote:
>
>> Ahh. That makes sense and is what Alexey said before I just didn't put
>> it all together. So we are OK reverting this patch? I cannot find a path
>> where the walker's pointers are updated without the tb6_lock write_lock.
>>
>
> There was a bug somewhere, not sure we want to NULL dereference again.
>

As you identified, the tree seems to be protected by tb6_lock. I
couldn't find a race by inspection either. If this is not the root of
the problem, how would this patch fix it? So I think it does nothing.
We are attempting to reproduce that crash to prove it, but like Gao
feng I don't think we will see it.

My current favorite theory is that inet6_dump_fib was called with a
NULL func in callback. This looks like the approximate area of the
crash, but it's impossible to say without more information from
Patrick McHardy.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 23, 2012, 12:02 a.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 22 Jun 2012 20:13:05 +0200

> On Fri, 2012-06-22 at 08:44 -0500, Josh Hunt wrote:
> 
>> Ahh. That makes sense and is what Alexey said before I just didn't put
>> it all together. So we are OK reverting this patch? I cannot find a path
>> where the walker's pointers are updated without the tb6_lock write_lock.
>> 
> 
> There was a bug somewhere, not sure we want to NULL dereference again.

Well:

1) Patrick McHardy has been inactive for a while, so do not expect
   any insight from him.

2) Ben Greear isn't even on the CC: list of this discussion yet he
   appears to be the person who reproduced the crash way back then
   and is listed in the Tested-by tag of the commit.

   As a result we aren't likely to get any insight from the one person
   who actually could hit the crash.

I'm inclined to just revert simply because we have people active who
can reproduce regressions introduced by this change and nobody can
understand why the change is even necessary.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 74c21b9..6083276 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1349,8 +1349,8 @@  static int fib6_walk_continue(struct fib6_walker_t *w)
 			if (w->leaf && fn->fn_flags & RTN_RTINFO) {
 				int err;
 
-				if (w->count < w->skip) {
-					w->count++;
+				if (w->skip) {
+					w->skip--;
 					continue;
 				}