Patchwork Fix RCU warning in rt_cache_seq_show

login
register
mail settings
Submitter Mark Rutland
Date Aug. 9, 2011, 5:02 p.m.
Message ID <1312909360-2675-1-git-send-email-mark.rutland@arm.com>
Download mbox | patch
Permalink /patch/109261/
State Rejected
Delegated to: David Miller
Headers show

Comments

Mark Rutland - Aug. 9, 2011, 5:02 p.m.
Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
added rcu protection to dst neighbour, and updated callsites for
dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.

This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
ARM Vexpress A9x4):
Eric Dumazet - Aug. 9, 2011, 5:18 p.m.
Le mardi 09 août 2011 à 18:02 +0100, Mark Rutland a écrit :
> Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
> added rcu protection to dst neighbour, and updated callsites for
> dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.
> 
> This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
> ARM Vexpress A9x4):
> 
> ===================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> include/net/dst.h:91 invoked rcu_dereference_check() without protection!
> 
> other info that might help us debug this:
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 2 locks held by proc01/32159:
> 
> stack backtrace:
> [<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4)
> [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>] (seq_read+0x324/0x4a4)
> [<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>] (proc_reg_read+0x70/0x94)
> [<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>] (vfs_read+0xb0/0x144)
> [<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>] (sys_read+0x40/0x70)
> [<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>] (ret_fast_syscall+0x0/0x3c)
> 
> This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
> protecting the dereferenced variable, and clearing the warning.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
> ---
>  net/ipv4/route.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index e3dec1c..6699ef7 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -419,6 +419,7 @@ static int rt_cache_seq_show(struct seq_file *seq, void *v)
>  		struct neighbour *n;
>  		int len;
>  
> +		rcu_read_lock();
>  		n = dst_get_neighbour(&r->dst);
>  		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
>  			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
> @@ -435,6 +436,7 @@ static int rt_cache_seq_show(struct seq_file *seq, void *v)
>  			-1,
>  			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
>  			r->rt_spec_dst, &len);
> +		rcu_read_unlock();
>  
>  		seq_printf(seq, "%*s\n", 127 - len, "");
>  	}


Hmm, I though rcu_read_lock_bh() (done by caller of this function) was
protecting us here.


Paul, is it really needed, or is it a lockdep artifact ?





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney - Aug. 10, 2011, 1:11 a.m.
On Tue, Aug 09, 2011 at 07:18:56PM +0200, Eric Dumazet wrote:
> Le mardi 09 août 2011 à 18:02 +0100, Mark Rutland a écrit :
> > Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
> > added rcu protection to dst neighbour, and updated callsites for
> > dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.
> > 
> > This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
> > ARM Vexpress A9x4):
> > 
> > ===================================================
> > [ INFO: suspicious rcu_dereference_check() usage. ]
> > ---------------------------------------------------
> > include/net/dst.h:91 invoked rcu_dereference_check() without protection!
> > 
> > other info that might help us debug this:
> > 
> > rcu_scheduler_active = 1, debug_locks = 0
> > 2 locks held by proc01/32159:

This is very strange.  It says that there are two locks held by the
task, but refuses to list them.  Maybe something stomped on the list
of held locks?

							Thanx, Paul

> > stack backtrace:
> > [<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4)
> > [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>] (seq_read+0x324/0x4a4)
> > [<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>] (proc_reg_read+0x70/0x94)
> > [<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>] (vfs_read+0xb0/0x144)
> > [<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>] (sys_read+0x40/0x70)
> > [<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>] (ret_fast_syscall+0x0/0x3c)
> > 
> > This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
> > protecting the dereferenced variable, and clearing the warning.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: David S. Miller <davem@davemloft.net>
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
> > ---
> >  net/ipv4/route.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index e3dec1c..6699ef7 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -419,6 +419,7 @@ static int rt_cache_seq_show(struct seq_file *seq, void *v)
> >  		struct neighbour *n;
> >  		int len;
> >  
> > +		rcu_read_lock();
> >  		n = dst_get_neighbour(&r->dst);
> >  		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
> >  			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
> > @@ -435,6 +436,7 @@ static int rt_cache_seq_show(struct seq_file *seq, void *v)
> >  			-1,
> >  			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
> >  			r->rt_spec_dst, &len);
> > +		rcu_read_unlock();
> >  
> >  		seq_printf(seq, "%*s\n", 127 - len, "");
> >  	}
> 
> 
> Hmm, I though rcu_read_lock_bh() (done by caller of this function) was
> protecting us here.
> 
> 
> Paul, is it really needed, or is it a lockdep artifact ?
> 
> 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Aug. 11, 2011, 4:58 p.m.
Le mercredi 10 août 2011 à 10:28 +0100, Mark Rutland a écrit :
> > -----Original Message-----
> > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > Sent: 09 August 2011 18:19
> > To: Mark Rutland; Paul E. McKenney
> > Cc: netdev@vger.kernel.org; David S. Miller; Gergely Kalman
> > Subject: Re: [PATCH] Fix RCU warning in rt_cache_seq_show
> > 
> > Le mardi 09 août 2011 à 18:02 +0100, Mark Rutland a écrit :
> > > Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
> > > added rcu protection to dst neighbour, and updated callsites for
> > > dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.
> > >
> > > This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
> > > ARM Vexpress A9x4):
> > >
> > > ===================================================
> > > [ INFO: suspicious rcu_dereference_check() usage. ]
> > > ---------------------------------------------------
> > > include/net/dst.h:91 invoked rcu_dereference_check() without
> > protection!
> > >
> > > other info that might help us debug this:
> > >
> > > rcu_scheduler_active = 1, debug_locks = 0
> > > 2 locks held by proc01/32159:
> > >
> > > stack backtrace:
> > > [<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>]
> > (rt_cache_seq_show+0x18c/0x1c4)
> > > [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>]
> > (seq_read+0x324/0x4a4)
> > > [<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>]
> > (proc_reg_read+0x70/0x94)
> > > [<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>]
> > (vfs_read+0xb0/0x144)
> > > [<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>]
> > (sys_read+0x40/0x70)
> > > [<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>]
> > (ret_fast_syscall+0x0/0x3c)
> > >
> > > This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
> > > protecting the dereferenced variable, and clearing the warning.
> > >
> > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > Cc: David S. Miller <davem@davemloft.net>
> > > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > > Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
> > > ---
> > >  net/ipv4/route.c |    2 ++
> > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > > index e3dec1c..6699ef7 100644
> > > --- a/net/ipv4/route.c
> > > +++ b/net/ipv4/route.c
> > > @@ -419,6 +419,7 @@ static int rt_cache_seq_show(struct seq_file
> > *seq, void *v)
> > >  		struct neighbour *n;
> > >  		int len;
> > >
> > > +		rcu_read_lock();
> > >  		n = dst_get_neighbour(&r->dst);
> > >  		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
> > >  			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
> > > @@ -435,6 +436,7 @@ static int rt_cache_seq_show(struct seq_file
> > *seq, void *v)
> > >  			-1,
> > >  			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
> > >  			r->rt_spec_dst, &len);
> > > +		rcu_read_unlock();
> > >
> > >  		seq_printf(seq, "%*s\n", 127 - len, "");
> > >  	}
> > 
> > 
> > Hmm, I though rcu_read_lock_bh() (done by caller of this function) was
> > protecting us here.
> 
> Aha. Being a bit trigger-happy, I'd had a quick look at the functions
> mentioned in the backtrace, and not looked at any possible inlining.
> 
> This being my first real exposure to RCU, I wasn't aware of the *_bh
> variants. Looking at the documentation (Documentation/RCU/checklist.txt),
> I think the real problem is that we should be using rcu_dereference_bh in
> this case:
> 
>   > read-side critical sections are delimited by rcu_read_lock()
>   > and rcu_read_unlock(), or by similar primitives such as
>   > rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case
>   > the matching rcu_dereference() primitive must be used in order
>   > to keep lockdep happy, in this case, rcu_dereference_bh().

Hmm.

I do think dst_get_neighbour() should use rcu_dereference(), because
dst->_neighbour are freed by call_rcu().

The question is : Is following construct [A] safe or not ?

{
rcu_read_lock_bh();
	/* BH are now disabled, and we are not allowed to sleep */
	...

	ptr = rcu_dereference();
	...
rcu_read_unlock_bh();
}


I dont really understand why lockdep wants [B] instead :

{
rcu_read_lock_bh();
	...

	{
	rcu_read_lock();
	ptr = rcu_dereference();
	rcu_read_unlock();
	}
	...
rcu_read_unlock_bh();
}



However, I can understand the other way [C], this is really needed :

{
rcu_read_lock();
	...

	{
	rcu_read_lock_bh();
	ptr = rcu_dereference_bh();
	rcu_read_unlock_bh();
	}
	...
rcu_read_unlock();
}

I believe [A] should be allowed by lockdep.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney - Aug. 12, 2011, 2:32 a.m.
On Thu, Aug 11, 2011 at 06:58:21PM +0200, Eric Dumazet wrote:
> Le mercredi 10 août 2011 à 10:28 +0100, Mark Rutland a écrit :
> > > -----Original Message-----
> > > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > > Sent: 09 August 2011 18:19
> > > To: Mark Rutland; Paul E. McKenney
> > > Cc: netdev@vger.kernel.org; David S. Miller; Gergely Kalman
> > > Subject: Re: [PATCH] Fix RCU warning in rt_cache_seq_show
> > > 
> > > Le mardi 09 août 2011 à 18:02 +0100, Mark Rutland a écrit :
> > > > Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
> > > > added rcu protection to dst neighbour, and updated callsites for
> > > > dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.
> > > >
> > > > This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
> > > > ARM Vexpress A9x4):
> > > >
> > > > ===================================================
> > > > [ INFO: suspicious rcu_dereference_check() usage. ]
> > > > ---------------------------------------------------
> > > > include/net/dst.h:91 invoked rcu_dereference_check() without
> > > protection!
> > > >
> > > > other info that might help us debug this:
> > > >
> > > > rcu_scheduler_active = 1, debug_locks = 0
> > > > 2 locks held by proc01/32159:
> > > >
> > > > stack backtrace:
> > > > [<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>]
> > > (rt_cache_seq_show+0x18c/0x1c4)
> > > > [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>]
> > > (seq_read+0x324/0x4a4)
> > > > [<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>]
> > > (proc_reg_read+0x70/0x94)
> > > > [<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>]
> > > (vfs_read+0xb0/0x144)
> > > > [<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>]
> > > (sys_read+0x40/0x70)
> > > > [<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>]
> > > (ret_fast_syscall+0x0/0x3c)
> > > >
> > > > This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
> > > > protecting the dereferenced variable, and clearing the warning.
> > > >
> > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: David S. Miller <davem@davemloft.net>
> > > > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > > > Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
> > > > ---
> > > >  net/ipv4/route.c |    2 ++
> > > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > > > index e3dec1c..6699ef7 100644
> > > > --- a/net/ipv4/route.c
> > > > +++ b/net/ipv4/route.c
> > > > @@ -419,6 +419,7 @@ static int rt_cache_seq_show(struct seq_file
> > > *seq, void *v)
> > > >  		struct neighbour *n;
> > > >  		int len;
> > > >
> > > > +		rcu_read_lock();
> > > >  		n = dst_get_neighbour(&r->dst);
> > > >  		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
> > > >  			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
> > > > @@ -435,6 +436,7 @@ static int rt_cache_seq_show(struct seq_file
> > > *seq, void *v)
> > > >  			-1,
> > > >  			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
> > > >  			r->rt_spec_dst, &len);
> > > > +		rcu_read_unlock();
> > > >
> > > >  		seq_printf(seq, "%*s\n", 127 - len, "");
> > > >  	}
> > > 
> > > 
> > > Hmm, I though rcu_read_lock_bh() (done by caller of this function) was
> > > protecting us here.
> > 
> > Aha. Being a bit trigger-happy, I'd had a quick look at the functions
> > mentioned in the backtrace, and not looked at any possible inlining.
> > 
> > This being my first real exposure to RCU, I wasn't aware of the *_bh
> > variants. Looking at the documentation (Documentation/RCU/checklist.txt),
> > I think the real problem is that we should be using rcu_dereference_bh in
> > this case:
> > 
> >   > read-side critical sections are delimited by rcu_read_lock()
> >   > and rcu_read_unlock(), or by similar primitives such as
> >   > rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case
> >   > the matching rcu_dereference() primitive must be used in order
> >   > to keep lockdep happy, in this case, rcu_dereference_bh().
> 
> Hmm.
> 
> I do think dst_get_neighbour() should use rcu_dereference(), because
> dst->_neighbour are freed by call_rcu().
> 
> The question is : Is following construct [A] safe or not ?
> 
> {
> rcu_read_lock_bh();
> 	/* BH are now disabled, and we are not allowed to sleep */
> 	...
> 
> 	ptr = rcu_dereference();

This should be:

	ptr = rcu_dereference_bh();

As you say below.  Never mind!  ;-)

> 	...
> rcu_read_unlock_bh();
> }
> 
> 
> I dont really understand why lockdep wants [B] instead :
> 
> {
> rcu_read_lock_bh();
> 	...
> 
> 	{
> 	rcu_read_lock();
> 	ptr = rcu_dereference();

Here you are protected by both RCU and RCU-bh, so you should be able
to use either rcu_dereference() or rcu_dereference_bh().  A bit
strange to use rcu_dereference_bh(), though.  Except perhaps if a
pointer to a function was passed in from the outer RCU-bh read-side
critical section or something.

> 	rcu_read_unlock();
> 	}
> 	...
> rcu_read_unlock_bh();
> }
> 
> 
> 
> However, I can understand the other way [C], this is really needed :
> 
> {
> rcu_read_lock();
> 	...
> 
> 	{
> 	rcu_read_lock_bh();
> 	ptr = rcu_dereference_bh();
> 	rcu_read_unlock_bh();
> 	}
> 	...
> rcu_read_unlock();
> }
> 
> I believe [A] should be allowed by lockdep.

OK, I'll bite.  Why?

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Aug. 12, 2011, 7:23 a.m.
Le jeudi 11 août 2011 à 19:32 -0700, Paul E. McKenney a écrit :
> On Thu, Aug 11, 2011 at 06:58:21PM +0200, Eric Dumazet wrote:
> > Le mercredi 10 août 2011 à 10:28 +0100, Mark Rutland a écrit :
> > > > -----Original Message-----
> > > > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > > > Sent: 09 August 2011 18:19
> > > > To: Mark Rutland; Paul E. McKenney
> > > > Cc: netdev@vger.kernel.org; David S. Miller; Gergely Kalman
> > > > Subject: Re: [PATCH] Fix RCU warning in rt_cache_seq_show
> > > > 
> > > > Le mardi 09 août 2011 à 18:02 +0100, Mark Rutland a écrit :
> > > > > Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
> > > > > added rcu protection to dst neighbour, and updated callsites for
> > > > > dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.
> > > > >
> > > > > This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
> > > > > ARM Vexpress A9x4):
> > > > >
> > > > > ===================================================
> > > > > [ INFO: suspicious rcu_dereference_check() usage. ]
> > > > > ---------------------------------------------------
> > > > > include/net/dst.h:91 invoked rcu_dereference_check() without
> > > > protection!
> > > > >
> > > > > other info that might help us debug this:
> > > > >
> > > > > rcu_scheduler_active = 1, debug_locks = 0
> > > > > 2 locks held by proc01/32159:
> > > > >
> > > > > stack backtrace:
> > > > > [<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>]
> > > > (rt_cache_seq_show+0x18c/0x1c4)
> > > > > [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>]
> > > > (seq_read+0x324/0x4a4)
> > > > > [<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>]
> > > > (proc_reg_read+0x70/0x94)
> > > > > [<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>]
> > > > (vfs_read+0xb0/0x144)
> > > > > [<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>]
> > > > (sys_read+0x40/0x70)
> > > > > [<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>]
> > > > (ret_fast_syscall+0x0/0x3c)
> > > > >
> > > > > This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
> > > > > protecting the dereferenced variable, and clearing the warning.
> > > > >
> > > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > > > > Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
> > > > > ---
> > > > >  net/ipv4/route.c |    2 ++
> > > > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > > > >
> > > > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > > > > index e3dec1c..6699ef7 100644
> > > > > --- a/net/ipv4/route.c
> > > > > +++ b/net/ipv4/route.c
> > > > > @@ -419,6 +419,7 @@ static int rt_cache_seq_show(struct seq_file
> > > > *seq, void *v)
> > > > >  		struct neighbour *n;
> > > > >  		int len;
> > > > >
> > > > > +		rcu_read_lock();
> > > > >  		n = dst_get_neighbour(&r->dst);
> > > > >  		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
> > > > >  			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
> > > > > @@ -435,6 +436,7 @@ static int rt_cache_seq_show(struct seq_file
> > > > *seq, void *v)
> > > > >  			-1,
> > > > >  			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
> > > > >  			r->rt_spec_dst, &len);
> > > > > +		rcu_read_unlock();
> > > > >
> > > > >  		seq_printf(seq, "%*s\n", 127 - len, "");
> > > > >  	}
> > > > 
> > > > 
> > > > Hmm, I though rcu_read_lock_bh() (done by caller of this function) was
> > > > protecting us here.
> > > 
> > > Aha. Being a bit trigger-happy, I'd had a quick look at the functions
> > > mentioned in the backtrace, and not looked at any possible inlining.
> > > 
> > > This being my first real exposure to RCU, I wasn't aware of the *_bh
> > > variants. Looking at the documentation (Documentation/RCU/checklist.txt),
> > > I think the real problem is that we should be using rcu_dereference_bh in
> > > this case:
> > > 
> > >   > read-side critical sections are delimited by rcu_read_lock()
> > >   > and rcu_read_unlock(), or by similar primitives such as
> > >   > rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case
> > >   > the matching rcu_dereference() primitive must be used in order
> > >   > to keep lockdep happy, in this case, rcu_dereference_bh().
> > 
> > Hmm.
> > 
> > I do think dst_get_neighbour() should use rcu_dereference(), because
> > dst->_neighbour are freed by call_rcu().
> > 
> > The question is : Is following construct [A] safe or not ?
> > 
> > {
> > rcu_read_lock_bh();
> > 	/* BH are now disabled, and we are not allowed to sleep */
> > 	...
> > 
> > 	ptr = rcu_dereference();
> 
> This should be:
> 
> 	ptr = rcu_dereference_bh();
> 
> As you say below.  Never mind!  ;-)
> 
> > 	...
> > rcu_read_unlock_bh();
> > }
> > 
> > 
> > I dont really understand why lockdep wants [B] instead :
> > 
> > {
> > rcu_read_lock_bh();
> > 	...
> > 
> > 	{
> > 	rcu_read_lock();
> > 	ptr = rcu_dereference();
> 
> Here you are protected by both RCU and RCU-bh, so you should be able
> to use either rcu_dereference() or rcu_dereference_bh().  A bit
> strange to use rcu_dereference_bh(), though.  Except perhaps if a
> pointer to a function was passed in from the outer RCU-bh read-side
> critical section or something.
> 
> > 	rcu_read_unlock();
> > 	}
> > 	...
> > rcu_read_unlock_bh();
> > }
> > 
> > 
> > 
> > However, I can understand the other way [C], this is really needed :
> > 
> > {
> > rcu_read_lock();
> > 	...
> > 
> > 	{
> > 	rcu_read_lock_bh();
> > 	ptr = rcu_dereference_bh();
> > 	rcu_read_unlock_bh();
> > 	}
> > 	...
> > rcu_read_unlock();
> > }
> > 
> > I believe [A] should be allowed by lockdep.
> 
> OK, I'll bite.  Why?
> 

Oh well, I assumed local_bh_disable() disables preemption.

It does since day-0
add_preempt_count(SOFTIRQ_DISABLE_OFFSET);

So following should be safe :

local_bh_disable();
{
ptr = rcu_dereference(...);
use(ptr);
}
local_bh_enable();

Maybe they are longterm plans to break this assumption, I dont know.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney - Aug. 12, 2011, 12:49 p.m.
On Fri, Aug 12, 2011 at 09:23:50AM +0200, Eric Dumazet wrote:
> Le jeudi 11 août 2011 à 19:32 -0700, Paul E. McKenney a écrit :
> > On Thu, Aug 11, 2011 at 06:58:21PM +0200, Eric Dumazet wrote:
> > > Le mercredi 10 août 2011 à 10:28 +0100, Mark Rutland a écrit :
> > > > > -----Original Message-----
> > > > > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > > > > Sent: 09 August 2011 18:19
> > > > > To: Mark Rutland; Paul E. McKenney
> > > > > Cc: netdev@vger.kernel.org; David S. Miller; Gergely Kalman
> > > > > Subject: Re: [PATCH] Fix RCU warning in rt_cache_seq_show
> > > > > 
> > > > > Le mardi 09 août 2011 à 18:02 +0100, Mark Rutland a écrit :
> > > > > > Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
> > > > > > added rcu protection to dst neighbour, and updated callsites for
> > > > > > dst_{get,set}_neighbour. Unfortunately, it missed rt_cache_seq_show.
> > > > > >
> > > > > > This produces a warning on v3.1-rc1 (on a preemptible kernel, on an
> > > > > > ARM Vexpress A9x4):
> > > > > >
> > > > > > ===================================================
> > > > > > [ INFO: suspicious rcu_dereference_check() usage. ]
> > > > > > ---------------------------------------------------
> > > > > > include/net/dst.h:91 invoked rcu_dereference_check() without
> > > > > protection!
> > > > > >
> > > > > > other info that might help us debug this:
> > > > > >
> > > > > > rcu_scheduler_active = 1, debug_locks = 0
> > > > > > 2 locks held by proc01/32159:
> > > > > >
> > > > > > stack backtrace:
> > > > > > [<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>]
> > > > > (rt_cache_seq_show+0x18c/0x1c4)
> > > > > > [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>]
> > > > > (seq_read+0x324/0x4a4)
> > > > > > [<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>]
> > > > > (proc_reg_read+0x70/0x94)
> > > > > > [<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>]
> > > > > (vfs_read+0xb0/0x144)
> > > > > > [<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>]
> > > > > (sys_read+0x40/0x70)
> > > > > > [<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>]
> > > > > (ret_fast_syscall+0x0/0x3c)
> > > > > >
> > > > > > This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
> > > > > > protecting the dereferenced variable, and clearing the warning.
> > > > > >
> > > > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > > > > > Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
> > > > > > ---
> > > > > >  net/ipv4/route.c |    2 ++
> > > > > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > > > > >
> > > > > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > > > > > index e3dec1c..6699ef7 100644
> > > > > > --- a/net/ipv4/route.c
> > > > > > +++ b/net/ipv4/route.c
> > > > > > @@ -419,6 +419,7 @@ static int rt_cache_seq_show(struct seq_file
> > > > > *seq, void *v)
> > > > > >  		struct neighbour *n;
> > > > > >  		int len;
> > > > > >
> > > > > > +		rcu_read_lock();
> > > > > >  		n = dst_get_neighbour(&r->dst);
> > > > > >  		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
> > > > > >  			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
> > > > > > @@ -435,6 +436,7 @@ static int rt_cache_seq_show(struct seq_file
> > > > > *seq, void *v)
> > > > > >  			-1,
> > > > > >  			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
> > > > > >  			r->rt_spec_dst, &len);
> > > > > > +		rcu_read_unlock();
> > > > > >
> > > > > >  		seq_printf(seq, "%*s\n", 127 - len, "");
> > > > > >  	}
> > > > > 
> > > > > 
> > > > > Hmm, I though rcu_read_lock_bh() (done by caller of this function) was
> > > > > protecting us here.
> > > > 
> > > > Aha. Being a bit trigger-happy, I'd had a quick look at the functions
> > > > mentioned in the backtrace, and not looked at any possible inlining.
> > > > 
> > > > This being my first real exposure to RCU, I wasn't aware of the *_bh
> > > > variants. Looking at the documentation (Documentation/RCU/checklist.txt),
> > > > I think the real problem is that we should be using rcu_dereference_bh in
> > > > this case:
> > > > 
> > > >   > read-side critical sections are delimited by rcu_read_lock()
> > > >   > and rcu_read_unlock(), or by similar primitives such as
> > > >   > rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case
> > > >   > the matching rcu_dereference() primitive must be used in order
> > > >   > to keep lockdep happy, in this case, rcu_dereference_bh().
> > > 
> > > Hmm.
> > > 
> > > I do think dst_get_neighbour() should use rcu_dereference(), because
> > > dst->_neighbour are freed by call_rcu().
> > > 
> > > The question is : Is following construct [A] safe or not ?
> > > 
> > > {
> > > rcu_read_lock_bh();
> > > 	/* BH are now disabled, and we are not allowed to sleep */
> > > 	...
> > > 
> > > 	ptr = rcu_dereference();
> > 
> > This should be:
> > 
> > 	ptr = rcu_dereference_bh();
> > 
> > As you say below.  Never mind!  ;-)
> > 
> > > 	...
> > > rcu_read_unlock_bh();
> > > }
> > > 
> > > 
> > > I dont really understand why lockdep wants [B] instead :
> > > 
> > > {
> > > rcu_read_lock_bh();
> > > 	...
> > > 
> > > 	{
> > > 	rcu_read_lock();
> > > 	ptr = rcu_dereference();
> > 
> > Here you are protected by both RCU and RCU-bh, so you should be able
> > to use either rcu_dereference() or rcu_dereference_bh().  A bit
> > strange to use rcu_dereference_bh(), though.  Except perhaps if a
> > pointer to a function was passed in from the outer RCU-bh read-side
> > critical section or something.
> > 
> > > 	rcu_read_unlock();
> > > 	}
> > > 	...
> > > rcu_read_unlock_bh();
> > > }
> > > 
> > > 
> > > 
> > > However, I can understand the other way [C], this is really needed :
> > > 
> > > {
> > > rcu_read_lock();
> > > 	...
> > > 
> > > 	{
> > > 	rcu_read_lock_bh();
> > > 	ptr = rcu_dereference_bh();
> > > 	rcu_read_unlock_bh();
> > > 	}
> > > 	...
> > > rcu_read_unlock();
> > > }
> > > 
> > > I believe [A] should be allowed by lockdep.
> > 
> > OK, I'll bite.  Why?
> > 
> 
> Oh well, I assumed local_bh_disable() disables preemption.
> 
> It does since day-0
> add_preempt_count(SOFTIRQ_DISABLE_OFFSET);
> 
> So following should be safe :
> 
> local_bh_disable();
> {
> ptr = rcu_dereference(...);
> use(ptr);
> }
> local_bh_enable();
> 
> Maybe they are longterm plans to break this assumption, I dont know.

It would be safe for TINY_RCU and TREE_RCU, but not for either
TINY_PREEMPT_RCU or TREE_PREEMPT_RCU.  These last two do not
recognize a preempt-disable region as an RCU read-side critical
section.

						Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
include/net/dst.h:91 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
2 locks held by proc01/32159:

stack backtrace:
[<80014880>] (unwind_backtrace+0x0/0xf8) from [<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4)
[<802e5c78>] (rt_cache_seq_show+0x18c/0x1c4) from [<800e0c5c>] (seq_read+0x324/0x4a4)
[<800e0c5c>] (seq_read+0x324/0x4a4) from [<8010786c>] (proc_reg_read+0x70/0x94)
[<8010786c>] (proc_reg_read+0x70/0x94) from [<800c0ba8>] (vfs_read+0xb0/0x144)
[<800c0ba8>] (vfs_read+0xb0/0x144) from [<800c0ea8>] (sys_read+0x40/0x70)
[<800c0ea8>] (sys_read+0x40/0x70) from [<8000e0c0>] (ret_fast_syscall+0x0/0x3c)

This patch adds calls to rcu_read_{lock,unlock} in rt_cache_seq_show,
protecting the dereferenced variable, and clearing the warning.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Gergely Kalman <synapse@hippy.csoma.elte.hu>
---
 net/ipv4/route.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e3dec1c..6699ef7 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -419,6 +419,7 @@  static int rt_cache_seq_show(struct seq_file *seq, void *v)
 		struct neighbour *n;
 		int len;
 
+		rcu_read_lock();
 		n = dst_get_neighbour(&r->dst);
 		seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
 			      "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
@@ -435,6 +436,7 @@  static int rt_cache_seq_show(struct seq_file *seq, void *v)
 			-1,
 			(n && (n->nud_state & NUD_CONNECTED)) ? 1 : 0,
 			r->rt_spec_dst, &len);
+		rcu_read_unlock();
 
 		seq_printf(seq, "%*s\n", 127 - len, "");
 	}