mbox series

[0/4] RCU: introduce noref debug

Message ID cover.1507294365.git.pabeni@redhat.com
Headers show
Series RCU: introduce noref debug | expand

Message

Paolo Abeni Oct. 6, 2017, 12:57 p.m. UTC
The networking subsystem is currently using some kind of long-lived
RCU-protected, references to avoid the overhead of full book-keeping.

Such references - skb_dst() noref - are stored inside the skbs and can be
moved across relevant slices of the network stack, with the users
being in charge of properly clearing the relevant skb - or properly refcount
the related dst references - before the skb escapes the RCU section.

We currently don't have any deterministic debug infrastructure to check
the dst noref usages - and the introduction of others noref artifact is
currently under discussion.

This series tries to tackle the above introducing an RCU debug infrastructure
aimed at spotting incorrect noref pointer usage, in patch one. The
infrastructure is small and must be explicitly enabled via a newly introduced
build option.

Patch two uses such infrastructure to track dst noref usage in the networking
stack.

Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
on basic scenarios.

Paolo Abeni (4):
  rcu: introduce noref debug
  net: use RCU noref infrastructure to track dst noref
  ipv4: drop unneeded and misleading RCU lock in ip_route_input_noref()
  tcp: avoid noref dst leak on input path

 include/linux/rcupdate.h | 11 ++++++
 include/linux/skbuff.h   |  1 +
 include/net/dst.h        |  5 +++
 kernel/rcu/Kconfig.debug | 15 ++++++++
 kernel/rcu/Makefile      |  1 +
 kernel/rcu/noref_debug.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/route.c         |  7 +---
 net/ipv4/tcp_input.c     |  5 ++-
 8 files changed, 127 insertions(+), 7 deletions(-)
 create mode 100644 kernel/rcu/noref_debug.c

Comments

Paul E. McKenney Oct. 6, 2017, 1:34 p.m. UTC | #1
On Fri, Oct 06, 2017 at 02:57:45PM +0200, Paolo Abeni wrote:
> The networking subsystem is currently using some kind of long-lived
> RCU-protected, references to avoid the overhead of full book-keeping.
> 
> Such references - skb_dst() noref - are stored inside the skbs and can be
> moved across relevant slices of the network stack, with the users
> being in charge of properly clearing the relevant skb - or properly refcount
> the related dst references - before the skb escapes the RCU section.
> 
> We currently don't have any deterministic debug infrastructure to check
> the dst noref usages - and the introduction of others noref artifact is
> currently under discussion.
> 
> This series tries to tackle the above introducing an RCU debug infrastructure
> aimed at spotting incorrect noref pointer usage, in patch one. The
> infrastructure is small and must be explicitly enabled via a newly introduced
> build option.
> 
> Patch two uses such infrastructure to track dst noref usage in the networking
> stack.
> 
> Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
> on basic scenarios.

This patchset does not look like it handles rcu_read_lock() nesting.
For example, given code like this:

	void foo(void)
	{
		rcu_read_lock();
		rcu_track_noref(&key2, &noref2, true);
		do_something();
		rcu_track_noref(&key2, &noref2, false);
		rcu_read_unlock();
	}

	void bar(void)
	{
		rcu_read_lock();
		rcu_track_noref(&key1, &noref1, true);
		do_something_more();
		foo();
		do_something_else();
		rcu_track_noref(&key1, &noref1, false);
		rcu_read_unlock();
	}

	void grill(void)
	{
		foo();
	}

It looks like foo()'s rcu_read_unlock() will complain about key1.
You could remove foo()'s rcu_read_lock() and rcu_read_unlock(), but
that will break the call from grill().

Or am I missing something subtle here?  Given patch 3/4, I suspect not...

							Thanx, Paul

> Paolo Abeni (4):
>   rcu: introduce noref debug
>   net: use RCU noref infrastructure to track dst noref
>   ipv4: drop unneeded and misleading RCU lock in ip_route_input_noref()
>   tcp: avoid noref dst leak on input path
> 
>  include/linux/rcupdate.h | 11 ++++++
>  include/linux/skbuff.h   |  1 +
>  include/net/dst.h        |  5 +++
>  kernel/rcu/Kconfig.debug | 15 ++++++++
>  kernel/rcu/Makefile      |  1 +
>  kernel/rcu/noref_debug.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++
>  net/ipv4/route.c         |  7 +---
>  net/ipv4/tcp_input.c     |  5 ++-
>  8 files changed, 127 insertions(+), 7 deletions(-)
>  create mode 100644 kernel/rcu/noref_debug.c
> 
> -- 
> 2.13.6
>
Paolo Abeni Oct. 6, 2017, 3:10 p.m. UTC | #2
Hi,

On Fri, 2017-10-06 at 06:34 -0700, Paul E. McKenney wrote:
> On Fri, Oct 06, 2017 at 02:57:45PM +0200, Paolo Abeni wrote:
> > The networking subsystem is currently using some kind of long-lived
> > RCU-protected, references to avoid the overhead of full book-keeping.
> > 
> > Such references - skb_dst() noref - are stored inside the skbs and can be
> > moved across relevant slices of the network stack, with the users
> > being in charge of properly clearing the relevant skb - or properly refcount
> > the related dst references - before the skb escapes the RCU section.
> > 
> > We currently don't have any deterministic debug infrastructure to check
> > the dst noref usages - and the introduction of others noref artifact is
> > currently under discussion.
> > 
> > This series tries to tackle the above introducing an RCU debug infrastructure
> > aimed at spotting incorrect noref pointer usage, in patch one. The
> > infrastructure is small and must be explicitly enabled via a newly introduced
> > build option.
> > 
> > Patch two uses such infrastructure to track dst noref usage in the networking
> > stack.
> > 
> > Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
> > on basic scenarios.

Thank you for the prompt reply!
> 
> This patchset does not look like it handles rcu_read_lock() nesting.
> For example, given code like this:
> 
> 	void foo(void)
> 	{
> 		rcu_read_lock();
> 		rcu_track_noref(&key2, &noref2, true);
> 		do_something();
> 		rcu_track_noref(&key2, &noref2, false);
> 		rcu_read_unlock();
> 	}
> 
> 	void bar(void)
> 	{
> 		rcu_read_lock();
> 		rcu_track_noref(&key1, &noref1, true);
> 		do_something_more();
> 		foo();
> 		do_something_else();
> 		rcu_track_noref(&key1, &noref1, false);
> 		rcu_read_unlock();
> 	}
> 
> 	void grill(void)
> 	{
> 		foo();
> 	}
> 
> It looks like foo()'s rcu_read_unlock() will complain about key1.
> You could remove foo()'s rcu_read_lock() and rcu_read_unlock(), but
> that will break the call from grill().

Actually the code should cope correctly with your example; when foo()'s
rcu_read_unlock() is called, 'cache' contains:

{ { &key1, &noref1, 1},  // ...

and when the related __rcu_check_noref() is invoked preempt_count() is
2 - because the check is called before decreasing the preempt counter.

In the main loop inside __rcu_check_noref() we will hit always the
'continue' statement because 'cache->store[i].nesting != nesting', so
no warn will be triggered.

> Or am I missing something subtle here?  Given patch 3/4, I suspect not...

The problem with the code in patch 3/4 is different; currently
ip_route_input_noref() is basically doing:

rcu_read_lock();

rcu_track_noref(&key1, &noref1, true);

rcu_read_unlock();

So the rcu lock there silence any RCU based check inside
ip_route_input_noref() but does not really protect the noref dst.

Please let me know if the above clarify the scenario.

Thanks,

Paolo
Paul E. McKenney Oct. 6, 2017, 4:34 p.m. UTC | #3
On Fri, Oct 06, 2017 at 05:10:09PM +0200, Paolo Abeni wrote:
> Hi,
> 
> On Fri, 2017-10-06 at 06:34 -0700, Paul E. McKenney wrote:
> > On Fri, Oct 06, 2017 at 02:57:45PM +0200, Paolo Abeni wrote:
> > > The networking subsystem is currently using some kind of long-lived
> > > RCU-protected, references to avoid the overhead of full book-keeping.
> > > 
> > > Such references - skb_dst() noref - are stored inside the skbs and can be
> > > moved across relevant slices of the network stack, with the users
> > > being in charge of properly clearing the relevant skb - or properly refcount
> > > the related dst references - before the skb escapes the RCU section.
> > > 
> > > We currently don't have any deterministic debug infrastructure to check
> > > the dst noref usages - and the introduction of others noref artifact is
> > > currently under discussion.
> > > 
> > > This series tries to tackle the above introducing an RCU debug infrastructure
> > > aimed at spotting incorrect noref pointer usage, in patch one. The
> > > infrastructure is small and must be explicitly enabled via a newly introduced
> > > build option.
> > > 
> > > Patch two uses such infrastructure to track dst noref usage in the networking
> > > stack.
> > > 
> > > Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
> > > on basic scenarios.
> 
> Thank you for the prompt reply!
> > 
> > This patchset does not look like it handles rcu_read_lock() nesting.
> > For example, given code like this:
> > 
> > 	void foo(void)
> > 	{
> > 		rcu_read_lock();
> > 		rcu_track_noref(&key2, &noref2, true);
> > 		do_something();
> > 		rcu_track_noref(&key2, &noref2, false);
> > 		rcu_read_unlock();
> > 	}
> > 
> > 	void bar(void)
> > 	{
> > 		rcu_read_lock();
> > 		rcu_track_noref(&key1, &noref1, true);
> > 		do_something_more();
> > 		foo();
> > 		do_something_else();
> > 		rcu_track_noref(&key1, &noref1, false);
> > 		rcu_read_unlock();
> > 	}
> > 
> > 	void grill(void)
> > 	{
> > 		foo();
> > 	}
> > 
> > It looks like foo()'s rcu_read_unlock() will complain about key1.
> > You could remove foo()'s rcu_read_lock() and rcu_read_unlock(), but
> > that will break the call from grill().
> 
> Actually the code should cope correctly with your example; when foo()'s
> rcu_read_unlock() is called, 'cache' contains:
> 
> { { &key1, &noref1, 1},  // ...
> 
> and when the related __rcu_check_noref() is invoked preempt_count() is
> 2 - because the check is called before decreasing the preempt counter.
> 
> In the main loop inside __rcu_check_noref() we will hit always the
> 'continue' statement because 'cache->store[i].nesting != nesting', so
> no warn will be triggered.

You are right, it was too early, and my example wasn't correct.  How
about this one?

	void foo(void (*f)(struct s *sp), struct s **spp)
	{
		rcu_read_lock();
		rcu_track_noref(&key2, &noref2, true);
		f(spp);
		rcu_track_noref(&key2, &noref2, false);
		rcu_read_unlock();
	}

	void barcb(struct s **spp)
	{
		*spp = &noref3;
		rcu_track_noref(&key3, *spp, true);
	}

	void bar(void)
	{
		struct s *sp;

		rcu_read_lock();
		rcu_track_noref(&key1, &noref1, true);
		do_something_more();
		foo(barcb, &sp);
		do_something_else(sp);
		rcu_track_noref(&key3, sp, false);
		rcu_track_noref(&key1, &noref1, false);
		rcu_read_unlock();
	}

	void grillcb(struct s **spp)
	{
		*spp
	}

	void grill(void)
	{
		foo();
	}

How does the user select the key argument?  It looks like someone
can choose to just pass in NULL -- is that the intent, or am I confused
about this as well?

> > Or am I missing something subtle here?  Given patch 3/4, I suspect not...
> 
> The problem with the code in patch 3/4 is different; currently
> ip_route_input_noref() is basically doing:
> 
> rcu_read_lock();
> 
> rcu_track_noref(&key1, &noref1, true);
> 
> rcu_read_unlock();
> 
> So the rcu lock there silence any RCU based check inside
> ip_route_input_noref() but does not really protect the noref dst.
> 
> Please let me know if the above clarify the scenario.

OK.

I am also concerned about false negatives when the user invokes
rcu_track_noref(..., false) but then leaks the pointer anyway.  Or is
there something you are doing that catches this that I am missing?

							Thanx, Paul
Paolo Abeni Oct. 9, 2017, 4:53 p.m. UTC | #4
On Fri, 2017-10-06 at 09:34 -0700, Paul E. McKenney wrote:
> On Fri, Oct 06, 2017 at 05:10:09PM +0200, Paolo Abeni wrote:
> > Hi,
> > 
> > On Fri, 2017-10-06 at 06:34 -0700, Paul E. McKenney wrote:
> > > On Fri, Oct 06, 2017 at 02:57:45PM +0200, Paolo Abeni wrote:
> > > > The networking subsystem is currently using some kind of long-lived
> > > > RCU-protected, references to avoid the overhead of full book-keeping.
> > > > 
> > > > Such references - skb_dst() noref - are stored inside the skbs and can be
> > > > moved across relevant slices of the network stack, with the users
> > > > being in charge of properly clearing the relevant skb - or properly refcount
> > > > the related dst references - before the skb escapes the RCU section.
> > > > 
> > > > We currently don't have any deterministic debug infrastructure to check
> > > > the dst noref usages - and the introduction of others noref artifact is
> > > > currently under discussion.
> > > > 
> > > > This series tries to tackle the above introducing an RCU debug infrastructure
> > > > aimed at spotting incorrect noref pointer usage, in patch one. The
> > > > infrastructure is small and must be explicitly enabled via a newly introduced
> > > > build option.
> > > > 
> > > > Patch two uses such infrastructure to track dst noref usage in the networking
> > > > stack.
> > > > 
> > > > Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
> > > > on basic scenarios.
> > 
> > Thank you for the prompt reply!
> > > 
> > > This patchset does not look like it handles rcu_read_lock() nesting.
> > > For example, given code like this:
> > > 
> > > 	void foo(void)
> > > 	{
> > > 		rcu_read_lock();
> > > 		rcu_track_noref(&key2, &noref2, true);
> > > 		do_something();
> > > 		rcu_track_noref(&key2, &noref2, false);
> > > 		rcu_read_unlock();
> > > 	}
> > > 
> > > 	void bar(void)
> > > 	{
> > > 		rcu_read_lock();
> > > 		rcu_track_noref(&key1, &noref1, true);
> > > 		do_something_more();
> > > 		foo();
> > > 		do_something_else();
> > > 		rcu_track_noref(&key1, &noref1, false);
> > > 		rcu_read_unlock();
> > > 	}
> > > 
> > > 	void grill(void)
> > > 	{
> > > 		foo();
> > > 	}
> > > 
> > > It looks like foo()'s rcu_read_unlock() will complain about key1.
> > > You could remove foo()'s rcu_read_lock() and rcu_read_unlock(), but
> > > that will break the call from grill().
> > 
> > Actually the code should cope correctly with your example; when foo()'s
> > rcu_read_unlock() is called, 'cache' contains:
> > 
> > { { &key1, &noref1, 1},  // ...
> > 
> > and when the related __rcu_check_noref() is invoked preempt_count() is
> > 2 - because the check is called before decreasing the preempt counter.
> > 
> > In the main loop inside __rcu_check_noref() we will hit always the
> > 'continue' statement because 'cache->store[i].nesting != nesting', so
> > no warn will be triggered.
> 
> You are right, it was too early, and my example wasn't correct.  How
> about this one?
> 
> 	void foo(void (*f)(struct s *sp), struct s **spp)
> 	{
> 		rcu_read_lock();
> 		rcu_track_noref(&key2, &noref2, true);
> 		f(spp);
> 		rcu_track_noref(&key2, &noref2, false);
> 		rcu_read_unlock();
> 	}
> 
> 	void barcb(struct s **spp)
> 	{
> 		*spp = &noref3;
> 		rcu_track_noref(&key3, *spp, true);
> 	}
> 
> 	void bar(void)
> 	{
> 		struct s *sp;
> 
> 		rcu_read_lock();
> 		rcu_track_noref(&key1, &noref1, true);
> 		do_something_more();
> 		foo(barcb, &sp);
> 		do_something_else(sp);
> 		rcu_track_noref(&key3, sp, false);
> 		rcu_track_noref(&key1, &noref1, false);
> 		rcu_read_unlock();
> 	}
> 
> 	void grillcb(struct s **spp)
> 	{
> 		*spp
> 	}
> 
> 	void grill(void)
> 	{
> 		foo();
> 	}

You are right: this will generate a splat, even if the code it safe.
The false positive can be avoided looking for leaked references only in
the outermost rcu unlook. I did a previous implementation performing
such check, but it emitted very generic splat so I tried to be more
strict. The latter choice allowed to find/do 3/4.

What about using save_stack_trace() in rcu_track_noref(, true) and
reporting such stack trace when the check in the outer most rcu fails?

the current strict/false-postive-prone check could be enabled under an
additional build flag.

> How does the user select the key argument?  It looks like someone
> can choose to just pass in NULL -- is that the intent, or am I confused
> about this as well?

The 'key' argument is intented to discriminate the scope of the same
noref dst attached to different skbs, which happens e.g. as a result of
as skb_dst_copy().

In a generic use case, it can be NULL, too.

> I am also concerned about false negatives when the user invokes
> rcu_track_noref(..., false) but then leaks the pointer anyway.  Or is
> there something you are doing that catches this that I am missing?

If the rcu_track_noref(..., false) is misplaced or missing, yes we can
have false negative. 

In the noref rcu use-case the rcu_track_noref(, false) call is/must be
placed side-by-side with the code clearing the the noref pointer, so
is/should be quite easy avoiding such mistakes.

Cheers,

Paolo
Paul E. McKenney Oct. 11, 2017, 4:02 a.m. UTC | #5
On Mon, Oct 09, 2017 at 06:53:12PM +0200, Paolo Abeni wrote:
> On Fri, 2017-10-06 at 09:34 -0700, Paul E. McKenney wrote:
> > On Fri, Oct 06, 2017 at 05:10:09PM +0200, Paolo Abeni wrote:
> > > Hi,
> > > 
> > > On Fri, 2017-10-06 at 06:34 -0700, Paul E. McKenney wrote:
> > > > On Fri, Oct 06, 2017 at 02:57:45PM +0200, Paolo Abeni wrote:
> > > > > The networking subsystem is currently using some kind of long-lived
> > > > > RCU-protected, references to avoid the overhead of full book-keeping.
> > > > > 
> > > > > Such references - skb_dst() noref - are stored inside the skbs and can be
> > > > > moved across relevant slices of the network stack, with the users
> > > > > being in charge of properly clearing the relevant skb - or properly refcount
> > > > > the related dst references - before the skb escapes the RCU section.
> > > > > 
> > > > > We currently don't have any deterministic debug infrastructure to check
> > > > > the dst noref usages - and the introduction of others noref artifact is
> > > > > currently under discussion.
> > > > > 
> > > > > This series tries to tackle the above introducing an RCU debug infrastructure
> > > > > aimed at spotting incorrect noref pointer usage, in patch one. The
> > > > > infrastructure is small and must be explicitly enabled via a newly introduced
> > > > > build option.
> > > > > 
> > > > > Patch two uses such infrastructure to track dst noref usage in the networking
> > > > > stack.
> > > > > 
> > > > > Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
> > > > > on basic scenarios.
> > > 
> > > Thank you for the prompt reply!
> > > > 
> > > > This patchset does not look like it handles rcu_read_lock() nesting.
> > > > For example, given code like this:
> > > > 
> > > > 	void foo(void)
> > > > 	{
> > > > 		rcu_read_lock();
> > > > 		rcu_track_noref(&key2, &noref2, true);
> > > > 		do_something();
> > > > 		rcu_track_noref(&key2, &noref2, false);
> > > > 		rcu_read_unlock();
> > > > 	}
> > > > 
> > > > 	void bar(void)
> > > > 	{
> > > > 		rcu_read_lock();
> > > > 		rcu_track_noref(&key1, &noref1, true);
> > > > 		do_something_more();
> > > > 		foo();
> > > > 		do_something_else();
> > > > 		rcu_track_noref(&key1, &noref1, false);
> > > > 		rcu_read_unlock();
> > > > 	}
> > > > 
> > > > 	void grill(void)
> > > > 	{
> > > > 		foo();
> > > > 	}
> > > > 
> > > > It looks like foo()'s rcu_read_unlock() will complain about key1.
> > > > You could remove foo()'s rcu_read_lock() and rcu_read_unlock(), but
> > > > that will break the call from grill().
> > > 
> > > Actually the code should cope correctly with your example; when foo()'s
> > > rcu_read_unlock() is called, 'cache' contains:
> > > 
> > > { { &key1, &noref1, 1},  // ...
> > > 
> > > and when the related __rcu_check_noref() is invoked preempt_count() is
> > > 2 - because the check is called before decreasing the preempt counter.
> > > 
> > > In the main loop inside __rcu_check_noref() we will hit always the
> > > 'continue' statement because 'cache->store[i].nesting != nesting', so
> > > no warn will be triggered.
> > 
> > You are right, it was too early, and my example wasn't correct.  How
> > about this one?
> > 
> > 	void foo(void (*f)(struct s *sp), struct s **spp)
> > 	{
> > 		rcu_read_lock();
> > 		rcu_track_noref(&key2, &noref2, true);
> > 		f(spp);
> > 		rcu_track_noref(&key2, &noref2, false);
> > 		rcu_read_unlock();
> > 	}
> > 
> > 	void barcb(struct s **spp)
> > 	{
> > 		*spp = &noref3;
> > 		rcu_track_noref(&key3, *spp, true);
> > 	}
> > 
> > 	void bar(void)
> > 	{
> > 		struct s *sp;
> > 
> > 		rcu_read_lock();
> > 		rcu_track_noref(&key1, &noref1, true);
> > 		do_something_more();
> > 		foo(barcb, &sp);
> > 		do_something_else(sp);
> > 		rcu_track_noref(&key3, sp, false);
> > 		rcu_track_noref(&key1, &noref1, false);
> > 		rcu_read_unlock();
> > 	}
> > 
> > 	void grillcb(struct s **spp)
> > 	{
> > 		*spp
> > 	}
> > 
> > 	void grill(void)
> > 	{
> > 		foo();
> > 	}
> 
> You are right: this will generate a splat, even if the code it safe.
> The false positive can be avoided looking for leaked references only in
> the outermost rcu unlook. I did a previous implementation performing
> such check, but it emitted very generic splat so I tried to be more
> strict. The latter choice allowed to find/do 3/4.
> 
> What about using save_stack_trace() in rcu_track_noref(, true) and
> reporting such stack trace when the check in the outer most rcu fails?
> 
> the current strict/false-postive-prone check could be enabled under an
> additional build flag.

Linus and Ingo will ask me how users decide how they should set that
additional build flag.  Especially given that if there is code that
requires non-strict checking, isn't everyone required to set up non-strict
checking to avoid false positives?  Unless you can also configure out
all the code that requires non-strict checking, I suppose, but how
would you keep track of what to configure out?

> > How does the user select the key argument?  It looks like someone
> > can choose to just pass in NULL -- is that the intent, or am I confused
> > about this as well?
> 
> The 'key' argument is intented to discriminate the scope of the same
> noref dst attached to different skbs, which happens e.g. as a result of
> as skb_dst_copy().
> 
> In a generic use case, it can be NULL, too.

OK.  There will probably be some discussion about the API in that case.

> > I am also concerned about false negatives when the user invokes
> > rcu_track_noref(..., false) but then leaks the pointer anyway.  Or is
> > there something you are doing that catches this that I am missing?
> 
> If the rcu_track_noref(..., false) is misplaced or missing, yes we can
> have false negative. 
> 
> In the noref rcu use-case the rcu_track_noref(, false) call is/must be
> placed side-by-side with the code clearing the the noref pointer, so
> is/should be quite easy avoiding such mistakes.

True enough.  Except that if people were good about always clearing the
pointer, then the pointer couldn't leak, right?  Or am I missing something
in your use cases?

Or to put it another way -- have you been able to catch any real
pointer-leak bugs with this?  The other RCU debug options have had
pretty long found-bug lists before we accepted them.

							Thanx, Paul
Paolo Abeni Oct. 11, 2017, 2:50 p.m. UTC | #6
On Tue, 2017-10-10 at 21:02 -0700, Paul E. McKenney wrote:
> Linus and Ingo will ask me how users decide how they should set that
> additional build flag.  Especially given that if there is code that
> requires non-strict checking, isn't everyone required to set up non-strict
> checking to avoid false positives?  Unless you can also configure out
> all the code that requires non-strict checking, I suppose, but how
> would you keep track of what to configure out?

I'm working to a new version using a single compile flag - without
additional strict option.

I don't know of any other subsytem that stores rcu pointer in
datastructures for a longer amount of time. That having said, I wonder
if the tests should completely move to the networking subsystem for the
time being. The Kconfig option would thus be called NET_DEBUG or
something along the lines. For abstraction it would be possible to add
an atomic_notifier_chain to the rcu_read/unlock methods, where multiple
users or checkers could register for. That way we keep the users
seperate from the implementation for the cost of a bit more layering
and more indirect calls. But given that this will anyway slow down
execution a lot, it will anyway only be suitable in
testing/verification/debugging environments.

> OK.  There will probably be some discussion about the API in that case.

I'll drop noref parameter, the key will became mandatory - the exact
position of where the reference of RCU managed object is stored. In the
case of noref dst it is &skb->_skb_refdst. With this kind of API it
should suite more subsystems.

> True enough.  Except that if people were good about always clearing the
> pointer, then the pointer couldn't leak, right?  Or am I missing something
> in your use cases?

This is correct. The dst_entry checking in skb, which this patch series
implements there are strict brackets in the sense of skb_dst_set,
skb_dst_set_noref, skb_dst_force, etc., which form brackets around the
safe uses of those dst_entries. This patch series validates that the
correct skb_dst_* functions are being called before the sk_buff leaves
the rcu protected section. Thus we don't need to modify and review a
lot of code but we can just patch into those helpers already.

> Or to put it another way -- have you been able to catch any real
> pointer-leak bugs with thister-leak bugs with this?  The other RCU
> debug options have had pretty long found-bug lists before we accepted
> them.

There have been two problems found so far, one is a rather minor one
while the other one seems like a normal bug. The patches for those are
part of this series (3/4 and 4/4).

Regards,

Paolo
Paul E. McKenney Oct. 11, 2017, 3:45 p.m. UTC | #7
On Wed, Oct 11, 2017 at 04:50:36PM +0200, Paolo Abeni wrote:
> On Tue, 2017-10-10 at 21:02 -0700, Paul E. McKenney wrote:
> > Linus and Ingo will ask me how users decide how they should set that
> > additional build flag.  Especially given that if there is code that
> > requires non-strict checking, isn't everyone required to set up non-strict
> > checking to avoid false positives?  Unless you can also configure out
> > all the code that requires non-strict checking, I suppose, but how
> > would you keep track of what to configure out?
> 
> I'm working to a new version using a single compile flag - without
> additional strict option.
> 
> I don't know of any other subsytem that stores rcu pointer in
> datastructures for a longer amount of time. That having said, I wonder
> if the tests should completely move to the networking subsystem for the
> time being. The Kconfig option would thus be called NET_DEBUG or
> something along the lines. For abstraction it would be possible to add
> an atomic_notifier_chain to the rcu_read/unlock methods, where multiple
> users or checkers could register for. That way we keep the users
> seperate from the implementation for the cost of a bit more layering
> and more indirect calls. But given that this will anyway slow down
> execution a lot, it will anyway only be suitable in
> testing/verification/debugging environments.

I like this approach.  And if it does a good job of finding networking
bugs over the next year or so, I would be quite happy to consider
something for all RCU users.

> > OK.  There will probably be some discussion about the API in that case.
> 
> I'll drop noref parameter, the key will became mandatory - the exact
> position of where the reference of RCU managed object is stored. In the
> case of noref dst it is &skb->_skb_refdst. With this kind of API it
> should suite more subsystems.

Interesting.  Do you intend to allow rcu_track_noref() to refuse to
record a pointer?  Other than for the array-full case.

> > True enough.  Except that if people were good about always clearing the
> > pointer, then the pointer couldn't leak, right?  Or am I missing something
> > in your use cases?
> 
> This is correct. The dst_entry checking in skb, which this patch series
> implements there are strict brackets in the sense of skb_dst_set,
> skb_dst_set_noref, skb_dst_force, etc., which form brackets around the
> safe uses of those dst_entries. This patch series validates that the
> correct skb_dst_* functions are being called before the sk_buff leaves
> the rcu protected section. Thus we don't need to modify and review a
> lot of code but we can just patch into those helpers already.

Makes sense.  Those willing to strictly bracket uses gain some debug
assist.

> > Or to put it another way -- have you been able to catch any real
> > pointer-leak bugs with thister-leak bugs with this?  The other RCU
> > debug options have had pretty long found-bug lists before we accepted
> > them.
> 
> There have been two problems found so far, one is a rather minor one
> while the other one seems like a normal bug. The patches for those are
> part of this series (3/4 and 4/4).

I agree that you have started gathering evidence and that the early
indications are positive, if quite mildly so.  If this time next year
there are a few tens of such bugs found, preferably including some
serious ones, I would very likely look favorably on pulling this in to
allow others to use it.

							Thanx, Paul