diff mbox

[v2,1/2] sched: Add cond_resched_rcu_lock() helper

Message ID 1367290378-29224-2-git-send-email-horms@verge.net.au
State RFC
Headers show

Commit Message

Simon Horman April 30, 2013, 2:52 a.m. UTC
This is intended for use in loops which read data protected by RCU and may
have a large number of iterations.  Such an example is dumping the list of
connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().

The call to cond_resched() is guarded by #ifndef CONFIG_PREEMPT_RCU as in
the case of CONFIG_PREEMPT_RCU _rcu_read_unlock() will check to see if the
RCU core needs to be informed, so there is no need to invoke cond_resched()
in that case. Thanks to Paul E. McKenney for explaining this.

cond_resched_rcu_lock() suggested by Eric Dumazet.
ifndef guard suggested by Paul E. McKenney and Julian Anastasov.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Julian Anastasov <ja@ssi.bg>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Simon Horman <horms@verge.net.au>

---

v2
* Add guard the call to cond_resched()
---
 include/linux/sched.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Julian Anastasov April 30, 2013, 7:12 a.m. UTC | #1
Hello,

On Tue, 30 Apr 2013, Simon Horman wrote:

> This is intended for use in loops which read data protected by RCU and may
> have a large number of iterations.  Such an example is dumping the list of
> connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().
> 
> The call to cond_resched() is guarded by #ifndef CONFIG_PREEMPT_RCU as in
> the case of CONFIG_PREEMPT_RCU _rcu_read_unlock() will check to see if the
> RCU core needs to be informed, so there is no need to invoke cond_resched()
> in that case. Thanks to Paul E. McKenney for explaining this.
> 
> cond_resched_rcu_lock() suggested by Eric Dumazet.
> ifndef guard suggested by Paul E. McKenney and Julian Anastasov.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Julian Anastasov <ja@ssi.bg>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Simon Horman <horms@verge.net.au>
> 
> ---
> 
> v2
> * Add guard the call to cond_resched()
> ---
>  include/linux/sched.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index e692a02..66da71c 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2787,3 +2787,14 @@ static inline unsigned long rlimit_max(unsigned int limit)
>  }
>  
>  #endif
> +
> +static void inline cond_resched_rcu_lock(void)
> +{
> +	if (need_resched()) {

	Ops, it should be without above need_resched.

> +		rcu_read_unlock();
> +#ifndef CONFIG_PREEMPT_RCU
> +		cond_resched();
> +#endif
> +		rcu_read_lock();
> +	}
> +}
> -- 
> 1.8.2.1

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Simon Horman April 30, 2013, 7:29 a.m. UTC | #2
On Tue, Apr 30, 2013 at 10:12:38AM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Tue, 30 Apr 2013, Simon Horman wrote:
> 
> > This is intended for use in loops which read data protected by RCU and may
> > have a large number of iterations.  Such an example is dumping the list of
> > connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().
> > 
> > The call to cond_resched() is guarded by #ifndef CONFIG_PREEMPT_RCU as in
> > the case of CONFIG_PREEMPT_RCU _rcu_read_unlock() will check to see if the
> > RCU core needs to be informed, so there is no need to invoke cond_resched()
> > in that case. Thanks to Paul E. McKenney for explaining this.
> > 
> > cond_resched_rcu_lock() suggested by Eric Dumazet.
> > ifndef guard suggested by Paul E. McKenney and Julian Anastasov.
> > 
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Julian Anastasov <ja@ssi.bg>
> > Acked-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Simon Horman <horms@verge.net.au>
> > 
> > ---
> > 
> > v2
> > * Add guard the call to cond_resched()
> > ---
> >  include/linux/sched.h | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index e692a02..66da71c 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -2787,3 +2787,14 @@ static inline unsigned long rlimit_max(unsigned int limit)
> >  }
> >  
> >  #endif
> > +
> > +static void inline cond_resched_rcu_lock(void)
> > +{
> > +	if (need_resched()) {
> 
> 	Ops, it should be without above need_resched.

Thanks, to clarify, just this:

static void inline cond_resched_rcu_lock(void)
{
	rcu_read_unlock();
#ifndef CONFIG_PREEMPT_RCU
	cond_resched();
#endif
	rcu_read_lock();
}
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov April 30, 2013, 7:52 a.m. UTC | #3
Hello,

On Tue, 30 Apr 2013, Simon Horman wrote:

> > > +static void inline cond_resched_rcu_lock(void)
> > > +{
> > > +	if (need_resched()) {
> > 
> > 	Ops, it should be without above need_resched.
> 
> Thanks, to clarify, just this:
> 
> static void inline cond_resched_rcu_lock(void)
> {
> 	rcu_read_unlock();
> #ifndef CONFIG_PREEMPT_RCU
> 	cond_resched();
> #endif
> 	rcu_read_lock();
> }

	Yes, thanks!

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 1, 2013, 9:10 a.m. UTC | #4
On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Tue, 30 Apr 2013, Simon Horman wrote:
> 
> > > > +static void inline cond_resched_rcu_lock(void)
> > > > +{
> > > > +	if (need_resched()) {
> > > 
> > > 	Ops, it should be without above need_resched.
> > 
> > Thanks, to clarify, just this:
> > 
> > static void inline cond_resched_rcu_lock(void)
> > {
> > 	rcu_read_unlock();
> > #ifndef CONFIG_PREEMPT_RCU
> > 	cond_resched();
> > #endif
> > 	rcu_read_lock();
> > }
> 
> 	Yes, thanks!

OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
dropping rcu_read_lock() at all?

That is; the thing that makes sense to me is:

static void inline cond_resched_rcu_lock(void)
{
#ifdef CONFIG_PREEMPT_RCU
	if (need_resched()) {
		rcu_read_unlock();
		cond_resched();
		rcu_read_lock();
	}
#endif /* CONFIG_PREEMPT_RCU */
}

That would have an rcu_read_lock() break and voluntary preemption point for
non-preemptible RCU and not bother with the stuff for preemptible RCU.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney May 1, 2013, 12:46 p.m. UTC | #5
On Wed, May 01, 2013 at 11:10:12AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> > 
> > 	Hello,
> > 
> > On Tue, 30 Apr 2013, Simon Horman wrote:
> > 
> > > > > +static void inline cond_resched_rcu_lock(void)
> > > > > +{
> > > > > +	if (need_resched()) {
> > > > 
> > > > 	Ops, it should be without above need_resched.
> > > 
> > > Thanks, to clarify, just this:
> > > 
> > > static void inline cond_resched_rcu_lock(void)
> > > {
> > > 	rcu_read_unlock();
> > > #ifndef CONFIG_PREEMPT_RCU
> > > 	cond_resched();
> > > #endif
> > > 	rcu_read_lock();
> > > }
> > 
> > 	Yes, thanks!
> 
> OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
> dropping rcu_read_lock() at all?

Good point, I was assuming that the goal was to let grace periods end
as well as to allow preemption.  The momentary dropping out of the
RCU read-side critical section allows the grace periods to end.

> That is; the thing that makes sense to me is:
> 
> static void inline cond_resched_rcu_lock(void)
> {
> #ifdef CONFIG_PREEMPT_RCU
> 	if (need_resched()) {
> 		rcu_read_unlock();
> 		cond_resched();
> 		rcu_read_lock();
> 	}
> #endif /* CONFIG_PREEMPT_RCU */
> }
> 
> That would have an rcu_read_lock() break and voluntary preemption point for
> non-preemptible RCU and not bother with the stuff for preemptible RCU.

If the only goal is to allow preemption, and if long grace periods are
not a concern, then this alternate approach would work fine as well.

Of course, both approaches assume that the caller is in a place
where having all RCU-protected data disappear is OK!

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov May 1, 2013, 2:22 p.m. UTC | #6
Hello,

On Wed, 1 May 2013, Peter Zijlstra wrote:

> On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> > 
> > 	Hello,
> > 
> > On Tue, 30 Apr 2013, Simon Horman wrote:
> > 
> > > Thanks, to clarify, just this:
> > > 
> > > static void inline cond_resched_rcu_lock(void)
> > > {
> > > 	rcu_read_unlock();
> > > #ifndef CONFIG_PREEMPT_RCU
> > > 	cond_resched();
> > > #endif
> > > 	rcu_read_lock();
> > > }
> > 
> > 	Yes, thanks!
> 
> OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
> dropping rcu_read_lock() at all?
> 
> That is; the thing that makes sense to me is:
> 
> static void inline cond_resched_rcu_lock(void)
> {
> #ifdef CONFIG_PREEMPT_RCU

	You mean '#ifndef' here, right? But in the non-preempt
case is using the need_resched() needed? rcu_read_unlock
and rcu_read_lock do not generate code.

> 	if (need_resched()) {
> 		rcu_read_unlock();
> 		cond_resched();
> 		rcu_read_lock();
> 	}
> #endif /* CONFIG_PREEMPT_RCU */
> }
> 
> That would have an rcu_read_lock() break and voluntary preemption point for
> non-preemptible RCU and not bother with the stuff for preemptible RCU.

	I see. So, can we choose one of both variants:

1. Your variant but with ifndef:

static void inline cond_resched_rcu_lock(void)
{
#ifndef CONFIG_PREEMPT_RCU
	if (need_resched()) {
		rcu_read_unlock();
		cond_resched();
		rcu_read_lock();
	}
#endif
}

2. Same without need_resched because cond_resched already
performs the same checks:

static void inline cond_resched_rcu_lock(void)
{
#ifndef CONFIG_PREEMPT_RCU
	rcu_read_unlock();
	cond_resched();
	rcu_read_lock();
#endif
}

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney May 1, 2013, 2:32 p.m. UTC | #7
On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> On Wed, May 01, 2013 at 11:10:12AM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> > > 
> > > 	Hello,
> > > 
> > > On Tue, 30 Apr 2013, Simon Horman wrote:
> > > 
> > > > > > +static void inline cond_resched_rcu_lock(void)
> > > > > > +{
> > > > > > +	if (need_resched()) {
> > > > > 
> > > > > 	Ops, it should be without above need_resched.
> > > > 
> > > > Thanks, to clarify, just this:
> > > > 
> > > > static void inline cond_resched_rcu_lock(void)
> > > > {
> > > > 	rcu_read_unlock();
> > > > #ifndef CONFIG_PREEMPT_RCU
> > > > 	cond_resched();
> > > > #endif
> > > > 	rcu_read_lock();
> > > > }
> > > 
> > > 	Yes, thanks!
> > 
> > OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
> > dropping rcu_read_lock() at all?
> 
> Good point, I was assuming that the goal was to let grace periods end
> as well as to allow preemption.  The momentary dropping out of the
> RCU read-side critical section allows the grace periods to end.
> 
> > That is; the thing that makes sense to me is:
> > 
> > static void inline cond_resched_rcu_lock(void)
> > {
> > #ifdef CONFIG_PREEMPT_RCU
> > 	if (need_resched()) {
> > 		rcu_read_unlock();
> > 		cond_resched();
> > 		rcu_read_lock();
> > 	}
> > #endif /* CONFIG_PREEMPT_RCU */
> > }
> > 
> > That would have an rcu_read_lock() break and voluntary preemption point for
> > non-preemptible RCU and not bother with the stuff for preemptible RCU.
> 
> If the only goal is to allow preemption, and if long grace periods are
> not a concern, then this alternate approach would work fine as well.

But now that I think about it, there is one big advantage to the
unconditional exiting and reentering the RCU read-side critical section:
It allows easy placement of unconditional lockdep debug code to catch
the following type of bug:

	rcu_read_lock();
	...
	rcu_read_lock();
	...
	cond_resched_rcu_lock();
	...
	rcu_read_unlock();
	...
	rcu_read_unlock();

Here is how to detect this:

	static void inline cond_resched_rcu_lock(void)
	{
		rcu_read_unlock();
		WARN_ON_ONCE(rcu_read_lock_held());
	#ifndef CONFIG_PREEMPT_RCU
		cond_resched();
	#endif
		rcu_read_lock();
	}

Of course, we could do this in your implementation as well:

	static void inline cond_resched_rcu_lock(void)
	{
	#ifdef CONFIG_PREEMPT_RCU
		if (need_resched()) {
			rcu_read_unlock();
			WARN_ON_ONCE(rcu_read_lock_held());
			cond_resched();
			rcu_read_lock();
		}
	#endif /* CONFIG_PREEMPT_RCU */
	}

But this would fail to detect the bug -- and would silently fail -- on
!CONFIG_PREEMPT_RCU systems.

							Thanx, Paul

> Of course, both approaches assume that the caller is in a place
> where having all RCU-protected data disappear is OK!
> 
> 							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 1, 2013, 3:17 p.m. UTC | #8
On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:

> If the only goal is to allow preemption, and if long grace periods are
> not a concern, then this alternate approach would work fine as well.

Hmm.. if that were the goal I'd like it to have a different name;
cond_resched*() has always been about preemption.

> Of course, both approaches assume that the caller is in a place
> where having all RCU-protected data disappear is OK!

Quite :-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 1, 2013, 3:29 p.m. UTC | #9
On Wed, 2013-05-01 at 17:17 +0200, Peter Zijlstra wrote:
> On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> 
> > If the only goal is to allow preemption, and if long grace periods are
> > not a concern, then this alternate approach would work fine as well.
> 
> Hmm.. if that were the goal I'd like it to have a different name;
> cond_resched*() has always been about preemption.

BTW, I do not remember why cond_resched() is not an empty macro
when CONFIG_PREEMPT=y ?


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 1, 2013, 3:59 p.m. UTC | #10
On Wed, May 01, 2013 at 08:29:55AM -0700, Eric Dumazet wrote:
> On Wed, 2013-05-01 at 17:17 +0200, Peter Zijlstra wrote:
> > On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> > 
> > > If the only goal is to allow preemption, and if long grace periods are
> > > not a concern, then this alternate approach would work fine as well.
> > 
> > Hmm.. if that were the goal I'd like it to have a different name;
> > cond_resched*() has always been about preemption.
> 
> BTW, I do not remember why cond_resched() is not an empty macro
> when CONFIG_PREEMPT=y ?

Good question.. at at least, only the __might_sleep() construct. Ingo, happen
to remember why this is? Most of this infrastructure is from before my time.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney May 1, 2013, 4:02 p.m. UTC | #11
On Wed, May 01, 2013 at 08:29:55AM -0700, Eric Dumazet wrote:
> On Wed, 2013-05-01 at 17:17 +0200, Peter Zijlstra wrote:
> > On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> > 
> > > If the only goal is to allow preemption, and if long grace periods are
> > > not a concern, then this alternate approach would work fine as well.
> > 
> > Hmm.. if that were the goal I'd like it to have a different name;
> > cond_resched*() has always been about preemption.
> 
> BTW, I do not remember why cond_resched() is not an empty macro
> when CONFIG_PREEMPT=y ?

My guess would be for the case where sched_preempt_enable_no_resched()
is followed some time later by cond_resched().

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 1, 2013, 4:57 p.m. UTC | #12
On Wed, May 01, 2013 at 09:02:18AM -0700, Paul E. McKenney wrote:
> My guess would be for the case where sched_preempt_enable_no_resched()
> is followed some time later by cond_resched().

I might hope not.. preempt_enable_no_resched() is nasty and you're likely to be
hit with a frozen fish of sorts by tglx if you try to use it ;-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney May 1, 2013, 5:30 p.m. UTC | #13
On Wed, May 01, 2013 at 06:57:49PM +0200, Peter Zijlstra wrote:
> On Wed, May 01, 2013 at 09:02:18AM -0700, Paul E. McKenney wrote:
> > My guess would be for the case where sched_preempt_enable_no_resched()
> > is followed some time later by cond_resched().
> 
> I might hope not.. preempt_enable_no_resched() is nasty and you're likely to be
> hit with a frozen fish of sorts by tglx if you try to use it ;-)

I will stick with my guess, though I agree that if I am correct,
this situation almost certainly predates tglx's Linux-related use of
frozen fish as projectile weapons.  ;-)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 2, 2013, 7:27 a.m. UTC | #14
On Wed, May 01, 2013 at 07:32:58AM -0700, Paul E. McKenney wrote:
> Here is how to detect this:
> 
> 	static void inline cond_resched_rcu_lock(void)
> 	{
> 		rcu_read_unlock();
> 		WARN_ON_ONCE(rcu_read_lock_held());
> 	#ifndef CONFIG_PREEMPT_RCU
> 		cond_resched();
> 	#endif
> 		rcu_read_lock();
> 	}

The __might_sleep(__FILE__, __LINE__, PREEMPT_RCU_OFFSET) I posted in the other
email just now should deal with this.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e692a02..66da71c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2787,3 +2787,14 @@  static inline unsigned long rlimit_max(unsigned int limit)
 }
 
 #endif
+
+static void inline cond_resched_rcu_lock(void)
+{
+	if (need_resched()) {
+		rcu_read_unlock();
+#ifndef CONFIG_PREEMPT_RCU
+		cond_resched();
+#endif
+		rcu_read_lock();
+	}
+}