diff mbox

[6/13] bridge: Add core IGMP snooping support

Message ID 20100310131946.GB6267@linux.vnet.ibm.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Paul E. McKenney March 10, 2010, 1:19 p.m. UTC
On Wed, Mar 10, 2010 at 10:41:32AM +0100, Arnd Bergmann wrote:
> On Wednesday 10 March 2010 03:14:10 Paul E. McKenney wrote:
> > On Tue, Mar 09, 2010 at 10:12:59PM +0100, Arnd Bergmann wrote:
> >
> > > I've just tried annotating net/ipv4/route.c like this and did not get
> > > very far, because the same pointers are used for rcu and rcu_bh.
> > > Could you check if this is a false positive or an actual finding?
> > 
> > Hmmm...  I am only seeing a call_rcu_bh() here, so unless I am missing
> > something, this is a real problem in TREE_PREEMPT_RCU kernels.  The
> > call_rcu_bh() only interacts with the rcu_read_lock_bh() readers, not
> > the rcu_read_lock() readers.
> > 
> > One approach is to run freed blocks through both types of grace periods,
> > I suppose.
> 
> Well, if I introduce different __rcu and __rcu_bh address space annotations,
> sparse would still not like that, because then you can only pass the annotated
> pointers into either rcu_dereference or rcu_dereference_bh.
> 
> What the code seems to be doing here is in some places
> 
> 	local_bh_disable();
> 	...
> 	rcu_read_lock();
> 	rcu_dereference(rt_hash_table[h].chain);
> 	rcu_read_unlock();
> 	...
> 	local_bh_enable();
> 
> and in others
> 
> 	rcu_read_lock_bh();
> 	rcu_dereference_bh(rt_hash_table[h].chain);
> 	rcu_read_unlock_bh();

Hmmm...  This is actually legal.

> When rt_hash_table[h].chain gets the __rcu_bh annotation, we'd have to
> turn first rcu_dereference into rcu_dereference_bh in order to have a clean
> build with sparse. Would that change be
> a) correct from RCU perspective,
> b) desirable for code inspection, and
> c) lockdep-clean?

I have a patch queued up that will make rcu_dereference_bh() handle this
correctly -- current -tip and mainline would complain.  Please see below
for a sneak preview.

Thoughts?

							Thanx, Paul

rcu: make rcu_read_lock_bh_held() allow for disabled BH

Disabling BH can stand in for rcu_read_lock_bh(), and this patch updates
rcu_read_lock_bh_held() to allow for this.  In order to avoid
include-file hell, this function is moved out of line to kernel/rcupdate.c.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

include/linux/rcupdate.h |   19 ++++---------------
 kernel/rcupdate.c        |   22 ++++++++++++++++++++++
 2 files changed, 26 insertions(+), 15 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Arnd Bergmann March 10, 2010, 1:30 p.m. UTC | #1
On Wednesday 10 March 2010, Paul E. McKenney wrote:
> > When rt_hash_table[h].chain gets the __rcu_bh annotation, we'd have to
> > turn first rcu_dereference into rcu_dereference_bh in order to have a clean
> > build with sparse. Would that change be
> > a) correct from RCU perspective,
> > b) desirable for code inspection, and
> > c) lockdep-clean?
> 
> I have a patch queued up that will make rcu_dereference_bh() handle this
> correctly -- current -tip and mainline would complain.  Please see below
> for a sneak preview.
> 
> Thoughts?

Ok, so that would mean we can convert it all to rcu_dereference_bh().
I guess an alternative to this would be to also change the rcu_read_lock()
inside local_bh_disable() sections to rcu_read_lock_bh(), which is not
necessary but also not harmful, right?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney March 10, 2010, 1:57 p.m. UTC | #2
On Wed, Mar 10, 2010 at 02:30:06PM +0100, Arnd Bergmann wrote:
> On Wednesday 10 March 2010, Paul E. McKenney wrote:
> > > When rt_hash_table[h].chain gets the __rcu_bh annotation, we'd have to
> > > turn first rcu_dereference into rcu_dereference_bh in order to have a clean
> > > build with sparse. Would that change be
> > > a) correct from RCU perspective,
> > > b) desirable for code inspection, and
> > > c) lockdep-clean?
> > 
> > I have a patch queued up that will make rcu_dereference_bh() handle this
> > correctly -- current -tip and mainline would complain.  Please see below
> > for a sneak preview.
> > 
> > Thoughts?
> 
> Ok, so that would mean we can convert it all to rcu_dereference_bh().
> I guess an alternative to this would be to also change the rcu_read_lock()
> inside local_bh_disable() sections to rcu_read_lock_bh(), which is not
> necessary but also not harmful, right?

It does impose additional overhead, which the networking guys are eager
to avoid, given that network links speeds are continuing to increase.  ;-)
So moving to rcu_dereference_bh() seems better to me.

(Please note that rcu_dereference_bh() imposes checking overhead only
in the presence of both CONFIG_PROVE_LOCKING and CONFIG_PROVE_RCU.)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 75921b8..c393acc 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -119,22 +119,11 @@  static inline int rcu_read_lock_held(void)
 	return lock_is_held(&rcu_lock_map);
 }
 
-/**
- * rcu_read_lock_bh_held - might we be in RCU-bh read-side critical section?
- *
- * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in
- * an RCU-bh read-side critical section.  In absence of CONFIG_PROVE_LOCKING,
- * this assumes we are in an RCU-bh read-side critical section unless it can
- * prove otherwise.
- *
- * Check rcu_scheduler_active to prevent false positives during boot.
+/*
+ * rcu_read_lock_bh_held() is defined out of line to avoid #include-file
+ * hell.
  */
-static inline int rcu_read_lock_bh_held(void)
-{
-	if (!debug_lockdep_rcu_enabled())
-		return 1;
-	return lock_is_held(&rcu_bh_lock_map);
-}
+extern int rcu_read_lock_bh_held(void);
 
 /**
  * rcu_read_lock_sched_held - might we be in RCU-sched read-side critical section?
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index f1125c1..913eccb 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -45,6 +45,7 @@ 
 #include <linux/mutex.h>
 #include <linux/module.h>
 #include <linux/kernel_stat.h>
+#include <linux/hardirq.h>
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 static struct lock_class_key rcu_lock_key;
@@ -66,6 +67,27 @@  EXPORT_SYMBOL_GPL(rcu_sched_lock_map);
 int rcu_scheduler_active __read_mostly;
 EXPORT_SYMBOL_GPL(rcu_scheduler_active);
 
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
+/**
+ * rcu_read_lock_bh_held - might we be in RCU-bh read-side critical section?
+ *
+ * Check for bottom half being disabled, which covers both the
+ * CONFIG_PROVE_RCU and not cases.  Note that if someone uses
+ * rcu_read_lock_bh(), but then later enables BH, lockdep (if enabled)
+ * will show the situation.
+ *
+ * Check debug_lockdep_rcu_enabled() to prevent false positives during boot.
+ */
+int rcu_read_lock_bh_held(void)
+{
+	if (!debug_lockdep_rcu_enabled())
+		return 1;
+	return in_softirq();
+}
+
+#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
 /*
  * This function is invoked towards the end of the scheduler's initialization
  * process.  Before this is called, the idle task might contain