diff mbox

[GIT] Networking

Message ID 1281883637.2942.42.camel@edumazet-laptop
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Aug. 15, 2010, 2:47 p.m. UTC
Le dimanche 15 août 2010 à 12:55 +0200, Eric Dumazet a écrit :

> We have one lock per cpu, and only one cpu can possibly lock its
> associated lock under softirq. So the usual lockdep check, warning a
> lock is taken with BH enabled, while same lock was taken inside softirq
> handler is triggering a false positive here.
> 
> I believe no existing lockdep annotation can instruct lockdep this use
> is OK, I guess we have following choice :
> 
> 1) Mask BH again, using xt_info_wrlock_lockdep(cpu) instead of
> xt_info_wrlock(cpu).
> 
> xt_info_wrlock_lockdep() being a variant, that disables BH in case
> CONFIG_PROVE_LOCKING=y
> 
> 2) temporally switch off lockdep in get_counters(), using a
> lockdep_off()/lockdep_on() pair, and a comment why this is necessary.
> 

In any case, here is patch implementing the later

CC Patrick, our netfilter maintainer...

Maybe lockdep rules could be improved to take care of this later ?

Thanks

[PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive

After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.

In this use case, the lockdep splat is a false positive, because
the BH disabling only matters for one cpu for a given lock
(we use one lock per cpu).

Use lockdep_off()/lockdep_on() around the problematic section to
avoid the splat.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Diagnosed-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
---
 net/ipv4/netfilter/arp_tables.c |    3 +++
 net/ipv4/netfilter/ip_tables.c  |    3 +++
 net/ipv6/netfilter/ip6_tables.c |    3 +++
 3 files changed, 9 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jarek Poplawski Aug. 16, 2010, 9:53 a.m. UTC | #1
Eric Dumazet wrote:
> Le dimanche 15 aou^t 2010 a` 12:55 +0200, Eric Dumazet a écrit :
...
> [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
> 
> After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> bottom half more than necessary), lockdep can raise a warning
> because we attempt to lock a spinlock with BH enabled, while
> the same lock is usually locked by another cpu in a softirq context.

Btw, could you remind us how get_counters() are serialized (I guess
you can't have them on 2 cpus at the same time)?

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 16, 2010, 10:42 a.m. UTC | #2
Le lundi 16 août 2010 à 09:53 +0000, Jarek Poplawski a écrit :
> Eric Dumazet wrote:
> > Le dimanche 15 aou^t 2010 a` 12:55 +0200, Eric Dumazet a écrit :
> ...
> > [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
> > 
> > After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> > bottom half more than necessary), lockdep can raise a warning
> > because we attempt to lock a spinlock with BH enabled, while
> > the same lock is usually locked by another cpu in a softirq context.
> 
> Btw, could you remind us how get_counters() are serialized (I guess
> you can't have them on 2 cpus at the same time)?
> 

get_counters() is serialized by the xt_find_table_lock() done from
get_entries(). This use a mutex to guard against changes.

You are right that if we ever allow two concurrent "iptables -nvL"
operations in the future (using a read lock on a rwlock instead of a
mutex), then we must disable BH even for summing data from the other
cpus.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Aug. 16, 2010, 10:48 a.m. UTC | #3
On Mon, Aug 16, 2010 at 12:42:41PM +0200, Eric Dumazet wrote:
> Le lundi 16 ao??t 2010 ?? 09:53 +0000, Jarek Poplawski a écrit :
> > Eric Dumazet wrote:
> > > Le dimanche 15 aou^t 2010 a` 12:55 +0200, Eric Dumazet a écrit :
> > ...
> > > [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
> > > 
> > > After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> > > bottom half more than necessary), lockdep can raise a warning
> > > because we attempt to lock a spinlock with BH enabled, while
> > > the same lock is usually locked by another cpu in a softirq context.
> > 
> > Btw, could you remind us how get_counters() are serialized (I guess
> > you can't have them on 2 cpus at the same time)?
> > 
> 
> get_counters() is serialized by the xt_find_table_lock() done from
> get_entries(). This use a mutex to guard against changes.
> 
> You are right that if we ever allow two concurrent "iptables -nvL"
> operations in the future (using a read lock on a rwlock instead of a
> mutex), then we must disable BH even for summing data from the other
> cpus.

OK, thanks for the explanation,

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Aug. 16, 2010, 7:36 p.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 15 Aug 2010 16:47:17 +0200

> Le dimanche 15 août 2010 à 12:55 +0200, Eric Dumazet a écrit :
> 
>> We have one lock per cpu, and only one cpu can possibly lock its
>> associated lock under softirq. So the usual lockdep check, warning a
>> lock is taken with BH enabled, while same lock was taken inside softirq
>> handler is triggering a false positive here.
>> 
>> I believe no existing lockdep annotation can instruct lockdep this use
>> is OK, I guess we have following choice :
>> 
>> 1) Mask BH again, using xt_info_wrlock_lockdep(cpu) instead of
>> xt_info_wrlock(cpu).
>> 
>> xt_info_wrlock_lockdep() being a variant, that disables BH in case
>> CONFIG_PROVE_LOCKING=y
>> 
>> 2) temporally switch off lockdep in get_counters(), using a
>> lockdep_off()/lockdep_on() pair, and a comment why this is necessary.
>> 
> 
> In any case, here is patch implementing the later

I'm hesistent to say that we should put this kind of patch in.

It will shut up lockdep for this specific case, but it also means
that if we do any other kinds of locking in this sequence we will
not validate it.

The valuable of this is open for debate I guess.

But locking is hard so I would say that disabling lockdep to kill a
warning it generates should be an absolute last resort.

I also don't think making the locking mechanics conditional upon
LOCKDEP is sane either, exactly because it means lockdep is testing
something other than what actually gets used in practice. :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 6bccba3..b4f7ebf 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -729,8 +729,10 @@  static void get_counters(const struct xt_table_info *t,
 	local_bh_enable();
 	/* Processing counters from other cpus, we can let bottom half enabled,
 	 * (preemption is disabled)
+	 * We must turn off lockdep to avoid a false positive.
 	 */
 
+	lockdep_off();
 	for_each_possible_cpu(cpu) {
 		if (cpu == curcpu)
 			continue;
@@ -743,6 +745,7 @@  static void get_counters(const struct xt_table_info *t,
 		}
 		xt_info_wrunlock(cpu);
 	}
+	lockdep_on();
 	put_cpu();
 }
 
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index c439721..dc5b2fd 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -903,8 +903,10 @@  get_counters(const struct xt_table_info *t,
 	local_bh_enable();
 	/* Processing counters from other cpus, we can let bottom half enabled,
 	 * (preemption is disabled)
+	 * We must turn off lockdep to avoid a false positive.
 	 */
 
+	lockdep_off();
 	for_each_possible_cpu(cpu) {
 		if (cpu == curcpu)
 			continue;
@@ -917,6 +919,7 @@  get_counters(const struct xt_table_info *t,
 		}
 		xt_info_wrunlock(cpu);
 	}
+	lockdep_on();
 	put_cpu();
 }
 
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 5359ef4..fb55443 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -916,8 +916,10 @@  get_counters(const struct xt_table_info *t,
 	local_bh_enable();
 	/* Processing counters from other cpus, we can let bottom half enabled,
 	 * (preemption is disabled)
+	 * We must turn off lockdep to avoid a false positive.
 	 */
 
+	lockdep_off();
 	for_each_possible_cpu(cpu) {
 		if (cpu == curcpu)
 			continue;
@@ -930,6 +932,7 @@  get_counters(const struct xt_table_info *t,
 		}
 		xt_info_wrunlock(cpu);
 	}
+	lockdep_on();
 	put_cpu();
 }