diff mbox

Nested GRE locking bug

Message ID 1287029519.2649.108.camel@edumazet-laptop
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Oct. 14, 2010, 4:11 a.m. UTC
Le jeudi 14 octobre 2010 à 05:00 +0100, Ben Hutchings a écrit :
> Beatrice Barbe reported a reproducible crash after creating large
> numbers of nested GRE tunnels and then pinging with the source address
> forced.  I was able to reproduce this using net-2.6.  I'm attaching the
> kernel config I used and a script to reproduce this based on the script
> she provided.  The magic number of tunnels to create is apparently 37.
> 
> With lockdep enabled, I get the following output:
> 

Thats a known problem, actually, called stack exhaustion :)

net-next-2.6 contains a fix for this, adding the perc_cpu xmit_recursion
limit. We might push it to net-2.6

Thanks

commit 745e20f1b626b1be4b100af5d4bf7b3439392f8f
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Wed Sep 29 13:23:09 2010 -0700

    net: add a recursion limit in xmit path
    
    As tunnel devices are going to be lockless, we need to make sure a
    misconfigured machine wont enter an infinite loop.
    
    Add a percpu variable, and limit to three the number of stacked xmits.
    
    Reported-by: Jesse Gross <jesse@nicira.com>
    Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Oct. 19, 2010, 8:53 a.m. UTC | #1
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 14 Oct 2010 06:11:59 +0200

> net-next-2.6 contains a fix for this, adding the perc_cpu
> xmit_recursion limit. We might push it to net-2.6

We need to think a bit more about this.

We are essentially now saying that one can only configure
tunnels 3 levels deep, and no more.

I can guarentee you someone out there uses at least 4,
perhaps more.

And those people will be broken by the new limit.

So putting this into net-2.6 with such a low limit will
be quite dangerous.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 19, 2010, 9:02 a.m. UTC | #2
Le mardi 19 octobre 2010 à 01:53 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 14 Oct 2010 06:11:59 +0200
> 
> > net-next-2.6 contains a fix for this, adding the perc_cpu
> > xmit_recursion limit. We might push it to net-2.6
> 
> We need to think a bit more about this.
> 
> We are essentially now saying that one can only configure
> tunnels 3 levels deep, and no more.
> 
> I can guarentee you someone out there uses at least 4,
> perhaps more.
> 
> And those people will be broken by the new limit.
> 
> So putting this into net-2.6 with such a low limit will
> be quite dangerous.

Well limit is actually 4, but I get your point ;)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 48ad47f..50dacca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2177,6 +2177,9 @@  static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 	return rc;
 }
 
+static DEFINE_PER_CPU(int, xmit_recursion);
+#define RECURSION_LIMIT 3
+
 /**
  *	dev_queue_xmit - transmit a buffer
  *	@skb: buffer to transmit
@@ -2242,10 +2245,15 @@  int dev_queue_xmit(struct sk_buff *skb)
 
 		if (txq->xmit_lock_owner != cpu) {
 
+			if (__this_cpu_read(xmit_recursion) > RECURSION_LIMIT)
+				goto recursion_alert;
+
 			HARD_TX_LOCK(dev, txq, cpu);
 
 			if (!netif_tx_queue_stopped(txq)) {
+				__this_cpu_inc(xmit_recursion);
 				rc = dev_hard_start_xmit(skb, dev, txq);
+				__this_cpu_dec(xmit_recursion);
 				if (dev_xmit_complete(rc)) {
 					HARD_TX_UNLOCK(dev, txq);
 					goto out;
@@ -2257,7 +2265,9 @@  int dev_queue_xmit(struct sk_buff *skb)
 				       "queue packet!\n", dev->name);
 		} else {
 			/* Recursion is detected! It is possible,
-			 * unfortunately */
+			 * unfortunately
+			 */
+recursion_alert:
 			if (net_ratelimit())
 				printk(KERN_CRIT "Dead loop on virtual device "
 				       "%s, fix it urgently!\n", dev->name);