Patchwork Bridged networking panics

login
register
mail settings
Submitter Eric Dumazet
Date July 4, 2012, 1:24 p.m.
Message ID <1341408298.2583.1963.camel@edumazet-glaptop>
Download mbox | patch
Permalink /patch/168983/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - July 4, 2012, 1:24 p.m.
On Wed, 2012-07-04 at 12:42 +0200, Massimo Cetra wrote:

> Ok, waiting for an answer (sincerely i don't understand why no ones 
> cares for months) i'm attaching 3 more panics happened this morning on a 
> vanilla 3.2.21.
> 
> The problem is related to heavy network traffic.
> 
> I'm not a kernel hacker but I see that net_rx_action() is in the backtrace.
> So (probably reaching wrong conclusions) if those panics are triggered 
> by incoming network traffic, can an attacker use this to crash a machine ?
> 
> Anyone willing to reply or at least tell me what's going on ?

Posting a bug report is not enough to get people working for free on the
problem.

Apparently your configuration is kind of special if nobody but you hits
the problem so often.

So it would help if you can reproduce the bug using current kernel and
provide all necessary steps to reproduce the bug. Ideally a script.sh
file doing all the configuration you use to trigger the bug, assuming
a basic machine freshly booted with no special config already done.

The panics dont happen in the bridge code itself, but in the
BRIDGE_NETFILTER one. Do you need it, and why ?

Are you using vlans ?

Please try following patch

 net/bridge/br_netfilter.c |   20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Massimo Cetra - July 4, 2012, 1:50 p.m.
On 04/07/2012 15:24, Eric Dumazet wrote:

> Posting a bug report is not enough to get people working for free on the
> problem.

Thanks for the reply.

I'd like to point out that without a reply of what is need of what i'm 
doing wrong i cannot provide anything useful.

> Apparently your configuration is kind of special if nobody but you hits
> the problem so often.

> So it would help if you can reproduce the bug using current kernel and
> provide all necessary steps to reproduce the bug. Ideally a script.sh
> file doing all the configuration you use to trigger the bug, assuming
> a basic machine freshly booted with no special config already done.

I can try to setup a fresh KVM image and see if the bug is reproduceable 
there. Would it be ok ?

> The panics dont happen in the bridge code itself, but in the
> BRIDGE_NETFILTER one. Do you need it, and why ?
>
> Are you using vlans ?

No, no VLANS.

I have 2 real network cards (Broadcom Corporation NetXtreme II BCM5716) 
configured as bridges.

Each bridge (br0 and br1) has an ip address which is fixed (does never 
change).

The server(s) run KVM machines which are attached to tun interfaces 
(created with "vde_tunctl -u $user -t $IFACE)

Each virtual KVM server has an IP address that is forwarded through the 
bridge and has as gateway the router of the main server.

Up to this point there is nothing strange in the configuration and if 
the system is used this way, there are no panics.


The (maybe) peculiar configs are:

1) heartbeat is installed and creates alias interfaces for the bridge 
and assigns them an IP address. So the server has br0:1 and br1:1 that 
are associated with a couple of IP addresses.

2) the server runs ipvs (to redirect HTTP requests to two KVM servers 
that are natted behind the br0:1 br1:1 addresses).


IF i remove the br0:1 and br1:1 interfaces (that are configured with the 
ip addresses used by IPVS i don't have any single problem and the crash 
(at least with 3.2.21) doesn't happen.

So, if i turn off heartbeat (and the alias ip addresses used by IPVSare 
switched to the other host) there are no panics.

The more the traffic, the quicker the panic happens.

Note that up to 2.6.36 this configuration was working without problems.

Ah, the last setting that i modified is disabling tcp_sack in sysctl.conf.

> Please try following patch

I will try on the latest 3.2.y for now, trying to replicate the problem.

Thanks again,

MC
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index e41456b..a73a8cb 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -173,9 +173,13 @@  static inline struct rtable *bridge_parent_rtable(const struct net_device *dev)
 static inline struct net_device *bridge_parent(const struct net_device *dev)
 {
 	struct net_bridge_port *port;
+	struct net_device *parent;
 
 	port = br_port_get_rcu(dev);
-	return port ? port->br->dev : NULL;
+	parent = port ? port->br->dev : NULL;
+	if (parent && !(parent->flags & IFF_UP))
+		parent = NULL;
+	return parent;
 }
 
 static inline struct nf_bridge_info *nf_bridge_alloc(struct sk_buff *skb)
@@ -510,6 +514,8 @@  static struct net_device *brnf_get_logical_dev(struct sk_buff *skb, const struct
 	struct net_device *vlan, *br;
 
 	br = bridge_parent(dev);
+	if (!br)
+		return NULL;
 	if (brnf_pass_vlan_indev == 0 || !vlan_tx_tag_present(skb))
 		return br;
 
@@ -748,7 +754,7 @@  static unsigned int br_nf_forward_ip(unsigned int hook, struct sk_buff *skb,
 				     int (*okfn)(struct sk_buff *))
 {
 	struct nf_bridge_info *nf_bridge;
-	struct net_device *parent;
+	struct net_device *parent, *ldev;
 	u_int8_t pf;
 
 	if (!skb->nf_bridge)
@@ -789,8 +795,11 @@  static unsigned int br_nf_forward_ip(unsigned int hook, struct sk_buff *skb,
 	else
 		skb->protocol = htons(ETH_P_IPV6);
 
-	NF_HOOK(pf, NF_INET_FORWARD, skb, brnf_get_logical_dev(skb, in), parent,
-		br_nf_forward_finish);
+	ldev = brnf_get_logical_dev(skb, in);
+	if (!ldev)
+		return NF_DROP;
+
+	NF_HOOK(pf, NF_INET_FORWARD, skb, ldev, parent, br_nf_forward_finish);
 
 	return NF_STOLEN;
 }
@@ -861,12 +870,13 @@  static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff *skb,
 				       int (*okfn)(struct sk_buff *))
 {
 	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
-	struct net_device *realoutdev = bridge_parent(skb->dev);
+	struct net_device *realoutdev;
 	u_int8_t pf;
 
 	if (!nf_bridge || !(nf_bridge->mask & BRNF_BRIDGED))
 		return NF_ACCEPT;
 
+	realoutdev = bridge_parent(skb->dev);
 	if (!realoutdev)
 		return NF_DROP;