Patchwork [net-next,v2] bridge: do not expire mdb entry as long as bridge still uses it

login
register
mail settings
Submitter Amerigo Wang
Date April 25, 2013, 8:21 a.m.
Message ID <1366878083-22797-1-git-send-email-amwang@redhat.com>
Download mbox | patch
Permalink /patch/239427/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Amerigo Wang - April 25, 2013, 8:21 a.m.
This bug can be observed in virt environment, when a KVM guest
communicates with the host via multicast. After some time (should
be 260 sec, I didn't measure), the multicast traffic suddenly
terminates.

This is due to the mdb entry for bridge itself expires automatically,
it should not expire as long as the bridge still generates multicast
traffic. It should expire when the bridge leaves the multicast group,
OR when there is no multicast traffic on this bridge.

I fix this by adding another bool which is set when there is
multicast traffic goes to the bridge, cleared in the expire timer and
when IGMP leave is received. I ran omping for 15 minutes, everything
looks good now.

This might not be the best fix, but might be the simplest fix.

(Adam, this problem is probably different with your problem,
at least this problem can _not_ be workarounded by setting
querier to 1.)

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Adam Baker <linux@baker-net.org.uk>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
v2: rename ->busy to ->for_br

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu - April 25, 2013, 8:37 a.m.
On Thu, Apr 25, 2013 at 04:21:23PM +0800, Cong Wang wrote:
> This bug can be observed in virt environment, when a KVM guest
> communicates with the host via multicast. After some time (should
> be 260 sec, I didn't measure), the multicast traffic suddenly
> terminates.
> 
> This is due to the mdb entry for bridge itself expires automatically,
> it should not expire as long as the bridge still generates multicast
> traffic. It should expire when the bridge leaves the multicast group,
> OR when there is no multicast traffic on this bridge.
> 
> I fix this by adding another bool which is set when there is
> multicast traffic goes to the bridge, cleared in the expire timer and
> when IGMP leave is received. I ran omping for 15 minutes, everything
> looks good now.
> 
> This might not be the best fix, but might be the simplest fix.
> 
> (Adam, this problem is probably different with your problem,
> at least this problem can _not_ be workarounded by setting
> querier to 1.)

Sorry, your patch still makes no sense to me.  All you're doing
is extending the bridge interface's subscription status every
time a multicast packet for that group is sent to to the bridge.

We should only be extending the status if we get a group report.

As you're saying the bridge interface's subscription is expiring
incorrectly, the question is why aren't we receiving those group
reports from ourselves, which should be sent out periodically if
we were subscribed to the group?

Cheers,
Amerigo Wang - April 25, 2013, 10:07 a.m.
On Thu, 2013-04-25 at 16:37 +0800, Herbert Xu wrote:
> 
> As you're saying the bridge interface's subscription is expiring
> incorrectly, the question is why aren't we receiving those group
> reports from ourselves, which should be sent out periodically if
> we were subscribed to the group?

I thought IGMP report is optional as long as the multicast traffic is
still running. If it is mandatory, I don't see any IGMP report sending
out from guest, so the bug is in IPv4 multicast code rather than bridge
code.

But this can't explain why guests can communicate with each other. :)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergei Shtylyov - April 25, 2013, 12:30 p.m.
Hello.

On 25-04-2013 12:21, Cong Wang wrote:

> This bug can be observed in virt environment, when a KVM guest
> communicates with the host via multicast. After some time (should
> be 260 sec, I didn't measure), the multicast traffic suddenly
> terminates.

> This is due to the mdb entry for bridge itself expires automatically,
> it should not expire as long as the bridge still generates multicast
> traffic. It should expire when the bridge leaves the multicast group,
> OR when there is no multicast traffic on this bridge.

> I fix this by adding another bool which is set when there is
> multicast traffic goes to the bridge, cleared in the expire timer and
> when IGMP leave is received. I ran omping for 15 minutes, everything
> looks good now.

> This might not be the best fix, but might be the simplest fix.

> (Adam, this problem is probably different with your problem,
> at least this problem can _not_ be workarounded by setting
> querier to 1.)

> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Adam Baker <linux@baker-net.org.uk>
> Signed-off-by: Cong Wang <amwang@redhat.com>
> ---
> v2: rename ->busy to ->for_br

> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index 828e2bc..4f0e50f 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -99,9 +99,12 @@ int br_handle_frame_finish(struct sk_buff *skb)
>   	else if (is_multicast_ether_addr(dest)) {
>   		mdst = br_mdb_get(br, skb, vid);
>   		if (mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) {
> -			if ((mdst && mdst->mglist) ||
> -			    br_multicast_is_router(br))
> +			bool to_br = mdst && mdst->mglist;

    Emoty line after declaration wouldn't hurt.

> +			if (to_br || br_multicast_is_router(br))
>   				skb2 = skb;

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu - April 26, 2013, 12:50 p.m.
On Thu, Apr 25, 2013 at 06:07:01PM +0800, Cong Wang wrote:
> On Thu, 2013-04-25 at 16:37 +0800, Herbert Xu wrote:
> > 
> > As you're saying the bridge interface's subscription is expiring
> > incorrectly, the question is why aren't we receiving those group
> > reports from ourselves, which should be sent out periodically if
> > we were subscribed to the group?
> 
> I thought IGMP report is optional as long as the multicast traffic is
> still running. If it is mandatory, I don't see any IGMP report sending
> out from guest, so the bug is in IPv4 multicast code rather than bridge
> code.
> 
> But this can't explain why guests can communicate with each other. :)

OK, if there is no querier in the network then our current default
behaviour is indeed broken because we won't get the reports needed
to sustain the subscriptions.

So we need to do two things:

1) Fix the group timeout mechanism to only arm the timer when a
corresponding query is received.
2) Resume sending queries when leave is received if the user makes
a querier.

Let me know if you'd like to work on this, otherwise I can take
a look at it next week.

Cheers,
Amerigo Wang - April 27, 2013, 3:40 p.m.
On Fri, 2013-04-26 at 20:50 +0800, Herbert Xu wrote:
> So we need to do two things:
> 
> 1) Fix the group timeout mechanism to only arm the timer when a
> corresponding query is received.
> 2) Resume sending queries when leave is received if the user makes
> a querier.

Agreed.

> 
> Let me know if you'd like to work on this, otherwise I can take
> a look at it next week.
> 

Yeah, I will work on it.

Thanks for the suggestion!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 828e2bc..4f0e50f 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -99,9 +99,12 @@  int br_handle_frame_finish(struct sk_buff *skb)
 	else if (is_multicast_ether_addr(dest)) {
 		mdst = br_mdb_get(br, skb, vid);
 		if (mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) {
-			if ((mdst && mdst->mglist) ||
-			    br_multicast_is_router(br))
+			bool to_br = mdst && mdst->mglist;
+			if (to_br || br_multicast_is_router(br))
 				skb2 = skb;
+			/* Hold this mdb entry for bridge itself */
+			if (to_br)
+				mdst->for_br = true;
 			br_multicast_forward(mdst, skb, skb2);
 			skb = NULL;
 			if (!skb2)
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 81f2389..a86aa42 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -228,11 +228,16 @@  static void br_multicast_group_expired(unsigned long data)
 	if (!netif_running(br->dev) || timer_pending(&mp->timer))
 		goto out;
 
-	mp->mglist = false;
-
 	if (mp->ports)
 		goto out;
 
+	if (mp->mglist && mp->for_br) {
+		mp->for_br = false;
+		goto out;
+	}
+
+	mp->mglist = false;
+
 	mdb = mlock_dereference(br->mdb, br);
 
 	hlist_del_rcu(&mp->hlist[mdb->ver]);
@@ -668,7 +673,7 @@  static int br_multicast_add_group(struct net_bridge *br,
 		goto err;
 
 	if (!port) {
-		mp->mglist = true;
+		mp->mglist = mp->for_br = true;
 		mod_timer(&mp->timer, now + br->multicast_membership_interval);
 		goto out;
 	}
@@ -1280,6 +1285,7 @@  static void br_multicast_leave_group(struct net_bridge *br,
 			mod_timer(&mp->timer, time);
 		}
 
+		mp->for_br = false;
 		goto out;
 	}
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index d2c043a..c96bf19 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -112,6 +112,7 @@  struct net_bridge_mdb_entry
 	struct timer_list		timer;
 	struct br_ip			addr;
 	bool				mglist;
+	bool				for_br; /* update this only when mglist == true */
 };
 
 struct net_bridge_mdb_htable