Patchwork bonding: fix link down handling in 802.3ad mode

login
register
mail settings
Submitter stephen hemminger
Date May 15, 2009, 6:44 p.m.
Message ID <20090515114432.4025a2d6@nehalam>
Download mbox | patch
Permalink /patch/27278/
State Accepted
Delegated to: David Miller
Headers show

Comments

stephen hemminger - May 15, 2009, 6:44 p.m.
One of the purposes of bonding is to allow for redundant links, and failover 
correctly if the cable is pulled. If all the members of a bonded device have
no carrier present, the bonded device itself needs to report no carrier present
to user space so management tools (like routing daemons) can respond.

Bonding in 802.3ad mode does not work correctly for this because it incorrectly 
chooses a link that is down as a possible aggregator.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---	       
This patch is a bug fix and should go into 2.6.30

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jay Vosburgh - May 15, 2009, 7:35 p.m.
Stephen Hemminger <shemminger@vyatta.com> wrote:

>One of the purposes of bonding is to allow for redundant links, and failover 
>correctly if the cable is pulled. If all the members of a bonded device have
>no carrier present, the bonded device itself needs to report no carrier present
>to user space so management tools (like routing daemons) can respond.
>
>Bonding in 802.3ad mode does not work correctly for this because it incorrectly 
>chooses a link that is down as a possible aggregator.
>
>Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

	Looks good.  I tested this after putting some printks in the agg
selection logic, and I don't see any undesirable side effects.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

	-J

>---	       
>This patch is a bug fix and should go into 2.6.30
>
>--- a/drivers/net/bonding/bond_3ad.c	2009-05-14 15:38:15.090363836 -0700
>+++ b/drivers/net/bonding/bond_3ad.c	2009-05-15 11:41:44.313559146 -0700
>@@ -1465,6 +1465,12 @@ static struct aggregator *ad_agg_selecti
> 	return best;
> }
>
>+static int agg_device_up(const struct aggregator *agg)
>+{
>+	return (netif_running(agg->slave->dev) &&
>+		netif_carrier_ok(agg->slave->dev));
>+}
>+
> /**
>  * ad_agg_selection_logic - select an aggregation group for a team
>  * @aggregator: the aggregator we're looking at
>@@ -1496,14 +1502,13 @@ static void ad_agg_selection_logic(struc
> 	struct port *port;
>
> 	origin = agg;
>-
> 	active = __get_active_agg(agg);
>-	best = active;
>+	best = (active && agg_device_up(active)) ? active : NULL;
>
> 	do {
> 		agg->is_active = 0;
>
>-		if (agg->num_of_ports)
>+		if (agg->num_of_ports && agg_device_up(agg))
> 			best = ad_agg_selection_test(best, agg);
>
> 	} while ((agg = __get_next_agg(agg)));
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - May 18, 2009, 4:16 a.m.
From: Jay Vosburgh <fubar@us.ibm.com>
Date: Fri, 15 May 2009 12:35:55 -0700

> Stephen Hemminger <shemminger@vyatta.com> wrote:
> 
>>One of the purposes of bonding is to allow for redundant links, and failover 
>>correctly if the cable is pulled. If all the members of a bonded device have
>>no carrier present, the bonded device itself needs to report no carrier present
>>to user space so management tools (like routing daemons) can respond.
>>
>>Bonding in 802.3ad mode does not work correctly for this because it incorrectly 
>>chooses a link that is down as a possible aggregator.
>>
>>Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> 	Looks good.  I tested this after putting some printks in the agg
> selection logic, and I don't see any undesirable side effects.
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

--- a/drivers/net/bonding/bond_3ad.c	2009-05-14 15:38:15.090363836 -0700
+++ b/drivers/net/bonding/bond_3ad.c	2009-05-15 11:41:44.313559146 -0700
@@ -1465,6 +1465,12 @@  static struct aggregator *ad_agg_selecti
 	return best;
 }
 
+static int agg_device_up(const struct aggregator *agg)
+{
+	return (netif_running(agg->slave->dev) &&
+		netif_carrier_ok(agg->slave->dev));
+}
+
 /**
  * ad_agg_selection_logic - select an aggregation group for a team
  * @aggregator: the aggregator we're looking at
@@ -1496,14 +1502,13 @@  static void ad_agg_selection_logic(struc
 	struct port *port;
 
 	origin = agg;
-
 	active = __get_active_agg(agg);
-	best = active;
+	best = (active && agg_device_up(active)) ? active : NULL;
 
 	do {
 		agg->is_active = 0;
 
-		if (agg->num_of_ports)
+		if (agg->num_of_ports && agg_device_up(agg))
 			best = ad_agg_selection_test(best, agg);
 
 	} while ((agg = __get_next_agg(agg)));