diff mbox

[v2] bonding:fix speed unknown,lacp bonding failed

Message ID 1373331122-10052-1-git-send-email-wangyufen@huawei.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Wang Yufen July 9, 2013, 12:52 a.m. UTC
From: "Wang Yufen" <wangyufen@huawei.com>

We bonded nic using LACP mode repeatedly, occasionally LACP bonding failed,
because a slave nic port speed was unknown. But when we used ethtool to 
check the slave NIC status, the nic status was normal,speed was 10000Mb/s.
	
Bonding get the NIC speed from NIC drivers,just when enslave nic 
and receive NETDEV_CHANGE event.We call bond_update_speed_duplex to 
update speed and duplex when miimon inspect slave link is OK and slave 
speed is unknown.
	
	
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
---
 drivers/net/bonding/bond_main.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Zefan Li July 9, 2013, 1:42 a.m. UTC | #1
On 2013/7/9 8:52, Wangyufen wrote:
> From: "Wang Yufen" <wangyufen@huawei.com>
> 
> We bonded nic using LACP mode repeatedly, occasionally LACP bonding failed,
> because a slave nic port speed was unknown. But when we used ethtool to 
> check the slave NIC status, the nic status was normal,speed was 10000Mb/s.
> 	
> Bonding get the NIC speed from NIC drivers,just when enslave nic 
> and receive NETDEV_CHANGE event.We call bond_update_speed_duplex to 
> update speed and duplex when miimon inspect slave link is OK and slave 
> speed is unknown.
> 	

Normally one should explain the changes from v1 to v2.

> 	
> Signed-off-by: Wang Yufen <wangyufen@huawei.com>
> ---
>  drivers/net/bonding/bond_main.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index f975696..4ccc173 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2301,8 +2301,22 @@ static int bond_miimon_inspect(struct bonding *bond)
>  
>  		switch (slave->link) {
>  		case BOND_LINK_UP:
> -			if (link_state)
> +			if (link_state) {
> +				if (slave->speed == SPEED_UNKNOWN) {
> +					rtnl_lock();

You should have CCed those who made comments to your previous version, and that's
Ben here.

"bond_update_sleep_duplex() must not be called in atomic context."

You didn't address his comment.

> +					bond_update_speed_duplex(slave);
> +					if (slave->speed != SPEED_UNKNOWN

Firstly you checked slave->speed == SPEED_UNKNOWN, and now slave->speed != SPEED_UNKNOWN ??

> +					&& bond->params.mode
> +					== BOND_MODE_8023AD) {

The codingstyle is awful...

if (slave->speed != SPEED_UNKNOWN &&
    bond->params.mode == BOND_MODE_8023AD) {
	...

Even if it breaks 80 chars limit a bit.

> +						bond_3ad_adapter_speed_changed(
> +						slave);

ditto,

> +						bond_3ad_adapter_duplex_changed(
> +						slave);

ditto

> +					}
> +					rtnl_unlock();
> +				}
>  				continue;
> +			}
>  
>  			slave->link = BOND_LINK_FAIL;
>  			slave->delay = bond->params.downdelay;
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Veaceslav Falico July 9, 2013, 1:43 p.m. UTC | #2
On Tue, Jul 09, 2013 at 08:52:02AM +0800, Wangyufen wrote:
>From: "Wang Yufen" <wangyufen@huawei.com>
>
>We bonded nic using LACP mode repeatedly, occasionally LACP bonding failed,
>because a slave nic port speed was unknown. But when we used ethtool to
>check the slave NIC status, the nic status was normal,speed was 10000Mb/s.
>	
>Bonding get the NIC speed from NIC drivers,just when enslave nic
>and receive NETDEV_CHANGE event.We call bond_update_speed_duplex to
>update speed and duplex when miimon inspect slave link is OK and slave
>speed is unknown.

This is the wrong way to fix it. The real problem here is that the NIC
doesn't send NETDEV_CHANGE when it changes its speed/duplex. Try finding
out why it doesn't and fix it.

And, as Ben mentioned earlier, bond_update_speed_duplex() can sleep, and
thus should not be called from atomic context. Take a look at the caller -
bond_mii_monitor() - the bond_miimon_inspect() is under bond->lock, for a
good reason.

For reference, see the patch 876254ae2758d50dcb08c7bd00caf6a806571178
("bonding: don't call update_speed_duplex() under spinlocks") - it
specifically removes the _update_speed_duplex() from _miimon_commit().

>	
>	
>Signed-off-by: Wang Yufen <wangyufen@huawei.com>
>
>---
>drivers/net/bonding/bond_main.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index f975696..4ccc173 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2301,8 +2301,22 @@ static int bond_miimon_inspect(struct bonding *bond)
>
> 		switch (slave->link) {
> 		case BOND_LINK_UP:
>-			if (link_state)
>+			if (link_state) {
>+				if (slave->speed == SPEED_UNKNOWN) {
>+					rtnl_lock();
>+					bond_update_speed_duplex(slave);
>+					if (slave->speed != SPEED_UNKNOWN
>+					&& bond->params.mode
>+					== BOND_MODE_8023AD) {
>+						bond_3ad_adapter_speed_changed(
>+						slave);
>+						bond_3ad_adapter_duplex_changed(
>+						slave);
>+					}
>+					rtnl_unlock();
>+				}
> 				continue;
>+			}
>
> 			slave->link = BOND_LINK_FAIL;
> 			slave->delay = bond->params.downdelay;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f975696..4ccc173 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2301,8 +2301,22 @@  static int bond_miimon_inspect(struct bonding *bond)
 
 		switch (slave->link) {
 		case BOND_LINK_UP:
-			if (link_state)
+			if (link_state) {
+				if (slave->speed == SPEED_UNKNOWN) {
+					rtnl_lock();
+					bond_update_speed_duplex(slave);
+					if (slave->speed != SPEED_UNKNOWN
+					&& bond->params.mode
+					== BOND_MODE_8023AD) {
+						bond_3ad_adapter_speed_changed(
+						slave);
+						bond_3ad_adapter_duplex_changed(
+						slave);
+					}
+					rtnl_unlock();
+				}
 				continue;
+			}
 
 			slave->link = BOND_LINK_FAIL;
 			slave->delay = bond->params.downdelay;