diff mbox series

bonding: force enable lacp port after link state recovery for 802.3ad

Message ID 20190820133822.2508-1-zhangsha.zhang@huawei.com
State Changes Requested
Delegated to: David Miller
Headers show
Series bonding: force enable lacp port after link state recovery for 802.3ad | expand

Commit Message

zhangsha (A) Aug. 20, 2019, 1:38 p.m. UTC
From: Sha Zhang <zhangsha.zhang@huawei.com>

After the commit 334031219a84 ("bonding/802.3ad: fix slave link
initialization transition states") merged,
the slave's link status will be changed to BOND_LINK_FAIL
from BOND_LINK_DOWN in the following scenario:
- Driver reports loss of carrier and
  bonding driver receives NETDEV_CHANGE notifier
- slave's duplex and speed is zerod and
  its port->is_enabled is cleard to 'false';
- Driver reports link recovery and
  bonding driver receives NETDEV_UP notifier;
- If speed/duplex getting failed here, the link status
  will be changed to BOND_LINK_FAIL;
- The MII monotor later recover the slave's speed/duplex
  and set link status to BOND_LINK_UP, but remains
  the 'port->is_enabled' to 'false'.

In this scenario, the lacp port will not be enabled even its speed
and duplex are valid. The bond will not send LACPDU's, and its
state is 'AD_STATE_DEFAULTED' forever. The simplest fix I think
is to force enable lacp after port slave speed check in
bond_miimon_commit. As enabled, the lacp port can run its state machine
normally after link recovery.

Signed-off-by: Sha Zhang <zhangsha.zhang@huawei.com>
---
 drivers/net/bonding/bond_main.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Jay Vosburgh Aug. 20, 2019, 11:14 p.m. UTC | #1
<zhangsha.zhang@huawei.com> wrote:

>From: Sha Zhang <zhangsha.zhang@huawei.com>
>
>After the commit 334031219a84 ("bonding/802.3ad: fix slave link
>initialization transition states") merged,
>the slave's link status will be changed to BOND_LINK_FAIL
>from BOND_LINK_DOWN in the following scenario:
>- Driver reports loss of carrier and
>  bonding driver receives NETDEV_CHANGE notifier
>- slave's duplex and speed is zerod and
>  its port->is_enabled is cleard to 'false';
>- Driver reports link recovery and
>  bonding driver receives NETDEV_UP notifier;
>- If speed/duplex getting failed here, the link status
>  will be changed to BOND_LINK_FAIL;
>- The MII monotor later recover the slave's speed/duplex
>  and set link status to BOND_LINK_UP, but remains
>  the 'port->is_enabled' to 'false'.
>
>In this scenario, the lacp port will not be enabled even its speed
>and duplex are valid. The bond will not send LACPDU's, and its
>state is 'AD_STATE_DEFAULTED' forever. The simplest fix I think
>is to force enable lacp after port slave speed check in
>bond_miimon_commit. As enabled, the lacp port can run its state machine
>normally after link recovery.
>
>Signed-off-by: Sha Zhang <zhangsha.zhang@huawei.com>
>---
> drivers/net/bonding/bond_main.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 931d9d9..379253a 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2194,6 +2194,7 @@ static void bond_miimon_commit(struct bonding *bond)
> {
> 	struct list_head *iter;
> 	struct slave *slave, *primary;
>+	struct port *port;
> 
> 	bond_for_each_slave(bond, slave, iter) {
> 		switch (slave->new_link) {
>@@ -2205,8 +2206,13 @@ static void bond_miimon_commit(struct bonding *bond)
> 			 * link status
> 			 */
> 			if (BOND_MODE(bond) == BOND_MODE_8023AD &&
>-			    slave->link == BOND_LINK_UP)
>+			    slave->link == BOND_LINK_UP) {
> 				bond_3ad_adapter_speed_duplex_changed(slave);
>+				if (slave->duplex == DUPLEX_FULL) {
>+					port = &(SLAVE_AD_INFO(slave)->port);
>+					port->is_enabled = true;
>+				}
>+			}

	I don't believe that testing duplex here is correct; is_enabled
is not controlled by duplex, but by carrier state.  Duplex does affect
whether or not a port is permitted to aggregate, but that's entirely
separate logic (the AD_PORT_LACP_ENABLED flag).

	Would it be better to call bond_3ad_handle_link_change() here,
instead of manually testing duplex and setting is_enabled?

	-J

> 			continue;
> 
> 		case BOND_LINK_UP:
>-- 
>1.8.3.1
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com
zhangsha (A) Aug. 22, 2019, 2:33 p.m. UTC | #2
> -----Original Message-----
> From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
> Sent: 2019年8月21日 7:15
> To: zhangsha (A) <zhangsha.zhang@huawei.com>
> Cc: vfalico@gmail.com; andy@greyhouse.net; davem@davemloft.net;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; yuehaibing
> <yuehaibing@huawei.com>; hunongda <hunongda@huawei.com>;
> Chenzhendong (alex) <alex.chen@huawei.com>
> Subject: Re: [PATCH] bonding: force enable lacp port after link state recovery
> for 802.3ad
> 
> <zhangsha.zhang@huawei.com> wrote:
> 
> >From: Sha Zhang <zhangsha.zhang@huawei.com>
> >
> >After the commit 334031219a84 ("bonding/802.3ad: fix slave link
> >initialization transition states") merged, the slave's link status will
> >be changed to BOND_LINK_FAIL from BOND_LINK_DOWN in the following
> >scenario:
> >- Driver reports loss of carrier and
> >  bonding driver receives NETDEV_CHANGE notifier
> >- slave's duplex and speed is zerod and
> >  its port->is_enabled is cleard to 'false';
> >- Driver reports link recovery and
> >  bonding driver receives NETDEV_UP notifier;
> >- If speed/duplex getting failed here, the link status
> >  will be changed to BOND_LINK_FAIL;
> >- The MII monotor later recover the slave's speed/duplex
> >  and set link status to BOND_LINK_UP, but remains
> >  the 'port->is_enabled' to 'false'.
> >
> >In this scenario, the lacp port will not be enabled even its speed and
> >duplex are valid. The bond will not send LACPDU's, and its state is
> >'AD_STATE_DEFAULTED' forever. The simplest fix I think is to force
> >enable lacp after port slave speed check in bond_miimon_commit. As
> >enabled, the lacp port can run its state machine normally after link
> >recovery.
> >
> >Signed-off-by: Sha Zhang <zhangsha.zhang@huawei.com>
> >---
> > drivers/net/bonding/bond_main.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/net/bonding/bond_main.c
> >b/drivers/net/bonding/bond_main.c index 931d9d9..379253a 100644
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
> >@@ -2194,6 +2194,7 @@ static void bond_miimon_commit(struct bonding
> >*bond)  {
> > 	struct list_head *iter;
> > 	struct slave *slave, *primary;
> >+	struct port *port;
> >
> > 	bond_for_each_slave(bond, slave, iter) {
> > 		switch (slave->new_link) {
> >@@ -2205,8 +2206,13 @@ static void bond_miimon_commit(struct bonding
> *bond)
> > 			 * link status
> > 			 */
> > 			if (BOND_MODE(bond) == BOND_MODE_8023AD &&
> >-			    slave->link == BOND_LINK_UP)
> >+			    slave->link == BOND_LINK_UP) {
> >
> 	bond_3ad_adapter_speed_duplex_changed(slave);
> >+				if (slave->duplex == DUPLEX_FULL) {
> >+					port = &(SLAVE_AD_INFO(slave)-
> >port);
> >+					port->is_enabled = true;
> >+				}
> >+			}
> 
> 	I don't believe that testing duplex here is correct; is_enabled is not
> controlled by duplex, but by carrier state.  Duplex does affect whether or not
> a port is permitted to aggregate, but that's entirely separate logic (the
> AD_PORT_LACP_ENABLED flag).
> 
> 	Would it be better to call bond_3ad_handle_link_change() here,
> instead of manually testing duplex and setting is_enabled?
> 
> 	-J

Hi, Jay,
Thanks for  the reply and I think bond_3ad_handle_link_change is indeed a better way here.
I will send a new patch later after having it  tested.

> 
> > 			continue;
> >
> > 		case BOND_LINK_UP:
> >--
> >1.8.3.1
> >
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 931d9d9..379253a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2194,6 +2194,7 @@  static void bond_miimon_commit(struct bonding *bond)
 {
 	struct list_head *iter;
 	struct slave *slave, *primary;
+	struct port *port;
 
 	bond_for_each_slave(bond, slave, iter) {
 		switch (slave->new_link) {
@@ -2205,8 +2206,13 @@  static void bond_miimon_commit(struct bonding *bond)
 			 * link status
 			 */
 			if (BOND_MODE(bond) == BOND_MODE_8023AD &&
-			    slave->link == BOND_LINK_UP)
+			    slave->link == BOND_LINK_UP) {
 				bond_3ad_adapter_speed_duplex_changed(slave);
+				if (slave->duplex == DUPLEX_FULL) {
+					port = &(SLAVE_AD_INFO(slave)->port);
+					port->is_enabled = true;
+				}
+			}
 			continue;
 
 		case BOND_LINK_UP: