diff mbox

[1/1] bonding: restrict up state in 802.3ad mode

Message ID 568E1801.9020804@gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Zhu Yanjun Jan. 7, 2016, 7:47 a.m. UTC
On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>> Sent: Tuesday, January 05, 2016 7:05 PM
>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>> Shteinbock, Boris (Wind River)
>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>
>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>> -----Original Message-----
>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> On
>>>> Behalf Of zhuyj
>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>> To: Michal Kubecek; Jay Vosburgh
>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>> netdev@vger.kernel.org;
>>>> Shteinbock, Boris (Wind River)
>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>
>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>> there is a time span between NIC up state and getting speed and
>> duplex.
>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>> work well.
>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>> 	What device is this?  It seems a bit odd that an Ethernet device
>>>>>> can be carrier up but not have the duplex and speed available.
>>>>> ...
>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>> propagate to the 802.3ad logic.
>>>>>>
>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>> duplex available, then maybe it should call netdev_state_change() when
>>>>>> the duplex and speed are available, or delay calling
>> netif_carrier_on().
>>>>> I have encountered this problem (NIC having carrier on before being
>> able
>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>> handling.") so this example rather supports what you said.
>>>>>
>>>>>                                                              Michal
>> Kubecek
>>>> Thanks a lot.
>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>> handling."). The symptoms are the same with mine.
>>>>
>>>> The root cause is different. In my problem, the root cause is that LINKS
>>>> register[]  can not provide link_up and link_speed at the same time.
>>>> There is a time span between link_up and link_speed.
>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>> updated
>>> simultaneously. Do you have any proof to show the delay you are referring
>> to
>>> as I am sure our HW engineers would like to know about it.
>> Sorry. I can not reproduce this problem locally. What I have is the
>> feedback from the customer.
> So you are assuming that there is a delay due to the issue you are seeing?
>
>> Settings for eth0:
>>     Supported ports: [ TP ]
>>     Supported link modes:   100baseT/Full
>>                             1000baseT/Full
>>                             10000baseT/Full
>>     Supported pause frame use: No
>>     Supports auto-negotiation: Yes
>>     Advertised link modes:  100baseT/Full
>>                             1000baseT/Full
>>                             10000baseT/Full
>>     Advertised pause frame use: No
>>     Advertised auto-negotiation: Yes
>>     Speed: Unknown!
>>     Duplex: Unknown! (255)
>>     Port: Twisted Pair
>>     PHYAD: 0
>>     Transceiver: external
>>     Auto-negotiation: on
>>     MDI-X: Unknown
>>     Supports Wake-on: d
>>     Wake-on: d
>>     Current message level: 0x00000007 (7)
>>                    drv probe link
>>     Link detected: yes
> The speed and the link state here are reported from
> different sources:
>
>>     Link detected: yes
> Comes from a netif_carrier_ok() check. This is done via ethtool_op_get_link().
>
> Only the speed is reported through the LINKS register - that is why it is reported
> as "Unknown" - in other words link_up is false.
>
> This is a trace from the case where the bonding driver reports 0 Mbps:
>
>     kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>     kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>       kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>    NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0: link status definitely up for interface eth1, 0 Mbps full duplex
>    NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>     kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task: eth1: link_speed = 80, link_up = true

Hi, Emil

Thanks for your feedback.
 From your log, I think the following can explain why bonding driver can 
not get speed.

bonding                           ixgbe
.                                   .
.      <-----------------------   NETDEV_UP
.                                   .
bond_slave_netdev_event           NETDEV_DOWN
.                                   .
.                                   .
.                                   .
NETDEV_UP                           .
.              ----------------> get_settings
                                     .
speed unknown  <---------------  link_up false
.
.
link_up = true
link_speed = unknown

In the above, ixgbe is up and bonding gets this message, then bonding 
calls bond_slave_netdev_event while ixgbe is down.
In bond_slave_netdev_event, bonding call get_settings in ixgbe to get 
link_speed. Since now ixgbe is down, so link_speed is
unknown. In the end, bonding get the final state of ixgbe as link_up 
without link_speed.

If you agree with me, would you like to help me to make tests with the 
following patch?

         return 0;

Thanks a lot.
Zhu Yanjun

>
> As you can see the link is initially established, but then lost and if just so happens that the
> bonding driver is checking it at that time it will report 0 Mbps.
>
> I will give your patch a try and see if it helps in this situation.
>
> Thanks,
> Emil
>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Tantilov, Emil S Jan. 7, 2016, 6:28 p.m. UTC | #1
>-----Original Message-----
>From: zhuyj [mailto:zyjzyj2000@gmail.com]
>Sent: Wednesday, January 06, 2016 11:47 PM
>To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>Shteinbock, Boris (Wind River)
>Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>
>On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>>> -----Original Message-----
>>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>>> Sent: Tuesday, January 05, 2016 7:05 PM
>>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>netdev@vger.kernel.org;
>>> Shteinbock, Boris (Wind River)
>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>
>>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>>> -----Original Message-----
>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>owner@vger.kernel.org]
>>> On
>>>>> Behalf Of zhuyj
>>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>>> To: Michal Kubecek; Jay Vosburgh
>>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>>> netdev@vger.kernel.org;
>>>>> Shteinbock, Boris (Wind River)
>>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>>
>>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>>> there is a time span between NIC up state and getting speed and
>>> duplex.
>>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>>> work well.
>>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>>> 	What device is this?  It seems a bit odd that an Ethernet
>device
>>>>>>> can be carrier up but not have the duplex and speed available.
>>>>>> ...
>>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>>> propagate to the 802.3ad logic.
>>>>>>>
>>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>>> duplex available, then maybe it should call netdev_state_change()
>when
>>>>>>> the duplex and speed are available, or delay calling
>>> netif_carrier_on().
>>>>>> I have encountered this problem (NIC having carrier on before being
>>> able
>>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>>> handling.") so this example rather supports what you said.
>>>>>>
>>>>>>                                                              Michal
>>> Kubecek
>>>>> Thanks a lot.
>>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>>> handling."). The symptoms are the same with mine.
>>>>>
>>>>> The root cause is different. In my problem, the root cause is that
>LINKS
>>>>> register[]  can not provide link_up and link_speed at the same time.
>>>>> There is a time span between link_up and link_speed.
>>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>>> updated
>>>> simultaneously. Do you have any proof to show the delay you are
>referring
>>> to
>>>> as I am sure our HW engineers would like to know about it.
>>> Sorry. I can not reproduce this problem locally. What I have is the
>>> feedback from the customer.
>> So you are assuming that there is a delay due to the issue you are
>seeing?
>>
>>> Settings for eth0:
>>>     Supported ports: [ TP ]
>>>     Supported link modes:   100baseT/Full
>>>                             1000baseT/Full
>>>                             10000baseT/Full
>>>     Supported pause frame use: No
>>>     Supports auto-negotiation: Yes
>>>     Advertised link modes:  100baseT/Full
>>>                             1000baseT/Full
>>>                             10000baseT/Full
>>>     Advertised pause frame use: No
>>>     Advertised auto-negotiation: Yes
>>>     Speed: Unknown!
>>>     Duplex: Unknown! (255)
>>>     Port: Twisted Pair
>>>     PHYAD: 0
>>>     Transceiver: external
>>>     Auto-negotiation: on
>>>     MDI-X: Unknown
>>>     Supports Wake-on: d
>>>     Wake-on: d
>>>     Current message level: 0x00000007 (7)
>>>                    drv probe link
>>>     Link detected: yes
>> The speed and the link state here are reported from
>> different sources:
>>
>>>     Link detected: yes
>> Comes from a netif_carrier_ok() check. This is done via
>ethtool_op_get_link().
>>
>> Only the speed is reported through the LINKS register - that is why it is
>reported
>> as "Unknown" - in other words link_up is false.
>>
>> This is a trace from the case where the bonding driver reports 0 Mbps:
>>
>>     kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task:
>eth1: link_speed = 80, link_up = true
>>     kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task:
>eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>>       kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings:
>eth1: link_speed = 80, link_up = false
>>    NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0:
>link status definitely up for interface eth1, 0 Mbps full duplex
>>    NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task:
>eth1: link_speed = 80, link_up = true
>>     kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task:
>eth1: link_speed = 80, link_up = true
>
>Hi, Emil
>
>Thanks for your feedback.
> From your log, I think the following can explain why bonding driver can
>not get speed.
>
>bonding                           ixgbe
>.                                   .
>.      <-----------------------   NETDEV_UP
>.                                   .
>bond_slave_netdev_event           NETDEV_DOWN
>.                                   .
>.                                   .
>.                                   .
>NETDEV_UP                           .
>.              ----------------> get_settings
>                                     .
>speed unknown  <---------------  link_up false
>.
>.
>link_up = true
>link_speed = unknown
>
>In the above, ixgbe is up and bonding gets this message, then bonding
>calls bond_slave_netdev_event while ixgbe is down.
>In bond_slave_netdev_event, bonding call get_settings in ixgbe to get
>link_speed. Since now ixgbe is down, so link_speed is
>unknown. In the end, bonding get the final state of ixgbe as link_up
>without link_speed.
>
>If you agree with me, would you like to help me to make tests with the
>following patch?
>
>diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>index d681273..3efc4d8 100644
>--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>@@ -285,27 +285,24 @@ static int ixgbe_get_settings(struct net_device
>*netdev,
>         }
>
>         hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
>-       if (link_up) {
>-               switch (link_speed) {
>-               case IXGBE_LINK_SPEED_10GB_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_10000);
>-                       break;
>-               case IXGBE_LINK_SPEED_2_5GB_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_2500);
>-                       break;
>-               case IXGBE_LINK_SPEED_1GB_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_1000);
>-                       break;
>-               case IXGBE_LINK_SPEED_100_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_100);
>-                       break;
>-               default:
>-                       break;
>-               }
>-               ecmd->duplex = DUPLEX_FULL;
>-       } else {
>-               ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN);
>+
>+       ecmd->duplex = DUPLEX_FULL;
>+       switch (link_speed) {
>+       case IXGBE_LINK_SPEED_10GB_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_10000);
>+               break;
>+       case IXGBE_LINK_SPEED_2_5GB_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_2500);
>+               break;
>+       case IXGBE_LINK_SPEED_1GB_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_1000);
>+               break;
>+       case IXGBE_LINK_SPEED_100_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_100);
>+               break;
>+       default:
>                 ecmd->duplex = DUPLEX_UNKNOWN;
>+               break;
>         }
>
>         return 0;

This will break speed reporting. You cannot ignore link_up.
The speed is only valid when the link_up bit is set.

Thanks,
Emil
Zhu Yanjun Jan. 8, 2016, 6:09 a.m. UTC | #2
On 01/08/2016 02:28 AM, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>> Sent: Wednesday, January 06, 2016 11:47 PM
>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>> Shteinbock, Boris (Wind River)
>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>
>> On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>>>> -----Original Message-----
>>>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>>>> Sent: Tuesday, January 05, 2016 7:05 PM
>>>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>> netdev@vger.kernel.org;
>>>> Shteinbock, Boris (Wind River)
>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>
>>>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>>>> -----Original Message-----
>>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org]
>>>> On
>>>>>> Behalf Of zhuyj
>>>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>>>> To: Michal Kubecek; Jay Vosburgh
>>>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>>>> netdev@vger.kernel.org;
>>>>>> Shteinbock, Boris (Wind River)
>>>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>>>
>>>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>>>> there is a time span between NIC up state and getting speed and
>>>> duplex.
>>>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>>>> work well.
>>>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>>>> 	What device is this?  It seems a bit odd that an Ethernet
>> device
>>>>>>>> can be carrier up but not have the duplex and speed available.
>>>>>>> ...
>>>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>>>> propagate to the 802.3ad logic.
>>>>>>>>
>>>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>>>> duplex available, then maybe it should call netdev_state_change()
>> when
>>>>>>>> the duplex and speed are available, or delay calling
>>>> netif_carrier_on().
>>>>>>> I have encountered this problem (NIC having carrier on before being
>>>> able
>>>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>>>> handling.") so this example rather supports what you said.
>>>>>>>
>>>>>>>                                                               Michal
>>>> Kubecek
>>>>>> Thanks a lot.
>>>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>>>> handling."). The symptoms are the same with mine.
>>>>>>
>>>>>> The root cause is different. In my problem, the root cause is that
>> LINKS
>>>>>> register[]  can not provide link_up and link_speed at the same time.
>>>>>> There is a time span between link_up and link_speed.
>>>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>>>> updated
>>>>> simultaneously. Do you have any proof to show the delay you are
>> referring
>>>> to
>>>>> as I am sure our HW engineers would like to know about it.
>>>> Sorry. I can not reproduce this problem locally. What I have is the
>>>> feedback from the customer.
>>> So you are assuming that there is a delay due to the issue you are
>> seeing?
>>>> Settings for eth0:
>>>>      Supported ports: [ TP ]
>>>>      Supported link modes:   100baseT/Full
>>>>                              1000baseT/Full
>>>>                              10000baseT/Full
>>>>      Supported pause frame use: No
>>>>      Supports auto-negotiation: Yes
>>>>      Advertised link modes:  100baseT/Full
>>>>                              1000baseT/Full
>>>>                              10000baseT/Full
>>>>      Advertised pause frame use: No
>>>>      Advertised auto-negotiation: Yes
>>>>      Speed: Unknown!
>>>>      Duplex: Unknown! (255)
>>>>      Port: Twisted Pair
>>>>      PHYAD: 0
>>>>      Transceiver: external
>>>>      Auto-negotiation: on
>>>>      MDI-X: Unknown
>>>>      Supports Wake-on: d
>>>>      Wake-on: d
>>>>      Current message level: 0x00000007 (7)
>>>>                     drv probe link
>>>>      Link detected: yes
>>> The speed and the link state here are reported from
>>> different sources:
>>>
>>>>      Link detected: yes
>>> Comes from a netif_carrier_ok() check. This is done via
>> ethtool_op_get_link().
>>> Only the speed is reported through the LINKS register - that is why it is
>> reported
>>> as "Unknown" - in other words link_up is false.
>>>
>>> This is a trace from the case where the bonding driver reports 0 Mbps:
>>>
>>>      kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = true
>>>      kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task:
>> eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>>>        kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings:
>> eth1: link_speed = 80, link_up = false
>>>     NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0:
>> link status definitely up for interface eth1, 0 Mbps full duplex
>>>     NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = true
>>>      kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = true
>>
>> Hi, Emil
>>
>> Thanks for your feedback.
>>  From your log, I think the following can explain why bonding driver can
>> not get speed.
>>
>> bonding                           ixgbe
>> .                                   .
>> .      <-----------------------   NETDEV_UP
>> .                                   .
>> bond_slave_netdev_event           NETDEV_DOWN
>> .                                   .
>> .                                   .
>> .                                   .
>> NETDEV_UP                           .
>> .              ----------------> get_settings
>>                                      .
>> speed unknown  <---------------  link_up false
>> .
>> .
>> link_up = true
>> link_speed = unknown
>>
>> In the above, ixgbe is up and bonding gets this message, then bonding
>> calls bond_slave_netdev_event while ixgbe is down.
>> In bond_slave_netdev_event, bonding call get_settings in ixgbe to get
>> link_speed. Since now ixgbe is down, so link_speed is
>> unknown. In the end, bonding get the final state of ixgbe as link_up
>> without link_speed.
>>
>> If you agree with me, would you like to help me to make tests with the
>> following patch?
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> index d681273..3efc4d8 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> @@ -285,27 +285,24 @@ static int ixgbe_get_settings(struct net_device
>> *netdev,
>>          }
>>
>>          hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
>> -       if (link_up) {
>> -               switch (link_speed) {
>> -               case IXGBE_LINK_SPEED_10GB_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_10000);
>> -                       break;
>> -               case IXGBE_LINK_SPEED_2_5GB_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_2500);
>> -                       break;
>> -               case IXGBE_LINK_SPEED_1GB_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_1000);
>> -                       break;
>> -               case IXGBE_LINK_SPEED_100_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_100);
>> -                       break;
>> -               default:
>> -                       break;
>> -               }
>> -               ecmd->duplex = DUPLEX_FULL;
>> -       } else {
>> -               ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN);
>> +
>> +       ecmd->duplex = DUPLEX_FULL;
>> +       switch (link_speed) {
>> +       case IXGBE_LINK_SPEED_10GB_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_10000);
>> +               break;
>> +       case IXGBE_LINK_SPEED_2_5GB_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_2500);
>> +               break;
>> +       case IXGBE_LINK_SPEED_1GB_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_1000);
>> +               break;
>> +       case IXGBE_LINK_SPEED_100_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_100);
>> +               break;
>> +       default:
>>                  ecmd->duplex = DUPLEX_UNKNOWN;
>> +               break;
>>          }
>>
>>          return 0;
> This will break speed reporting. You cannot ignore link_up.
> The speed is only valid when the link_up bit is set.
Hi, Emil

Thanks for your reply.
But in this function ixgbe_check_mac_link_generic. The speed is reported 
whether the link_up is true or false.
I followed this function.

Thanks a lot.
Zhu Yanjun
>
> Thanks,
> Emil
>
>
diff mbox

Patch

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index d681273..3efc4d8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -285,27 +285,24 @@  static int ixgbe_get_settings(struct net_device 
*netdev,
         }

         hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
-       if (link_up) {
-               switch (link_speed) {
-               case IXGBE_LINK_SPEED_10GB_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_10000);
-                       break;
-               case IXGBE_LINK_SPEED_2_5GB_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_2500);
-                       break;
-               case IXGBE_LINK_SPEED_1GB_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_1000);
-                       break;
-               case IXGBE_LINK_SPEED_100_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_100);
-                       break;
-               default:
-                       break;
-               }
-               ecmd->duplex = DUPLEX_FULL;
-       } else {
-               ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN);
+
+       ecmd->duplex = DUPLEX_FULL;
+       switch (link_speed) {
+       case IXGBE_LINK_SPEED_10GB_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_10000);
+               break;
+       case IXGBE_LINK_SPEED_2_5GB_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_2500);
+               break;
+       case IXGBE_LINK_SPEED_1GB_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_1000);
+               break;
+       case IXGBE_LINK_SPEED_100_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_100);
+               break;
+       default:
                 ecmd->duplex = DUPLEX_UNKNOWN;
+               break;
         }