diff mbox

Fix link test for e1000 and e1000e when iface is down

Message ID 200902101435.58331.adetsch@br.ibm.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Andre Detsch Feb. 10, 2009, 4:35 p.m. UTC
When running ethtool -t on an interface that was just
brought down, the link test was failing. That's because
we need to perform a reset on the interface to be able
to check the link status correctly. There is no problem
on doing such reset right before the test, as the link
test routine is prepared to wait for autoneg to complete
if that is the case.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
---

The bug can be produced with the following commands
(assuming eth0 is an e1000 or e1000e iface):
 ifconfig eth0 up;
 ifconfig eth0 down;
 ethtool -t eth0;

 drivers/net/e1000/e1000_ethtool.c |   10 ++++++----
 drivers/net/e1000e/ethtool.c      |   10 +++++-----
 2 files changed, 11 insertions(+), 9 deletions(-)

Comments

David Miller Feb. 13, 2009, 12:29 a.m. UTC | #1
From: Andre Detsch <adetsch@br.ibm.com>
Date: Tue, 10 Feb 2009 14:35:58 -0200

> 
> When running ethtool -t on an interface that was just
> brought down, the link test was failing. That's because
> we need to perform a reset on the interface to be able
> to check the link status correctly. There is no problem
> on doing such reset right before the test, as the link
> test routine is prepared to wait for autoneg to complete
> if that is the case.
> 
> Signed-off-by: Andre Detsch <adetsch@br.ibm.com>

Can I get some Intel ACKs on this?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kirsher, Jeffrey T Feb. 13, 2009, 1:10 a.m. UTC | #2
On Tue, Feb 10, 2009 at 8:35 AM, Andre Detsch <adetsch@br.ibm.com> wrote:
>
> When running ethtool -t on an interface that was just
> brought down, the link test was failing. That's because
> we need to perform a reset on the interface to be able
> to check the link status correctly. There is no problem
> on doing such reset right before the test, as the link
> test routine is prepared to wait for autoneg to complete
> if that is the case.
>
> Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
> ---

I guess this comes down to what the definition of a Link test should
be doing.  My $0.02 is that it should be testing if the interface has
a link, in which case if you ifdown the interface before running the
Link test, I would expect it to fail.

With this patch, if you bring down the device and run the ethtool diag
tests, the Link test would come back as passing which is something I
would not expect.

NAK.
David Miller Feb. 13, 2009, 1:17 a.m. UTC | #3
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 12 Feb 2009 17:10:22 -0800

> On Tue, Feb 10, 2009 at 8:35 AM, Andre Detsch <adetsch@br.ibm.com> wrote:
> >
> > When running ethtool -t on an interface that was just
> > brought down, the link test was failing. That's because
> > we need to perform a reset on the interface to be able
> > to check the link status correctly. There is no problem
> > on doing such reset right before the test, as the link
> > test routine is prepared to wait for autoneg to complete
> > if that is the case.
> >
> > Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
> > ---
> 
> I guess this comes down to what the definition of a Link test should
> be doing.  My $0.02 is that it should be testing if the interface has
> a link, in which case if you ifdown the interface before running the
> Link test, I would expect it to fail.
> 
> With this patch, if you bring down the device and run the ethtool diag
> tests, the Link test would come back as passing which is something I
> would not expect.
> 
> NAK.

I have to agree with Jeff on this one.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andre Detsch Feb. 13, 2009, 12:08 p.m. UTC | #4
Jeff Kirsher wrote:
> I guess this comes down to what the definition of a Link test should
> be doing.  My $0.02 is that it should be testing if the interface has
> a link, in which case if you ifdown the interface before running the
> Link test, I would expect it to fail.
> 
> With this patch, if you bring down the device and run the ethtool diag
> tests, the Link test would come back as passing which is something I
> would not expect.

With the current code, e1000_reset is called during the test anyway, so 
ethtool -t fails only the first time it is run (after ifconfig down). Is that 
the expected behavior for the test?


# ifconfig eth0 up; ifconfig eth0 down; ethtool -t eth0

The test result is FAIL
The test extra info:
Register test  (offline)         0
Eeprom test    (offline)         0
Interrupt test (offline)         0
Loopback test  (offline)         0
Link test   (on/offline)         1


# ethtool -t eth0

The test result is PASS
The test extra info:
Register test  (offline)         0
Eeprom test    (offline)         0
Interrupt test (offline)         0
Loopback test  (offline)         0
Link test   (on/offline)         0


Thanks for the feedback.
Kirsher, Jeffrey T Feb. 13, 2009, 12:52 p.m. UTC | #5
On Fri, Feb 13, 2009 at 4:08 AM, André Detsch <adetsch@br.ibm.com> wrote:
> Jeff Kirsher wrote:
>>
>> I guess this comes down to what the definition of a Link test should
>> be doing.  My $0.02 is that it should be testing if the interface has
>> a link, in which case if you ifdown the interface before running the
>> Link test, I would expect it to fail.
>>
>> With this patch, if you bring down the device and run the ethtool diag
>> tests, the Link test would come back as passing which is something I
>> would not expect.
>
> With the current code, e1000_reset is called during the test anyway, so
> ethtool -t fails only the first time it is run (after ifconfig down). Is
> that the expected behavior for the test?
>
>
> # ifconfig eth0 up; ifconfig eth0 down; ethtool -t eth0
>
> The test result is FAIL
> The test extra info:
> Register test  (offline)         0
> Eeprom test    (offline)         0
> Interrupt test (offline)         0
> Loopback test  (offline)         0
> Link test   (on/offline)         1
>
>
> # ethtool -t eth0
>
> The test result is PASS
> The test extra info:
> Register test  (offline)         0
> Eeprom test    (offline)         0
> Interrupt test (offline)         0
> Loopback test  (offline)         0
> Link test   (on/offline)         0
>
>
> Thanks for the feedback.
>
> --
> André Detsch
> Kernel Software Engineer
> Linux Technology Center Brazil
> --

Yes, this is expected.

We want the Link test to correctly report whether or not the interface
currently has link, whether or not the user manually downed the
interface.  The reason for resetting is because during the testing,
the hardware settings may be changed to complete such tests as the
Loopback and Interrupt tests so we need to reset the hardware settings
to normal operation.  It has the added benefit, that if the user had
not manually downed the interface, and the interface had gone down for
some unknown reason, the reset attempts to bring the interface back
up.

For an example, if the link went down for some reason and the user had
not downed the interface manually and were to run the ethtool tests to
see what was wrong.  Your patch would report the link test as a pass
(if after the reset, the link was re-established).  This would have
falsely reported the Link status when the ethtool tests were
initiated.
diff mbox

Patch

diff --git a/drivers/net/e1000/e1000_ethtool.c b/drivers/net/e1000/e1000_ethtool.c
index c854c96..bed7c88 100644
--- a/drivers/net/e1000/e1000_ethtool.c
+++ b/drivers/net/e1000/e1000_ethtool.c
@@ -1646,16 +1646,18 @@  static void e1000_diag_test(struct net_device *netdev,
 
 		DPRINTK(HW, INFO, "offline testing starting\n");
 
-		/* Link test performed before hardware reset so autoneg doesn't
-		 * interfere with test result */
+		if (!if_running)
+			e1000_reset(adapter);
+
+		/* Due to the reset above, autoneg may be undergoing.
+		 * But link test is prepared to handle such situation.
+		 */
 		if (e1000_link_test(adapter, &data[4]))
 			eth_test->flags |= ETH_TEST_FL_FAILED;
 
 		if (if_running)
 			/* indicate we're in test mode */
 			dev_close(netdev);
-		else
-			e1000_reset(adapter);
 
 		if (e1000_reg_test(adapter, &data[0]))
 			eth_test->flags |= ETH_TEST_FL_FAILED;
diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
index e48956d..d14a5be 100644
--- a/drivers/net/e1000e/ethtool.c
+++ b/drivers/net/e1000e/ethtool.c
@@ -1647,9 +1647,11 @@  static void e1000_diag_test(struct net_device *netdev,
 
 		e_info("offline testing starting\n");
 
-		/*
-		 * Link test performed before hardware reset so autoneg doesn't
-		 * interfere with test result
+		if (!if_running)
+			e1000e_reset(adapter);
+
+		/* Due to the reset above, autoneg may be undergoing.
+		 * But link test is prepared to handle such situation.
 		 */
 		if (e1000_link_test(adapter, &data[4]))
 			eth_test->flags |= ETH_TEST_FL_FAILED;
@@ -1657,8 +1659,6 @@  static void e1000_diag_test(struct net_device *netdev,
 		if (if_running)
 			/* indicate we're in test mode */
 			dev_close(netdev);
-		else
-			e1000e_reset(adapter);
 
 		if (e1000_reg_test(adapter, &data[0]))
 			eth_test->flags |= ETH_TEST_FL_FAILED;