Message ID | 520205889.412690.1454977910808.JavaMail.zimbra@xes-inc.com |
---|---|
State | Changes Requested |
Delegated to: | Jeff Kirsher |
Headers | show |
On Mon, 2016-02-08 at 18:31 -0600, Aaron Sierra wrote: > From: Joe Schultz <jschultz@xes-inc.com> > > Add an offline diagnostic test for the I210 internal PHY which checks > for cable faults and reports the distance along the cable where the > fault was detected. Fault types detected include open, short, and > cross-pair short. > > Signed-off-by: Joe Schultz <jschultz@xes-inc.com> > Signed-off-by: Aaron Sierra <asierra@xes-inc.com> > --- > v2 - account for changes made by this patch in dev-queue: > drivers/net: get rid of unnecessary initializations in > .get_drvinfo() > v3 - fix uninitialized variable compile warning > - remove unneeded igb_cable_fault_test_prep() function > - don't add unused define to e1000_defines.h > - only run cable diagnostic if link test fails > v4 - only set no-fault, link-present cable distance for I210 > > drivers/net/ethernet/intel/igb/e1000_defines.h | 12 +- > drivers/net/ethernet/intel/igb/igb_ethtool.c | 186 > ++++++++++++++++++++++++- > 2 files changed, 192 insertions(+), 6 deletions(-) I try not to be a checkpatch.pl stickler, but your patch clearly has some style issues that checkpatch.pl complains about and need to be fixed. I have gone ahead and applied your patch to my next-queue tree (dev-queue branch) so that Aaron can test the functional changes of your patch. So I will require a v5 to fix the coding style issues that checkpatch.pl whines about, but before re-spinning your patch, lets wait to see if Aaron finds any other issues with your patch.
> From: Kirsher, Jeffrey T > Sent: Tuesday, February 09, 2016 2:47 AM > To: Aaron Sierra; intel-wired-lan@lists.osuosl.org > Cc: Wyborny, Carolyn; Brown, Aaron F; Joe Schultz > Subject: Re: [PATCH v4] igb: Add I210 cable fault detection to self test > > On Mon, 2016-02-08 at 18:31 -0600, Aaron Sierra wrote: >> From: Joe Schultz <jschultz@xes-inc.com> > > > > Add an offline diagnostic test for the I210 internal PHY which checks > > for cable faults and reports the distance along the cable where the > > fault was detected. Fault types detected include open, short, and > > cross-pair short. > > > > Signed-off-by: Joe Schultz <jschultz@xes-inc.com> > > Signed-off-by: Aaron Sierra <asierra@xes-inc.com> > > --- > > v2 - account for changes made by this patch in dev-queue: > > drivers/net: get rid of unnecessary initializations in> > > .get_drvinfo() > > v3 - fix uninitialized variable compile warning > > - remove unneeded igb_cable_fault_test_prep() function > > - don't add unused define to e1000_defines.h > > - only run cable diagnostic if link test fails > > v4 - only set no-fault, link-present cable distance for I210 > > > > drivers/net/ethernet/intel/igb/e1000_defines.h | 12 +- > > drivers/net/ethernet/intel/igb/igb_ethtool.c | 186 > > ++++++++++++++++++++++++- > > 2 files changed, 192 insertions(+), 6 deletions(-) > > I try not to be a checkpatch.pl stickler, but your patch clearly has > some style issues that checkpatch.pl complains about and need to be > fixed. I have gone ahead and applied your patch to my next-queue tree > (dev-queue branch) so that Aaron can test the functional changes of > your patch. > So I will require a v5 to fix the coding style issues that > checkpatch.pl whines about, but before re-spinning your patch, lets > wait to see if Aaron finds any other issues with your patch. I no longer see the kernel panics on other parts with this patch, and functionally it seems good on all the parts I have scrounged up. Thanks. But I still see "-1" for the Pair fault distance's when a good cable is connected and diags are run offline: ... Pair D cable fault (offline) 0 Pair A fault distance -1 Pair B fault distance -1 Pair C fault distance -1 Pair D fault distance -1 Pair A fault open 0 ... I'm not sure the intent here, but having something other than 0 looks / feels like an error to me, It's also confusing as those values come up as 0 if the diags are run with the online keyword with a valid cable connection. The diag session itself (ethtool -t ethX offline) returns 0 so the test as a whole is registering as passing.
----- Original Message ----- > From: "Aaron F Brown" <aaron.f.brown@intel.com> > Sent: Saturday, February 13, 2016 12:27:19 AM > > > From: Kirsher, Jeffrey T > > Sent: Tuesday, February 09, 2016 2:47 AM > > To: Aaron Sierra; intel-wired-lan@lists.osuosl.org > > Cc: Wyborny, Carolyn; Brown, Aaron F; Joe Schultz > > Subject: Re: [PATCH v4] igb: Add I210 cable fault detection to self test > > > > On Mon, 2016-02-08 at 18:31 -0600, Aaron Sierra wrote: > >> From: Joe Schultz <jschultz@xes-inc.com> > > > > > > Add an offline diagnostic test for the I210 internal PHY which checks > > > for cable faults and reports the distance along the cable where the > > > fault was detected. Fault types detected include open, short, and > > > cross-pair short. > > > > > > Signed-off-by: Joe Schultz <jschultz@xes-inc.com> > > > Signed-off-by: Aaron Sierra <asierra@xes-inc.com> > > > --- > > > v2 - account for changes made by this patch in dev-queue: > > > drivers/net: get rid of unnecessary initializations in> > > > .get_drvinfo() > > > v3 - fix uninitialized variable compile warning > > > - remove unneeded igb_cable_fault_test_prep() function > > > - don't add unused define to e1000_defines.h > > > - only run cable diagnostic if link test fails > > > v4 - only set no-fault, link-present cable distance for I210 > > > > > > drivers/net/ethernet/intel/igb/e1000_defines.h | 12 +- > > > drivers/net/ethernet/intel/igb/igb_ethtool.c | 186 > > > ++++++++++++++++++++++++- > > > 2 files changed, 192 insertions(+), 6 deletions(-) > > > > I try not to be a checkpatch.pl stickler, but your patch clearly has > > some style issues that checkpatch.pl complains about and need to be > > fixed. I have gone ahead and applied your patch to my next-queue tree > > (dev-queue branch) so that Aaron can test the functional changes of > > your patch. > > > So I will require a v5 to fix the coding style issues that > > checkpatch.pl whines about, but before re-spinning your patch, lets > > wait to see if Aaron finds any other issues with your patch. > > I no longer see the kernel panics on other parts with this patch, and > functionally it seems good on all the parts I have scrounged up. Thanks. > > But I still see "-1" for the Pair fault distance's when a good cable is > connected and diags are run offline: > ... > Pair D cable fault (offline) 0 > Pair A fault distance -1 > Pair B fault distance -1 > Pair C fault distance -1 > Pair D fault distance -1 > Pair A fault open 0 > ... > > I'm not sure the intent here, but having something other than 0 looks / feels > like an error to me, I wanted to avoid using values for the fault distance that could reasonably be misinterpreted as presence of a fault. For instance, if no cable is connected, open faults should be detected on all pairs. The distance to those faults will be 0 because the trace lengths to the connector are almost certainly shorter than 1 meter. If link is present, fault distance should really be undefined, but is displayed as a signed integer. > It's also confusing as those values come up as 0 if the diags are run with > the online keyword with a valid cable connection. > The diag session itself (ethtool -t ethX offline) returns 0 so the test as > a whole is registering as passing. I can see this either way and defaulting to zero would be easier. This difference in result output is unintentional, but this leads me to wonder what reasonable output for the online case should look like, since none of the fault tests will be run in that case. I think these strings should include an (offline) tag, like the original tests. In order to do that I'm inclined to merge inter and intra pair shorts into a single short output, like this: Register test (offline) Eeprom test (offline) Interrupt test (offline) Loopback test (offline) Link test (on/offline) Pair A cable fault (offline) Pair B cable fault (offline) Pair C cable fault (offline) Pair D cable fault (offline) Pair A fault distance (offline) Pair B fault distance (offline) Pair C fault distance (offline) Pair D fault distance (offline) Pair A fault open (offline) Pair B fault open (offline) Pair C fault open (offline) Pair D fault open (offline) Pair A fault short (offline) Pair B fault short (offline) Pair C fault short (offline) Pair D fault short (offline) Please let me know what you think. -Aaron S.
> -----Original Message----- > From: Aaron Sierra [mailto:asierra@xes-inc.com] > Sent: Friday, February 19, 2016 10:05 AM > To: Brown, Aaron F <aaron.f.brown@intel.com> > Cc: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; intel-wired- > lan@lists.osuosl.org; Wyborny, Carolyn <carolyn.wyborny@intel.com>; Joe > Schultz <jschultz@xes-inc.com> > Subject: Re: [PATCH v4] igb: Add I210 cable fault detection to self test > > ----- Original Message ----- > > From: "Aaron F Brown" <aaron.f.brown@intel.com> > > Sent: Saturday, February 13, 2016 12:27:19 AM > > > > > From: Kirsher, Jeffrey T > > > Sent: Tuesday, February 09, 2016 2:47 AM > > > To: Aaron Sierra; intel-wired-lan@lists.osuosl.org > > > Cc: Wyborny, Carolyn; Brown, Aaron F; Joe Schultz > > > Subject: Re: [PATCH v4] igb: Add I210 cable fault detection to self > > > test > > > > > > On Mon, 2016-02-08 at 18:31 -0600, Aaron Sierra wrote: > > >> From: Joe Schultz <jschultz@xes-inc.com> > > > > > > > > Add an offline diagnostic test for the I210 internal PHY which > > > > checks for cable faults and reports the distance along the cable > > > > where the fault was detected. Fault types detected include open, > > > > short, and cross-pair short. > > > > > > > > Signed-off-by: Joe Schultz <jschultz@xes-inc.com> > > > > Signed-off-by: Aaron Sierra <asierra@xes-inc.com> > > > > --- > > > > ... > > > > But I still see "-1" for the Pair fault distance's when a good cable > > is connected and diags are run offline: > > ... > > Pair D cable fault (offline) 0 > > Pair A fault distance -1 > > Pair B fault distance -1 > > Pair C fault distance -1 > > Pair D fault distance -1 > > Pair A fault open 0 > > ... > > > > I'm not sure the intent here, but having something other than 0 looks > > / feels like an error to me, > > I wanted to avoid using values for the fault distance that could reasonably be > misinterpreted as presence of a fault. For instance, if no cable is connected, > open faults should be detected on all pairs. The distance to those faults will > be 0 because the trace lengths to the connector are almost certainly shorter > than 1 meter. If link is present, fault distance should really be undefined, but > is displayed as a signed integer. That make sense, and I tend to agree. Just that the Unix convention for 0 as passed / returned properly is so ingrained in me that my first instinct is something failed when I get something other than 0 back. > > > It's also confusing as those values come up as 0 if the diags are run > > with the online keyword with a valid cable connection. > > The diag session itself (ethtool -t ethX offline) returns 0 so the > > test as a whole is registering as passing. > > I can see this either way and defaulting to zero would be easier. > > This difference in result output is unintentional, but this leads me to wonder > what reasonable output for the online case should look like, since none of > the fault tests will be run in that case. I'm not really sure, but I feel it should be consistent with the results of an offline test that is run with a good cable attached. Anyone out there have an opinion on this and want to weigh in? > I think these strings should include an > (offline) tag, like the original tests. I agree, the offline tag makes sense. Would it be possible to suppress the fault distance messages if no fault is found? I'm not all that familiar with ethtool's inner workings. > > In order to do that I'm inclined to merge inter and intra pair shorts into a > single short output, like this: > > Register test (offline) > Eeprom test (offline) > Interrupt test (offline) > Loopback test (offline) > Link test (on/offline) > Pair A cable fault (offline) > Pair B cable fault (offline) > Pair C cable fault (offline) > Pair D cable fault (offline) > Pair A fault distance (offline) > Pair B fault distance (offline) > Pair C fault distance (offline) > Pair D fault distance (offline) > Pair A fault open (offline) > Pair B fault open (offline) > Pair C fault open (offline) > Pair D fault open (offline) > Pair A fault short (offline) > Pair B fault short (offline) > Pair C fault short (offline) > Pair D fault short (offline) > > Please let me know what you think. I think that's just a matter of how much info you want to show and how you envision people using the info. If differentiating between inter and intra is likely to help someone track a problem down by all means keep both. If you expect people to simply toss the cable and try a new one including both forms is probably just extra noise. Thanks Aaron B.
diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h index 240902e..707e1ba 100644 --- a/drivers/net/ethernet/intel/igb/e1000_defines.h +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h @@ -932,6 +932,7 @@ #define I347AT4_PCDL1 0x11 /* Pair 1 PHY Cable Diagnostics Length */ #define I347AT4_PCDL2 0x12 /* Pair 2 PHY Cable Diagnostics Length */ #define I347AT4_PCDL3 0x13 /* Pair 3 PHY Cable Diagnostics Length */ +#define I347AT4_PCDR 0x14 /* PHY Cable Diagnostics Results */ #define I347AT4_PCDC 0x15 /* PHY Cable Diagnostics Control */ #define I347AT4_PAGE_SELECT 0x16 @@ -952,7 +953,16 @@ #define I347AT4_PSCR_DOWNSHIFT_8X 0x7000 /* i347-AT4 PHY Cable Diagnostics Control */ -#define I347AT4_PCDC_CABLE_LENGTH_UNIT 0x0400 /* 0=cm 1=meters */ +#define I347AT4_PCDC_CABLE_LENGTH_UNIT 0x0400 /* 0=cm 1=meters */ +#define I347AT4_PCDC_CABLE_DIAG_STATUS 0x0800 +#define I347AT4_PCDC_DISABLE_CROSS_PAIR 0x2000 +#define I347AT4_PCDC_RUN_TEST 0x8000 + +/* i347-AT4 PHY Cable Diagnostics Results */ +#define I347AT4_PCDR_CABLE_OK 0x0001 /* No faults detected on pair */ +#define I347AT4_PCDR_CABLE_OPEN 0x0002 /* Open pair detected */ +#define I347AT4_PCDR_CABLE_SHORT 0x0003 /* Shorted pair detected */ +#define I347AT4_PCDR_CABLE_CROSS_SHORT 0x0004 /* Cross-pair short detected */ /* Marvell 1112 only registers */ #define M88E1112_VCT_DSP_DISTANCE 0x001A diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c index 1d329f1..a135bf3 100644 --- a/drivers/net/ethernet/intel/igb/igb_ethtool.c +++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c @@ -132,7 +132,28 @@ enum igb_diagnostics_results { TEST_EEP, TEST_IRQ, TEST_LOOP, - TEST_LINK + TEST_LINK, + /* I210 superset */ + TEST_FAULT_A, + TEST_FAULT_B, + TEST_FAULT_C, + TEST_FAULT_D, + TEST_LENGTH_A, + TEST_LENGTH_B, + TEST_LENGTH_C, + TEST_LENGTH_D, + TEST_OPEN_A, + TEST_OPEN_B, + TEST_OPEN_C, + TEST_OPEN_D, + TEST_SHORT_A, + TEST_SHORT_B, + TEST_SHORT_C, + TEST_SHORT_D, + TEST_CROSS_A, + TEST_CROSS_B, + TEST_CROSS_C, + TEST_CROSS_D }; static const char igb_gstrings_test[][ETH_GSTRING_LEN] = { @@ -142,7 +163,50 @@ static const char igb_gstrings_test[][ETH_GSTRING_LEN] = { [TEST_LOOP] = "Loopback test (offline)", [TEST_LINK] = "Link test (on/offline)" }; + +static const char igb_i210_gstrings_test[][ETH_GSTRING_LEN] = { + [TEST_REG] = "Register test (offline)", + [TEST_EEP] = "Eeprom test (offline)", + [TEST_IRQ] = "Interrupt test (offline)", + [TEST_LOOP] = "Loopback test (offline)", + [TEST_LINK] = "Link test (on/offline)", + [TEST_FAULT_A] = "Pair A cable fault (offline)", + [TEST_FAULT_B] = "Pair B cable fault (offline)", + [TEST_FAULT_C] = "Pair C cable fault (offline)", + [TEST_FAULT_D] = "Pair D cable fault (offline)", + [TEST_LENGTH_A] = "Pair A fault distance ", + [TEST_LENGTH_B] = "Pair B fault distance ", + [TEST_LENGTH_C] = "Pair C fault distance ", + [TEST_LENGTH_D] = "Pair D fault distance ", + [TEST_OPEN_A] = "Pair A fault open ", + [TEST_OPEN_B] = "Pair B fault open ", + [TEST_OPEN_C] = "Pair C fault open ", + [TEST_OPEN_D] = "Pair D fault open ", + [TEST_SHORT_A] = "Pair A fault intra-pair short ", + [TEST_SHORT_B] = "Pair B fault intra-pair short ", + [TEST_SHORT_C] = "Pair C fault intra-pair short ", + [TEST_SHORT_D] = "Pair D fault intra-pair short ", + [TEST_CROSS_A] = "Pair A fault inter-pair short ", + [TEST_CROSS_B] = "Pair B fault inter-pair short ", + [TEST_CROSS_C] = "Pair C fault inter-pair short ", + [TEST_CROSS_D] = "Pair D fault inter-pair short " +}; + #define IGB_TEST_LEN (sizeof(igb_gstrings_test) / ETH_GSTRING_LEN) +#define IGB_I210_TEST_LEN (sizeof(igb_i210_gstrings_test) / ETH_GSTRING_LEN) + +static inline bool igb_has_i210_cable_fault_test(struct igb_adapter *adapter) +{ + struct e1000_hw *hw = &adapter->hw; + + if (hw->phy.media_type != e1000_media_type_copper) + return false; + + if (hw->phy.id == I210_I_PHY_ID) + return true; + + return false; +} static int igb_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd) { @@ -1983,6 +2047,97 @@ static int igb_link_test(struct igb_adapter *adapter, u64 *data) return *data; } +static int igb_cable_fault_test(struct igb_adapter *adapter, + struct ethtool_test *eth_test, u64 *data) { + struct e1000_hw *hw = &adapter->hw; + u16 old_pcdc, pcdc; + u16 pcdr; + u16 error_code = 0; + u32 timeout = 0; + s32 ret_val; + int i; + + ret_val = igb_write_phy_reg(hw, I347AT4_PAGE_SELECT, 0x7); + if (ret_val) + goto no_set_page; + + /* Save PCDC register and initiate immediate diagnostic */ + ret_val = igb_read_phy_reg(hw, I347AT4_PCDC, &pcdc); + if (ret_val) + goto no_pcdc; + + old_pcdc = pcdc; + pcdc &= ~I347AT4_PCDC_DISABLE_CROSS_PAIR; + pcdc |= I347AT4_PCDC_CABLE_LENGTH_UNIT | I347AT4_PCDC_RUN_TEST; + + ret_val = igb_write_phy_reg(hw, I347AT4_PCDC, pcdc); + if (ret_val) + goto done; + + /* Wait up to 1.5s for the results to be ready */ + do { + ret_val = igb_read_phy_reg(hw, I347AT4_PCDC, &pcdc); + if (ret_val || timeout == 1500) + break; + udelay(1000); + timeout++; + } while (pcdc & I347AT4_PCDC_CABLE_DIAG_STATUS); + + if (timeout >= 1500) + dev_warn(&adapter->pdev->dev, + "Cable fault test timed out. Results may be invalid"); + + ret_val = igb_read_phy_reg(hw, I347AT4_PCDR, &pcdr); + if (ret_val) + goto done; + + hw->phy.ops.get_cable_length(hw); + + /* Iterate over each cable pair */ + for (i = 0; i < 4; i++) { + data[TEST_LENGTH_A + i] = hw->phy.pair_length[i]; + + error_code = (pcdr >> (i * 4)) & 0xf; + switch (error_code) { + case I347AT4_PCDR_CABLE_OK: + data[TEST_FAULT_A + i] = 0; + data[TEST_LENGTH_A + i] = -1; + /* don't assign ret_val */ + break; + case I347AT4_PCDR_CABLE_OPEN: + data[TEST_FAULT_A + i] = 1; + data[TEST_OPEN_A + i] = 1; + ret_val = -1; + break; + case I347AT4_PCDR_CABLE_SHORT: + data[TEST_FAULT_A + i] = 1; + data[TEST_SHORT_A + i] = 1; + ret_val = -1; + break; + case I347AT4_PCDR_CABLE_CROSS_SHORT: + data[TEST_FAULT_A + i] = 1; + data[TEST_CROSS_A + i] = 1; + ret_val = -1; + break; + default: + data[TEST_FAULT_A + i] = -1; + data[TEST_LENGTH_A + i] = -1; + data[TEST_OPEN_A + i] = -1; + data[TEST_SHORT_A + i] = -1; + data[TEST_CROSS_A + i] = -1; + ret_val = -1; + } + } + +done: + /* Restore PCDC */ + igb_write_phy_reg(hw, I347AT4_PCDC, old_pcdc); +no_pcdc: + igb_write_phy_reg(hw, I347AT4_PAGE_SELECT, 0); +no_set_page: + return ret_val; +} + static void igb_diag_test(struct net_device *netdev, struct ethtool_test *eth_test, u64 *data) { @@ -1993,6 +2148,11 @@ static void igb_diag_test(struct net_device *netdev, set_bit(__IGB_TESTING, &adapter->state); + if (igb_has_i210_cable_fault_test(adapter)) { + memset(&data[TEST_FAULT_A], 0x0, + sizeof(u64) * (IGB_I210_TEST_LEN - IGB_TEST_LEN)); + } + /* can't do offline tests on media switching devices */ if (adapter->hw.dev_spec._82575.mas_capable) eth_test->flags &= ~ETH_TEST_FL_OFFLINE; @@ -2012,8 +2172,19 @@ static void igb_diag_test(struct net_device *netdev, /* Link test performed before hardware reset so autoneg doesn't * interfere with test result */ - if (igb_link_test(adapter, &data[TEST_LINK])) + if (!igb_has_i210_cable_fault_test(adapter)) { + if (igb_link_test(adapter, &data[TEST_LINK])) + eth_test->flags |= ETH_TEST_FL_FAILED; + } else if (igb_link_test(adapter, &data[TEST_LINK])) { eth_test->flags |= ETH_TEST_FL_FAILED; + igb_cable_fault_test(adapter, eth_test, data); + } else { + /* I210: link present, no fault */ + data[TEST_LENGTH_A] = -1; + data[TEST_LENGTH_B] = -1; + data[TEST_LENGTH_C] = -1; + data[TEST_LENGTH_D] = -1; + } if (if_running) /* indicate we're in test mode */ @@ -2270,7 +2441,8 @@ static int igb_get_sset_count(struct net_device *netdev, int sset) case ETH_SS_STATS: return IGB_STATS_LEN; case ETH_SS_TEST: - return IGB_TEST_LEN; + return igb_has_i210_cable_fault_test(netdev_priv(netdev)) ? + IGB_I210_TEST_LEN : IGB_TEST_LEN; default: return -ENOTSUPP; } @@ -2340,8 +2512,12 @@ static void igb_get_strings(struct net_device *netdev, u32 stringset, u8 *data) switch (stringset) { case ETH_SS_TEST: - memcpy(data, *igb_gstrings_test, - IGB_TEST_LEN*ETH_GSTRING_LEN); + if (igb_has_i210_cable_fault_test(adapter)) + memcpy(data, *igb_i210_gstrings_test, + IGB_I210_TEST_LEN*ETH_GSTRING_LEN); + else + memcpy(data, *igb_gstrings_test, + IGB_TEST_LEN*ETH_GSTRING_LEN); break; case ETH_SS_STATS: for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {