diff mbox

[1/2] gianfar: Deal with link state changes during GFAR_RESETTING dev state

Message ID 1486988529-24924-1-git-send-email-thomas.graziadei@omicronenergy.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

thomas.graziadei@omicronenergy.com Feb. 13, 2017, 12:22 p.m. UTC
From: Thomas Graziadei <thomas.graziadei@omicronenergy.com>

The link state is not correctly set in the case that the network driver
is reconfigured while the link state changes. The phy informs the gianfar
driver, but gfar_update_link_state just exits and therefore looses the
change event. The network driver remains in the old state until a new link
event is sent.

A trace log from a possible scenario at bootup, when the link state in the
network driver stays down even though the phy reports an up link. The test
sends a SIOCSHWTSTAMP ioctl at the right moment (which calls reset_gfar):
         ip-1196 [000]  5.389270: phy_start: state: READY -> UP
kworker/0:2-1195 [000]  5.389784: phy_start_aneg: state: UP -> AN
kworker/0:2-1195 [000]  5.389788: phy_state_machine: state: UP -> AN
kworker/0:2-1195 [000]  6.828064: adjust_link: eth0, link 0 -> 0
kworker/0:2-1195 [000]  6.828599: phy_state_machine: state: AN -> NOLINK
       test-1470 [000]  7.460422: reset_gfar: before locking GFAR_RESETTING
       test-1470 [000]  7.470806: phy_stop: state: NOLINK -> HALTED
       test-1470 [000]  7.478806: phy_start: state: HALTED -> RESUMING
kworker/0:2-1195 [000]  7.479478: adjust_link: eth0, link 0 -> 1
kworker/0:2-1195 [000]  7.479482: phy_state_machine: state: RESUMING -> RUNNING
       test-1470 [000]  7.479826: reset_gfar: after locking GFAR_RESETTING

To resolve the issue adjust_link is called after every GFAR_RESETTING lock
section. Adjust_link itself checks if anything has changed and updates the
link accordingly.

Signed-off-by: Thomas Graziadei <thomas.graziadei@omicronenergy.com>
---
 drivers/net/ethernet/freescale/gianfar.c         |  7 +++++++
 drivers/net/ethernet/freescale/gianfar_ethtool.c | 15 +++++++++++++++
 2 files changed, 22 insertions(+)

Comments

Claudiu Manoil Feb. 13, 2017, 5:10 p.m. UTC | #1
>-----Original Message-----
>From: Thomas Graziadei [mailto:thomas.graziadei@omicronenergy.com]
>Sent: Monday, February 13, 2017 2:22 PM
>To: claudiu.manoil@freescale.com; netdev@vger.kernel.org; linux-
>kernel@vger.kernel.org
>Cc: Thomas Graziadei <thomas.graziadei@omicronenergy.com>
>Subject: [PATCH 1/2] gianfar: Deal with link state changes during GFAR_RESETTING
>dev state
>
>From: Thomas Graziadei <thomas.graziadei@omicronenergy.com>
>
>The link state is not correctly set in the case that the network driver
>is reconfigured while the link state changes. The phy informs the gianfar
>driver, but gfar_update_link_state just exits and therefore looses the
>change event. The network driver remains in the old state until a new link
>event is sent.
>
>A trace log from a possible scenario at bootup, when the link state in the
>network driver stays down even though the phy reports an up link. The test
>sends a SIOCSHWTSTAMP ioctl at the right moment (which calls reset_gfar):
>         ip-1196 [000]  5.389270: phy_start: state: READY -> UP
>kworker/0:2-1195 [000]  5.389784: phy_start_aneg: state: UP -> AN
>kworker/0:2-1195 [000]  5.389788: phy_state_machine: state: UP -> AN
>kworker/0:2-1195 [000]  6.828064: adjust_link: eth0, link 0 -> 0
>kworker/0:2-1195 [000]  6.828599: phy_state_machine: state: AN -> NOLINK
>       test-1470 [000]  7.460422: reset_gfar: before locking GFAR_RESETTING
>       test-1470 [000]  7.470806: phy_stop: state: NOLINK -> HALTED
>       test-1470 [000]  7.478806: phy_start: state: HALTED -> RESUMING
>kworker/0:2-1195 [000]  7.479478: adjust_link: eth0, link 0 -> 1
>kworker/0:2-1195 [000]  7.479482: phy_state_machine: state: RESUMING ->
>RUNNING
>       test-1470 [000]  7.479826: reset_gfar: after locking GFAR_RESETTING
>
>To resolve the issue adjust_link is called after every GFAR_RESETTING lock
>section. Adjust_link itself checks if anything has changed and updates the
>link accordingly.
>

Hi,
Interesting findings. I need more time to check the patches. Btw, we don't 
use "//" for comments on netdev.

Thanks,
Claudiu
Claudiu Manoil Feb. 16, 2017, 5:41 p.m. UTC | #2
>-----Original Message-----
>From: Thomas Graziadei [mailto:thomas.graziadei@omicronenergy.com]
>Sent: Monday, February 13, 2017 2:22 PM
>To: claudiu.manoil@freescale.com; netdev@vger.kernel.org; linux-
>kernel@vger.kernel.org
>Cc: Thomas Graziadei <thomas.graziadei@omicronenergy.com>
>Subject: [PATCH 1/2] gianfar: Deal with link state changes during GFAR_RESETTING
>dev state
>
>From: Thomas Graziadei <thomas.graziadei@omicronenergy.com>
>

[...]

>
>To resolve the issue adjust_link is called after every GFAR_RESETTING lock
>section. Adjust_link itself checks if anything has changed and updates the
>link accordingly.
>

Calling adjust_link() from the ethernet driver, even if only following device reset / 
reconfig sections, is racy - the phylib state machine may run adjust_link() at the 
same time or the phydev state may be inconsistent - and may end up in more 
subtle link state issues.  It also defeats the purpose of the adjust link hook, which 
is to be called by the phylib state machine.

I had a look at this, and I don't think there is an easy fix in the driver for your case. 
However, the workaround should be straight forward: wait for the link to stabilize 
to the requested parameters before applying other device configuration changes 
that require device reset (like rx timestamping).  Given that the PHYLIB state machine 
changed its behavior since adjust_link() in gianfar was last modified (i.e. it used to be called
periodically before, from RUNNING<->CHANGELINK states) and given that there is a good part 
of legacy code in the current adjust_link() implementation, I think that driver part needs 
considerable rework to correctly cover this usecase as well.

Thanks.
Claudiu
diff mbox

Patch

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 957bfc2..b3b5c43 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2617,6 +2617,10 @@  static int gfar_change_mtu(struct net_device *dev, int new_mtu)
 
 	clear_bit_unlock(GFAR_RESETTING, &priv->state);
 
+	// catch maybe missed link state changes
+	if (dev->flags & IFF_UP)
+		adjust_link(dev);
+
 	return 0;
 }
 
@@ -2631,6 +2635,9 @@  void reset_gfar(struct net_device *ndev)
 	startup_gfar(ndev);
 
 	clear_bit_unlock(GFAR_RESETTING, &priv->state);
+
+	// catch maybe missed link state changes
+	adjust_link(ndev);
 }
 
 /* gfar_reset_task gets scheduled when a packet has not been
diff --git a/drivers/net/ethernet/freescale/gianfar_ethtool.c b/drivers/net/ethernet/freescale/gianfar_ethtool.c
index a93e019..065f05e 100644
--- a/drivers/net/ethernet/freescale/gianfar_ethtool.c
+++ b/drivers/net/ethernet/freescale/gianfar_ethtool.c
@@ -330,6 +330,7 @@  static int gfar_scoalesce(struct net_device *dev,
 			  struct ethtool_coalesce *cvals)
 {
 	struct gfar_private *priv = netdev_priv(dev);
+	struct phy_device *phydev = dev->phydev;
 	int i, err = 0;
 
 	if (!(priv->device_flags & FSL_GIANFAR_DEV_HAS_COALESCE))
@@ -408,6 +409,10 @@  static int gfar_scoalesce(struct net_device *dev,
 
 	clear_bit_unlock(GFAR_RESETTING, &priv->state);
 
+	// catch maybe missed link state changes
+	if (dev->flags & IFF_UP)
+		phydev->adjust_link(dev);
+
 	return err;
 }
 
@@ -445,6 +450,7 @@  static int gfar_sringparam(struct net_device *dev,
 			   struct ethtool_ringparam *rvals)
 {
 	struct gfar_private *priv = netdev_priv(dev);
+	struct phy_device *phydev = dev->phydev;
 	int err = 0, i;
 
 	if (rvals->rx_pending > GFAR_RX_MAX_RING_SIZE)
@@ -482,6 +488,10 @@  static int gfar_sringparam(struct net_device *dev,
 
 	clear_bit_unlock(GFAR_RESETTING, &priv->state);
 
+	// catch maybe missed link state changes
+	if (dev->flags & IFF_UP)
+		phydev->adjust_link(dev);
+
 	return err;
 }
 
@@ -569,6 +579,7 @@  int gfar_set_features(struct net_device *dev, netdev_features_t features)
 {
 	netdev_features_t changed = dev->features ^ features;
 	struct gfar_private *priv = netdev_priv(dev);
+	struct phy_device *phydev = dev->phydev;
 	int err = 0;
 
 	if (!(changed & (NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
@@ -590,6 +601,10 @@  int gfar_set_features(struct net_device *dev, netdev_features_t features)
 
 	clear_bit_unlock(GFAR_RESETTING, &priv->state);
 
+	// catch maybe missed link state changes
+	if (dev->flags & IFF_UP)
+		phydev->adjust_link(dev);
+
 	return err;
 }