Patchwork [v2,3/3] ucc_geth: Fix hangs after switching from full to half duplex

login
register
mail settings
Submitter Anton Vorontsov
Date Sept. 10, 2009, 5:58 p.m.
Message ID <20090910175852.GA18948@oksana.dev.rtsoft.ru>
Download mbox | patch
Permalink /patch/33378/
State Superseded
Headers show

Comments

Anton Vorontsov - Sept. 10, 2009, 5:58 p.m.
MPC8360 QE UCC ethernet controllers hang when changing link duplex
under a load (a bit of NFS activity is enough).

  PHY: mdio@e0102120:00 - Link is Up - 1000/Full
  sh-3.00# ethtool -s eth0 speed 100 duplex half autoneg off
  PHY: mdio@e0102120:00 - Link is Down
  PHY: mdio@e0102120:00 - Link is Up - 100/Half
  NETDEV WATCHDOG: eth0 (ucc_geth): transmit queue 0 timed out
  ------------[ cut here ]------------
  Badness at c01fcbd0 [verbose debug info unavailable]
  NIP: c01fcbd0 LR: c01fcbd0 CTR: c0194e44
  ...

The cure is to disable the controller before changing speed/duplex
and enable it afterwards.

Since ugeth_graceful_stop_{tx,rx} now may be called from an atomic
context, switch the two functions from msleep() to mdelay().

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---

v2: Switch ugeth_graceful_stop_{tx,rx} to mdelay. Noticed with
    CONFIG_DEBUG_SPINLOCK_SLEEP.

 drivers/net/ucc_geth.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)
Scott Wood - Sept. 10, 2009, 6:04 p.m.
Anton Vorontsov wrote:
> MPC8360 QE UCC ethernet controllers hang when changing link duplex
> under a load (a bit of NFS activity is enough).
> 
>   PHY: mdio@e0102120:00 - Link is Up - 1000/Full
>   sh-3.00# ethtool -s eth0 speed 100 duplex half autoneg off
>   PHY: mdio@e0102120:00 - Link is Down
>   PHY: mdio@e0102120:00 - Link is Up - 100/Half
>   NETDEV WATCHDOG: eth0 (ucc_geth): transmit queue 0 timed out
>   ------------[ cut here ]------------
>   Badness at c01fcbd0 [verbose debug info unavailable]
>   NIP: c01fcbd0 LR: c01fcbd0 CTR: c0194e44
>   ...
> 
> The cure is to disable the controller before changing speed/duplex
> and enable it afterwards.
> 
> Since ugeth_graceful_stop_{tx,rx} now may be called from an atomic
> context, switch the two functions from msleep() to mdelay().

Ouch.  Can we put this in a workqueue or something?

-Scott
Anton Vorontsov - Sept. 10, 2009, 7:40 p.m.
On Thu, Sep 10, 2009 at 01:04:32PM -0500, Scott Wood wrote:
> Anton Vorontsov wrote:
> >MPC8360 QE UCC ethernet controllers hang when changing link duplex
> >under a load (a bit of NFS activity is enough).
> >
> >  PHY: mdio@e0102120:00 - Link is Up - 1000/Full
> >  sh-3.00# ethtool -s eth0 speed 100 duplex half autoneg off
> >  PHY: mdio@e0102120:00 - Link is Down
> >  PHY: mdio@e0102120:00 - Link is Up - 100/Half
> >  NETDEV WATCHDOG: eth0 (ucc_geth): transmit queue 0 timed out
> >  ------------[ cut here ]------------
> >  Badness at c01fcbd0 [verbose debug info unavailable]
> >  NIP: c01fcbd0 LR: c01fcbd0 CTR: c0194e44
> >  ...
> >
> >The cure is to disable the controller before changing speed/duplex
> >and enable it afterwards.
> >
> >Since ugeth_graceful_stop_{tx,rx} now may be called from an atomic
> >context, switch the two functions from msleep() to mdelay().
> 
> Ouch.

Yeah, right... delaying for 10ms with irqs off isn't good.

> Can we put this in a workqueue or something?

adjust_link() itself isn't called from an atomic context.

It's we are grabbing ugeth->lock, i.e. a spinlock. I don't see
why the lock is needed in adjust_link() in its current form,
but if we're going to disable the controller for some time,
we'll have to make sure that no start_xmit() or NAPI is running,
scheduled or will be scheduled until we say so.

I think that lock-less, and thus completely sleep-able variant
of adjust_link is doable.

Thanks,
Anton Vorontsov - Sept. 10, 2009, 9:09 p.m.
On Thu, Sep 10, 2009 at 11:40:53PM +0400, Anton Vorontsov wrote:
> On Thu, Sep 10, 2009 at 01:04:32PM -0500, Scott Wood wrote:
> > Anton Vorontsov wrote:
> > >MPC8360 QE UCC ethernet controllers hang when changing link duplex
> > >under a load (a bit of NFS activity is enough).
> > >
> > >  PHY: mdio@e0102120:00 - Link is Up - 1000/Full
> > >  sh-3.00# ethtool -s eth0 speed 100 duplex half autoneg off
> > >  PHY: mdio@e0102120:00 - Link is Down
> > >  PHY: mdio@e0102120:00 - Link is Up - 100/Half
> > >  NETDEV WATCHDOG: eth0 (ucc_geth): transmit queue 0 timed out
> > >  ------------[ cut here ]------------
> > >  Badness at c01fcbd0 [verbose debug info unavailable]
> > >  NIP: c01fcbd0 LR: c01fcbd0 CTR: c0194e44
> > >  ...
> > >
> > >The cure is to disable the controller before changing speed/duplex
> > >and enable it afterwards.
> > >
> > >Since ugeth_graceful_stop_{tx,rx} now may be called from an atomic
> > >context, switch the two functions from msleep() to mdelay().
> > 
> > Ouch.
> 
> Yeah, right... delaying for 10ms with irqs off isn't good.
> 
> > Can we put this in a workqueue or something?
> 
> adjust_link() itself isn't called from an atomic context.

Oops. I though that phylib calls us from a workqueue, not a timer. Hm...

Will be a little bit more work..

Patch

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 2a2c973..5a0803f 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -1432,7 +1432,7 @@  static int ugeth_graceful_stop_tx(struct ucc_geth_private *ugeth)
 
 	/* Wait for command to complete */
 	do {
-		msleep(10);
+		mdelay(10);
 		temp = in_be32(uccf->p_ucce);
 	} while (!(temp & UCC_GETH_UCCE_GRA) && --i);
 
@@ -1464,7 +1464,7 @@  static int ugeth_graceful_stop_rx(struct ucc_geth_private *ugeth)
 						ucc_num);
 		qe_issue_cmd(QE_GRACEFUL_STOP_RX, cecr_subblock,
 			     QE_CR_PROTOCOL_ETHERNET, 0);
-		msleep(10);
+		mdelay(10);
 		temp = in_8(&ugeth->p_rx_glbl_pram->rxgstpack);
 	} while (!(temp & GRACEFUL_STOP_ACKNOWLEDGE_RX) && --i);
 
@@ -1631,9 +1631,13 @@  static void adjust_link(struct net_device *dev)
 			ugeth->oldspeed = phydev->speed;
 		}
 
+		ugeth_disable(ugeth, COMM_DIR_RX_AND_TX);
+
 		out_be32(&ug_regs->maccfg2, tempval);
 		out_be32(&uf_regs->upsmr, upsmr);
 
+		ugeth_enable(ugeth, COMM_DIR_RX_AND_TX);
+
 		if (!ugeth->oldlink) {
 			new_state = 1;
 			ugeth->oldlink = 1;