Patchwork i.Mx6Quad - eth0: tx queue full!

login
register
mail settings
Submitter Duan Fugang-B38611
Date Jan. 30, 2013, 8:37 a.m.
Message ID <9848F2DB572E5649BA045B288BE08FBE011B0826@039-SN2MPN1-022.039d.mgd.msft.net>
Download mbox | patch
Permalink /patch/216781/
State RFC
Delegated to: David Miller
Headers show

Comments

Duan Fugang-B38611 - Jan. 30, 2013, 8:37 a.m.
Hi, all

The issue cannot be found kernel 3.0.35 branch: (even run stress test with IPU and VPU)
ssh://sw-git01-tx30/git/sw_git/repos/linux-2.6-imx.git  
branch name: imx_3.0.35


The patch as below:


From 91a0c892263e57ecde9e9ff38be3acdb7f66a17f Mon Sep 17 00:00:00 2001
From: Fugang Duan <B38611@freescale.com>
Date: Thu, 9 Aug 2012 17:59:44 +0800
Subject: [PATCH] ENGR00180288 - FEC : Fix kernel dump about eth0

Kernel dump when do wifi stress test with suspend and resume as below:
        eth0: tx queue full!.
        remove wake up source irq 103 
        PM: resume of devices complete after 348.934 msecs
        Restarting tasks ... done.
        ------------[ cut here ]------------
        WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x284/0x2a8()
        NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out 
        Modules linked in: ar6000
        [<8004482c>] (unwind_backtrace+0x0/0xf8) from
        [<80068cd0>] (warn_slowpath_common+0x4c/0x64)
        [<80068cd0>] (warn_slowpath_common+0x4c/0x64)from
        [<80068d7c>] (warn_slowpath_fmt+0x30/0x40)
        [<80068d7c>] (warn_slowpath_fmt+0x30/0x40) from
        [<803f0c50>] (dev_watchdog+0x284/0x2a8)
        [<803f0c50>] (dev_watchdog+0x284/0x2a8) from
        [<80074430>] (run_timer_softirq+0xec/0x214)
        [<80074430>] (run_timer_softirq+0xec/0x214) from
        [<8006e524>] (__do_softirq+0xac/0x140)
        [<8006e524>] (__do_softirq+0xac/0x140) from
        [<8006ea60>] (irq_exit+0x94/0x9c)
        [<8006ea60>] (irq_exit+0x94/0x9c) from
        [<80039240>] (do_local_timer+0x54/0x70)
        [<80039240>] (do_local_timer+0x54/0x70) from
        [<8003ea0c>] (__irq_svc+0x4c/0xe8)
        Exception stack(0x80a2bf68 to 0x80a2bfb0)
        bf60:                   0000001f 80a3babc 80a2bfb0 00000000 80a2a000 80a7b8e4
        bf80: 804befcc 80a3ee7c 1000406a 412fc09a 00000000 00000000 80a81440 80a2bfb0
        bfa0: 8003fa64 8003fa68 60000013 ffffffff
        [<8003ea0c>] (__irq_svc+0x4c/0xe8) from [<8003fa68>] (default_idle+0x24/0x28)
        [<8003fa68>] (default_idle+0x24/0x28) from [<8003fc60>] (cpu_idle+0xbc/0xfc)
        [<8003fc60>] (cpu_idle+0xbc/0xfc) from [<80008878>] (start_kernel+0x258/0x29c)
        [<80008878>] (start_kernel+0x258/0x29c) from [<10008040>] (0x10008040)
        ---[ end trace 30671ac42e272c2d ]---

But ethernet and system still be alive. In sometime,the issue
will cause system hang like "nfs: server 10.192.242.179 not 
responding, still trying".

The root cause is tx buffer descriptors are not cleaned when
ethernet resume back.

Signed-off-by: Fugang Duan <B38611@freescale.com>

drivers/net/fec.c |   39 ++++++++++++++++++++++++++-------------
 1 files changed, 26 insertions(+), 13 deletions(-)
Vikram Narayanan - Jan. 30, 2013, 3:18 p.m.
Hello Duan,

On 1/30/2013 2:07 PM, Duan Fugang-B38611 wrote:
> Hi, all
>
> The issue cannot be found kernel 3.0.35 branch: (even run stress test with IPU and VPU)
> ssh://sw-git01-tx30/git/sw_git/repos/linux-2.6-imx.git

Where is this git located?

> branch name: imx_3.0.35

Troy's branch (located in github) works fine for me. I've kept it for a 
long run with heavy network load. Should I encounter some errors, I'll 
post back.

> The patch as below:
>
>  From 91a0c892263e57ecde9e9ff38be3acdb7f66a17f Mon Sep 17 00:00:00 2001
> From: Fugang Duan <B38611@freescale.com>
> Date: Thu, 9 Aug 2012 17:59:44 +0800
> Subject: [PATCH] ENGR00180288 - FEC : Fix kernel dump about eth0

Thanks, I'll give a try with this patch as well.

Regards,
Vikram
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/fec.c b/drivers/net/fec.c
index f007bf0..b1fa464 100755
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -1456,6 +1456,28 @@  static const struct net_device_ops fec_netdev_ops = {
 #endif
 };
 
+/* Init TX buffer descriptors
+ */
+static void fec_enet_txbd_init(struct net_device *dev)
+{
+       struct fec_enet_private *fep = netdev_priv(dev);
+       struct bufdesc *bdp;
+       int i;
+
+       /* ...and the same for transmit */
+       bdp = fep->tx_bd_base;
+       for (i = 0; i < TX_RING_SIZE; i++) {
+
+               /* Initialize the BD for every fragment in the page. */
+               bdp->cbd_sc = 0;
+               bdp++;
+       }
+
+       /* Set the last buffer to wrap */
+       bdp--;
+       bdp->cbd_sc |= BD_SC_WRAP;
+}
+
  /*
   * XXX:  We need to clean up on failure exits here.
   *
@@ -1512,19 +1534,8 @@  static int fec_enet_init(struct net_device *ndev)
        bdp--;
        bdp->cbd_sc |= BD_SC_WRAP;
 
-       /* ...and the same for transmit */
-       bdp = fep->tx_bd_base;
-       for (i = 0; i < TX_RING_SIZE; i++) {
-
-               /* Initialize the BD for every fragment in the page. */
-               bdp->cbd_sc = 0;
-               bdp->cbd_bufaddr = 0;
-               bdp++;
-       }
-
-       /* Set the last buffer to wrap */
-       bdp--;
-       bdp->cbd_sc |= BD_SC_WRAP;
+       /* Init transmit descriptors */
+       fec_enet_txbd_init(ndev);
 
        fec_restart(ndev, 0);
 
@@ -1575,6 +1586,8 @@  fec_restart(struct net_device *dev, int duplex)
        writel(fep->bd_dma, fep->hwp + FEC_R_DES_START);
        writel((unsigned long)fep->bd_dma + sizeof(struct bufdesc) * RX_RING_SIZE,
                        fep->hwp + FEC_X_DES_START);
+       /* Reinit transmit descriptors */
+       fec_enet_txbd_init(dev);
 
        fep->dirty_tx = fep->cur_tx = fep->tx_bd_base;
        fep->cur_rx = fep->rx_bd_base;