Patchwork ucc_geth: Fix hung tasks.

login
register
mail settings
Submitter Joakim Tjernlund
Date Nov. 8, 2010, 10:23 a.m.
Message ID <1289211819-21746-1-git-send-email-Joakim.Tjernlund@transmode.se>
Download mbox | patch
Permalink /patch/70407/
State Rejected
Delegated to: David Miller
Headers show

Comments

Joakim Tjernlund - Nov. 8, 2010, 10:23 a.m.
We noticed a few hangs like this:

INFO: task ifconfig:572 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ifconfig      D 0ff65760     0   572    369 0x00000000
Call Trace:
[c6157be0] [c6008460] 0xc6008460 (unreliable)
[c6157ca0] [c0008608] __switch_to+0x4c/0x6c
[c6157cb0] [c028fecc] schedule+0x184/0x310
[c6157ce0] [c0290e54] __mutex_lock_slowpath+0xa4/0x150
[c6157d20] [c0290c48] mutex_lock+0x44/0x48
[c6157d30] [c01aba74] phy_stop+0x20/0x70
[c6157d40] [c01aef40] ucc_geth_stop+0x30/0x98
[c6157d60] [c01b18fc] ucc_geth_close+0x9c/0xdc
[c6157d80] [c01db0cc] __dev_close+0xa0/0xd0
[c6157d90] [c01deddc] __dev_change_flags+0x8c/0x148
[c6157db0] [c01def54] dev_change_flags+0x1c/0x64
[c6157dd0] [c0237ac8] devinet_ioctl+0x678/0x784
[c6157e50] [c0239a58] inet_ioctl+0xb0/0xbc
[c6157e60] [c01cafa8] sock_ioctl+0x174/0x2a0
[c6157e80] [c009a16c] vfs_ioctl+0xcc/0xe0
[c6157ea0] [c009a998] do_vfs_ioctl+0xc4/0x79c
[c6157f10] [c009b0b0] sys_ioctl+0x40/0x74
[c6157f40] [c00117c4] ret_from_syscall+0x0/0x38

I THINK this is due to a missing cancel_work_sync in the driver
although we cannot be sure. I found this by comparing
ucc_geth with gianfar.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 drivers/net/ucc_geth.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
Joakim Tjernlund - Nov. 10, 2010, 12:05 p.m.
Ping?

Even though this patch didn't solve my hang it is still a bug.

     Jocke

Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote on 2010/11/08 11:23:39:

> From: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> To: linuxppc-dev@lists.ozlabs.org, netdev@vger.kernel.org, Anton Vorontsov <avorontsov@ru.mvista.com>
> Cc: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> Date: 2010/11/08 11:23
> Subject: [PATCH] ucc_geth: Fix hung tasks.
>
> We noticed a few hangs like this:
>
> INFO: task ifconfig:572 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> ifconfig      D 0ff65760     0   572    369 0x00000000
> Call Trace:
> [c6157be0] [c6008460] 0xc6008460 (unreliable)
> [c6157ca0] [c0008608] __switch_to+0x4c/0x6c
> [c6157cb0] [c028fecc] schedule+0x184/0x310
> [c6157ce0] [c0290e54] __mutex_lock_slowpath+0xa4/0x150
> [c6157d20] [c0290c48] mutex_lock+0x44/0x48
> [c6157d30] [c01aba74] phy_stop+0x20/0x70
> [c6157d40] [c01aef40] ucc_geth_stop+0x30/0x98
> [c6157d60] [c01b18fc] ucc_geth_close+0x9c/0xdc
> [c6157d80] [c01db0cc] __dev_close+0xa0/0xd0
> [c6157d90] [c01deddc] __dev_change_flags+0x8c/0x148
> [c6157db0] [c01def54] dev_change_flags+0x1c/0x64
> [c6157dd0] [c0237ac8] devinet_ioctl+0x678/0x784
> [c6157e50] [c0239a58] inet_ioctl+0xb0/0xbc
> [c6157e60] [c01cafa8] sock_ioctl+0x174/0x2a0
> [c6157e80] [c009a16c] vfs_ioctl+0xcc/0xe0
> [c6157ea0] [c009a998] do_vfs_ioctl+0xc4/0x79c
> [c6157f10] [c009b0b0] sys_ioctl+0x40/0x74
> [c6157f40] [c00117c4] ret_from_syscall+0x0/0x38
>
> I THINK this is due to a missing cancel_work_sync in the driver
> although we cannot be sure. I found this by comparing
> ucc_geth with gianfar.
>
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> ---
>  drivers/net/ucc_geth.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
> index 97f9f7d..6647ed7 100644
> --- a/drivers/net/ucc_geth.c
> +++ b/drivers/net/ucc_geth.c
> @@ -3556,6 +3556,7 @@ static int ucc_geth_close(struct net_device *dev)
>
>     napi_disable(&ugeth->napi);
>
> +   cancel_work_sync(&ugeth->timeout_work);
>     ucc_geth_stop(ugeth);
>
>     free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);
> --
> 1.7.2.2
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joakim Tjernlund - Nov. 10, 2010, 2:11 p.m.
Actually, there is something wrong anyway with TX timeout
so don't use this patch. I must investigate more but
it seems like cancel_work_sync hangs whenever an TX timeout
occurs.

      Jocke

Joakim Tjernlund/Transmode wrote on 2010/11/10 13:05:28:
>
> Ping?
>
> Even though this patch didn't solve my hang it is still a bug.
>
>      Jocke
>
> Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote on 2010/11/08 11:23:39:
>
> > From: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > To: linuxppc-dev@lists.ozlabs.org, netdev@vger.kernel.org, Anton Vorontsov <avorontsov@ru.mvista.com>
> > Cc: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > Date: 2010/11/08 11:23
> > Subject: [PATCH] ucc_geth: Fix hung tasks.
> >
> > We noticed a few hangs like this:
> >
> > INFO: task ifconfig:572 blocked for more than 120 seconds.
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > ifconfig      D 0ff65760     0   572    369 0x00000000
> > Call Trace:
> > [c6157be0] [c6008460] 0xc6008460 (unreliable)
> > [c6157ca0] [c0008608] __switch_to+0x4c/0x6c
> > [c6157cb0] [c028fecc] schedule+0x184/0x310
> > [c6157ce0] [c0290e54] __mutex_lock_slowpath+0xa4/0x150
> > [c6157d20] [c0290c48] mutex_lock+0x44/0x48
> > [c6157d30] [c01aba74] phy_stop+0x20/0x70
> > [c6157d40] [c01aef40] ucc_geth_stop+0x30/0x98
> > [c6157d60] [c01b18fc] ucc_geth_close+0x9c/0xdc
> > [c6157d80] [c01db0cc] __dev_close+0xa0/0xd0
> > [c6157d90] [c01deddc] __dev_change_flags+0x8c/0x148
> > [c6157db0] [c01def54] dev_change_flags+0x1c/0x64
> > [c6157dd0] [c0237ac8] devinet_ioctl+0x678/0x784
> > [c6157e50] [c0239a58] inet_ioctl+0xb0/0xbc
> > [c6157e60] [c01cafa8] sock_ioctl+0x174/0x2a0
> > [c6157e80] [c009a16c] vfs_ioctl+0xcc/0xe0
> > [c6157ea0] [c009a998] do_vfs_ioctl+0xc4/0x79c
> > [c6157f10] [c009b0b0] sys_ioctl+0x40/0x74
> > [c6157f40] [c00117c4] ret_from_syscall+0x0/0x38
> >
> > I THINK this is due to a missing cancel_work_sync in the driver
> > although we cannot be sure. I found this by comparing
> > ucc_geth with gianfar.
> >
> > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > ---
> >  drivers/net/ucc_geth.c |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
> > index 97f9f7d..6647ed7 100644
> > --- a/drivers/net/ucc_geth.c
> > +++ b/drivers/net/ucc_geth.c
> > @@ -3556,6 +3556,7 @@ static int ucc_geth_close(struct net_device *dev)
> >
> >     napi_disable(&ugeth->napi);
> >
> > +   cancel_work_sync(&ugeth->timeout_work);
> >     ucc_geth_stop(ugeth);
> >
> >     free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);
> > --
> > 1.7.2.2
> >

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 97f9f7d..6647ed7 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -3556,6 +3556,7 @@  static int ucc_geth_close(struct net_device *dev)
 
 	napi_disable(&ugeth->napi);
 
+	cancel_work_sync(&ugeth->timeout_work);
 	ucc_geth_stop(ugeth);
 
 	free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);