Message ID | 1340724786-3819-1-git-send-email-cjren@qca.qualcomm.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Jun 26, 2012 at 12:33:06PM -0300, Ren, Cloud wrote: > From: xiong <xiong@qca.qualcomm.com> > > some people report atl1c could cause system hang with following > kernel trace info: > --------------------------------------- > WARNING: at.../net/sched/sch_generic.c:258 > dev_watchdog+0x1db/0x1d0() > ... > NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out > ... > --------------------------------------- > This is caused by netif_stop_queue calling when cable Link is down > but netif_wake_queue isn't called when cable Link is resume. > > Signed-off-by: xiong <xiong@qca.qualcomm.com> > Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com> If this fixes a system hang then this could be a stable fix -- that is, this should be propagated to older stable kernels, no? If so then please add to the commit log a line like this: Cc: stable@vger.kernel.org [3.4] Note: this is for the commit log! Right above the line below that has "---" Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-06-26 at 12:33 -0300, Ren, Cloud wrote: > From: xiong <xiong@qca.qualcomm.com> > > some people report atl1c could cause system hang with following > kernel trace info: > --------------------------------------- > WARNING: at.../net/sched/sch_generic.c:258 > dev_watchdog+0x1db/0x1d0() > ... > NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out > ... > --------------------------------------- > This is caused by netif_stop_queue calling when cable Link is down > but netif_wake_queue isn't called when cable Link is resume. > > Signed-off-by: xiong <xiong@qca.qualcomm.com> > Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com> > --- > drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > index 85717cb..c2736c4 100644 > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > @@ -351,6 +351,8 @@ static void atl1c_common_task(struct work_struct *work) > atl1c_irq_disable(adapter); > atl1c_check_link_status(adapter); > atl1c_irq_enable(adapter); > + if (netif_queue_stopped(netdev) && netif_carrier_ok(netdev)) > + netif_wake_queue(netdev); > } > } > Why explicitly stop/start the queue when the link changes? That's what link_watch is for. Ben.
Yes, another fix to remove netif_stop_queue when cable link is down. -Xiong > -----Original Message----- > From: Ben Hutchings [mailto:bhutchings@solarflare.com] > Sent: Wednesday, June 27, 2012 4:24 > To: Ren, Cloud > Cc: davem@davemloft.net; netdev@vger.kernel.org; linux- > kernel@vger.kernel.org; qca-linux-team; nic-devel; Huang, Xiong > Subject: Re: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out > > On Tue, 2012-06-26 at 12:33 -0300, Ren, Cloud wrote: > > From: xiong <xiong@qca.qualcomm.com> > > > > some people report atl1c could cause system hang with following kernel > > trace info: > > --------------------------------------- > > WARNING: at.../net/sched/sch_generic.c:258 > > dev_watchdog+0x1db/0x1d0() > > ... > > NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out ... > > --------------------------------------- > > This is caused by netif_stop_queue calling when cable Link is down but > > netif_wake_queue isn't called when cable Link is resume. > > > > Signed-off-by: xiong <xiong@qca.qualcomm.com> > > Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com> > > --- > > drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 2 ++ > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > index 85717cb..c2736c4 100644 > > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > @@ -351,6 +351,8 @@ static void atl1c_common_task(struct work_struct > *work) > > atl1c_irq_disable(adapter); > > atl1c_check_link_status(adapter); > > atl1c_irq_enable(adapter); > > + if (netif_queue_stopped(netdev) && netif_carrier_ok(netdev)) > > + netif_wake_queue(netdev); > > } > > } > > > > Why explicitly stop/start the queue when the link changes? That's what > link_watch is for. > > Ben. > > -- > Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's > the marketing department's job. > They asked us to note that Solarflare product names are trademarked.
Sorry, my mean , another fix is to remove netif_stop_queue when cable link is down. Thanks Xiong > -----Original Message----- > From: Huang, Xiong > Sent: Wednesday, June 27, 2012 4:25 > To: Ben Hutchings; Ren, Cloud > Cc: davem@davemloft.net; netdev@vger.kernel.org; linux- > kernel@vger.kernel.org; qca-linux-team; nic-devel > Subject: RE: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out > > Yes, another fix to remove netif_stop_queue when cable link is down. > > -Xiong > > > -----Original Message----- > > From: Ben Hutchings [mailto:bhutchings@solarflare.com] > > Sent: Wednesday, June 27, 2012 4:24 > > To: Ren, Cloud > > Cc: davem@davemloft.net; netdev@vger.kernel.org; linux- > > kernel@vger.kernel.org; qca-linux-team; nic-devel; Huang, Xiong > > Subject: Re: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed > > out > > > > On Tue, 2012-06-26 at 12:33 -0300, Ren, Cloud wrote: > > > From: xiong <xiong@qca.qualcomm.com> > > > > > > some people report atl1c could cause system hang with following > > > kernel trace info: > > > --------------------------------------- > > > WARNING: at.../net/sched/sch_generic.c:258 > > > dev_watchdog+0x1db/0x1d0() > > > ... > > > NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out ... > > > --------------------------------------- > > > This is caused by netif_stop_queue calling when cable Link is down > > > but netif_wake_queue isn't called when cable Link is resume. > > > > > > Signed-off-by: xiong <xiong@qca.qualcomm.com> > > > Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com> > > > --- > > > drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 2 ++ > > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > > index 85717cb..c2736c4 100644 > > > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > > @@ -351,6 +351,8 @@ static void atl1c_common_task(struct work_struct > > *work) > > > atl1c_irq_disable(adapter); > > > atl1c_check_link_status(adapter); > > > atl1c_irq_enable(adapter); > > > + if (netif_queue_stopped(netdev) && netif_carrier_ok(netdev)) > > > + netif_wake_queue(netdev); > > > } > > > } > > > > > > > Why explicitly stop/start the queue when the link changes? That's > > what link_watch is for. > > > > Ben. > > > > -- > > Ben Hutchings, Staff Engineer, Solarflare Not speaking for my > > employer; that's the marketing department's job. > > They asked us to note that Solarflare product names are trademarked.
Luis It should be a stable fix, but as Ben Hutchings mentioned in another mail, Maybe, removing netif_stop_queue when cable link down is a better choice. Do you mean we need add 'cc:stable@vger.kernel.org' just before 'some people report ...' ? Thanks Xiong > -----Original Message----- > From: Rodriguez, Luis > Sent: Wednesday, June 27, 2012 2:03 > To: Ren, Cloud > Cc: davem@davemloft.net; netdev@vger.kernel.org; linux- > kernel@vger.kernel.org; qca-linux-team; nic-devel; Huang, Xiong > Subject: Re: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out > > On Tue, Jun 26, 2012 at 12:33:06PM -0300, Ren, Cloud wrote: > > From: xiong <xiong@qca.qualcomm.com> > > > > some people report atl1c could cause system hang with following kernel > > trace info: > > --------------------------------------- > > WARNING: at.../net/sched/sch_generic.c:258 > > dev_watchdog+0x1db/0x1d0() > > ... > > NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out ... > > --------------------------------------- > > This is caused by netif_stop_queue calling when cable Link is down but > > netif_wake_queue isn't called when cable Link is resume. > > > > Signed-off-by: xiong <xiong@qca.qualcomm.com> > > Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com> > > If this fixes a system hang then this could be a stable fix -- that is, this should > be propagated to older stable kernels, no? > > If so then please add to the commit log a line like this: > > Cc: stable@vger.kernel.org [3.4] > > Note: this is for the commit log! Right above the line below that has "---" > > Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 26, 2012 at 01:41:11PM -0700, Huang, Xiong wrote: > Luis > It should be a stable fix, but as Ben Hutchings mentioned in another mail, > Maybe, removing netif_stop_queue when cable link down is a better choice. > > Do you mean we need add 'cc:stable@vger.kernel.org' just before 'some people report ...' ? Nope, see commit 4f7a67e2dd49fbfba002c453bc24bf00e701cc71 as an example of how to do this. This is a random commit that has been marked as stable. commit 4f7a67e2dd49fbfba002c453bc24bf00e701cc71 Author: Ricardo Martins <rasm@fe.up.pt> Date: Tue May 22 18:02:03 2012 +0100 USB: fix PS3 EHCI systems After commit aaa0ef289afe9186f81e2340114ea413eef0492a "PS3 EHCI QH read work-around", Terratec Grabby (em28xx) stopped working with AMD Geode LX 800 (USB controller AMD CS5536). Since this is a PS3 only fix, the following patch adds a conditional block around it. Signed-off-by: Ricardo Martins <rasm@fe.up.pt> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Sometimes it helps if you specify the oldest stable kernel to apply patches to, so for example: commit 80b08a8d8829a58b5db14b1417151094cc28face Author: Felix Fietkau <nbd@openwrt.org> Date: Fri Jun 15 03:04:53 2012 +0200 ath9k: fix invalid pointer access in the tx path After setup_frame_info has been called, only info->control.rates is still valid, other control fields have been overwritten by the ath_frame_info data. Move the access to info->control.vif for checking short preamble to setup_frame_info before it gets overwritten. This regression was introduced in commit d47a61aa "ath9k: Fix multi-VIF BSS handling" Signed-off-by: Felix Fietkau <nbd@openwrt.org> Reported-by: Thomas Hühn <thomas@net.t-labs.tu-berlin.de> Acked-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Cc: stable@vger.kernel.org [3.4] Signed-off-by: John W. Linville <linville@tuxdriver.com> To be clear, this is not a Cc: in the e-mail but instead a Cc line in the commit log entry. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Understand, thank you ! > -----Original Message----- > From: Rodriguez, Luis > Sent: Wednesday, June 27, 2012 4:55 > To: Huang, Xiong > Cc: Ren, Cloud; davem@davemloft.net; netdev@vger.kernel.org; linux- > kernel@vger.kernel.org; qca-linux-team; nic-devel > Subject: Re: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out > > On Tue, Jun 26, 2012 at 01:41:11PM -0700, Huang, Xiong wrote: > > Luis > > It should be a stable fix, but as Ben Hutchings mentioned in > > another mail, Maybe, removing netif_stop_queue when cable link down is a > better choice. > > > > Do you mean we need add 'cc:stable@vger.kernel.org' just before 'some > people report ...' ? > > Nope, see commit 4f7a67e2dd49fbfba002c453bc24bf00e701cc71 > as an example of how to do this. This is a random commit that has been > marked as stable. > > commit 4f7a67e2dd49fbfba002c453bc24bf00e701cc71 > Author: Ricardo Martins <rasm@fe.up.pt> > Date: Tue May 22 18:02:03 2012 +0100 > > USB: fix PS3 EHCI systems > > After commit aaa0ef289afe9186f81e2340114ea413eef0492a "PS3 EHCI > QH > read work-around", Terratec Grabby (em28xx) stopped working with AMD > Geode LX 800 (USB controller AMD CS5536). Since this is a PS3 only > fix, the following patch adds a conditional block around it. > > Signed-off-by: Ricardo Martins <rasm@fe.up.pt> > Acked-by: Alan Stern <stern@rowland.harvard.edu> > Cc: stable <stable@vger.kernel.org> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > > Sometimes it helps if you specify the oldest stable kernel to apply patches to, > so for example: > > commit 80b08a8d8829a58b5db14b1417151094cc28face > Author: Felix Fietkau <nbd@openwrt.org> > Date: Fri Jun 15 03:04:53 2012 +0200 > > ath9k: fix invalid pointer access in the tx path > > After setup_frame_info has been called, only info->control.rates is still > valid, other control fields have been overwritten by the ath_frame_info > data. Move the access to info->control.vif for checking short preamble > to setup_frame_info before it gets overwritten. > > This regression was introduced in commit d47a61aa > "ath9k: Fix multi-VIF BSS handling" > > Signed-off-by: Felix Fietkau <nbd@openwrt.org> > Reported-by: Thomas Hühn <thomas@net.t-labs.tu-berlin.de> > Acked-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> > Cc: stable@vger.kernel.org [3.4] > Signed-off-by: John W. Linville <linville@tuxdriver.com> > > To be clear, this is not a Cc: in the e-mail but instead a Cc line in the commit > log entry. > > Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index 85717cb..c2736c4 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -351,6 +351,8 @@ static void atl1c_common_task(struct work_struct *work) atl1c_irq_disable(adapter); atl1c_check_link_status(adapter); atl1c_irq_enable(adapter); + if (netif_queue_stopped(netdev) && netif_carrier_ok(netdev)) + netif_wake_queue(netdev); } }