Message ID | 201010120022.13171.james@albanarts.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, 12 Oct 2010 00:22:12 +0100 James Hogan <james@albanarts.com> wrote: > This driver was hanging on resume because it was requesting a shared irq > that it wasn't ready to immediately handle, which was tested in > request_irq because of the CONFIG_DEBUG_SHIRQ config option. The > interrupt handler tried to read the interrupt status register but for > some reason it hung the system. > > The request_irq is now moved a bit later after resetting the hardware > which seems to fix it. > > Signed-off-by: James Hogan <james@albanarts.com> > --- > drivers/net/b44.c | 12 ++++++------ > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/b44.c b/drivers/net/b44.c > index 1e620e2..dbba981 100644 > --- a/drivers/net/b44.c > +++ b/drivers/net/b44.c > @@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev) > if (!netif_running(dev)) > return 0; > > - rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); > - if (rc) { > - netdev_err(dev, "request_irq failed\n"); > - return rc; > - } > - > spin_lock_irq(&bp->lock); > > b44_init_rings(bp); > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev) > netif_device_attach(bp->dev); > spin_unlock_irq(&bp->lock); > > + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); > + if (rc) { > + netdev_err(dev, "request_irq failed\n"); > + return rc; > + } > + > b44_enable_ints(bp); > netif_wake_queue(dev); OK, running the interrupt handler before b44_init_hw() is presumably the problem here. The hardware surely won't be generating interrupts until we've run b44_init_hw() and b44_enable_ints(), so this patch really is only to keep CONFIG_DEBUG_SHIRQ happy. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 12 October 2010 00:34:40 Andrew Morton wrote: > On Tue, 12 Oct 2010 00:22:12 +0100 > > James Hogan <james@albanarts.com> wrote: > > This driver was hanging on resume because it was requesting a shared irq > > that it wasn't ready to immediately handle, which was tested in > > request_irq because of the CONFIG_DEBUG_SHIRQ config option. The > > interrupt handler tried to read the interrupt status register but for > > some reason it hung the system. > > > > The request_irq is now moved a bit later after resetting the hardware > > which seems to fix it. > > > > Signed-off-by: James Hogan <james@albanarts.com> > > --- > > > > drivers/net/b44.c | 12 ++++++------ > > 1 files changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/net/b44.c b/drivers/net/b44.c > > index 1e620e2..dbba981 100644 > > --- a/drivers/net/b44.c > > +++ b/drivers/net/b44.c > > @@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev) > > > > if (!netif_running(dev)) > > > > return 0; > > > > - rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); > > - if (rc) { > > - netdev_err(dev, "request_irq failed\n"); > > - return rc; > > - } > > - > > > > spin_lock_irq(&bp->lock); > > > > b44_init_rings(bp); > > > > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev) > > > > netif_device_attach(bp->dev); > > spin_unlock_irq(&bp->lock); > > > > + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); > > + if (rc) { > > + netdev_err(dev, "request_irq failed\n"); > > + return rc; > > + } > > + > > > > b44_enable_ints(bp); > > netif_wake_queue(dev); > > OK, running the interrupt handler before b44_init_hw() is presumably > the problem here. > > The hardware surely won't be generating interrupts until we've run > b44_init_hw() and b44_enable_ints(), so this patch really is only to > keep CONFIG_DEBUG_SHIRQ happy. For me it's mainly to keep CONFIG_DEBUG_SHIRQ happy (Fedora has this switched on), but since it's a shared IRQ, there is still a chance it could be called before enabling it's own interrupts by a different device on the same IRQ. It makes sense to me why it's disabling the IRQ now, in case another device triggers it when it cannot handle it safely. I also tried calling the interrupt directly before the free_irq in the suspend function to check that it wasn't being done too late, and it didn't fail, so possibly it is the core suspension that makes it start failing until it is brought back up properly. Cheers James -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 12 Oct 2010 08:08:05 +0100 James Hogan <james@albanarts.com> wrote: > > > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev) > > > > > > netif_device_attach(bp->dev); > > > spin_unlock_irq(&bp->lock); > > > > > > + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, > dev); > > > + if (rc) { > > > + netdev_err(dev, "request_irq failed\n"); > > > + return rc; > > > + } > > > + > > > > > > b44_enable_ints(bp); > > > netif_wake_queue(dev); > > > > OK, running the interrupt handler before b44_init_hw() is presumably > > the problem here. > > > > The hardware surely won't be generating interrupts until we've run > > b44_init_hw() and b44_enable_ints(), so this patch really is only to > > keep CONFIG_DEBUG_SHIRQ happy. > > For me it's mainly to keep CONFIG_DEBUG_SHIRQ happy (Fedora has this switched > on), but since it's a shared IRQ, there is still a chance it could be > called before enabling it's own interrupts by a different device on the same > IRQ. ooh, yes, you're right, I forgot about that. It's indeed a bug. > It makes sense to me why it's disabling the IRQ now, in case another device > triggers it when it cannot handle it safely. What code are you referring to here? There's no disable_irq() in that area? > I also tried calling the > interrupt directly before the free_irq in the suspend function to check that > it wasn't being done too late, and it didn't fail, so possibly it is the core > suspension that makes it start failing until it is brought back up properly. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12 October 2010 08:27, Andrew Morton <akpm@linux-foundation.org> wrote: > On Tue, 12 Oct 2010 08:08:05 +0100 James Hogan <james@albanarts.com> wrote: > >> > > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev) >> > > >> > > netif_device_attach(bp->dev); >> > > spin_unlock_irq(&bp->lock); >> > > >> > > + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, >> dev); >> > > + if (rc) { >> > > + netdev_err(dev, "request_irq failed\n"); >> > > + return rc; >> > > + } >> > > + >> > > >> > > b44_enable_ints(bp); >> > > netif_wake_queue(dev); >> > >> > OK, running the interrupt handler before b44_init_hw() is presumably >> > the problem here. >> > >> > The hardware surely won't be generating interrupts until we've run >> > b44_init_hw() and b44_enable_ints(), so this patch really is only to >> > keep CONFIG_DEBUG_SHIRQ happy. >> >> For me it's mainly to keep CONFIG_DEBUG_SHIRQ happy (Fedora has this switched >> on), but since it's a shared IRQ, there is still a chance it could be >> called before enabling it's own interrupts by a different device on the same >> IRQ. > > ooh, yes, you're right, I forgot about that. It's indeed a bug. > >> It makes sense to me why it's disabling the IRQ now, in case another device >> triggers it when it cannot handle it safely. > > What code are you referring to here? There's no disable_irq() in that area? Sorry, I meant freeing the irq (free_irq() in b44_suspend). Thinking about it this should also go in stable too. Cheers James > >> I also tried calling the >> interrupt directly before the free_irq in the suspend function to check that >> it wasn't being done too late, and it didn't fail, so possibly it is the core >> suspension that makes it start failing until it is brought back up properly. > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: James Hogan <james@albanarts.com> Date: Tue, 12 Oct 2010 00:22:12 +0100 > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev) > netif_device_attach(bp->dev); > spin_unlock_irq(&bp->lock); > > + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); > + if (rc) { > + netdev_err(dev, "request_irq failed\n"); > + return rc; > + } > + > b44_enable_ints(bp); > netif_wake_queue(dev); Since you've moved the request_irq() down, you'll need to adjust the error handling so that it undoes side effects made by this function up until this point. F.e. netif_device_attach() has to be undone for one thing. Next, b44_init_rings() allocates memory that you must now free. Etc. etc. etc. This change is not so simple. :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/b44.c b/drivers/net/b44.c index 1e620e2..dbba981 100644 --- a/drivers/net/b44.c +++ b/drivers/net/b44.c @@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev) if (!netif_running(dev)) return 0; - rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); - if (rc) { - netdev_err(dev, "request_irq failed\n"); - return rc; - } - spin_lock_irq(&bp->lock); b44_init_rings(bp); @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev) netif_device_attach(bp->dev); spin_unlock_irq(&bp->lock); + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev); + if (rc) { + netdev_err(dev, "request_irq failed\n"); + return rc; + } + b44_enable_ints(bp); netif_wake_queue(dev);
This driver was hanging on resume because it was requesting a shared irq that it wasn't ready to immediately handle, which was tested in request_irq because of the CONFIG_DEBUG_SHIRQ config option. The interrupt handler tried to read the interrupt status register but for some reason it hung the system. The request_irq is now moved a bit later after resetting the hardware which seems to fix it. Signed-off-by: James Hogan <james@albanarts.com> --- drivers/net/b44.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-)