diff mbox

b44: fix resume, request_irq after hw reset

Message ID 201010120022.13171.james@albanarts.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

James Hogan Oct. 11, 2010, 11:22 p.m. UTC
This driver was hanging on resume because it was requesting a shared irq
that it wasn't ready to immediately handle, which was tested in
request_irq because of the CONFIG_DEBUG_SHIRQ config option. The
interrupt handler tried to read the interrupt status register but for
some reason it hung the system.

The request_irq is now moved a bit later after resetting the hardware
which seems to fix it.

Signed-off-by: James Hogan <james@albanarts.com>
---
 drivers/net/b44.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

Comments

Andrew Morton Oct. 11, 2010, 11:34 p.m. UTC | #1
On Tue, 12 Oct 2010 00:22:12 +0100
James Hogan <james@albanarts.com> wrote:

> This driver was hanging on resume because it was requesting a shared irq
> that it wasn't ready to immediately handle, which was tested in
> request_irq because of the CONFIG_DEBUG_SHIRQ config option. The
> interrupt handler tried to read the interrupt status register but for
> some reason it hung the system.
> 
> The request_irq is now moved a bit later after resetting the hardware
> which seems to fix it.
> 
> Signed-off-by: James Hogan <james@albanarts.com>
> ---
>  drivers/net/b44.c |   12 ++++++------
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/b44.c b/drivers/net/b44.c
> index 1e620e2..dbba981 100644
> --- a/drivers/net/b44.c
> +++ b/drivers/net/b44.c
> @@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev)
>  	if (!netif_running(dev))
>  		return 0;
>  
> -	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
> -	if (rc) {
> -		netdev_err(dev, "request_irq failed\n");
> -		return rc;
> -	}
> -
>  	spin_lock_irq(&bp->lock);
>  
>  	b44_init_rings(bp);
> @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
>  	netif_device_attach(bp->dev);
>  	spin_unlock_irq(&bp->lock);
>  
> +	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
> +	if (rc) {
> +		netdev_err(dev, "request_irq failed\n");
> +		return rc;
> +	}
> +
>  	b44_enable_ints(bp);
>  	netif_wake_queue(dev);

OK, running the interrupt handler before b44_init_hw() is presumably
the problem here.

The hardware surely won't be generating interrupts until we've run
b44_init_hw() and b44_enable_ints(), so this patch really is only to
keep CONFIG_DEBUG_SHIRQ happy.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Hogan Oct. 12, 2010, 7:08 a.m. UTC | #2
On Tuesday 12 October 2010 00:34:40 Andrew Morton wrote:
> On Tue, 12 Oct 2010 00:22:12 +0100
> 
> James Hogan <james@albanarts.com> wrote:
> > This driver was hanging on resume because it was requesting a shared irq
> > that it wasn't ready to immediately handle, which was tested in
> > request_irq because of the CONFIG_DEBUG_SHIRQ config option. The
> > interrupt handler tried to read the interrupt status register but for
> > some reason it hung the system.
> > 
> > The request_irq is now moved a bit later after resetting the hardware
> > which seems to fix it.
> > 
> > Signed-off-by: James Hogan <james@albanarts.com>
> > ---
> > 
> >  drivers/net/b44.c |   12 ++++++------
> >  1 files changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/net/b44.c b/drivers/net/b44.c
> > index 1e620e2..dbba981 100644
> > --- a/drivers/net/b44.c
> > +++ b/drivers/net/b44.c
> > @@ -2296,12 +2296,6 @@ static int b44_resume(struct ssb_device *sdev)
> > 
> >  	if (!netif_running(dev))
> >  	
> >  		return 0;
> > 
> > -	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, 
dev);
> > -	if (rc) {
> > -		netdev_err(dev, "request_irq failed\n");
> > -		return rc;
> > -	}
> > -
> > 
> >  	spin_lock_irq(&bp->lock);
> >  	
> >  	b44_init_rings(bp);
> > 
> > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
> > 
> >  	netif_device_attach(bp->dev);
> >  	spin_unlock_irq(&bp->lock);
> > 
> > +	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, 
dev);
> > +	if (rc) {
> > +		netdev_err(dev, "request_irq failed\n");
> > +		return rc;
> > +	}
> > +
> > 
> >  	b44_enable_ints(bp);
> >  	netif_wake_queue(dev);
> 
> OK, running the interrupt handler before b44_init_hw() is presumably
> the problem here.
> 
> The hardware surely won't be generating interrupts until we've run
> b44_init_hw() and b44_enable_ints(), so this patch really is only to
> keep CONFIG_DEBUG_SHIRQ happy.

For me it's mainly to keep CONFIG_DEBUG_SHIRQ happy (Fedora has this switched 
on), but since it's a shared IRQ, there is still a chance it could be 
called before enabling it's own interrupts by a different device on the same 
IRQ.

It makes sense to me why it's disabling the IRQ now, in case another device 
triggers it when it cannot handle it safely. I also tried calling the 
interrupt directly before the free_irq in the suspend function to check that 
it wasn't being done too late, and it didn't fail, so possibly it is the core 
suspension that makes it start failing until it is brought back up properly.

Cheers
James
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Morton Oct. 12, 2010, 7:27 a.m. UTC | #3
On Tue, 12 Oct 2010 08:08:05 +0100 James Hogan <james@albanarts.com> wrote:

> > > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
> > > 
> > >  	netif_device_attach(bp->dev);
> > >  	spin_unlock_irq(&bp->lock);
> > > 
> > > +	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, 
> dev);
> > > +	if (rc) {
> > > +		netdev_err(dev, "request_irq failed\n");
> > > +		return rc;
> > > +	}
> > > +
> > > 
> > >  	b44_enable_ints(bp);
> > >  	netif_wake_queue(dev);
> > 
> > OK, running the interrupt handler before b44_init_hw() is presumably
> > the problem here.
> > 
> > The hardware surely won't be generating interrupts until we've run
> > b44_init_hw() and b44_enable_ints(), so this patch really is only to
> > keep CONFIG_DEBUG_SHIRQ happy.
> 
> For me it's mainly to keep CONFIG_DEBUG_SHIRQ happy (Fedora has this switched 
> on), but since it's a shared IRQ, there is still a chance it could be 
> called before enabling it's own interrupts by a different device on the same 
> IRQ.

ooh, yes, you're right, I forgot about that.  It's indeed a bug.

> It makes sense to me why it's disabling the IRQ now, in case another device 
> triggers it when it cannot handle it safely.

What code are you referring to here?  There's no disable_irq() in that area?

> I also tried calling the 
> interrupt directly before the free_irq in the suspend function to check that 
> it wasn't being done too late, and it didn't fail, so possibly it is the core 
> suspension that makes it start failing until it is brought back up properly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Hogan Oct. 12, 2010, 8:40 a.m. UTC | #4
On 12 October 2010 08:27, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 12 Oct 2010 08:08:05 +0100 James Hogan <james@albanarts.com> wrote:
>
>> > > @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
>> > >
>> > >   netif_device_attach(bp->dev);
>> > >   spin_unlock_irq(&bp->lock);
>> > >
>> > > + rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name,
>> dev);
>> > > + if (rc) {
>> > > +         netdev_err(dev, "request_irq failed\n");
>> > > +         return rc;
>> > > + }
>> > > +
>> > >
>> > >   b44_enable_ints(bp);
>> > >   netif_wake_queue(dev);
>> >
>> > OK, running the interrupt handler before b44_init_hw() is presumably
>> > the problem here.
>> >
>> > The hardware surely won't be generating interrupts until we've run
>> > b44_init_hw() and b44_enable_ints(), so this patch really is only to
>> > keep CONFIG_DEBUG_SHIRQ happy.
>>
>> For me it's mainly to keep CONFIG_DEBUG_SHIRQ happy (Fedora has this switched
>> on), but since it's a shared IRQ, there is still a chance it could be
>> called before enabling it's own interrupts by a different device on the same
>> IRQ.
>
> ooh, yes, you're right, I forgot about that.  It's indeed a bug.
>
>> It makes sense to me why it's disabling the IRQ now, in case another device
>> triggers it when it cannot handle it safely.
>
> What code are you referring to here?  There's no disable_irq() in that area?

Sorry, I meant freeing the irq (free_irq() in b44_suspend).

Thinking about it this should also go in stable too.

Cheers
James

>
>> I also tried calling the
>> interrupt directly before the free_irq in the suspend function to check that
>> it wasn't being done too late, and it didn't fail, so possibly it is the core
>> suspension that makes it start failing until it is brought back up properly.
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 13, 2010, 4:46 p.m. UTC | #5
From: James Hogan <james@albanarts.com>
Date: Tue, 12 Oct 2010 00:22:12 +0100

> @@ -2309,6 +2303,12 @@ static int b44_resume(struct ssb_device *sdev)
>  	netif_device_attach(bp->dev);
>  	spin_unlock_irq(&bp->lock);
>  
> +	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
> +	if (rc) {
> +		netdev_err(dev, "request_irq failed\n");
> +		return rc;
> +	}
> +
>  	b44_enable_ints(bp);
>  	netif_wake_queue(dev);

Since you've moved the request_irq() down, you'll need to adjust
the error handling so that it undoes side effects made by this
function up until this point.

F.e. netif_device_attach() has to be undone for one thing.

Next, b44_init_rings() allocates memory that you must now free.

Etc. etc. etc.

This change is not so simple. :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index 1e620e2..dbba981 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2296,12 +2296,6 @@  static int b44_resume(struct ssb_device *sdev)
 	if (!netif_running(dev))
 		return 0;
 
-	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
-	if (rc) {
-		netdev_err(dev, "request_irq failed\n");
-		return rc;
-	}
-
 	spin_lock_irq(&bp->lock);
 
 	b44_init_rings(bp);
@@ -2309,6 +2303,12 @@  static int b44_resume(struct ssb_device *sdev)
 	netif_device_attach(bp->dev);
 	spin_unlock_irq(&bp->lock);
 
+	rc = request_irq(dev->irq, b44_interrupt, IRQF_SHARED, dev->name, dev);
+	if (rc) {
+		netdev_err(dev, "request_irq failed\n");
+		return rc;
+	}
+
 	b44_enable_ints(bp);
 	netif_wake_queue(dev);