Patchwork 3c59x: Get rid of "Trying to free already-free IRQ"

login
register
mail settings
Submitter Anton Vorontsov
Date Sept. 24, 2009, 6:31 p.m.
Message ID <20090924183152.GA30254@oksana.dev.rtsoft.ru>
Download mbox | patch
Permalink /patch/34228/
State Accepted
Delegated to: David Miller
Headers show

Comments

Anton Vorontsov - Sept. 24, 2009, 6:31 p.m.
Following trace pops up if we try to suspend with 3c59x ethernet NIC
brought down:

  root@b1:~# ifconfig eth16 down
  root@b1:~# echo mem > /sys/power/state
  ...
  3c59x 0000:00:10.0: suspend
  3c59x 0000:00:10.0: PME# disabled
  Trying to free already-free IRQ 48
  ------------[ cut here ]------------
  Badness at c00554e4 [verbose debug info unavailable]
  NIP: c00554e4 LR: c00554e4 CTR: c019a098
  REGS: c7975c60 TRAP: 0700   Not tainted  (2.6.31-rc4)
  MSR: 00021032 <ME,CE,IR,DR>  CR: 28242422  XER: 20000000
  TASK = c79cb0c0[1746] 'bash' THREAD: c7974000
  ...
  NIP [c00554e4] __free_irq+0x108/0x1b0
  LR [c00554e4] __free_irq+0x108/0x1b0
  Call Trace:
  [c7975d10] [c00554e4] __free_irq+0x108/0x1b0 (unreliable)
  [c7975d30] [c005559c] free_irq+0x10/0x24
  [c7975d40] [c01e21ec] vortex_suspend+0x70/0xc4
  [c7975d60] [c017e584] pci_legacy_suspend+0x58/0x100

This is because the driver manages interrupts without checking for
netif_running().

Though, there are few other issues with suspend/resume in this driver.
The intention of calling free_irq() in suspend() was to avoid any
possible spurious interrupts (see commit 5b039e681b8c5f30aac9cc04385
"3c59x PM fixes"). But,

- On resume, the driver was requesting IRQ just after pci_set_master(),
  but before vortex_up() (which actually resets 3c59x chips).

- Issuing free_irq() on a shared IRQ doesn't guarantee that a buggy
  HW won't trigger spurious interrupts in another driver that
  requested the same interrupt. So, if we want to protect from
  unexpected interrupts, then on suspend we should issue disable_irq(),
  not free_irq().

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---
 drivers/net/3c59x.c |   12 +++---------
 1 files changed, 3 insertions(+), 9 deletions(-)
Rafael J. Wysocki - Sept. 24, 2009, 8:30 p.m.
On Thursday 24 September 2009, Anton Vorontsov wrote:
> Following trace pops up if we try to suspend with 3c59x ethernet NIC
> brought down:

Patch looks good, but IMO it'd be a little effort to convert the driver to
dev_pm_ops while you're at it (please see r8169 for a working example).

Thanks,
Rafael


>   root@b1:~# ifconfig eth16 down
>   root@b1:~# echo mem > /sys/power/state
>   ...
>   3c59x 0000:00:10.0: suspend
>   3c59x 0000:00:10.0: PME# disabled
>   Trying to free already-free IRQ 48
>   ------------[ cut here ]------------
>   Badness at c00554e4 [verbose debug info unavailable]
>   NIP: c00554e4 LR: c00554e4 CTR: c019a098
>   REGS: c7975c60 TRAP: 0700   Not tainted  (2.6.31-rc4)
>   MSR: 00021032 <ME,CE,IR,DR>  CR: 28242422  XER: 20000000
>   TASK = c79cb0c0[1746] 'bash' THREAD: c7974000
>   ...
>   NIP [c00554e4] __free_irq+0x108/0x1b0
>   LR [c00554e4] __free_irq+0x108/0x1b0
>   Call Trace:
>   [c7975d10] [c00554e4] __free_irq+0x108/0x1b0 (unreliable)
>   [c7975d30] [c005559c] free_irq+0x10/0x24
>   [c7975d40] [c01e21ec] vortex_suspend+0x70/0xc4
>   [c7975d60] [c017e584] pci_legacy_suspend+0x58/0x100
> 
> This is because the driver manages interrupts without checking for
> netif_running().
> 
> Though, there are few other issues with suspend/resume in this driver.
> The intention of calling free_irq() in suspend() was to avoid any
> possible spurious interrupts (see commit 5b039e681b8c5f30aac9cc04385
> "3c59x PM fixes"). But,
> 
> - On resume, the driver was requesting IRQ just after pci_set_master(),
>   but before vortex_up() (which actually resets 3c59x chips).
> 
> - Issuing free_irq() on a shared IRQ doesn't guarantee that a buggy
>   HW won't trigger spurious interrupts in another driver that
>   requested the same interrupt. So, if we want to protect from
>   unexpected interrupts, then on suspend we should issue disable_irq(),
>   not free_irq().
> 
> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
> ---
>  drivers/net/3c59x.c |   12 +++---------
>  1 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
> index c34aee9..7cdd4b0 100644
> --- a/drivers/net/3c59x.c
> +++ b/drivers/net/3c59x.c
> @@ -807,10 +807,10 @@ static int vortex_suspend(struct pci_dev *pdev, pm_message_t state)
>  		if (netif_running(dev)) {
>  			netif_device_detach(dev);
>  			vortex_down(dev, 1);
> +			disable_irq(dev->irq);
>  		}
>  		pci_save_state(pdev);
>  		pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
> -		free_irq(dev->irq, dev);
>  		pci_disable_device(pdev);
>  		pci_set_power_state(pdev, pci_choose_state(pdev, state));
>  	}
> @@ -833,18 +833,12 @@ static int vortex_resume(struct pci_dev *pdev)
>  			return err;
>  		}
>  		pci_set_master(pdev);
> -		if (request_irq(dev->irq, vp->full_bus_master_rx ?
> -				&boomerang_interrupt : &vortex_interrupt, IRQF_SHARED, dev->name, dev)) {
> -			pr_warning("%s: Could not reserve IRQ %d\n", dev->name, dev->irq);
> -			pci_disable_device(pdev);
> -			return -EBUSY;
> -		}
>  		if (netif_running(dev)) {
>  			err = vortex_up(dev);
>  			if (err)
>  				return err;
> -			else
> -				netif_device_attach(dev);
> +			enable_irq(dev->irq);
> +			netif_device_attach(dev);
>  		}
>  	}
>  	return 0;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anton Vorontsov - Sept. 24, 2009, 9:30 p.m.
On Thu, Sep 24, 2009 at 10:30:33PM +0200, Rafael J. Wysocki wrote:
> On Thursday 24 September 2009, Anton Vorontsov wrote:
> > Following trace pops up if we try to suspend with 3c59x ethernet NIC
> > brought down:
> 
> Patch looks good, but IMO it'd be a little effort to convert the driver to
> dev_pm_ops while you're at it (please see r8169 for a working example).

I'd like to avoid putting irrelevant stuff into bugfixes.

Apart from delights as bisecting and revert-only-offending-piece,
keeping bugfixes small and self-sufficient helps to back-port the
fixes to stable/distro kernels. Think of not so old kernels that
don't have dev_pm_ops.

Converting this driver (and others that I'm interested in) to
dev_pm_ops is on my todo list though.

Thanks,
David Miller - Sept. 24, 2009, 10:26 p.m.
From: Anton Vorontsov <avorontsov@ru.mvista.com>
Date: Fri, 25 Sep 2009 01:30:39 +0400

> On Thu, Sep 24, 2009 at 10:30:33PM +0200, Rafael J. Wysocki wrote:
>> On Thursday 24 September 2009, Anton Vorontsov wrote:
>> > Following trace pops up if we try to suspend with 3c59x ethernet NIC
>> > brought down:
>> 
>> Patch looks good, but IMO it'd be a little effort to convert the driver to
>> dev_pm_ops while you're at it (please see r8169 for a working example).
> 
> I'd like to avoid putting irrelevant stuff into bugfixes.

Agreed.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Sept. 24, 2009, 10:47 p.m.
From: Anton Vorontsov <avorontsov@ru.mvista.com>
Date: Thu, 24 Sep 2009 22:31:52 +0400

> Following trace pops up if we try to suspend with 3c59x ethernet NIC
> brought down:
 ...
> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>

Applied.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern - Sept. 25, 2009, 4:43 a.m.
On Thu, 24 Sep 2009, Anton Vorontsov wrote:

> Though, there are few other issues with suspend/resume in this driver.
> The intention of calling free_irq() in suspend() was to avoid any
> possible spurious interrupts (see commit 5b039e681b8c5f30aac9cc04385
> "3c59x PM fixes"). But,
> 
> - On resume, the driver was requesting IRQ just after pci_set_master(),
>   but before vortex_up() (which actually resets 3c59x chips).

Shouldn't it be possible to reset the chip (or at least prevent it from 
generating spurious IRQs) during the early-resume phase?

> - Issuing free_irq() on a shared IRQ doesn't guarantee that a buggy
>   HW won't trigger spurious interrupts in another driver that
>   requested the same interrupt. So, if we want to protect from
>   unexpected interrupts, then on suspend we should issue disable_irq(),
>   not free_irq().

What if some other device shares the IRQ and still relies on receiving
interrupts when this code runs?  Won't disable_irq() mess up the other
device?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki - Sept. 25, 2009, 12:32 p.m.
On Friday 25 September 2009, Alan Stern wrote:
> On Thu, 24 Sep 2009, Anton Vorontsov wrote:
> 
> > Though, there are few other issues with suspend/resume in this driver.
> > The intention of calling free_irq() in suspend() was to avoid any
> > possible spurious interrupts (see commit 5b039e681b8c5f30aac9cc04385
> > "3c59x PM fixes"). But,
> > 
> > - On resume, the driver was requesting IRQ just after pci_set_master(),
> >   but before vortex_up() (which actually resets 3c59x chips).
> 
> Shouldn't it be possible to reset the chip (or at least prevent it from 
> generating spurious IRQs) during the early-resume phase?
> 
> > - Issuing free_irq() on a shared IRQ doesn't guarantee that a buggy
> >   HW won't trigger spurious interrupts in another driver that
> >   requested the same interrupt. So, if we want to protect from
> >   unexpected interrupts, then on suspend we should issue disable_irq(),
> >   not free_irq().
> 
> What if some other device shares the IRQ and still relies on receiving
> interrupts when this code runs?  Won't disable_irq() mess up the other
> device?

Ah, I overlooked the disable_irq()/enable_irq() part, which is not really
necessary anyway.

Anton, have you tried without that?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki - Sept. 25, 2009, 12:35 p.m.
On Friday 25 September 2009, David Miller wrote:
> From: Anton Vorontsov <avorontsov@ru.mvista.com>
> Date: Fri, 25 Sep 2009 01:30:39 +0400
> 
> > On Thu, Sep 24, 2009 at 10:30:33PM +0200, Rafael J. Wysocki wrote:
> >> On Thursday 24 September 2009, Anton Vorontsov wrote:
> >> > Following trace pops up if we try to suspend with 3c59x ethernet NIC
> >> > brought down:
> >> 
> >> Patch looks good, but IMO it'd be a little effort to convert the driver to
> >> dev_pm_ops while you're at it (please see r8169 for a working example).
> > 
> > I'd like to avoid putting irrelevant stuff into bugfixes.
> 
> Agreed.

Well, the point is that all of the PCI core stuff the driver does is not
necessary and should better be dropped along with the IRQ thing.

Best,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anton Vorontsov - Sept. 25, 2009, 12:56 p.m.
On Fri, Sep 25, 2009 at 02:32:30PM +0200, Rafael J. Wysocki wrote:
> On Friday 25 September 2009, Alan Stern wrote:
> > On Thu, 24 Sep 2009, Anton Vorontsov wrote:
> > 
> > > Though, there are few other issues with suspend/resume in this driver.
> > > The intention of calling free_irq() in suspend() was to avoid any
> > > possible spurious interrupts (see commit 5b039e681b8c5f30aac9cc04385
> > > "3c59x PM fixes"). But,
> > > 
> > > - On resume, the driver was requesting IRQ just after pci_set_master(),
> > >   but before vortex_up() (which actually resets 3c59x chips).
> > 
> > Shouldn't it be possible to reset the chip (or at least prevent it from 
> > generating spurious IRQs) during the early-resume phase?
> > 
> > > - Issuing free_irq() on a shared IRQ doesn't guarantee that a buggy
> > >   HW won't trigger spurious interrupts in another driver that
> > >   requested the same interrupt. So, if we want to protect from
> > >   unexpected interrupts, then on suspend we should issue disable_irq(),
> > >   not free_irq().
> > 
> > What if some other device shares the IRQ and still relies on receiving
> > interrupts when this code runs?  Won't disable_irq() mess up the other
> > device?
> 
> Ah, I overlooked the disable_irq()/enable_irq() part, which is not really
> necessary anyway.

Well, it is necessary if 3c59x really throws spurious interrupts
upon suspend (i.e. after pci_disable_device(pdev)). My first though
was to just remove free/request_irq stuff, but then I could introduce
a regression if 3c59x really throws unexpected IRQs and 3c59x was
the only user of a PCI IRQ (in that case free_irq() would actually
help).

> Anton, have you tried without that?

Yes, and there wasn't any issues for 3x59x I have. Alan raised a very
good point, and converting to dev_pm_opsas as you've suggested would
solve it in a nice way, since if we use the dev_pm_ops, PCI core
will disable the device in _noirq suspend, after we quiesced the
chip itself.

I'll send another patch that reworks PM stuff in the driver soon.

Thanks,
Rafael J. Wysocki - Sept. 25, 2009, 1:02 p.m.
On Friday 25 September 2009, Anton Vorontsov wrote:
> On Fri, Sep 25, 2009 at 02:32:30PM +0200, Rafael J. Wysocki wrote:
> > On Friday 25 September 2009, Alan Stern wrote:
> > > On Thu, 24 Sep 2009, Anton Vorontsov wrote:
> > > 
> > > > Though, there are few other issues with suspend/resume in this driver.
> > > > The intention of calling free_irq() in suspend() was to avoid any
> > > > possible spurious interrupts (see commit 5b039e681b8c5f30aac9cc04385
> > > > "3c59x PM fixes"). But,
> > > > 
> > > > - On resume, the driver was requesting IRQ just after pci_set_master(),
> > > >   but before vortex_up() (which actually resets 3c59x chips).
> > > 
> > > Shouldn't it be possible to reset the chip (or at least prevent it from 
> > > generating spurious IRQs) during the early-resume phase?
> > > 
> > > > - Issuing free_irq() on a shared IRQ doesn't guarantee that a buggy
> > > >   HW won't trigger spurious interrupts in another driver that
> > > >   requested the same interrupt. So, if we want to protect from
> > > >   unexpected interrupts, then on suspend we should issue disable_irq(),
> > > >   not free_irq().
> > > 
> > > What if some other device shares the IRQ and still relies on receiving
> > > interrupts when this code runs?  Won't disable_irq() mess up the other
> > > device?
> > 
> > Ah, I overlooked the disable_irq()/enable_irq() part, which is not really
> > necessary anyway.
> 
> Well, it is necessary if 3c59x really throws spurious interrupts
> upon suspend (i.e. after pci_disable_device(pdev)). My first though
> was to just remove free/request_irq stuff, but then I could introduce
> a regression if 3c59x really throws unexpected IRQs and 3c59x was
> the only user of a PCI IRQ (in that case free_irq() would actually
> help).
> 
> > Anton, have you tried without that?
> 
> Yes, and there wasn't any issues for 3x59x I have. Alan raised a very
> good point, and converting to dev_pm_opsas as you've suggested would
> solve it in a nice way, since if we use the dev_pm_ops, PCI core
> will disable the device in _noirq suspend, after we quiesced the
> chip itself.

That's exactly why I suggested to do that. :-)

> I'll send another patch that reworks PM stuff in the driver soon.

Thanks a lot for taking care of this!

Best,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
index c34aee9..7cdd4b0 100644
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -807,10 +807,10 @@  static int vortex_suspend(struct pci_dev *pdev, pm_message_t state)
 		if (netif_running(dev)) {
 			netif_device_detach(dev);
 			vortex_down(dev, 1);
+			disable_irq(dev->irq);
 		}
 		pci_save_state(pdev);
 		pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
-		free_irq(dev->irq, dev);
 		pci_disable_device(pdev);
 		pci_set_power_state(pdev, pci_choose_state(pdev, state));
 	}
@@ -833,18 +833,12 @@  static int vortex_resume(struct pci_dev *pdev)
 			return err;
 		}
 		pci_set_master(pdev);
-		if (request_irq(dev->irq, vp->full_bus_master_rx ?
-				&boomerang_interrupt : &vortex_interrupt, IRQF_SHARED, dev->name, dev)) {
-			pr_warning("%s: Could not reserve IRQ %d\n", dev->name, dev->irq);
-			pci_disable_device(pdev);
-			return -EBUSY;
-		}
 		if (netif_running(dev)) {
 			err = vortex_up(dev);
 			if (err)
 				return err;
-			else
-				netif_device_attach(dev);
+			enable_irq(dev->irq);
+			netif_device_attach(dev);
 		}
 	}
 	return 0;