e1000e: Undo e1000e_pm_freeze if __e1000_shutdown fails

Submitted by Chris Wilson on Feb. 17, 2017, 12:41 p.m.

Details

Message ID 20170217124148.17130-1-chris@chris-wilson.co.uk
State Changes Requested
Headers show

Commit Message

Chris Wilson Feb. 17, 2017, 12:41 p.m.
An error during suspend (e100e_pm_suspend),

[  429.994338] ACPI : EC: event blocked
[  429.994633] e1000e: EEE TX LPI TIMER: 00000011
[  430.955451] pci_pm_suspend(): e1000e_pm_suspend+0x0/0x30 [e1000e] returns -2
[  430.955454] dpm_run_callback(): pci_pm_suspend+0x0/0x140 returns -2
[  430.955458] PM: Device 0000:00:19.0 failed to suspend async: error -2
[  430.955581] PM: Some devices failed to suspend, or early wake event detected
[  430.957709] ACPI : EC: event unblocked

lead to complete failure:

[  432.585002] ------------[ cut here ]------------
[  432.585013] WARNING: CPU: 3 PID: 8372 at kernel/irq/manage.c:1478 __free_irq+0x9f/0x280
[  432.585015] Trying to free already-free IRQ 20
[  432.585016] Modules linked in: cdc_ncm usbnet x86_pkg_temp_thermal intel_powerclamp coretemp mii crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep lpc_ich snd_hda_core snd_pcm mei_me mei sdhci_pci sdhci i915 mmc_core e1000e ptp pps_core prime_numbers
[  432.585042] CPU: 3 PID: 8372 Comm: kworker/u16:40 Tainted: G     U          4.10.0-rc8-CI-Patchwork_3870+ #1
[  432.585044] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012
[  432.585050] Workqueue: events_unbound async_run_entry_fn
[  432.585051] Call Trace:
[  432.585058]  dump_stack+0x67/0x92
[  432.585062]  __warn+0xc6/0xe0
[  432.585065]  warn_slowpath_fmt+0x4a/0x50
[  432.585070]  ? _raw_spin_lock_irqsave+0x49/0x60
[  432.585072]  __free_irq+0x9f/0x280
[  432.585075]  free_irq+0x34/0x80
[  432.585089]  e1000_free_irq+0x65/0x70 [e1000e]
[  432.585098]  e1000e_pm_freeze+0x7a/0xb0 [e1000e]
[  432.585106]  e1000e_pm_suspend+0x21/0x30 [e1000e]
[  432.585113]  pci_pm_suspend+0x71/0x140
[  432.585118]  dpm_run_callback+0x6f/0x330
[  432.585122]  ? pci_pm_freeze+0xe0/0xe0
[  432.585125]  __device_suspend+0xea/0x330
[  432.585128]  async_suspend+0x1a/0x90
[  432.585132]  async_run_entry_fn+0x34/0x160
[  432.585137]  process_one_work+0x1f4/0x6d0
[  432.585140]  ? process_one_work+0x16e/0x6d0
[  432.585143]  worker_thread+0x49/0x4a0
[  432.585145]  kthread+0x107/0x140
[  432.585148]  ? process_one_work+0x6d0/0x6d0
[  432.585150]  ? kthread_create_on_node+0x40/0x40
[  432.585154]  ret_from_fork+0x2e/0x40
[  432.585156] ---[ end trace 6712df7f8c4b9124 ]---

The unwind failures stems from commit 2800209994f8 ("e1000e: Refactor PM
flows"), but it may be a later patch that introduced the non-recoverable
behaviour.

Fixes: 2800209994f8 ("e1000e: Refactor PM flows")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Dave Ertman <davidx.m.ertman@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Ruinskiy, Dima Feb. 20, 2017, 12:43 p.m.
>-----Original Message-----

>Fixes: 2800209994f8 ("e1000e: Refactor PM flows")
>Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847
>Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>Cc: Dave Ertman <davidx.m.ertman@intel.com>
>Cc: Bruce Allan <bruce.w.allan@intel.com>
>Cc: intel-wired-lan@lists.osuosl.org
>Cc: netdev@vger.kernel.org
>---
> drivers/net/ethernet/intel/e1000e/netdev.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>b/drivers/net/ethernet/intel/e1000e/netdev.c
>index eccf1da9356b..429a5210230d 100644
>--- a/drivers/net/ethernet/intel/e1000e/netdev.c
>+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>@@ -6615,12 +6615,19 @@ static int e1000e_pm_thaw(struct device *dev)
>static int e1000e_pm_suspend(struct device *dev)  {
> 	struct pci_dev *pdev = to_pci_dev(dev);
>+	int rc;
>
> 	e1000e_flush_lpic(pdev);
>
> 	e1000e_pm_freeze(dev);
>
>-	return __e1000_shutdown(pdev, false);
>+	rc = __e1000_shutdown(pdev, false);
>+	if (rc) {
>+		e1000e_pm_thaw(dev);
>+		return rc;
>+	}
>+
>+	return 0;
> }
>
> static int e1000e_pm_resume(struct device *dev)
>--

Looks reasonable. However, can't you get the same result with fewer code lines?
-	return __e1000_shutdown(pdev, false);
+	rc = __e1000_shutdown(pdev, false);
+	if (rc)
+		e1000e_pm_thaw(dev);
+
+	return rc;
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Chris Wilson Feb. 20, 2017, 1:02 p.m.
On Mon, Feb 20, 2017 at 12:43:11PM +0000, Ruinskiy, Dima wrote:
> >-----Original Message-----
> 
> >Fixes: 2800209994f8 ("e1000e: Refactor PM flows")
> >Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> >Cc: Dave Ertman <davidx.m.ertman@intel.com>
> >Cc: Bruce Allan <bruce.w.allan@intel.com>
> >Cc: intel-wired-lan@lists.osuosl.org
> >Cc: netdev@vger.kernel.org
> >---
> > drivers/net/ethernet/intel/e1000e/netdev.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> >b/drivers/net/ethernet/intel/e1000e/netdev.c
> >index eccf1da9356b..429a5210230d 100644
> >--- a/drivers/net/ethernet/intel/e1000e/netdev.c
> >+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> >@@ -6615,12 +6615,19 @@ static int e1000e_pm_thaw(struct device *dev)
> >static int e1000e_pm_suspend(struct device *dev)  {
> > 	struct pci_dev *pdev = to_pci_dev(dev);
> >+	int rc;
> >
> > 	e1000e_flush_lpic(pdev);
> >
> > 	e1000e_pm_freeze(dev);
> >
> >-	return __e1000_shutdown(pdev, false);
> >+	rc = __e1000_shutdown(pdev, false);
> >+	if (rc) {
> >+		e1000e_pm_thaw(dev);
> >+		return rc;
> >+	}
> >+
> >+	return 0;
> > }
> >
> > static int e1000e_pm_resume(struct device *dev)
> >--
> 
> Looks reasonable. However, can't you get the same result with fewer code lines?
> -	return __e1000_shutdown(pdev, false);
> +	rc = __e1000_shutdown(pdev, false);
> +	if (rc)
> +		e1000e_pm_thaw(dev);
> +
> +	return rc;

You are welcome to use whatever style is consistent with the rest of the
driver. :)
-Chris
Sasha Neftin Feb. 21, 2017, 3:11 p.m.
On 2/20/2017 15:02, Chris Wilson wrote:
> Chris Wilson, Intel Open Source Technology Centre

Chris,

Please, check patch with fewer code lines. If short patch good and works 
well for you, please, submit it and we will check then on our side.

Patch hide | download patch | download mbox

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index eccf1da9356b..429a5210230d 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6615,12 +6615,19 @@  static int e1000e_pm_thaw(struct device *dev)
 static int e1000e_pm_suspend(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
+	int rc;
 
 	e1000e_flush_lpic(pdev);
 
 	e1000e_pm_freeze(dev);
 
-	return __e1000_shutdown(pdev, false);
+	rc = __e1000_shutdown(pdev, false);
+	if (rc) {
+		e1000e_pm_thaw(dev);
+		return rc;
+	}
+
+	return 0;
 }
 
 static int e1000e_pm_resume(struct device *dev)