diff mbox series

PCI / ACPI / PM: Resume all bridges on suspend-to-RAM

Message ID 6401388.2t0qD3iTOr@aspire.rjw.lan
State Not Applicable
Headers show
Series PCI / ACPI / PM: Resume all bridges on suspend-to-RAM | expand

Commit Message

Rafael J. Wysocki Aug. 16, 2018, 10:56 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on
suspend-to-RAM) attempted to fix a functional regression resulting
from commit c62ec4610c40 (PM / core: Fix direct_complete handling
for devices with no callbacks) by resuming PCI bridges without
drivers (that is, "parallel PCI" ones) during system-wide suspend if
the target system state is not ACPI S0 (working state).

That turns out insufficient, however, as it is reported that, at
least in one case, the platform firmware gets confused if a PCIe
root port is suspended before entering the ACPI S3 sleep state.

For this reason, drop the driver check from acpi_pci_need_resume()
and resume all bridges (including PCIe ports with drivers) during
system-wide suspend if the target system state is not ACPI S0.

[If the target system state is ACPI S0, it means suspend-to-idle
 and the platform firmware is not going to be invoked to actually
 suspend the system, so there is no need to resume the bridges in
 that case.]

Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks)
Reported-by: teika kazura <teika@gmx.com>
Tested-by: teika kazura <teika@gmx.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pci-acpi.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Mika Westerberg Aug. 16, 2018, 7:15 p.m. UTC | #1
On Thu, Aug 16, 2018 at 12:56:46PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on
> suspend-to-RAM) attempted to fix a functional regression resulting
> from commit c62ec4610c40 (PM / core: Fix direct_complete handling
> for devices with no callbacks) by resuming PCI bridges without
> drivers (that is, "parallel PCI" ones) during system-wide suspend if
> the target system state is not ACPI S0 (working state).
> 
> That turns out insufficient, however, as it is reported that, at
> least in one case, the platform firmware gets confused if a PCIe
> root port is suspended before entering the ACPI S3 sleep state.
> 
> For this reason, drop the driver check from acpi_pci_need_resume()
> and resume all bridges (including PCIe ports with drivers) during
> system-wide suspend if the target system state is not ACPI S0.
> 
> [If the target system state is ACPI S0, it means suspend-to-idle
>  and the platform firmware is not going to be invoked to actually
>  suspend the system, so there is no need to resume the bridges in
>  that case.]
> 
> Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks)
> Reported-by: teika kazura <teika@gmx.com>
> Tested-by: teika kazura <teika@gmx.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675
> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...)
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Bjorn Helgaas Aug. 16, 2018, 8:21 p.m. UTC | #2
On Thu, Aug 16, 2018 at 12:56:46PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on
> suspend-to-RAM) attempted to fix a functional regression resulting
> from commit c62ec4610c40 (PM / core: Fix direct_complete handling
> for devices with no callbacks) by resuming PCI bridges without
> drivers (that is, "parallel PCI" ones) during system-wide suspend if
> the target system state is not ACPI S0 (working state).
> 
> That turns out insufficient, however, as it is reported that, at
> least in one case, the platform firmware gets confused if a PCIe
> root port is suspended before entering the ACPI S3 sleep state.
> 
> For this reason, drop the driver check from acpi_pci_need_resume()
> and resume all bridges (including PCIe ports with drivers) during
> system-wide suspend if the target system state is not ACPI S0.
> 
> [If the target system state is ACPI S0, it means suspend-to-idle
>  and the platform firmware is not going to be invoked to actually
>  suspend the system, so there is no need to resume the bridges in
>  that case.]
> 
> Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks)
> Reported-by: teika kazura <teika@gmx.com>
> Tested-by: teika kazura <teika@gmx.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675
> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...)
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Thanks for doing this.  I don't like dependencies on the PCIe
PM/AER/hotplug/etc features being implemented as a "driver" because
they could be implemented in the PCI core directly.

> ---
>  drivers/pci/pci-acpi.c |    6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -632,13 +632,11 @@ static bool acpi_pci_need_resume(struct
>  	/*
>  	 * In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over
>  	 * system-wide suspend/resume confuses the platform firmware, so avoid
> -	 * doing that, unless the bridge has a driver that should take care of
> -	 * the PM handling.  According to Section 16.1.6 of ACPI 6.2, endpoint
> +	 * doing that.  According to Section 16.1.6 of ACPI 6.2, endpoint
>  	 * devices are expected to be in D3 before invoking the S3 entry path
>  	 * from the firmware, so they should not be affected by this issue.
>  	 */
> -	if (pci_is_bridge(dev) && !dev->driver &&
> -	    acpi_target_system_state() != ACPI_STATE_S0)
> +	if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0)
>  		return true;
>  
>  	if (!adev || !acpi_device_power_manageable(adev))
>
Teika Kazura Aug. 17, 2018, 5:43 a.m. UTC | #3
For the record, about the exactness of the patch description.

The patch mentions the regression by the commit c62ec4610c40, but it is not the cause of the bug (https://bugzilla.kernel.org/show_bug.cgi?id=20067) reported by me; I reverted c62ec4610c40 on linux-4.17.13, and the bug remained.

# Some details: my bug was introduced by the commit (i) 877b3729ca0 on Jan 3. The commit (ii) c62ec4610c40 was on May 22. The commit (iii) 26112ddc254c on Jun 30 fixes one problem caused by c62ec4610c40. The present patch modifies the code of the commit (iii), so it can be said as the completion of the commit (iii). It at the same time fixes my bug, too.

This suggests the present patch possibly fixes other unknown PM problems; former kernels had some loose end(s). Now this patch puts the kernel in a better position.

I'm a lay Linux user, and don't know if this post helps. If it does, it may be worth mentioning it in the above bugzilla entry.

Dziękuję (thanks), kernel developers. Best regards,
Teika (Teika kazura)

From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Subject: [PATCH] PCI / ACPI / PM: Resume all bridges on suspend-to-RAM
Date: Thu, 16 Aug 2018 12:56:46 +0200

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on
> suspend-to-RAM) attempted to fix a functional regression resulting
> from commit c62ec4610c40 (PM / core: Fix direct_complete handling
> for devices with no callbacks) by resuming PCI bridges without
> drivers (that is, "parallel PCI" ones) during system-wide suspend if
> the target system state is not ACPI S0 (working state).
> 
> That turns out insufficient, however, as it is reported that, at
> least in one case, the platform firmware gets confused if a PCIe
> root port is suspended before entering the ACPI S3 sleep state.
> 
> For this reason, drop the driver check from acpi_pci_need_resume()
> and resume all bridges (including PCIe ports with drivers) during
> system-wide suspend if the target system state is not ACPI S0.
> 
> [If the target system state is ACPI S0, it means suspend-to-idle
>  and the platform firmware is not going to be invoked to actually
>  suspend the system, so there is no need to resume the bridges in
>  that case.]
> 
> Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks)
> Reported-by: teika kazura <teika@gmx.com>
> Tested-by: teika kazura <teika@gmx.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675
> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...)
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/pci/pci-acpi.c |    6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -632,13 +632,11 @@ static bool acpi_pci_need_resume(struct
>  	/*
>  	 * In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over
>  	 * system-wide suspend/resume confuses the platform firmware, so avoid
> -	 * doing that, unless the bridge has a driver that should take care of
> -	 * the PM handling.  According to Section 16.1.6 of ACPI 6.2, endpoint
> +	 * doing that.  According to Section 16.1.6 of ACPI 6.2, endpoint
>  	 * devices are expected to be in D3 before invoking the S3 entry path
>  	 * from the firmware, so they should not be affected by this issue.
>  	 */
> -	if (pci_is_bridge(dev) && !dev->driver &&
> -	    acpi_target_system_state() != ACPI_STATE_S0)
> +	if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0)
>  		return true;
>  
>  	if (!adev || !acpi_device_power_manageable(adev))
>
Rafael J. Wysocki Aug. 17, 2018, 7:50 a.m. UTC | #4
On Fri, Aug 17, 2018 at 7:45 AM Teika Kazura <teika@gmx.com> wrote:
>
> For the record, about the exactness of the patch description.
>
> The patch mentions the regression by the commit c62ec4610c40, but it is not the cause of the bug (https://bugzilla.kernel.org/show_bug.cgi?id=20067)
> reported by me; I reverted c62ec4610c40 on linux-4.17.13, and the bug remained.
>
> # Some details: my bug was introduced by the commit (i) 877b3729ca0 on Jan 3. The commit (ii) c62ec4610c40 was on May 22. The commit (iii) 26112ddc254c
> on Jun 30 fixes one problem caused by c62ec4610c40. The present patch modifies the code of the commit (iii), so it can be said as the completion of the
> commit (iii). It at the same time fixes my bug, too.

You are right, commit 877b3729ca0 introduced the issue for you, but it
did that by exposing the same functional problem in the firmware that
was previously addressed by commit 26112ddc254c in a different case.

> This suggests the present patch possibly fixes other unknown PM problems; former kernels had some loose end(s). Now this patch puts the kernel in a better position.
>
> I'm a lay Linux user, and don't know if this post helps. If it does, it may be worth mentioning it in the above bugzilla entry.

Yes, it does, thanks!

I have updated the tags and the commit log of this patch according to
the information above.

Cheers,
Rafael
diff mbox series

Patch

Index: linux-pm/drivers/pci/pci-acpi.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-acpi.c
+++ linux-pm/drivers/pci/pci-acpi.c
@@ -632,13 +632,11 @@  static bool acpi_pci_need_resume(struct
 	/*
 	 * In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over
 	 * system-wide suspend/resume confuses the platform firmware, so avoid
-	 * doing that, unless the bridge has a driver that should take care of
-	 * the PM handling.  According to Section 16.1.6 of ACPI 6.2, endpoint
+	 * doing that.  According to Section 16.1.6 of ACPI 6.2, endpoint
 	 * devices are expected to be in D3 before invoking the S3 entry path
 	 * from the firmware, so they should not be affected by this issue.
 	 */
-	if (pci_is_bridge(dev) && !dev->driver &&
-	    acpi_target_system_state() != ACPI_STATE_S0)
+	if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0)
 		return true;
 
 	if (!adev || !acpi_device_power_manageable(adev))