diff mbox series

PCI/ASPM: Add back L1 PM Substate save and restore

Message ID 20230627062442.54008-1-mika.westerberg@linux.intel.com
State New
Headers show
Series PCI/ASPM: Add back L1 PM Substate save and restore | expand

Commit Message

Mika Westerberg June 27, 2023, 6:24 a.m. UTC
Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
due to a regression that caused resume from suspend to fail on certain
systems. However, we never added this capability back and this is now
causing systems fail to enter low power CPU states, drawing more power
from the battery.

The original revert mentioned that we restore L1 PM substate configuration
even though ASPM L1 may already be enabled. This is due the fact that
the pci_restore_aspm_l1ss_state() was called before
pci_restore_pcie_state().

Try to enable this functionality again by:

 1) Moving the restore happen after PCIe capability is restored
    following PCIe r6.0, sec 5.5.4.
 2) Following the PCIe spec more closely to restore L1 PM substate
    configuration (program enable bits last).
 3) Adding denylist that skips the restoring on the ASUS system where
    the original regression happened, just in case.

Reported-by: Koba Ko <koba.ko@canonical.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321
Cc: Tasev Nikola <tasev.stefanoska@skynet.be>
Cc: Mark Enriquez <enriquezmark36@gmail.com>
Cc: Thomas Witt <kernel@witt.link>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
Hi,

There are several systems out there that are not entering lower package
C-states after the first suspend/resume cycle because we do not restore
L1SS configuration as can be seen in the linked bugzilla entry. Although
they are functioning fine they still empty the battery faster than the
user may expect.

Apologies sending this out during merge window but I will be starting my
vacation next week so wanted to get this out before, and possibly
incorporate any changes during the week still.

 drivers/pci/pci.c       |  7 ++++
 drivers/pci/pci.h       |  4 +++
 drivers/pci/pcie/aspm.c | 76 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 87 insertions(+)

Comments

Thomas Witt June 27, 2023, 9:53 a.m. UTC | #1
On 27/06/2023 08:24, Mika Westerberg wrote:
> Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
> due to a regression that caused resume from suspend to fail on certain
> systems. However, we never added this capability back and this is now
> causing systems fail to enter low power CPU states, drawing more power
> from the battery.

Hello Mika,

I am sorry, but your patch (applied on top of master) triggers the exact 
same behaviour I described in 
<https://bugzilla.kernel.org/show_bug.cgi?id=216877> (nvme and wifi 
become unavailable during suspend/resume)

Best Regards,
Thomas
Mika Westerberg June 27, 2023, 10:04 a.m. UTC | #2
Hi Thomas,

On Tue, Jun 27, 2023 at 11:53:33AM +0200, Thomas Witt wrote:
> On 27/06/2023 08:24, Mika Westerberg wrote:
> > Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> > for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
> > due to a regression that caused resume from suspend to fail on certain
> > systems. However, we never added this capability back and this is now
> > causing systems fail to enter low power CPU states, drawing more power
> > from the battery.
> 
> Hello Mika,
> 
> I am sorry, but your patch (applied on top of master) triggers the exact
> same behaviour I described in
> <https://bugzilla.kernel.org/show_bug.cgi?id=216877> (nvme and wifi become
> unavailable during suspend/resume)

Thanks for testing! Can you provide the output of dmidecode from that
system? We can add it to the denylist as well to avoid the issue on your
system.
Bjorn Helgaas June 27, 2023, 8:41 p.m. UTC | #3
On Tue, Jun 27, 2023 at 01:04:47PM +0300, Mika Westerberg wrote:
> On Tue, Jun 27, 2023 at 11:53:33AM +0200, Thomas Witt wrote:
> > On 27/06/2023 08:24, Mika Westerberg wrote:
> > > Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> > > for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
> > > due to a regression that caused resume from suspend to fail on certain
> > > systems. However, we never added this capability back and this is now
> > > causing systems fail to enter low power CPU states, drawing more power
> > > from the battery.
> > 
> > Hello Mika,
> > 
> > I am sorry, but your patch (applied on top of master) triggers the exact
> > same behaviour I described in
> > <https://bugzilla.kernel.org/show_bug.cgi?id=216877> (nvme and wifi become
> > unavailable during suspend/resume)
> 
> Thanks for testing! Can you provide the output of dmidecode from that
> system? We can add it to the denylist as well to avoid the issue on your
> system.

To me this says we don't completely understand the mechanism of the
failure.  If BIOS made L1SS work initially, Linux should be able to
make it work again after suspend/resume.

If we can identify an actual hardware or firmware defect, it makes
good sense to add a quirk or denylist.  But I'll push back a little if
it's just "there's some problem we don't understand on this system, so
avoid it."

Bjorn
Mika Westerberg June 28, 2023, 6:46 a.m. UTC | #4
Hi Bjorn,

On Tue, Jun 27, 2023 at 03:41:24PM -0500, Bjorn Helgaas wrote:
> On Tue, Jun 27, 2023 at 01:04:47PM +0300, Mika Westerberg wrote:
> > On Tue, Jun 27, 2023 at 11:53:33AM +0200, Thomas Witt wrote:
> > > On 27/06/2023 08:24, Mika Westerberg wrote:
> > > > Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> > > > for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
> > > > due to a regression that caused resume from suspend to fail on certain
> > > > systems. However, we never added this capability back and this is now
> > > > causing systems fail to enter low power CPU states, drawing more power
> > > > from the battery.
> > > 
> > > Hello Mika,
> > > 
> > > I am sorry, but your patch (applied on top of master) triggers the exact
> > > same behaviour I described in
> > > <https://bugzilla.kernel.org/show_bug.cgi?id=216877> (nvme and wifi become
> > > unavailable during suspend/resume)
> > 
> > Thanks for testing! Can you provide the output of dmidecode from that
> > system? We can add it to the denylist as well to avoid the issue on your
> > system.
> 
> To me this says we don't completely understand the mechanism of the
> failure.  If BIOS made L1SS work initially, Linux should be able to
> make it work again after suspend/resume.
> 
> If we can identify an actual hardware or firmware defect, it makes
> good sense to add a quirk or denylist.  But I'll push back a little if
> it's just "there's some problem we don't understand on this system, so
> avoid it."

Fair enough.

I've looked at the Thomas' system dmesg that he attached to the bugzilla
and he has mem_sleep_default=deep in the kernel command line. There is
no such option in the mainline kernel but I suppose this makes systemd
(or some initscript) to write "deep" to /sys/power/mem_sleep, thus
forcing S3 instead of the default the ACPI tables suggest, which
probably is S0ix (Intel low power state which does not involve BIOS).

So when forcing S3 this means the system suspend and resume is done
through the BIOS who is supposed to enable wakes and program the devices
accordingly during and after S3 before the OS is given back the control,
it might very well be that it already did something for the L1
configuration here before Linux (with this patch) reconfigured them and
this is causing the problem.

@Thomas, is there any particular reason you have this option in the
command line? There is possibility that S3 is not even fully validated
if the system advertises S0 low power sleep instead.
Thomas Witt June 28, 2023, 10:24 a.m. UTC | #5
On 28/06/2023 08:46, Mika Westerberg wrote:
> @Thomas, is there any particular reason you have this option in the
> command line? There is possibility that S3 is not even fully validated
> if the system advertises S0 low power sleep instead.

In fact, there is: Entering suspend-to-ram without setting 
/sys/power/mem_sleep to "deep", my laptop consumes about the same power 
as it would idling online. The manufacturer suggests setting that 
commandline parameter:

<https://www.tuxedocomputers.com/en/Infos/Help-Support/Instructions/Fine-tuning-of-power-management-with-suspend-standby.tuxedo#>

I just retested your patch with setting mem_sleep to "s2idle", and it no 
longer triggers the loss of PCI devices. I guess that could be the 
indicator that Björn asked for.

I attached the output of dmidecode to the bugzilla entry mentioned 
above: <https://bugzilla.kernel.org/attachment.cgi?id=304494>

Best regards,
Thomas
Mika Westerberg June 28, 2023, 10:59 a.m. UTC | #6
Hi Thomas,

On Wed, Jun 28, 2023 at 12:24:06PM +0200, Thomas Witt wrote:
> On 28/06/2023 08:46, Mika Westerberg wrote:
> > @Thomas, is there any particular reason you have this option in the
> > command line? There is possibility that S3 is not even fully validated
> > if the system advertises S0 low power sleep instead.
> 
> In fact, there is: Entering suspend-to-ram without setting
> /sys/power/mem_sleep to "deep", my laptop consumes about the same power as
> it would idling online. The manufacturer suggests setting that commandline
> parameter:
> 
> <https://www.tuxedocomputers.com/en/Infos/Help-Support/Instructions/Fine-tuning-of-power-management-with-suspend-standby.tuxedo#>

Thanks for the clarification.

> I just retested your patch with setting mem_sleep to "s2idle", and it no
> longer triggers the loss of PCI devices. I guess that could be the indicator
> that Björn asked for.

I wonder if the patch actually helps here now because the reason we want
to add it back is that it allows the CPU to enter lower power states and
thus reducing the power consumption in S2idle too. Do you observe that
when you have the patch applied?

> I attached the output of dmidecode to the bugzilla entry mentioned above:
> <https://bugzilla.kernel.org/attachment.cgi?id=304494>

Thanks!
Mario Limonciello June 28, 2023, 12:16 p.m. UTC | #7
On 6/28/23 01:46, Mika Westerberg wrote:
> Hi Bjorn,
> 
> On Tue, Jun 27, 2023 at 03:41:24PM -0500, Bjorn Helgaas wrote:
>> On Tue, Jun 27, 2023 at 01:04:47PM +0300, Mika Westerberg wrote:
>>> On Tue, Jun 27, 2023 at 11:53:33AM +0200, Thomas Witt wrote:
>>>> On 27/06/2023 08:24, Mika Westerberg wrote:
>>>>> Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
>>>>> for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
>>>>> due to a regression that caused resume from suspend to fail on certain
>>>>> systems. However, we never added this capability back and this is now
>>>>> causing systems fail to enter low power CPU states, drawing more power
>>>>> from the battery.
>>>>
>>>> Hello Mika,
>>>>
>>>> I am sorry, but your patch (applied on top of master) triggers the exact
>>>> same behaviour I described in
>>>> <https://bugzilla.kernel.org/show_bug.cgi?id=216877> (nvme and wifi become
>>>> unavailable during suspend/resume)
>>>
>>> Thanks for testing! Can you provide the output of dmidecode from that
>>> system? We can add it to the denylist as well to avoid the issue on your
>>> system.
>>
>> To me this says we don't completely understand the mechanism of the
>> failure.  If BIOS made L1SS work initially, Linux should be able to
>> make it work again after suspend/resume.
>>
>> If we can identify an actual hardware or firmware defect, it makes
>> good sense to add a quirk or denylist.  But I'll push back a little if
>> it's just "there's some problem we don't understand on this system, so
>> avoid it."
> 
> Fair enough.
> 
> I've looked at the Thomas' system dmesg that he attached to the bugzilla
> and he has mem_sleep_default=deep in the kernel command line. There is
> no such option in the mainline kernel but I suppose this makes systemd
> (or some initscript) to write "deep" to /sys/power/mem_sleep, thus
> forcing S3 instead of the default the ACPI tables suggest, which
> probably is S0ix (Intel low power state which does not involve BIOS).

JFYI actually it is a mainline kernel parameter.

Reference the documentation here:
https://www.kernel.org/doc/html/v6.4/admin-guide/kernel-parameters.html?highlight=mem_sleep_default

> 
> So when forcing S3 this means the system suspend and resume is done
> through the BIOS who is supposed to enable wakes and program the devices
> accordingly during and after S3 before the OS is given back the control,
> it might very well be that it already did something for the L1
> configuration here before Linux (with this patch) reconfigured them and
> this is causing the problem.
> 
> @Thomas, is there any particular reason you have this option in the
> command line? There is possibility that S3 is not even fully validated
> if the system advertises S0 low power sleep instead.
>
Mika Westerberg June 28, 2023, 12:30 p.m. UTC | #8
On Wed, Jun 28, 2023 at 01:59:40PM +0300, Mika Westerberg wrote:
> Hi Thomas,
> 
> On Wed, Jun 28, 2023 at 12:24:06PM +0200, Thomas Witt wrote:
> > On 28/06/2023 08:46, Mika Westerberg wrote:
> > > @Thomas, is there any particular reason you have this option in the
> > > command line? There is possibility that S3 is not even fully validated
> > > if the system advertises S0 low power sleep instead.
> > 
> > In fact, there is: Entering suspend-to-ram without setting
> > /sys/power/mem_sleep to "deep", my laptop consumes about the same power as
> > it would idling online. The manufacturer suggests setting that commandline
> > parameter:
> > 
> > <https://www.tuxedocomputers.com/en/Infos/Help-Support/Instructions/Fine-tuning-of-power-management-with-suspend-standby.tuxedo#>
> 
> Thanks for the clarification.
> 
> > I just retested your patch with setting mem_sleep to "s2idle", and it no
> > longer triggers the loss of PCI devices. I guess that could be the indicator
> > that Björn asked for.
> 
> I wonder if the patch actually helps here now because the reason we want
> to add it back is that it allows the CPU to enter lower power states and
> thus reducing the power consumption in S2idle too. Do you observe that
> when you have the patch applied?

One possibility is to check the package C-state residency like:

  # cat /sys/kernel/debug/pmc_core/package_cstate_show

after system sleep and see if there is a change with the patch applied,
as done here:

  https://bugzilla.kernel.org/show_bug.cgi?id=217321
Mika Westerberg June 28, 2023, 12:38 p.m. UTC | #9
On Wed, Jun 28, 2023 at 07:16:49AM -0500, Mario Limonciello wrote:
> On 6/28/23 01:46, Mika Westerberg wrote:
> > Hi Bjorn,
> > 
> > On Tue, Jun 27, 2023 at 03:41:24PM -0500, Bjorn Helgaas wrote:
> > > On Tue, Jun 27, 2023 at 01:04:47PM +0300, Mika Westerberg wrote:
> > > > On Tue, Jun 27, 2023 at 11:53:33AM +0200, Thomas Witt wrote:
> > > > > On 27/06/2023 08:24, Mika Westerberg wrote:
> > > > > > Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> > > > > > for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
> > > > > > due to a regression that caused resume from suspend to fail on certain
> > > > > > systems. However, we never added this capability back and this is now
> > > > > > causing systems fail to enter low power CPU states, drawing more power
> > > > > > from the battery.
> > > > > 
> > > > > Hello Mika,
> > > > > 
> > > > > I am sorry, but your patch (applied on top of master) triggers the exact
> > > > > same behaviour I described in
> > > > > <https://bugzilla.kernel.org/show_bug.cgi?id=216877> (nvme and wifi become
> > > > > unavailable during suspend/resume)
> > > > 
> > > > Thanks for testing! Can you provide the output of dmidecode from that
> > > > system? We can add it to the denylist as well to avoid the issue on your
> > > > system.
> > > 
> > > To me this says we don't completely understand the mechanism of the
> > > failure.  If BIOS made L1SS work initially, Linux should be able to
> > > make it work again after suspend/resume.
> > > 
> > > If we can identify an actual hardware or firmware defect, it makes
> > > good sense to add a quirk or denylist.  But I'll push back a little if
> > > it's just "there's some problem we don't understand on this system, so
> > > avoid it."
> > 
> > Fair enough.
> > 
> > I've looked at the Thomas' system dmesg that he attached to the bugzilla
> > and he has mem_sleep_default=deep in the kernel command line. There is
> > no such option in the mainline kernel but I suppose this makes systemd
> > (or some initscript) to write "deep" to /sys/power/mem_sleep, thus
> > forcing S3 instead of the default the ACPI tables suggest, which
> > probably is S0ix (Intel low power state which does not involve BIOS).
> 
> JFYI actually it is a mainline kernel parameter.
> 
> Reference the documentation here:
> https://www.kernel.org/doc/html/v6.4/admin-guide/kernel-parameters.html?highlight=mem_sleep_default

Indeed it is. Thanks for correcting me. I grepped this from my source
tree but somehow screwed it up and it did not show up anything. Now that
I checked again it is there.
Thomas Witt June 29, 2023, 9:47 a.m. UTC | #10
On 28/06/2023 12:59, Mika Westerberg wrote:
> I wonder if the patch actually helps here now because the reason we want
> to add it back is that it allows the CPU to enter lower power states and
> thus reducing the power consumption in S2idle too. Do you observe that
> when you have the patch applied?

No joy. I did not check what its actually doing, but the computer takes 
about 150mA over the charger at idle with screen off and 140mA in 
suspend. With mem_sleep set to "deep", it takes about 20mA in suspend. 
All with the battery at 100%, at 19.7V.

So I guess I want to keep the "deep" setting in my cmdline.

BR
Thomas
Mika Westerberg June 29, 2023, 10:23 a.m. UTC | #11
On Thu, Jun 29, 2023 at 11:47:01AM +0200, Thomas Witt wrote:
> On 28/06/2023 12:59, Mika Westerberg wrote:
> > I wonder if the patch actually helps here now because the reason we want
> > to add it back is that it allows the CPU to enter lower power states and
> > thus reducing the power consumption in S2idle too. Do you observe that
> > when you have the patch applied?
> 
> No joy. I did not check what its actually doing, but the computer takes
> about 150mA over the charger at idle with screen off and 140mA in suspend.
> With mem_sleep set to "deep", it takes about 20mA in suspend. All with the
> battery at 100%, at 19.7V.
> 
> So I guess I want to keep the "deep" setting in my cmdline.

Okay thanks a lot for checking! Back to drawing board then...
David E. Box June 29, 2023, 2:24 p.m. UTC | #12
On Thu, 2023-06-29 at 11:47 +0200, Thomas Witt wrote:
> On 28/06/2023 12:59, Mika Westerberg wrote:
> > I wonder if the patch actually helps here now because the reason we want
> > to add it back is that it allows the CPU to enter lower power states and
> > thus reducing the power consumption in S2idle too. Do you observe that
> > when you have the patch applied?
> 
> No joy. I did not check what its actually doing, but the computer takes 
> about 150mA over the charger at idle with screen off and 140mA in 
> suspend. With mem_sleep set to "deep", it takes about 20mA in suspend. 
> All with the battery at 100%, at 19.7V.
> 
> So I guess I want to keep the "deep" setting in my cmdline.

It's likely something is not entering the right state. You can try the following
script to check what the cause may be.

https://github.com/intel/S0ixSelftestTool

David

> 
> BR
> Thomas
>
Mika Westerberg June 30, 2023, 10:41 a.m. UTC | #13
On Thu, Jun 29, 2023 at 07:24:55AM -0700, David E. Box wrote:
> On Thu, 2023-06-29 at 11:47 +0200, Thomas Witt wrote:
> > On 28/06/2023 12:59, Mika Westerberg wrote:
> > > I wonder if the patch actually helps here now because the reason we want
> > > to add it back is that it allows the CPU to enter lower power states and
> > > thus reducing the power consumption in S2idle too. Do you observe that
> > > when you have the patch applied?
> > 
> > No joy. I did not check what its actually doing, but the computer takes 
> > about 150mA over the charger at idle with screen off and 140mA in 
> > suspend. With mem_sleep set to "deep", it takes about 20mA in suspend. 
> > All with the battery at 100%, at 19.7V.
> > 
> > So I guess I want to keep the "deep" setting in my cmdline.
> 
> It's likely something is not entering the right state. You can try the following
> script to check what the cause may be.
> 
> https://github.com/intel/S0ixSelftestTool

That would be useful insights, indeed.

@Thomas, below is an updated patch. I wonder if you could try it out?
This one restores L1 substates first and then L0s/L1 as the spec
suggests. If this does not work, them I'm not sure what to do because
now we should be doing exactly what the spec is saying (unless I
misinterpret something):

  - Write L1 enables on the upstream component first then downstream
    (this is taken care by the parent child order of the Linux PM).
  - Program L1 SS before L1 enables
  - Program L1 SS enables after rest of the fields in the capability

Applies on top of v6.4.

------8<-----8<-----8<-----8<-----

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5ede93222bc1..2e947fea5afc 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1545,7 +1545,7 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
 {
 	int i = 0;
 	struct pci_cap_saved_state *save_state;
-	u16 *cap;
+	u16 *cap, val;
 
 	save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
 	if (!save_state)
@@ -1560,7 +1560,14 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
 
 	cap = (u16 *)&save_state->cap.data[0];
 	pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]);
-	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]);
+	/*
+	 * Restoring ASPM L1 substates has special requirements
+	 * according to the PCIe spec 6.0. So we restore here only the
+	 * LNKCTL register with the ASPM control field clear. ASPM will
+	 * be restored in pci_restore_aspm_state().
+	 */
+	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
+	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
 	pcie_capability_write_word(dev, PCI_EXP_SLTCTL, cap[i++]);
 	pcie_capability_write_word(dev, PCI_EXP_RTCTL, cap[i++]);
 	pcie_capability_write_word(dev, PCI_EXP_DEVCTL2, cap[i++]);
@@ -1671,6 +1678,7 @@ int pci_save_state(struct pci_dev *dev)
 	pci_save_ltr_state(dev);
 	pci_save_dpc_state(dev);
 	pci_save_aer_state(dev);
+	pci_save_aspm_state(dev);
 	pci_save_ptm_state(dev);
 	return pci_save_vc_state(dev);
 }
@@ -1784,6 +1792,7 @@ void pci_restore_state(struct pci_dev *dev)
 	pci_restore_rebar_state(dev);
 	pci_restore_dpc_state(dev);
 	pci_restore_ptm_state(dev);
+	pci_restore_aspm_state(dev);
 
 	pci_aer_clear_status(dev);
 	pci_restore_aer_state(dev);
@@ -3467,6 +3476,11 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev)
 	if (error)
 		pci_err(dev, "unable to allocate suspend buffer for LTR\n");
 
+	error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS,
+					    2 * sizeof(u32));
+	if (error)
+		pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n");
+
 	pci_allocate_vc_save_buffers(dev);
 }
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2475098f6518..47dbbc006884 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -567,10 +567,14 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
 void pcie_aspm_init_link_state(struct pci_dev *pdev);
 void pcie_aspm_exit_link_state(struct pci_dev *pdev);
 void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
+void pci_save_aspm_state(struct pci_dev *pdev);
+void pci_restore_aspm_state(struct pci_dev *pdev);
 #else
 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
+static inline void pci_save_aspm_state(struct pci_dev *pdev) { }
+static inline void pci_restore_aspm_state(struct pci_dev *pdev) { }
 #endif
 
 #ifdef CONFIG_PCIE_ECRC
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 66d7514ca111..879896fffb1e 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -7,6 +7,7 @@
  * Copyright (C) Shaohua Li (shaohua.li@intel.com)
  */
 
+#include <linux/dmi.h>
 #include <linux/kernel.h>
 #include <linux/math.h>
 #include <linux/module.h>
@@ -751,6 +752,114 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state)
 				PCI_L1SS_CTL1_L1SS_MASK, val);
 }
 
+void pci_save_aspm_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u16 l1ss = pdev->l1ss;
+	u32 *cap;
+
+	/*
+	 * Save L1 substate configuration. The ASPM L0s/L1 configuration
+	 * is already saved in pci_save_pcie_state().
+	 */
+	if (!l1ss)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
+	if (!save_state)
+		return;
+
+	cap = (u32 *)&save_state->cap.data[0];
+	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
+	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
+}
+
+/*
+ * Do not restore L1 substates for the below systems even if BIOS has
+ * enabled it initially. This breaks resume from suspend otherwise on
+ * these.
+ */
+static const struct dmi_system_id aspm_l1ss_denylist[] = {
+	{
+		/* https://bugzilla.kernel.org/show_bug.cgi?id=216782 */
+		.ident = "ASUS UX305FA",
+		.matches = {
+			DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."),
+			DMI_MATCH(DMI_BOARD_NAME, "UX305FA"),
+		},
+	},
+	{ }
+};
+
+static void pcie_restore_aspm_l1ss(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u32 *cap, ctl1, ctl2, l1_2_enable;
+	u16 l1ss = pdev->l1ss;
+
+	if (!l1ss)
+		return;
+
+	if (dmi_check_system(aspm_l1ss_denylist)) {
+		pci_dbg(pdev, "skipping restoring L1 substates on this system\n");
+		return;
+	}
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
+	if (!save_state)
+		return;
+
+	cap = (u32 *)&save_state->cap.data[0];
+	ctl2 = *cap++;
+	ctl1 = *cap;
+
+	/*
+	 * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD
+	 * in PCI_L1SS_CTL1 must be programmed *before* setting the L1.2
+	 * enable bits, even though they're all in PCI_L1SS_CTL1.
+	 */
+	l1_2_enable = ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
+	ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
+
+	/* Write back without enables first (above we cleared them in ctl1) */
+	pci_write_config_dword(pdev, l1ss + PCI_L1SS_CTL1, ctl1);
+	pci_write_config_dword(pdev, l1ss + PCI_L1SS_CTL2, ctl2);
+
+	/* Then write back the enables */
+	if (l1_2_enable)
+		pci_write_config_dword(pdev, l1ss + PCI_L1SS_CTL1,
+				       ctl1 | l1_2_enable);
+}
+
+void pci_restore_aspm_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u16 *cap, val, tmp;
+
+	save_state = pci_find_saved_cap(pdev, PCI_CAP_ID_EXP);
+	if (!save_state)
+		return;
+
+	cap = (u16 *)&save_state->cap.data[0];
+	/*
+	 * Must match the ordering in pci_save/restore_pcie_state().
+	 * This is PCI_EXP_LNKCTL.
+	 */
+	val = cap[1] & PCI_EXP_LNKCTL_ASPMC;
+	if (!val)
+		return;
+
+	/*
+	 * We restore L1 substate configuration first before enabling L1
+	 * as the PCIe spec 6.0 sec 5.5.4 suggests.
+	 * */
+	pcie_restore_aspm_l1ss(pdev);
+
+	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &tmp);
+	/* Re-enable L0s/L1 */
+	pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, tmp | val);
+}
+
 static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
 {
 	pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL,
Thomas Witt June 30, 2023, 4:58 p.m. UTC | #14
On 30/06/2023 12:41, Mika Westerberg wrote:
> @Thomas, below is an updated patch. I wonder if you could try it out?
> This one restores L1 substates first and then L0s/L1 as the spec
> suggests. If this does not work, them I'm not sure what to do because
> now we should be doing exactly what the spec is saying (unless I
> misinterpret something):
> 
>    - Write L1 enables on the upstream component first then downstream
>      (this is taken care by the parent child order of the Linux PM).
>    - Program L1 SS before L1 enables
>    - Program L1 SS enables after rest of the fields in the capability

Sadly, same as before. With s2idle, power consumption stays up, but 
suspend/resume works, with deep it does not correctly suspend the PCI 
devices.
David E. Box July 5, 2023, 8:53 p.m. UTC | #15
Hi Thomas,

On Fri, 2023-06-30 at 18:58 +0200, Thomas Witt wrote:
> > On 30/06/2023 12:41, Mika Westerberg wrote:
> > > > @Thomas, below is an updated patch. I wonder if you could try it out?
> > > > This one restores L1 substates first and then L0s/L1 as the spec
> > > > suggests. If this does not work, them I'm not sure what to do because
> > > > now we should be doing exactly what the spec is saying (unless I
> > > > misinterpret something):
> > > > 
> > > >    - Write L1 enables on the upstream component first then downstream
> > > >      (this is taken care by the parent child order of the Linux PM).
> > > >    - Program L1 SS before L1 enables
> > > >    - Program L1 SS enables after rest of the fields in the capability
> > 
> > Sadly, same as before. With s2idle, power consumption stays up, but 
> > suspend/resume works, with deep it does not correctly suspend the PCI 
> > devices.

Mika is now out on extended vacation. We still need a solution that will enable
the L1 substate save/restore without breaking your system. I'd like to try to
get the power consumption lowered on your system while suspended with s2idle.
The s0ix self test script will really help to tell us where to start. You can
provide the results in the bugzilla.

The other thing we can do is find out why it only breaks under S3. It could be
timing related, so I've attached another patch to the bugzilla to test this.

https://bugzilla.kernel.org/attachment.cgi?id=304553

Please let me know if it works. Thanks.

David
Thomas Witt July 6, 2023, 7:14 p.m. UTC | #16
On 05/07/2023 22:53, David E. Box wrote:
> Mika is now out on extended vacation. We still need a solution that will enable
> the L1 substate save/restore without breaking your system. I'd like to try to
> get the power consumption lowered on your system while suspended with s2idle.
> The s0ix self test script will really help to tell us where to start. You can
> provide the results in the bugzilla.
> 
> The other thing we can do is find out why it only breaks under S3. It could be
> timing related, so I've attached another patch to the bugzilla to test this.
> 
> https://bugzilla.kernel.org/attachment.cgi?id=304553
> 
> Please let me know if it works. Thanks.
> 
> David

Hi David,

I tried your patch, and I see no difference from Mika's. Still not 
coming back from suspend.

Thomas
Mika Westerberg July 31, 2023, 3:01 p.m. UTC | #17
Hi Thomas,

On Thu, Jul 06, 2023 at 09:14:27PM +0200, Thomas Witt wrote:
> On 05/07/2023 22:53, David E. Box wrote:
> > Mika is now out on extended vacation. We still need a solution that will enable
> > the L1 substate save/restore without breaking your system. I'd like to try to
> > get the power consumption lowered on your system while suspended with s2idle.
> > The s0ix self test script will really help to tell us where to start. You can
> > provide the results in the bugzilla.
> > 
> > The other thing we can do is find out why it only breaks under S3. It could be
> > timing related, so I've attached another patch to the bugzilla to test this.
> > 
> > https://bugzilla.kernel.org/attachment.cgi?id=304553
> > 
> > Please let me know if it works. Thanks.
> > 
> > David
> 
> Hi David,
> 
> I tried your patch, and I see no difference from Mika's. Still not coming
> back from suspend.

Thanks for trying that. Did you manage to try out the S0ix script David
suggested? That should show us hopefully what is draining the battery in
s2idle.
Thomas Witt Aug. 5, 2023, 7:57 a.m. UTC | #18
On 31/07/2023 17:01, Mika Westerberg wrote:
> Hi Thomas,
> 
> Thanks for trying that. Did you manage to try out the S0ix script David
> suggested? That should show us hopefully what is draining the battery in
> s2idle.

Hi Mika,

I did, with -s it gives

Your system does not support low power S0 idle capability.
Isolation suggestion:
Please check BIOS low power S0 idle capability setting.

with -r on

Your system did not achieve the runtime PC10 state during screen ON

additionally, it encounters a syntax error:
./s0ix-selftest-tool.sh: line 1182: wc:: syntax error in expression 
(error token is ":")

with -r off, it tries xset which fails due to a lack of xserver.

Thomas
Mika Westerberg Aug. 7, 2023, 7:58 a.m. UTC | #19
Hi Thomas,

On Sat, Aug 05, 2023 at 09:57:47AM +0200, Thomas Witt wrote:
> On 31/07/2023 17:01, Mika Westerberg wrote:
> > Hi Thomas,
> > 
> > Thanks for trying that. Did you manage to try out the S0ix script David
> > suggested? That should show us hopefully what is draining the battery in
> > s2idle.
> 
> Hi Mika,
> 
> I did, with -s it gives
> 
> Your system does not support low power S0 idle capability.
> Isolation suggestion:
> Please check BIOS low power S0 idle capability setting.
> 
> with -r on
> 
> Your system did not achieve the runtime PC10 state during screen ON

Thanks for trying. Did you change the "mem_sleep" back to "s2idle"
before you run the script?

> additionally, it encounters a syntax error:
> ./s0ix-selftest-tool.sh: line 1182: wc:: syntax error in expression (error
> token is ":")

@David, do you know what might be the issue?

> with -r off, it tries xset which fails due to a lack of xserver.

You do have graphics running right? I mean i915 driver is enabled and
all the firmwares are in place (should come with the distro). I'm asking
because s2idle typically requires that graphics and pretty much all the
devices on the SoC have a driver and the accompanying firmwares, and
that they enter D3 properly.
David E. Box Aug. 10, 2023, 11:44 p.m. UTC | #20
On Mon, 2023-08-07 at 10:58 +0300, Mika Westerberg wrote:
> Hi Thomas,
> 
> On Sat, Aug 05, 2023 at 09:57:47AM +0200, Thomas Witt wrote:
> > On 31/07/2023 17:01, Mika Westerberg wrote:
> > > Hi Thomas,
> > > 
> > > Thanks for trying that. Did you manage to try out the S0ix script David
> > > suggested? That should show us hopefully what is draining the battery in
> > > s2idle.
> > 
> > Hi Mika,
> > 
> > I did, with -s it gives
> > 
> > Your system does not support low power S0 idle capability.
> > Isolation suggestion:
> > Please check BIOS low power S0 idle capability setting.
> > 
> > with -r on
> > 
> > Your system did not achieve the runtime PC10 state during screen ON
> 
> Thanks for trying. Did you change the "mem_sleep" back to "s2idle"
> before you run the script?

The script checks the FADT to determine if the system supports S0ix and it found
that it didn't which is weird since Thomas is setting "mem_sleep" to "deep" from
the default "s2idle" which is based on this bit.

Here are the commands to check it.

  sudo acpidump -n FADT -b
  iasl -d facp.dat
  grep "Low Power S0 Idle" facp.dsl

Thomas, can you confirm what the value of mem_sleep is when you boot and run the
above to confirm what your hardware supports?

> 
> > additionally, it encounters a syntax error:
> > ./s0ix-selftest-tool.sh: line 1182: wc:: syntax error in expression (error
> > token is ":")
> 
> @David, do you know what might be the issue?

Yes. The latest kernel changes the output of the ltr_show command by adding a
PMC number prefix (since Meteor Lake has more than one PMC now). The script is
erring on the unexpected colon. We'll get this fixed.

David

> 
> > with -r off, it tries xset which fails due to a lack of xserver.
> 
> You do have graphics running right? I mean i915 driver is enabled and
> all the firmwares are in place (should come with the distro). I'm asking
> because s2idle typically requires that graphics and pretty much all the
> devices on the SoC have a driver and the accompanying firmwares, and
> that they enter D3 properly.
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 578bf0d3ec3c..bf30561aa32c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1679,6 +1679,7 @@  int pci_save_state(struct pci_dev *dev)
 	pci_save_dpc_state(dev);
 	pci_save_aer_state(dev);
 	pci_save_ptm_state(dev);
+	pci_save_aspm_l1ss_state(dev);
 	return pci_save_vc_state(dev);
 }
 EXPORT_SYMBOL(pci_save_state);
@@ -1790,6 +1791,7 @@  void pci_restore_state(struct pci_dev *dev)
 	pci_restore_vc_state(dev);
 	pci_restore_rebar_state(dev);
 	pci_restore_dpc_state(dev);
+	pci_restore_aspm_l1ss_state(dev);
 	pci_restore_ptm_state(dev);
 
 	pci_aer_clear_status(dev);
@@ -3474,6 +3476,11 @@  void pci_allocate_cap_save_buffers(struct pci_dev *dev)
 	if (error)
 		pci_err(dev, "unable to allocate suspend buffer for LTR\n");
 
+	error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS,
+					    2 * sizeof(u32));
+	if (error)
+		pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n");
+
 	pci_allocate_vc_save_buffers(dev);
 }
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index d09e8f39e429..f34b55701d63 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -560,10 +560,14 @@  bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
 void pcie_aspm_init_link_state(struct pci_dev *pdev);
 void pcie_aspm_exit_link_state(struct pci_dev *pdev);
 void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
+void pci_save_aspm_l1ss_state(struct pci_dev *pdev);
+void pci_restore_aspm_l1ss_state(struct pci_dev *pdev);
 #else
 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
+static inline void pci_save_aspm_l1ss_state(struct pci_dev *pdev) { }
+static inline void pci_restore_aspm_l1ss_state(struct pci_dev *pdev) { }
 #endif
 
 #ifdef CONFIG_PCIE_ECRC
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 66d7514ca111..52e1a69f818a 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -7,6 +7,7 @@ 
  * Copyright (C) Shaohua Li (shaohua.li@intel.com)
  */
 
+#include <linux/dmi.h>
 #include <linux/kernel.h>
 #include <linux/math.h>
 #include <linux/module.h>
@@ -751,6 +752,81 @@  static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state)
 				PCI_L1SS_CTL1_L1SS_MASK, val);
 }
 
+void pci_save_aspm_l1ss_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u16 l1ss = pdev->l1ss;
+	u32 *cap;
+
+	if (!l1ss)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
+	if (!save_state)
+		return;
+
+	cap = (u32 *)&save_state->cap.data[0];
+	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
+	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
+}
+
+/*
+ * Do not restore L1 substates for the below systems even if BIOS has
+ * enabled it initially. This breaks resume from suspend otherwise on
+ * these.
+ */
+static const struct dmi_system_id aspm_l1ss_denylist[] = {
+	{
+		/* https://bugzilla.kernel.org/show_bug.cgi?id=216782 */
+		.ident = "ASUS UX305FA",
+		.matches = {
+			DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."),
+			DMI_MATCH(DMI_BOARD_NAME, "UX305FA"),
+		},
+	},
+	{ }
+};
+
+void pci_restore_aspm_l1ss_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u32 *cap, ctl1, ctl2, l1_2_enable;
+	u16 l1ss = pdev->l1ss;
+
+	if (!l1ss)
+		return;
+
+	if (dmi_check_system(aspm_l1ss_denylist)) {
+		pci_dbg(pdev, "skipping restoring L1 substates on this system\n");
+		return;
+	}
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
+	if (!save_state)
+		return;
+
+	cap = (u32 *)&save_state->cap.data[0];
+	ctl2 = *cap++;
+	ctl1 = *cap;
+
+	/*
+	 * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD
+	 * in PCI_L1SS_CTL1 must be programmed *before* setting the L1.2
+	 * enable bits, even though they're all in PCI_L1SS_CTL1.
+	 */
+	l1_2_enable = ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
+	ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
+
+	/* Write back without enables first (above we cleared them in ctl1) */
+	pci_write_config_dword(pdev, l1ss + PCI_L1SS_CTL1, ctl1);
+	pci_write_config_dword(pdev, l1ss + PCI_L1SS_CTL2, ctl2);
+
+	/* Then write back the enables */
+	if (l1_2_enable)
+		pci_write_config_dword(pdev, l1ss + PCI_L1SS_CTL1,
+				       ctl1 | l1_2_enable);
+}
+
 static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
 {
 	pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL,