diff mbox series

[v14.a,1/1] PCI: Only put Intel PCIe ports >= 2015 into D3

Message ID 20230818193932.27187-1-mario.limonciello@amd.com
State New
Headers show
Series [v14.a,1/1] PCI: Only put Intel PCIe ports >= 2015 into D3 | expand

Commit Message

Mario Limonciello Aug. 18, 2023, 7:39 p.m. UTC
commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
changed pci_bridge_d3_possible() so that any vendor's PCIe ports
from modern machines (>=2015) are allowed to be put into D3.

Iain reports that USB devices can't be used to wake a Lenovo Z13
from suspend. This is because the PCIe root port has been put
into D3 and AMD's platform can't handle USB devices waking in this
case.

This behavior is only reported on Linux. Comparing the behavior
on Windows and Linux, Windows doesn't put the root ports into D3.

To fix the issue without regressing existing Intel systems,
limit the >=2015 check to only apply to Intel PCIe ports.

Cc: stable@vger.kernel.org
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
Reported-by: Iain Lane <iain@orangesquash.org.uk>
Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121
Reviewed-by:Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
In v14 this series has been split into 3 parts.
 part A: Immediate fix for AMD issue.
 part B: LPS0 export improvements
 part C: Long term solution for all vendors
v13->v14:
 * Reword the comment
 * add tag
v12->v13:
 * New patch
---
 drivers/pci/pci.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Rafael J. Wysocki Aug. 21, 2023, 6:17 p.m. UTC | #1
On Fri, Aug 18, 2023 at 9:40 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> changed pci_bridge_d3_possible() so that any vendor's PCIe ports
> from modern machines (>=2015) are allowed to be put into D3.
>
> Iain reports that USB devices can't be used to wake a Lenovo Z13
> from suspend. This is because the PCIe root port has been put
> into D3 and AMD's platform can't handle USB devices waking in this
> case.
>
> This behavior is only reported on Linux. Comparing the behavior
> on Windows and Linux, Windows doesn't put the root ports into D3.
>
> To fix the issue without regressing existing Intel systems,
> limit the >=2015 check to only apply to Intel PCIe ports.
>
> Cc: stable@vger.kernel.org
> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> Reported-by: Iain Lane <iain@orangesquash.org.uk>
> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121
> Reviewed-by:Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

Acked-by: Rafael J. Wysocki <rafael@kernel.org>

> ---
> In v14 this series has been split into 3 parts.
>  part A: Immediate fix for AMD issue.
>  part B: LPS0 export improvements
>  part C: Long term solution for all vendors
> v13->v14:
>  * Reword the comment
>  * add tag
> v12->v13:
>  * New patch
> ---
>  drivers/pci/pci.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 60230da957e0c..bfdad2eb36d13 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
>                         return false;
>
>                 /*
> -                * It should be safe to put PCIe ports from 2015 or newer
> -                * to D3.
> +                * Allow Intel PCIe ports from 2015 onward to go into D3 to
> +                * achieve additional energy conservation on some platforms.
> +                *
> +                * This is only set for Intel PCIe ports as it causes problems
> +                * on both AMD Rembrandt and Phoenix platforms where USB keyboards
> +                * can not be used to wake the system from suspend.
>                  */
> -               if (dmi_get_bios_year() >= 2015)
> +               if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
> +                   dmi_get_bios_year() >= 2015)
>                         return true;
>                 break;
>         }
> --
> 2.34.1
>
Bjorn Helgaas Aug. 21, 2023, 10:42 p.m. UTC | #2
On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
> commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> changed pci_bridge_d3_possible() so that any vendor's PCIe ports
> from modern machines (>=2015) are allowed to be put into D3.
> 
> Iain reports that USB devices can't be used to wake a Lenovo Z13
> from suspend. This is because the PCIe root port has been put
> into D3 and AMD's platform can't handle USB devices waking in this
> case.
> 
> This behavior is only reported on Linux. Comparing the behavior
> on Windows and Linux, Windows doesn't put the root ports into D3.
> 
> To fix the issue without regressing existing Intel systems,
> limit the >=2015 check to only apply to Intel PCIe ports.
> 
> Cc: stable@vger.kernel.org
> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> Reported-by: Iain Lane <iain@orangesquash.org.uk>
> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121
> Reviewed-by:Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> In v14 this series has been split into 3 parts.
>  part A: Immediate fix for AMD issue.
>  part B: LPS0 export improvements
>  part C: Long term solution for all vendors
> v13->v14:
>  * Reword the comment
>  * add tag
> v12->v13:
>  * New patch
> ---
>  drivers/pci/pci.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 60230da957e0c..bfdad2eb36d13 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
>  			return false;
>  
>  		/*
> -		 * It should be safe to put PCIe ports from 2015 or newer
> -		 * to D3.
> +		 * Allow Intel PCIe ports from 2015 onward to go into D3 to
> +		 * achieve additional energy conservation on some platforms.
> +		 *
> +		 * This is only set for Intel PCIe ports as it causes problems
> +		 * on both AMD Rembrandt and Phoenix platforms where USB keyboards
> +		 * can not be used to wake the system from suspend.
>  		 */
> -		if (dmi_get_bios_year() >= 2015)
> +		if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
> +		    dmi_get_bios_year() >= 2015)
>  			return true;

Hmm.  I'm really not a fan of checks like this that aren't connected
to an actual property of the platform.  The Intel Vendor ID tells us
nothing about what the actual problem is, which makes it really hard
to maintain in the future.  It's also very AMD- and Intel-centric,
when this code is ostensibly arch-agnostic, so this potentially
regresses ARM64, RISC-V, powerpc, etc.

It's bad enough that we check for 2015.  A BIOS security update to a
2014 platform will break things, even though the update has nothing to
do with D3.  We're stuck with that one, and it's old enough that maybe
it won't bite us any more, but I hate to add more.

The list of conditions in pci_bridge_d3_possible() is a pretty good
clue that we don't really know what we're doing, and all we can do is
find configurations that happen to work.  

I don't have any better suggestions, other than that this should be
described somehow via ACPI (and not in vendor-specific stuff like
PNP0D80).

Bjorn
Mario Limonciello Aug. 22, 2023, 2:32 a.m. UTC | #3
On 8/21/2023 5:42 PM, Bjorn Helgaas wrote:
> On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
>> commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
>> changed pci_bridge_d3_possible() so that any vendor's PCIe ports
>> from modern machines (>=2015) are allowed to be put into D3.
>>
>> Iain reports that USB devices can't be used to wake a Lenovo Z13
>> from suspend. This is because the PCIe root port has been put
>> into D3 and AMD's platform can't handle USB devices waking in this
>> case.
>>
>> This behavior is only reported on Linux. Comparing the behavior
>> on Windows and Linux, Windows doesn't put the root ports into D3.
>>
>> To fix the issue without regressing existing Intel systems,
>> limit the >=2015 check to only apply to Intel PCIe ports.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
>> Reported-by: Iain Lane <iain@orangesquash.org.uk>
>> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121
>> Reviewed-by:Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> In v14 this series has been split into 3 parts.
>>   part A: Immediate fix for AMD issue.
>>   part B: LPS0 export improvements
>>   part C: Long term solution for all vendors
>> v13->v14:
>>   * Reword the comment
>>   * add tag
>> v12->v13:
>>   * New patch
>> ---
>>   drivers/pci/pci.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 60230da957e0c..bfdad2eb36d13 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
>>   			return false;
>>   
>>   		/*
>> -		 * It should be safe to put PCIe ports from 2015 or newer
>> -		 * to D3.
>> +		 * Allow Intel PCIe ports from 2015 onward to go into D3 to
>> +		 * achieve additional energy conservation on some platforms.
>> +		 *
>> +		 * This is only set for Intel PCIe ports as it causes problems
>> +		 * on both AMD Rembrandt and Phoenix platforms where USB keyboards
>> +		 * can not be used to wake the system from suspend.
>>   		 */
>> -		if (dmi_get_bios_year() >= 2015)
>> +		if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
>> +		    dmi_get_bios_year() >= 2015)
>>   			return true;
> 
> Hmm.  I'm really not a fan of checks like this that aren't connected
> to an actual property of the platform.  The Intel Vendor ID tells us
> nothing about what the actual problem is, which makes it really hard
> to maintain in the future.  It's also very AMD- and Intel-centric,
> when this code is ostensibly arch-agnostic, so this potentially
> regresses ARM64, RISC-V, powerpc, etc.
> 
> It's bad enough that we check for 2015.  A BIOS security update to a
> 2014 platform will break things, even though the update has nothing to
> do with D3.  We're stuck with that one, and it's old enough that maybe
> it won't bite us any more, but I hate to add more.

I don't see this change as adding any more checks.
It's correcting what should have been a more narrow check when that one 
was introduced.

There was no spec backing the 2015 date, it was just "hardware works 
with this and it's needed".

> 
> The list of conditions in pci_bridge_d3_possible() is a pretty good
> clue that we don't really know what we're doing, and all we can do is
> find configurations that happen to work.
> 

I see this function as stemming from three high level desires that 
intersect.

1) Make hardware not originally designated for the PC ecosystem work.
Macs fall in this camp.  They don't always adhere to the same "firmware 
norms" as UEFI PCs do.  They don't run Microsoft's certifications, and 
thus they don't always work the same.

2) Make hardware compatible to the the ACPI and PCI specs where possible.

3) Make hardware compatible with Windows behavior.

In a strict world <2> would be the only one that was followed and 
everything else would be relegated to quirks or sub-drivers that made 
decisions, but we are where we are.

I'm not saying it's a bad thing that all 3 goals overlap.
If not for the changes that were introduced into this function for Mac 
compatibility I don't think we would Thunderbolt/USB4 where it is today.

> I don't have any better suggestions, other than that this should be
> described somehow via ACPI (and not in vendor-specific stuff like
> PNP0D80).
> 
> Bjorn

The problem is the ACPI spec doesn't say what OSPM should do when 
something isn't power manageable by ACPI.  Can you really argue it should?

Even if we DID manage to get something added to the spec how does that 
help everything already on the marketplace that's broken?

If we can't fix the check to be more narrow the only options I see left:

0) Do nothing.

Document somewhere that if you're on AMD and care about wake from 
keyboard that you need pcie_port_pm=off.  I really don't think this is a 
good idea.  Here's why:

Completely separate from this wake from USB issue I know of another 
issue where some PHX BIOS versions HANG the system during resume because 
of root port being in D3.

It's fixed in newer PHX BIOS versions, but if you end up with an OEM 
system that happened to launch with one of these you're in a very bad 
place.  I haven't mentioned it because I've not seen an OEM system with 
this yet.

If we "do nothing" the only way to solve those will be to grow the DMI 
avoid list if any of these come up.

1) Hardcode all the Intel root ports that need D3 into a quirk list.

I don't know how big this list is, but I assume it's massive and doesn't 
scale unless the constraints stuff works for Intel to opt in the newer
things.

2) Hardcode quirks for all the affected AMD PCI root ports to avoid D3.

It's 4 of them for the 2022-2023 platforms.
It will probably be another 2 more for 2024 ones, and then another 2 
more for 2025 ones.
Maybe by 2025 the affected root port can handle D3 and let USB wake it 
up?  I don't know.

3) Move the quirk somewhere that AMD can maintain specifically for this 
case outside of drivers/pci.

I did prototype it being put into drivers/platform/x86/amd/pmc.c or 
drivers/platform/x86/amd/pmc_quirks.c instead as part of the suspend 
callbacks or LPS0 callbacks.

This works and can scale as that driver at a minimum gets new IDs added 
for new platforms.  However it makes all this logic much more convoluted 
and harder for anyone to follow.
Rafael J. Wysocki Aug. 22, 2023, 10:11 a.m. UTC | #4
On Tue, Aug 22, 2023 at 12:42 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
> > commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> > changed pci_bridge_d3_possible() so that any vendor's PCIe ports
> > from modern machines (>=2015) are allowed to be put into D3.
> >
> > Iain reports that USB devices can't be used to wake a Lenovo Z13
> > from suspend. This is because the PCIe root port has been put
> > into D3 and AMD's platform can't handle USB devices waking in this
> > case.
> >
> > This behavior is only reported on Linux. Comparing the behavior
> > on Windows and Linux, Windows doesn't put the root ports into D3.
> >
> > To fix the issue without regressing existing Intel systems,
> > limit the >=2015 check to only apply to Intel PCIe ports.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> > Reported-by: Iain Lane <iain@orangesquash.org.uk>
> > Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121
> > Reviewed-by:Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > ---
> > In v14 this series has been split into 3 parts.
> >  part A: Immediate fix for AMD issue.
> >  part B: LPS0 export improvements
> >  part C: Long term solution for all vendors
> > v13->v14:
> >  * Reword the comment
> >  * add tag
> > v12->v13:
> >  * New patch
> > ---
> >  drivers/pci/pci.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 60230da957e0c..bfdad2eb36d13 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
> >                       return false;
> >
> >               /*
> > -              * It should be safe to put PCIe ports from 2015 or newer
> > -              * to D3.
> > +              * Allow Intel PCIe ports from 2015 onward to go into D3 to
> > +              * achieve additional energy conservation on some platforms.
> > +              *
> > +              * This is only set for Intel PCIe ports as it causes problems
> > +              * on both AMD Rembrandt and Phoenix platforms where USB keyboards
> > +              * can not be used to wake the system from suspend.
> >                */
> > -             if (dmi_get_bios_year() >= 2015)
> > +             if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
> > +                 dmi_get_bios_year() >= 2015)
> >                       return true;
>
> Hmm.  I'm really not a fan of checks like this that aren't connected
> to an actual property of the platform.  The Intel Vendor ID tells us
> nothing about what the actual problem is, which makes it really hard
> to maintain in the future.  It's also very AMD- and Intel-centric,
> when this code is ostensibly arch-agnostic, so this potentially
> regresses ARM64, RISC-V, powerpc, etc.

That's a fair point.

Would it be better to reverse this and filter out AMD systems as they
are affected by the existing check?

> It's bad enough that we check for 2015.  A BIOS security update to a
> 2014 platform will break things,

Well, not necessarily.  Pre-2015 systems already worked and the check
was added as "surely, everything 2015 or newer should work either".
While it is true that putting PCIe Root Ports into D3hot was necessary
for extra energy conservation on Intel systems, it actually has been
expected to work everywhere.

> even though the update has nothing to
> do with D3.  We're stuck with that one, and it's old enough that maybe
> it won't bite us any more, but I hate to add more.

Well, how would you like to deal with the systems that don't work
today, because they expect a different behavior?

Effectively, the current behavior for all modern systems is to allow
bridge D3 if there are no indications that it shouldn't be allowed.
The platforms in question assume the reverse, so what else can be
done?

> The list of conditions in pci_bridge_d3_possible() is a pretty good
> clue that we don't really know what we're doing, and all we can do is
> find configurations that happen to work.

Yes, because by the spec it all should work just fine.  The PCI PM 1.2
specification defines the expected behavior for bridges and the PCIe
specification claims to be a superset of that.

What we need to deal with here is basically non-compliant systems and
so we have to catch the various forms of non-compliance.

> I don't have any better suggestions, other than that this should be
> described somehow via ACPI (and not in vendor-specific stuff like
> PNP0D80).

Well, it isn't in practice.
Bjorn Helgaas Aug. 23, 2023, 12:02 a.m. UTC | #5
On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
> On Tue, Aug 22, 2023 at 12:42 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
> > > commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> > > changed pci_bridge_d3_possible() so that any vendor's PCIe ports
> > > from modern machines (>=2015) are allowed to be put into D3.
> > >
> > > Iain reports that USB devices can't be used to wake a Lenovo Z13
> > > from suspend. This is because the PCIe root port has been put
> > > into D3 and AMD's platform can't handle USB devices waking in this
> > > case.
> > >
> > > This behavior is only reported on Linux. Comparing the behavior
> > > on Windows and Linux, Windows doesn't put the root ports into D3.
> > >
> > > To fix the issue without regressing existing Intel systems,
> > > limit the >=2015 check to only apply to Intel PCIe ports.

> > > @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
> > >                       return false;
> > >
> > >               /*
> > > -              * It should be safe to put PCIe ports from 2015 or newer
> > > -              * to D3.
> > > +              * Allow Intel PCIe ports from 2015 onward to go into D3 to
> > > +              * achieve additional energy conservation on some platforms.
> > > +              *
> > > +              * This is only set for Intel PCIe ports as it causes problems
> > > +              * on both AMD Rembrandt and Phoenix platforms where USB keyboards
> > > +              * can not be used to wake the system from suspend.
> > >                */
> > > -             if (dmi_get_bios_year() >= 2015)
> > > +             if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
> > > +                 dmi_get_bios_year() >= 2015)
> > >                       return true;
> >
> > Hmm.  I'm really not a fan of checks like this that aren't connected
> > to an actual property of the platform.  The Intel Vendor ID tells us
> > nothing about what the actual problem is, which makes it really hard
> > to maintain in the future.  It's also very AMD- and Intel-centric,
> > when this code is ostensibly arch-agnostic, so this potentially
> > regresses ARM64, RISC-V, powerpc, etc.
> 
> That's a fair point.
> 
> Would it be better to reverse this and filter out AMD systems as they
> are affected by the existing check?

Since we're trying to avoid an issue on AMD systems, I would
definitely prefer to have the code change mention AMD instead of
Intel.

> > It's bad enough that we check for 2015.  A BIOS security update to a
> > 2014 platform will break things,
> 
> Well, not necessarily.  Pre-2015 systems already worked and the check
> was added as "surely, everything 2015 or newer should work either".
> While it is true that putting PCIe Root Ports into D3hot was necessary
> for extra energy conservation on Intel systems, it actually has been
> expected to work everywhere.

This is a tangent; I was just trying to make the point that the date
check means a BIOS update may change Linux behavior even if the update
has nothing to do with PM, and I think that's a bad thing even if the
new behavior is not a failure.  But this is water under the bridge and
is probably not going to cause problems in the future.

> > even though the update has nothing to do with D3.  We're stuck
> > with that one, and it's old enough that maybe it won't bite us any
> > more, but I hate to add more.
> 
> Well, how would you like to deal with the systems that don't work
> today, because they expect a different behavior?
>
> Effectively, the current behavior for all modern systems is to allow
> bridge D3 if there are no indications that it shouldn't be allowed.
> The platforms in question assume the reverse, so what else can be
> done?
> 
> > The list of conditions in pci_bridge_d3_possible() is a pretty good
> > clue that we don't really know what we're doing, and all we can do is
> > find configurations that happen to work.
> 
> Yes, because by the spec it all should work just fine.  The PCI PM 1.2
> specification defines the expected behavior for bridges and the PCIe
> specification claims to be a superset of that.
> 
> What we need to deal with here is basically non-compliant systems and
> so we have to catch the various forms of non-compliance.

Thanks for this, that helps.  If pci_bridge_d3_possible() is a list of
quirks for systems that are known to be broken (or at least not known
to work correctly and avoiding D3 is acceptable), then we should
document and use it that way.

The current documentation ("checks if it is possible to move to D3")
frames it as "does the bridge have the required features?" instead of
"do we know about something broken in this bridge or this platform?"

If something is broken, I would expect tests based on the device or
DMI check.  But several some are not obvious defects.  E.g.,
"bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what
defect are we finding there?  What does the spec require that isn't
happening?

In this particular patch, apparently we assume any non-Intel port or
any BIOS before 2015 is broken.  Obviously way too general.  We know
"USB keyboards don't wake from suspend," but I think we need something
at the PCI level like "PME interrupt doesn't happen when bridge is in
state X" (i.e., the part that is non-compliant), and "one consequence
is that downstream devices can't wake from suspend."

> > I don't have any better suggestions, other than that this should be
> > described somehow via ACPI (and not in vendor-specific stuff like
> > PNP0D80).
> 
> Well, it isn't in practice.

If this is basically quirks and we treat it that way (comments about
the breakage and references to what the spec violations are), maybe
this is the best we can do.

Bjorn
Lukas Wunner Aug. 23, 2023, 5:04 a.m. UTC | #6
On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
> On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
> > What we need to deal with here is basically non-compliant systems and
> > so we have to catch the various forms of non-compliance.
> 
> Thanks for this, that helps.  If pci_bridge_d3_possible() is a list of
> quirks for systems that are known to be broken (or at least not known
> to work correctly and avoiding D3 is acceptable), then we should
> document and use it that way.
> 
> The current documentation ("checks if it is possible to move to D3")
> frames it as "does the bridge have the required features?" instead of
> "do we know about something broken in this bridge or this platform?"
> 
> If something is broken, I would expect tests based on the device or
> DMI check.  But several some are not obvious defects.  E.g.,
> "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what
> defect are we finding there?  What does the spec require that isn't
> happening?

This particular check doesn't pertain to a defect, but indeed
follows from the spec:

If hotplug control wasn't granted to the OS, the OS shall not put
the hotplug port in D3 behind firmware's back because the power state
affects accessibility of devices downstream of the hotplug port.

Put another way, the firmware expects to have control of hotplug
and hotplug may break if the OS fiddles with the power state of the
hotplug port.

Here's a bugzilla where this caused issues:
https://bugzilla.kernel.org/show_bug.cgi?id=53811

On the other hand Thunderbolt hotplug ports are required to runtime
suspend to D3 in order to save power.  On Macs they're always handled
natively by the OS.  Hence the code comment.

A somewhat longer explanation I gave in 2016:
https://lore.kernel.org/all/20160617213209.GA1927@wunner.de/

Perhaps the code comment preceding that check can be rephrased to
convey its meaning more clearly...

Thanks,

Lukas
Bjorn Helgaas Aug. 23, 2023, 11:46 a.m. UTC | #7
On Wed, Aug 23, 2023 at 07:04:53AM +0200, Lukas Wunner wrote:
> On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
> > On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
> > > What we need to deal with here is basically non-compliant systems and
> > > so we have to catch the various forms of non-compliance.
> > 
> > Thanks for this, that helps.  If pci_bridge_d3_possible() is a list of
> > quirks for systems that are known to be broken (or at least not known
> > to work correctly and avoiding D3 is acceptable), then we should
> > document and use it that way.
> > 
> > The current documentation ("checks if it is possible to move to D3")
> > frames it as "does the bridge have the required features?" instead of
> > "do we know about something broken in this bridge or this platform?"
> > 
> > If something is broken, I would expect tests based on the device or
> > DMI check.  But several some are not obvious defects.  E.g.,
> > "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what
> > defect are we finding there?  What does the spec require that isn't
> > happening?
> 
> This particular check doesn't pertain to a defect, but indeed
> follows from the spec:
> 
> If hotplug control wasn't granted to the OS, the OS shall not put
> the hotplug port in D3 behind firmware's back because the power state
> affects accessibility of devices downstream of the hotplug port.
> 
> Put another way, the firmware expects to have control of hotplug
> and hotplug may break if the OS fiddles with the power state of the
> hotplug port.
> 
> Here's a bugzilla where this caused issues:
> https://bugzilla.kernel.org/show_bug.cgi?id=53811
> 
> On the other hand Thunderbolt hotplug ports are required to runtime
> suspend to D3 in order to save power.  

Sounds like there may be a requirement in a Thunderbolt spec about
this, so maybe we could add that citation?  I guess this goes with the
"bridge->is_thunderbolt" check?

> On Macs they're always handled
> natively by the OS.  Hence the code comment.

And I guess this goes with the "System Management Mode" and
"Thunderbolt on non-Macs" comments?  A citation to the source behind
"OS shall not put the hotplug port in D3 behind firmware's back" would
be super helpful here.

> A somewhat longer explanation I gave in 2016:
> https://lore.kernel.org/all/20160617213209.GA1927@wunner.de/
> 
> Perhaps the code comment preceding that check can be rephrased to
> convey its meaning more clearly...

Thanks!  I think it would be worth trying to separate out the "normal"
things that correspond to the spec from the "quirk" things that work
around defects.  That's not material for *this* patch, though.

It's also a little weird that pci_bridge_d3_possible() itself looks
like it's invariant for the life of the system, but we call it several
times (pci_pm_init(), pci_bridge_d3_update(), pcie_portdrv_probe(),
etc).  I guess this is because we save the result in dev->bridge_d3,
but then pci_bridge_d3_update() updates dev->bridge_d3 based on other
things, so the original value is lost.  Maybe another bit or two could
avoid those extra calls.

Bjorn
Lukas Wunner Aug. 28, 2023, 8:10 p.m. UTC | #8
On Wed, Aug 23, 2023 at 06:46:19AM -0500, Bjorn Helgaas wrote:
> On Wed, Aug 23, 2023 at 07:04:53AM +0200, Lukas Wunner wrote:
> > On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
> > > On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
> > > > What we need to deal with here is basically non-compliant systems and
> > > > so we have to catch the various forms of non-compliance.
> > > 
> > > Thanks for this, that helps.  If pci_bridge_d3_possible() is a list of
> > > quirks for systems that are known to be broken (or at least not known
> > > to work correctly and avoiding D3 is acceptable), then we should
> > > document and use it that way.
> > > 
> > > The current documentation ("checks if it is possible to move to D3")
> > > frames it as "does the bridge have the required features?" instead of
> > > "do we know about something broken in this bridge or this platform?"
> > > 
> > > If something is broken, I would expect tests based on the device or
> > > DMI check.  But several some are not obvious defects.  E.g.,
> > > "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what
> > > defect are we finding there?  What does the spec require that isn't
> > > happening?
> > 
> > This particular check doesn't pertain to a defect, but indeed
> > follows from the spec:
> > 
> > If hotplug control wasn't granted to the OS, the OS shall not put
> > the hotplug port in D3 behind firmware's back because the power state
> > affects accessibility of devices downstream of the hotplug port.
> > 
> > Put another way, the firmware expects to have control of hotplug
> > and hotplug may break if the OS fiddles with the power state of the
> > hotplug port.
> > 
> > Here's a bugzilla where this caused issues:
> > https://bugzilla.kernel.org/show_bug.cgi?id=53811
> > 
> > On the other hand Thunderbolt hotplug ports are required to runtime
> > suspend to D3 in order to save power.  
> 
> Sounds like there may be a requirement in a Thunderbolt spec about
> this, so maybe we could add that citation?  I guess this goes with the
> "bridge->is_thunderbolt" check?

Right, that's the check I was referring to.  But I'm afraid there is
no explicit rule in Thunderbolt / USB4 specs that hotplug ports must
be runtime suspended in order to save power, at least to the best
of my knowledge.

In practice, Thunderbolt controllers come in one of two forms,
discrete or integrated into the CPU.  Originally only discrete
controllers existed, but over time it became more and more common
to integrate them.

Apple was pretty much the only vendor which sold larger quantities
of Thunderbolt 1 and 2 chips in the 2010 to 2016 era.  Back in the day,
only discrete controllers existed and they consumed around 1.5 to 2 W.
On laptops, that's a significant amount of energy, so from day 1, Apple
put load switches on their motherboards which allowed the Thunderbolt
controllers to be powered down if nothing is plugged in.

With the *integrated* Thunderbolt controllers, powering down on idle
is usually likewise required to allow the entire CPU package to enter
a low power state.

To the operating system, a Thunderbolt controller is visible as a
PCIe switch with an NHI device below one of the Downstream Ports
and hotplugged devices appearing below the other Downstream Ports.
The NHI is a vendor-agnostic Native Host Interface for tunnel setup,
similar to the OHCI, EHCI, XHCI interface definitions that are in use
with USB and FireWire controllers.

Linux uses a hierarchical power management model, i.e. parent devices
cannot runtime suspend unless their children runtime suspend.  Thus,
the hotplug ports need to runtime suspend and then the Switch Upstream
Port can runtime suspend and that triggers powerdown of the controller.

To cut a long story short, in *practice* Thunderbolt hotplug ports need
to be put into D3hot in order to save power, so that's what we do.
And we know from experience that they're all *safe* to be put into D3hot.
Hence we're whitelisting them in pci_bridge_d3_possible().

By contrast, I recall that we got MCEs on Xeon-SP processors back in
the day when their Root Ports were put into D3hot.  Hence the rather
conservative approach taken in pci_bridge_d3_possible() to whitelist
only known-good, newer hardware.


> > On Macs they're always handled
> > natively by the OS.  Hence the code comment.
> 
> And I guess this goes with the "System Management Mode" and
> "Thunderbolt on non-Macs" comments?  A citation to the source behind
> "OS shall not put the hotplug port in D3 behind firmware's back" would
> be super helpful here.

So I've just looked through the PCI Firmware Spec and can't find that
mentioned anywhere explicitly, but it's pretty obvious if you think
about it:

If the OS puts the hotplug port in D3hot, its downstream bus transitions
to either B2 or B3 (PCI Power Management Spec r1.2 sec 4.7.1).

Which means devices downstream of the hotplug port become inaccessible.
If hotplug control wasn't granted to the OS, firmware expects it may
handle the hotplug port without interference by the OS.

If the OS fiddles with the hotplug port's power state, that expectation
is no longer met.  Let's say the firmware reads the downstream device's
Vendor ID register to probe whether the device is there.  That'll no
longer work as expected once the hotplug port is transitioned to D3hot
by the operating system.

In retrospect, this code comment is probably confusing:

		/*
		 * Hotplug ports handled by firmware in System Management Mode
		 * may not be put into D3 by the OS (Thunderbolt on non-Macs).
		 */
		if (bridge->is_hotplug_bridge && !pciehp_is_native(bridge))
			return false;

This is not in any way specific to Thunderbolt.  It's just that
back in the day, Thunderbolt tunnel management was done natively
on Macs, whereas non-Macs did it in firmware.  That has since
changed and most vendors have adopted native tunnel management.
So the code comment is outdated.  The following would probably
be more accurate today:

		/*
		 * Hotplug ports handled by firmware in System Management Mode
		 * may not be put into D3 by the OS (behind firmware's back).
		 */


> > A somewhat longer explanation I gave in 2016:
> > https://lore.kernel.org/all/20160617213209.GA1927@wunner.de/
> > 
> > Perhaps the code comment preceding that check can be rephrased to
> > convey its meaning more clearly...
> 
> Thanks!  I think it would be worth trying to separate out the "normal"
> things that correspond to the spec from the "quirk" things that work
> around defects.  That's not material for *this* patch, though.
> 
> It's also a little weird that pci_bridge_d3_possible() itself looks
> like it's invariant for the life of the system, but we call it several
> times (pci_pm_init(), pci_bridge_d3_update(), pcie_portdrv_probe(),
> etc).  I guess this is because we save the result in dev->bridge_d3,
> but then pci_bridge_d3_update() updates dev->bridge_d3 based on other
> things, so the original value is lost.  Maybe another bit or two could
> avoid those extra calls.

Right on all accounts.  Those invocations of pci_bridge_d3_possible()
are all in code paths which run only once, e.g. on enumeration and removal
of the PCIe port and on shutdown.  We figured that it's not worth it
to cache the return value of pci_bridge_d3_possible() for these few
invocations, none of which are in hot paths.  ("We" is mostly Mika and
yours truly, who introduced this for Thunderbolt power management.)

Thanks,

Lukas
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 60230da957e0c..bfdad2eb36d13 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3037,10 +3037,15 @@  bool pci_bridge_d3_possible(struct pci_dev *bridge)
 			return false;
 
 		/*
-		 * It should be safe to put PCIe ports from 2015 or newer
-		 * to D3.
+		 * Allow Intel PCIe ports from 2015 onward to go into D3 to
+		 * achieve additional energy conservation on some platforms.
+		 *
+		 * This is only set for Intel PCIe ports as it causes problems
+		 * on both AMD Rembrandt and Phoenix platforms where USB keyboards
+		 * can not be used to wake the system from suspend.
 		 */
-		if (dmi_get_bios_year() >= 2015)
+		if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
+		    dmi_get_bios_year() >= 2015)
 			return true;
 		break;
 	}