diff mbox series

PCI/ASPM: Enable ASPM on external PCIe devices

Message ID 20230615070421.1704133-1-kai.heng.feng@canonical.com
State New
Headers show
Series PCI/ASPM: Enable ASPM on external PCIe devices | expand

Commit Message

Kai-Heng Feng June 15, 2023, 7:04 a.m. UTC
When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
enabled for that device. However, when the device is plugged preboot,
ASPM is enabled by default.

The disparity happens because BIOS doesn't have the ability to program
ASPM on hotplugged devices.

So enable ASPM by default for external connected PCIe devices so ASPM
settings are consitent between preboot and hotplugged.

On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
pcieport 0000:07:04.0:    [ 7] BadDLLP

The root cause is still unclear, but quite likely because the I225 on
the dock supports PTM, where ASPM timing is precalculated for the PTM.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/pci/pcie/aspm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Ilpo Järvinen June 15, 2023, 2:02 p.m. UTC | #1
On Thu, 15 Jun 2023, Kai-Heng Feng wrote:

> When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> enabled for that device. However, when the device is plugged preboot,
> ASPM is enabled by default.
> 
> The disparity happens because BIOS doesn't have the ability to program
> ASPM on hotplugged devices.
> 
> So enable ASPM by default for external connected PCIe devices so ASPM
> settings are consitent between preboot and hotplugged.

consistent

> On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:

enabling ASPM ?
Kuppuswamy Sathyanarayanan June 15, 2023, 5:07 p.m. UTC | #2
On 6/15/23 12:04 AM, Kai-Heng Feng wrote:
> When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> enabled for that device. However, when the device is plugged preboot,
> ASPM is enabled by default.
> 
> The disparity happens because BIOS doesn't have the ability to program
> ASPM on hotplugged devices.
> 
> So enable ASPM by default for external connected PCIe devices so ASPM
> settings are consitent between preboot and hotplugged.

Why it has to be consistent? Can you add info about what it solves?

> 
> On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> pcieport 0000:07:04.0:    [ 7] BadDLLP
> 
> The root cause is still unclear, but quite likely because the I225 on
> the dock supports PTM, where ASPM timing is precalculated for the PTM.
> 
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pcie/aspm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 66d7514ca111..613b0754c9bb 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
>  		/* Enable Everything */
>  		return ASPM_STATE_ALL;
>  	case POLICY_DEFAULT:
> -		return link->aspm_default;
> +		return dev_is_removable(&link->downstream->dev) ?
> +			link->aspm_capable :
> +			link->aspm_default;
>  	}
>  	return 0;
>  }
Bjorn Helgaas June 15, 2023, 5:12 p.m. UTC | #3
On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
> When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> enabled for that device. However, when the device is plugged preboot,
> ASPM is enabled by default.
> 
> The disparity happens because BIOS doesn't have the ability to program
> ASPM on hotplugged devices.
> 
> So enable ASPM by default for external connected PCIe devices so ASPM
> settings are consitent between preboot and hotplugged.
> 
> On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> pcieport 0000:07:04.0:    [ 7] BadDLLP
> 
> The root cause is still unclear, but quite likely because the I225 on
> the dock supports PTM, where ASPM timing is precalculated for the PTM.
> 
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pcie/aspm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 66d7514ca111..613b0754c9bb 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
>  		/* Enable Everything */
>  		return ASPM_STATE_ALL;
>  	case POLICY_DEFAULT:
> -		return link->aspm_default;
> +		return dev_is_removable(&link->downstream->dev) ?
> +			link->aspm_capable :
> +			link->aspm_default;

I'm a little hesitant because dev_is_removable() is a convenient test
that covers the current issue, but it doesn't seem tightly connected
from a PCIe architecture perspective.

I think the current model of compile-time ASPM policy selection:

  CONFIG_PCIEASPM_DEFAULT          /* BIOS default setting */
  CONFIG_PCIEASPM_PERFORMANCE      /* disable L0s and L1 */
  CONFIG_PCIEASPM_POWERSAVE        /* enable L0s and L1 */
  CONFIG_PCIEASPM_POWER_SUPERSAVE  /* enable L1 substates */

is flawed.  As far as I know, there's no technical reason we have to
select this at kernel build-time.  I suspect the original reason was
risk avoidance, i.e., we were worried that we might expose hardware
defects if we enabled ASPM states that BIOS hadn't already enabled.

How do we get out of that model?  We do have sysfs knobs that should
cover all the functionality (set overall policy as above via
/sys/module/pcie_aspm/parameters/policy; set device-level exceptions
via /sys/bus/pci/devices/.../link/*_aspm).

In my opinion, the cleanest solution would be to enable all ASPM
functionality whenever possible and let users disable it if they need
to for performance.  If there are device defects when something is
enabled, deal with it via quirks, as we do for other PCI features.

That feels a little risky, but let's have a conversation about where
we want to go in the long term.  It's good to avoid risk, but too much
avoidance leads to its own complexity and an inability to change
things.

Bjorn
Kai-Heng Feng June 16, 2023, 2:37 a.m. UTC | #4
On Fri, Jun 16, 2023 at 1:07 AM Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>
>
>
> On 6/15/23 12:04 AM, Kai-Heng Feng wrote:
> > When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> > enabled for that device. However, when the device is plugged preboot,
> > ASPM is enabled by default.
> >
> > The disparity happens because BIOS doesn't have the ability to program
> > ASPM on hotplugged devices.
> >
> > So enable ASPM by default for external connected PCIe devices so ASPM
> > settings are consitent between preboot and hotplugged.
>
> Why it has to be consistent? Can you add info about what it solves?

It enables ASPM when BIOS can't program LNKCTL.

Enabling ASPM can bring significant power saving for modern CPUs.

It's also quite nice that it can make the BadDLLP caused by I225 device go away.

Kai-Heng

>
> >
> > On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> > pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> > pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > pcieport 0000:07:04.0:    [ 7] BadDLLP
> >
> > The root cause is still unclear, but quite likely because the I225 on
> > the dock supports PTM, where ASPM timing is precalculated for the PTM.
> >
> > Cc: Mario Limonciello <mario.limonciello@amd.com>
> > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >  drivers/pci/pcie/aspm.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 66d7514ca111..613b0754c9bb 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
> >               /* Enable Everything */
> >               return ASPM_STATE_ALL;
> >       case POLICY_DEFAULT:
> > -             return link->aspm_default;
> > +             return dev_is_removable(&link->downstream->dev) ?
> > +                     link->aspm_capable :
> > +                     link->aspm_default;
> >       }
> >       return 0;
> >  }
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
Kai-Heng Feng June 16, 2023, 3:01 a.m. UTC | #5
On Fri, Jun 16, 2023 at 1:12 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
> > When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> > enabled for that device. However, when the device is plugged preboot,
> > ASPM is enabled by default.
> >
> > The disparity happens because BIOS doesn't have the ability to program
> > ASPM on hotplugged devices.
> >
> > So enable ASPM by default for external connected PCIe devices so ASPM
> > settings are consitent between preboot and hotplugged.
> >
> > On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> > pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> > pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > pcieport 0000:07:04.0:    [ 7] BadDLLP
> >
> > The root cause is still unclear, but quite likely because the I225 on
> > the dock supports PTM, where ASPM timing is precalculated for the PTM.
> >
> > Cc: Mario Limonciello <mario.limonciello@amd.com>
> > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >  drivers/pci/pcie/aspm.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 66d7514ca111..613b0754c9bb 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
> >               /* Enable Everything */
> >               return ASPM_STATE_ALL;
> >       case POLICY_DEFAULT:
> > -             return link->aspm_default;
> > +             return dev_is_removable(&link->downstream->dev) ?
> > +                     link->aspm_capable :
> > +                     link->aspm_default;
>
> I'm a little hesitant because dev_is_removable() is a convenient test
> that covers the current issue, but it doesn't seem tightly connected
> from a PCIe architecture perspective.
>
> I think the current model of compile-time ASPM policy selection:
>
>   CONFIG_PCIEASPM_DEFAULT          /* BIOS default setting */
>   CONFIG_PCIEASPM_PERFORMANCE      /* disable L0s and L1 */
>   CONFIG_PCIEASPM_POWERSAVE        /* enable L0s and L1 */
>   CONFIG_PCIEASPM_POWER_SUPERSAVE  /* enable L1 substates */
>
> is flawed.  As far as I know, there's no technical reason we have to
> select this at kernel build-time.  I suspect the original reason was
> risk avoidance, i.e., we were worried that we might expose hardware
> defects if we enabled ASPM states that BIOS hadn't already enabled.
>
> How do we get out of that model?  We do have sysfs knobs that should
> cover all the functionality (set overall policy as above via
> /sys/module/pcie_aspm/parameters/policy; set device-level exceptions
> via /sys/bus/pci/devices/.../link/*_aspm).

Agree. Build-time policy can be obsoleted by boot-time argument.

>
> In my opinion, the cleanest solution would be to enable all ASPM
> functionality whenever possible and let users disable it if they need
> to for performance.  If there are device defects when something is
> enabled, deal with it via quirks, as we do for other PCI features.
>
> That feels a little risky, but let's have a conversation about where
> we want to go in the long term.  It's good to avoid risk, but too much
> avoidance leads to its own complexity and an inability to change
> things.

I think we should separate the situation into two cases:
- When BIOS/system firmware has the ability to program ASPM, honor it.
This applies to most "internal" PCI devices.
- When BIOS/system can't program ASPM, enable ASPM for whatever it's
capable of. Most notable case is Intel VMD controller, and this patch
for devices connected through TBT.

Enabling all ASPM functionality regardless of what's being
pre-programmed by BIOS is way too risky.
Disabling ASPM to workaround issues and defects are still quite common
among hardware manufacturers.

Kai-Heng

>
> Bjorn
Bjorn Helgaas June 16, 2023, 10:01 p.m. UTC | #6
On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
> When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> enabled for that device. However, when the device is plugged preboot,
> ASPM is enabled by default.
> 
> The disparity happens because BIOS doesn't have the ability to program
> ASPM on hotplugged devices.
> 
> So enable ASPM by default for external connected PCIe devices so ASPM
> settings are consitent between preboot and hotplugged.
> 
> On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> pcieport 0000:07:04.0:    [ 7] BadDLLP
> 
> The root cause is still unclear, but quite likely because the I225 on
> the dock supports PTM, where ASPM timing is precalculated for the PTM.

> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557

I know you said this isn't clear yet, but I don't see a connection
between ASPM being enabled and PTM.  If anything, *disabling* ASPM
should be safer if there's a timing issue.  

I assume the ASPM timing you refer to is the LTR snoop/no snoop
latency, since that's the only timing difference I see in the lspci
output in bugzilla?

I don't see any PTM differences there.

Bjorn
Mario Limonciello June 19, 2023, 4:16 p.m. UTC | #7
On 6/15/2023 10:01 PM, Kai-Heng Feng wrote:
> On Fri, Jun 16, 2023 at 1:12 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
>>> When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
>>> enabled for that device. However, when the device is plugged preboot,
>>> ASPM is enabled by default.
>>>
>>> The disparity happens because BIOS doesn't have the ability to program
>>> ASPM on hotplugged devices.
>>>
>>> So enable ASPM by default for external connected PCIe devices so ASPM
>>> settings are consitent between preboot and hotplugged.
>>>
>>> On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
>>> pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
>>> pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
>>> pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
>>> pcieport 0000:07:04.0:    [ 7] BadDLLP
>>>
>>> The root cause is still unclear, but quite likely because the I225 on
>>> the dock supports PTM, where ASPM timing is precalculated for the PTM.
>>>
>>> Cc: Mario Limonciello <mario.limonciello@amd.com>
>>> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>> ---
>>>   drivers/pci/pcie/aspm.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>> index 66d7514ca111..613b0754c9bb 100644
>>> --- a/drivers/pci/pcie/aspm.c
>>> +++ b/drivers/pci/pcie/aspm.c
>>> @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
>>>                /* Enable Everything */
>>>                return ASPM_STATE_ALL;
>>>        case POLICY_DEFAULT:
>>> -             return link->aspm_default;
>>> +             return dev_is_removable(&link->downstream->dev) ?
>>> +                     link->aspm_capable :
>>> +                     link->aspm_default;
>> I'm a little hesitant because dev_is_removable() is a convenient test
>> that covers the current issue, but it doesn't seem tightly connected
>> from a PCIe architecture perspective.
>>
>> I think the current model of compile-time ASPM policy selection:
>>
>>    CONFIG_PCIEASPM_DEFAULT          /* BIOS default setting */
>>    CONFIG_PCIEASPM_PERFORMANCE      /* disable L0s and L1 */
>>    CONFIG_PCIEASPM_POWERSAVE        /* enable L0s and L1 */
>>    CONFIG_PCIEASPM_POWER_SUPERSAVE  /* enable L1 substates */
>>
>> is flawed.  As far as I know, there's no technical reason we have to
>> select this at kernel build-time.  I suspect the original reason was
>> risk avoidance, i.e., we were worried that we might expose hardware
>> defects if we enabled ASPM states that BIOS hadn't already enabled.
>>
>> How do we get out of that model?  We do have sysfs knobs that should
>> cover all the functionality (set overall policy as above via
>> /sys/module/pcie_aspm/parameters/policy; set device-level exceptions
>> via /sys/bus/pci/devices/.../link/*_aspm).
> Agree. Build-time policy can be obsoleted by boot-time argument.
I agree as well.
>
>> In my opinion, the cleanest solution would be to enable all ASPM
>> functionality whenever possible and let users disable it if they need
>> to for performance.  If there are device defects when something is
>> enabled, deal with it via quirks, as we do for other PCI features.
>>
>> That feels a little risky, but let's have a conversation about where
>> we want to go in the long term.  It's good to avoid risk, but too much
>> avoidance leads to its own complexity and an inability to change
>> things.
> I think we should separate the situation into two cases:
> - When BIOS/system firmware has the ability to program ASPM, honor it.
> This applies to most "internal" PCI devices.
> - When BIOS/system can't program ASPM, enable ASPM for whatever it's
> capable of. Most notable case is Intel VMD controller, and this patch
> for devices connected through TBT.
>
> Enabling all ASPM functionality regardless of what's being
> pre-programmed by BIOS is way too risky.
> Disabling ASPM to workaround issues and defects are still quite common
> among hardware manufacturers.
I think the pragmatic way to approach it is to
(essentially) apply the policy as BIOS defaults and
allow overrides from that.

1) Leave a kernel command line option that will force it
off as a "big hammer" for problems.

Remove other ASPM related kernel command line options.

2) Direct users to use the sysfs knobs to override on a
case by case basis for devices to aide in development of
any quirks or new policies.

3) Add quirks and for things like VMD.

The external devices I don't think we should have a blanket
policy.  What happens when you put a faulty device into a
USB4 PCIe enclosure?  I think this will be a recipe for system
instability.

If the i225 PCIe device generally contained with USB4 docks
is known to behave well with ASPM, how about just quirking it like
VMD would be quirked?
Bjorn Helgaas June 19, 2023, 9:37 p.m. UTC | #8
On Mon, Jun 19, 2023 at 11:16:35AM -0500, Limonciello, Mario wrote:
> On 6/15/2023 10:01 PM, Kai-Heng Feng wrote:
> > On Fri, Jun 16, 2023 at 1:12 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
> > > > When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> > > > enabled for that device. However, when the device is plugged preboot,
> > > > ASPM is enabled by default.
> > > > 
> > > > The disparity happens because BIOS doesn't have the ability to program
> > > > ASPM on hotplugged devices.
> > > > 
> > > > So enable ASPM by default for external connected PCIe devices so ASPM
> > > > settings are consitent between preboot and hotplugged.
> > > > 
> > > > On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> > > > pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> > > > pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > > > pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > > pcieport 0000:07:04.0:    [ 7] BadDLLP
> > > > 
> > > > The root cause is still unclear, but quite likely because the I225 on
> > > > the dock supports PTM, where ASPM timing is precalculated for the PTM.
> > > > 
> > > > Cc: Mario Limonciello <mario.limonciello@amd.com>
> > > > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
> > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > ---
> > > >   drivers/pci/pcie/aspm.c | 4 +++-
> > > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > index 66d7514ca111..613b0754c9bb 100644
> > > > --- a/drivers/pci/pcie/aspm.c
> > > > +++ b/drivers/pci/pcie/aspm.c
> > > > @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
> > > >                /* Enable Everything */
> > > >                return ASPM_STATE_ALL;
> > > >        case POLICY_DEFAULT:
> > > > -             return link->aspm_default;
> > > > +             return dev_is_removable(&link->downstream->dev) ?
> > > > +                     link->aspm_capable :
> > > > +                     link->aspm_default;
> > >
> > > I'm a little hesitant because dev_is_removable() is a convenient
> > > test that covers the current issue, but it doesn't seem tightly
> > > connected from a PCIe architecture perspective.
> > > 
> > > I think the current model of compile-time ASPM policy selection:
> > > 
> > >    CONFIG_PCIEASPM_DEFAULT          /* BIOS default setting */
> > >    CONFIG_PCIEASPM_PERFORMANCE      /* disable L0s and L1 */
> > >    CONFIG_PCIEASPM_POWERSAVE        /* enable L0s and L1 */
> > >    CONFIG_PCIEASPM_POWER_SUPERSAVE  /* enable L1 substates */
> > > 
> > > is flawed.  As far as I know, there's no technical reason we
> > > have to select this at kernel build-time.  I suspect the
> > > original reason was risk avoidance, i.e., we were worried that
> > > we might expose hardware defects if we enabled ASPM states that
> > > BIOS hadn't already enabled.
> > > 
> > > How do we get out of that model?  We do have sysfs knobs that
> > > should cover all the functionality (set overall policy as above
> > > via /sys/module/pcie_aspm/parameters/policy; set device-level
> > > exceptions via /sys/bus/pci/devices/.../link/*_aspm).
> >
> > Agree. Build-time policy can be obsoleted by boot-time argument.
>
> I agree as well.

This isn't strictly relevant to the current problem, so let's put this
on the back burner for now.

> > > In my opinion, the cleanest solution would be to enable all ASPM
> > > functionality whenever possible and let users disable it if they
> > > need to for performance.  If there are device defects when
> > > something is enabled, deal with it via quirks, as we do for
> > > other PCI features.
> > > 
> > > That feels a little risky, but let's have a conversation about
> > > where we want to go in the long term.  It's good to avoid risk,
> > > but too much avoidance leads to its own complexity and an
> > > inability to change things.
> >
> > I think we should separate the situation into two cases:
> > - When BIOS/system firmware has the ability to program ASPM, honor
> > it.  This applies to most "internal" PCI devices.
> > - When BIOS/system can't program ASPM, enable ASPM for whatever
> > it's capable of. Most notable case is Intel VMD controller, and
> > this patch for devices connected through TBT.
> > 
> > Enabling all ASPM functionality regardless of what's being
> > pre-programmed by BIOS is way too risky.  Disabling ASPM to
> > workaround issues and defects are still quite common among
> > hardware manufacturers.

It sounds like you have actual experience with this :)  Do you have
any concrete examples that we can use as "known breakage"?

This feels like a real problem to me.  There are existing mechanisms
(ACPI_FADT_NO_ASPM and _OSC PCIe cap ownership) the platform can use
to prevent the OS from using ASPM.

If vendors assume that *in addition*, the OS will pay attention to
whatever ASPM configuration BIOS did, that's a major disconnect.  We
don't do anything like that for other PCI features, and I'm not aware
of any restriction like that being documented.

> I think the pragmatic way to approach it is to (essentially) apply
> the policy as BIOS defaults and allow overrides from that.

Do you mean that when enumerating a device (at boot-time or hot-add
time), we would read the current ASPM config but not change it?  And
users could use the sysfs knobs to enable/disable ASPM as desired?
That wouldn't solve the problem Kai-Heng is trying to solve.

Or that we leave ASPM alone during boot-time enumeration, but enable
ASPM when we enumerate hot-added devices?  It doesn't sound right that
a device would be configured differently if present at boot vs
hot-added.

Bjorn
Mario Limonciello June 19, 2023, 10:09 p.m. UTC | #9
On 6/19/2023 4:37 PM, Bjorn Helgaas wrote:
> On Mon, Jun 19, 2023 at 11:16:35AM -0500, Limonciello, Mario wrote:
>> On 6/15/2023 10:01 PM, Kai-Heng Feng wrote:
>>> On Fri, Jun 16, 2023 at 1:12 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>> On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
>>>>> When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
>>>>> enabled for that device. However, when the device is plugged preboot,
>>>>> ASPM is enabled by default.
>>>>>
>>>>> The disparity happens because BIOS doesn't have the ability to program
>>>>> ASPM on hotplugged devices.
>>>>>
>>>>> So enable ASPM by default for external connected PCIe devices so ASPM
>>>>> settings are consitent between preboot and hotplugged.
>>>>>
>>>>> On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
>>>>> pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
>>>>> pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
>>>>> pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
>>>>> pcieport 0000:07:04.0:    [ 7] BadDLLP
>>>>>
>>>>> The root cause is still unclear, but quite likely because the I225 on
>>>>> the dock supports PTM, where ASPM timing is precalculated for the PTM.
>>>>>
>>>>> Cc: Mario Limonciello <mario.limonciello@amd.com>
>>>>> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
>>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
>>>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>>>> ---
>>>>>    drivers/pci/pcie/aspm.c | 4 +++-
>>>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>>> index 66d7514ca111..613b0754c9bb 100644
>>>>> --- a/drivers/pci/pcie/aspm.c
>>>>> +++ b/drivers/pci/pcie/aspm.c
>>>>> @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
>>>>>                 /* Enable Everything */
>>>>>                 return ASPM_STATE_ALL;
>>>>>         case POLICY_DEFAULT:
>>>>> -             return link->aspm_default;
>>>>> +             return dev_is_removable(&link->downstream->dev) ?
>>>>> +                     link->aspm_capable :
>>>>> +                     link->aspm_default;
>>>> I'm a little hesitant because dev_is_removable() is a convenient
>>>> test that covers the current issue, but it doesn't seem tightly
>>>> connected from a PCIe architecture perspective.
>>>>
>>>> I think the current model of compile-time ASPM policy selection:
>>>>
>>>>     CONFIG_PCIEASPM_DEFAULT          /* BIOS default setting */
>>>>     CONFIG_PCIEASPM_PERFORMANCE      /* disable L0s and L1 */
>>>>     CONFIG_PCIEASPM_POWERSAVE        /* enable L0s and L1 */
>>>>     CONFIG_PCIEASPM_POWER_SUPERSAVE  /* enable L1 substates */
>>>>
>>>> is flawed.  As far as I know, there's no technical reason we
>>>> have to select this at kernel build-time.  I suspect the
>>>> original reason was risk avoidance, i.e., we were worried that
>>>> we might expose hardware defects if we enabled ASPM states that
>>>> BIOS hadn't already enabled.
>>>>
>>>> How do we get out of that model?  We do have sysfs knobs that
>>>> should cover all the functionality (set overall policy as above
>>>> via /sys/module/pcie_aspm/parameters/policy; set device-level
>>>> exceptions via /sys/bus/pci/devices/.../link/*_aspm).
>>> Agree. Build-time policy can be obsoleted by boot-time argument.
>> I agree as well.
> This isn't strictly relevant to the current problem, so let's put this
> on the back burner for now.


I think it could fold into it if we end up treating the i225
PCIe device as a quirk as mentioned below.

>
>>>> In my opinion, the cleanest solution would be to enable all ASPM
>>>> functionality whenever possible and let users disable it if they
>>>> need to for performance.  If there are device defects when
>>>> something is enabled, deal with it via quirks, as we do for
>>>> other PCI features.
>>>>
>>>> That feels a little risky, but let's have a conversation about
>>>> where we want to go in the long term.  It's good to avoid risk,
>>>> but too much avoidance leads to its own complexity and an
>>>> inability to change things.
>>> I think we should separate the situation into two cases:
>>> - When BIOS/system firmware has the ability to program ASPM, honor
>>> it.  This applies to most "internal" PCI devices.
>>> - When BIOS/system can't program ASPM, enable ASPM for whatever
>>> it's capable of. Most notable case is Intel VMD controller, and
>>> this patch for devices connected through TBT.
>>>
>>> Enabling all ASPM functionality regardless of what's being
>>> pre-programmed by BIOS is way too risky.  Disabling ASPM to
>>> workaround issues and defects are still quite common among
>>> hardware manufacturers.
> It sounds like you have actual experience with this :)  Do you have
> any concrete examples that we can use as "known breakage"?
A variety of Intel chipsets don't support lane width switching
or speed switching.  When ASPM has been enabled on a dGPU,
these features are utilized and breakage ensues.

There are various methods to try to mitigate the impact both in
firmware and driver code.

>
> This feels like a real problem to me.  There are existing mechanisms
> (ACPI_FADT_NO_ASPM and _OSC PCIe cap ownership) the platform can use
> to prevent the OS from using ASPM.
>
> If vendors assume that *in addition*, the OS will pay attention to
> whatever ASPM configuration BIOS did, that's a major disconnect.  We
> don't do anything like that for other PCI features, and I'm not aware
> of any restriction like that being documented.
With both of those policies in place, how did we get into
the situation of having configuration options and knobs?
>> I think the pragmatic way to approach it is to (essentially) apply
>> the policy as BIOS defaults and allow overrides from that.
> Do you mean that when enumerating a device (at boot-time or hot-add
> time), we would read the current ASPM config but not change it?  And
> users could use the sysfs knobs to enable/disable ASPM as desired?
Yes.
> That wouldn't solve the problem Kai-Heng is trying to solve.
Alone it wouldn't; but if you treated the i225 PCIe device
connected to the system as a "quirk" to apply ASPM policy
from the parent device to this child device it could.
> Or that we leave ASPM alone during boot-time enumeration, but enable
> ASPM when we enumerate hot-added devices?  It doesn't sound right that
> a device would be configured differently if present at boot vs
> hot-added.
Same policy for both boot and hot add but specifically if the
device is in a quirk list to enable it or disable it.
Bjorn Helgaas June 20, 2023, 6:28 p.m. UTC | #10
On Mon, Jun 19, 2023 at 05:09:55PM -0500, Limonciello, Mario wrote:
> On 6/19/2023 4:37 PM, Bjorn Helgaas wrote:
> > On Mon, Jun 19, 2023 at 11:16:35AM -0500, Limonciello, Mario wrote:
> > > On 6/15/2023 10:01 PM, Kai-Heng Feng wrote:
> > > > On Fri, Jun 16, 2023 at 1:12 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
> > > > > > When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> > > > > > enabled for that device. However, when the device is plugged preboot,
> > > > > > ASPM is enabled by default.
> > > > > > 
> > > > > > The disparity happens because BIOS doesn't have the ability to program
> > > > > > ASPM on hotplugged devices.
> > > > > > 
> > > > > > So enable ASPM by default for external connected PCIe devices so ASPM
> > > > > > settings are consitent between preboot and hotplugged.
> > > > > > 
> > > > > > On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> > > > > > pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> > > > > > pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > > > > > pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > > > > pcieport 0000:07:04.0:    [ 7] BadDLLP
> > > > > > 
> > > > > > The root cause is still unclear, but quite likely because the I225 on
> > > > > > the dock supports PTM, where ASPM timing is precalculated for the PTM.
> > > > > > 
> > > > > > Cc: Mario Limonciello <mario.limonciello@amd.com>
> > > > > > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
> > > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > > ---
> > > > > >    drivers/pci/pcie/aspm.c | 4 +++-
> > > > > >    1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > > index 66d7514ca111..613b0754c9bb 100644
> > > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > > @@ -119,7 +119,9 @@ static int policy_to_aspm_state(struct pcie_link_state *link)
> > > > > >                 /* Enable Everything */
> > > > > >                 return ASPM_STATE_ALL;
> > > > > >         case POLICY_DEFAULT:
> > > > > > -             return link->aspm_default;
> > > > > > +             return dev_is_removable(&link->downstream->dev) ?
> > > > > > +                     link->aspm_capable :
> > > > > > +                     link->aspm_default;
> > > > > I'm a little hesitant because dev_is_removable() is a convenient
> > > > > test that covers the current issue, but it doesn't seem tightly
> > > > > connected from a PCIe architecture perspective.
> > > > > 
> > > > > I think the current model of compile-time ASPM policy selection:
> > > > > 
> > > > >     CONFIG_PCIEASPM_DEFAULT          /* BIOS default setting */
> > > > >     CONFIG_PCIEASPM_PERFORMANCE      /* disable L0s and L1 */
> > > > >     CONFIG_PCIEASPM_POWERSAVE        /* enable L0s and L1 */
> > > > >     CONFIG_PCIEASPM_POWER_SUPERSAVE  /* enable L1 substates */
> > > > > 
> > > > > is flawed.  As far as I know, there's no technical reason we
> > > > > have to select this at kernel build-time.  I suspect the
> > > > > original reason was risk avoidance, i.e., we were worried that
> > > > > we might expose hardware defects if we enabled ASPM states that
> > > > > BIOS hadn't already enabled.
> > > > > 
> > > > > How do we get out of that model?  We do have sysfs knobs that
> > > > > should cover all the functionality (set overall policy as above
> > > > > via /sys/module/pcie_aspm/parameters/policy; set device-level
> > > > > exceptions via /sys/bus/pci/devices/.../link/*_aspm).
> > > > Agree. Build-time policy can be obsoleted by boot-time argument.
> > > I agree as well.
> > This isn't strictly relevant to the current problem, so let's put this
> > on the back burner for now.
> 
> I think it could fold into it if we end up treating the i225
> PCIe device as a quirk as mentioned below.
> 
> > > > > In my opinion, the cleanest solution would be to enable all ASPM
> > > > > functionality whenever possible and let users disable it if they
> > > > > need to for performance.  If there are device defects when
> > > > > something is enabled, deal with it via quirks, as we do for
> > > > > other PCI features.
> > > > > 
> > > > > That feels a little risky, but let's have a conversation about
> > > > > where we want to go in the long term.  It's good to avoid risk,
> > > > > but too much avoidance leads to its own complexity and an
> > > > > inability to change things.
> > > > I think we should separate the situation into two cases:
> > > > - When BIOS/system firmware has the ability to program ASPM, honor
> > > > it.  This applies to most "internal" PCI devices.
> > > > - When BIOS/system can't program ASPM, enable ASPM for whatever
> > > > it's capable of. Most notable case is Intel VMD controller, and
> > > > this patch for devices connected through TBT.
> > > > 
> > > > Enabling all ASPM functionality regardless of what's being
> > > > pre-programmed by BIOS is way too risky.  Disabling ASPM to
> > > > workaround issues and defects are still quite common among
> > > > hardware manufacturers.
> >
> > It sounds like you have actual experience with this :)  Do you have
> > any concrete examples that we can use as "known breakage"?
>
> A variety of Intel chipsets don't support lane width switching
> or speed switching.  When ASPM has been enabled on a dGPU,
> these features are utilized and breakage ensues.

Maybe this helps explain all the completely unmaintainable ASPM
garbage in amdgpu and radeon.

If these devices are broken, we need quirks for them.  We can't avoid
ASPM in general just because random devices break.

> There are various methods to try to mitigate the impact both in
> firmware and driver code.
> 
> > This feels like a real problem to me.  There are existing mechanisms
> > (ACPI_FADT_NO_ASPM and _OSC PCIe cap ownership) the platform can use
> > to prevent the OS from using ASPM.
> > 
> > If vendors assume that *in addition*, the OS will pay attention to
> > whatever ASPM configuration BIOS did, that's a major disconnect.  We
> > don't do anything like that for other PCI features, and I'm not aware
> > of any restriction like that being documented.
>
> With both of those policies in place, how did we get into
> the situation of having configuration options and knobs?

The kernel parameters and config options been there pretty much from
the beginning.  We didn't have the per-device sysfs knobs until very
recently.

> > > I think the pragmatic way to approach it is to (essentially) apply
> > > the policy as BIOS defaults and allow overrides from that.
> >
> > Do you mean that when enumerating a device (at boot-time or hot-add
> > time), we would read the current ASPM config but not change it?  And
> > users could use the sysfs knobs to enable/disable ASPM as desired?
>
> Yes.

Hot-added devices power up with ASPM disabled.  This policy would mean
the user has to explicitly enable it, which doesn't seem practical to
me.

> > That wouldn't solve the problem Kai-Heng is trying to solve.
>
> Alone it wouldn't; but if you treated the i225 PCIe device
> connected to the system as a "quirk" to apply ASPM policy
> from the parent device to this child device it could.

I want quirks for BROKEN devices.  Quirks for working hardware is a
maintenance nightmare.

Bjorn
Mario Limonciello June 20, 2023, 6:36 p.m. UTC | #11
<snip>
>> A variety of Intel chipsets don't support lane width switching
>> or speed switching.  When ASPM has been enabled on a dGPU,
>> these features are utilized and breakage ensues.
> Maybe this helps explain all the completely unmaintainable ASPM
> garbage in amdgpu and radeon.
>
> If these devices are broken, we need quirks for them.

The problem is which device do you consider "broken"?
The dGPU that uses these features when the platform advertises ASPM
or the chipset which doesn't support the features that the device
uses when ASPM is active?

With this problem I'm talking about the dGPU works fine on hosts
that support these features.

KH has a lot more experience with ASPM issues and hopefully has some
other examples to share.

> We can't avoid
> ASPM in general just because random devices break.

I'm not advocating to avoid it in general, I'm saying we shouldn't
be turning it on across the board for all devices if the platform had
it off initially via a kernel command line option or Kconfig.

>> There are various methods to try to mitigate the impact both in
>> firmware and driver code.
>>
>>> This feels like a real problem to me.  There are existing mechanisms
>>> (ACPI_FADT_NO_ASPM and _OSC PCIe cap ownership) the platform can use
>>> to prevent the OS from using ASPM.
>>>
>>> If vendors assume that *in addition*, the OS will pay attention to
>>> whatever ASPM configuration BIOS did, that's a major disconnect.  We
>>> don't do anything like that for other PCI features, and I'm not aware
>>> of any restriction like that being documented.
>> With both of those policies in place, how did we get into
>> the situation of having configuration options and knobs?
> The kernel parameters and config options been there pretty much from
> the beginning.  We didn't have the per-device sysfs knobs until very
> recently.
Ah, I see.
>
>>>> I think the pragmatic way to approach it is to (essentially) apply
>>>> the policy as BIOS defaults and allow overrides from that.
>>> Do you mean that when enumerating a device (at boot-time or hot-add
>>> time), we would read the current ASPM config but not change it?  And
>>> users could use the sysfs knobs to enable/disable ASPM as desired?
>> Yes.
> Hot-added devices power up with ASPM disabled.  This policy would mean
> the user has to explicitly enable it, which doesn't seem practical to
> me.
Could we maybe have the hot added devices follow the policy of
the bridge they're connected to by default?
>
>>> That wouldn't solve the problem Kai-Heng is trying to solve.
>> Alone it wouldn't; but if you treated the i225 PCIe device
>> connected to the system as a "quirk" to apply ASPM policy
>> from the parent device to this child device it could.
> I want quirks for BROKEN devices.  Quirks for working hardware is a
> maintenance nightmare.
>
> Bjorn
If you follow my idea of hot added devices the policy follows
the parent would it work for the i225 PCIe device case?
Kai-Heng Feng June 21, 2023, 3:08 a.m. UTC | #12
On Sat, Jun 17, 2023 at 6:01 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Jun 15, 2023 at 03:04:20PM +0800, Kai-Heng Feng wrote:
> > When a PCIe device is hotplugged to a Thunderbolt port, ASPM is not
> > enabled for that device. However, when the device is plugged preboot,
> > ASPM is enabled by default.
> >
> > The disparity happens because BIOS doesn't have the ability to program
> > ASPM on hotplugged devices.
> >
> > So enable ASPM by default for external connected PCIe devices so ASPM
> > settings are consitent between preboot and hotplugged.
> >
> > On HP Thunderbolt Dock G4, enable ASPM can also fix BadDLLP error:
> > pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:04.0
> > pcieport 0000:07:04.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > pcieport 0000:07:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > pcieport 0000:07:04.0:    [ 7] BadDLLP
> >
> > The root cause is still unclear, but quite likely because the I225 on
> > the dock supports PTM, where ASPM timing is precalculated for the PTM.
>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217557
>
> I know you said this isn't clear yet, but I don't see a connection
> between ASPM being enabled and PTM.  If anything, *disabling* ASPM
> should be safer if there's a timing issue.

If PTM timing is tested when ASPM is enabled, there can be a strong
connection between the two.

I'll raise the issue to IGC devs.

>
> I assume the ASPM timing you refer to is the LTR snoop/no snoop
> latency, since that's the only timing difference I see in the lspci
> output in bugzilla?

Not only LTR. ASPM L0s and L1 are not enabled when devices are hotplugged.

Kai-Heng

>
> I don't see any PTM differences there.
>
> Bjorn
Bjorn Helgaas June 22, 2023, 11:06 p.m. UTC | #13
On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> <snip>
> > > A variety of Intel chipsets don't support lane width switching
> > > or speed switching.  When ASPM has been enabled on a dGPU,
> > > these features are utilized and breakage ensues.
> > Maybe this helps explain all the completely unmaintainable ASPM
> > garbage in amdgpu and radeon.
> > 
> > If these devices are broken, we need quirks for them.
> 
> The problem is which device do you consider "broken"?
> The dGPU that uses these features when the platform advertises ASPM
> or the chipset which doesn't support the features that the device
> uses when ASPM is active?
> 
> With this problem I'm talking about the dGPU works fine on hosts
> that support these features.

Without more details about what's broken and when, I can't say.  What
I *think* is that a device that doesn't work per spec needs a quirk.
Typically it's a device that advertises a capability that doesn't work
correctly.

> > > > > I think the pragmatic way to approach it is to (essentially)
> > > > > apply the policy as BIOS defaults and allow overrides from
> > > > > that.
> > > >
> > > > Do you mean that when enumerating a device (at boot-time or
> > > > hot-add time), we would read the current ASPM config but not
> > > > change it?  And users could use the sysfs knobs to
> > > > enable/disable ASPM as desired?
> > >
> > > Yes.
> > >
> > Hot-added devices power up with ASPM disabled.  This policy would
> > mean the user has to explicitly enable it, which doesn't seem
> > practical to me.
>
> Could we maybe have the hot added devices follow the policy of
> the bridge they're connected to by default?
>
> > > > That wouldn't solve the problem Kai-Heng is trying to solve.
> > >
> > > Alone it wouldn't; but if you treated the i225 PCIe device
> > > connected to the system as a "quirk" to apply ASPM policy
> > > from the parent device to this child device it could.
> >
> > I want quirks for BROKEN devices.  Quirks for working hardware is a
> > maintenance nightmare.
>
> If you follow my idea of hot added devices the policy follows
> the parent would it work for the i225 PCIe device case?

That doesn't *sound* really robust to me because even if the default
config after hot-add works, the user can change things via sysfs, and
any configuration we set it to should work as well.  If there are
land-mines there, we need a quirk that prevents sysfs from running
into it.

Bjorn
Kai-Heng Feng June 27, 2023, 8:35 a.m. UTC | #14
On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> > <snip>
> > > > A variety of Intel chipsets don't support lane width switching
> > > > or speed switching.  When ASPM has been enabled on a dGPU,
> > > > these features are utilized and breakage ensues.
> > > Maybe this helps explain all the completely unmaintainable ASPM
> > > garbage in amdgpu and radeon.
> > >
> > > If these devices are broken, we need quirks for them.
> >
> > The problem is which device do you consider "broken"?
> > The dGPU that uses these features when the platform advertises ASPM
> > or the chipset which doesn't support the features that the device
> > uses when ASPM is active?
> >
> > With this problem I'm talking about the dGPU works fine on hosts
> > that support these features.
>
> Without more details about what's broken and when, I can't say.  What
> I *think* is that a device that doesn't work per spec needs a quirk.
> Typically it's a device that advertises a capability that doesn't work
> correctly.

Many silicon vendors use the same "PCI IP" and integrate it into their
own chip. That's one of the reason why the capability doesn't really
reflect what actually being support.
The vendors then send their private datasheet to system integrator so
BIOS can enable what's really supported.

So the logic is to ignore the capability and trust the default set by BIOS.

>
> > > > > > I think the pragmatic way to approach it is to (essentially)
> > > > > > apply the policy as BIOS defaults and allow overrides from
> > > > > > that.
> > > > >
> > > > > Do you mean that when enumerating a device (at boot-time or
> > > > > hot-add time), we would read the current ASPM config but not
> > > > > change it?  And users could use the sysfs knobs to
> > > > > enable/disable ASPM as desired?
> > > >
> > > > Yes.
> > > >
> > > Hot-added devices power up with ASPM disabled.  This policy would
> > > mean the user has to explicitly enable it, which doesn't seem
> > > practical to me.
> >
> > Could we maybe have the hot added devices follow the policy of
> > the bridge they're connected to by default?
> >
> > > > > That wouldn't solve the problem Kai-Heng is trying to solve.
> > > >
> > > > Alone it wouldn't; but if you treated the i225 PCIe device
> > > > connected to the system as a "quirk" to apply ASPM policy
> > > > from the parent device to this child device it could.
> > >
> > > I want quirks for BROKEN devices.  Quirks for working hardware is a
> > > maintenance nightmare.
> >
> > If you follow my idea of hot added devices the policy follows
> > the parent would it work for the i225 PCIe device case?
>
> That doesn't *sound* really robust to me because even if the default
> config after hot-add works, the user can change things via sysfs, and
> any configuration we set it to should work as well.  If there are
> land-mines there, we need a quirk that prevents sysfs from running
> into it.

For this case it means driver needs to provide a ASPM callback to flip
PTM based on ASPM sysfs.

Kai-Heng

> Bjorn
Bjorn Helgaas June 27, 2023, 8:54 p.m. UTC | #15
On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
> On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> > > <snip>
> > > > > A variety of Intel chipsets don't support lane width switching
> > > > > or speed switching.  When ASPM has been enabled on a dGPU,
> > > > > these features are utilized and breakage ensues.
> > > >
> > > > Maybe this helps explain all the completely unmaintainable ASPM
> > > > garbage in amdgpu and radeon.
> > > >
> > > > If these devices are broken, we need quirks for them.
> > >
> > > The problem is which device do you consider "broken"?
> > > The dGPU that uses these features when the platform advertises ASPM
> > > or the chipset which doesn't support the features that the device
> > > uses when ASPM is active?
> > >
> > > With this problem I'm talking about the dGPU works fine on hosts
> > > that support these features.
> >
> > Without more details about what's broken and when, I can't say.  What
> > I *think* is that a device that doesn't work per spec needs a quirk.
> > Typically it's a device that advertises a capability that doesn't work
> > correctly.
> 
> Many silicon vendors use the same "PCI IP" and integrate it into their
> own chip. That's one of the reason why the capability doesn't really
> reflect what actually being support.
> The vendors then send their private datasheet to system integrator so
> BIOS can enable what's really supported.

It's perfectly fine for the IP to support PCI features that are not
and can not be enabled in a system design.  But I expect that
strapping or firmware would disable those features so they are not
advertised in config space.

If BIOS leaves features disabled because they cannot work, but at the
same time leaves them advertised in config space, I'd say that's a
BIOS defect.  In that case, we should have a DMI quirk or something to
work around the defect.

> So the logic is to ignore the capability and trust the default set
> by BIOS.

I think limiting ASPM support to whatever BIOS configured at boot-time
is problematic.  I don't think we can assume that all platforms have
firmware that configures ASPM as aggressively as possible, and
obviously firmware won't configure hot-added devices at all (in
general; I know ACPI _HPX can do some of that).

It's possible we need some kind of policy that limits ASPM to the BIOS
config until date X, but I don't want to limit that forever.

> > > If you follow my idea of hot added devices the policy follows
> > > the parent would it work for the i225 PCIe device case?
> >
> > That doesn't *sound* really robust to me because even if the default
> > config after hot-add works, the user can change things via sysfs, and
> > any configuration we set it to should work as well.  If there are
> > land-mines there, we need a quirk that prevents sysfs from running
> > into it.
> 
> For this case it means driver needs to provide a ASPM callback to flip
> PTM based on ASPM sysfs.

I'm not sure I follow this, but it sounds like you're saying PTM
doesn't work correctly with some ASPM configurations.  Is this allowed
by the PCIe spec or is it a hardware defect?  If the latter, maybe we
just need a quirk to disable PTM on the device.

I'm not sure PTM is valuable enough to add a generic callback
mechanism to work around a defect in an individual device.

Bjorn
Kai-Heng Feng June 28, 2023, 5:09 a.m. UTC | #16
[+Cc Sasha and Intel Wired Lan]

On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
> > On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> > > > <snip>
> > > > > > A variety of Intel chipsets don't support lane width switching
> > > > > > or speed switching.  When ASPM has been enabled on a dGPU,
> > > > > > these features are utilized and breakage ensues.
> > > > >
> > > > > Maybe this helps explain all the completely unmaintainable ASPM
> > > > > garbage in amdgpu and radeon.
> > > > >
> > > > > If these devices are broken, we need quirks for them.
> > > >
> > > > The problem is which device do you consider "broken"?
> > > > The dGPU that uses these features when the platform advertises ASPM
> > > > or the chipset which doesn't support the features that the device
> > > > uses when ASPM is active?
> > > >
> > > > With this problem I'm talking about the dGPU works fine on hosts
> > > > that support these features.
> > >
> > > Without more details about what's broken and when, I can't say.  What
> > > I *think* is that a device that doesn't work per spec needs a quirk.
> > > Typically it's a device that advertises a capability that doesn't work
> > > correctly.
> >
> > Many silicon vendors use the same "PCI IP" and integrate it into their
> > own chip. That's one of the reason why the capability doesn't really
> > reflect what actually being support.
> > The vendors then send their private datasheet to system integrator so
> > BIOS can enable what's really supported.
>
> It's perfectly fine for the IP to support PCI features that are not
> and can not be enabled in a system design.  But I expect that
> strapping or firmware would disable those features so they are not
> advertised in config space.
>
> If BIOS leaves features disabled because they cannot work, but at the
> same time leaves them advertised in config space, I'd say that's a
> BIOS defect.  In that case, we should have a DMI quirk or something to
> work around the defect.

That means most if not all BIOS are defected.
BIOS vendors and ODM never bothered (and probably will not) to change
the capabilities advertised by config space because "it already works
under Windows".

>
> > So the logic is to ignore the capability and trust the default set
> > by BIOS.
>
> I think limiting ASPM support to whatever BIOS configured at boot-time
> is problematic.  I don't think we can assume that all platforms have
> firmware that configures ASPM as aggressively as possible, and
> obviously firmware won't configure hot-added devices at all (in
> general; I know ACPI _HPX can do some of that).

Totally agree. I was not suggesting to limiting the setting at all.
A boot-time parameter to flip ASPM setting is very useful. If none has
been set, default to BIOS setting.

>
> It's possible we need some kind of policy that limits ASPM to the BIOS
> config until date X, but I don't want to limit that forever.

Desktop can have very different PCIe cards in the slots so BIOS date
isn't a good indicator for ASPM support.
Let alone BIOS date changes on each BIOS upgrade, this means potential
regression risk.

>
> > > > If you follow my idea of hot added devices the policy follows
> > > > the parent would it work for the i225 PCIe device case?
> > >
> > > That doesn't *sound* really robust to me because even if the default
> > > config after hot-add works, the user can change things via sysfs, and
> > > any configuration we set it to should work as well.  If there are
> > > land-mines there, we need a quirk that prevents sysfs from running
> > > into it.
> >
> > For this case it means driver needs to provide a ASPM callback to flip
> > PTM based on ASPM sysfs.
>
> I'm not sure I follow this, but it sounds like you're saying PTM
> doesn't work correctly with some ASPM configurations.  Is this allowed
> by the PCIe spec or is it a hardware defect?  If the latter, maybe we
> just need a quirk to disable PTM on the device.

I'll leave this one to Intel folks.

>
> I'm not sure PTM is valuable enough to add a generic callback
> mechanism to work around a defect in an individual device.

Agree. Right now the issue is only observed when I225 is in a TBT dock.

Kai-Heng

>
> Bjorn
Bjorn Helgaas July 5, 2023, 8:06 p.m. UTC | #17
On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
> On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
> > > On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:

> > It's perfectly fine for the IP to support PCI features that are not
> > and can not be enabled in a system design.  But I expect that
> > strapping or firmware would disable those features so they are not
> > advertised in config space.
> >
> > If BIOS leaves features disabled because they cannot work, but at the
> > same time leaves them advertised in config space, I'd say that's a
> > BIOS defect.  In that case, we should have a DMI quirk or something to
> > work around the defect.
> 
> That means most if not all BIOS are defected.
> BIOS vendors and ODM never bothered (and probably will not) to change
> the capabilities advertised by config space because "it already works
> under Windows".

This is what seems strange to me.  Are you saying that Windows never
enables these power-saving features?  Or that Windows includes quirks
for all these broken BIOSes?  Neither idea seems very convincing.

> > > So the logic is to ignore the capability and trust the default set
> > > by BIOS.
> >
> > I think limiting ASPM support to whatever BIOS configured at boot-time
> > is problematic.  I don't think we can assume that all platforms have
> > firmware that configures ASPM as aggressively as possible, and
> > obviously firmware won't configure hot-added devices at all (in
> > general; I know ACPI _HPX can do some of that).
> 
> Totally agree. I was not suggesting to limiting the setting at all.
> A boot-time parameter to flip ASPM setting is very useful. If none has
> been set, default to BIOS setting.

A boot-time parameter for debugging and workarounds is fine.  IMO,
needing a boot-time parameter in the course of normal operation is
not OK.

Bjorn
Mario Limonciello July 6, 2023, 4:07 a.m. UTC | #18
On 7/5/23 15:06, Bjorn Helgaas wrote:
> On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
>> On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>> On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
>>>> On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>> On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> 
>>> It's perfectly fine for the IP to support PCI features that are not
>>> and can not be enabled in a system design.  But I expect that
>>> strapping or firmware would disable those features so they are not
>>> advertised in config space.
>>>
>>> If BIOS leaves features disabled because they cannot work, but at the
>>> same time leaves them advertised in config space, I'd say that's a
>>> BIOS defect.  In that case, we should have a DMI quirk or something to
>>> work around the defect.
>>
>> That means most if not all BIOS are defected.
>> BIOS vendors and ODM never bothered (and probably will not) to change
>> the capabilities advertised by config space because "it already works
>> under Windows".
> 
> This is what seems strange to me.  Are you saying that Windows never
> enables these power-saving features?  Or that Windows includes quirks
> for all these broken BIOSes?  Neither idea seems very convincing.
> 

I see your point.  I was looking through Microsoft documentation for 
hints and came across this:

https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/pci-express-settings-link-state-power-management

They have a policy knob to globally set L0 or L1 for PCIe links.

They don't explicitly say it, but surely it's based on what the devices 
advertise in the capabilities registers.

>>>> So the logic is to ignore the capability and trust the default set
>>>> by BIOS.
>>>
>>> I think limiting ASPM support to whatever BIOS configured at boot-time
>>> is problematic.  I don't think we can assume that all platforms have
>>> firmware that configures ASPM as aggressively as possible, and
>>> obviously firmware won't configure hot-added devices at all (in
>>> general; I know ACPI _HPX can do some of that).
>>
>> Totally agree. I was not suggesting to limiting the setting at all.
>> A boot-time parameter to flip ASPM setting is very useful. If none has
>> been set, default to BIOS setting.
> 
> A boot-time parameter for debugging and workarounds is fine.  IMO,
> needing a boot-time parameter in the course of normal operation is
> not OK.
> 
> Bjorn
Kai-Heng Feng July 14, 2023, 8:17 a.m. UTC | #19
On Thu, Jul 6, 2023 at 12:07 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> On 7/5/23 15:06, Bjorn Helgaas wrote:
> > On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
> >> On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>> On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
> >>>> On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>> On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> >
> >>> It's perfectly fine for the IP to support PCI features that are not
> >>> and can not be enabled in a system design.  But I expect that
> >>> strapping or firmware would disable those features so they are not
> >>> advertised in config space.
> >>>
> >>> If BIOS leaves features disabled because they cannot work, but at the
> >>> same time leaves them advertised in config space, I'd say that's a
> >>> BIOS defect.  In that case, we should have a DMI quirk or something to
> >>> work around the defect.
> >>
> >> That means most if not all BIOS are defected.
> >> BIOS vendors and ODM never bothered (and probably will not) to change
> >> the capabilities advertised by config space because "it already works
> >> under Windows".
> >
> > This is what seems strange to me.  Are you saying that Windows never
> > enables these power-saving features?  Or that Windows includes quirks
> > for all these broken BIOSes?  Neither idea seems very convincing.
> >
>
> I see your point.  I was looking through Microsoft documentation for
> hints and came across this:
>
> https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/pci-express-settings-link-state-power-management
>
> They have a policy knob to globally set L0 or L1 for PCIe links.
>
> They don't explicitly say it, but surely it's based on what the devices
> advertise in the capabilities registers.

So essentially it can be achieved via boot time kernel parameter
and/or sysfs knob.

The main point is OS should stick to the BIOS default, which is the
only ASPM setting tested before putting hardware to the market.

Kai-Heng

>
> >>>> So the logic is to ignore the capability and trust the default set
> >>>> by BIOS.
> >>>
> >>> I think limiting ASPM support to whatever BIOS configured at boot-time
> >>> is problematic.  I don't think we can assume that all platforms have
> >>> firmware that configures ASPM as aggressively as possible, and
> >>> obviously firmware won't configure hot-added devices at all (in
> >>> general; I know ACPI _HPX can do some of that).
> >>
> >> Totally agree. I was not suggesting to limiting the setting at all.
> >> A boot-time parameter to flip ASPM setting is very useful. If none has
> >> been set, default to BIOS setting.
> >
> > A boot-time parameter for debugging and workarounds is fine.  IMO,
> > needing a boot-time parameter in the course of normal operation is
> > not OK.
> >
> > Bjorn
>
Mario Limonciello July 14, 2023, 4:37 p.m. UTC | #20
On 7/14/23 03:17, Kai-Heng Feng wrote:
> On Thu, Jul 6, 2023 at 12:07 PM Mario Limonciello
> <mario.limonciello@amd.com> wrote:
>>
>> On 7/5/23 15:06, Bjorn Helgaas wrote:
>>> On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
>>>> On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>> On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
>>>>>> On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>> On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
>>>
>>>>> It's perfectly fine for the IP to support PCI features that are not
>>>>> and can not be enabled in a system design.  But I expect that
>>>>> strapping or firmware would disable those features so they are not
>>>>> advertised in config space.
>>>>>
>>>>> If BIOS leaves features disabled because they cannot work, but at the
>>>>> same time leaves them advertised in config space, I'd say that's a
>>>>> BIOS defect.  In that case, we should have a DMI quirk or something to
>>>>> work around the defect.
>>>>
>>>> That means most if not all BIOS are defected.
>>>> BIOS vendors and ODM never bothered (and probably will not) to change
>>>> the capabilities advertised by config space because "it already works
>>>> under Windows".
>>>
>>> This is what seems strange to me.  Are you saying that Windows never
>>> enables these power-saving features?  Or that Windows includes quirks
>>> for all these broken BIOSes?  Neither idea seems very convincing.
>>>
>>
>> I see your point.  I was looking through Microsoft documentation for
>> hints and came across this:
>>
>> https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/pci-express-settings-link-state-power-management
>>
>> They have a policy knob to globally set L0 or L1 for PCIe links.
>>
>> They don't explicitly say it, but surely it's based on what the devices
>> advertise in the capabilities registers.
> 
> So essentially it can be achieved via boot time kernel parameter
> and/or sysfs knob.
> 
> The main point is OS should stick to the BIOS default, which is the
> only ASPM setting tested before putting hardware to the market.

Unfortunately; I don't think you can jump to this conclusion.

A big difference in the Windows world to Linux world is that OEMs ship 
with a factory Windows image that may set policies like this.  OEM 
"platform" drivers can set registry keys too.

I think the next ASPM issue that comes up what we (collectively) need to 
do is compare ASPM policy and PCI registers for:
1) A "clean" Windows install from Microsoft media before all the OEM 
drivers are installed.
2) A Windows install that the drivers have been installed.
3) A up to date mainline Linux kernel.

Actually as this thread started for determining policy for external PCIe 
devices, maybe that would be good to check with those.

> 
> Kai-Heng
> 
>>
>>>>>> So the logic is to ignore the capability and trust the default set
>>>>>> by BIOS.
>>>>>
>>>>> I think limiting ASPM support to whatever BIOS configured at boot-time
>>>>> is problematic.  I don't think we can assume that all platforms have
>>>>> firmware that configures ASPM as aggressively as possible, and
>>>>> obviously firmware won't configure hot-added devices at all (in
>>>>> general; I know ACPI _HPX can do some of that).
>>>>
>>>> Totally agree. I was not suggesting to limiting the setting at all.
>>>> A boot-time parameter to flip ASPM setting is very useful. If none has
>>>> been set, default to BIOS setting.
>>>
>>> A boot-time parameter for debugging and workarounds is fine.  IMO,
>>> needing a boot-time parameter in the course of normal operation is
>>> not OK.
>>>
>>> Bjorn
>>
Kai-Heng Feng July 17, 2023, 3:34 a.m. UTC | #21
On Sat, Jul 15, 2023 at 12:37 AM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> On 7/14/23 03:17, Kai-Heng Feng wrote:
> > On Thu, Jul 6, 2023 at 12:07 PM Mario Limonciello
> > <mario.limonciello@amd.com> wrote:
> >>
> >> On 7/5/23 15:06, Bjorn Helgaas wrote:
> >>> On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
> >>>> On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>> On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
> >>>>>> On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>>>> On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
> >>>
> >>>>> It's perfectly fine for the IP to support PCI features that are not
> >>>>> and can not be enabled in a system design.  But I expect that
> >>>>> strapping or firmware would disable those features so they are not
> >>>>> advertised in config space.
> >>>>>
> >>>>> If BIOS leaves features disabled because they cannot work, but at the
> >>>>> same time leaves them advertised in config space, I'd say that's a
> >>>>> BIOS defect.  In that case, we should have a DMI quirk or something to
> >>>>> work around the defect.
> >>>>
> >>>> That means most if not all BIOS are defected.
> >>>> BIOS vendors and ODM never bothered (and probably will not) to change
> >>>> the capabilities advertised by config space because "it already works
> >>>> under Windows".
> >>>
> >>> This is what seems strange to me.  Are you saying that Windows never
> >>> enables these power-saving features?  Or that Windows includes quirks
> >>> for all these broken BIOSes?  Neither idea seems very convincing.
> >>>
> >>
> >> I see your point.  I was looking through Microsoft documentation for
> >> hints and came across this:
> >>
> >> https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/pci-express-settings-link-state-power-management
> >>
> >> They have a policy knob to globally set L0 or L1 for PCIe links.
> >>
> >> They don't explicitly say it, but surely it's based on what the devices
> >> advertise in the capabilities registers.
> >
> > So essentially it can be achieved via boot time kernel parameter
> > and/or sysfs knob.
> >
> > The main point is OS should stick to the BIOS default, which is the
> > only ASPM setting tested before putting hardware to the market.
>
> Unfortunately; I don't think you can jump to this conclusion.
>
> A big difference in the Windows world to Linux world is that OEMs ship
> with a factory Windows image that may set policies like this.  OEM
> "platform" drivers can set registry keys too.

Thanks. This is new to me.

>
> I think the next ASPM issue that comes up what we (collectively) need to
> do is compare ASPM policy and PCI registers for:
> 1) A "clean" Windows install from Microsoft media before all the OEM
> drivers are installed.
> 2) A Windows install that the drivers have been installed.
> 3) A up to date mainline Linux kernel.
>
> Actually as this thread started for determining policy for external PCIe
> devices, maybe that would be good to check with those.

Did that before submitting the patch.
From very limited devices I tested, ASPM is enabled for external
connected PCIe device via TBT ports.

I wonder if there's any particular modification should be improved for
this patch?

Kai-Heng

>
> >
> > Kai-Heng
> >
> >>
> >>>>>> So the logic is to ignore the capability and trust the default set
> >>>>>> by BIOS.
> >>>>>
> >>>>> I think limiting ASPM support to whatever BIOS configured at boot-time
> >>>>> is problematic.  I don't think we can assume that all platforms have
> >>>>> firmware that configures ASPM as aggressively as possible, and
> >>>>> obviously firmware won't configure hot-added devices at all (in
> >>>>> general; I know ACPI _HPX can do some of that).
> >>>>
> >>>> Totally agree. I was not suggesting to limiting the setting at all.
> >>>> A boot-time parameter to flip ASPM setting is very useful. If none has
> >>>> been set, default to BIOS setting.
> >>>
> >>> A boot-time parameter for debugging and workarounds is fine.  IMO,
> >>> needing a boot-time parameter in the course of normal operation is
> >>> not OK.
> >>>
> >>> Bjorn
> >>
>
Mario Limonciello July 17, 2023, 4:51 p.m. UTC | #22
On 7/16/2023 10:34 PM, Kai-Heng Feng wrote:
> On Sat, Jul 15, 2023 at 12:37 AM Mario Limonciello
> <mario.limonciello@amd.com> wrote:
>>
>> On 7/14/23 03:17, Kai-Heng Feng wrote:
>>> On Thu, Jul 6, 2023 at 12:07 PM Mario Limonciello
>>> <mario.limonciello@amd.com> wrote:
>>>>
>>>> On 7/5/23 15:06, Bjorn Helgaas wrote:
>>>>> On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
>>>>>> On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>> On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
>>>>>>>> On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>>>> On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:
>>>>>
>>>>>>> It's perfectly fine for the IP to support PCI features that are not
>>>>>>> and can not be enabled in a system design.  But I expect that
>>>>>>> strapping or firmware would disable those features so they are not
>>>>>>> advertised in config space.
>>>>>>>
>>>>>>> If BIOS leaves features disabled because they cannot work, but at the
>>>>>>> same time leaves them advertised in config space, I'd say that's a
>>>>>>> BIOS defect.  In that case, we should have a DMI quirk or something to
>>>>>>> work around the defect.
>>>>>>
>>>>>> That means most if not all BIOS are defected.
>>>>>> BIOS vendors and ODM never bothered (and probably will not) to change
>>>>>> the capabilities advertised by config space because "it already works
>>>>>> under Windows".
>>>>>
>>>>> This is what seems strange to me.  Are you saying that Windows never
>>>>> enables these power-saving features?  Or that Windows includes quirks
>>>>> for all these broken BIOSes?  Neither idea seems very convincing.
>>>>>
>>>>
>>>> I see your point.  I was looking through Microsoft documentation for
>>>> hints and came across this:
>>>>
>>>> https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/pci-express-settings-link-state-power-management
>>>>
>>>> They have a policy knob to globally set L0 or L1 for PCIe links.
>>>>
>>>> They don't explicitly say it, but surely it's based on what the devices
>>>> advertise in the capabilities registers.
>>>
>>> So essentially it can be achieved via boot time kernel parameter
>>> and/or sysfs knob.
>>>
>>> The main point is OS should stick to the BIOS default, which is the
>>> only ASPM setting tested before putting hardware to the market.
>>
>> Unfortunately; I don't think you can jump to this conclusion.
>>
>> A big difference in the Windows world to Linux world is that OEMs ship
>> with a factory Windows image that may set policies like this.  OEM
>> "platform" drivers can set registry keys too.
> 
> Thanks. This is new to me.
> 
>>
>> I think the next ASPM issue that comes up what we (collectively) need to
>> do is compare ASPM policy and PCI registers for:
>> 1) A "clean" Windows install from Microsoft media before all the OEM
>> drivers are installed.
>> 2) A Windows install that the drivers have been installed.
>> 3) A up to date mainline Linux kernel.
>>
>> Actually as this thread started for determining policy for external PCIe
>> devices, maybe that would be good to check with those.
> 
> Did that before submitting the patch.
>  From very limited devices I tested, ASPM is enabled for external
> connected PCIe device via TBT ports.
> 
> I wonder if there's any particular modification should be improved for
> this patch?
> 

Knowing this information I personally think the original patch that 
started this thread makes a lot of sense.

Bjorn, what are your thoughts?

> Kai-Heng
> 
>>
>>>
>>> Kai-Heng
>>>
>>>>
>>>>>>>> So the logic is to ignore the capability and trust the default set
>>>>>>>> by BIOS.
>>>>>>>
>>>>>>> I think limiting ASPM support to whatever BIOS configured at boot-time
>>>>>>> is problematic.  I don't think we can assume that all platforms have
>>>>>>> firmware that configures ASPM as aggressively as possible, and
>>>>>>> obviously firmware won't configure hot-added devices at all (in
>>>>>>> general; I know ACPI _HPX can do some of that).
>>>>>>
>>>>>> Totally agree. I was not suggesting to limiting the setting at all.
>>>>>> A boot-time parameter to flip ASPM setting is very useful. If none has
>>>>>> been set, default to BIOS setting.
>>>>>
>>>>> A boot-time parameter for debugging and workarounds is fine.  IMO,
>>>>> needing a boot-time parameter in the course of normal operation is
>>>>> not OK.
>>>>>
>>>>> Bjorn
>>>>
>>
Bjorn Helgaas July 18, 2023, 7:24 p.m. UTC | #23
On Mon, Jul 17, 2023 at 11:51:32AM -0500, Limonciello, Mario wrote:
> On 7/16/2023 10:34 PM, Kai-Heng Feng wrote:
> > On Sat, Jul 15, 2023 at 12:37 AM Mario Limonciello <mario.limonciello@amd.com> wrote:
> > > On 7/14/23 03:17, Kai-Heng Feng wrote:

> > > > The main point is OS should stick to the BIOS default, which is the
> > > > only ASPM setting tested before putting hardware to the market.
> > > 
> > > Unfortunately; I don't think you can jump to this conclusion.

I think using the BIOS default as a limit is problematic.  I think it
would be perfectly reasonable for a BIOS to (a) configure only devices
it needs for console and boot, leaving others at power-on defaults,
and (b) configure devices in the safest configuration possible on the
assumption that an OS can decide the runtime policy itself.

Obviously I'm not a BIOS writer (though I sure wish I could talk to
some!), so maybe these are bad assumptions.

> > > A big difference in the Windows world to Linux world is that OEMs ship
> > > with a factory Windows image that may set policies like this.  OEM
> > > "platform" drivers can set registry keys too.

I suppose this means that the OEM image contains drivers that aren't
in the Microsoft media, and those drivers may set constraints on ASPM
usage?

If you boot the Microsoft media that lacks those drivers, maybe it
doesn't bother to configure ASPM for those devices?  Linux currently
configures ASPM for everything at enumeration-time, so we do it even
if there's no driver.

> > I wonder if there's any particular modification should be improved for
> > this patch?
> 
> Knowing this information I personally think the original patch that started
> this thread makes a lot of sense.

I'm still opposed to using dev_is_removable() as a predicate because I
don't think it has any technical connection to ASPM configuration.

Bjorn
Kai-Heng Feng Aug. 11, 2023, 8:34 a.m. UTC | #24
On Wed, Jul 19, 2023 at 3:24 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Mon, Jul 17, 2023 at 11:51:32AM -0500, Limonciello, Mario wrote:
> > On 7/16/2023 10:34 PM, Kai-Heng Feng wrote:
> > > On Sat, Jul 15, 2023 at 12:37 AM Mario Limonciello <mario.limonciello@amd.com> wrote:
> > > > On 7/14/23 03:17, Kai-Heng Feng wrote:
>
> > > > > The main point is OS should stick to the BIOS default, which is the
> > > > > only ASPM setting tested before putting hardware to the market.
> > > >
> > > > Unfortunately; I don't think you can jump to this conclusion.
>
> I think using the BIOS default as a limit is problematic.  I think it
> would be perfectly reasonable for a BIOS to (a) configure only devices
> it needs for console and boot, leaving others at power-on defaults,
> and (b) configure devices in the safest configuration possible on the
> assumption that an OS can decide the runtime policy itself.

This is not using BIOS as a "limit". OS is still capable of changing
the ASPM policy at boot time or runtime.
The main point is to find a "sane" setting for devices where BIOS
can't program ASPM.

>
> Obviously I'm not a BIOS writer (though I sure wish I could talk to
> some!), so maybe these are bad assumptions.
>
> > > > A big difference in the Windows world to Linux world is that OEMs ship
> > > > with a factory Windows image that may set policies like this.  OEM
> > > > "platform" drivers can set registry keys too.
>
> I suppose this means that the OEM image contains drivers that aren't
> in the Microsoft media, and those drivers may set constraints on ASPM
> usage?
>
> If you boot the Microsoft media that lacks those drivers, maybe it
> doesn't bother to configure ASPM for those devices?  Linux currently
> configures ASPM for everything at enumeration-time, so we do it even
> if there's no driver.

This can be another topic to explore. But sounds like it can break things.

>
> > > I wonder if there's any particular modification should be improved for
> > > this patch?
> >
> > Knowing this information I personally think the original patch that started
> > this thread makes a lot of sense.
>
> I'm still opposed to using dev_is_removable() as a predicate because I
> don't think it has any technical connection to ASPM configuration.

OK. So what should we do instead? Checking if the device is connected
to TBT switch?

Kai-Heng


>
> Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 66d7514ca111..613b0754c9bb 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -119,7 +119,9 @@  static int policy_to_aspm_state(struct pcie_link_state *link)
 		/* Enable Everything */
 		return ASPM_STATE_ALL;
 	case POLICY_DEFAULT:
-		return link->aspm_default;
+		return dev_is_removable(&link->downstream->dev) ?
+			link->aspm_capable :
+			link->aspm_default;
 	}
 	return 0;
 }