mbox series

[v3,0/2] PCI/ERR: Allow Native AER/DPC using _OSC

Message ID 1588272369-2145-1-git-send-email-jonathan.derrick@intel.com (mailing list archive)
Headers show
Series PCI/ERR: Allow Native AER/DPC using _OSC | expand

Message

Jon Derrick April 30, 2020, 6:46 p.m. UTC
Hi Bjorn & Kuppuswamy,

I see a problem in the DPC ECN [1] to _OSC in that it doesn't give us a way to
determine if firmware supports _OSC DPC negotation, and therefore how to handle
DPC.

Here is the wording of the ECN that implies that Firmware without _OSC DPC
negotiation support should have the OSPM rely on _OSC AER negotiation when
determining DPC control:

  PCIe Base Specification suggests that Downstream Port Containment may be
  controlled either by the Firmware or the Operating System. It also suggests
  that the Firmware retain ownership of Downstream Port Containment if it also
  owns AER. When the Firmware owns Downstream Port Containment, it is expected
  to use the new "Error Disconnect Recover" notification to alert OSPM of a
  Downstream Port Containment event.

In legacy platforms, as bits in _OSC are reserved prior to implementation, ACPI
Root Bus enumeration will mark these Host Bridges as without Native DPC
support, even though the specification implies it's expected that AER _OSC
negotiation determines DPC control for these platforms. There seems to be a
need for a way to determine if the DPC control bit in _OSC is supported and
fallback on AER otherwise.


Currently portdrv assumes DPC control if the port has Native AER services:

static int get_port_device_capability(struct pci_dev *dev)
...
	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
	    pci_aer_available() &&
	    (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
		services |= PCIE_PORT_SERVICE_DPC;

Newer firmware may not grant OSPM DPC control, if for instance, it expects to
use Error Disconnect Recovery. However it looks like ACPI will use DPC services
via the EDR driver, without binding the full DPC port service driver.


If we change portdrv to probe based on host->native_dpc and not AER, then we
break instances with legacy firmware where OSPM will clear host->native_dpc
solely due to _OSC bits being reserved:

struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
...
	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
		host_bridge->native_dpc = 0;



So my assumption instead is that host->native_dpc can be 0 and expect Native
DPC services if AER is used. In other words, if and only if DPC probe is
invoked from portdrv, then it needs to rely on the AER dependency. Otherwise it
should be assumed that ACPI set up DPC via EDR. This covers legacy firmware.

However it seems like that could be trouble with newer firmware that might give
OSPM control of AER but not DPC, and would result in both Native DPC and EDR
being in effect.


Anyways here are two patches that give control of AER and DPC on the results of
_OSC. They don't mess with the HEST parser as I expect those to be removed at
some point. I need these for VMD support which doesn't even rely on _OSC, but I
suspect this won't be the last effort as we detangle Firmware First.

[1] https://members.pcisig.com/wg/PCI-SIG/document/12888


Jon Derrick (2):
  PCI/AER: Use _OSC to determine Firmware First before HEST
  PCI/DPC: Use _OSC to determine DPC support

 drivers/pci/pcie/aer.c          | 3 +++
 drivers/pci/pcie/dpc.c          | 3 ---
 drivers/pci/pcie/portdrv_core.c | 3 ++-
 3 files changed, 5 insertions(+), 4 deletions(-)

Comments

Bjorn Helgaas May 1, 2020, 5:16 p.m. UTC | #1
On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
> Hi Bjorn & Kuppuswamy,
> 
> I see a problem in the DPC ECN [1] to _OSC in that it doesn't give us a way to
> determine if firmware supports _OSC DPC negotation, and therefore how to handle
> DPC.
> 
> Here is the wording of the ECN that implies that Firmware without _OSC DPC
> negotiation support should have the OSPM rely on _OSC AER negotiation when
> determining DPC control:
> 
>   PCIe Base Specification suggests that Downstream Port Containment may be
>   controlled either by the Firmware or the Operating System. It also suggests
>   that the Firmware retain ownership of Downstream Port Containment if it also
>   owns AER. When the Firmware owns Downstream Port Containment, it is expected
>   to use the new "Error Disconnect Recover" notification to alert OSPM of a
>   Downstream Port Containment event.
> 
> In legacy platforms, as bits in _OSC are reserved prior to implementation, ACPI
> Root Bus enumeration will mark these Host Bridges as without Native DPC
> support, even though the specification implies it's expected that AER _OSC
> negotiation determines DPC control for these platforms. There seems to be a
> need for a way to determine if the DPC control bit in _OSC is supported and
> fallback on AER otherwise.
> 
> 
> Currently portdrv assumes DPC control if the port has Native AER services:
> 
> static int get_port_device_capability(struct pci_dev *dev)
> ...
> 	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
> 	    pci_aer_available() &&
> 	    (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
> 		services |= PCIE_PORT_SERVICE_DPC;
> 
> Newer firmware may not grant OSPM DPC control, if for instance, it expects to
> use Error Disconnect Recovery. However it looks like ACPI will use DPC services
> via the EDR driver, without binding the full DPC port service driver.
> 
> 
> If we change portdrv to probe based on host->native_dpc and not AER, then we
> break instances with legacy firmware where OSPM will clear host->native_dpc
> solely due to _OSC bits being reserved:
> 
> struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> ...
> 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
> 		host_bridge->native_dpc = 0;
> 
> 
> 
> So my assumption instead is that host->native_dpc can be 0 and expect Native
> DPC services if AER is used. In other words, if and only if DPC probe is
> invoked from portdrv, then it needs to rely on the AER dependency. Otherwise it
> should be assumed that ACPI set up DPC via EDR. This covers legacy firmware.
> 
> However it seems like that could be trouble with newer firmware that might give
> OSPM control of AER but not DPC, and would result in both Native DPC and EDR
> being in effect.
> 
> 
> Anyways here are two patches that give control of AER and DPC on the results of
> _OSC. They don't mess with the HEST parser as I expect those to be removed at
> some point. I need these for VMD support which doesn't even rely on _OSC, but I
> suspect this won't be the last effort as we detangle Firmware First.
> 
> [1] https://members.pcisig.com/wg/PCI-SIG/document/12888

Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
from Alex and Sathy first, then see what needs to be done on top of
those, so I'm going to push these off for a few days and they'll
probably need a refresh.

Bjorn
Jon Derrick May 1, 2020, 5:35 p.m. UTC | #2
On Fri, 2020-05-01 at 12:16 -0500, Bjorn Helgaas wrote:
> On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
> > Hi Bjorn & Kuppuswamy,
> > 
> > I see a problem in the DPC ECN [1] to _OSC in that it doesn't give us a way to
> > determine if firmware supports _OSC DPC negotation, and therefore how to handle
> > DPC.
> > 
> > Here is the wording of the ECN that implies that Firmware without _OSC DPC
> > negotiation support should have the OSPM rely on _OSC AER negotiation when
> > determining DPC control:
> > 
> >   PCIe Base Specification suggests that Downstream Port Containment may be
> >   controlled either by the Firmware or the Operating System. It also suggests
> >   that the Firmware retain ownership of Downstream Port Containment if it also
> >   owns AER. When the Firmware owns Downstream Port Containment, it is expected
> >   to use the new "Error Disconnect Recover" notification to alert OSPM of a
> >   Downstream Port Containment event.
> > 
> > In legacy platforms, as bits in _OSC are reserved prior to implementation, ACPI
> > Root Bus enumeration will mark these Host Bridges as without Native DPC
> > support, even though the specification implies it's expected that AER _OSC
> > negotiation determines DPC control for these platforms. There seems to be a
> > need for a way to determine if the DPC control bit in _OSC is supported and
> > fallback on AER otherwise.
> > 
> > 
> > Currently portdrv assumes DPC control if the port has Native AER services:
> > 
> > static int get_port_device_capability(struct pci_dev *dev)
> > ...
> > 	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
> > 	    pci_aer_available() &&
> > 	    (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
> > 		services |= PCIE_PORT_SERVICE_DPC;
> > 
> > Newer firmware may not grant OSPM DPC control, if for instance, it expects to
> > use Error Disconnect Recovery. However it looks like ACPI will use DPC services
> > via the EDR driver, without binding the full DPC port service driver.
> > 
> > 
> > If we change portdrv to probe based on host->native_dpc and not AER, then we
> > break instances with legacy firmware where OSPM will clear host->native_dpc
> > solely due to _OSC bits being reserved:
> > 
> > struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> > ...
> > 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
> > 		host_bridge->native_dpc = 0;
> > 
> > 
> > 
> > So my assumption instead is that host->native_dpc can be 0 and expect Native
> > DPC services if AER is used. In other words, if and only if DPC probe is
> > invoked from portdrv, then it needs to rely on the AER dependency. Otherwise it
> > should be assumed that ACPI set up DPC via EDR. This covers legacy firmware.
> > 
> > However it seems like that could be trouble with newer firmware that might give
> > OSPM control of AER but not DPC, and would result in both Native DPC and EDR
> > being in effect.
> > 
> > 
> > Anyways here are two patches that give control of AER and DPC on the results of
> > _OSC. They don't mess with the HEST parser as I expect those to be removed at
> > some point. I need these for VMD support which doesn't even rely on _OSC, but I
> > suspect this won't be the last effort as we detangle Firmware First.
> > 
> > [1] https://members.pcisig.com/wg/PCI-SIG/document/12888
> 
> Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
> from Alex and Sathy first, then see what needs to be done on top of
> those, so I'm going to push these off for a few days and they'll
> probably need a refresh.
> 
> Bjorn


Agreed, no need to merge now. Just wanted to bring up the DPC
ambiguity, which I think was first addressed by dpc-native:

commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8
Author: Olof Johansson <olof@lixom.net>
Date:   Wed Oct 23 12:22:05 2019 -0700

    PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control
    
    Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
    Linux handled DPC events regardless of whether firmware had granted it
    ownership of AER or DPC, e.g., via _OSC.
    
    PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
    control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
    has control of AER.
    
    On platforms that do not grant OS control of AER via _OSC, Linux DPC
    handling worked before eed85ff4c0da7 but not after.
    
    To make Linux DPC handling work on those platforms the same way they did
    before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
    handle DPC events regardless of whether it has control of AER.
    
    [bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
    Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.net
    Signed-off-by: Olof Johansson <olof@lixom.net>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


Thanks,
Jon
Jon Derrick May 22, 2020, 5:23 p.m. UTC | #3
On Fri, 2020-05-01 at 11:35 -0600, Jonathan Derrick wrote:
> On Fri, 2020-05-01 at 12:16 -0500, Bjorn Helgaas wrote:
> > On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
> > > Hi Bjorn & Kuppuswamy,
> > > 
> > > I see a problem in the DPC ECN [1] to _OSC in that it doesn't give us a way to
> > > determine if firmware supports _OSC DPC negotation, and therefore how to handle
> > > DPC.
> > > 
> > > Here is the wording of the ECN that implies that Firmware without _OSC DPC
> > > negotiation support should have the OSPM rely on _OSC AER negotiation when
> > > determining DPC control:
> > > 
> > >   PCIe Base Specification suggests that Downstream Port Containment may be
> > >   controlled either by the Firmware or the Operating System. It also suggests
> > >   that the Firmware retain ownership of Downstream Port Containment if it also
> > >   owns AER. When the Firmware owns Downstream Port Containment, it is expected
> > >   to use the new "Error Disconnect Recover" notification to alert OSPM of a
> > >   Downstream Port Containment event.
> > > 
> > > In legacy platforms, as bits in _OSC are reserved prior to implementation, ACPI
> > > Root Bus enumeration will mark these Host Bridges as without Native DPC
> > > support, even though the specification implies it's expected that AER _OSC
> > > negotiation determines DPC control for these platforms. There seems to be a
> > > need for a way to determine if the DPC control bit in _OSC is supported and
> > > fallback on AER otherwise.
> > > 
> > > 
> > > Currently portdrv assumes DPC control if the port has Native AER services:
> > > 
> > > static int get_port_device_capability(struct pci_dev *dev)
> > > ...
> > > 	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
> > > 	    pci_aer_available() &&
> > > 	    (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
> > > 		services |= PCIE_PORT_SERVICE_DPC;
> > > 
> > > Newer firmware may not grant OSPM DPC control, if for instance, it expects to
> > > use Error Disconnect Recovery. However it looks like ACPI will use DPC services
> > > via the EDR driver, without binding the full DPC port service driver.
> > > 
> > > 
> > > If we change portdrv to probe based on host->native_dpc and not AER, then we
> > > break instances with legacy firmware where OSPM will clear host->native_dpc
> > > solely due to _OSC bits being reserved:
> > > 
> > > struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> > > ...
> > > 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
> > > 		host_bridge->native_dpc = 0;
> > > 
> > > 
> > > 
> > > So my assumption instead is that host->native_dpc can be 0 and expect Native
> > > DPC services if AER is used. In other words, if and only if DPC probe is
> > > invoked from portdrv, then it needs to rely on the AER dependency. Otherwise it
> > > should be assumed that ACPI set up DPC via EDR. This covers legacy firmware.
> > > 
> > > However it seems like that could be trouble with newer firmware that might give
> > > OSPM control of AER but not DPC, and would result in both Native DPC and EDR
> > > being in effect.
> > > 
> > > 
> > > Anyways here are two patches that give control of AER and DPC on the results of
> > > _OSC. They don't mess with the HEST parser as I expect those to be removed at
> > > some point. I need these for VMD support which doesn't even rely on _OSC, but I
> > > suspect this won't be the last effort as we detangle Firmware First.
> > > 
> > > [1] https://members.pcisig.com/wg/PCI-SIG/document/12888
> > 
> > Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
> > from Alex and Sathy first, then see what needs to be done on top of
> > those, so I'm going to push these off for a few days and they'll
> > probably need a refresh.
> > 
> > Bjorn
> 
> Agreed, no need to merge now. Just wanted to bring up the DPC
> ambiguity, which I think was first addressed by dpc-native:
> 
> commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8
> Author: Olof Johansson <olof@lixom.net>
> Date:   Wed Oct 23 12:22:05 2019 -0700
> 
>     PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control
>     
>     Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
>     Linux handled DPC events regardless of whether firmware had granted it
>     ownership of AER or DPC, e.g., via _OSC.
>     
>     PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
>     control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
>     has control of AER.
>     
>     On platforms that do not grant OS control of AER via _OSC, Linux DPC
>     handling worked before eed85ff4c0da7 but not after.
>     
>     To make Linux DPC handling work on those platforms the same way they did
>     before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
>     handle DPC events regardless of whether it has control of AER.
>     
>     [bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
>     Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.net
>     Signed-off-by: Olof Johansson <olof@lixom.net>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> 
> Thanks,
> Jon



Hi Bjorn,

Are you still thinking about removing the HEST parser?

For VMD we still need the ability to bind DPC if native_dpc==1.
I think if we can do that, this set should still pretty much still
apply with a modification to patch 2 to allow matching
pcie_ports_dpc_native in dpc_probe.

Jon
Bjorn Helgaas May 22, 2020, 7:46 p.m. UTC | #4
On Fri, May 22, 2020 at 05:23:31PM +0000, Derrick, Jonathan wrote:
> On Fri, 2020-05-01 at 11:35 -0600, Jonathan Derrick wrote:
> > On Fri, 2020-05-01 at 12:16 -0500, Bjorn Helgaas wrote:
> > > On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
> > > > Hi Bjorn & Kuppuswamy,
> > > > 
> > > > I see a problem in the DPC ECN [1] to _OSC in that it doesn't
> > > > give us a way to determine if firmware supports _OSC DPC
> > > > negotation, and therefore how to handle DPC.
> > > > 
> > > > Here is the wording of the ECN that implies that Firmware
> > > > without _OSC DPC negotiation support should have the OSPM rely
> > > > on _OSC AER negotiation when determining DPC control:
> > > > 
> > > >   PCIe Base Specification suggests that Downstream Port
> > > >   Containment may be controlled either by the Firmware or the
> > > >   Operating System. It also suggests that the Firmware retain
> > > >   ownership of Downstream Port Containment if it also owns
> > > >   AER. When the Firmware owns Downstream Port Containment, it
> > > >   is expected to use the new "Error Disconnect Recover"
> > > >   notification to alert OSPM of a Downstream Port Containment
> > > >   event.
> > > > 
> > > > In legacy platforms, as bits in _OSC are reserved prior to
> > > > implementation, ACPI Root Bus enumeration will mark these Host
> > > > Bridges as without Native DPC support, even though the
> > > > specification implies it's expected that AER _OSC negotiation
> > > > determines DPC control for these platforms. There seems to be
> > > > a need for a way to determine if the DPC control bit in _OSC
> > > > is supported and fallback on AER otherwise.
> > > > 
> > > > 
> > > > Currently portdrv assumes DPC control if the port has Native
> > > > AER services:
> > > > 
> > > > static int get_port_device_capability(struct pci_dev *dev)
> > > > ...
> > > > 	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
> > > > 	    pci_aer_available() &&
> > > > 	    (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
> > > > 		services |= PCIE_PORT_SERVICE_DPC;
> > > > 
> > > > Newer firmware may not grant OSPM DPC control, if for
> > > > instance, it expects to use Error Disconnect Recovery. However
> > > > it looks like ACPI will use DPC services via the EDR driver,
> > > > without binding the full DPC port service driver.
> > > > 
> > > > 
> > > > If we change portdrv to probe based on host->native_dpc and
> > > > not AER, then we break instances with legacy firmware where
> > > > OSPM will clear host->native_dpc solely due to _OSC bits being
> > > > reserved:
> > > > 
> > > > struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> > > > ...
> > > > 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
> > > > 		host_bridge->native_dpc = 0;
> > > > 
> > > > 
> > > > 
> > > > So my assumption instead is that host->native_dpc can be 0 and
> > > > expect Native DPC services if AER is used. In other words, if
> > > > and only if DPC probe is invoked from portdrv, then it needs
> > > > to rely on the AER dependency. Otherwise it should be assumed
> > > > that ACPI set up DPC via EDR. This covers legacy firmware.
> > > > 
> > > > However it seems like that could be trouble with newer
> > > > firmware that might give OSPM control of AER but not DPC, and
> > > > would result in both Native DPC and EDR being in effect.
> > > > 
> > > > 
> > > > Anyways here are two patches that give control of AER and DPC
> > > > on the results of _OSC. They don't mess with the HEST parser
> > > > as I expect those to be removed at some point. I need these
> > > > for VMD support which doesn't even rely on _OSC, but I suspect
> > > > this won't be the last effort as we detangle Firmware First.
> > > > 
> > > > [1] https://members.pcisig.com/wg/PCI-SIG/document/12888
> > > 
> > > Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
> > > from Alex and Sathy first, then see what needs to be done on top of
> > > those, so I'm going to push these off for a few days and they'll
> > > probably need a refresh.
> > > 
> > > Bjorn
> > 
> > Agreed, no need to merge now. Just wanted to bring up the DPC
> > ambiguity, which I think was first addressed by dpc-native:
> > 
> > commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8
> > Author: Olof Johansson <olof@lixom.net>
> > Date:   Wed Oct 23 12:22:05 2019 -0700
> > 
> >     PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control
> >     
> >     Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
> >     Linux handled DPC events regardless of whether firmware had granted it
> >     ownership of AER or DPC, e.g., via _OSC.
> >     
> >     PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
> >     control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
> >     has control of AER.
> >     
> >     On platforms that do not grant OS control of AER via _OSC, Linux DPC
> >     handling worked before eed85ff4c0da7 but not after.
> >     
> >     To make Linux DPC handling work on those platforms the same way they did
> >     before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
> >     handle DPC events regardless of whether it has control of AER.
> >     
> >     [bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
> >     Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.net
> >     Signed-off-by: Olof Johansson <olof@lixom.net>
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Are you still thinking about removing the HEST parser?
> 
> For VMD we still need the ability to bind DPC if native_dpc==1.
> I think if we can do that, this set should still pretty much still
> apply with a modification to patch 2 to allow matching
> pcie_ports_dpc_native in dpc_probe.

Yes, I think we should remove the HEST firmware-first parsing, because
IIRC the spec really doesn't specify any action the OS should take
based on it.  I was thinking Sathy might update the patch, and it fell
off my radar.

Bjorn
Kuppuswamy Sathyanarayanan May 22, 2020, 8:48 p.m. UTC | #5
Hi Bjorn, Derrick,

On 5/22/20 12:46 PM, Bjorn Helgaas wrote:
> On Fri, May 22, 2020 at 05:23:31PM +0000, Derrick, Jonathan wrote:
>> On Fri, 2020-05-01 at 11:35 -0600, Jonathan Derrick wrote:
>>> On Fri, 2020-05-01 at 12:16 -0500, Bjorn Helgaas wrote:
>>>> On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
>>>>> Hi Bjorn & Kuppuswamy,
>>>>>
>>>>> I see a problem in the DPC ECN [1] to _OSC in that it doesn't
>>>>> give us a way to determine if firmware supports _OSC DPC
>>>>> negotation, and therefore how to handle DPC.
>>>>>
>>>>> Here is the wording of the ECN that implies that Firmware
>>>>> without _OSC DPC negotiation support should have the OSPM rely
>>>>> on _OSC AER negotiation when determining DPC control:
>>>>>
>>>>>    PCIe Base Specification suggests that Downstream Port
>>>>>    Containment may be controlled either by the Firmware or the
>>>>>    Operating System. It also suggests that the Firmware retain
>>>>>    ownership of Downstream Port Containment if it also owns
>>>>>    AER. When the Firmware owns Downstream Port Containment, it
>>>>>    is expected to use the new "Error Disconnect Recover"
>>>>>    notification to alert OSPM of a Downstream Port Containment
>>>>>    event.
>>>>>
>>>>> In legacy platforms, as bits in _OSC are reserved prior to
>>>>> implementation, ACPI Root Bus enumeration will mark these Host
>>>>> Bridges as without Native DPC support, even though the
>>>>> specification implies it's expected that AER _OSC negotiation
>>>>> determines DPC control for these platforms. There seems to be
>>>>> a need for a way to determine if the DPC control bit in _OSC
>>>>> is supported and fallback on AER otherwise.
>>>>>
>>>>>
>>>>> Currently portdrv assumes DPC control if the port has Native
>>>>> AER services:
>>>>>
>>>>> static int get_port_device_capability(struct pci_dev *dev)
>>>>> ...
>>>>> 	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
>>>>> 	    pci_aer_available() &&
>>>>> 	    (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
>>>>> 		services |= PCIE_PORT_SERVICE_DPC;
>>>>>
>>>>> Newer firmware may not grant OSPM DPC control, if for
>>>>> instance, it expects to use Error Disconnect Recovery. However
>>>>> it looks like ACPI will use DPC services via the EDR driver,
>>>>> without binding the full DPC port service driver.
>>>>>
>>>>>
>>>>> If we change portdrv to probe based on host->native_dpc and
>>>>> not AER, then we break instances with legacy firmware where
>>>>> OSPM will clear host->native_dpc solely due to _OSC bits being
>>>>> reserved:
>>>>>
>>>>> struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>>>>> ...
>>>>> 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>>>>> 		host_bridge->native_dpc = 0;
>>>>>
>>>>>
>>>>>
>>>>> So my assumption instead is that host->native_dpc can be 0 and
>>>>> expect Native DPC services if AER is used. In other words, if
>>>>> and only if DPC probe is invoked from portdrv, then it needs
>>>>> to rely on the AER dependency. Otherwise it should be assumed
>>>>> that ACPI set up DPC via EDR. This covers legacy firmware.
>>>>>
>>>>> However it seems like that could be trouble with newer
>>>>> firmware that might give OSPM control of AER but not DPC, and
>>>>> would result in both Native DPC and EDR being in effect.
>>>>>
>>>>>
>>>>> Anyways here are two patches that give control of AER and DPC
>>>>> on the results of _OSC. They don't mess with the HEST parser
>>>>> as I expect those to be removed at some point. I need these
>>>>> for VMD support which doesn't even rely on _OSC, but I suspect
>>>>> this won't be the last effort as we detangle Firmware First.
>>>>>
>>>>> [1] https://members.pcisig.com/wg/PCI-SIG/document/12888
>>>>
>>>> Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
>>>> from Alex and Sathy first, then see what needs to be done on top of
>>>> those, so I'm going to push these off for a few days and they'll
>>>> probably need a refresh.
>>>>
>>>> Bjorn
>>>
>>> Agreed, no need to merge now. Just wanted to bring up the DPC
>>> ambiguity, which I think was first addressed by dpc-native:
>>>
>>> commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8
>>> Author: Olof Johansson <olof@lixom.net>
>>> Date:   Wed Oct 23 12:22:05 2019 -0700
>>>
>>>      PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control
>>>      
>>>      Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
>>>      Linux handled DPC events regardless of whether firmware had granted it
>>>      ownership of AER or DPC, e.g., via _OSC.
>>>      
>>>      PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
>>>      control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
>>>      has control of AER.
>>>      
>>>      On platforms that do not grant OS control of AER via _OSC, Linux DPC
>>>      handling worked before eed85ff4c0da7 but not after.
>>>      
>>>      To make Linux DPC handling work on those platforms the same way they did
>>>      before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
>>>      handle DPC events regardless of whether it has control of AER.
>>>      
>>>      [bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
>>>      Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.net
>>>      Signed-off-by: Olof Johansson <olof@lixom.net>
>>>      Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>
>> Are you still thinking about removing the HEST parser?
>>
>> For VMD we still need the ability to bind DPC if native_dpc==1.
>> I think if we can do that, this set should still pretty much still
>> apply with a modification to patch 2 to allow matching
>> pcie_ports_dpc_native in dpc_probe.
> 
> Yes, I think we should remove the HEST firmware-first parsing, because
> IIRC the spec really doesn't specify any action the OS should take
> based on it.  I was thinking Sathy might update the patch, and it fell
> off my radar.

Sorry for the delay.

I was just waiting to see whether we get any issues with merging
following commit.

commit c100beb9ccfb98e2474586a4006483cbf770c823
Author: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Date:   Mon Apr 27 18:25:13 2020 -0500

     PCI/AER: Use only _OSC to determine AER ownership

Since I did not see any email reporting any issues about it,
I will work on follow up patch.

> 
> Bjorn
>