Message ID | 20201104232244.434115-1-sean.v.kelley@intel.com |
---|---|
State | New |
Headers | show |
Series | [V3] AER: aer_root_reset() non-native handling | expand |
On Wed, Nov 04, 2020 at 03:22:44PM -0800, Sean V Kelley wrote: > If an OS has not been granted AER control via _OSC, then > the OS should not make changes to PCI_ERR_ROOT_COMMAND and > PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of > the System Firmware Intermediary (SFI) _OSC and DPC Updates > ECN [1], this bit also covers these aspects of the PCI > Express Advanced Error Reporting. Based on the above and > earlier discussion [2], make the following changes: > > Add a check for the native case (i.e., AER control via _OSC) > > Note that the current "clear, reset, enable" order suggests that the > reset might cause errors that we should ignore. Lacking historical > context, these will be retained. > > [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24, > 2020, affecting PCI Firmware Specification, Rev. 3.2 > https://members.pcisig.com/wg/PCI-SIG/document/14076 > [2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/ > > Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com> What do I do with this patch in combination with "[PATCH v10 00/16] Add RCEC handling to PCI/AER"? I tried applying this and the RCEC series on top, but they conflict. I was thinking it would be easiest to include this as the first patch in the series so I wouldn't have to resolve the conflicts, but maybe you had something else in mind? > --- > Changes since V2 : > > Fixed an unfortunate copy/paste error. > > Changes since V1 [1]: > > Noted lack of historical context on isolation of both the > pci_bus_error_reset() and the clearing of Root Error Status. In fact, > the call to aer_enable_rootport() likewise disables system error generation > in response to error messages around the clearing of the error status. So > retained the wrapping of the "clear, reset, enable". > > [1] https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@intel.com/ > > Thanks, > > Sean > --- > drivers/pci/pcie/aer.c | 31 +++++++++++++++++-------------- > 1 file changed, 17 insertions(+), 14 deletions(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 65dff5f3457a..4ab379fa1506 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1358,26 +1358,29 @@ static int aer_probe(struct pcie_device *dev) > static pci_ers_result_t aer_root_reset(struct pci_dev *dev) > { > int aer = dev->aer_cap; > + int rc = 0; Unnecessary init, I think. > u32 reg32; > - int rc; > - > > - /* Disable Root's interrupt in response to error messages */ > - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); > - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; > - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); > + if (pcie_aer_is_native(dev)) { > + /* Disable Root's interrupt in response to error messages */ > + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); > + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; > + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); > + } > > rc = pci_bus_error_reset(dev); > - pci_info(dev, "Root Port link has been reset\n"); > + pci_info(dev, "Root Port link has been reset (%d)\n", rc); > > - /* Clear Root Error Status */ > - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32); > - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32); > + if (pcie_aer_is_native(dev)) { > + /* Clear Root Error Status */ > + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32); > + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32); > > - /* Enable Root Port's interrupt in response to error messages */ > - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); > - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; > - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); > + /* Enable Root Port's interrupt in response to error messages */ > + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); > + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; > + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); > + } > > return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED; > } > -- > 2.29.2 >
On 14 Nov 2020, at 7:37, Bjorn Helgaas wrote: > On Wed, Nov 04, 2020 at 03:22:44PM -0800, Sean V Kelley wrote: >> If an OS has not been granted AER control via _OSC, then >> the OS should not make changes to PCI_ERR_ROOT_COMMAND and >> PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of >> the System Firmware Intermediary (SFI) _OSC and DPC Updates >> ECN [1], this bit also covers these aspects of the PCI >> Express Advanced Error Reporting. Based on the above and >> earlier discussion [2], make the following changes: >> >> Add a check for the native case (i.e., AER control via _OSC) >> >> Note that the current "clear, reset, enable" order suggests that the >> reset might cause errors that we should ignore. Lacking historical >> context, these will be retained. >> >> [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb >> 24, >> 2020, affecting PCI Firmware Specification, Rev. 3.2 >> https://members.pcisig.com/wg/PCI-SIG/document/14076 >> [2] >> https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/ >> >> Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com> > > What do I do with this patch in combination with "[PATCH v10 00/16] > Add RCEC handling to PCI/AER"? I tried applying this and the RCEC > series on top, but they conflict. > > I was thinking it would be easiest to include this as the first patch > in the series so I wouldn't have to resolve the conflicts, but maybe > you had something else in mind? It originally did rebase but the last minute %s/root/dev changes may have broke the rebase. Will fix and submit as the first patch in the RCEC series. > >> --- >> Changes since V2 : >> >> Fixed an unfortunate copy/paste error. >> >> Changes since V1 [1]: >> >> Noted lack of historical context on isolation of both the >> pci_bus_error_reset() and the clearing of Root Error Status. In fact, >> the call to aer_enable_rootport() likewise disables system error >> generation >> in response to error messages around the clearing of the error >> status. So >> retained the wrapping of the "clear, reset, enable". >> >> [1] >> https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@intel.com/ >> >> Thanks, >> >> Sean >> --- >> drivers/pci/pcie/aer.c | 31 +++++++++++++++++-------------- >> 1 file changed, 17 insertions(+), 14 deletions(-) >> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >> index 65dff5f3457a..4ab379fa1506 100644 >> --- a/drivers/pci/pcie/aer.c >> +++ b/drivers/pci/pcie/aer.c >> @@ -1358,26 +1358,29 @@ static int aer_probe(struct pcie_device *dev) >> static pci_ers_result_t aer_root_reset(struct pci_dev *dev) >> { >> int aer = dev->aer_cap; >> + int rc = 0; > > Unnecessary init, I think. Will remove init. Thanks, Sean > >> u32 reg32; >> - int rc; >> - >> >> - /* Disable Root's interrupt in response to error messages */ >> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); >> - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; >> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); >> + if (pcie_aer_is_native(dev)) { >> + /* Disable Root's interrupt in response to error messages */ >> + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); >> + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; >> + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); >> + } >> >> rc = pci_bus_error_reset(dev); >> - pci_info(dev, "Root Port link has been reset\n"); >> + pci_info(dev, "Root Port link has been reset (%d)\n", rc); >> >> - /* Clear Root Error Status */ >> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32); >> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32); >> + if (pcie_aer_is_native(dev)) { >> + /* Clear Root Error Status */ >> + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32); >> + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32); >> >> - /* Enable Root Port's interrupt in response to error messages */ >> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); >> - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; >> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); >> + /* Enable Root Port's interrupt in response to error messages */ >> + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); >> + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; >> + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); >> + } >> >> return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED; >> } >> -- >> 2.29.2 >>
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 65dff5f3457a..4ab379fa1506 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1358,26 +1358,29 @@ static int aer_probe(struct pcie_device *dev) static pci_ers_result_t aer_root_reset(struct pci_dev *dev) { int aer = dev->aer_cap; + int rc = 0; u32 reg32; - int rc; - - /* Disable Root's interrupt in response to error messages */ - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); + if (pcie_aer_is_native(dev)) { + /* Disable Root's interrupt in response to error messages */ + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); + } rc = pci_bus_error_reset(dev); - pci_info(dev, "Root Port link has been reset\n"); + pci_info(dev, "Root Port link has been reset (%d)\n", rc); - /* Clear Root Error Status */ - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32); - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32); + if (pcie_aer_is_native(dev)) { + /* Clear Root Error Status */ + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32); + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32); - /* Enable Root Port's interrupt in response to error messages */ - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); + /* Enable Root Port's interrupt in response to error messages */ + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32); + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32); + } return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED; }
If an OS has not been granted AER control via _OSC, then the OS should not make changes to PCI_ERR_ROOT_COMMAND and PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of the System Firmware Intermediary (SFI) _OSC and DPC Updates ECN [1], this bit also covers these aspects of the PCI Express Advanced Error Reporting. Based on the above and earlier discussion [2], make the following changes: Add a check for the native case (i.e., AER control via _OSC) Note that the current "clear, reset, enable" order suggests that the reset might cause errors that we should ignore. Lacking historical context, these will be retained. [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24, 2020, affecting PCI Firmware Specification, Rev. 3.2 https://members.pcisig.com/wg/PCI-SIG/document/14076 [2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/ Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com> --- Changes since V2 : Fixed an unfortunate copy/paste error. Changes since V1 [1]: Noted lack of historical context on isolation of both the pci_bus_error_reset() and the clearing of Root Error Status. In fact, the call to aer_enable_rootport() likewise disables system error generation in response to error messages around the clearing of the error status. So retained the wrapping of the "clear, reset, enable". [1] https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@intel.com/ Thanks, Sean --- drivers/pci/pcie/aer.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) -- 2.29.2