diff mbox series

[V3] AER: aer_root_reset() non-native handling

Message ID 20201104232244.434115-1-sean.v.kelley@intel.com
State New
Headers show
Series [V3] AER: aer_root_reset() non-native handling | expand

Commit Message

Kelley, Sean V Nov. 4, 2020, 11:22 p.m. UTC
If an OS has not been granted AER control via _OSC, then
the OS should not make changes to PCI_ERR_ROOT_COMMAND and
PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of
the System Firmware Intermediary (SFI) _OSC and DPC Updates
ECN [1], this bit also covers these aspects of the PCI
Express Advanced Error Reporting. Based on the above and
earlier discussion [2], make the following changes:

Add a check for the native case (i.e., AER control via _OSC)

Note that the current "clear, reset, enable" order suggests that the
reset might cause errors that we should ignore. Lacking historical
context, these will be retained.

[1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
    2020, affecting PCI Firmware Specification, Rev. 3.2
    https://members.pcisig.com/wg/PCI-SIG/document/14076
[2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/

Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
---
Changes since V2 :

Fixed an unfortunate copy/paste error.

Changes since V1 [1]:

Noted lack of historical context on isolation of both the
pci_bus_error_reset() and the clearing of Root Error Status. In fact,
the call to aer_enable_rootport() likewise disables system error generation
in response to error messages around the clearing of the error status. So
retained the wrapping of the  "clear, reset, enable".

[1] https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@intel.com/

Thanks,

Sean
---
 drivers/pci/pcie/aer.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

--
2.29.2

Comments

Bjorn Helgaas Nov. 14, 2020, 3:37 p.m. UTC | #1
On Wed, Nov 04, 2020 at 03:22:44PM -0800, Sean V Kelley wrote:
> If an OS has not been granted AER control via _OSC, then
> the OS should not make changes to PCI_ERR_ROOT_COMMAND and
> PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of
> the System Firmware Intermediary (SFI) _OSC and DPC Updates
> ECN [1], this bit also covers these aspects of the PCI
> Express Advanced Error Reporting. Based on the above and
> earlier discussion [2], make the following changes:
> 
> Add a check for the native case (i.e., AER control via _OSC)
> 
> Note that the current "clear, reset, enable" order suggests that the
> reset might cause errors that we should ignore. Lacking historical
> context, these will be retained.
> 
> [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
>     2020, affecting PCI Firmware Specification, Rev. 3.2
>     https://members.pcisig.com/wg/PCI-SIG/document/14076
> [2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/
> 
> Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>

What do I do with this patch in combination with "[PATCH v10 00/16]
Add RCEC handling to PCI/AER"?  I tried applying this and the RCEC
series on top, but they conflict.

I was thinking it would be easiest to include this as the first patch
in the series so I wouldn't have to resolve the conflicts, but maybe
you had something else in mind?

> ---
> Changes since V2 :
> 
> Fixed an unfortunate copy/paste error.
> 
> Changes since V1 [1]:
> 
> Noted lack of historical context on isolation of both the
> pci_bus_error_reset() and the clearing of Root Error Status. In fact,
> the call to aer_enable_rootport() likewise disables system error generation
> in response to error messages around the clearing of the error status. So
> retained the wrapping of the  "clear, reset, enable".
> 
> [1] https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@intel.com/
> 
> Thanks,
> 
> Sean
> ---
>  drivers/pci/pcie/aer.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 65dff5f3457a..4ab379fa1506 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1358,26 +1358,29 @@ static int aer_probe(struct pcie_device *dev)
>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>  {
>  	int aer = dev->aer_cap;
> +	int rc = 0;

Unnecessary init, I think.

>  	u32 reg32;
> -	int rc;
> -
> 
> -	/* Disable Root's interrupt in response to error messages */
> -	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> -	reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> -	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> +	if (pcie_aer_is_native(dev)) {
> +		/* Disable Root's interrupt in response to error messages */
> +		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> +		reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> +		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> +	}
> 
>  	rc = pci_bus_error_reset(dev);
> -	pci_info(dev, "Root Port link has been reset\n");
> +	pci_info(dev, "Root Port link has been reset (%d)\n", rc);
> 
> -	/* Clear Root Error Status */
> -	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
> -	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
> +	if (pcie_aer_is_native(dev)) {
> +		/* Clear Root Error Status */
> +		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
> +		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
> 
> -	/* Enable Root Port's interrupt in response to error messages */
> -	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> -	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> -	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> +		/* Enable Root Port's interrupt in response to error messages */
> +		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> +		reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> +		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> +	}
> 
>  	return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
>  }
> --
> 2.29.2
>
Kelley, Sean V Nov. 16, 2020, 5:10 p.m. UTC | #2
On 14 Nov 2020, at 7:37, Bjorn Helgaas wrote:

> On Wed, Nov 04, 2020 at 03:22:44PM -0800, Sean V Kelley wrote:
>> If an OS has not been granted AER control via _OSC, then
>> the OS should not make changes to PCI_ERR_ROOT_COMMAND and
>> PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of
>> the System Firmware Intermediary (SFI) _OSC and DPC Updates
>> ECN [1], this bit also covers these aspects of the PCI
>> Express Advanced Error Reporting. Based on the above and
>> earlier discussion [2], make the following changes:
>>
>> Add a check for the native case (i.e., AER control via _OSC)
>>
>> Note that the current "clear, reset, enable" order suggests that the
>> reset might cause errors that we should ignore. Lacking historical
>> context, these will be retained.
>>
>> [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 
>> 24,
>>     2020, affecting PCI Firmware Specification, Rev. 3.2
>>     https://members.pcisig.com/wg/PCI-SIG/document/14076
>> [2] 
>> https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/
>>
>> Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
>
> What do I do with this patch in combination with "[PATCH v10 00/16]
> Add RCEC handling to PCI/AER"?  I tried applying this and the RCEC
> series on top, but they conflict.
>
> I was thinking it would be easiest to include this as the first patch
> in the series so I wouldn't have to resolve the conflicts, but maybe
> you had something else in mind?

It originally did rebase but the last minute %s/root/dev changes may 
have broke the rebase.
Will fix and submit as the first patch in the RCEC series.


>
>> ---
>> Changes since V2 :
>>
>> Fixed an unfortunate copy/paste error.
>>
>> Changes since V1 [1]:
>>
>> Noted lack of historical context on isolation of both the
>> pci_bus_error_reset() and the clearing of Root Error Status. In fact,
>> the call to aer_enable_rootport() likewise disables system error 
>> generation
>> in response to error messages around the clearing of the error 
>> status. So
>> retained the wrapping of the  "clear, reset, enable".
>>
>> [1] 
>> https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@intel.com/
>>
>> Thanks,
>>
>> Sean
>> ---
>>  drivers/pci/pcie/aer.c | 31 +++++++++++++++++--------------
>>  1 file changed, 17 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index 65dff5f3457a..4ab379fa1506 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -1358,26 +1358,29 @@ static int aer_probe(struct pcie_device *dev)
>>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>>  {
>>  	int aer = dev->aer_cap;
>> +	int rc = 0;
>
> Unnecessary init, I think.

Will remove init.

Thanks,

Sean

>
>>  	u32 reg32;
>> -	int rc;
>> -
>>
>> -	/* Disable Root's interrupt in response to error messages */
>> -	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
>> -	reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>> -	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>> +	if (pcie_aer_is_native(dev)) {
>> +		/* Disable Root's interrupt in response to error messages */
>> +		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
>> +		reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>> +		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>> +	}
>>
>>  	rc = pci_bus_error_reset(dev);
>> -	pci_info(dev, "Root Port link has been reset\n");
>> +	pci_info(dev, "Root Port link has been reset (%d)\n", rc);
>>
>> -	/* Clear Root Error Status */
>> -	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
>> -	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
>> +	if (pcie_aer_is_native(dev)) {
>> +		/* Clear Root Error Status */
>> +		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
>> +		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
>>
>> -	/* Enable Root Port's interrupt in response to error messages */
>> -	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
>> -	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
>> -	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>> +		/* Enable Root Port's interrupt in response to error messages */
>> +		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
>> +		reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
>> +		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>> +	}
>>
>>  	return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
>>  }
>> --
>> 2.29.2
>>
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 65dff5f3457a..4ab379fa1506 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1358,26 +1358,29 @@  static int aer_probe(struct pcie_device *dev)
 static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 {
 	int aer = dev->aer_cap;
+	int rc = 0;
 	u32 reg32;
-	int rc;
-

-	/* Disable Root's interrupt in response to error messages */
-	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
-	reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
-	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+	if (pcie_aer_is_native(dev)) {
+		/* Disable Root's interrupt in response to error messages */
+		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
+		reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
+		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+	}

 	rc = pci_bus_error_reset(dev);
-	pci_info(dev, "Root Port link has been reset\n");
+	pci_info(dev, "Root Port link has been reset (%d)\n", rc);

-	/* Clear Root Error Status */
-	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
-	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
+	if (pcie_aer_is_native(dev)) {
+		/* Clear Root Error Status */
+		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
+		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);

-	/* Enable Root Port's interrupt in response to error messages */
-	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
-	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
-	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+		/* Enable Root Port's interrupt in response to error messages */
+		pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
+		reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
+		pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+	}

 	return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
 }