[v6,1/9] PCI/ERR: Update error status after reset_link()
diff mbox series

Message ID 6be594215ae2ea0935d949537bfb84ff9e656a36.1564177080.git.sathyanarayanan.kuppuswamy@linux.intel.com
State New
Delegated to: Bjorn Helgaas
Headers show
Series
  • Add Error Disconnect Recover (EDR) support
Related show

Commit Message

Kuppuswamy Sathyanarayanan July 26, 2019, 9:43 p.m. UTC
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors. But, if the reset is
successful there is no need to continue the rest of the error recovery
checks. Also, during fatal error recovery, if the initial value of error
status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then
even after successful recovery (using reset_link()) pcie_do_recovery()
will report the recovery result as failure. So update the status of
error after reset_link().

Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pcie/err.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Bjorn Helgaas Aug. 15, 2019, 10:16 p.m. UTC | #1
On Fri, Jul 26, 2019 at 02:43:11PM -0700, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
> reset_link() to recover from fatal errors. But, if the reset is
> successful there is no need to continue the rest of the error recovery
> checks. Also, during fatal error recovery, if the initial value of error
> status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then
> even after successful recovery (using reset_link()) pcie_do_recovery()
> will report the recovery result as failure. So update the status of
> error after reset_link().
> 
> Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> ---
>  drivers/pci/pcie/err.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 773197a12568..aecec124a829 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -204,9 +204,13 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
>  	else
>  		pci_walk_bus(bus, report_normal_detected, &status);
>  
> -	if (state == pci_channel_io_frozen &&
> -	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
> -		goto failed;
> +	if (state == pci_channel_io_frozen) {
> +		status = reset_link(dev, service);
> +		if (status != PCI_ERS_RESULT_RECOVERED)
> +			goto failed;
> +		else
> +			goto done;

This will be easier to read without the negation, i.e.,

                if (status == PCI_ERS_RESULT_RECOVERED)
                        goto done;
                else
                        goto failed;

> +	}
>  
>  	if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>  		status = PCI_ERS_RESULT_RECOVERED;
> @@ -228,6 +232,7 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
>  	if (status != PCI_ERS_RESULT_RECOVERED)
>  		goto failed;
>  
> +done:
>  	pci_dbg(dev, "broadcast resume message\n");
>  	pci_walk_bus(bus, report_resume, &status);
>  
> -- 
> 2.21.0
>
Kuppuswamy Sathyanarayanan Aug. 15, 2019, 10:52 p.m. UTC | #2
Hi,

On 8/15/19 3:16 PM, Bjorn Helgaas wrote:
> On Fri, Jul 26, 2019 at 02:43:11PM -0700, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>
>> Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
>> reset_link() to recover from fatal errors. But, if the reset is
>> successful there is no need to continue the rest of the error recovery
>> checks. Also, during fatal error recovery, if the initial value of error
>> status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then
>> even after successful recovery (using reset_link()) pcie_do_recovery()
>> will report the recovery result as failure. So update the status of
>> error after reset_link().
>>
>> Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Keith Busch <keith.busch@intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>> ---
>>   drivers/pci/pcie/err.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>> index 773197a12568..aecec124a829 100644
>> --- a/drivers/pci/pcie/err.c
>> +++ b/drivers/pci/pcie/err.c
>> @@ -204,9 +204,13 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
>>   	else
>>   		pci_walk_bus(bus, report_normal_detected, &status);
>>   
>> -	if (state == pci_channel_io_frozen &&
>> -	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
>> -		goto failed;
>> +	if (state == pci_channel_io_frozen) {
>> +		status = reset_link(dev, service);
>> +		if (status != PCI_ERS_RESULT_RECOVERED)
>> +			goto failed;
>> +		else
>> +			goto done;
> This will be easier to read without the negation, i.e.,
>
>                  if (status == PCI_ERS_RESULT_RECOVERED)
>                          goto done;
>                  else
>                          goto failed;
will change it in next version.
Joe Perches Aug. 16, 2019, 1:54 a.m. UTC | #3
On Thu, 2019-08-15 at 17:16 -0500, Bjorn Helgaas wrote:
> On Fri, Jul 26, 2019 at 02:43:11PM -0700, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > 
> > Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
> > reset_link() to recover from fatal errors. But, if the reset is
> > successful there is no need to continue the rest of the error recovery
> > checks. Also, during fatal error recovery, if the initial value of error
> > status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then
> > even after successful recovery (using reset_link()) pcie_do_recovery()
> > will report the recovery result as failure. So update the status of
> > error after reset_link().
[]
> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
[]
> > @@ -204,9 +204,13 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
> >  	else
> >  		pci_walk_bus(bus, report_normal_detected, &status);
> >  
> > -	if (state == pci_channel_io_frozen &&
> > -	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
> > -		goto failed;
> > +	if (state == pci_channel_io_frozen) {
> > +		status = reset_link(dev, service);
> > +		if (status != PCI_ERS_RESULT_RECOVERED)
> > +			goto failed;
> > +		else
> > +			goto done;
> 
> This will be easier to read without the negation, i.e.,
> 
>                 if (status == PCI_ERS_RESULT_RECOVERED)
>                         goto done;
>                 else
>                         goto failed;

bikeshed: I think it'd be easier to read without the else

		status = reset_link(dev, service);
		if (status != PCI_ERS_RESULT_RECOVERED)
			goto failed;

		goto done;

Patch
diff mbox series

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 773197a12568..aecec124a829 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -204,9 +204,13 @@  void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 	else
 		pci_walk_bus(bus, report_normal_detected, &status);
 
-	if (state == pci_channel_io_frozen &&
-	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
-		goto failed;
+	if (state == pci_channel_io_frozen) {
+		status = reset_link(dev, service);
+		if (status != PCI_ERS_RESULT_RECOVERED)
+			goto failed;
+		else
+			goto done;
+	}
 
 	if (status == PCI_ERS_RESULT_CAN_RECOVER) {
 		status = PCI_ERS_RESULT_RECOVERED;
@@ -228,6 +232,7 @@  void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 	if (status != PCI_ERS_RESULT_RECOVERED)
 		goto failed;
 
+done:
 	pci_dbg(dev, "broadcast resume message\n");
 	pci_walk_bus(bus, report_resume, &status);