diff mbox series

[4/5] PCI: wait device ready after pci_pm_reset()

Message ID 1506212218-29103-4-git-send-email-okaya@codeaurora.org
State Changes Requested
Headers show
Series [1/5] PCI: protect restore with device lock to be consistent | expand

Commit Message

Sinan Kaya Sept. 24, 2017, 12:16 a.m. UTC
Rev 3.1 Sec 2.3.1 Request Handling Rules says a device can issue CRS
following a D3hot->D0 transition. Add pci_dev_wait() call with 1 second
timeout to see if device is available before returning.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/pci/pci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Bjorn Helgaas Oct. 11, 2017, 10:06 p.m. UTC | #1
On Sat, Sep 23, 2017 at 08:16:57PM -0400, Sinan Kaya wrote:
> Rev 3.1 Sec 2.3.1 Request Handling Rules says a device can issue CRS
> following a D3hot->D0 transition. Add pci_dev_wait() call with 1 second
> timeout to see if device is available before returning.
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/pci/pci.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index fd4a3b6..074adf9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3963,6 +3963,7 @@ static int pci_af_flr(struct pci_dev *dev, int probe)
>   */
>  static int pci_pm_reset(struct pci_dev *dev, int probe)
>  {
> +	unsigned int delay = dev->d3_delay;
>  	u16 csr;
>  
>  	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
> @@ -3988,7 +3989,10 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
>  	pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
>  	pci_dev_d3_sleep(dev);
>  
> -	return 0;
> +	if (delay < pci_pm_d3_delay)
> +		delay = pci_pm_d3_delay;
> +
> +	return pci_dev_wait(dev, "PM D3->D0", delay, 1000);

1) Why do we wait up to 1 second here, when we wait up to 60 seconds
for the other methods?  Can they all be the same?  Maybe a #define for
it?

2) I don't really like the fact that we do the initial sleep one place
and then pass the length of that sleep here.  It's hard to verify
they're the same and keep them in sync.  I think the only thing you
use initial_wait for is to include that time in the dmesg messages.
Maybe we should just omit that time from the message and drop the
parameter?

>  }
>  
>  void pci_reset_secondary_bus(struct pci_dev *dev)
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Sinan Kaya Oct. 12, 2017, 4:48 p.m. UTC | #2
On 10/11/2017 6:06 PM, Bjorn Helgaas wrote:
>> static int pci_pm_reset(struct pci_dev *dev, int probe)
>>  {
>> +	unsigned int delay = dev->d3_delay;
>>  	u16 csr;
>>  
>>  	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
>> @@ -3988,7 +3989,10 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
>>  	pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
>>  	pci_dev_d3_sleep(dev);
>>  
>> -	return 0;
>> +	if (delay < pci_pm_d3_delay)
>> +		delay = pci_pm_d3_delay;
>> +
>> +	return pci_dev_wait(dev, "PM D3->D0", delay, 1000);
> 1) Why do we wait up to 1 second here, when we wait up to 60 seconds
> for the other methods?  Can they all be the same?  Maybe a #define for
> it?

I know you want to have similar behavior for systems that do and do not support
CRS. That was the reason why I converted flr wait function to into dev_wait function.

However, here is the problem:

For systems that do not support CRS, there is no way of knowing whether we
are reading 0xFFFFFFFF because the endpoint is not reachable due to an error
like "it doesn't support this reset type" or if it is actually emitting a CRS.

If one system has a problem with pm_reset, this code would add an unnecessary
1 second delay into the reset path. If I make it 60 it would be something like:

1. try reset method A
2. wait 60 seconds
3. try reset method B
4. wait 60 seconds. 
5. try reset method C
6. wait 60 seconds

This might end up being a regression on some system. 

I'm still leaning towards a wait only if we are observing a CRS. What's your
thought on this?

then the sequence would be.

1. try reset method A
2. if CRS pending, wait 60 seconds
3. try reset method B
4. if CRS pending, wait 60 seconds. 
5. try reset method C
6. if CRS pending, wait 60 seconds

> 
> 2) I don't really like the fact that we do the initial sleep one place
> and then pass the length of that sleep here.  It's hard to verify
> they're the same and keep them in sync.  I think the only thing you
> use initial_wait for is to include that time in the dmesg messages.
> Maybe we should just omit that time from the message and drop the
> parameter?
> 

This was for printing reasons like you spotted, I can certainly get rid of
the initial_wait.
Sinan Kaya Oct. 16, 2017, 12:51 p.m. UTC | #3
On 10/12/2017 12:48 PM, Sinan Kaya wrote:
> On 10/11/2017 6:06 PM, Bjorn Helgaas wrote:
>>> static int pci_pm_reset(struct pci_dev *dev, int probe)
>>>  {
>>> +	unsigned int delay = dev->d3_delay;
>>>  	u16 csr;
>>>  
>>>  	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
>>> @@ -3988,7 +3989,10 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
>>>  	pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
>>>  	pci_dev_d3_sleep(dev);
>>>  
>>> -	return 0;
>>> +	if (delay < pci_pm_d3_delay)
>>> +		delay = pci_pm_d3_delay;
>>> +
>>> +	return pci_dev_wait(dev, "PM D3->D0", delay, 1000);
>> 1) Why do we wait up to 1 second here, when we wait up to 60 seconds
>> for the other methods?  Can they all be the same?  Maybe a #define for
>> it?
> 
> I know you want to have similar behavior for systems that do and do not support
> CRS. That was the reason why I converted flr wait function to into dev_wait function.
> 
> However, here is the problem:
> 
> For systems that do not support CRS, there is no way of knowing whether we
> are reading 0xFFFFFFFF because the endpoint is not reachable due to an error
> like "it doesn't support this reset type" or if it is actually emitting a CRS.
> 
> If one system has a problem with pm_reset, this code would add an unnecessary
> 1 second delay into the reset path. If I make it 60 it would be something like:
> 
> 1. try reset method A
> 2. wait 60 seconds
> 3. try reset method B
> 4. wait 60 seconds. 
> 5. try reset method C
> 6. wait 60 seconds
> 
> This might end up being a regression on some system. 
> 
> I'm still leaning towards a wait only if we are observing a CRS. What's your
> thought on this?
> 
> then the sequence would be.
> 
> 1. try reset method A
> 2. if CRS pending, wait 60 seconds
> 3. try reset method B
> 4. if CRS pending, wait 60 seconds. 
> 5. try reset method C
> 6. if CRS pending, wait 60 seconds
> 

Thinking more about this. Another possibility is to have an adjustable sleep time.
Start with 60 seconds for all reset types. If somebody doesn't like it,
have a kernel command line override.

>>
>> 2) I don't really like the fact that we do the initial sleep one place
>> and then pass the length of that sleep here.  It's hard to verify
>> they're the same and keep them in sync.  I think the only thing you
>> use initial_wait for is to include that time in the dmesg messages.
>> Maybe we should just omit that time from the message and drop the
>> parameter?
>>
> 
> This was for printing reasons like you spotted, I can certainly get rid of
> the initial_wait.
>
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fd4a3b6..074adf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3963,6 +3963,7 @@  static int pci_af_flr(struct pci_dev *dev, int probe)
  */
 static int pci_pm_reset(struct pci_dev *dev, int probe)
 {
+	unsigned int delay = dev->d3_delay;
 	u16 csr;
 
 	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
@@ -3988,7 +3989,10 @@  static int pci_pm_reset(struct pci_dev *dev, int probe)
 	pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
 	pci_dev_d3_sleep(dev);
 
-	return 0;
+	if (delay < pci_pm_d3_delay)
+		delay = pci_pm_d3_delay;
+
+	return pci_dev_wait(dev, "PM D3->D0", delay, 1000);
 }
 
 void pci_reset_secondary_bus(struct pci_dev *dev)