diff mbox

pci: Disable master abort while waiting for an FLR to complete

Message ID 20170512181107.13853.34680.stgit@localhost.localdomain
State Changes Requested
Headers show

Commit Message

Alexander H Duyck May 12, 2017, 6:13 p.m. UTC
From: Alexander Duyck <alexander.h.duyck@intel.com>

This patch is meant to address issues seen when performing on FLR on some
systems as the long wait can result in a master abort when reading a
function that is not yet ready.

To prevent the master abort from being reported up we should disable
reporting for it while waiting on the reset. Once the reset is completed
then we can re-enable the master abort for the bus.

Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset")

Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

I haven't been able to test this code as I don't have a system that can
reproduce the issue. The fix itself is based on the issue as reported by
Brian so I am hoping he can test this on the Samsung Chromebook Plus with
RK399 OP1 that this was seen on.

 drivers/pci/pci.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Brian Norris May 15, 2017, 4:36 p.m. UTC | #1
On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This patch is meant to address issues seen when performing on FLR on some
> systems as the long wait can result in a master abort when reading a
> function that is not yet ready.
> 
> To prevent the master abort from being reported up we should disable
> reporting for it while waiting on the reset. Once the reset is completed
> then we can re-enable the master abort for the bus.
> 
> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset")
> 
> Reported-by: Brian Norris <briannorris@chromium.org>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
> 
> I haven't been able to test this code as I don't have a system that can
> reproduce the issue. The fix itself is based on the issue as reported by
> Brian so I am hoping he can test this on the Samsung Chromebook Plus with
> RK399 OP1 that this was seen on.

FYI, it's "RK3399" not "RK399". No harm done :)

>  drivers/pci/pci.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 7904d02ffdb9..acbdbabeaa0e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
>   */
>  static void pci_flr_wait(struct pci_dev *dev)
>  {
> +	struct pci_dev *bridge = dev->bus->self;
>  	int i = 0;
> +	u16 bctl;
>  	u32 id;
>  
> +	/*
> +	 * Disable MasterAbortMode while we are waiting to avoid reporting
> +	 * bus errors for reading the command register on a device that is
> +	 * not ready (in some architectures)

I assume it's intentional to only do this *after* you've started the
reset (but before you start polling)?

> +	 */
> +	if (bridge) {
> +		pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
> +		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
> +				      bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
> +	}
> +
>  	do {
>  		msleep(100);
>  		pci_read_config_dword(dev, PCI_COMMAND, &id);

IIUC, the patch works fine, in that I no longer get an abort from the
RC.

>  	} while (i++ < 10 && id == ~0);

BTW, the RK3399 host controller doesn't actually return all 1's on
aborted transactions. So this loop doesn't really work for it at all.

I take it that this (seemingly widely used) pattern all derives from the
PCI spec, which notes:

  "The host bus bridge, in PC compatible systems, must return all 1's on
  a read transaction and discard data on a write transaction when
  terminated with Master-Abort."

RK3399 is far from a "PC compatible system", but I don't know if that's
a real excuse here...

>  
> +	/* restore bridge state */
> +	if (bridge)
> +		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl);
> +
>  	if (id == ~0)
>  		dev_warn(&dev->dev, "Failed to return from FLR\n");

And there's no actual error code returned here; so since several of the
error cases I can trigger on my hardware seem to never actually recover
(no longer how long I wait and/or poll), it seems like we should
propagate the error upward. Right now, we still unconditionally continue
with the pci_dev_restore(), even though that's doomed to abort too...

So there may be several implementation bugs with RK3399. I don't know
how much can be fixed on Rockchip's side, vs. how much can be
accomodated in the PCI core.

Brian

>  	else if (i > 1)
>
Alexander H Duyck May 15, 2017, 7:48 p.m. UTC | #2
On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote:
> On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote:
>> From: Alexander Duyck <alexander.h.duyck@intel.com>
>>
>> This patch is meant to address issues seen when performing on FLR on some
>> systems as the long wait can result in a master abort when reading a
>> function that is not yet ready.
>>
>> To prevent the master abort from being reported up we should disable
>> reporting for it while waiting on the reset. Once the reset is completed
>> then we can re-enable the master abort for the bus.
>>
>> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset")
>>
>> Reported-by: Brian Norris <briannorris@chromium.org>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> ---
>>
>> I haven't been able to test this code as I don't have a system that can
>> reproduce the issue. The fix itself is based on the issue as reported by
>> Brian so I am hoping he can test this on the Samsung Chromebook Plus with
>> RK399 OP1 that this was seen on.
>
> FYI, it's "RK3399" not "RK399". No harm done :)

I will try to update my patch in case there is a future re-submission.

>>  drivers/pci/pci.c |   17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 7904d02ffdb9..acbdbabeaa0e 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
>>   */
>>  static void pci_flr_wait(struct pci_dev *dev)
>>  {
>> +     struct pci_dev *bridge = dev->bus->self;
>>       int i = 0;
>> +     u16 bctl;
>>       u32 id;
>>
>> +     /*
>> +      * Disable MasterAbortMode while we are waiting to avoid reporting
>> +      * bus errors for reading the command register on a device that is
>> +      * not ready (in some architectures)
>
> I assume it's intentional to only do this *after* you've started the
> reset (but before you start polling)?

Yes. The general idea is that the only thing that should be accessing
the device is us. So this way we can catch any other code that might
be broken such as a device driver that is leaving a thread active that
is accessing the device before a reset.

>> +      */
>> +     if (bridge) {
>> +             pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
>> +             pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
>> +                                   bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
>> +     }
>> +
>>       do {
>>               msleep(100);
>>               pci_read_config_dword(dev, PCI_COMMAND, &id);
>
> IIUC, the patch works fine, in that I no longer get an abort from the
> RC.

So did you actually test this patch or are you just speculating it
should work? I'm just not sure what you mean by "IIUC" in this context
here, were you referring to how the patch fixes the issue or how the
testing should work for verifying it fixed the issue?

>>       } while (i++ < 10 && id == ~0);
>
> BTW, the RK3399 host controller doesn't actually return all 1's on
> aborted transactions. So this loop doesn't really work for it at all.
>
> I take it that this (seemingly widely used) pattern all derives from the
> PCI spec, which notes:
>
>   "The host bus bridge, in PC compatible systems, must return all 1's on
>   a read transaction and discard data on a write transaction when
>   terminated with Master-Abort."
>
> RK3399 is far from a "PC compatible system", but I don't know if that's
> a real excuse here...

So if it doesn't return all 1's does it return all 0's? I'm just
curious what is returned?

Also do we know why we are reading the PCI_COMMAND register instead of
just checking for the device ID? Just wondering because the easiest
solution would be to copy out the logic in
pci_bus_read_dev_vendor_id() if all we are doing is waiting on the
configuration space to begin responding to requests again.

>>
>> +     /* restore bridge state */
>> +     if (bridge)
>> +             pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl);
>> +
>>       if (id == ~0)
>>               dev_warn(&dev->dev, "Failed to return from FLR\n");
>
> And there's no actual error code returned here; so since several of the
> error cases I can trigger on my hardware seem to never actually recover
> (no longer how long I wait and/or poll), it seems like we should
> propagate the error upward. Right now, we still unconditionally continue
> with the pci_dev_restore(), even though that's doomed to abort too...

If you take a look we ignore the return value anyway and call the
restore function in pci_reset_function(). I think the general idea is
we should still have a device here after attempting to perform the
reset.

It may be that performing an FLR isn't the right choice for this
hardware. Do you know if it functions correctly and actually completes
the reset correctly? I know in the case of some of the Intel Ethernet
silicon we were advertising support for it but found an issue that was
causing the Ethernet PHY to go offline as a result of performing the
FLR. Do you know what it was that was kicking off the need to perform
the reset in the first place? Also you may consider possibly adding a
quirk if the FLR is kicking your device completely out of the system.
There was recently a PCI flag added to indicate a device doesn't
support FLR. You might try adding your own version of the quirk for
this device to see if maybe pushing it to use a different sort of
reset might be more useful.

> So there may be several implementation bugs with RK3399. I don't know
> how much can be fixed on Rockchip's side, vs. how much can be
> accomodated in the PCI core.
>
> Brian

This is the kind of thing we have PCIe quirks for if nothing else.
Odds are we should be able to work around it, it is just a matter of
knowing what all the quirks are.

- Alex
Alex Williamson May 15, 2017, 8 p.m. UTC | #3
On Mon, 15 May 2017 12:48:30 -0700
Alexander Duyck <alexander.duyck@gmail.com> wrote:

> On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote:
> > On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote:  
> >> From: Alexander Duyck <alexander.h.duyck@intel.com>
> >>
> >> This patch is meant to address issues seen when performing on FLR on some
> >> systems as the long wait can result in a master abort when reading a
> >> function that is not yet ready.
> >>
> >> To prevent the master abort from being reported up we should disable
> >> reporting for it while waiting on the reset. Once the reset is completed
> >> then we can re-enable the master abort for the bus.
> >>
> >> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset")
> >>
> >> Reported-by: Brian Norris <briannorris@chromium.org>
> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> >> ---
> >>
> >> I haven't been able to test this code as I don't have a system that can
> >> reproduce the issue. The fix itself is based on the issue as reported by
> >> Brian so I am hoping he can test this on the Samsung Chromebook Plus with
> >> RK399 OP1 that this was seen on.  
> >
> > FYI, it's "RK3399" not "RK399". No harm done :)  
> 
> I will try to update my patch in case there is a future re-submission.
> 
> >>  drivers/pci/pci.c |   17 +++++++++++++++++
> >>  1 file changed, 17 insertions(+)
> >>
> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >> index 7904d02ffdb9..acbdbabeaa0e 100644
> >> --- a/drivers/pci/pci.c
> >> +++ b/drivers/pci/pci.c
> >> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
> >>   */
> >>  static void pci_flr_wait(struct pci_dev *dev)
> >>  {
> >> +     struct pci_dev *bridge = dev->bus->self;
> >>       int i = 0;
> >> +     u16 bctl;
> >>       u32 id;
> >>
> >> +     /*
> >> +      * Disable MasterAbortMode while we are waiting to avoid reporting
> >> +      * bus errors for reading the command register on a device that is
> >> +      * not ready (in some architectures)  
> >
> > I assume it's intentional to only do this *after* you've started the
> > reset (but before you start polling)?  
> 
> Yes. The general idea is that the only thing that should be accessing
> the device is us. So this way we can catch any other code that might
> be broken such as a device driver that is leaving a thread active that
> is accessing the device before a reset.
> 
> >> +      */
> >> +     if (bridge) {
> >> +             pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
> >> +             pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
> >> +                                   bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
> >> +     }
> >> +
> >>       do {
> >>               msleep(100);
> >>               pci_read_config_dword(dev, PCI_COMMAND, &id);  
> >
> > IIUC, the patch works fine, in that I no longer get an abort from the
> > RC.  
> 
> So did you actually test this patch or are you just speculating it
> should work? I'm just not sure what you mean by "IIUC" in this context
> here, were you referring to how the patch fixes the issue or how the
> testing should work for verifying it fixed the issue?
> 
> >>       } while (i++ < 10 && id == ~0);  
> >
> > BTW, the RK3399 host controller doesn't actually return all 1's on
> > aborted transactions. So this loop doesn't really work for it at all.
> >
> > I take it that this (seemingly widely used) pattern all derives from the
> > PCI spec, which notes:
> >
> >   "The host bus bridge, in PC compatible systems, must return all 1's on
> >   a read transaction and discard data on a write transaction when
> >   terminated with Master-Abort."
> >
> > RK3399 is far from a "PC compatible system", but I don't know if that's
> > a real excuse here...  
> 
> So if it doesn't return all 1's does it return all 0's? I'm just
> curious what is returned?
> 
> Also do we know why we are reading the PCI_COMMAND register instead of
> just checking for the device ID? Just wondering because the easiest
> solution would be to copy out the logic in
> pci_bus_read_dev_vendor_id() if all we are doing is waiting on the
> configuration space to begin responding to requests again.

Because SR-IOV VFs don't implement the first dword.  Thanks,

Alex
Brian Norris May 15, 2017, 9 p.m. UTC | #4
On Mon, May 15, 2017 at 12:48:30PM -0700, Alexander Duyck wrote:
> On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote:
> > On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote:
> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >> index 7904d02ffdb9..acbdbabeaa0e 100644
> >> --- a/drivers/pci/pci.c
> >> +++ b/drivers/pci/pci.c
> >> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
> >>   */
> >>  static void pci_flr_wait(struct pci_dev *dev)
> >>  {
> >> +     struct pci_dev *bridge = dev->bus->self;
> >>       int i = 0;
> >> +     u16 bctl;
> >>       u32 id;
> >>
> >> +     /*
> >> +      * Disable MasterAbortMode while we are waiting to avoid reporting
> >> +      * bus errors for reading the command register on a device that is
> >> +      * not ready (in some architectures)
> >
> > I assume it's intentional to only do this *after* you've started the
> > reset (but before you start polling)?
> 
> Yes. The general idea is that the only thing that should be accessing
> the device is us. So this way we can catch any other code that might
> be broken such as a device driver that is leaving a thread active that
> is accessing the device before a reset.

I figured as much (and that makes sense). Thus, this patch is
independent of the report I made in the first place -- that somehow MSI
interrupts can still be delivered from my EP while we're in the midst of
FLR.

> >> +      */
> >> +     if (bridge) {
> >> +             pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
> >> +             pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
> >> +                                   bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
> >> +     }
> >> +
> >>       do {
> >>               msleep(100);
> >>               pci_read_config_dword(dev, PCI_COMMAND, &id);
> >
> > IIUC, the patch works fine, in that I no longer get an abort from the
> > RC.
> 
> So did you actually test this patch or are you just speculating it
> should work? I'm just not sure what you mean by "IIUC" in this context
> here, were you referring to how the patch fixes the issue or how the
> testing should work for verifying it fixed the issue?

I did test it, but due to the below issue (the timeout loop doesn't
actually work for me, and then we still abort / panic when the device
doesn't respond later) it's hard to tell whether this is very effective.

The precise reason why I use "IIUC": even with this patch, my system can
crash with asynchronous aborts (ARM64 SError) just after FLR. I *think*
that means the above pci_read_config_dword() is no longer causing an
abort, but it's hard to tell for sure what exactly caused the abort at
all when looking at asynchronous errors (I can add extra delays, prints,
etc., to get a pretty good idea, but that isn't exactly precise, and
it's time consuming). I *suspect* that I'm now just crashing due to
pci_dev_restore().

> >>       } while (i++ < 10 && id == ~0);
> >
> > BTW, the RK3399 host controller doesn't actually return all 1's on
> > aborted transactions. So this loop doesn't really work for it at all.
> >
> > I take it that this (seemingly widely used) pattern all derives from the
> > PCI spec, which notes:
> >
> >   "The host bus bridge, in PC compatible systems, must return all 1's on
> >   a read transaction and discard data on a write transaction when
> >   terminated with Master-Abort."
> >
> > RK3399 is far from a "PC compatible system", but I don't know if that's
> > a real excuse here...
> 
> So if it doesn't return all 1's does it return all 0's? I'm just
> curious what is returned?

All 0's.

> >>
> >> +     /* restore bridge state */
> >> +     if (bridge)
> >> +             pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl);
> >> +
> >>       if (id == ~0)
> >>               dev_warn(&dev->dev, "Failed to return from FLR\n");
> >
> > And there's no actual error code returned here; so since several of the
> > error cases I can trigger on my hardware seem to never actually recover
> > (no longer how long I wait and/or poll), it seems like we should
> > propagate the error upward. Right now, we still unconditionally continue
> > with the pci_dev_restore(), even though that's doomed to abort too...
> 
> If you take a look we ignore the return value anyway and call the
> restore function in pci_reset_function().

Right, that's what I was referring to.

> I think the general idea is
> we should still have a device here after attempting to perform the
> reset.

That assumption doesn't always seem to hold true here.

> It may be that performing an FLR isn't the right choice for this
> hardware. Do you know if it functions correctly and actually completes
> the reset correctly? I know in the case of some of the Intel Ethernet
> silicon we were advertising support for it but found an issue that was
> causing the Ethernet PHY to go offline as a result of performing the
> FLR.

I don't know exactly the cause for why this Wifi card aborts vs. when it
resets properly. A high percentage of the time, this card seems to reset
properly, but I am able to trigger certain cases (due to a buggy Wifi
driver; what's new?) that seem to get the entire card into a bad state
such that it no longer reset properly.

I can pursue the source of such driver bugs, but if I know anything
about this vendor, there's never going to be an acceptably-close-to-100%
certainty that there won't be such bugs again.

> Do you know what it was that was kicking off the need to perform
> the reset in the first place?

It's now baked into the mwifiex Wifi driver. It always has had a
fallback such that if the Wifi FW stops responding for whatever reason
[1], we can automatically unload the netdev / reset the device / reload
the netdev. This was supported for SDIO previously, but was added to the
PCIe driver in commit 4c5dae59d2e9 ("mwifiex: add PCIe function level
reset support") and extended to work automatically via this
not-yet-merged patch:

https://patchwork.kernel.org/patch/9706855/
[PATCH v3 2/2] mwifiex: pcie: add card_reset() support

> Also you may consider possibly adding a
> quirk if the FLR is kicking your device completely out of the system.
> There was recently a PCI flag added to indicate a device doesn't
> support FLR. You might try adding your own version of the quirk for
> this device to see if maybe pushing it to use a different sort of
> reset might be more useful.

I'm not sure if another sort of reset is available, nor if it would work
better. I'll investigate though.

Also, I'm told this same Wifi card works fine on other PCIe systems
(e.g., Intel-based laptops) without trouble, including over 1000s of FLR
cycles. (But I only trust such testing as far as I can throw the tester
though -- I don't think they're testing as many corner cases as I am.)

> > So there may be several implementation bugs with RK3399. I don't know
> > how much can be fixed on Rockchip's side, vs. how much can be
> > accomodated in the PCI core.
> >
> > Brian
> 
> This is the kind of thing we have PCIe quirks for if nothing else.
> Odds are we should be able to work around it, it is just a matter of
> knowing what all the quirks are.

Maybe your imagination for "quirks" isn't creative enough :)

Brian

[1] Hint: there are a boundless number of reasons for such FW failures.
Brian Norris May 15, 2017, 9:51 p.m. UTC | #5
On Mon, May 15, 2017 at 02:00:33PM -0700, Brian Norris wrote:
> On Mon, May 15, 2017 at 12:48:30PM -0700, Alexander Duyck wrote:
> > On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote:
> > > So there may be several implementation bugs with RK3399. I don't know
> > > how much can be fixed on Rockchip's side, vs. how much can be
> > > accomodated in the PCI core.
> > 
> > This is the kind of thing we have PCIe quirks for if nothing else.
> > Odds are we should be able to work around it, it is just a matter of
> > knowing what all the quirks are.
> 
> Maybe your imagination for "quirks" isn't creative enough :)

BTW, I meant to note that people have gone to some extreme lengths to
handle buggy / quirky RCs. Look at drivers/pci/dwc/pci-imx6.c, which
hooks all the way into the core ARM exception handling just to mask
aborts! (See imx6q_pcie_abort_handler.) And they have some patches
intended to fix crashes in 4.12-rc1 to hook synchronous aborts to return
"all 1's". But that's extremely invasive and not very precise (I don't
think that even checks the difference between PCI-induced aborts and
other system aborts?). Probably not gonna fly on ARM64 either...

Brian

[1] [PATCH] PCI: imx6: fix downstream bus scanning
https://patchwork.ozlabs.org/patch/760748/
Bjorn Helgaas June 14, 2017, 10:34 p.m. UTC | #6
On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This patch is meant to address issues seen when performing on FLR on some
> systems as the long wait can result in a master abort when reading a
> function that is not yet ready.
> 
> To prevent the master abort from being reported up we should disable
> reporting for it while waiting on the reset. Once the reset is completed
> then we can re-enable the master abort for the bus.
> 
> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset")
> 
> Reported-by: Brian Norris <briannorris@chromium.org>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
> 
> I haven't been able to test this code as I don't have a system that can
> reproduce the issue. The fix itself is based on the issue as reported by
> Brian so I am hoping he can test this on the Samsung Chromebook Plus with
> RK399 OP1 that this was seen on.

I'm slightly hesitant about turning off Master Abort reporting in the
upstream bridge because that potentially affects several devices below
the bridge, not just the one that's being reset.  But maybe it's the
best we can do.

Is there a public problem report we can reference?  It'd be nice to
have "lspci -vvxxxx" output (for future reference since lspci doesn't
decode FRS stuff yet).

Alex W, since you wrote 5adecf817dd6, have you seen similar problems
or do you have any thoughts on this?

I have the same question as Alex D -- why aren't we using CRS in this
path as in pci_bus_read_dev_vendor_id()?  PCIe r3.1, sec 6.6.2, says a
function should use CRS if it requires more time.  I see the comment
about VFs not implementing the first dword, but a VF vendor ID
*should* read as 0xffff, which is distinguishable from a CRS response
(0x0001).

Per sec 6.6.2, a function must complete an FLR within 100ms.  I assume
that means the Intel IGD from 5adecf817dd6 and whatever device is
involved here (what is it, by the way?) are out of spec.  Or is there
some weasel wording that allows devices to take this long after a
reset?

If they're out of spec, it'd be nice to identify them with quirks
(maybe using pci_dev_specific_reset()) instead of changing the generic
code to accommodate them.  That way the code matches the spec better
and we can potentially work on better ways to handle the issue within
the spec, e.g., via something like Function Readiness Status
notifications (sec 6.23.2).

Please repost this eventually with the typo correction and hopefully a
link to the problem report and lspci info.

Bjorn

>  drivers/pci/pci.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 7904d02ffdb9..acbdbabeaa0e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
>   */
>  static void pci_flr_wait(struct pci_dev *dev)
>  {
> +	struct pci_dev *bridge = dev->bus->self;
>  	int i = 0;
> +	u16 bctl;
>  	u32 id;
>  
> +	/*
> +	 * Disable MasterAbortMode while we are waiting to avoid reporting
> +	 * bus errors for reading the command register on a device that is
> +	 * not ready (in some architectures)
> +	 */
> +	if (bridge) {
> +		pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
> +		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
> +				      bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
> +	}
> +
>  	do {
>  		msleep(100);
>  		pci_read_config_dword(dev, PCI_COMMAND, &id);
>  	} while (i++ < 10 && id == ~0);
>  
> +	/* restore bridge state */
> +	if (bridge)
> +		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl);
> +
>  	if (id == ~0)
>  		dev_warn(&dev->dev, "Failed to return from FLR\n");
>  	else if (i > 1)
>
diff mbox

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7904d02ffdb9..acbdbabeaa0e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3758,14 +3758,31 @@  int pci_wait_for_pending_transaction(struct pci_dev *dev)
  */
 static void pci_flr_wait(struct pci_dev *dev)
 {
+	struct pci_dev *bridge = dev->bus->self;
 	int i = 0;
+	u16 bctl;
 	u32 id;
 
+	/*
+	 * Disable MasterAbortMode while we are waiting to avoid reporting
+	 * bus errors for reading the command register on a device that is
+	 * not ready (in some architectures)
+	 */
+	if (bridge) {
+		pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
+		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
+				      bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
+	}
+
 	do {
 		msleep(100);
 		pci_read_config_dword(dev, PCI_COMMAND, &id);
 	} while (i++ < 10 && id == ~0);
 
+	/* restore bridge state */
+	if (bridge)
+		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl);
+
 	if (id == ~0)
 		dev_warn(&dev->dev, "Failed to return from FLR\n");
 	else if (i > 1)