diff mbox series

PCI: Make sure the bus bridge powered on when scanning bus

Message ID 1596022223-4765-1-git-send-email-yangyicong@hisilicon.com
State New
Headers show
Series PCI: Make sure the bus bridge powered on when scanning bus | expand

Commit Message

Yicong Yang July 29, 2020, 11:30 a.m. UTC
When the bus bridge is runtime suspended, we'll fail to rescan
the devices through sysfs as we cannot access the configuration
space correctly when the bridge is in D3hot.
It can be reproduced like:

$ echo 1 > /sys/bus/pci/devices/0000:80:00.0/0000:81:00.1/remove
$ echo 1 > /sys/bus/pci/devices/0000:80:00.0/pci_bus/0000:81/rescan

0000:80:00.0 is root port and is runtime suspended and we cannot
get 0000:81:00.1 after rescan.

Make bridge powered on when scanning the child bus, by adding
pm_runtime_get_sync()/pm_runtime_put() in pci_scan_child_bus_extend().

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/pci/probe.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Yicong Yang Aug. 21, 2020, 9:54 a.m. UTC | #1
gentle ping ...

Any comments on this or is it possible to be merged?


On 2020/7/29 19:30, Yicong Yang wrote:
> When the bus bridge is runtime suspended, we'll fail to rescan
> the devices through sysfs as we cannot access the configuration
> space correctly when the bridge is in D3hot.
> It can be reproduced like:
>
> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/0000:81:00.1/remove
> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/pci_bus/0000:81/rescan
>
> 0000:80:00.0 is root port and is runtime suspended and we cannot
> get 0000:81:00.1 after rescan.
>
> Make bridge powered on when scanning the child bus, by adding
> pm_runtime_get_sync()/pm_runtime_put() in pci_scan_child_bus_extend().
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/pci/probe.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f66988..5bb502b 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2795,6 +2795,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>  
>  	dev_dbg(&bus->dev, "scanning bus\n");
>  
> +	/*
> +	 * Make sure the bus bridge is powered on, otherwise we may not be
> +	 * able to scan the devices as we may fail to access the configuration
> +	 * space of subordinates.
> +	 */
> +	if (bus->self)
> +		pm_runtime_get_sync(&bus->self->dev);
> +
>  	/* Go find them, Rover! */
>  	for (devfn = 0; devfn < 256; devfn += 8) {
>  		nr_devs = pci_scan_slot(bus, devfn);
> @@ -2907,6 +2915,9 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>  		}
>  	}
>  
> +	if (bus->self)
> +		pm_runtime_put(&bus->self->dev);
> +
>  	/*
>  	 * We've scanned the bus and so we know all about what's on
>  	 * the other side of any bridges that may be on this bus plus
Bjorn Helgaas Sept. 17, 2020, 9:07 p.m. UTC | #2
[+cc Mika, Rafael, Peter]

On Wed, Jul 29, 2020 at 07:30:23PM +0800, Yicong Yang wrote:
> When the bus bridge is runtime suspended, we'll fail to rescan
> the devices through sysfs as we cannot access the configuration
> space correctly when the bridge is in D3hot.
> It can be reproduced like:
> 
> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/0000:81:00.1/remove
> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/pci_bus/0000:81/rescan
>
> 0000:80:00.0 is root port and is runtime suspended and we cannot
> get 0000:81:00.1 after rescan.
> 
> Make bridge powered on when scanning the child bus, by adding
> pm_runtime_get_sync()/pm_runtime_put() in pci_scan_child_bus_extend().
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/pci/probe.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f66988..5bb502b 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2795,6 +2795,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>  
>  	dev_dbg(&bus->dev, "scanning bus\n");
>  
> +	/*
> +	 * Make sure the bus bridge is powered on, otherwise we may not be
> +	 * able to scan the devices as we may fail to access the configuration
> +	 * space of subordinates.
> +	 */
> +	if (bus->self)
> +		pm_runtime_get_sync(&bus->self->dev);

I think if we do this, we should be able to remove the call from
pci_scan_bridge() added by d963f6512e15 ("PCI: Power on bridges before
scanning new devices"), right?

The reason we need it here is because there are two paths to
pci_scan_child_bus_extend() and only one of them calls
pm_runtime_get_sync():

  pci_scan_bridge_extend
    pm_runtime_get_sync
    pci_scan_child_bus_extend

  pci_scan_child_bus
    pci_scan_child_bus_extend

If we move the pm_runtime_get_sync() from pci_scan_bridge_extend() to
pci_scan_child_bus_extend(), both paths should be safe.

>  	/* Go find them, Rover! */
>  	for (devfn = 0; devfn < 256; devfn += 8) {
>  		nr_devs = pci_scan_slot(bus, devfn);
> @@ -2907,6 +2915,9 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>  		}
>  	}
>  
> +	if (bus->self)
> +		pm_runtime_put(&bus->self->dev);

I would probably do this:

  struct pci_dev *bridge = bus->self;

  if (bridge)
    pm_runtime_get_sync(&bridge->dev);
  ...
  if (bridge)
    pm_runtime_put(&bridge->dev);

>  	/*
>  	 * We've scanned the bus and so we know all about what's on
>  	 * the other side of any bridges that may be on this bus plus
> -- 
> 2.8.1
>
Yicong Yang Sept. 18, 2020, 9:31 a.m. UTC | #3
On 2020/9/18 5:07, Bjorn Helgaas wrote:
> [+cc Mika, Rafael, Peter]
>
> On Wed, Jul 29, 2020 at 07:30:23PM +0800, Yicong Yang wrote:
>> When the bus bridge is runtime suspended, we'll fail to rescan
>> the devices through sysfs as we cannot access the configuration
>> space correctly when the bridge is in D3hot.
>> It can be reproduced like:
>>
>> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/0000:81:00.1/remove
>> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/pci_bus/0000:81/rescan
>>
>> 0000:80:00.0 is root port and is runtime suspended and we cannot
>> get 0000:81:00.1 after rescan.
>>
>> Make bridge powered on when scanning the child bus, by adding
>> pm_runtime_get_sync()/pm_runtime_put() in pci_scan_child_bus_extend().
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  drivers/pci/probe.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index 2f66988..5bb502b 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -2795,6 +2795,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>>  
>>  	dev_dbg(&bus->dev, "scanning bus\n");
>>  
>> +	/*
>> +	 * Make sure the bus bridge is powered on, otherwise we may not be
>> +	 * able to scan the devices as we may fail to access the configuration
>> +	 * space of subordinates.
>> +	 */
>> +	if (bus->self)
>> +		pm_runtime_get_sync(&bus->self->dev);
> I think if we do this, we should be able to remove the call from
> pci_scan_bridge() added by d963f6512e15 ("PCI: Power on bridges before
> scanning new devices"), right?
>
> The reason we need it here is because there are two paths to
> pci_scan_child_bus_extend() and only one of them calls
> pm_runtime_get_sync():
>
>   pci_scan_bridge_extend
>     pm_runtime_get_sync
>     pci_scan_child_bus_extend
>
>   pci_scan_child_bus
>     pci_scan_child_bus_extend
>
> If we move the pm_runtime_get_sync() from pci_scan_bridge_extend() to
> pci_scan_child_bus_extend(), both paths should be safe.

A bit different, I think. The issue I met is a bit different from Mika, as
we go through different sysfs files. Think about rescanning device under a 
root port,

when echo 1 > /sysfs/bus/pci/devices/${RootPort}/rescan:

rescan_restore()
  pci_rescan_bus(pdev->bus) /* we will rescan the root bus */
    pci_rescan_child_bus()
      pci_scan_child_bus_extend()  /* we cannot wake up the bus bridge here as is on the root bus */
        pci_scan_bridge_extend() /* we have to wake up the root port here */

when echo 1 > /sysfs/bus/pci/devices/${RootPort}/pci_bus/${PciBus}/rescan:

rescan_restore()
  pci_rescan_bus(bus) /* we will rescan the bus of the root port */
    pci_rescan_child_bus()
      pci_scan_child_bus_extend() /* we can wake up the bus bridge - root port here */

As different bus is rescanned, so it'll have problem without patch d963f6512e15.


>
>>  	/* Go find them, Rover! */
>>  	for (devfn = 0; devfn < 256; devfn += 8) {
>>  		nr_devs = pci_scan_slot(bus, devfn);
>> @@ -2907,6 +2915,9 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>>  		}
>>  	}
>>  
>> +	if (bus->self)
>> +		pm_runtime_put(&bus->self->dev);
> I would probably do this:
>
>   struct pci_dev *bridge = bus->self;
>
>   if (bridge)
>     pm_runtime_get_sync(&bridge->dev);
>   ...
>   if (bridge)
>     pm_runtime_put(&bridge->dev);

Sure.

Regards,
Yicong


>
>>  	/*
>>  	 * We've scanned the bus and so we know all about what's on
>>  	 * the other side of any bridges that may be on this bus plus
>> -- 
>> 2.8.1
>>
> .
>
Bjorn Helgaas Sept. 18, 2020, 4:17 p.m. UTC | #4
On Fri, Sep 18, 2020 at 05:31:54PM +0800, Yicong Yang wrote:
> On 2020/9/18 5:07, Bjorn Helgaas wrote:
> > On Wed, Jul 29, 2020 at 07:30:23PM +0800, Yicong Yang wrote:
> >> When the bus bridge is runtime suspended, we'll fail to rescan
> >> the devices through sysfs as we cannot access the configuration
> >> space correctly when the bridge is in D3hot.
> >> It can be reproduced like:
> >>
> >> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/0000:81:00.1/remove
> >> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/pci_bus/0000:81/rescan
> >>
> >> 0000:80:00.0 is root port and is runtime suspended and we cannot
> >> get 0000:81:00.1 after rescan.
> >>
> >> Make bridge powered on when scanning the child bus, by adding
> >> pm_runtime_get_sync()/pm_runtime_put() in pci_scan_child_bus_extend().
> >>
> >> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> >> ---
> >>  drivers/pci/probe.c | 11 +++++++++++
> >>  1 file changed, 11 insertions(+)
> >>
> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >> index 2f66988..5bb502b 100644
> >> --- a/drivers/pci/probe.c
> >> +++ b/drivers/pci/probe.c
> >> @@ -2795,6 +2795,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> >>  
> >>  	dev_dbg(&bus->dev, "scanning bus\n");
> >>  
> >> +	/*
> >> +	 * Make sure the bus bridge is powered on, otherwise we may not be
> >> +	 * able to scan the devices as we may fail to access the configuration
> >> +	 * space of subordinates.
> >> +	 */
> >> +	if (bus->self)
> >> +		pm_runtime_get_sync(&bus->self->dev);
> >
> > I think if we do this, we should be able to remove the call from
> > pci_scan_bridge() added by d963f6512e15 ("PCI: Power on bridges before
> > scanning new devices"), right?
> >
> > The reason we need it here is because there are two paths to
> > pci_scan_child_bus_extend() and only one of them calls
> > pm_runtime_get_sync():
> >
> >   pci_scan_bridge_extend
> >     pm_runtime_get_sync
> >     pci_scan_child_bus_extend
> >
> >   pci_scan_child_bus
> >     pci_scan_child_bus_extend
> >
> > If we move the pm_runtime_get_sync() from pci_scan_bridge_extend() to
> > pci_scan_child_bus_extend(), both paths should be safe.
> 
> A bit different, I think. The issue I met is a bit different from
> Mika, as we go through different sysfs files. Think about rescanning
> device under a root port,
> 
> when echo 1 > /sysfs/bus/pci/devices/${RootPort}/rescan:
> 
> rescan_store()
>   pci_rescan_bus(pdev->bus) /* we will rescan the root bus */
>     pci_rescan_child_bus()
>       pci_scan_child_bus_extend()  /* we cannot wake up the bus bridge here as is on the root bus */
>         pci_scan_bridge_extend() /* we have to wake up the root port here */
> 
> when echo 1 > /sysfs/bus/pci/devices/${RootPort}/pci_bus/${PciBus}/rescan:
> 
> rescan_store()
>   pci_rescan_bus(bus) /* we will rescan the bus of the root port */
>     pci_rescan_child_bus()
>       pci_scan_child_bus_extend() /* we can wake up the bus bridge - root port here */
> 
> As different bus is rescanned, so it'll have problem without patch
> d963f6512e15.

Sorry, I didn't quite follow the above.

The problem here is about scanning a bridge's secondary bus when the
bridge may be runtime-suspended.  The bridge may be in D0, D1, D2, or
D3hot.  It is not in D3cold.  pm_runtime_get_sync() brings a device
that may have been runtime-suspended back to D0.

All PCI devices respond to config accesses when they are in D0, D1,
D2, or D3hot [1], so we don't need pm_runtime_get_sync() to access a
bridge's config space.

But when a bridge is not in D0, it does not initiate transactions on
its secondary bus [2], so we do need pm_runtime_get_sync() before we
attempt config accesses for any children.

pci_scan_bridge_extend() does not directly do anything with the
secondary bus, which is why I'm not sure it needs
pm_runtime_get_sync().

The accesses to the secondary bus are in pci_scan_slot(), so the
pm_runtime_get_sync() you added immediately before calling
pci_scan_slot() makes sense to me.  Although possibly it could go in
pci_scan_slot() itself, since there are several other callers.

[1] PCIe r5.0, sec 5.3.1.4.1
[2] PCIe r5.0, sec 5.3.1 implementation note

> >>  	/* Go find them, Rover! */
> >>  	for (devfn = 0; devfn < 256; devfn += 8) {
> >>  		nr_devs = pci_scan_slot(bus, devfn);
> >> @@ -2907,6 +2915,9 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> >>  		}
> >>  	}
> >>  
> >> +	if (bus->self)
> >> +		pm_runtime_put(&bus->self->dev);
> > I would probably do this:
> >
> >   struct pci_dev *bridge = bus->self;
> >
> >   if (bridge)
> >     pm_runtime_get_sync(&bridge->dev);
> >   ...
> >   if (bridge)
> >     pm_runtime_put(&bridge->dev);
> 
> Sure.
> 
> Regards,
> Yicong
> 
> 
> >
> >>  	/*
> >>  	 * We've scanned the bus and so we know all about what's on
> >>  	 * the other side of any bridges that may be on this bus plus
> >> -- 
> >> 2.8.1
> >>
> > .
> >
>
Yicong Yang Sept. 19, 2020, 10:22 a.m. UTC | #5
On 2020/9/19 0:17, Bjorn Helgaas wrote:

> On Fri, Sep 18, 2020 at 05:31:54PM +0800, Yicong Yang wrote:
>> On 2020/9/18 5:07, Bjorn Helgaas wrote:
>>> On Wed, Jul 29, 2020 at 07:30:23PM +0800, Yicong Yang wrote:
>>>> When the bus bridge is runtime suspended, we'll fail to rescan
>>>> the devices through sysfs as we cannot access the configuration
>>>> space correctly when the bridge is in D3hot.
>>>> It can be reproduced like:
>>>>
>>>> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/0000:81:00.1/remove
>>>> $ echo 1 > /sys/bus/pci/devices/0000:80:00.0/pci_bus/0000:81/rescan
>>>>
>>>> 0000:80:00.0 is root port and is runtime suspended and we cannot
>>>> get 0000:81:00.1 after rescan.
>>>>
>>>> Make bridge powered on when scanning the child bus, by adding
>>>> pm_runtime_get_sync()/pm_runtime_put() in pci_scan_child_bus_extend().
>>>>
>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>> ---
>>>>  drivers/pci/probe.c | 11 +++++++++++
>>>>  1 file changed, 11 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>>> index 2f66988..5bb502b 100644
>>>> --- a/drivers/pci/probe.c
>>>> +++ b/drivers/pci/probe.c
>>>> @@ -2795,6 +2795,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>>>>  
>>>>  	dev_dbg(&bus->dev, "scanning bus\n");
>>>>  
>>>> +	/*
>>>> +	 * Make sure the bus bridge is powered on, otherwise we may not be
>>>> +	 * able to scan the devices as we may fail to access the configuration
>>>> +	 * space of subordinates.
>>>> +	 */
>>>> +	if (bus->self)
>>>> +		pm_runtime_get_sync(&bus->self->dev);
>>> I think if we do this, we should be able to remove the call from
>>> pci_scan_bridge() added by d963f6512e15 ("PCI: Power on bridges before
>>> scanning new devices"), right?
>>>
>>> The reason we need it here is because there are two paths to
>>> pci_scan_child_bus_extend() and only one of them calls
>>> pm_runtime_get_sync():
>>>
>>>   pci_scan_bridge_extend
>>>     pm_runtime_get_sync
>>>     pci_scan_child_bus_extend
>>>
>>>   pci_scan_child_bus
>>>     pci_scan_child_bus_extend
>>>
>>> If we move the pm_runtime_get_sync() from pci_scan_bridge_extend() to
>>> pci_scan_child_bus_extend(), both paths should be safe.
>> A bit different, I think. The issue I met is a bit different from
>> Mika, as we go through different sysfs files. Think about rescanning
>> device under a root port,
>>
>> when echo 1 > /sysfs/bus/pci/devices/${RootPort}/rescan:
>>
>> rescan_store()
>>   pci_rescan_bus(pdev->bus) /* we will rescan the root bus */
>>     pci_rescan_child_bus()
>>       pci_scan_child_bus_extend()  /* we cannot wake up the bus bridge here as is on the root bus */
>>         pci_scan_bridge_extend() /* we have to wake up the root port here */
>>
>> when echo 1 > /sysfs/bus/pci/devices/${RootPort}/pci_bus/${PciBus}/rescan:
>>
>> rescan_store()
>>   pci_rescan_bus(bus) /* we will rescan the bus of the root port */
>>     pci_rescan_child_bus()
>>       pci_scan_child_bus_extend() /* we can wake up the bus bridge - root port here */
>>
>> As different bus is rescanned, so it'll have problem without patch
>> d963f6512e15.
> Sorry, I didn't quite follow the above.
>
> The problem here is about scanning a bridge's secondary bus when the
> bridge may be runtime-suspended.  The bridge may be in D0, D1, D2, or
> D3hot.  It is not in D3cold.  pm_runtime_get_sync() brings a device
> that may have been runtime-suspended back to D0.
>
> All PCI devices respond to config accesses when they are in D0, D1,
> D2, or D3hot [1], so we don't need pm_runtime_get_sync() to access a
> bridge's config space.
>
> But when a bridge is not in D0, it does not initiate transactions on
> its secondary bus [2], so we do need pm_runtime_get_sync() before we
> attempt config accesses for any children.
>
> pci_scan_bridge_extend() does not directly do anything with the
> secondary bus, which is why I'm not sure it needs
> pm_runtime_get_sync().
>
> The accesses to the secondary bus are in pci_scan_slot(), so the
> pm_runtime_get_sync() you added immediately before calling
> pci_scan_slot() makes sense to me.  Although possibly it could go in
> pci_scan_slot() itself, since there are several other callers.
>
> [1] PCIe r5.0, sec 5.3.1.4.1
> [2] PCIe r5.0, sec 5.3.1 implementation note

yes it's right. with this patch and patch d963f6512e15 removed, the rescan
works well and I've tested it. I didn't understand it correctly in my last
reply, sorry. I'll remove d963f6512e15 in v2 patch.

If we put the pm_runtime_get/put in the pci_scan_slot(), we'll get/put the
bridge many times. so maybe put it immediately before pci_scan_slot() is
better?

Thanks,
Yicong


>
>>>>  	/* Go find them, Rover! */
>>>>  	for (devfn = 0; devfn < 256; devfn += 8) {
>>>>  		nr_devs = pci_scan_slot(bus, devfn);
>>>> @@ -2907,6 +2915,9 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>>>>  		}
>>>>  	}
>>>>  
>>>> +	if (bus->self)
>>>> +		pm_runtime_put(&bus->self->dev);
>>> I would probably do this:
>>>
>>>   struct pci_dev *bridge = bus->self;
>>>
>>>   if (bridge)
>>>     pm_runtime_get_sync(&bridge->dev);
>>>   ...
>>>   if (bridge)
>>>     pm_runtime_put(&bridge->dev);
>> Sure.
>>
>> Regards,
>> Yicong
>>
>>
>>>>  	/*
>>>>  	 * We've scanned the bus and so we know all about what's on
>>>>  	 * the other side of any bridges that may be on this bus plus
>>>> -- 
>>>> 2.8.1
>>>>
>>> .
>>>
> .
>
diff mbox series

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f66988..5bb502b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2795,6 +2795,14 @@  static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 
 	dev_dbg(&bus->dev, "scanning bus\n");
 
+	/*
+	 * Make sure the bus bridge is powered on, otherwise we may not be
+	 * able to scan the devices as we may fail to access the configuration
+	 * space of subordinates.
+	 */
+	if (bus->self)
+		pm_runtime_get_sync(&bus->self->dev);
+
 	/* Go find them, Rover! */
 	for (devfn = 0; devfn < 256; devfn += 8) {
 		nr_devs = pci_scan_slot(bus, devfn);
@@ -2907,6 +2915,9 @@  static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 		}
 	}
 
+	if (bus->self)
+		pm_runtime_put(&bus->self->dev);
+
 	/*
 	 * We've scanned the bus and so we know all about what's on
 	 * the other side of any bridges that may be on this bus plus