diff mbox series

[v5,18/23] powerpc/pci: Handle BAR movement

Message ID 20190816165101.911-19-s.miroshnichenko@yadro.com
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series PCI: Allow BAR movement during hotplug | expand

Commit Message

Sergei Miroshnichenko Aug. 16, 2019, 4:50 p.m. UTC
Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
the device's driver supports movable BARs, pcibios_rescan_prepare() will be
called after the device is stopped, and pcibios_rescan_done() - before it
resumes. There are no memory requests to this device between the hooks, so
it it safe to rebuild the EEH address cache during that.

CC: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Oliver O'Halloran Sept. 4, 2019, 5:37 a.m. UTC | #1
On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
> the device's driver supports movable BARs, pcibios_rescan_prepare() will be
> called after the device is stopped, and pcibios_rescan_done() - before it
> resumes. There are no memory requests to this device between the hooks, so
> it it safe to rebuild the EEH address cache during that.
> 
> CC: Oliver O'Halloran <oohall@gmail.com>
> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> ---
>  arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> index 0b0cf8168b47..18cf13bba228 100644
> --- a/arch/powerpc/kernel/pci-hotplug.c
> +++ b/arch/powerpc/kernel/pci-hotplug.c
> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
>  	pcibios_finish_adding_to_bus(bus);
>  }
>  EXPORT_SYMBOL_GPL(pci_hp_add_devices);
> +
> +void pcibios_rescan_prepare(struct pci_dev *pdev)
> +{
> +	eeh_addr_cache_rmv_dev(pdev);
> +}
> +
> +void pcibios_rescan_done(struct pci_dev *pdev)
> +{
> +	eeh_addr_cache_insert_dev(pdev);
> +}

Is this actually sufficent? The PE number for a device is largely
determined by the location of the MMIO BARs. If you move a BAR far
enough the PE number stored in the eeh_pe would need to be updated as
well.
Sergei Miroshnichenko Sept. 6, 2019, 4:24 p.m. UTC | #2
Hi Oliver,

On 9/4/19 8:37 AM, Oliver O'Halloran wrote:
> On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
>> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
>> the device's driver supports movable BARs, pcibios_rescan_prepare() will be
>> called after the device is stopped, and pcibios_rescan_done() - before it
>> resumes. There are no memory requests to this device between the hooks, so
>> it it safe to rebuild the EEH address cache during that.
>>
>> CC: Oliver O'Halloran <oohall@gmail.com>
>> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
>> ---
>>  arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
>> index 0b0cf8168b47..18cf13bba228 100644
>> --- a/arch/powerpc/kernel/pci-hotplug.c
>> +++ b/arch/powerpc/kernel/pci-hotplug.c
>> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
>>  	pcibios_finish_adding_to_bus(bus);
>>  }
>>  EXPORT_SYMBOL_GPL(pci_hp_add_devices);
>> +
>> +void pcibios_rescan_prepare(struct pci_dev *pdev)
>> +{
>> +	eeh_addr_cache_rmv_dev(pdev);
>> +}
>> +
>> +void pcibios_rescan_done(struct pci_dev *pdev)
>> +{
>> +	eeh_addr_cache_insert_dev(pdev);
>> +}
> 
> Is this actually sufficent? The PE number for a device is largely
> determined by the location of the MMIO BARs. If you move a BAR far
> enough the PE number stored in the eeh_pe would need to be updated as
> well.
> 

Thanks for the hint! I've checked on our PowerNV: for bridges with MEM
only it allocates PE numbers starting from 0xff down, and when there
are MEM64 - starting from 0 up, one PE number per 4GiB.

PEs are allocated during call to pnv_pci_setup_bridge(), and the I've
added invocation of pci_setup_bridge() after a hotplug event in the
"Recalculate all bridge windows during rescan" patch of this series.

Currently, if a bus already has a PE, pnv_ioda_setup_bus_PE() takes it
and returns. I can see two ways to change it, both are not difficult to
implement:

 a.1) check if MEM64 BARs appeared below the bus - allocate and assign
      a new master PE with required number of slave PEs;

 a.2) if the bus now has more MEM64 than before - check if more slave
      PEs must be reserved;

 b) release all the PEs before a PCI rescan and allocate+assign them
    again after - with this approach the "Hook up the writes to
    PCI_SECONDARY_BUS register" patch may be eliminated.

Do you find any of these suitable?

Serge
Oliver O'Halloran Sept. 9, 2019, 2:02 p.m. UTC | #3
On Sat, Sep 7, 2019 at 2:25 AM Sergey Miroshnichenko
<s.miroshnichenko@yadro.com> wrote:
>
> Hi Oliver,
>
> On 9/4/19 8:37 AM, Oliver O'Halloran wrote:
> > On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
> >> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
> >> the device's driver supports movable BARs, pcibios_rescan_prepare() will be
> >> called after the device is stopped, and pcibios_rescan_done() - before it
> >> resumes. There are no memory requests to this device between the hooks, so
> >> it it safe to rebuild the EEH address cache during that.
> >>
> >> CC: Oliver O'Halloran <oohall@gmail.com>
> >> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> >> ---
> >>  arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> >> index 0b0cf8168b47..18cf13bba228 100644
> >> --- a/arch/powerpc/kernel/pci-hotplug.c
> >> +++ b/arch/powerpc/kernel/pci-hotplug.c
> >> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
> >>      pcibios_finish_adding_to_bus(bus);
> >>  }
> >>  EXPORT_SYMBOL_GPL(pci_hp_add_devices);
> >> +
> >> +void pcibios_rescan_prepare(struct pci_dev *pdev)
> >> +{
> >> +    eeh_addr_cache_rmv_dev(pdev);
> >> +}
> >> +
> >> +void pcibios_rescan_done(struct pci_dev *pdev)
> >> +{
> >> +    eeh_addr_cache_insert_dev(pdev);
> >> +}
> >
> > Is this actually sufficent? The PE number for a device is largely
> > determined by the location of the MMIO BARs. If you move a BAR far
> > enough the PE number stored in the eeh_pe would need to be updated as
> > well.
> >
>
> Thanks for the hint! I've checked on our PowerNV: for bridges with MEM
> only it allocates PE numbers starting from 0xff down, and when there
> are MEM64 - starting from 0 up, one PE number per 4GiB.
>
> PEs are allocated during call to pnv_pci_setup_bridge(), and the I've
> added invocation of pci_setup_bridge() after a hotplug event in the
> "Recalculate all bridge windows during rescan" patch of this series.

Sort of.

On PHB3 both the 32bit and the 64bit MMIO windows are split into 256
segments each of which is mapped to a PE number. For the 32bit space
there's a remapping table in hardware that allows arbitrary mapping of
segments to PE numbers, but in the 64bit space the mapping is fixed
with the first segment being PE0, etc. If there's a 64 bit BAR under a
bridge the PE is really "allocated" during the BAR assignment process,
and the setup_bridge() step sets up the EEH state based on that.

It's worth pointing out that this is why the 64bit window is usually
4GB. Bridge windows need to be aligned to a segment boundary to ensure
the devices under them are placed into a unique PE.

> Currently, if a bus already has a PE, pnv_ioda_setup_bus_PE() takes it
> and returns. I can see two ways to change it, both are not difficult to
> implement:
>
>  a.1) check if MEM64 BARs appeared below the bus - allocate and assign
>       a new master PE with required number of slave PEs;
>
>  a.2) if the bus now has more MEM64 than before - check if more slave
>       PEs must be reserved;
>
>  b) release all the PEs before a PCI rescan and allocate+assign them
>     again after - with this approach the "Hook up the writes to
>     PCI_SECONDARY_BUS register" patch may be eliminated.
>
> Do you find any of these suitable?

I'm not sure a) would work, but even if it does b) is preferable.
There's a lot of strangeness in the powerpc PCI code as-is without
adding extra code paths to deal with. Keeping what happens at hotplug
consistent with what happens at boot will help keep things sane.

FYI in the next few days I'm going to post a series that rips out the
use of pci_dn in powernv and the generic parts of EEH (pseries still
uses it). Assuming Bjorn isn't picking this up for 5.4 you might want
to wait for that before getting too deep into this.

Oliver
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 0b0cf8168b47..18cf13bba228 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -144,3 +144,13 @@  void pci_hp_add_devices(struct pci_bus *bus)
 	pcibios_finish_adding_to_bus(bus);
 }
 EXPORT_SYMBOL_GPL(pci_hp_add_devices);
+
+void pcibios_rescan_prepare(struct pci_dev *pdev)
+{
+	eeh_addr_cache_rmv_dev(pdev);
+}
+
+void pcibios_rescan_done(struct pci_dev *pdev)
+{
+	eeh_addr_cache_insert_dev(pdev);
+}