diff mbox series

[RFC,1/2] arm64: PCI: Allow use arch-specific pci sysdata

Message ID 20210319161956.2838291-2-boqun.feng@gmail.com
State New
Headers show
Series PCI: Introduce pci_ops::use_arch_sysdata | expand

Commit Message

Boqun Feng March 19, 2021, 4:19 p.m. UTC
Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
->sysdata in bus and bridge will be treated as struct pci_config_window,
which is created by generic ECAM using the data from acpi.

However, for a virtualized PCI bus, there might be no enough data in of
or acpi table to create a pci_config_window. This is similar to the case
where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
structure for sysdata, so no apci table lookup is required.

In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
introduce arch-specific pci sysdata (similar to the one for x86) for
ARM64, and allow the core PCI code to detect the type of sysdata at the
runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
field.

Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
---
 arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
 arch/arm64/kernel/pci.c      | 15 ++++++++++++---
 include/linux/pci.h          |  3 +++
 3 files changed, 44 insertions(+), 3 deletions(-)

Comments

Bjorn Helgaas March 19, 2021, 9:12 p.m. UTC | #1
[+cc Arnd (author of 37d6a0a6f470 ("PCI: Add
pci_register_host_bridge() interface"), which I think would make my
idea below possible), Marc (IRQ domains maintainer)]

On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote:
> Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> ->sysdata in bus and bridge will be treated as struct pci_config_window,
> which is created by generic ECAM using the data from acpi.

It might be a mistake that we put the struct pci_config_window
pointer, which is really arch-independent, in the ->sysdata element,
which normally contains a pointer to arch- or host bridge-dependent 
data.

> However, for a virtualized PCI bus, there might be no enough data in of
> or acpi table to create a pci_config_window. This is similar to the case
> where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> structure for sysdata, so no apci table lookup is required.
> 
> In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> introduce arch-specific pci sysdata (similar to the one for x86) for
> ARM64, and allow the core PCI code to detect the type of sysdata at the
> runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> field.
> 
> Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> ---
>  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
>  include/linux/pci.h          |  3 +++
>  3 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..dade061a0658 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -22,6 +22,16 @@
>  
>  extern int isa_dma_bridge_buggy;
>  
> +struct pci_sysdata {
> +	int domain;	/* PCI domain */
> +	int node;	/* NUMA Node */
> +#ifdef CONFIG_ACPI
> +	struct acpi_device *companion;	/* ACPI companion device */
> +#endif
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +	void *fwnode;			/* IRQ domain for MSI assignment */
> +#endif
> +};

Our PCI domain code is really a mess (mostly my fault) and I hate to
make it even more complicated by adding more switches, e.g.,
->use_arch_sysdata.

I think the design problem is that PCI host bridge drivers should
supply the PCI domain up front instead of having callbacks to extract
it.

We could put "int domain_nr" in struct pci_host_bridge, and the arch
code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD,
HV, etc) could fill in pci_host_bridge.domain_nr before calling
pci_scan_root_bus_bridge() or pci_host_probe().

Then maybe we could get rid of pci_bus_find_domain_nr() and some of
the needlessly arch-specific implementations of pci_domain_nr().
I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too,
eventually.

>  #ifdef CONFIG_PCI
>  static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  {
> @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
> +	if (bus->ops->use_arch_sysdata)
> +		return pci_domain_nr(bus);
>  	return 1;

I don't understand this.  pci_proc_domain() returns a boolean and
determines whether the /proc/bus/pci/ directory contains, e.g.,

  /proc/bus/pci/00            or
  /proc/bus/pci/0000:00

On arm64, pci_proc_domain() currently always returns 1, so the
directory contains "0000:00".  After these patches, pci_proc_domain()
returns 0 if CONFIG_PCI_DOMAINS_GENERIC=y and "bus" is in domain 0,
so buses in domain 0 will be "00" instead of "0000:00".

This doesn't make sense to me, but at the very least, this
user-visible change needs to be explained.

>  }
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
> +{
> +	struct pci_sysdata *sd = bus->sysdata;
> +
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->fwnode;
> +
> +	/*
> +	 * bus->sysdata is not struct pci_sysdata, fwnode should be able to
> +	 * be queried from of/acpi.
> +	 */
> +	return NULL;
> +}
> +#define pci_root_bus_fwnode	_pci_root_bus_fwnode

Ugh.  pci_root_bus_fwnode() is another callback to find the
irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
itself is only called from pci_set_bus_msi_domain().  This feels like
another case where we could simplify things by having the host bridge
driver figure out the irq_domain explicitly when it creates the
pci_host_bridge.  It seems like that's where we have the most
information about how to find the irq_domain.

> +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> +
>  #endif  /* CONFIG_PCI */
>  
>  #endif  /* __ASM_PCI_H */
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 1006ed2d7c60..63d420d57e63 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info {
>  int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
>  {
>  	struct pci_config_window *cfg = bus->sysdata;
> -	struct acpi_device *adev = to_acpi_device(cfg->parent);
> -	struct acpi_pci_root *root = acpi_driver_data(adev);
> +	struct pci_sysdata *sd = bus->sysdata;
> +	struct acpi_device *adev;
> +	struct acpi_pci_root *root;
> +
> +	/* struct pci_sysdata has domain nr in it */
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->domain;
> +
> +	/* or pci_config_window is used as sysdata */
> +	adev = to_acpi_device(cfg->parent);
> +	root = acpi_driver_data(adev);

My comments above are a lot of hand-waving without a very clear way
forward.  Would it simplify things to just add a "struct
pci_config_window *ecam_info" to pci_host_bridge, so we wouldn't have
to overload sysdata?

>  	return root->segment;
>  }
>  
>  int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
>  {
> -	if (!acpi_disabled) {
> +	if (!acpi_disabled && bridge->ops->use_arch_sysdata) {
>  		struct pci_config_window *cfg = bridge->bus->sysdata;
>  		struct acpi_device *adev = to_acpi_device(cfg->parent);
>  		struct device *bus_dev = &bridge->bus->dev;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..4036aac40361 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -740,6 +740,9 @@ struct pci_ops {
>  	void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where);
>  	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
>  	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> +	int use_arch_sysdata;	/* ->sysdata is arch-specific */
> +#endif
>  };
>  
>  /*
> -- 
> 2.30.2
>
Arnd Bergmann March 20, 2021, 12:52 p.m. UTC | #2
On Fri, Mar 19, 2021 at 5:22 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> ->sysdata in bus and bridge will be treated as struct pci_config_window,
> which is created by generic ECAM using the data from acpi.
>
> However, for a virtualized PCI bus, there might be no enough data in of
> or acpi table to create a pci_config_window. This is similar to the case
> where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> structure for sysdata, so no apci table lookup is required.
>
> In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> introduce arch-specific pci sysdata (similar to the one for x86) for
> ARM64, and allow the core PCI code to detect the type of sysdata at the
> runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> field.
>
> Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>

I think this takes it in the opposite direction of where it should be going.

> ---
>  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
>  include/linux/pci.h          |  3 +++
>  3 files changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..dade061a0658 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -22,6 +22,16 @@
>
>  extern int isa_dma_bridge_buggy;
>
> +struct pci_sysdata {
> +       int domain;     /* PCI domain */
> +       int node;       /* NUMA Node */
> +#ifdef CONFIG_ACPI
> +       struct acpi_device *companion;  /* ACPI companion device */
> +#endif
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +       void *fwnode;                   /* IRQ domain for MSI assignment */
> +#endif
> +};

I think none of these members belong into sysdata or architecture specific
code. The fact that a pci_host_bridge belongs to a particular NUMA node
or i associated with a firmware description is neither specific to a
host bridge implementation nor a CPU instruction set!

Moreover, you cannot assume that all PCI host bridges on any given
architecture can share the pci_sysdata pointer, it is purely specific to
the bridge driver.

A good start would be to move the members (one at a time) into struct
pci_host_bridge and out of the sysdata of individual host bridge drivers.

> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 1006ed2d7c60..63d420d57e63 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info {
>  int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
>  {
>         struct pci_config_window *cfg = bus->sysdata;
> -       struct acpi_device *adev = to_acpi_device(cfg->parent);
> -       struct acpi_pci_root *root = acpi_driver_data(adev);
> +       struct pci_sysdata *sd = bus->sysdata;
> +       struct acpi_device *adev;
> +       struct acpi_pci_root *root;

There should be no reason to add even most code to this file,
it should in fact become empty as it gets generalized more.

        Arnd
Marc Zyngier March 20, 2021, 12:54 p.m. UTC | #3
Thanks Bjorn for looping me in.

On Fri, 19 Mar 2021 21:12:46 +0000,
Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> [+cc Arnd (author of 37d6a0a6f470 ("PCI: Add
> pci_register_host_bridge() interface"), which I think would make my
> idea below possible), Marc (IRQ domains maintainer)]
> 
> On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote:
> > Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> > ->sysdata in bus and bridge will be treated as struct pci_config_window,
> > which is created by generic ECAM using the data from acpi.
> 
> It might be a mistake that we put the struct pci_config_window
> pointer, which is really arch-independent, in the ->sysdata element,
> which normally contains a pointer to arch- or host bridge-dependent 
> data.
> 
> > However, for a virtualized PCI bus, there might be no enough data in of
> > or acpi table to create a pci_config_window. This is similar to the case
> > where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> > structure for sysdata, so no apci table lookup is required.
> > 
> > In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> > entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> > introduce arch-specific pci sysdata (similar to the one for x86) for
> > ARM64, and allow the core PCI code to detect the type of sysdata at the
> > runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> > field.
> > 
> > Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> > Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> > ---
> >  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
> >  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
> >  include/linux/pci.h          |  3 +++
> >  3 files changed, 44 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> > index b33ca260e3c9..dade061a0658 100644
> > --- a/arch/arm64/include/asm/pci.h
> > +++ b/arch/arm64/include/asm/pci.h
> > @@ -22,6 +22,16 @@
> >  
> >  extern int isa_dma_bridge_buggy;
> >  
> > +struct pci_sysdata {
> > +	int domain;	/* PCI domain */
> > +	int node;	/* NUMA Node */
> > +#ifdef CONFIG_ACPI
> > +	struct acpi_device *companion;	/* ACPI companion device */
> > +#endif
> > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> > +	void *fwnode;			/* IRQ domain for MSI assignment */

Why isn't this more strongly typed? pci_host_bridge_msi_domain()
definitely expects this to be the real thing. And the comment is
wrong.

[...]

> > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> > +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
> > +{
> > +	struct pci_sysdata *sd = bus->sysdata;
> > +
> > +	if (bus->ops->use_arch_sysdata)
> > +		return sd->fwnode;
> > +
> > +	/*
> > +	 * bus->sysdata is not struct pci_sysdata, fwnode should be able to
> > +	 * be queried from of/acpi.
> > +	 */
> > +	return NULL;
> > +}
> > +#define pci_root_bus_fwnode	_pci_root_bus_fwnode
> 
> Ugh.  pci_root_bus_fwnode() is another callback to find the
> irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
> itself is only called from pci_set_bus_msi_domain().  This feels like
> another case where we could simplify things by having the host bridge
> driver figure out the irq_domain explicitly when it creates the
> pci_host_bridge.  It seems like that's where we have the most
> information about how to find the irq_domain.

Urgh. This is a perfect copy paste of the x86 horror, warts and all.
I can't say I'm thrilled (another way to say "Gawd, Noes! Never!").

One thing I am sure of is that I do not want to add more custom
indirection to build the MSI topology. We barely got rid of the
msi_controller structure, and this is the same thing by another
name. Probably worse, actually.

In this case, I don't see the point in going via a fwnode indirection
given that there is no firmware tables the first place.

As for finding the irq domain from the host bridge, that's not doable
in most cases on arm64, as it is pretty likely that the host bridge
knows nothing about MSIs when they are implemented in the GIC (see my
recent msi_controller removal series that has a few patches about
that).

Having an optional callback to host bridges to obtain the MSI domain
may be possible in some cases though (there might be a chicken/egg
problem for some drivers though...).

Thanks,

	M.
Arnd Bergmann March 20, 2021, 12:54 p.m. UTC | #4
On Fri, Mar 19, 2021 at 10:14 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > However, for a virtualized PCI bus, there might be no enough data in of
> > or acpi table to create a pci_config_window. This is similar to the case
> > where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> > structure for sysdata, so no apci table lookup is required.
> >
> > In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> > entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> > introduce arch-specific pci sysdata (similar to the one for x86) for
> > ARM64, and allow the core PCI code to detect the type of sysdata at the
> > runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> > field.
> >
> > Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> > Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> > ---
> >  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
> >  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
> >  include/linux/pci.h          |  3 +++
> >  3 files changed, 44 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> > index b33ca260e3c9..dade061a0658 100644
> > --- a/arch/arm64/include/asm/pci.h
> > +++ b/arch/arm64/include/asm/pci.h
> > @@ -22,6 +22,16 @@
> >
> >  extern int isa_dma_bridge_buggy;
> >
> > +struct pci_sysdata {
> > +     int domain;     /* PCI domain */
> > +     int node;       /* NUMA Node */
> > +#ifdef CONFIG_ACPI
> > +     struct acpi_device *companion;  /* ACPI companion device */
> > +#endif
> > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> > +     void *fwnode;                   /* IRQ domain for MSI assignment */
> > +#endif
> > +};
>
> Our PCI domain code is really a mess (mostly my fault) and I hate to
> make it even more complicated by adding more switches, e.g.,
> ->use_arch_sysdata.
>
> I think the design problem is that PCI host bridge drivers should
> supply the PCI domain up front instead of having callbacks to extract
> it.
>
> We could put "int domain_nr" in struct pci_host_bridge, and the arch
> code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD,
> HV, etc) could fill in pci_host_bridge.domain_nr before calling
> pci_scan_root_bus_bridge() or pci_host_probe().
>
> Then maybe we could get rid of pci_bus_find_domain_nr() and some of
> the needlessly arch-specific implementations of pci_domain_nr().
> I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too,
> eventually.

Agreed. I actually still have a (not really tested) patch series to clean up
the pci host bridge registration, and this should make this a lot easier
to add on top.

I should dig that out of my backlog and post for review.

         Arnd
Arnd Bergmann March 20, 2021, 1:03 p.m. UTC | #5
On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote:
> On Fri, 19 Mar 2021 21:12:46 +0000,

> >
> > Ugh.  pci_root_bus_fwnode() is another callback to find the
> > irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
> > itself is only called from pci_set_bus_msi_domain().  This feels like
> > another case where we could simplify things by having the host bridge
> > driver figure out the irq_domain explicitly when it creates the
> > pci_host_bridge.  It seems like that's where we have the most
> > information about how to find the irq_domain.
>
> Urgh. This is a perfect copy paste of the x86 horror, warts and all.
> I can't say I'm thrilled (another way to say "Gawd, Noes! Never!").
>
> One thing I am sure of is that I do not want to add more custom
> indirection to build the MSI topology. We barely got rid of the
> msi_controller structure, and this is the same thing by another
> name. Probably worse, actually.
>
> In this case, I don't see the point in going via a fwnode indirection
> given that there is no firmware tables the first place.
>
> As for finding the irq domain from the host bridge, that's not doable
> in most cases on arm64, as it is pretty likely that the host bridge
> knows nothing about MSIs when they are implemented in the GIC (see my
> recent msi_controller removal series that has a few patches about
> that).
>
> Having an optional callback to host bridges to obtain the MSI domain
> may be possible in some cases though (there might be a chicken/egg
> problem for some drivers though...).

I would expect that the host bridge driver can find the MSI domain
at probe time and just add a pointer into the pci_host_bridge
structure.

        Arnd
Marc Zyngier March 20, 2021, 1:23 p.m. UTC | #6
On Sat, 20 Mar 2021 13:03:13 +0000,
Arnd Bergmann <arnd@arndb.de> wrote:
> 
> On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote:
> > On Fri, 19 Mar 2021 21:12:46 +0000,
> 
> > >
> > > Ugh.  pci_root_bus_fwnode() is another callback to find the
> > > irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
> > > itself is only called from pci_set_bus_msi_domain().  This feels like
> > > another case where we could simplify things by having the host bridge
> > > driver figure out the irq_domain explicitly when it creates the
> > > pci_host_bridge.  It seems like that's where we have the most
> > > information about how to find the irq_domain.
> >
> > Urgh. This is a perfect copy paste of the x86 horror, warts and all.
> > I can't say I'm thrilled (another way to say "Gawd, Noes! Never!").
> >
> > One thing I am sure of is that I do not want to add more custom
> > indirection to build the MSI topology. We barely got rid of the
> > msi_controller structure, and this is the same thing by another
> > name. Probably worse, actually.
> >
> > In this case, I don't see the point in going via a fwnode indirection
> > given that there is no firmware tables the first place.
> >
> > As for finding the irq domain from the host bridge, that's not doable
> > in most cases on arm64, as it is pretty likely that the host bridge
> > knows nothing about MSIs when they are implemented in the GIC (see my
> > recent msi_controller removal series that has a few patches about
> > that).
> >
> > Having an optional callback to host bridges to obtain the MSI domain
> > may be possible in some cases though (there might be a chicken/egg
> > problem for some drivers though...).
> 
> I would expect that the host bridge driver can find the MSI domain
> at probe time and just add a pointer into the pci_host_bridge
> structure.

In most cases, it doesn't implement it itself, and I'd be reluctant to
duplicate information that can already be retrieved from somewhere
else in a generic way (i.e. no PCI specific).

	M.
Arnd Bergmann March 20, 2021, 2:24 p.m. UTC | #7
On Sat, Mar 20, 2021 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Sat, 20 Mar 2021 13:03:13 +0000,
> Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote:
> > > On Fri, 19 Mar 2021 21:12:46 +0000,
> > >
> > > Having an optional callback to host bridges to obtain the MSI domain
> > > may be possible in some cases though (there might be a chicken/egg
> > > problem for some drivers though...).
> >
> > I would expect that the host bridge driver can find the MSI domain
> > at probe time and just add a pointer into the pci_host_bridge
> > structure.
>
> In most cases, it doesn't implement it itself, and I'd be reluctant to
> duplicate information that can already be retrieved from somewhere
> else in a generic way (i.e. no PCI specific).

At the moment, the information is retried through a maze of different
functions, and already duplicated in both the pci_host_bridge and the
pci_bus structures.  If we can change everything to use
CONFIG_GENERIC_MSI_IRQ_DOMAIN, then most of that code
can probably just go away, leaving only the part in the phb.

       Arnd
Arnd Bergmann March 20, 2021, 4:09 p.m. UTC | #8
On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann <arnd@arndb.de> wrote:
>      I actually still have a (not really tested) patch series to clean up
> the pci host bridge registration, and this should make this a lot easier
> to add on top.
>
> I should dig that out of my backlog and post for review.

I've uploaded my series to
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git
pci-probe-rework-20210320

The purpose of this series is mostly to simplify what variations of
host probe methods exist, towards using pci_host_probe() as the
only method. It does provide some simplifications based on that
that, including a way to universally have access to the pci_host_bridge
pointer during the probe function.

         Arnd
Marc Zyngier March 20, 2021, 5:14 p.m. UTC | #9
On Sat, 20 Mar 2021 14:24:06 +0000,
Arnd Bergmann <arnd@arndb.de> wrote:
> 
> On Sat, Mar 20, 2021 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Sat, 20 Mar 2021 13:03:13 +0000,
> > Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote:
> > > > On Fri, 19 Mar 2021 21:12:46 +0000,
> > > >
> > > > Having an optional callback to host bridges to obtain the MSI domain
> > > > may be possible in some cases though (there might be a chicken/egg
> > > > problem for some drivers though...).
> > >
> > > I would expect that the host bridge driver can find the MSI domain
> > > at probe time and just add a pointer into the pci_host_bridge
> > > structure.
> >
> > In most cases, it doesn't implement it itself, and I'd be reluctant to
> > duplicate information that can already be retrieved from somewhere
> > else in a generic way (i.e. no PCI specific).
> 
> At the moment, the information is retried through a maze of different
> functions, and already duplicated in both the pci_host_bridge and the
> pci_bus structures.  If we can change everything to use
> CONFIG_GENERIC_MSI_IRQ_DOMAIN, then most of that code
> can probably just go away, leaving only the part in the phb.

Fine by me, as long as you don't assume that there is a single MSI
domain per PHB (both OF and IORT mandate that you can segment the RID
space to hit multiple controllers).

	M.
Boqun Feng March 29, 2021, 2:32 p.m. UTC | #10
Hi Arnd,

On Sat, Mar 20, 2021 at 05:09:10PM +0100, Arnd Bergmann wrote:
> On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >      I actually still have a (not really tested) patch series to clean up
> > the pci host bridge registration, and this should make this a lot easier
> > to add on top.
> >
> > I should dig that out of my backlog and post for review.
> 
> I've uploaded my series to
> https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git
> pci-probe-rework-20210320
> 
> The purpose of this series is mostly to simplify what variations of
> host probe methods exist, towards using pci_host_probe() as the
> only method. It does provide some simplifications based on that
> that, including a way to universally have access to the pci_host_bridge
> pointer during the probe function.
> 

Thanks for the suggestion and code. I spend some time to catch up. Yes,
Bjorn and you are correct, the better way is having a 'domain_nr' in the
'pci_host_bridge' and making sure every driver fill that correctly
before probe. I definitly will use this approach.

However, I may start small: I plan to introduce 'domain_nr' and only
fill the field at probe time for PCI_DOMAINS_GENERIC=y archs, and leave
other archs and driver alone. (honestly, I was shocked by the number of
pci_scan_root_bus_bridge() and pci_host_probe() that I need to adjust if
I really want to unify the 'domain_nr' handling for every arch and
driver ;-)). This will fulfil my requirement for Hyper-V PCI controller
on ARM64. And later on, we can switch each arch to this approach one by
one and keep the rest still working.

Thoughts?

Regards,
Boqun

>          Arnd
Arnd Bergmann March 29, 2021, 2:43 p.m. UTC | #11
On Mon, Mar 29, 2021 at 4:32 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> Hi Arnd,
>
> On Sat, Mar 20, 2021 at 05:09:10PM +0100, Arnd Bergmann wrote:
> > On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >      I actually still have a (not really tested) patch series to clean up
> > > the pci host bridge registration, and this should make this a lot easier
> > > to add on top.
> > >
> > > I should dig that out of my backlog and post for review.
> >
> > I've uploaded my series to
> > https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git
> > pci-probe-rework-20210320
> >
> > The purpose of this series is mostly to simplify what variations of
> > host probe methods exist, towards using pci_host_probe() as the
> > only method. It does provide some simplifications based on that
> > that, including a way to universally have access to the pci_host_bridge
> > pointer during the probe function.
> >
>
> Thanks for the suggestion and code. I spend some time to catch up. Yes,
> Bjorn and you are correct, the better way is having a 'domain_nr' in the
> 'pci_host_bridge' and making sure every driver fill that correctly
> before probe. I definitly will use this approach.
>
> However, I may start small: I plan to introduce 'domain_nr' and only
> fill the field at probe time for PCI_DOMAINS_GENERIC=y archs, and leave
> other archs and driver alone. (honestly, I was shocked by the number of
> pci_scan_root_bus_bridge() and pci_host_probe() that I need to adjust if
> I really want to unify the 'domain_nr' handling for every arch and
> driver ;-)). This will fulfil my requirement for Hyper-V PCI controller
> on ARM64. And later on, we can switch each arch to this approach one by
> one and keep the rest still working.
>
> Thoughts?

That sounds reasonable to me, yes. I would also suggest you look
at my hyperv patch from the branch I mentioned [1] and try to integrate
that first. I suspect this makes it easier to do the domain rework and
possibly other changes on top.

       Arnd

https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/?h=pci-probe-rework-20210320&id=44db8df9d729d
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index b33ca260e3c9..dade061a0658 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -22,6 +22,16 @@ 
 
 extern int isa_dma_bridge_buggy;
 
+struct pci_sysdata {
+	int domain;	/* PCI domain */
+	int node;	/* NUMA Node */
+#ifdef CONFIG_ACPI
+	struct acpi_device *companion;	/* ACPI companion device */
+#endif
+#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
+	void *fwnode;			/* IRQ domain for MSI assignment */
+#endif
+};
 #ifdef CONFIG_PCI
 static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
 {
@@ -31,8 +41,27 @@  static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
 
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
+	if (bus->ops->use_arch_sysdata)
+		return pci_domain_nr(bus);
 	return 1;
 }
+#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
+static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
+{
+	struct pci_sysdata *sd = bus->sysdata;
+
+	if (bus->ops->use_arch_sysdata)
+		return sd->fwnode;
+
+	/*
+	 * bus->sysdata is not struct pci_sysdata, fwnode should be able to
+	 * be queried from of/acpi.
+	 */
+	return NULL;
+}
+#define pci_root_bus_fwnode	_pci_root_bus_fwnode
+#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
+
 #endif  /* CONFIG_PCI */
 
 #endif  /* __ASM_PCI_H */
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
index 1006ed2d7c60..63d420d57e63 100644
--- a/arch/arm64/kernel/pci.c
+++ b/arch/arm64/kernel/pci.c
@@ -74,15 +74,24 @@  struct acpi_pci_generic_root_info {
 int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
 {
 	struct pci_config_window *cfg = bus->sysdata;
-	struct acpi_device *adev = to_acpi_device(cfg->parent);
-	struct acpi_pci_root *root = acpi_driver_data(adev);
+	struct pci_sysdata *sd = bus->sysdata;
+	struct acpi_device *adev;
+	struct acpi_pci_root *root;
+
+	/* struct pci_sysdata has domain nr in it */
+	if (bus->ops->use_arch_sysdata)
+		return sd->domain;
+
+	/* or pci_config_window is used as sysdata */
+	adev = to_acpi_device(cfg->parent);
+	root = acpi_driver_data(adev);
 
 	return root->segment;
 }
 
 int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
 {
-	if (!acpi_disabled) {
+	if (!acpi_disabled && bridge->ops->use_arch_sysdata) {
 		struct pci_config_window *cfg = bridge->bus->sysdata;
 		struct acpi_device *adev = to_acpi_device(cfg->parent);
 		struct device *bus_dev = &bridge->bus->dev;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..4036aac40361 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -740,6 +740,9 @@  struct pci_ops {
 	void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where);
 	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+	int use_arch_sysdata;	/* ->sysdata is arch-specific */
+#endif
 };
 
 /*