diff mbox

[v2,1/2] PCI: Add new method for registering PCI hosts

Message ID 20160630151931.29216-1-thierry.reding@gmail.com
State Changes Requested
Headers show

Commit Message

Thierry Reding June 30, 2016, 3:19 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de>

This patch makes the existing pci_host_bridge structure a proper device
that is usable by PCI host drivers in a more standard way. In addition
to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
and pci_create_root_bus interfaces, this unfortunately means having to
add yet another interface doing basically the same thing, and add some
extra code in the initial step.

However, this time it's more likely to be extensible enough that we
won't have to do another one again in the future, and we should be
able to reduce code much more as a result.

The main idea is to pull the allocation of 'struct pci_host_bridge' out
of the registration, and let individual host drivers and architecture
code fill the members before calling the registration function.

There are a number of things we can do based on this:

* Use a single memory allocation for the driver-specific structure
  and the generic PCI host bridge
* consolidate the contents of driver specific structures by moving
  them into pci_host_bridge
* Add a consistent interface for removing a PCI host bridge again
  when unloading a host driver module
* Replace the architecture specific __weak pcibios_* functions with
  callbacks in a pci_host_bridge device
* Move common boilerplate code from host drivers into the generic
  function, based on contents of the structure
* Extend pci_host_bridge with additional members when needed without
  having to add arguments to pci_scan_*.
* Move members of struct pci_bus into pci_host_bridge to avoid
  having lots of identical copies.

As mentioned in a previous email, one open question is whether we want
to export a function for allocating a pci_host_bridge device in
combination with the per-device structure or let the driver itself
call kzalloc.

Changes in v2 (Thierry Reding):
- add a pci_host_bridge_init() helper that drivers can use to perform
  all the necessary steps to initialize the bridge
- rename pci_register_host() to pci_host_bridge_register() to reflect
  the naming used by other functions
- plug memory leak on registration failure

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/pci/probe.c | 111 ++++++++++++++++++++++++++++++++--------------------
 include/linux/pci.h |   8 +++-
 2 files changed, 75 insertions(+), 44 deletions(-)

Comments

Arnd Bergmann June 30, 2016, 3:37 p.m. UTC | #1
On Thursday, June 30, 2016 5:19:30 PM CEST Thierry Reding wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> This patch makes the existing pci_host_bridge structure a proper device
> that is usable by PCI host drivers in a more standard way. In addition
> to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
> and pci_create_root_bus interfaces, this unfortunately means having to
> add yet another interface doing basically the same thing, and add some
> extra code in the initial step.

Thanks a lot for following up on my patches and getting them to work!

> As mentioned in a previous email, one open question is whether we want
> to export a function for allocating a pci_host_bridge device in
> combination with the per-device structure or let the driver itself
> call kzalloc.
> 
> Changes in v2 (Thierry Reding):
> - add a pci_host_bridge_init() helper that drivers can use to perform
>   all the necessary steps to initialize the bridge

For these two points, the addition of pci_host_bridge_init() to me
tips the balance in favour of having a separate allocation function
that also does the initialization, something like

struct struct pci_host_bridge *pci_host_bridge_alloc(size_t priv)
{
	struct pci_host_bridge *bridge;

	bridge = kzalloc(sizeof(*bridge) + priv, GFP_KERNEL);
	if (!bridge)
		return NULL;

	INIT_LIST_HEAD(&bridge->windows);

	if (priv)
		bridge->private = &bridge[1];

	return bridge;
}

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau July 1, 2016, 2:14 p.m. UTC | #2
On Thu, Jun 30, 2016 at 05:37:35PM +0200, Arnd Bergmann wrote:
> On Thursday, June 30, 2016 5:19:30 PM CEST Thierry Reding wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> > 
> > This patch makes the existing pci_host_bridge structure a proper device
> > that is usable by PCI host drivers in a more standard way. In addition
> > to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
> > and pci_create_root_bus interfaces, this unfortunately means having to
> > add yet another interface doing basically the same thing, and add some
> > extra code in the initial step.
> 
> Thanks a lot for following up on my patches and getting them to work!
> 
> > As mentioned in a previous email, one open question is whether we want
> > to export a function for allocating a pci_host_bridge device in
> > combination with the per-device structure or let the driver itself
> > call kzalloc.
> > 
> > Changes in v2 (Thierry Reding):
> > - add a pci_host_bridge_init() helper that drivers can use to perform
> >   all the necessary steps to initialize the bridge
> 
> For these two points, the addition of pci_host_bridge_init() to me
> tips the balance in favour of having a separate allocation function
> that also does the initialization, something like
> 
> struct struct pci_host_bridge *pci_host_bridge_alloc(size_t priv)
> {
> 	struct pci_host_bridge *bridge;
> 
> 	bridge = kzalloc(sizeof(*bridge) + priv, GFP_KERNEL);
> 	if (!bridge)
> 		return NULL;
> 
> 	INIT_LIST_HEAD(&bridge->windows);
> 
> 	if (priv)
> 		bridge->private = &bridge[1];
> 
> 	return bridge;
> }

I'm confused. Why are we butchering the old pci_alloc_host_bridge() function
to then go back and add re-add it in a different form?

The old pci_alloc_host_bridge() was doing mostly what the above
function does minus the bridge->private line. And the &bridge[1] construct
could very well be pointing to the next struct pci_host_bridge aligned address,
which means parts of priv area are ignored.

Best regards,
Liviu

> 
> 	Arnd
>
Arnd Bergmann July 1, 2016, 2:24 p.m. UTC | #3
On Friday, July 1, 2016 3:14:47 PM CEST Liviu Dudau wrote:
> 
> I'm confused. Why are we butchering the old pci_alloc_host_bridge() function
> to then go back and add re-add it in a different form?

Good point, we could just keep the existing implementation and make it
an exported function without the bus argument.

> The old pci_alloc_host_bridge() was doing mostly what the above
> function does minus the bridge->private line. And the &bridge[1] construct
> could very well be pointing to the next struct pci_host_bridge aligned address,
> which means parts of priv area are ignored.

I don't know what you mean with that. Would you rather write this as?

	bridge->private = bridge + 1;

or leave it out and add a helper function

void *pci_host_bridge_private(struct pci_host_bridge *bridge)
{
	return &bridge[1];
}

or do you mean we should have extra alignment in there so the
private pointer has a minimum alignment higher than the
alignment of struct pci_host_bridge?

I'm absolutely fine with any of those suggestions, whichever
makes the nicest API.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau July 1, 2016, 2:46 p.m. UTC | #4
On Thu, Jun 30, 2016 at 05:19:30PM +0200, Thierry Reding wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> This patch makes the existing pci_host_bridge structure a proper device
> that is usable by PCI host drivers in a more standard way. In addition
> to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
> and pci_create_root_bus interfaces, this unfortunately means having to
> add yet another interface doing basically the same thing, and add some
> extra code in the initial step.
> 
> However, this time it's more likely to be extensible enough that we
> won't have to do another one again in the future, and we should be
> able to reduce code much more as a result.

I am really disapointed by the direction in which the whole pci_host_bridge
structure and associated functions is going. When I started to get involved
in this mess that is the creation of a root PCI bus I was hoping that we
are moving towards one or few functions that create the root bus by using
the details provided in the host bridge without having so many variants for
the functionality.

One reason for this holy mess is the duplication of information. Both root
bus and the pci_host_bridge hold pointers to useful information (msi_controller,
pci_ops) and coherency should be guaranteed when one or the other gets created.
Hence the ridiculous dance being done to find if root bus hasn't already been
created and if not reusing the scrap bus we have created at the top of
pci_create_root_bus() to actually set the data, then create the pci_host_bridge,
oops we need more information, and so on. (apologies for the incoherent style,
I'm trying to duplicate the spirit of the probe.c code :) )

I think this patchset has the right intention but it is not doing it in
the right way. To me, the right way is to separate the whole allocation
of the pci_host_bridge structure from the scanning or the root bus (keeping the
INIT_LIST_HEAD(&bridge->windows) call inside), not add the sysdata pointer
*at all* to pci_host_bridge, move the information from pci_bus into pci_host_bridge
and make pci_scan_root_bus take a pci_host_bridge pointer once that has been
done.

Think of the code structure as a reflection of how the system is structured: the
PCI-to-host bridge is a structure that holds the information required to connect
the functionality of the host code (msi_controller, host-to-bus resource windows)
to the PCI(e) bus. The root pci_bus should only have PCI(e) bus information if
possible.

> 
> The main idea is to pull the allocation of 'struct pci_host_bridge' out
> of the registration, and let individual host drivers and architecture
> code fill the members before calling the registration function.

That's how we got into the arch/arm mess that we currently have. The host driver
should not drive the instantiation of the pci_host_bridge structure because it
will prevent you from having two drivers running at the same time.

> 
> There are a number of things we can do based on this:
> 
> * Use a single memory allocation for the driver-specific structure
>   and the generic PCI host bridge
> * consolidate the contents of driver specific structures by moving
>   them into pci_host_bridge
> * Add a consistent interface for removing a PCI host bridge again
>   when unloading a host driver module
> * Replace the architecture specific __weak pcibios_* functions with
>   callbacks in a pci_host_bridge device
> * Move common boilerplate code from host drivers into the generic
>   function, based on contents of the structure
> * Extend pci_host_bridge with additional members when needed without
>   having to add arguments to pci_scan_*.
> * Move members of struct pci_bus into pci_host_bridge to avoid
>   having lots of identical copies.
> 
> As mentioned in a previous email, one open question is whether we want
> to export a function for allocating a pci_host_bridge device in
> combination with the per-device structure or let the driver itself
> call kzalloc.
> 
> Changes in v2 (Thierry Reding):
> - add a pci_host_bridge_init() helper that drivers can use to perform
>   all the necessary steps to initialize the bridge
> - rename pci_register_host() to pci_host_bridge_register() to reflect
>   the naming used by other functions
> - plug memory leak on registration failure
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/pci/probe.c | 111 ++++++++++++++++++++++++++++++++--------------------
>  include/linux/pci.h |   8 +++-
>  2 files changed, 75 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 77f0bbb05c3f..ba40514acd32 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -521,19 +521,6 @@ static void pci_release_host_bridge_dev(struct device *dev)
>  	kfree(bridge);
>  }
>  
> -static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
> -{
> -	struct pci_host_bridge *bridge;
> -
> -	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
> -	if (!bridge)
> -		return NULL;
> -
> -	INIT_LIST_HEAD(&bridge->windows);
> -	bridge->bus = b;
> -	return bridge;
> -}
> -

You might have strong arguments for removing this function but they are not properly
explained here and I feel that they should. Specially as ....

>  static const unsigned char pcix_bus_speed[] = {
>  	PCI_SPEED_UNKNOWN,		/* 0 */
>  	PCI_SPEED_66MHz_PCIX,		/* 1 */
> @@ -2117,53 +2104,56 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
>  {
>  }
>  
> -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> -		struct pci_ops *ops, void *sysdata, struct list_head *resources)
> +void pci_host_bridge_init(struct pci_host_bridge *bridge)
> +{
> +	INIT_LIST_HEAD(&bridge->windows);
> +}
> +EXPORT_SYMBOL_GPL(pci_host_bridge_init);

I have no idea why it is useful to re-initialise the bridge->windows list at any
time other than allocation time.

> +
> +int pci_host_bridge_register(struct pci_host_bridge *bridge)
>  {
>  	int error;
> -	struct pci_host_bridge *bridge;
>  	struct pci_bus *b, *b2;
>  	struct resource_entry *window, *n;
> +	LIST_HEAD(resources);
>  	struct resource *res;
>  	resource_size_t offset;
>  	char bus_addr[64];
>  	char *fmt;
> +	struct device *parent = bridge->dev.parent;
>  
>  	b = pci_alloc_bus(NULL);
>  	if (!b)
> -		return NULL;
> +		return -ENOMEM;
> +	bridge->bus = b;
>  
> -	b->sysdata = sysdata;
> -	b->ops = ops;
> -	b->number = b->busn_res.start = bus;
> +	/* temporarily move resources off the list */
> +	list_splice_init(&bridge->windows, &resources);

What are you trying to accomplish here with the moving around of the bridge->windows list
elements? This also leaves the bridge->windows list empty afterwards, is that intended?

> +	b->sysdata = bridge->sysdata;
> +	b->msi = bridge->msi;
> +	b->ops = bridge->ops;
> +	b->number = b->busn_res.start = bridge->busnr;
>  #ifdef CONFIG_PCI_DOMAINS_GENERIC
>  	b->domain_nr = pci_bus_find_domain_nr(b, parent);
>  #endif
> -	b2 = pci_find_bus(pci_domain_nr(b), bus);
> +	b2 = pci_find_bus(pci_domain_nr(b), bridge->busnr);
>  	if (b2) {
>  		/* If we already got to this bus through a different bridge, ignore it */
>  		dev_dbg(&b2->dev, "bus already known\n");
> +		error = -EEXIST;
>  		goto err_out;
>  	}
>  
> -	bridge = pci_alloc_host_bridge(b);
> -	if (!bridge)
> -		goto err_out;
> -
> -	bridge->dev.parent = parent;
> -	bridge->dev.release = pci_release_host_bridge_dev;
> -	dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
> +	dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b),
> +		     bridge->busnr);
>  	error = pcibios_root_bridge_prepare(bridge);
> -	if (error) {
> -		kfree(bridge);
> +	if (error)
>  		goto err_out;
> -	}
>  
>  	error = device_register(&bridge->dev);
> -	if (error) {
> +	if (error)
>  		put_device(&bridge->dev);
> -		goto err_out;
> -	}
> +
>  	b->bridge = get_device(&bridge->dev);
>  	device_enable_async_suspend(b->bridge);
>  	pci_set_bus_of_node(b);
> @@ -2174,7 +2164,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
>  
>  	b->dev.class = &pcibus_class;
>  	b->dev.parent = b->bridge;
> -	dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
> +	dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bridge->busnr);
>  	error = device_register(&b->dev);
>  	if (error)
>  		goto class_dev_reg_err;
> @@ -2190,12 +2180,12 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
>  		printk(KERN_INFO "PCI host bridge to bus %s\n", dev_name(&b->dev));
>  
>  	/* Add initial resources to the bus */
> -	resource_list_for_each_entry_safe(window, n, resources) {
> +	resource_list_for_each_entry_safe(window, n, &resources) {
>  		list_move_tail(&window->node, &bridge->windows);
>  		res = window->res;
>  		offset = window->offset;
>  		if (res->flags & IORESOURCE_BUS)
> -			pci_bus_insert_busn_res(b, bus, res->end);
> +			pci_bus_insert_busn_res(b, bridge->busnr, res->end);
>  		else
>  			pci_bus_add_resource(b, res, 0);
>  		if (offset) {
> @@ -2215,16 +2205,16 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
>  	list_add_tail(&b->node, &pci_root_buses);
>  	up_write(&pci_bus_sem);
>  
> -	return b;
> +	return 0;
>  
>  class_dev_reg_err:
>  	put_device(&bridge->dev);
>  	device_unregister(&bridge->dev);
>  err_out:
>  	kfree(b);
> -	return NULL;
> +	return error;
>  }
> -EXPORT_SYMBOL_GPL(pci_create_root_bus);
> +EXPORT_SYMBOL_GPL(pci_host_bridge_register);
>  
>  int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
>  {
> @@ -2289,6 +2279,43 @@ void pci_bus_release_busn_res(struct pci_bus *b)
>  			res, ret ? "can not be" : "is");
>  }
>  
> +static struct pci_bus *pci_create_root_bus_msi(struct device *parent,
> +		int bus, struct pci_ops *ops, void *sysdata,
> +		struct list_head *resources, struct msi_controller *msi)
> +{
> +	struct pci_host_bridge *bridge;
> +	int ret;
> +
> +	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
> +	if (!bridge)
> +		return NULL;
> +
> +	pci_host_bridge_init(bridge);
> +	bridge->dev.parent = parent;
> +	bridge->dev.release = pci_release_host_bridge_dev;
> +	bridge->busnr = bus;
> +	bridge->ops = ops;
> +	bridge->sysdata = sysdata;
> +	bridge->msi = msi;
> +	list_splice_init(resources, &bridge->windows);
> +
> +	ret = pci_host_bridge_register(bridge);
> +	if (ret) {
> +		kfree(bridge);
> +		return NULL;
> +	}
> +
> +	return bridge->bus;
> +}
> +
> +struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> +		struct pci_ops *ops, void *sysdata, struct list_head *resources)
> +{
> +	return pci_create_root_bus_msi(parent, bus, ops, sysdata, resources,
> +				       NULL);
> +}
> +EXPORT_SYMBOL_GPL(pci_create_root_bus);
> +
>  struct pci_bus *pci_scan_root_bus_msi(struct device *parent, int bus,
>  		struct pci_ops *ops, void *sysdata,
>  		struct list_head *resources, struct msi_controller *msi)
> @@ -2304,12 +2331,10 @@ struct pci_bus *pci_scan_root_bus_msi(struct device *parent, int bus,
>  			break;
>  		}
>  
> -	b = pci_create_root_bus(parent, bus, ops, sysdata, resources);
> +	b = pci_create_root_bus_msi(parent, bus, ops, sysdata, resources, msi);
>  	if (!b)
>  		return NULL;
>  
> -	b->msi = msi;
> -
>  	if (!found) {
>  		dev_info(&b->dev,
>  		 "No busn resource found for root bus, will use [bus %02x-ff]\n",
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 8badb664be00..5bf04e20c1cd 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -405,10 +405,14 @@ static inline int pci_channel_offline(struct pci_dev *pdev)
>  
>  struct pci_host_bridge {
>  	struct device dev;
> -	struct pci_bus *bus;		/* root bus */
> +	struct pci_ops *ops;
> +	void *sysdata;
> +	int busnr;
> +	struct pci_bus *bus;		/* points to root */
>  	struct list_head windows;	/* resource_entry */
>  	void (*release_fn)(struct pci_host_bridge *);
>  	void *release_data;
> +	struct msi_controller *msi;
>  	unsigned int ignore_reset_delay:1;	/* for entire hierarchy */
>  	/* Resource alignment requirements */
>  	resource_size_t (*align_resource)(struct pci_dev *dev,
> @@ -817,6 +821,8 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
>  struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
>  				    struct pci_ops *ops, void *sysdata,
>  				    struct list_head *resources);
> +void pci_host_bridge_init(struct pci_host_bridge *bridge);
> +int pci_host_bridge_register(struct pci_host_bridge *bridge);
>  int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
>  int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
>  void pci_bus_release_busn_res(struct pci_bus *b);
> -- 
> 2.8.3
> 

There is too much stuff moving around. I know it is needed and the temptation is to get it
done as quickly as possible, but it makes the reviewing and the commenting on this series hard.
Can you split the patch into one that adds the new members into pci_host_bridge, then do the
renaming/re-organisation in another patch?

Best regards,
Liviu
Liviu Dudau July 1, 2016, 2:52 p.m. UTC | #5
On Fri, Jul 01, 2016 at 04:24:50PM +0200, Arnd Bergmann wrote:
> On Friday, July 1, 2016 3:14:47 PM CEST Liviu Dudau wrote:
> > 
> > I'm confused. Why are we butchering the old pci_alloc_host_bridge() function
> > to then go back and add re-add it in a different form?
> 
> Good point, we could just keep the existing implementation and make it
> an exported function without the bus argument.
> 
> > The old pci_alloc_host_bridge() was doing mostly what the above
> > function does minus the bridge->private line. And the &bridge[1] construct
> > could very well be pointing to the next struct pci_host_bridge aligned address,
> > which means parts of priv area are ignored.
> 
> I don't know what you mean with that. Would you rather write this as?
> 
> 	bridge->private = bridge + 1;
> 
> or leave it out and add a helper function
> 
> void *pci_host_bridge_private(struct pci_host_bridge *bridge)
> {
> 	return &bridge[1];
> }

none of the above

> 
> or do you mean we should have extra alignment in there so the
> private pointer has a minimum alignment higher than the
> alignment of struct pci_host_bridge?

but this ^. bridge pointer arithmetic means +1 is not necessarily +sizeof(struct pci_host_bridge)
bytes. AFAIK that can be rounded to the nearest natural alignment for pointers on that
architecture.

> 
> I'm absolutely fine with any of those suggestions, whichever
> makes the nicest API.

Does anyone need to subclass the pci_host_bridge structure? And is appending data right
after the structure useful?

Best regards,
Liviu

> 
> 	Arnd
>
Arnd Bergmann July 1, 2016, 3:17 p.m. UTC | #6
On Friday, July 1, 2016 3:52:44 PM CEST Liviu Dudau wrote:
> 
> > 
> > or do you mean we should have extra alignment in there so the
> > private pointer has a minimum alignment higher than the
> > alignment of struct pci_host_bridge?
> 
> but this ^. bridge pointer arithmetic means +1 is not necessarily +sizeof(struct pci_host_bridge)
> bytes. AFAIK that can be rounded to the nearest natural alignment for pointers on that
> architecture.

No, that's not how it works.

> > I'm absolutely fine with any of those suggestions, whichever
> > makes the nicest API.
> 
> Does anyone need to subclass the pci_host_bridge structure? And is appending data right
> after the structure useful?

I was basically following the way alloc_etherdev() and a lot of other
subsystems handle it.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau July 1, 2016, 3:40 p.m. UTC | #7
On Fri, Jul 01, 2016 at 05:17:11PM +0200, Arnd Bergmann wrote:
> On Friday, July 1, 2016 3:52:44 PM CEST Liviu Dudau wrote:
> > 
> > > 
> > > or do you mean we should have extra alignment in there so the
> > > private pointer has a minimum alignment higher than the
> > > alignment of struct pci_host_bridge?
> > 
> > but this ^. bridge pointer arithmetic means +1 is not necessarily +sizeof(struct pci_host_bridge)
> > bytes. AFAIK that can be rounded to the nearest natural alignment for pointers on that
> > architecture.
> 
> No, that's not how it works.

Really? If struct foo takes 31 bytes, and struct foo *p = (struct foo*)64, what's p's
value after p++ ? 95? I thought the compiler is allowed to consider the structure padded so that
p++ is 96.

> 
> > > I'm absolutely fine with any of those suggestions, whichever
> > > makes the nicest API.
> > 
> > Does anyone need to subclass the pci_host_bridge structure? And is appending data right
> > after the structure useful?
> 
> I was basically following the way alloc_etherdev() and a lot of other
> subsystems handle it.

Probably valid if casted to char* ?

Liviu

> 
> 	Arnd
>
Arnd Bergmann July 1, 2016, 3:44 p.m. UTC | #8
On Friday, July 1, 2016 3:46:48 PM CEST Liviu Dudau wrote:
> On Thu, Jun 30, 2016 at 05:19:30PM +0200, Thierry Reding wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> > 
> > This patch makes the existing pci_host_bridge structure a proper device
> > that is usable by PCI host drivers in a more standard way. In addition
> > to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
> > and pci_create_root_bus interfaces, this unfortunately means having to
> > add yet another interface doing basically the same thing, and add some
> > extra code in the initial step.
> > 
> > However, this time it's more likely to be extensible enough that we
> > won't have to do another one again in the future, and we should be
> > able to reduce code much more as a result.
> 
> I am really disapointed by the direction in which the whole pci_host_bridge
> structure and associated functions is going. When I started to get involved
> in this mess that is the creation of a root PCI bus I was hoping that we
> are moving towards one or few functions that create the root bus by using
> the details provided in the host bridge without having so many variants for
> the functionality.

That is the plan, we just never made a lot of progress here.

> One reason for this holy mess is the duplication of information. Both root
> bus and the pci_host_bridge hold pointers to useful information (msi_controller,
> pci_ops) and coherency should be guaranteed when one or the other gets created.
> Hence the ridiculous dance being done to find if root bus hasn't already been
> created and if not reusing the scrap bus we have created at the top of
> pci_create_root_bus() to actually set the data, then create the pci_host_bridge,
> oops we need more information, and so on. (apologies for the incoherent style,
> I'm trying to duplicate the spirit of the probe.c code :) )

Agreed.

> I think this patchset has the right intention but it is not doing it in
> the right way. To me, the right way is to separate the whole allocation
> of the pci_host_bridge structure from the scanning or the root bus (keeping the
> INIT_LIST_HEAD(&bridge->windows) call inside), not add the sysdata pointer
> *at all* to pci_host_bridge, move the information from pci_bus into pci_host_bridge
> and make pci_scan_root_bus take a pci_host_bridge pointer once that has been
> done.

Again, this is what I'm doing here, unfortunately we cannot remove the
sysdata pointer altogether as we still have over a hundred existing PCI
host bridge implementations that all use sysdata.

> Think of the code structure as a reflection of how the system is structured: the
> PCI-to-host bridge is a structure that holds the information required to connect
> the functionality of the host code (msi_controller, host-to-bus resource windows)
> to the PCI(e) bus. The root pci_bus should only have PCI(e) bus information if
> possible.

Exactly.

> > 
> > The main idea is to pull the allocation of 'struct pci_host_bridge' out
> > of the registration, and let individual host drivers and architecture
> > code fill the members before calling the registration function.
> 
> That's how we got into the arch/arm mess that we currently have. The host driver
> should not drive the instantiation of the pci_host_bridge structure because it
> will prevent you from having two drivers running at the same time.

Can you clarify that? I don't see what prevents two drivers from each
creating a pci_host_bridge structure and passing it into
pci_host_bridge_register().

> > -static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
> > -{
> > -	struct pci_host_bridge *bridge;
> > -
> > -	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
> > -	if (!bridge)
> > -		return NULL;
> > -
> > -	INIT_LIST_HEAD(&bridge->windows);
> > -	bridge->bus = b;
> > -	return bridge;
> > -}
> > -
> 
> You might have strong arguments for removing this function but they are not properly
> explained here and I feel that they should. Specially as ....

In my initial version, it simply became unnecessary because all callers
of pci_host_bridge_register() would have to allocate it anyway, and with
the pci_bus assignment gone, it didn't do much at all.

> >  static const unsigned char pcix_bus_speed[] = {
> >  	PCI_SPEED_UNKNOWN,		/* 0 */
> >  	PCI_SPEED_66MHz_PCIX,		/* 1 */
> > @@ -2117,53 +2104,56 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
> >  {
> >  }
> >  
> > -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> > -		struct pci_ops *ops, void *sysdata, struct list_head *resources)
> > +void pci_host_bridge_init(struct pci_host_bridge *bridge)
> > +{
> > +	INIT_LIST_HEAD(&bridge->windows);
> > +}
> > +EXPORT_SYMBOL_GPL(pci_host_bridge_init);
> 
> I have no idea why it is useful to re-initialise the bridge->windows list at any
> time other than allocation time.

That's why I suggested doing this in the allocation function now.

> > +int pci_host_bridge_register(struct pci_host_bridge *bridge)
> >  {
> >  	int error;
> > -	struct pci_host_bridge *bridge;
> >  	struct pci_bus *b, *b2;
> >  	struct resource_entry *window, *n;
> > +	LIST_HEAD(resources);
> >  	struct resource *res;
> >  	resource_size_t offset;
> >  	char bus_addr[64];
> >  	char *fmt;
> > +	struct device *parent = bridge->dev.parent;
> >  
> >  	b = pci_alloc_bus(NULL);
> >  	if (!b)
> > -		return NULL;
> > +		return -ENOMEM;
> > +	bridge->bus = b;
> >  
> > -	b->sysdata = sysdata;
> > -	b->ops = ops;
> > -	b->number = b->busn_res.start = bus;
> > +	/* temporarily move resources off the list */
> > +	list_splice_init(&bridge->windows, &resources);
> 
> What are you trying to accomplish here with the moving around of the bridge->windows list
> elements? This also leaves the bridge->windows list empty afterwards, is that intended?

I was trying to preserve the existing behavior, this can probably
be simplified, but as my initial version was completely untested
I decided not to touch it.

Later in the function, the list is filled one entry at a time from
the local 'resources' list that we traditionally passed as a function
argument before.

Ideally we'd just walk the list instead of doing the
split/list_del/list_add dance, and this would be a logical
cleanup on top of this patch.

> > @@ -817,6 +821,8 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
> >  struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> >  				    struct pci_ops *ops, void *sysdata,
> >  				    struct list_head *resources);
> > +void pci_host_bridge_init(struct pci_host_bridge *bridge);
> > +int pci_host_bridge_register(struct pci_host_bridge *bridge);
> >  int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
> >  int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
> >  void pci_bus_release_busn_res(struct pci_bus *b);
> 
> There is too much stuff moving around. I know it is needed and the temptation is to get it
> done as quickly as possible, but it makes the reviewing and the commenting on this series hard.

This is why I left the list handling you mentioned above unchanged
in my original patch instead of rewriting it.

> Can you split the patch into one that adds the new members
> into pci_host_bridge, then do the renaming/re-organisation in
> another patch?

That should be possible, I guess Thierry can deal with that when respinning
the series.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann July 1, 2016, 3:58 p.m. UTC | #9
On Friday, July 1, 2016 4:40:46 PM CEST Liviu Dudau wrote:
> On Fri, Jul 01, 2016 at 05:17:11PM +0200, Arnd Bergmann wrote:
> > On Friday, July 1, 2016 3:52:44 PM CEST Liviu Dudau wrote:
> > > 
> > > > 
> > > > or do you mean we should have extra alignment in there so the
> > > > private pointer has a minimum alignment higher than the
> > > > alignment of struct pci_host_bridge?
> > > 
> > > but this ^. bridge pointer arithmetic means +1 is not necessarily +sizeof(struct pci_host_bridge)
> > > bytes. AFAIK that can be rounded to the nearest natural alignment for pointers on that
> > > architecture.
> > 
> > No, that's not how it works.
> 
> Really? If struct foo takes 31 bytes, and struct foo *p = (struct foo*)64, what's p's
> value after p++ ? 95? I thought the compiler is allowed to consider the structure padded so that
> p++ is 96.

In that example, sizeof(struct foo) is 32, so we get to the right result,
for any type, this is true:

	(char *)((struct foo *)p + 1) == (char *)p + sizeof(struct foo);

However, there is indeed a problem in the case that the private structure
requires a larger alignment than struct pci_host_bridge, so adding some
bytes for alignment the way that alloc_etherdev() does is probably
a good idea to be on the safe side, so we could do

	bridge = kzalloc(ALIGN(sizeof(struct pci_host_bridge), ARCH_KMALLOC_MINALIGN) +
					     sizeof_priv, GFP_KERNEL);
	priv = PTR_ALIGN(bridge + 1, ARCH_KMALLOC_MINALIGN);

which will guarantee that both bridge and priv are aligned to
ARCH_KMALLOC_MINALIGN and the allocation is large enough to hold
both.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau July 1, 2016, 4:09 p.m. UTC | #10
On Fri, Jul 01, 2016 at 05:44:09PM +0200, Arnd Bergmann wrote:
> On Friday, July 1, 2016 3:46:48 PM CEST Liviu Dudau wrote:
> > On Thu, Jun 30, 2016 at 05:19:30PM +0200, Thierry Reding wrote:
> > > From: Arnd Bergmann <arnd@arndb.de>
> > > 
> > > This patch makes the existing pci_host_bridge structure a proper device
> > > that is usable by PCI host drivers in a more standard way. In addition
> > > to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
> > > and pci_create_root_bus interfaces, this unfortunately means having to
> > > add yet another interface doing basically the same thing, and add some
> > > extra code in the initial step.
> > > 
> > > However, this time it's more likely to be extensible enough that we
> > > won't have to do another one again in the future, and we should be
> > > able to reduce code much more as a result.
> > 
> > I am really disapointed by the direction in which the whole pci_host_bridge
> > structure and associated functions is going. When I started to get involved
> > in this mess that is the creation of a root PCI bus I was hoping that we
> > are moving towards one or few functions that create the root bus by using
> > the details provided in the host bridge without having so many variants for
> > the functionality.
> 
> That is the plan, we just never made a lot of progress here.
> 
> > One reason for this holy mess is the duplication of information. Both root
> > bus and the pci_host_bridge hold pointers to useful information (msi_controller,
> > pci_ops) and coherency should be guaranteed when one or the other gets created.
> > Hence the ridiculous dance being done to find if root bus hasn't already been
> > created and if not reusing the scrap bus we have created at the top of
> > pci_create_root_bus() to actually set the data, then create the pci_host_bridge,
> > oops we need more information, and so on. (apologies for the incoherent style,
> > I'm trying to duplicate the spirit of the probe.c code :) )
> 
> Agreed.
> 
> > I think this patchset has the right intention but it is not doing it in
> > the right way. To me, the right way is to separate the whole allocation
> > of the pci_host_bridge structure from the scanning or the root bus (keeping the
> > INIT_LIST_HEAD(&bridge->windows) call inside), not add the sysdata pointer
> > *at all* to pci_host_bridge, move the information from pci_bus into pci_host_bridge
> > and make pci_scan_root_bus take a pci_host_bridge pointer once that has been
> > done.
> 
> Again, this is what I'm doing here, unfortunately we cannot remove the
> sysdata pointer altogether as we still have over a hundred existing PCI
> host bridge implementations that all use sysdata.
> 
> > Think of the code structure as a reflection of how the system is structured: the
> > PCI-to-host bridge is a structure that holds the information required to connect
> > the functionality of the host code (msi_controller, host-to-bus resource windows)
> > to the PCI(e) bus. The root pci_bus should only have PCI(e) bus information if
> > possible.
> 
> Exactly.
> 
> > > 
> > > The main idea is to pull the allocation of 'struct pci_host_bridge' out
> > > of the registration, and let individual host drivers and architecture
> > > code fill the members before calling the registration function.
> > 
> > That's how we got into the arch/arm mess that we currently have. The host driver
> > should not drive the instantiation of the pci_host_bridge structure because it
> > will prevent you from having two drivers running at the same time.
> 
> Can you clarify that? I don't see what prevents two drivers from each
> creating a pci_host_bridge structure and passing it into
> pci_host_bridge_register().

Maybe I'm exagerating a bit, but I always disliked the pci_common_init() function and
the way it is creating a pci_host_bridge while relying on hw_pci hooks to get
things ready. I'm pretty sure there are still issues around the fact that a lot of
platform drivers have a subsys_initcall() function that calls pci_common_init(). I
don't want us to go down that path again.

Don't get me wrong, clearly someone needs to create an instance of pci_host_bridge.
What I want people to accept is that in the PCI(e) architecture there is nothing that
stops a piece of HW that used to be the bridge between host and underlying bus into
becoming the bridge between a higher PCI(e) bus and the bus underneath. If the writes
to the configuration registers happens somehow, the Host Bridge doesn't even know if
it is talking to the actual host. Can the driver in that case make sure it did not made
assumptions that were tied to a specific SoC implementation?

> 
> > > -static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
> > > -{
> > > -	struct pci_host_bridge *bridge;
> > > -
> > > -	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
> > > -	if (!bridge)
> > > -		return NULL;
> > > -
> > > -	INIT_LIST_HEAD(&bridge->windows);
> > > -	bridge->bus = b;
> > > -	return bridge;
> > > -}
> > > -
> > 
> > You might have strong arguments for removing this function but they are not properly
> > explained here and I feel that they should. Specially as ....
> 
> In my initial version, it simply became unnecessary because all callers
> of pci_host_bridge_register() would have to allocate it anyway, and with
> the pci_bus assignment gone, it didn't do much at all.
> 
> > >  static const unsigned char pcix_bus_speed[] = {
> > >  	PCI_SPEED_UNKNOWN,		/* 0 */
> > >  	PCI_SPEED_66MHz_PCIX,		/* 1 */
> > > @@ -2117,53 +2104,56 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
> > >  {
> > >  }
> > >  
> > > -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> > > -		struct pci_ops *ops, void *sysdata, struct list_head *resources)
> > > +void pci_host_bridge_init(struct pci_host_bridge *bridge)
> > > +{
> > > +	INIT_LIST_HEAD(&bridge->windows);
> > > +}
> > > +EXPORT_SYMBOL_GPL(pci_host_bridge_init);
> > 
> > I have no idea why it is useful to re-initialise the bridge->windows list at any
> > time other than allocation time.
> 
> That's why I suggested doing this in the allocation function now.

Understood. Thanks for the explanations, I did not manage to keep track of all the old
discussions around PCIe, specially if they were prefixed with ACPI ;)

Best regards,
Liviu

> 
> > > +int pci_host_bridge_register(struct pci_host_bridge *bridge)
> > >  {
> > >  	int error;
> > > -	struct pci_host_bridge *bridge;
> > >  	struct pci_bus *b, *b2;
> > >  	struct resource_entry *window, *n;
> > > +	LIST_HEAD(resources);
> > >  	struct resource *res;
> > >  	resource_size_t offset;
> > >  	char bus_addr[64];
> > >  	char *fmt;
> > > +	struct device *parent = bridge->dev.parent;
> > >  
> > >  	b = pci_alloc_bus(NULL);
> > >  	if (!b)
> > > -		return NULL;
> > > +		return -ENOMEM;
> > > +	bridge->bus = b;
> > >  
> > > -	b->sysdata = sysdata;
> > > -	b->ops = ops;
> > > -	b->number = b->busn_res.start = bus;
> > > +	/* temporarily move resources off the list */
> > > +	list_splice_init(&bridge->windows, &resources);
> > 
> > What are you trying to accomplish here with the moving around of the bridge->windows list
> > elements? This also leaves the bridge->windows list empty afterwards, is that intended?
> 
> I was trying to preserve the existing behavior, this can probably
> be simplified, but as my initial version was completely untested
> I decided not to touch it.
> 
> Later in the function, the list is filled one entry at a time from
> the local 'resources' list that we traditionally passed as a function
> argument before.
> 
> Ideally we'd just walk the list instead of doing the
> split/list_del/list_add dance, and this would be a logical
> cleanup on top of this patch.
> 
> > > @@ -817,6 +821,8 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
> > >  struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> > >  				    struct pci_ops *ops, void *sysdata,
> > >  				    struct list_head *resources);
> > > +void pci_host_bridge_init(struct pci_host_bridge *bridge);
> > > +int pci_host_bridge_register(struct pci_host_bridge *bridge);
> > >  int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
> > >  int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
> > >  void pci_bus_release_busn_res(struct pci_bus *b);
> > 
> > There is too much stuff moving around. I know it is needed and the temptation is to get it
> > done as quickly as possible, but it makes the reviewing and the commenting on this series hard.
> 
> This is why I left the list handling you mentioned above unchanged
> in my original patch instead of rewriting it.
> 
> > Can you split the patch into one that adds the new members
> > into pci_host_bridge, then do the renaming/re-organisation in
> > another patch?
> 
> That should be possible, I guess Thierry can deal with that when respinning
> the series.
> 
> 	Arnd
>
Arnd Bergmann July 1, 2016, 4:30 p.m. UTC | #11
On Friday, July 1, 2016 5:09:23 PM CEST Liviu Dudau wrote:
> > > > 
> > > > The main idea is to pull the allocation of 'struct pci_host_bridge' out
> > > > of the registration, and let individual host drivers and architecture
> > > > code fill the members before calling the registration function.
> > > 
> > > That's how we got into the arch/arm mess that we currently have. The host driver
> > > should not drive the instantiation of the pci_host_bridge structure because it
> > > will prevent you from having two drivers running at the same time.
> > 
> > Can you clarify that? I don't see what prevents two drivers from each
> > creating a pci_host_bridge structure and passing it into
> > pci_host_bridge_register().
> 
> Maybe I'm exagerating a bit, but I always disliked the pci_common_init() function and
> the way it is creating a pci_host_bridge while relying on hw_pci hooks to get
> things ready. I'm pretty sure there are still issues around the fact that a lot of
> platform drivers have a subsys_initcall() function that calls pci_common_init(). I
> don't want us to go down that path again.

I see what you mean now, and I agree we don't want to do this like
pci_common_init(), but the patch here does a number of things very
differently:

- it's not architecture specific
- it is not tied into architecture specific pcibios_* functions
- it does not create multiple bridges at once
- it does not have three levels of callbacks going back and forth,
  the idea is to eventually end up with one structure for the
  callback pointers including those that today are done as __weak
  functions.

Today we only have four callers of pci_common_init_dev that probe the
PCI host from DT: cns3xxx, versatile, mvebu and rcar-gen2. I hope
we can replace all of them with the new method here directly and
not take any intermediate steps.

The users of pci_common_init() (not _dev) are limited to board files
on really old platforms that are either not getting updated (ixp, iop,
footbridge, ks8695, mv78xx0, sa1100) or that have DT based replacement
code coming (dove, orion5x, cns3xxx, integrator).

> Don't get me wrong, clearly someone needs to create an instance of pci_host_bridge.
> What I want people to accept is that in the PCI(e) architecture there is nothing that
> stops a piece of HW that used to be the bridge between host and underlying bus into
> becoming the bridge between a higher PCI(e) bus and the bus underneath. If the writes
> to the configuration registers happens somehow, the Host Bridge doesn't even know if
> it is talking to the actual host. Can the driver in that case make sure it did not made
> assumptions that were tied to a specific SoC implementation?

Sorry, I'm not really following what you mean with that, or what the
consequence is for the Linux implementation. Can you try to rephrase this?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau July 4, 2016, 9:56 a.m. UTC | #12
On Fri, Jul 01, 2016 at 06:30:24PM +0200, Arnd Bergmann wrote:
> On Friday, July 1, 2016 5:09:23 PM CEST Liviu Dudau wrote:
> > > > > 
> > > > > The main idea is to pull the allocation of 'struct pci_host_bridge' out
> > > > > of the registration, and let individual host drivers and architecture
> > > > > code fill the members before calling the registration function.
> > > > 
> > > > That's how we got into the arch/arm mess that we currently have. The host driver
> > > > should not drive the instantiation of the pci_host_bridge structure because it
> > > > will prevent you from having two drivers running at the same time.
> > > 
> > > Can you clarify that? I don't see what prevents two drivers from each
> > > creating a pci_host_bridge structure and passing it into
> > > pci_host_bridge_register().
> > 
> > Maybe I'm exagerating a bit, but I always disliked the pci_common_init() function and
> > the way it is creating a pci_host_bridge while relying on hw_pci hooks to get
> > things ready. I'm pretty sure there are still issues around the fact that a lot of
> > platform drivers have a subsys_initcall() function that calls pci_common_init(). I
> > don't want us to go down that path again.
> 
> I see what you mean now, and I agree we don't want to do this like
> pci_common_init(), but the patch here does a number of things very
> differently:
> 
> - it's not architecture specific
> - it is not tied into architecture specific pcibios_* functions
> - it does not create multiple bridges at once
> - it does not have three levels of callbacks going back and forth,
>   the idea is to eventually end up with one structure for the
>   callback pointers including those that today are done as __weak
>   functions.
> 
> Today we only have four callers of pci_common_init_dev that probe the
> PCI host from DT: cns3xxx, versatile, mvebu and rcar-gen2. I hope
> we can replace all of them with the new method here directly and
> not take any intermediate steps.
> 
> The users of pci_common_init() (not _dev) are limited to board files
> on really old platforms that are either not getting updated (ixp, iop,
> footbridge, ks8695, mv78xx0, sa1100) or that have DT based replacement
> code coming (dove, orion5x, cns3xxx, integrator).
> 
> > Don't get me wrong, clearly someone needs to create an instance of pci_host_bridge.
> > What I want people to accept is that in the PCI(e) architecture there is nothing that
> > stops a piece of HW that used to be the bridge between host and underlying bus into
> > becoming the bridge between a higher PCI(e) bus and the bus underneath. If the writes
> > to the configuration registers happens somehow, the Host Bridge doesn't even know if
> > it is talking to the actual host. Can the driver in that case make sure it did not made
> > assumptions that were tied to a specific SoC implementation?
> 
> Sorry, I'm not really following what you mean with that, or what the
> consequence is for the Linux implementation. Can you try to rephrase this?

I'm thinking drivers expecting to drive the bridge between the host and the PCI bus.
When that HW is used to bridge between another bus (PCI or PCIe) because the
functionality was there all the time in terms of signals, the driver's assumptions
break down. But maybe I'm being too theoretical and such beasts don't exist? (I remember
seeing a network card that had native PCI chip plus an added PCIe-to-PCI bridge and
the driver was tripping badly, but I can't remember the exact details.

Best regards,
Liviu

> 
> 	Arnd
>
Arnd Bergmann July 4, 2016, 1:46 p.m. UTC | #13
On Monday, July 4, 2016 10:56:27 AM CEST Liviu Dudau wrote:
> On Fri, Jul 01, 2016 at 06:30:24PM +0200, Arnd Bergmann wrote:
> > On Friday, July 1, 2016 5:09:23 PM CEST Liviu Dudau wrote:

> > > Don't get me wrong, clearly someone needs to create an instance of pci_host_bridge.
> > > What I want people to accept is that in the PCI(e) architecture there is nothing that
> > > stops a piece of HW that used to be the bridge between host and underlying bus into
> > > becoming the bridge between a higher PCI(e) bus and the bus underneath. If the writes
> > > to the configuration registers happens somehow, the Host Bridge doesn't even know if
> > > it is talking to the actual host. Can the driver in that case make sure it did not made
> > > assumptions that were tied to a specific SoC implementation?
> > 
> > Sorry, I'm not really following what you mean with that, or what the
> > consequence is for the Linux implementation. Can you try to rephrase this?
> 
> I'm thinking drivers expecting to drive the bridge between the host and the PCI bus.
> When that HW is used to bridge between another bus (PCI or PCIe) because the
> functionality was there all the time in terms of signals, the driver's assumptions
> break down. But maybe I'm being too theoretical and such beasts don't exist? (I remember
> seeing a network card that had native PCI chip plus an added PCIe-to-PCI bridge and
> the driver was tripping badly, but I can't remember the exact details.

I think we should expect all non-host PCI bridges to behave according to the
normal PCI spec and get handled by the PCI core.

There are some handlers for broken bridges in drivers/pci/quirks.c, which
is not nice but there is little alternative, and I don't think that
hooking into the PCI host bridge driver code would help here.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 28, 2016, 8:43 p.m. UTC | #14
Hi Thierry and Arnd,

Thanks a lot for pushing this forward.  I would like to have gotten
this into the v4.8 merge window, but I didn't get to it soon enough.

On Thu, Jun 30, 2016 at 05:19:30PM +0200, Thierry Reding wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> This patch makes the existing pci_host_bridge structure a proper device
> that is usable by PCI host drivers in a more standard way. In addition
> to the existing pci_scan_bus, pci_scan_root_bus, pci_scan_root_bus_msi,
> and pci_create_root_bus interfaces, this unfortunately means having to
> add yet another interface doing basically the same thing, and add some
> extra code in the initial step.
> 
> However, this time it's more likely to be extensible enough that we
> won't have to do another one again in the future, and we should be
> able to reduce code much more as a result.
> 
> The main idea is to pull the allocation of 'struct pci_host_bridge' out
> of the registration, and let individual host drivers and architecture
> code fill the members before calling the registration function.
> 
> There are a number of things we can do based on this:
> 
> * Use a single memory allocation for the driver-specific structure
>   and the generic PCI host bridge
> * consolidate the contents of driver specific structures by moving
>   them into pci_host_bridge
> * Add a consistent interface for removing a PCI host bridge again
>   when unloading a host driver module
> * Replace the architecture specific __weak pcibios_* functions with
>   callbacks in a pci_host_bridge device
> * Move common boilerplate code from host drivers into the generic
>   function, based on contents of the structure
> * Extend pci_host_bridge with additional members when needed without
>   having to add arguments to pci_scan_*.
> * Move members of struct pci_bus into pci_host_bridge to avoid
>   having lots of identical copies.
> 
> As mentioned in a previous email, one open question is whether we want
> to export a function for allocating a pci_host_bridge device in
> combination with the per-device structure or let the driver itself
> call kzalloc.

I agree with Arnd that a pci_host_bridge_alloc() would be nice.  The
pointer to alloc_etherdev() was very useful and probably worth
including in the changelog.  It helps a lot if we can copy the style
of some other subsystem, but my head is buried so deep in PCI that I
don't know what they do.

Looking forward to the next iteration.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 77f0bbb05c3f..ba40514acd32 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -521,19 +521,6 @@  static void pci_release_host_bridge_dev(struct device *dev)
 	kfree(bridge);
 }
 
-static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
-{
-	struct pci_host_bridge *bridge;
-
-	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
-	if (!bridge)
-		return NULL;
-
-	INIT_LIST_HEAD(&bridge->windows);
-	bridge->bus = b;
-	return bridge;
-}
-
 static const unsigned char pcix_bus_speed[] = {
 	PCI_SPEED_UNKNOWN,		/* 0 */
 	PCI_SPEED_66MHz_PCIX,		/* 1 */
@@ -2117,53 +2104,56 @@  void __weak pcibios_remove_bus(struct pci_bus *bus)
 {
 }
 
-struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
-		struct pci_ops *ops, void *sysdata, struct list_head *resources)
+void pci_host_bridge_init(struct pci_host_bridge *bridge)
+{
+	INIT_LIST_HEAD(&bridge->windows);
+}
+EXPORT_SYMBOL_GPL(pci_host_bridge_init);
+
+int pci_host_bridge_register(struct pci_host_bridge *bridge)
 {
 	int error;
-	struct pci_host_bridge *bridge;
 	struct pci_bus *b, *b2;
 	struct resource_entry *window, *n;
+	LIST_HEAD(resources);
 	struct resource *res;
 	resource_size_t offset;
 	char bus_addr[64];
 	char *fmt;
+	struct device *parent = bridge->dev.parent;
 
 	b = pci_alloc_bus(NULL);
 	if (!b)
-		return NULL;
+		return -ENOMEM;
+	bridge->bus = b;
 
-	b->sysdata = sysdata;
-	b->ops = ops;
-	b->number = b->busn_res.start = bus;
+	/* temporarily move resources off the list */
+	list_splice_init(&bridge->windows, &resources);
+	b->sysdata = bridge->sysdata;
+	b->msi = bridge->msi;
+	b->ops = bridge->ops;
+	b->number = b->busn_res.start = bridge->busnr;
 #ifdef CONFIG_PCI_DOMAINS_GENERIC
 	b->domain_nr = pci_bus_find_domain_nr(b, parent);
 #endif
-	b2 = pci_find_bus(pci_domain_nr(b), bus);
+	b2 = pci_find_bus(pci_domain_nr(b), bridge->busnr);
 	if (b2) {
 		/* If we already got to this bus through a different bridge, ignore it */
 		dev_dbg(&b2->dev, "bus already known\n");
+		error = -EEXIST;
 		goto err_out;
 	}
 
-	bridge = pci_alloc_host_bridge(b);
-	if (!bridge)
-		goto err_out;
-
-	bridge->dev.parent = parent;
-	bridge->dev.release = pci_release_host_bridge_dev;
-	dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
+	dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b),
+		     bridge->busnr);
 	error = pcibios_root_bridge_prepare(bridge);
-	if (error) {
-		kfree(bridge);
+	if (error)
 		goto err_out;
-	}
 
 	error = device_register(&bridge->dev);
-	if (error) {
+	if (error)
 		put_device(&bridge->dev);
-		goto err_out;
-	}
+
 	b->bridge = get_device(&bridge->dev);
 	device_enable_async_suspend(b->bridge);
 	pci_set_bus_of_node(b);
@@ -2174,7 +2164,7 @@  struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 
 	b->dev.class = &pcibus_class;
 	b->dev.parent = b->bridge;
-	dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
+	dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bridge->busnr);
 	error = device_register(&b->dev);
 	if (error)
 		goto class_dev_reg_err;
@@ -2190,12 +2180,12 @@  struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 		printk(KERN_INFO "PCI host bridge to bus %s\n", dev_name(&b->dev));
 
 	/* Add initial resources to the bus */
-	resource_list_for_each_entry_safe(window, n, resources) {
+	resource_list_for_each_entry_safe(window, n, &resources) {
 		list_move_tail(&window->node, &bridge->windows);
 		res = window->res;
 		offset = window->offset;
 		if (res->flags & IORESOURCE_BUS)
-			pci_bus_insert_busn_res(b, bus, res->end);
+			pci_bus_insert_busn_res(b, bridge->busnr, res->end);
 		else
 			pci_bus_add_resource(b, res, 0);
 		if (offset) {
@@ -2215,16 +2205,16 @@  struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 	list_add_tail(&b->node, &pci_root_buses);
 	up_write(&pci_bus_sem);
 
-	return b;
+	return 0;
 
 class_dev_reg_err:
 	put_device(&bridge->dev);
 	device_unregister(&bridge->dev);
 err_out:
 	kfree(b);
-	return NULL;
+	return error;
 }
-EXPORT_SYMBOL_GPL(pci_create_root_bus);
+EXPORT_SYMBOL_GPL(pci_host_bridge_register);
 
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
 {
@@ -2289,6 +2279,43 @@  void pci_bus_release_busn_res(struct pci_bus *b)
 			res, ret ? "can not be" : "is");
 }
 
+static struct pci_bus *pci_create_root_bus_msi(struct device *parent,
+		int bus, struct pci_ops *ops, void *sysdata,
+		struct list_head *resources, struct msi_controller *msi)
+{
+	struct pci_host_bridge *bridge;
+	int ret;
+
+	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
+	if (!bridge)
+		return NULL;
+
+	pci_host_bridge_init(bridge);
+	bridge->dev.parent = parent;
+	bridge->dev.release = pci_release_host_bridge_dev;
+	bridge->busnr = bus;
+	bridge->ops = ops;
+	bridge->sysdata = sysdata;
+	bridge->msi = msi;
+	list_splice_init(resources, &bridge->windows);
+
+	ret = pci_host_bridge_register(bridge);
+	if (ret) {
+		kfree(bridge);
+		return NULL;
+	}
+
+	return bridge->bus;
+}
+
+struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
+		struct pci_ops *ops, void *sysdata, struct list_head *resources)
+{
+	return pci_create_root_bus_msi(parent, bus, ops, sysdata, resources,
+				       NULL);
+}
+EXPORT_SYMBOL_GPL(pci_create_root_bus);
+
 struct pci_bus *pci_scan_root_bus_msi(struct device *parent, int bus,
 		struct pci_ops *ops, void *sysdata,
 		struct list_head *resources, struct msi_controller *msi)
@@ -2304,12 +2331,10 @@  struct pci_bus *pci_scan_root_bus_msi(struct device *parent, int bus,
 			break;
 		}
 
-	b = pci_create_root_bus(parent, bus, ops, sysdata, resources);
+	b = pci_create_root_bus_msi(parent, bus, ops, sysdata, resources, msi);
 	if (!b)
 		return NULL;
 
-	b->msi = msi;
-
 	if (!found) {
 		dev_info(&b->dev,
 		 "No busn resource found for root bus, will use [bus %02x-ff]\n",
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8badb664be00..5bf04e20c1cd 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -405,10 +405,14 @@  static inline int pci_channel_offline(struct pci_dev *pdev)
 
 struct pci_host_bridge {
 	struct device dev;
-	struct pci_bus *bus;		/* root bus */
+	struct pci_ops *ops;
+	void *sysdata;
+	int busnr;
+	struct pci_bus *bus;		/* points to root */
 	struct list_head windows;	/* resource_entry */
 	void (*release_fn)(struct pci_host_bridge *);
 	void *release_data;
+	struct msi_controller *msi;
 	unsigned int ignore_reset_delay:1;	/* for entire hierarchy */
 	/* Resource alignment requirements */
 	resource_size_t (*align_resource)(struct pci_dev *dev,
@@ -817,6 +821,8 @@  struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
 struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 				    struct pci_ops *ops, void *sysdata,
 				    struct list_head *resources);
+void pci_host_bridge_init(struct pci_host_bridge *bridge);
+int pci_host_bridge_register(struct pci_host_bridge *bridge);
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
 int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
 void pci_bus_release_busn_res(struct pci_bus *b);