diff mbox

PCI: Only enable IO window if supported

Message ID 1432342336-25832-1-git-send-email-linux@roeck-us.net
State Changes Requested
Headers show

Commit Message

Guenter Roeck May 23, 2015, 12:52 a.m. UTC
The PCI subsystem always assumes that I/O is supported on PCIe bridges
and tries to assign an I/O window to each port even if that is not
the case.

This may result in messages such as

pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
					get_res_add_size add_size 1000
pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]

for each bridge port, even if a port or its parent does not support
I/O in the first place.

To avoid this message, check if a port supports I/O before trying to
enable it. Also check if port's parent supports I/O, and only modify
a port's I/O resource size if both the port and its parent support I/O.

If IO is disabled after the initial port scan, the IO base and size
registers are set to 0x00f0 to indicate that IO is disabled. A later
rescan interprets this as "IO supported" and enables the IO range,
even if the parent does not support IO. Handle this situation as well.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/pci/probe.c     | 14 ++++++++++++++
 drivers/pci/setup-bus.c |  4 ++--
 include/linux/pci.h     |  9 +++++++++
 3 files changed, 25 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas May 27, 2015, 9:04 p.m. UTC | #1
[+cc Lorenzo, Suravee, Will]

I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
pci_read_bases() from the PCI core instead of from arch code, and there are
likely some dependencies between these two things.

On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
> The PCI subsystem always assumes that I/O is supported on PCIe bridges
> and tries to assign an I/O window to each port even if that is not
> the case.
> 
> This may result in messages such as
> 
> pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
> 					get_res_add_size add_size 1000
> pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
> pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
> 
> for each bridge port, even if a port or its parent does not support
> I/O in the first place.
> 
> To avoid this message, check if a port supports I/O before trying to
> enable it. Also check if port's parent supports I/O, and only modify
> a port's I/O resource size if both the port and its parent support I/O.
> 
> If IO is disabled after the initial port scan, the IO base and size
> registers are set to 0x00f0 to indicate that IO is disabled. A later
> rescan interprets this as "IO supported" and enables the IO range,
> even if the parent does not support IO. Handle this situation as well.
> 
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  drivers/pci/probe.c     | 14 ++++++++++++++
>  drivers/pci/setup-bus.c |  4 ++--
>  include/linux/pci.h     |  9 +++++++++
>  3 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 6675a7a1b9fc..f4944ef45148 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus *child)
>  	base = (io_base_lo & io_mask) << 8;
>  	limit = (io_limit_lo & io_mask) << 8;
>  
> +	/* If necessary, check if the bridge supports an I/O aperture */
> +	if (!io_base_lo && !io_limit_lo) {
> +		u16 io;
> +
> +		if (!pci_parent_supports_io(child))
> +			return;
> +
> +		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
> +		pci_read_config_word(dev, PCI_IO_BASE, &io);
> +		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
> +		if (!io)
> +			return;
> +	}

I really like the idea of pushing this into pci_read_bridge_io().

I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
somehow get rid of pci_bridge_check_ranges() altogether?

I think I looked at doing that a while back, and it seems like there was
some wrinkle, but I don't remember what it was.

It does make sense that if the bridge supports an I/O aperture, but there's
no possibility of I/O resources on the primary side, we should pretend the
bridge has no I/O aperture.  But I think it might be nice to emit a
diagnostic about *why* we're ignoring it.  Otherwise there's a little
discrepancy between dmesg and lspci.

> +
>  	if ((io_base_lo & PCI_IO_RANGE_TYPE_MASK) == PCI_IO_RANGE_TYPE_32) {
>  		u16 io_base_hi, io_limit_hi;
>  
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 4fd0cacf7ca0..963b31a109a9 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -750,12 +750,12 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
>  	b_res[1].flags |= IORESOURCE_MEM;
>  
>  	pci_read_config_word(bridge, PCI_IO_BASE, &io);
> -	if (!io) {
> +	if (!io && pci_parent_supports_io(bus)) {
>  		pci_write_config_word(bridge, PCI_IO_BASE, 0xe0f0);
>  		pci_read_config_word(bridge, PCI_IO_BASE, &io);
>  		pci_write_config_word(bridge, PCI_IO_BASE, 0x0);
>  	}
> -	if (io)
> +	if (io && (io != 0x00f0 || pci_parent_supports_io(bus)))

I *think* this 0x00f0 depends on what pci_setup_bridge_io() writes to
PCI_IO_BASE when it disables an I/O aperture.  Depending on that particular
value here is sort of ugly and would need at least a comment if we can't
figure out a better way to do it.

But it would be ideal if we could get rid of pci_bridge_check_ranges()
altogether and have the rule that we read bridge window characteristics
(IORESOURCE_IO, IORESOURCE_MEM, IORESOURCE_PREFETCH, IORESOURCE_MEM_64)
once when we enumerate the bridge.  After that, the only changes would be
to change res->start and res->end and update the hardware correspondingly.

I'd like res->flags to reflect the capabilities of the hardware, not
whether the window is currently enabled.

>  		b_res[0].flags |= IORESOURCE_IO;
>  
>  	/*  DECchip 21050 pass 2 errata: the bridge may miss an address
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 353db8dc4c6e..f3de9e24aab1 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -489,6 +489,15 @@ static inline bool pci_is_root_bus(struct pci_bus *pbus)
>  	return !(pbus->parent);
>  }
>  
> +/*
> + * Returns true if the parent bus supports an I/O aperture.
> + */
> +static inline bool pci_parent_supports_io(struct pci_bus *pbus)
> +{
> +	return pci_is_root_bus(pbus) || pci_is_root_bus(pbus->parent) ||
> +	       (pbus->parent->resource[0]->flags & IORESOURCE_IO);

This is not obvious to me.  There are host bridges that do not have I/O
apertures, so I don't see what the pci_is_root_bus() tests have to do with
this.  The resource[0]->flags & IORESOURCE_IO part does make sense to me.

I think at the root bus, we'd have to iterate through all the host bridge
resources to figure out whether there are any I/O apertures.

> +}
> +
>  /**
>   * pci_is_bridge - check if the PCI device is a bridge
>   * @dev: PCI device
> -- 
> 2.1.0
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck May 28, 2015, 2:23 a.m. UTC | #2
On Wed, May 27, 2015 at 04:04:47PM -0500, Bjorn Helgaas wrote:
> [+cc Lorenzo, Suravee, Will]
> 
> I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
> pci_read_bases() from the PCI core instead of from arch code, and there are
> likely some dependencies between these two things.
> 
> On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
> > The PCI subsystem always assumes that I/O is supported on PCIe bridges
> > and tries to assign an I/O window to each port even if that is not
> > the case.
> > 
> > This may result in messages such as
> > 
> > pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
> > 					get_res_add_size add_size 1000
> > pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
> > pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
> > 
> > for each bridge port, even if a port or its parent does not support
> > I/O in the first place.
> > 
> > To avoid this message, check if a port supports I/O before trying to
> > enable it. Also check if port's parent supports I/O, and only modify
> > a port's I/O resource size if both the port and its parent support I/O.
> > 
> > If IO is disabled after the initial port scan, the IO base and size
> > registers are set to 0x00f0 to indicate that IO is disabled. A later
> > rescan interprets this as "IO supported" and enables the IO range,
> > even if the parent does not support IO. Handle this situation as well.
> > 
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> >  drivers/pci/probe.c     | 14 ++++++++++++++
> >  drivers/pci/setup-bus.c |  4 ++--
> >  include/linux/pci.h     |  9 +++++++++
> >  3 files changed, 25 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 6675a7a1b9fc..f4944ef45148 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus *child)
> >  	base = (io_base_lo & io_mask) << 8;
> >  	limit = (io_limit_lo & io_mask) << 8;
> >  
> > +	/* If necessary, check if the bridge supports an I/O aperture */
> > +	if (!io_base_lo && !io_limit_lo) {
> > +		u16 io;
> > +
> > +		if (!pci_parent_supports_io(child))
> > +			return;
> > +
> > +		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
> > +		pci_read_config_word(dev, PCI_IO_BASE, &io);
> > +		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
> > +		if (!io)
> > +			return;
> > +	}
> 
> I really like the idea of pushing this into pci_read_bridge_io().
> 
> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
> somehow get rid of pci_bridge_check_ranges() altogether?
> 
Sure, I just figured I'd start with IO, and do the rest after
I have a better idea if I am going into the right direction.

> I think I looked at doing that a while back, and it seems like there was
> some wrinkle, but I don't remember what it was.
> 
> It does make sense that if the bridge supports an I/O aperture, but there's
> no possibility of I/O resources on the primary side, we should pretend the
> bridge has no I/O aperture.  But I think it might be nice to emit a
> diagnostic about *why* we're ignoring it.  Otherwise there's a little
> discrepancy between dmesg and lspci.
> 
Ok, makes sense. Would you want to see that message for every port ?
Guess I can check how it looks like, to make sure that I don't end up
getting a lot of noise again.

> > +
> >  	if ((io_base_lo & PCI_IO_RANGE_TYPE_MASK) == PCI_IO_RANGE_TYPE_32) {
> >  		u16 io_base_hi, io_limit_hi;
> >  
> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > index 4fd0cacf7ca0..963b31a109a9 100644
> > --- a/drivers/pci/setup-bus.c
> > +++ b/drivers/pci/setup-bus.c
> > @@ -750,12 +750,12 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
> >  	b_res[1].flags |= IORESOURCE_MEM;
> >  
> >  	pci_read_config_word(bridge, PCI_IO_BASE, &io);
> > -	if (!io) {
> > +	if (!io && pci_parent_supports_io(bus)) {
> >  		pci_write_config_word(bridge, PCI_IO_BASE, 0xe0f0);
> >  		pci_read_config_word(bridge, PCI_IO_BASE, &io);
> >  		pci_write_config_word(bridge, PCI_IO_BASE, 0x0);
> >  	}
> > -	if (io)
> > +	if (io && (io != 0x00f0 || pci_parent_supports_io(bus)))
> 
> I *think* this 0x00f0 depends on what pci_setup_bridge_io() writes to
> PCI_IO_BASE when it disables an I/O aperture.  Depending on that particular

Correct. I could have checked if io is disabled (limit < base),
but at least for the time being I wanted the impact to be minimal.
So far the code auto-enables IO if it was disabled (eg by the BIOS)
but the bridge chip supports it. I only wanted to keep it disabled
if it was likely that it was disabled by pci_setup_bridge_io().

Of course,
	if (io && pci_parent_supports_io(bus))
might just be sufficient.

> value here is sort of ugly and would need at least a comment if we can't
> figure out a better way to do it.
> 
> But it would be ideal if we could get rid of pci_bridge_check_ranges()
> altogether and have the rule that we read bridge window characteristics
> (IORESOURCE_IO, IORESOURCE_MEM, IORESOURCE_PREFETCH, IORESOURCE_MEM_64)
> once when we enumerate the bridge.  After that, the only changes would be
> to change res->start and res->end and update the hardware correspondingly.
> 
Would be great - this should solve the above problem automatically.
I was hesitant to do that, because I don't know if there would be side
effects. I could take out the io handling from pci_bridge_check_ranges()
and see what happens, but obviously my test coverage would be somewhat
limited.

> I'd like res->flags to reflect the capabilities of the hardware, not
> whether the window is currently enabled.
> 
Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
purpose, or could that cause conflicts elsewhere ?

> >  		b_res[0].flags |= IORESOURCE_IO;
> >  
> >  	/*  DECchip 21050 pass 2 errata: the bridge may miss an address
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 353db8dc4c6e..f3de9e24aab1 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -489,6 +489,15 @@ static inline bool pci_is_root_bus(struct pci_bus *pbus)
> >  	return !(pbus->parent);
> >  }
> >  
> > +/*
> > + * Returns true if the parent bus supports an I/O aperture.
> > + */
> > +static inline bool pci_parent_supports_io(struct pci_bus *pbus)
> > +{
> > +	return pci_is_root_bus(pbus) || pci_is_root_bus(pbus->parent) ||
> > +	       (pbus->parent->resource[0]->flags & IORESOURCE_IO);
> 
> This is not obvious to me.  There are host bridges that do not have I/O
> apertures, so I don't see what the pci_is_root_bus() tests have to do with
> this.  The resource[0]->flags & IORESOURCE_IO part does make sense to me.
> 
More a matter of me not knowing what I need to do. resource[0] is NULL
for the root bus, at least on the powerpc system I used for testing.

> I think at the root bus, we'd have to iterate through all the host bridge
> resources to figure out whether there are any I/O apertures.
> 
Can you give me a hint on how to do that, hopefully in a platform
independent way ? Walk through bus->resources and search for an
IO resource ? Or does resource[0] == NULL already indicate
that there is no IO aperture ?

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi June 2, 2015, 2:55 p.m. UTC | #3
Bjorn, Guenter,

On Wed, May 27, 2015 at 10:04:47PM +0100, Bjorn Helgaas wrote:
> [+cc Lorenzo, Suravee, Will]
> 
> I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
> pci_read_bases() from the PCI core instead of from arch code, and there are
> likely some dependencies between these two things.
> 
> On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
> > The PCI subsystem always assumes that I/O is supported on PCIe bridges
> > and tries to assign an I/O window to each port even if that is not
> > the case.
> > 
> > This may result in messages such as
> > 
> > pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
> > 					get_res_add_size add_size 1000
> > pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
> > pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
> > 
> > for each bridge port, even if a port or its parent does not support
> > I/O in the first place.
> > 
> > To avoid this message, check if a port supports I/O before trying to
> > enable it. Also check if port's parent supports I/O, and only modify
> > a port's I/O resource size if both the port and its parent support I/O.
> > 
> > If IO is disabled after the initial port scan, the IO base and size
> > registers are set to 0x00f0 to indicate that IO is disabled. A later
> > rescan interprets this as "IO supported" and enables the IO range,
> > even if the parent does not support IO. Handle this situation as well.
> > 
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> >  drivers/pci/probe.c     | 14 ++++++++++++++
> >  drivers/pci/setup-bus.c |  4 ++--
> >  include/linux/pci.h     |  9 +++++++++
> >  3 files changed, 25 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 6675a7a1b9fc..f4944ef45148 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus *child)
> >  	base = (io_base_lo & io_mask) << 8;
> >  	limit = (io_limit_lo & io_mask) << 8;
> >  
> > +	/* If necessary, check if the bridge supports an I/O aperture */
> > +	if (!io_base_lo && !io_limit_lo) {
> > +		u16 io;
> > +
> > +		if (!pci_parent_supports_io(child))
> > +			return;
> > +
> > +		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
> > +		pci_read_config_word(dev, PCI_IO_BASE, &io);
> > +		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
> > +		if (!io)
> > +			return;
> > +	}
> 
> I really like the idea of pushing this into pci_read_bridge_io().
> 
> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
> somehow get rid of pci_bridge_check_ranges() altogether?
> 
> I think I looked at doing that a while back, and it seems like there was
> some wrinkle, but I don't remember what it was.

While at it, do you think it is reasonable to also claim the bridge
windows (resources) in the respective pci_read_bridge_* calls ?

Is there a reason why we don't/can't do it ? I noticed that on
PROBE_ONLY systems on ARM/ARM64 at the moment we do not claim
the bridge apertures and this is not correct, see below:

[5.980127] pcieport 0000:00:02.1: can't enable device: BAR 8
[mem 0xbff00000 - 0xbfffffff] not claimed
[5.988056] pcieport: probe of 0000:00:02.1 failed with error -22

Thanks,
Lorenzo

> It does make sense that if the bridge supports an I/O aperture, but there's
> no possibility of I/O resources on the primary side, we should pretend the
> bridge has no I/O aperture.  But I think it might be nice to emit a
> diagnostic about *why* we're ignoring it.  Otherwise there's a little
> discrepancy between dmesg and lspci.
> 
> > +
> >  	if ((io_base_lo & PCI_IO_RANGE_TYPE_MASK) == PCI_IO_RANGE_TYPE_32) {
> >  		u16 io_base_hi, io_limit_hi;
> >  
> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > index 4fd0cacf7ca0..963b31a109a9 100644
> > --- a/drivers/pci/setup-bus.c
> > +++ b/drivers/pci/setup-bus.c
> > @@ -750,12 +750,12 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
> >  	b_res[1].flags |= IORESOURCE_MEM;
> >  
> >  	pci_read_config_word(bridge, PCI_IO_BASE, &io);
> > -	if (!io) {
> > +	if (!io && pci_parent_supports_io(bus)) {
> >  		pci_write_config_word(bridge, PCI_IO_BASE, 0xe0f0);
> >  		pci_read_config_word(bridge, PCI_IO_BASE, &io);
> >  		pci_write_config_word(bridge, PCI_IO_BASE, 0x0);
> >  	}
> > -	if (io)
> > +	if (io && (io != 0x00f0 || pci_parent_supports_io(bus)))
> 
> I *think* this 0x00f0 depends on what pci_setup_bridge_io() writes to
> PCI_IO_BASE when it disables an I/O aperture.  Depending on that particular
> value here is sort of ugly and would need at least a comment if we can't
> figure out a better way to do it.
> 
> But it would be ideal if we could get rid of pci_bridge_check_ranges()
> altogether and have the rule that we read bridge window characteristics
> (IORESOURCE_IO, IORESOURCE_MEM, IORESOURCE_PREFETCH, IORESOURCE_MEM_64)
> once when we enumerate the bridge.  After that, the only changes would be
> to change res->start and res->end and update the hardware correspondingly.
> 
> I'd like res->flags to reflect the capabilities of the hardware, not
> whether the window is currently enabled.
> 
> >  		b_res[0].flags |= IORESOURCE_IO;
> >  
> >  	/*  DECchip 21050 pass 2 errata: the bridge may miss an address
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 353db8dc4c6e..f3de9e24aab1 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -489,6 +489,15 @@ static inline bool pci_is_root_bus(struct pci_bus *pbus)
> >  	return !(pbus->parent);
> >  }
> >  
> > +/*
> > + * Returns true if the parent bus supports an I/O aperture.
> > + */
> > +static inline bool pci_parent_supports_io(struct pci_bus *pbus)
> > +{
> > +	return pci_is_root_bus(pbus) || pci_is_root_bus(pbus->parent) ||
> > +	       (pbus->parent->resource[0]->flags & IORESOURCE_IO);
> 
> This is not obvious to me.  There are host bridges that do not have I/O
> apertures, so I don't see what the pci_is_root_bus() tests have to do with
> this.  The resource[0]->flags & IORESOURCE_IO part does make sense to me.
> 
> I think at the root bus, we'd have to iterate through all the host bridge
> resources to figure out whether there are any I/O apertures.
> 
> > +}
> > +
> >  /**
> >   * pci_is_bridge - check if the PCI device is a bridge
> >   * @dev: PCI device
> > -- 
> > 2.1.0
> > 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 2, 2015, 4:32 p.m. UTC | #4
On Tue, Jun 2, 2015 at 9:55 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, May 27, 2015 at 10:04:47PM +0100, Bjorn Helgaas wrote:

>> I really like the idea of pushing this into pci_read_bridge_io().
>>
>> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
>> somehow get rid of pci_bridge_check_ranges() altogether?
>>
>> I think I looked at doing that a while back, and it seems like there was
>> some wrinkle, but I don't remember what it was.
>
> While at it, do you think it is reasonable to also claim the bridge
> windows (resources) in the respective pci_read_bridge_* calls ?

I *do* think that's reasonable, and I think it would simplify some
things.  Of course, we can only claim them if firmware has already
programmed them.  If firmware hasn't done anything, the claim should
fail and we can treat it as an unassigned resource and allocate and
claim the space later.

I'm sure we'll trip over some issues when trying this, but it seems
like a logical thing to do, so I think it'd be great if you tried it
out.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck June 2, 2015, 5:02 p.m. UTC | #5
On 06/02/2015 07:55 AM, Lorenzo Pieralisi wrote:
> Bjorn, Guenter,
>
> On Wed, May 27, 2015 at 10:04:47PM +0100, Bjorn Helgaas wrote:
>> [+cc Lorenzo, Suravee, Will]
>>
>> I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
>> pci_read_bases() from the PCI core instead of from arch code, and there are
>> likely some dependencies between these two things.
>>
>> On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
>>> The PCI subsystem always assumes that I/O is supported on PCIe bridges
>>> and tries to assign an I/O window to each port even if that is not
>>> the case.
>>>
>>> This may result in messages such as
>>>
>>> pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
>>> 					get_res_add_size add_size 1000
>>> pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
>>> pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
>>>
>>> for each bridge port, even if a port or its parent does not support
>>> I/O in the first place.
>>>
>>> To avoid this message, check if a port supports I/O before trying to
>>> enable it. Also check if port's parent supports I/O, and only modify
>>> a port's I/O resource size if both the port and its parent support I/O.
>>>
>>> If IO is disabled after the initial port scan, the IO base and size
>>> registers are set to 0x00f0 to indicate that IO is disabled. A later
>>> rescan interprets this as "IO supported" and enables the IO range,
>>> even if the parent does not support IO. Handle this situation as well.
>>>
>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>> ---
>>>   drivers/pci/probe.c     | 14 ++++++++++++++
>>>   drivers/pci/setup-bus.c |  4 ++--
>>>   include/linux/pci.h     |  9 +++++++++
>>>   3 files changed, 25 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>> index 6675a7a1b9fc..f4944ef45148 100644
>>> --- a/drivers/pci/probe.c
>>> +++ b/drivers/pci/probe.c
>>> @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus *child)
>>>   	base = (io_base_lo & io_mask) << 8;
>>>   	limit = (io_limit_lo & io_mask) << 8;
>>>
>>> +	/* If necessary, check if the bridge supports an I/O aperture */
>>> +	if (!io_base_lo && !io_limit_lo) {
>>> +		u16 io;
>>> +
>>> +		if (!pci_parent_supports_io(child))
>>> +			return;
>>> +
>>> +		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
>>> +		pci_read_config_word(dev, PCI_IO_BASE, &io);
>>> +		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
>>> +		if (!io)
>>> +			return;
>>> +	}
>>
>> I really like the idea of pushing this into pci_read_bridge_io().
>>
>> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
>> somehow get rid of pci_bridge_check_ranges() altogether?
>>
>> I think I looked at doing that a while back, and it seems like there was
>> some wrinkle, but I don't remember what it was.
>

After looking into this some more, I think the wrinkle may be that
pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
on probe-only systems (if PCI_PROBE_ONLY is set). A secondary
problem is that pci_read_bridge_io() does not enable a resource
if it is explicitly disabled (base > limit), but the subsequent call
to pci_bridge_check_ranges() unconditionally enables it.

Not really sure how to address this; my current code checks IO support
in both pci_read_bridge_io() and pci_bridge_check_ranges(). And since
pci_read_bridge_io() is not always called, I don't see how it might
be possible to get rid of pci_bridge_check_ranges(), or even the check
for IO support in pci_bridge_check_ranges().

> While at it, do you think it is reasonable to also claim the bridge
> windows (resources) in the respective pci_read_bridge_* calls ?
>
> Is there a reason why we don't/can't do it ? I noticed that on
> PROBE_ONLY systems on ARM/ARM64 at the moment we do not claim
> the bridge apertures and this is not correct, see below:
>
> [5.980127] pcieport 0000:00:02.1: can't enable device: BAR 8
> [mem 0xbff00000 - 0xbfffffff] not claimed
> [5.988056] pcieport: probe of 0000:00:02.1 failed with error -22
>
Is this when trying my patches or with the current upstream code ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 2, 2015, 7:58 p.m. UTC | #6
On Tue, Jun 2, 2015 at 12:02 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On 06/02/2015 07:55 AM, Lorenzo Pieralisi wrote:
>>
>> Bjorn, Guenter,
>>
>> On Wed, May 27, 2015 at 10:04:47PM +0100, Bjorn Helgaas wrote:
>>>
>>> [+cc Lorenzo, Suravee, Will]
>>>
>>> I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
>>> pci_read_bases() from the PCI core instead of from arch code, and there
>>> are
>>> likely some dependencies between these two things.
>>>
>>> On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
>>>>
>>>> The PCI subsystem always assumes that I/O is supported on PCIe bridges
>>>> and tries to assign an I/O window to each port even if that is not
>>>> the case.
>>>>
>>>> This may result in messages such as
>>>>
>>>> pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
>>>>                                         get_res_add_size add_size 1000
>>>> pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
>>>> pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
>>>>
>>>> for each bridge port, even if a port or its parent does not support
>>>> I/O in the first place.
>>>>
>>>> To avoid this message, check if a port supports I/O before trying to
>>>> enable it. Also check if port's parent supports I/O, and only modify
>>>> a port's I/O resource size if both the port and its parent support I/O.
>>>>
>>>> If IO is disabled after the initial port scan, the IO base and size
>>>> registers are set to 0x00f0 to indicate that IO is disabled. A later
>>>> rescan interprets this as "IO supported" and enables the IO range,
>>>> even if the parent does not support IO. Handle this situation as well.
>>>>
>>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>>> ---
>>>>   drivers/pci/probe.c     | 14 ++++++++++++++
>>>>   drivers/pci/setup-bus.c |  4 ++--
>>>>   include/linux/pci.h     |  9 +++++++++
>>>>   3 files changed, 25 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>>> index 6675a7a1b9fc..f4944ef45148 100644
>>>> --- a/drivers/pci/probe.c
>>>> +++ b/drivers/pci/probe.c
>>>> @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus
>>>> *child)
>>>>         base = (io_base_lo & io_mask) << 8;
>>>>         limit = (io_limit_lo & io_mask) << 8;
>>>>
>>>> +       /* If necessary, check if the bridge supports an I/O aperture */
>>>> +       if (!io_base_lo && !io_limit_lo) {
>>>> +               u16 io;
>>>> +
>>>> +               if (!pci_parent_supports_io(child))
>>>> +                       return;
>>>> +
>>>> +               pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
>>>> +               pci_read_config_word(dev, PCI_IO_BASE, &io);
>>>> +               pci_write_config_word(dev, PCI_IO_BASE, 0x0);
>>>> +               if (!io)
>>>> +                       return;
>>>> +       }
>>>
>>>
>>> I really like the idea of pushing this into pci_read_bridge_io().
>>>
>>> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
>>> somehow get rid of pci_bridge_check_ranges() altogether?
>>>
>>> I think I looked at doing that a while back, and it seems like there was
>>> some wrinkle, but I don't remember what it was.
>
> After looking into this some more, I think the wrinkle may be that
> pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
> on probe-only systems (if PCI_PROBE_ONLY is set). A secondary
> problem is that pci_read_bridge_io() does not enable a resource
> if it is explicitly disabled (base > limit), but the subsequent call
> to pci_bridge_check_ranges() unconditionally enables it.

I haven't researched this, but it sounds wrong that we skip
pci_read_bridge_bases() if PCI_PROBE_ONLY is set.  I think
PCI_PROBE_ONLY should mean "look, but don't touch."  So I think we
should always look at the bridge windows, and my advice is to see if
it looks reasonable to change this aspect of PCI_PROBE_ONLY.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi June 3, 2015, 10:32 a.m. UTC | #7
On Tue, Jun 02, 2015 at 06:02:49PM +0100, Guenter Roeck wrote:
> On 06/02/2015 07:55 AM, Lorenzo Pieralisi wrote:
> > Bjorn, Guenter,
> >
> > On Wed, May 27, 2015 at 10:04:47PM +0100, Bjorn Helgaas wrote:
> >> [+cc Lorenzo, Suravee, Will]
> >>
> >> I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
> >> pci_read_bases() from the PCI core instead of from arch code, and there are
> >> likely some dependencies between these two things.
> >>
> >> On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
> >>> The PCI subsystem always assumes that I/O is supported on PCIe bridges
> >>> and tries to assign an I/O window to each port even if that is not
> >>> the case.
> >>>
> >>> This may result in messages such as
> >>>
> >>> pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
> >>> 					get_res_add_size add_size 1000
> >>> pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
> >>> pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
> >>>
> >>> for each bridge port, even if a port or its parent does not support
> >>> I/O in the first place.
> >>>
> >>> To avoid this message, check if a port supports I/O before trying to
> >>> enable it. Also check if port's parent supports I/O, and only modify
> >>> a port's I/O resource size if both the port and its parent support I/O.
> >>>
> >>> If IO is disabled after the initial port scan, the IO base and size
> >>> registers are set to 0x00f0 to indicate that IO is disabled. A later
> >>> rescan interprets this as "IO supported" and enables the IO range,
> >>> even if the parent does not support IO. Handle this situation as well.
> >>>
> >>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> >>> ---
> >>>   drivers/pci/probe.c     | 14 ++++++++++++++
> >>>   drivers/pci/setup-bus.c |  4 ++--
> >>>   include/linux/pci.h     |  9 +++++++++
> >>>   3 files changed, 25 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >>> index 6675a7a1b9fc..f4944ef45148 100644
> >>> --- a/drivers/pci/probe.c
> >>> +++ b/drivers/pci/probe.c
> >>> @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus *child)
> >>>   	base = (io_base_lo & io_mask) << 8;
> >>>   	limit = (io_limit_lo & io_mask) << 8;
> >>>
> >>> +	/* If necessary, check if the bridge supports an I/O aperture */
> >>> +	if (!io_base_lo && !io_limit_lo) {
> >>> +		u16 io;
> >>> +
> >>> +		if (!pci_parent_supports_io(child))
> >>> +			return;
> >>> +
> >>> +		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
> >>> +		pci_read_config_word(dev, PCI_IO_BASE, &io);
> >>> +		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
> >>> +		if (!io)
> >>> +			return;
> >>> +	}
> >>
> >> I really like the idea of pushing this into pci_read_bridge_io().
> >>
> >> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
> >> somehow get rid of pci_bridge_check_ranges() altogether?
> >>
> >> I think I looked at doing that a while back, and it seems like there was
> >> some wrinkle, but I don't remember what it was.
> >
> 
> After looking into this some more, I think the wrinkle may be that
> pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
> on probe-only systems (if PCI_PROBE_ONLY is set). A secondary

That's what we would like to change :)

https://lkml.org/lkml/2015/5/21/359

> problem is that pci_read_bridge_io() does not enable a resource
> if it is explicitly disabled (base > limit), but the subsequent call
> to pci_bridge_check_ranges() unconditionally enables it.
> 
> Not really sure how to address this; my current code checks IO support
> in both pci_read_bridge_io() and pci_bridge_check_ranges(). And since
> pci_read_bridge_io() is not always called, I don't see how it might
> be possible to get rid of pci_bridge_check_ranges(), or even the check
> for IO support in pci_bridge_check_ranges().
> 
> > While at it, do you think it is reasonable to also claim the bridge
> > windows (resources) in the respective pci_read_bridge_* calls ?
> >
> > Is there a reason why we don't/can't do it ? I noticed that on
> > PROBE_ONLY systems on ARM/ARM64 at the moment we do not claim
> > the bridge apertures and this is not correct, see below:
> >
> > [5.980127] pcieport 0000:00:02.1: can't enable device: BAR 8
> > [mem 0xbff00000 - 0xbfffffff] not claimed
> > [5.988056] pcieport: probe of 0000:00:02.1 failed with error -22
> >
> Is this when trying my patches or with the current upstream code ?

It is upstream code with a couple of ARM64 related patches not yet
merged. Still, it shows an issue that must be tackled.

It is not caused by your patches but it can be solved by them.
On PROBE_ONLY systems, all resources must be claimed (since they
are not reassigned, hence not claimed by the code that reassigns them),
otherwise we can't enable a device resources (ie pcibios_enable_device
calls pci_enable_resources that fails, since resources are not claimed).

That's why we are suggesting claiming the bridge apertures as soon
as they are read from the base registers, even on PROBE_ONLY systems.

I think that's the only approach Bjorn would accept, otherwise
we will have to fiddle with PROBE_ONLY on ARM64, and either avoid calling
pci_enable_resources or avoid checking if a resource is claimed in
pci_enable_resources, neither solution seems sane to me.

Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck June 3, 2015, 3:12 p.m. UTC | #8
On 06/03/2015 03:32 AM, Lorenzo Pieralisi wrote:
> On Tue, Jun 02, 2015 at 06:02:49PM +0100, Guenter Roeck wrote:
>> On 06/02/2015 07:55 AM, Lorenzo Pieralisi wrote:
>>> Bjorn, Guenter,
>>>
>>> On Wed, May 27, 2015 at 10:04:47PM +0100, Bjorn Helgaas wrote:
>>>> [+cc Lorenzo, Suravee, Will]
>>>>
>>>> I cc'd Lorenzo, Suravee, and Will because Lorenzo is working on calling
>>>> pci_read_bases() from the PCI core instead of from arch code, and there are
>>>> likely some dependencies between these two things.
>>>>
>>>> On Fri, May 22, 2015 at 05:52:16PM -0700, Guenter Roeck wrote:
>>>>> The PCI subsystem always assumes that I/O is supported on PCIe bridges
>>>>> and tries to assign an I/O window to each port even if that is not
>>>>> the case.
>>>>>
>>>>> This may result in messages such as
>>>>>
>>>>> pcieport 0000:02:00.0: res[7]=[io  0x1000-0x0fff]
>>>>> 					get_res_add_size add_size 1000
>>>>> pcieport 0000:02:00.0: BAR 7: no space for [io  size 0x1000]
>>>>> pcieport 0000:02:00.0: BAR 7: failed to assign [io  size 0x1000]
>>>>>
>>>>> for each bridge port, even if a port or its parent does not support
>>>>> I/O in the first place.
>>>>>
>>>>> To avoid this message, check if a port supports I/O before trying to
>>>>> enable it. Also check if port's parent supports I/O, and only modify
>>>>> a port's I/O resource size if both the port and its parent support I/O.
>>>>>
>>>>> If IO is disabled after the initial port scan, the IO base and size
>>>>> registers are set to 0x00f0 to indicate that IO is disabled. A later
>>>>> rescan interprets this as "IO supported" and enables the IO range,
>>>>> even if the parent does not support IO. Handle this situation as well.
>>>>>
>>>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>>>> ---
>>>>>    drivers/pci/probe.c     | 14 ++++++++++++++
>>>>>    drivers/pci/setup-bus.c |  4 ++--
>>>>>    include/linux/pci.h     |  9 +++++++++
>>>>>    3 files changed, 25 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>>>> index 6675a7a1b9fc..f4944ef45148 100644
>>>>> --- a/drivers/pci/probe.c
>>>>> +++ b/drivers/pci/probe.c
>>>>> @@ -354,6 +354,20 @@ static void pci_read_bridge_io(struct pci_bus *child)
>>>>>    	base = (io_base_lo & io_mask) << 8;
>>>>>    	limit = (io_limit_lo & io_mask) << 8;
>>>>>
>>>>> +	/* If necessary, check if the bridge supports an I/O aperture */
>>>>> +	if (!io_base_lo && !io_limit_lo) {
>>>>> +		u16 io;
>>>>> +
>>>>> +		if (!pci_parent_supports_io(child))
>>>>> +			return;
>>>>> +
>>>>> +		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
>>>>> +		pci_read_config_word(dev, PCI_IO_BASE, &io);
>>>>> +		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
>>>>> +		if (!io)
>>>>> +			return;
>>>>> +	}
>>>>
>>>> I really like the idea of pushing this into pci_read_bridge_io().
>>>>
>>>> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
>>>> somehow get rid of pci_bridge_check_ranges() altogether?
>>>>
>>>> I think I looked at doing that a while back, and it seems like there was
>>>> some wrinkle, but I don't remember what it was.
>>>
>>
>> After looking into this some more, I think the wrinkle may be that
>> pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
>> on probe-only systems (if PCI_PROBE_ONLY is set). A secondary
>
> That's what we would like to change :)
>
> https://lkml.org/lkml/2015/5/21/359

Yes, that should help. I had a brief look last night and concluded
that this would require changes all over the place, which your patch
pretty much confirms. Glad that you are tackling it - changes all over
the place spell trouble and would probably require more time than I have
available to spend on the problem.

>
>> problem is that pci_read_bridge_io() does not enable a resource
>> if it is explicitly disabled (base > limit), but the subsequent call
>> to pci_bridge_check_ranges() unconditionally enables it.
>>
>> Not really sure how to address this; my current code checks IO support
>> in both pci_read_bridge_io() and pci_bridge_check_ranges(). And since
>> pci_read_bridge_io() is not always called, I don't see how it might
>> be possible to get rid of pci_bridge_check_ranges(), or even the check
>> for IO support in pci_bridge_check_ranges().
>>
>>> While at it, do you think it is reasonable to also claim the bridge
>>> windows (resources) in the respective pci_read_bridge_* calls ?
>>>
>>> Is there a reason why we don't/can't do it ? I noticed that on
>>> PROBE_ONLY systems on ARM/ARM64 at the moment we do not claim
>>> the bridge apertures and this is not correct, see below:
>>>
>>> [5.980127] pcieport 0000:00:02.1: can't enable device: BAR 8
>>> [mem 0xbff00000 - 0xbfffffff] not claimed
>>> [5.988056] pcieport: probe of 0000:00:02.1 failed with error -22
>>>
>> Is this when trying my patches or with the current upstream code ?
>
> It is upstream code with a couple of ARM64 related patches not yet
> merged. Still, it shows an issue that must be tackled.
>
> It is not caused by your patches but it can be solved by them.
> On PROBE_ONLY systems, all resources must be claimed (since they
> are not reassigned, hence not claimed by the code that reassigns them),
> otherwise we can't enable a device resources (ie pcibios_enable_device
> calls pci_enable_resources that fails, since resources are not claimed).
>
> That's why we are suggesting claiming the bridge apertures as soon
> as they are read from the base registers, even on PROBE_ONLY systems.
>
> I think that's the only approach Bjorn would accept, otherwise
> we will have to fiddle with PROBE_ONLY on ARM64, and either avoid calling
> pci_enable_resources or avoid checking if a resource is claimed in
> pci_enable_resources, neither solution seems sane to me.
>

Looks like I'll need one of those arm64 systems at some point ;-).

Where is your patch in respect to acceptance ? Would it make sense to
merge it into my code and base my patch(es) on it, or do you expect
major changes which would make that difficult ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck June 3, 2015, 3:15 p.m. UTC | #9
Bjorn,

On 06/02/2015 12:58 PM, Bjorn Helgaas wrote:
> On Tue, Jun 2, 2015 at 12:02 PM, Guenter Roeck <linux@roeck-us.net> wrote:
[ ... ]
>>>>
>>>> I wonder if we can do the same with pci_read_bridge_mmio_pref(), and
>>>> somehow get rid of pci_bridge_check_ranges() altogether?
>>>>
>>>> I think I looked at doing that a while back, and it seems like there was
>>>> some wrinkle, but I don't remember what it was.
>>
>> After looking into this some more, I think the wrinkle may be that
>> pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
>> on probe-only systems (if PCI_PROBE_ONLY is set). A secondary
>> problem is that pci_read_bridge_io() does not enable a resource
>> if it is explicitly disabled (base > limit), but the subsequent call
>> to pci_bridge_check_ranges() unconditionally enables it.
>
> I haven't researched this, but it sounds wrong that we skip
> pci_read_bridge_bases() if PCI_PROBE_ONLY is set.  I think
> PCI_PROBE_ONLY should mean "look, but don't touch."  So I think we
> should always look at the bridge windows, and my advice is to see if
> it looks reasonable to change this aspect of PCI_PROBE_ONLY.
>

Looks like Lorenzo's patch is going to take care of that problem.

How about the second problem mentioned above ? One option might be to
keep pci_bridge_check_ranges() but only use it to _enable_ the various
ranges (and maybe rename it accordingly) ?

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi June 3, 2015, 4:55 p.m. UTC | #10
On Wed, Jun 03, 2015 at 04:12:24PM +0100, Guenter Roeck wrote:

[...]

> >> After looking into this some more, I think the wrinkle may be that
> >> pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
> >> on probe-only systems (if PCI_PROBE_ONLY is set). A secondary
> >
> > That's what we would like to change :)
> >
> > https://lkml.org/lkml/2015/5/21/359
> 
> Yes, that should help. I had a brief look last night and concluded
> that this would require changes all over the place, which your patch
> pretty much confirms. Glad that you are tackling it - changes all over
> the place spell trouble and would probably require more time than I have
> available to spend on the problem.

Eh, trouble did not even start because we have just tested it on ARM/ARM64
systems (that's all I can do no sign of testing on any other arch), so I do
not expect it will be merged quickly, it will take me time to get all the
required acks.

I should be able to send a v2 beginning of next week.

> >> problem is that pci_read_bridge_io() does not enable a resource
> >> if it is explicitly disabled (base > limit), but the subsequent call
> >> to pci_bridge_check_ranges() unconditionally enables it.
> >>
> >> Not really sure how to address this; my current code checks IO support
> >> in both pci_read_bridge_io() and pci_bridge_check_ranges(). And since
> >> pci_read_bridge_io() is not always called, I don't see how it might
> >> be possible to get rid of pci_bridge_check_ranges(), or even the check
> >> for IO support in pci_bridge_check_ranges().
> >>
> >>> While at it, do you think it is reasonable to also claim the bridge
> >>> windows (resources) in the respective pci_read_bridge_* calls ?
> >>>
> >>> Is there a reason why we don't/can't do it ? I noticed that on
> >>> PROBE_ONLY systems on ARM/ARM64 at the moment we do not claim
> >>> the bridge apertures and this is not correct, see below:
> >>>
> >>> [5.980127] pcieport 0000:00:02.1: can't enable device: BAR 8
> >>> [mem 0xbff00000 - 0xbfffffff] not claimed
> >>> [5.988056] pcieport: probe of 0000:00:02.1 failed with error -22
> >>>
> >> Is this when trying my patches or with the current upstream code ?
> >
> > It is upstream code with a couple of ARM64 related patches not yet
> > merged. Still, it shows an issue that must be tackled.
> >
> > It is not caused by your patches but it can be solved by them.
> > On PROBE_ONLY systems, all resources must be claimed (since they
> > are not reassigned, hence not claimed by the code that reassigns them),
> > otherwise we can't enable a device resources (ie pcibios_enable_device
> > calls pci_enable_resources that fails, since resources are not claimed).
> >
> > That's why we are suggesting claiming the bridge apertures as soon
> > as they are read from the base registers, even on PROBE_ONLY systems.
> >
> > I think that's the only approach Bjorn would accept, otherwise
> > we will have to fiddle with PROBE_ONLY on ARM64, and either avoid calling
> > pci_enable_resources or avoid checking if a resource is claimed in
> > pci_enable_resources, neither solution seems sane to me.
> >
> 
> Looks like I'll need one of those arm64 systems at some point ;-).
> 
> Where is your patch in respect to acceptance ? Would it make sense to
> merge it into my code and base my patch(es) on it, or do you expect
> major changes which would make that difficult ?

I have a tweak to v1, I will post v2 next week and copy you in.
Acceptance, I think it received review only from ARM guys/platforms
so we are still far from merging it.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck June 3, 2015, 6:07 p.m. UTC | #11
On Wed, Jun 03, 2015 at 05:55:35PM +0100, Lorenzo Pieralisi wrote:
> On Wed, Jun 03, 2015 at 04:12:24PM +0100, Guenter Roeck wrote:
> 
> [...]
> 
> > >> After looking into this some more, I think the wrinkle may be that
> > >> pci_read_bridge_bases() and thus pci_read_bridge_io() isn't called
> > >> on probe-only systems (if PCI_PROBE_ONLY is set). A secondary
> > >
> > > That's what we would like to change :)
> > >
> > > https://lkml.org/lkml/2015/5/21/359
> > 
> > Yes, that should help. I had a brief look last night and concluded
> > that this would require changes all over the place, which your patch
> > pretty much confirms. Glad that you are tackling it - changes all over
> > the place spell trouble and would probably require more time than I have
> > available to spend on the problem.
> 
> Eh, trouble did not even start because we have just tested it on ARM/ARM64
> systems (that's all I can do no sign of testing on any other arch), so I do
> not expect it will be merged quickly, it will take me time to get all the
> required acks.
> 
> I should be able to send a v2 beginning of next week.
> 
Ok.

> > >> problem is that pci_read_bridge_io() does not enable a resource
> > >> if it is explicitly disabled (base > limit), but the subsequent call
> > >> to pci_bridge_check_ranges() unconditionally enables it.
> > >>
> > >> Not really sure how to address this; my current code checks IO support
> > >> in both pci_read_bridge_io() and pci_bridge_check_ranges(). And since
> > >> pci_read_bridge_io() is not always called, I don't see how it might
> > >> be possible to get rid of pci_bridge_check_ranges(), or even the check
> > >> for IO support in pci_bridge_check_ranges().
> > >>
> > >>> While at it, do you think it is reasonable to also claim the bridge
> > >>> windows (resources) in the respective pci_read_bridge_* calls ?
> > >>>
> > >>> Is there a reason why we don't/can't do it ? I noticed that on
> > >>> PROBE_ONLY systems on ARM/ARM64 at the moment we do not claim
> > >>> the bridge apertures and this is not correct, see below:
> > >>>
> > >>> [5.980127] pcieport 0000:00:02.1: can't enable device: BAR 8
> > >>> [mem 0xbff00000 - 0xbfffffff] not claimed
> > >>> [5.988056] pcieport: probe of 0000:00:02.1 failed with error -22
> > >>>
> > >> Is this when trying my patches or with the current upstream code ?
> > >
> > > It is upstream code with a couple of ARM64 related patches not yet
> > > merged. Still, it shows an issue that must be tackled.
> > >
> > > It is not caused by your patches but it can be solved by them.
> > > On PROBE_ONLY systems, all resources must be claimed (since they
> > > are not reassigned, hence not claimed by the code that reassigns them),
> > > otherwise we can't enable a device resources (ie pcibios_enable_device
> > > calls pci_enable_resources that fails, since resources are not claimed).
> > >
> > > That's why we are suggesting claiming the bridge apertures as soon
> > > as they are read from the base registers, even on PROBE_ONLY systems.
> > >
> > > I think that's the only approach Bjorn would accept, otherwise
> > > we will have to fiddle with PROBE_ONLY on ARM64, and either avoid calling
> > > pci_enable_resources or avoid checking if a resource is claimed in
> > > pci_enable_resources, neither solution seems sane to me.
> > >
> > 
> > Looks like I'll need one of those arm64 systems at some point ;-).
> > 
> > Where is your patch in respect to acceptance ? Would it make sense to
> > merge it into my code and base my patch(es) on it, or do you expect
> > major changes which would make that difficult ?
> 
> I have a tweak to v1, I will post v2 next week and copy you in.

Thanks!

> Acceptance, I think it received review only from ARM guys/platforms
> so we are still far from merging it.
> 
I should be able to test it on powerpc (p2020/p5020) and x86.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck June 18, 2015, 6:01 p.m. UTC | #12
On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
> 
> > > I'd like res->flags to reflect the capabilities of the hardware, not
> > > whether the window is currently enabled.
> > > 
> > Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
> > purpose, or could that cause conflicts elsewhere ?
> 
> Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
> below a host bridge that doesn't support I/O space.
> 
I integrated Lorenzo's patch and tried to get this working.

Problem is that the use of a resource is widely checked with "!res->flags"
throughout the code. That would have to be changed to something like
"(!res->flags || (res->flags & IORESOURCE_DISABLED))" whereever it is used.

I tried going with "!res->flags" instead, but have not been able to get it
to work realiably; it is just very difficult to distinguish if "!res->flags"
means that the resource has not yet been assigned or if it means that it is not
supported.

The correct approach, in my opinion, would be to go with IORESOURCE_DISABLED
and make the necessary changes whereever needed. Effectively this means to
replace the "!res->flags" check with something like pci_res_used() [ pick
your preferred name ] and define it as

#define pci_res_used(res) ((res)->flags && !((res)->flags & IORESOURCE_DISABLED))

Is that the right direction ?

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 18, 2015, 7:51 p.m. UTC | #13
On Thu, Jun 18, 2015 at 1:01 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
>>
>> > > I'd like res->flags to reflect the capabilities of the hardware, not
>> > > whether the window is currently enabled.
>> > >
>> > Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
>> > purpose, or could that cause conflicts elsewhere ?
>>
>> Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
>> below a host bridge that doesn't support I/O space.
>>
> I integrated Lorenzo's patch and tried to get this working.
>
> Problem is that the use of a resource is widely checked with "!res->flags"
> throughout the code. That would have to be changed to something like
> "(!res->flags || (res->flags & IORESOURCE_DISABLED))" whereever it is used.
>
> I tried going with "!res->flags" instead, but have not been able to get it
> to work realiably; it is just very difficult to distinguish if "!res->flags"
> means that the resource has not yet been assigned or if it means that it is not
> supported.
>
> The correct approach, in my opinion, would be to go with IORESOURCE_DISABLED
> and make the necessary changes whereever needed. Effectively this means to
> replace the "!res->flags" check with something like pci_res_used() [ pick
> your preferred name ] and define it as
>
> #define pci_res_used(res) ((res)->flags && !((res)->flags & IORESOURCE_DISABLED))

I think that makes sense.  Maybe "res_valid()"?  It's not really
PCI-specific, and "used" is a little ambiguous.  So is "valid", I
admit.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck June 18, 2015, 8:53 p.m. UTC | #14
On Thu, Jun 18, 2015 at 02:51:52PM -0500, Bjorn Helgaas wrote:
> On Thu, Jun 18, 2015 at 1:01 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
> >>
> >> > > I'd like res->flags to reflect the capabilities of the hardware, not
> >> > > whether the window is currently enabled.
> >> > >
> >> > Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
> >> > purpose, or could that cause conflicts elsewhere ?
> >>
> >> Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
> >> below a host bridge that doesn't support I/O space.
> >>
> > I integrated Lorenzo's patch and tried to get this working.
> >
> > Problem is that the use of a resource is widely checked with "!res->flags"
> > throughout the code. That would have to be changed to something like
> > "(!res->flags || (res->flags & IORESOURCE_DISABLED))" whereever it is used.
> >
> > I tried going with "!res->flags" instead, but have not been able to get it
> > to work realiably; it is just very difficult to distinguish if "!res->flags"
> > means that the resource has not yet been assigned or if it means that it is not
> > supported.
> >
> > The correct approach, in my opinion, would be to go with IORESOURCE_DISABLED
> > and make the necessary changes whereever needed. Effectively this means to
> > replace the "!res->flags" check with something like pci_res_used() [ pick
> > your preferred name ] and define it as
> >
> > #define pci_res_used(res) ((res)->flags && !((res)->flags & IORESOURCE_DISABLED))
> 
> I think that makes sense.  Maybe "res_valid()"?  It's not really
> PCI-specific, and "used" is a little ambiguous.  So is "valid", I
> admit.
> 
res_valid() sounds good to me. It is also nice and short.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi June 19, 2015, 4:24 p.m. UTC | #15
Hi Guenter,

On Thu, Jun 18, 2015 at 07:01:03PM +0100, Guenter Roeck wrote:
> On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
> > 
> > > > I'd like res->flags to reflect the capabilities of the hardware, not
> > > > whether the window is currently enabled.
> > > > 
> > > Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
> > > purpose, or could that cause conflicts elsewhere ?
> > 
> > Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
> > below a host bridge that doesn't support I/O space.
> > 
> I integrated Lorenzo's patch and tried to get this working.

Thanks. How do you want to proceed with this ? Are you taking my patch
and post it along with your updated series ? We need to extend test
coverage to platforms we could not test on, as you know my series
affects all archs but SPARC (I mean it should *not* affect them, this
has to be tested though, I do not have the HW needed, your coverage
for x86 and PowerPC is great but I do not think it can be deemed
sufficient).

Please let me know, thanks !

Lorenzo

> Problem is that the use of a resource is widely checked with "!res->flags"
> throughout the code. That would have to be changed to something like
> "(!res->flags || (res->flags & IORESOURCE_DISABLED))" whereever it is used.
> 
> I tried going with "!res->flags" instead, but have not been able to get it
> to work realiably; it is just very difficult to distinguish if "!res->flags"
> means that the resource has not yet been assigned or if it means that it is not
> supported.
> 
> The correct approach, in my opinion, would be to go with IORESOURCE_DISABLED
> and make the necessary changes whereever needed. Effectively this means to
> replace the "!res->flags" check with something like pci_res_used() [ pick
> your preferred name ] and define it as
> 
> #define pci_res_used(res) ((res)->flags && !((res)->flags & IORESOURCE_DISABLED))
> 
> Is that the right direction ?
> 
> Thanks,
> Guenter
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
Benjamin Herrenschmidt June 23, 2015, 10:46 p.m. UTC | #16
On Tue, 2015-06-02 at 15:55 +0100, Lorenzo Pieralisi wrote:
> While at it, do you think it is reasonable to also claim the bridge
> windows (resources) in the respective pci_read_bridge_* calls ?

No, don't claim in read. There's a clear distinction between gathering
resources and claiming them, and we need to keep that.

Some fixups might happen in between the two for example.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 23, 2015, 11:02 p.m. UTC | #17
On Tue, Jun 23, 2015 at 5:46 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2015-06-02 at 15:55 +0100, Lorenzo Pieralisi wrote:
>> While at it, do you think it is reasonable to also claim the bridge
>> windows (resources) in the respective pci_read_bridge_* calls ?
>
> No, don't claim in read. There's a clear distinction between gathering
> resources and claiming them, and we need to keep that.
>
> Some fixups might happen in between the two for example.

Are there any existing fixups like that?  Concrete examples would help
figure out the best way forward.

Most arches call pci_read_bridge_bases() from pcibios_fixup_bus().  I
think that's a poor place to do it because it's code that normally
doesn't have to be arch-specific.  Resource claiming is also usually
done from arch code, and it shouldn't be arch-specific either.

If we move both the read and claim into generic code, then we might
need to make sure there's a fixup phase in between or something.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Herrenschmidt June 23, 2015, 11:14 p.m. UTC | #18
On Tue, 2015-06-23 at 18:02 -0500, Bjorn Helgaas wrote:
> On Tue, Jun 23, 2015 at 5:46 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Tue, 2015-06-02 at 15:55 +0100, Lorenzo Pieralisi wrote:
> >> While at it, do you think it is reasonable to also claim the bridge
> >> windows (resources) in the respective pci_read_bridge_* calls ?
> >
> > No, don't claim in read. There's a clear distinction between gathering
> > resources and claiming them, and we need to keep that.
> >
> > Some fixups might happen in between the two for example.
> 
> Are there any existing fixups like that?  Concrete examples would help
> figure out the best way forward.

Not off the top of my mind, it's been a long time since I wrote the
resource claiming stuff in arch/powerpc but it does make me nervous. We
collect resources when probing and we claim in the survey, those have
been historically very distinct steps.

> Most arches call pci_read_bridge_bases() from pcibios_fixup_bus().  I
> think that's a poor place to do it because it's code that normally
> doesn't have to be arch-specific.  Resource claiming is also usually
> done from arch code, and it shouldn't be arch-specific either.

Claiming as in putting in the resource tree etc... is different from
actually reading the values from the HW and is traditionally done much
later, no ?

> If we move both the read and claim into generic code, then we might
> need to make sure there's a fixup phase in between or something.

Well, then there's a more general argument to be made as to whether we
want the claiming to be "merged" as part of the probing/reading I
suppose...

Then there's also the case where everything gets fully reassigned, like
powernv, where the "read" phase is really only about sizing device
BARs...

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi June 25, 2015, 11:27 a.m. UTC | #19
On Wed, Jun 24, 2015 at 12:14:43AM +0100, Benjamin Herrenschmidt wrote:
> On Tue, 2015-06-23 at 18:02 -0500, Bjorn Helgaas wrote:
> > On Tue, Jun 23, 2015 at 5:46 PM, Benjamin Herrenschmidt
> > <benh@kernel.crashing.org> wrote:
> > > On Tue, 2015-06-02 at 15:55 +0100, Lorenzo Pieralisi wrote:
> > >> While at it, do you think it is reasonable to also claim the bridge
> > >> windows (resources) in the respective pci_read_bridge_* calls ?
> > >
> > > No, don't claim in read. There's a clear distinction between gathering
> > > resources and claiming them, and we need to keep that.
> > >
> > > Some fixups might happen in between the two for example.
> > 
> > Are there any existing fixups like that?  Concrete examples would help
> > figure out the best way forward.
> 
> Not off the top of my mind, it's been a long time since I wrote the
> resource claiming stuff in arch/powerpc but it does make me nervous. We
> collect resources when probing and we claim in the survey, those have
> been historically very distinct steps.

Yes, that makes me nervous too, that's why I posted my patch as an
RFC/RFT, I think there is little debate in moving
pci_read_bridge_bases() to core PCI, claiming the resources is a
different question though, and I can't have the overall picture
since it _seems_ arch specific (I know Bjorn does not agree with
this though - it might be due to platform specific quirks) even if
it should not.

> > Most arches call pci_read_bridge_bases() from pcibios_fixup_bus().  I
> > think that's a poor place to do it because it's code that normally
> > doesn't have to be arch-specific.  Resource claiming is also usually
> > done from arch code, and it shouldn't be arch-specific either.
> 
> Claiming as in putting in the resource tree etc... is different from
> actually reading the values from the HW and is traditionally done much
> later, no ?
> 
> > If we move both the read and claim into generic code, then we might
> > need to make sure there's a fixup phase in between or something.
> 
> Well, then there's a more general argument to be made as to whether we
> want the claiming to be "merged" as part of the probing/reading I
> suppose...

On PROBE_ONLY systems (that are the systems I really wanted to cover
by claiming as soon as pci_read_bridge_bases() is executed) I think
we all agree that merging the claiming/reading is sane (but I also
think that Bjorn is not happy with that :), I mean it should not
be PROBE_ONLY dependent).

> Then there's also the case where everything gets fully reassigned, like
> powernv, where the "read" phase is really only about sizing device
> BARs...

Exactly, there claiming right after reading should not be a problem
either. How do you want me to proceed ? Should I make bridge resources
claiming in PCI core PROBE_ONLY ? Or move it to ARM specific hooks ?

I will certainly move pci_read_bridge_bases() to core code since this
changes nothing to current behaviour and must be consolidated
regardless.

Thoughts appreciated.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi July 7, 2015, 2:40 p.m. UTC | #20
On Fri, Jun 19, 2015 at 05:24:13PM +0100, Lorenzo Pieralisi wrote:
> Hi Guenter,
> 
> On Thu, Jun 18, 2015 at 07:01:03PM +0100, Guenter Roeck wrote:
> > On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
> > > 
> > > > > I'd like res->flags to reflect the capabilities of the hardware, not
> > > > > whether the window is currently enabled.
> > > > > 
> > > > Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
> > > > purpose, or could that cause conflicts elsewhere ?
> > > 
> > > Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
> > > below a host bridge that doesn't support I/O space.
> > > 
> > I integrated Lorenzo's patch and tried to get this working.
> 
> Thanks. How do you want to proceed with this ? Are you taking my patch
> and post it along with your updated series ? We need to extend test
> coverage to platforms we could not test on, as you know my series
> affects all archs but SPARC (I mean it should *not* affect them, this
> has to be tested though, I do not have the HW needed, your coverage
> for x86 and PowerPC is great but I do not think it can be deemed
> sufficient).
> 
> Please let me know, thanks !

Any comment on this ? I will have to remove the bridge resource claiming
from my patch according to Ben's concerns for PowerPC, which requires
a v3.

How do you want me to go on with this ?

Thanks,
Lorenzo

> > Problem is that the use of a resource is widely checked with "!res->flags"
> > throughout the code. That would have to be changed to something like
> > "(!res->flags || (res->flags & IORESOURCE_DISABLED))" whereever it is used.
> > 
> > I tried going with "!res->flags" instead, but have not been able to get it
> > to work realiably; it is just very difficult to distinguish if "!res->flags"
> > means that the resource has not yet been assigned or if it means that it is not
> > supported.
> > 
> > The correct approach, in my opinion, would be to go with IORESOURCE_DISABLED
> > and make the necessary changes whereever needed. Effectively this means to
> > replace the "!res->flags" check with something like pci_res_used() [ pick
> > your preferred name ] and define it as
> > 
> > #define pci_res_used(res) ((res)->flags && !((res)->flags & IORESOURCE_DISABLED))
> > 
> > Is that the right direction ?
> > 
> > Thanks,
> > Guenter
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck July 7, 2015, 3:01 p.m. UTC | #21
Hi Lorenzo,

On 07/07/2015 07:40 AM, Lorenzo Pieralisi wrote:
> On Fri, Jun 19, 2015 at 05:24:13PM +0100, Lorenzo Pieralisi wrote:
>> Hi Guenter,
>>
>> On Thu, Jun 18, 2015 at 07:01:03PM +0100, Guenter Roeck wrote:
>>> On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
>>>>
>>>>>> I'd like res->flags to reflect the capabilities of the hardware, not
>>>>>> whether the window is currently enabled.
>>>>>>
>>>>> Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
>>>>> purpose, or could that cause conflicts elsewhere ?
>>>>
>>>> Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
>>>> below a host bridge that doesn't support I/O space.
>>>>
>>> I integrated Lorenzo's patch and tried to get this working.
>>
>> Thanks. How do you want to proceed with this ? Are you taking my patch
>> and post it along with your updated series ? We need to extend test
>> coverage to platforms we could not test on, as you know my series
>> affects all archs but SPARC (I mean it should *not* affect them, this
>> has to be tested though, I do not have the HW needed, your coverage
>> for x86 and PowerPC is great but I do not think it can be deemed
>> sufficient).
>>
>> Please let me know, thanks !
>
> Any comment on this ? I will have to remove the bridge resource claiming
> from my patch according to Ben's concerns for PowerPC, which requires
> a v3.
>
> How do you want me to go on with this ?
>

Can you send your v3 ?

I didn't submit my latest version because I recalled Ben's objections,
and I never got around asking you if you plan to send a new version
of your patch.

I had to drop the idea of using IORESOURCE_DISABLED; pretty much all
kernel code uses the "!flags" test to identify unused resources.
I tried to change that, but just could not get it to work.
I ended up introducing a new bus flag instead, PCI_BUS_FLAGS_NO_IO,
which works quite nicely since it propagates to child buses.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi July 7, 2015, 5:28 p.m. UTC | #22
On Tue, Jul 07, 2015 at 04:01:45PM +0100, Guenter Roeck wrote:
> Hi Lorenzo,
> 
> On 07/07/2015 07:40 AM, Lorenzo Pieralisi wrote:
> > On Fri, Jun 19, 2015 at 05:24:13PM +0100, Lorenzo Pieralisi wrote:
> >> Hi Guenter,
> >>
> >> On Thu, Jun 18, 2015 at 07:01:03PM +0100, Guenter Roeck wrote:
> >>> On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
> >>>>
> >>>>>> I'd like res->flags to reflect the capabilities of the hardware, not
> >>>>>> whether the window is currently enabled.
> >>>>>>
> >>>>> Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
> >>>>> purpose, or could that cause conflicts elsewhere ?
> >>>>
> >>>> Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
> >>>> below a host bridge that doesn't support I/O space.
> >>>>
> >>> I integrated Lorenzo's patch and tried to get this working.
> >>
> >> Thanks. How do you want to proceed with this ? Are you taking my patch
> >> and post it along with your updated series ? We need to extend test
> >> coverage to platforms we could not test on, as you know my series
> >> affects all archs but SPARC (I mean it should *not* affect them, this
> >> has to be tested though, I do not have the HW needed, your coverage
> >> for x86 and PowerPC is great but I do not think it can be deemed
> >> sufficient).
> >>
> >> Please let me know, thanks !
> >
> > Any comment on this ? I will have to remove the bridge resource claiming
> > from my patch according to Ben's concerns for PowerPC, which requires
> > a v3.
> >
> > How do you want me to go on with this ?
> >
> 
> Can you send your v3 ?

Yes, I have to figure out though where I can claim bridge resources
on PROBE_ONLY arm/arm64 systems, which is proving interesting, anyway
I will send it out asap.

> I didn't submit my latest version because I recalled Ben's objections,
> and I never got around asking you if you plan to send a new version
> of your patch.

No worries, let's get this sorted.

> I had to drop the idea of using IORESOURCE_DISABLED; pretty much all
> kernel code uses the "!flags" test to identify unused resources.
> I tried to change that, but just could not get it to work.
> I ended up introducing a new bus flag instead, PCI_BUS_FLAGS_NO_IO,
> which works quite nicely since it propagates to child buses.

Ok, great, I can test it too.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck July 7, 2015, 6:13 p.m. UTC | #23
On 07/07/2015 10:28 AM, Lorenzo Pieralisi wrote:
> On Tue, Jul 07, 2015 at 04:01:45PM +0100, Guenter Roeck wrote:
>> Hi Lorenzo,
>>
>> On 07/07/2015 07:40 AM, Lorenzo Pieralisi wrote:
>>> On Fri, Jun 19, 2015 at 05:24:13PM +0100, Lorenzo Pieralisi wrote:
>>>> Hi Guenter,
>>>>
>>>> On Thu, Jun 18, 2015 at 07:01:03PM +0100, Guenter Roeck wrote:
>>>>> On Thu, May 28, 2015 at 07:41:12AM -0500, Bjorn Helgaas wrote:
>>>>>>
>>>>>>>> I'd like res->flags to reflect the capabilities of the hardware, not
>>>>>>>> whether the window is currently enabled.
>>>>>>>>
>>>>>>> Flag bits seem to be all taken. Could we use IORESOURCE_DISABLED for that
>>>>>>> purpose, or could that cause conflicts elsewhere ?
>>>>>>
>>>>>> Yes, I think IORESOURCE_DISABLED would be appropriate for any I/O windows
>>>>>> below a host bridge that doesn't support I/O space.
>>>>>>
>>>>> I integrated Lorenzo's patch and tried to get this working.
>>>>
>>>> Thanks. How do you want to proceed with this ? Are you taking my patch
>>>> and post it along with your updated series ? We need to extend test
>>>> coverage to platforms we could not test on, as you know my series
>>>> affects all archs but SPARC (I mean it should *not* affect them, this
>>>> has to be tested though, I do not have the HW needed, your coverage
>>>> for x86 and PowerPC is great but I do not think it can be deemed
>>>> sufficient).
>>>>
>>>> Please let me know, thanks !
>>>
>>> Any comment on this ? I will have to remove the bridge resource claiming
>>> from my patch according to Ben's concerns for PowerPC, which requires
>>> a v3.
>>>
>>> How do you want me to go on with this ?
>>>
>>
>> Can you send your v3 ?
>
> Yes, I have to figure out though where I can claim bridge resources
> on PROBE_ONLY arm/arm64 systems, which is proving interesting, anyway
> I will send it out asap.
>
>> I didn't submit my latest version because I recalled Ben's objections,
>> and I never got around asking you if you plan to send a new version
>> of your patch.
>
> No worries, let's get this sorted.
>
>> I had to drop the idea of using IORESOURCE_DISABLED; pretty much all
>> kernel code uses the "!flags" test to identify unused resources.
>> I tried to change that, but just could not get it to work.
>> I ended up introducing a new bus flag instead, PCI_BUS_FLAGS_NO_IO,
>> which works quite nicely since it propagates to child buses.
>
> Ok, great, I can test it too.
>

I just sent out the curent version of my patch as RFC to get some feedback.
I'll be out of the office for the next two weeks, so I won't be able
to do much if any testing during that time. So take time for your v3.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi July 8, 2015, 8:38 a.m. UTC | #24
On Wed, Jun 24, 2015 at 12:02:14AM +0100, Bjorn Helgaas wrote:
> On Tue, Jun 23, 2015 at 5:46 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Tue, 2015-06-02 at 15:55 +0100, Lorenzo Pieralisi wrote:
> >> While at it, do you think it is reasonable to also claim the bridge
> >> windows (resources) in the respective pci_read_bridge_* calls ?
> >
> > No, don't claim in read. There's a clear distinction between gathering
> > resources and claiming them, and we need to keep that.
> >
> > Some fixups might happen in between the two for example.
> 
> Are there any existing fixups like that?  Concrete examples would help
> figure out the best way forward.
> 
> Most arches call pci_read_bridge_bases() from pcibios_fixup_bus().  I
> think that's a poor place to do it because it's code that normally
> doesn't have to be arch-specific.  Resource claiming is also usually
> done from arch code, and it shouldn't be arch-specific either.
> 
> If we move both the read and claim into generic code, then we might
> need to make sure there's a fixup phase in between or something.

Yes, that's where I am at the moment. On arm/arm64 PROBE_ONLY systems, if
I can't claim bridge apertures upon pci_read_bridge_bases, I can't
claim device resources in pcibios_add_device() since the bridge apertures
have not been claimed at that point, hence resulting in failures.

Given current code I see the following options:

(1) Claim bridge resources in pci_read_bridge_bases()
(2) Claim bridge resources in pcibios_add_device() (but that's horrible,
    since it requires looking up device upstream bridge and claim its
    resources)
(3) Do not claim resources on PROBE_ONLY systems (that's what arm does at
    present) and do not enable resources in pcibios_enable_device
(4) Add an initcall to arm/arm64 that carries out a resource survey,
    that's what's done on powerPC and also x86 it seems
    (eg pcibios_init in arch/powerpc/kernel/pci_64.c)

Personally I think (1) is by far the cleanest solution, I understand
Ben's concern but we need a way forward.

I will have to revert to (3) unless we find another solution, I would
like to make progress on this since it became a blocking issue.

As I said before, I will move pci_read_bridge_bases() to generic code
regardless.

Comments appreciated.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6675a7a1b9fc..f4944ef45148 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -354,6 +354,20 @@  static void pci_read_bridge_io(struct pci_bus *child)
 	base = (io_base_lo & io_mask) << 8;
 	limit = (io_limit_lo & io_mask) << 8;
 
+	/* If necessary, check if the bridge supports an I/O aperture */
+	if (!io_base_lo && !io_limit_lo) {
+		u16 io;
+
+		if (!pci_parent_supports_io(child))
+			return;
+
+		pci_write_config_word(dev, PCI_IO_BASE, 0xe0f0);
+		pci_read_config_word(dev, PCI_IO_BASE, &io);
+		pci_write_config_word(dev, PCI_IO_BASE, 0x0);
+		if (!io)
+			return;
+	}
+
 	if ((io_base_lo & PCI_IO_RANGE_TYPE_MASK) == PCI_IO_RANGE_TYPE_32) {
 		u16 io_base_hi, io_limit_hi;
 
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4fd0cacf7ca0..963b31a109a9 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -750,12 +750,12 @@  static void pci_bridge_check_ranges(struct pci_bus *bus)
 	b_res[1].flags |= IORESOURCE_MEM;
 
 	pci_read_config_word(bridge, PCI_IO_BASE, &io);
-	if (!io) {
+	if (!io && pci_parent_supports_io(bus)) {
 		pci_write_config_word(bridge, PCI_IO_BASE, 0xe0f0);
 		pci_read_config_word(bridge, PCI_IO_BASE, &io);
 		pci_write_config_word(bridge, PCI_IO_BASE, 0x0);
 	}
-	if (io)
+	if (io && (io != 0x00f0 || pci_parent_supports_io(bus)))
 		b_res[0].flags |= IORESOURCE_IO;
 
 	/*  DECchip 21050 pass 2 errata: the bridge may miss an address
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 353db8dc4c6e..f3de9e24aab1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -489,6 +489,15 @@  static inline bool pci_is_root_bus(struct pci_bus *pbus)
 	return !(pbus->parent);
 }
 
+/*
+ * Returns true if the parent bus supports an I/O aperture.
+ */
+static inline bool pci_parent_supports_io(struct pci_bus *pbus)
+{
+	return pci_is_root_bus(pbus) || pci_is_root_bus(pbus->parent) ||
+	       (pbus->parent->resource[0]->flags & IORESOURCE_IO);
+}
+
 /**
  * pci_is_bridge - check if the PCI device is a bridge
  * @dev: PCI device