diff mbox

[RFC] PCI-E broken on PPC (regression)

Message ID 4B5D9FC5.5070600@linux.vnet.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

Breno Leitao Jan. 25, 2010, 1:42 p.m. UTC
Hello, 

I found that qlge is broken on PPC, and it got broken after commit 
06a1cbafb253c4c60d6a54a994887f5fbceabcc0. It happens because dev->pcie 
is not set on PPC, because the function set_pcie_port_type(), who sets
dev->pcie, is not being called on PPC PCI code.

So, I have two ideas to fix it, the first one is to call 
set_pcie_port_type() on pci_device_add() instead of pci_setup_device(). 
Since that PPC device flow calls pci_device_add(), it fixes the problem
without duplicating the caller for this function. 

OTOH, it's also possible to add set_pcie_port_type() on pci.h and call
it inside the PPC PCI files, specifically on of_create_pci_dev().

I tested both ideas and they work perfect, so I'd like to figure out 
which one is the most correct one. Both patches are attached here. 

Thanks, 
Breno

----- First idea -----

Comments

Jesse Barnes Jan. 25, 2010, 8:38 p.m. UTC | #1
On Mon, 25 Jan 2010 11:42:29 -0200
Breno Leitao <leitao@linux.vnet.ibm.com> wrote:

> Hello, 
> 
> I found that qlge is broken on PPC, and it got broken after commit 
> 06a1cbafb253c4c60d6a54a994887f5fbceabcc0. It happens because
> dev->pcie is not set on PPC, because the function
> set_pcie_port_type(), who sets dev->pcie, is not being called on PPC
> PCI code.

You mean dev->is_pcie?

Why isn't pci_scan_device calling pci_setup_device for you?  That
should do the proper PCIe init depending on the device, along with
extracting other device info...
Jesse Barnes Jan. 26, 2010, 1:50 a.m. UTC | #2
On Mon, 25 Jan 2010 12:38:49 -0800
Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> On Mon, 25 Jan 2010 11:42:29 -0200
> Breno Leitao <leitao@linux.vnet.ibm.com> wrote:
> 
> > Hello, 
> > 
> > I found that qlge is broken on PPC, and it got broken after commit 
> > 06a1cbafb253c4c60d6a54a994887f5fbceabcc0. It happens because
> > dev->pcie is not set on PPC, because the function
> > set_pcie_port_type(), who sets dev->pcie, is not being called on PPC
> > PCI code.
> 
> You mean dev->is_pcie?
> 
> Why isn't pci_scan_device calling pci_setup_device for you?  That
> should do the proper PCIe init depending on the device, along with
> extracting other device info...

Cc'ing Ben for PPC.  Ben, should PPC use pci_scan_device when probing
its root busses?  Sounds like it just uses pci_device_add for each one
it finds instead?

If you don't actually need scanning (though what about hotplug?) we can
move the call to device_add instead...
Benjamin Herrenschmidt Jan. 26, 2010, 2:59 a.m. UTC | #3
On Mon, 2010-01-25 at 17:50 -0800, Jesse Barnes wrote:
> > You mean dev->is_pcie?
> > 
> > Why isn't pci_scan_device calling pci_setup_device for you?  That
> > should do the proper PCIe init depending on the device, along with
> > extracting other device info...
> 
> Cc'ing Ben for PPC.  Ben, should PPC use pci_scan_device when probing
> its root busses?  Sounds like it just uses pci_device_add for each one
> it finds instead?
> 
> If you don't actually need scanning (though what about hotplug?) we can
> move the call to device_add instead...

I need to give it more brain cells than I have available today :-) I'll
have a look tomorrow. In some case we build the PCI stuff up from the
firmware device-tree without actually doing any config space scanning,
so the root busses are treated a bit special.

Cheers,
Ben.
Kenji Kaneshige Jan. 26, 2010, 4:18 a.m. UTC | #4
Breno Leitao wrote:
> Hello, 
> 
> I found that qlge is broken on PPC, and it got broken after commit 
> 06a1cbafb253c4c60d6a54a994887f5fbceabcc0. It happens because dev->pcie 
> is not set on PPC, because the function set_pcie_port_type(), who sets
> dev->pcie, is not being called on PPC PCI code.

Hi Breno,

I'm sorry for the regression. I didn't realize that PPC uses open-coded
PCI scan.

I guess the problem is caused by dev->pcie_cap, right? But I don't
understand what is happening clearly. Could you send me the detailed
information about the problem?

> 
> So, I have two ideas to fix it, the first one is to call 
> set_pcie_port_type() on pci_device_add() instead of pci_setup_device(). 
> Since that PPC device flow calls pci_device_add(), it fixes the problem
> without duplicating the caller for this function. 
> 

The set_pcie_port_type() needs to be called before pci_fixup_device() in
pci_setup_device() because fixup code might refer dev->pcie_cap. So I
don't think the first one is a good idea.

> OTOH, it's also possible to add set_pcie_port_type() on pci.h and call
> it inside the PPC PCI files, specifically on of_create_pci_dev().
> 
> I tested both ideas and they work perfect, so I'd like to figure out 
> which one is the most correct one. Both patches are attached here. 

I think the second one is better. But to prevent similar problems, I
think it would be better to remove open-coded PCI scan in PPC in a
long term, though I don't know the background about that.

Thanks,
Kenji Kaneshige


> 
> Thanks, 
> Breno
> 
> ----- First idea -----
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 98ffb2d..328c3ab 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -746,7 +746,6 @@ int pci_setup_device(struct pci_dev *dev)
>  	dev->hdr_type = hdr_type & 0x7f;
>  	dev->multifunction = !!(hdr_type & 0x80);
>  	dev->error_state = pci_channel_io_normal;
> -	set_pcie_port_type(dev);
>  	set_pci_aer_firmware_first(dev);
>  
>  	list_for_each_entry(slot, &dev->bus->slots, list)
> @@ -1052,6 +1051,7 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  	/* Initialize various capabilities */
>  	pci_init_capabilities(dev);
>  
> +	set_pcie_port_type(dev);
>  	/*
>  	 * Add the device to our list of discovered devices
>  	 * and the bus list for fixup functions, etc.
> 
> ----- Second idea -----
> 
> diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
> index 7311fdf..f8820e8 100644
> --- a/arch/powerpc/kernel/pci_of_scan.c
> +++ b/arch/powerpc/kernel/pci_of_scan.c
> @@ -160,6 +160,7 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
>  	dev->error_state = pci_channel_io_normal;
>  	dev->dma_mask = 0xffffffff;
>  
> +	set_pcie_port_type(dev);
>  	if (!strcmp(type, "pci") || !strcmp(type, "pciex")) {
>  		/* a PCI-PCI bridge */
>  		dev->hdr_type = PCI_HEADER_TYPE_BRIDGE;
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 98ffb2d..f787eea 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -681,7 +681,7 @@ static void pci_read_irq(struct pci_dev *dev)
>  	dev->irq = irq;
>  }
>  
> -static void set_pcie_port_type(struct pci_dev *pdev)
> +void set_pcie_port_type(struct pci_dev *pdev)
>  {
>  	int pos;
>  	u16 reg16;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 174e539..765095b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1269,6 +1269,7 @@ int pcibios_add_platform_entries(struct pci_dev *dev);
>  void pcibios_disable_device(struct pci_dev *dev);
>  int pcibios_set_pcie_reset_state(struct pci_dev *dev,
>  				 enum pcie_reset_state state);
> +extern void set_pcie_port_type(struct pci_dev *pdev);
>  
>  #ifdef CONFIG_PCI_MMCONFIG
>  extern void __init pci_mmcfg_early_init(void);
> 
>
Kenji Kaneshige Jan. 26, 2010, 4:36 a.m. UTC | #5
Jesse Barnes wrote:
> On Mon, 25 Jan 2010 12:38:49 -0800
> Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> 
>> On Mon, 25 Jan 2010 11:42:29 -0200
>> Breno Leitao <leitao@linux.vnet.ibm.com> wrote:
>>
>>> Hello, 
>>>
>>> I found that qlge is broken on PPC, and it got broken after commit 
>>> 06a1cbafb253c4c60d6a54a994887f5fbceabcc0. It happens because
>>> dev->pcie is not set on PPC, because the function
>>> set_pcie_port_type(), who sets dev->pcie, is not being called on PPC
>>> PCI code.
>> You mean dev->is_pcie?
>>
>> Why isn't pci_scan_device calling pci_setup_device for you?  That
>> should do the proper PCIe init depending on the device, along with
>> extracting other device info...
> 
> Cc'ing Ben for PPC.  Ben, should PPC use pci_scan_device when probing
> its root busses?  Sounds like it just uses pci_device_add for each one
> it finds instead?
> 
> If you don't actually need scanning (though what about hotplug?) we can
> move the call to device_add instead...
> 

As I mentioned in the other e-mail, I think the set_pcie_port_type() needs
to be called before pci_fixup_device() call in the pci_setup_device()
because some fixup handler might refer the dev->is_pcie, dev->pcie_cap,
or dev->pcie_type.

Thanks,
Kenji Kaneshige
Benjamin Herrenschmidt Jan. 27, 2010, 2:10 a.m. UTC | #6
> Cc'ing Ben for PPC.  Ben, should PPC use pci_scan_device when probing
> its root busses?  Sounds like it just uses pci_device_add for each one
> it finds instead?
> 
> If you don't actually need scanning (though what about hotplug?) we can
> move the call to device_add instead...

Ok so I looked at the code and the problem goes way beyond root busses.

Basically, powerpc can use the code in arch/powerpc/kernel/pci_of_scan.c
to "generate" the pci_dev without using config space probing or at least
using as little of it as possible, using the firmware device-tree
information instead.

This is also probably going to be moved to a more generic place and
extended to be used optionally by other architectures.

I think sparc does something similar in fact in arch/sparc/kernel/pci.c
(of_create_pci_dev()) though it would be logical to have sparc and
powerpc share the same implementation here in the long run and I believe
Grant Likely is working on it.

That means that potentially, pci_dev will be created on those archs for
which pci_setup_device() is never called. Thus we need to be very
careful when adding initializations there that at least we (myself and
davem) are notified of that so we can mirror them in our code, or even
better, if people doing so put them there too...

So as far as I can tell, we are missing that set_pcie_port_type(), so we
need to add it to sparc and powerpc (and so make the function non-static
in drivers/pci/probe.c). We are also missing the manipulation of
dev->slot in fact, so that will need to be fixed too.

set_pcie_hotplug_bridge() might be something we want to add too, it's
not totally clear yet due to possible issues with our firmwares.
pci_fixup_device(pci_fixup_early,...) as well in fact.

I'll try do make ppc catch up with some of that see how it goes.

Cheers,
Ben.
Jesse Barnes Jan. 27, 2010, 4:26 p.m. UTC | #7
On Wed, 27 Jan 2010 13:10:56 +1100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> 
> > Cc'ing Ben for PPC.  Ben, should PPC use pci_scan_device when probing
> > its root busses?  Sounds like it just uses pci_device_add for each one
> > it finds instead?
> > 
> > If you don't actually need scanning (though what about hotplug?) we can
> > move the call to device_add instead...
> 
> Ok so I looked at the code and the problem goes way beyond root busses.
> 
> Basically, powerpc can use the code in arch/powerpc/kernel/pci_of_scan.c
> to "generate" the pci_dev without using config space probing or at least
> using as little of it as possible, using the firmware device-tree
> information instead.
> 
> This is also probably going to be moved to a more generic place and
> extended to be used optionally by other architectures.
> 
> I think sparc does something similar in fact in arch/sparc/kernel/pci.c
> (of_create_pci_dev()) though it would be logical to have sparc and
> powerpc share the same implementation here in the long run and I believe
> Grant Likely is working on it.
> 
> That means that potentially, pci_dev will be created on those archs for
> which pci_setup_device() is never called. Thus we need to be very
> careful when adding initializations there that at least we (myself and
> davem) are notified of that so we can mirror them in our code, or even
> better, if people doing so put them there too...
> 
> So as far as I can tell, we are missing that set_pcie_port_type(), so we
> need to add it to sparc and powerpc (and so make the function non-static
> in drivers/pci/probe.c). We are also missing the manipulation of
> dev->slot in fact, so that will need to be fixed too.
> 
> set_pcie_hotplug_bridge() might be something we want to add too, it's
> not totally clear yet due to possible issues with our firmwares.
> pci_fixup_device(pci_fixup_early,...) as well in fact.
> 
> I'll try do make ppc catch up with some of that see how it goes.

Thanks Ben.  Any refactoring we need to handle this stuff better is
fine with me too.  I guess on some platforms calling pci_setup_device
may cause problems with special platform devices?
Benjamin Herrenschmidt Jan. 27, 2010, 10 p.m. UTC | #8
On Wed, 2010-01-27 at 08:26 -0800, Jesse Barnes wrote:
> 
> Thanks Ben.  Any refactoring we need to handle this stuff better is
> fine with me too.  I guess on some platforms calling pci_setup_device
> may cause problems with special platform devices? 

Well, we don't call pci_setup_device() because part of the deal is to
avoid all of that config space reading that it does :-) Especially in
the case of some of the IBM EADS bridges which don't let you access
everything we may want.

Cheers,
Ben.
David Miller Jan. 28, 2010, 12:01 a.m. UTC | #9
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Thu, 28 Jan 2010 09:00:12 +1100

> On Wed, 2010-01-27 at 08:26 -0800, Jesse Barnes wrote:
>> 
>> Thanks Ben.  Any refactoring we need to handle this stuff better is
>> fine with me too.  I guess on some platforms calling pci_setup_device
>> may cause problems with special platform devices? 
> 
> Well, we don't call pci_setup_device() because part of the deal is to
> avoid all of that config space reading that it does :-) Especially in
> the case of some of the IBM EADS bridges which don't let you access
> everything we may want.

Same problem on sparc64, it's not safe to poke config space
arbitrarily.  Some PCI controllers even have bugs which cause them to
hang if you try to access some parts of the host controller's PCI
config space.
Jesse Barnes Jan. 28, 2010, 12:03 a.m. UTC | #10
On Wed, 27 Jan 2010 16:01:21 -0800 (PST)
David Miller <davem@davemloft.net> wrote:

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Thu, 28 Jan 2010 09:00:12 +1100
> 
> > On Wed, 2010-01-27 at 08:26 -0800, Jesse Barnes wrote:
> >> 
> >> Thanks Ben.  Any refactoring we need to handle this stuff better is
> >> fine with me too.  I guess on some platforms calling pci_setup_device
> >> may cause problems with special platform devices? 
> > 
> > Well, we don't call pci_setup_device() because part of the deal is to
> > avoid all of that config space reading that it does :-) Especially in
> > the case of some of the IBM EADS bridges which don't let you access
> > everything we may want.
> 
> Same problem on sparc64, it's not safe to poke config space
> arbitrarily.  Some PCI controllers even have bugs which cause them to
> hang if you try to access some parts of the host controller's PCI
> config space.

Ok, that's what I thought.  We'll need to make these functions
available then...
Matthew Wilcox Jan. 29, 2010, 3:45 a.m. UTC | #11
On Wed, Jan 27, 2010 at 01:10:56PM +1100, Benjamin Herrenschmidt wrote:
> 
> > Cc'ing Ben for PPC.  Ben, should PPC use pci_scan_device when probing
> > its root busses?  Sounds like it just uses pci_device_add for each one
> > it finds instead?
> > 
> > If you don't actually need scanning (though what about hotplug?) we can
> > move the call to device_add instead...
> 
> Ok so I looked at the code and the problem goes way beyond root busses.
> 
> Basically, powerpc can use the code in arch/powerpc/kernel/pci_of_scan.c
> to "generate" the pci_dev without using config space probing or at least
> using as little of it as possible, using the firmware device-tree
> information instead.
> 
> This is also probably going to be moved to a more generic place and
> extended to be used optionally by other architectures.

Yes, having it under drivers/pci/ somewhere would be a big improvement,
that way we'd actually see it when trying to do cleanups and wouldn't
accidentally break your architectures.
Benjamin Herrenschmidt Jan. 29, 2010, 3:54 a.m. UTC | #12
On Thu, 2010-01-28 at 20:45 -0700, Matthew Wilcox wrote:
> > This is also probably going to be moved to a more generic place and
> > extended to be used optionally by other architectures.
> 
> Yes, having it under drivers/pci/ somewhere would be a big improvement,
> that way we'd actually see it when trying to do cleanups and wouldn't
> accidentally break your architectures. 

Yup, and you'll notice that I didn't complain about the breakage for
that precise reason :-)

Cheers,
Ben.
David Miller Feb. 18, 2010, 12:22 a.m. UTC | #13
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Wed, 27 Jan 2010 13:10:56 +1100

> I'll try do make ppc catch up with some of that see how it goes.

FWIW I'm taking care of syncing up sparc in this area
right now.

I just noticed the ->dma_mask assignment got moved as well.

Really, any change made to drivers/pci/probe.c is going to
require powerpc and sparc changes these days.

We should and will commonize our OpenFirmware PCI device probing code
under driver/pci but until that happens a real effort needs to be put
in place to at least ping Benjamin and myself when changes are going
in to drivers/pci/probe.c

Thanks.
Jesse Barnes Feb. 18, 2010, 12:29 a.m. UTC | #14
On Wed, 17 Feb 2010 16:22:59 -0800 (PST)
David Miller <davem@davemloft.net> wrote:

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Wed, 27 Jan 2010 13:10:56 +1100
> 
> > I'll try do make ppc catch up with some of that see how it goes.
> 
> FWIW I'm taking care of syncing up sparc in this area
> right now.
> 
> I just noticed the ->dma_mask assignment got moved as well.
> 
> Really, any change made to drivers/pci/probe.c is going to
> require powerpc and sparc changes these days.
> 
> We should and will commonize our OpenFirmware PCI device probing code
> under driver/pci but until that happens a real effort needs to be put
> in place to at least ping Benjamin and myself when changes are going
> in to drivers/pci/probe.c

Ok I'll include you guys on further changes.
diff mbox

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 98ffb2d..328c3ab 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -746,7 +746,6 @@  int pci_setup_device(struct pci_dev *dev)
 	dev->hdr_type = hdr_type & 0x7f;
 	dev->multifunction = !!(hdr_type & 0x80);
 	dev->error_state = pci_channel_io_normal;
-	set_pcie_port_type(dev);
 	set_pci_aer_firmware_first(dev);
 
 	list_for_each_entry(slot, &dev->bus->slots, list)
@@ -1052,6 +1051,7 @@  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 	/* Initialize various capabilities */
 	pci_init_capabilities(dev);
 
+	set_pcie_port_type(dev);
 	/*
 	 * Add the device to our list of discovered devices
 	 * and the bus list for fixup functions, etc.

----- Second idea -----

diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
index 7311fdf..f8820e8 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -160,6 +160,7 @@  struct pci_dev *of_create_pci_dev(struct device_node *node,
 	dev->error_state = pci_channel_io_normal;
 	dev->dma_mask = 0xffffffff;
 
+	set_pcie_port_type(dev);
 	if (!strcmp(type, "pci") || !strcmp(type, "pciex")) {
 		/* a PCI-PCI bridge */
 		dev->hdr_type = PCI_HEADER_TYPE_BRIDGE;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 98ffb2d..f787eea 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -681,7 +681,7 @@  static void pci_read_irq(struct pci_dev *dev)
 	dev->irq = irq;
 }
 
-static void set_pcie_port_type(struct pci_dev *pdev)
+void set_pcie_port_type(struct pci_dev *pdev)
 {
 	int pos;
 	u16 reg16;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 174e539..765095b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1269,6 +1269,7 @@  int pcibios_add_platform_entries(struct pci_dev *dev);
 void pcibios_disable_device(struct pci_dev *dev);
 int pcibios_set_pcie_reset_state(struct pci_dev *dev,
 				 enum pcie_reset_state state);
+extern void set_pcie_port_type(struct pci_dev *pdev);
 
 #ifdef CONFIG_PCI_MMCONFIG
 extern void __init pci_mmcfg_early_init(void);