diff mbox

[v2,2/4] powerpc/powernv: Fix stale PE primary bus

Message ID 1454993431-17068-2-git-send-email-gwshan@linux.vnet.ibm.com (mailing list archive)
State Accepted
Headers show

Commit Message

Gavin Shan Feb. 9, 2016, 4:50 a.m. UTC
When PCI bus is unplugged during full hotplug for EEH recovery,
the platform PE instance (struct pnv_ioda_pe) isn't released and
it dereferences the stale PCI bus that has been released. It leads
to kernel crash when referring to the stale PCI bus.

This fixes the issue by correcting the PE's primary bus when it's
oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
be called by pcibios_fixup_bus().

Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |  1 +
 arch/powerpc/platforms/powernv/pci.c      | 20 ++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h      |  1 +
 3 files changed, 22 insertions(+)

Comments

Andrew Donnellan Feb. 12, 2016, 6:02 a.m. UTC | #1
On 09/02/16 15:50, Gavin Shan wrote:
> When PCI bus is unplugged during full hotplug for EEH recovery,
> the platform PE instance (struct pnv_ioda_pe) isn't released and
> it dereferences the stale PCI bus that has been released. It leads
> to kernel crash when referring to the stale PCI bus.
>
> This fixes the issue by correcting the PE's primary bus when it's
> oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
> be called by pcibios_fixup_bus().
>
> Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

I realise this has already been merged, but the following was found by 
Coverity:

> +void pnv_pci_dma_bus_setup(struct pci_bus *bus)
> +{
> +	struct pci_controller *hose = bus->sysdata;
> +	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_ioda_pe *pe;
> +
> +	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> +		if (!(pe->flags | (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
> +			continue;

This condition is always false. I think the first "|" is supposed to be "&".
Gavin Shan Feb. 12, 2016, 6:09 a.m. UTC | #2
On Fri, Feb 12, 2016 at 05:02:46PM +1100, Andrew Donnellan wrote:
>On 09/02/16 15:50, Gavin Shan wrote:
>>When PCI bus is unplugged during full hotplug for EEH recovery,
>>the platform PE instance (struct pnv_ioda_pe) isn't released and
>>it dereferences the stale PCI bus that has been released. It leads
>>to kernel crash when referring to the stale PCI bus.
>>
>>This fixes the issue by correcting the PE's primary bus when it's
>>oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
>>be called by pcibios_fixup_bus().
>>
>>Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
>>Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
>
>I realise this has already been merged, but the following was found by
>Coverity:
>
>>+void pnv_pci_dma_bus_setup(struct pci_bus *bus)
>>+{
>>+	struct pci_controller *hose = bus->sysdata;
>>+	struct pnv_phb *phb = hose->private_data;
>>+	struct pnv_ioda_pe *pe;
>>+
>>+	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>>+		if (!(pe->flags | (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
>>+			continue;
>
>This condition is always false. I think the first "|" is supposed to be "&".
>

Yeah, that should be "&". I think the problem isn't found when doing
testing in non-SRIOV environment. Thanks for pointing it out.

Michael, please let me know if I need send a follow-up revision to
correct this one? I found the first two patches have been put into
linux-next branch. I think we probably just need repost the correct
version for this one only.

Thanks,
Gavin

>-- 
>Andrew Donnellan              Software Engineer, OzLabs
>andrew.donnellan@au1.ibm.com  Australia Development Lab, Canberra
>+61 2 6201 8874 (work)        IBM Australia Limited
Michael Ellerman Feb. 15, 2016, 10:38 a.m. UTC | #3
On Fri, 2016-02-12 at 17:09 +1100, Gavin Shan wrote:

> On Fri, Feb 12, 2016 at 05:02:46PM +1100, Andrew Donnellan wrote:

> > On 09/02/16 15:50, Gavin Shan wrote:

> > > When PCI bus is unplugged during full hotplug for EEH recovery,
> > > the platform PE instance (struct pnv_ioda_pe) isn't released and
> > > it dereferences the stale PCI bus that has been released. It leads
> > > to kernel crash when referring to the stale PCI bus.
> > > 
> > > This fixes the issue by correcting the PE's primary bus when it's
> > > oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
> > > be called by pcibios_fixup_bus().
> > > 
> > > Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> > > Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
> > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> > > Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> > 
> > I realise this has already been merged, but the following was found by
> > Coverity:
> > 

> > > +void pnv_pci_dma_bus_setup(struct pci_bus *bus)
> > > +{
> > > +	struct pci_controller *hose = bus->sysdata;
> > > +	struct pnv_phb *phb = hose->private_data;
> > > +	struct pnv_ioda_pe *pe;
> > > +
> > > +	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> > > +		if (!(pe->flags | (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
> > > +			continue;
> > 
> > This condition is always false. I think the first "|" is supposed to be "&".
> > 
> 
> Yeah, that should be "&". I think the problem isn't found when doing
> testing in non-SRIOV environment. Thanks for pointing it out.
> 
> Michael, please let me know if I need send a follow-up revision to
> correct this one? I found the first two patches have been put into
> linux-next branch. I think we probably just need repost the correct
> version for this one only.

As it happens I needed to rebase my fixes branch anyway, so I've updated it and
rebased. No further action required :)

cheers
Gavin Shan Feb. 15, 2016, 10:43 p.m. UTC | #4
On Mon, Feb 15, 2016 at 09:38:35PM +1100, Michael Ellerman wrote:
>On Fri, 2016-02-12 at 17:09 +1100, Gavin Shan wrote:
>
>> On Fri, Feb 12, 2016 at 05:02:46PM +1100, Andrew Donnellan wrote:
>
>> > On 09/02/16 15:50, Gavin Shan wrote:
>
>> > > When PCI bus is unplugged during full hotplug for EEH recovery,
>> > > the platform PE instance (struct pnv_ioda_pe) isn't released and
>> > > it dereferences the stale PCI bus that has been released. It leads
>> > > to kernel crash when referring to the stale PCI bus.
>> > > 
>> > > This fixes the issue by correcting the PE's primary bus when it's
>> > > oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
>> > > be called by pcibios_fixup_bus().
>> > > 
>> > > Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
>> > > Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
>> > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> > > Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
>> > 
>> > I realise this has already been merged, but the following was found by
>> > Coverity:
>> > 
>
>> > > +void pnv_pci_dma_bus_setup(struct pci_bus *bus)
>> > > +{
>> > > +	struct pci_controller *hose = bus->sysdata;
>> > > +	struct pnv_phb *phb = hose->private_data;
>> > > +	struct pnv_ioda_pe *pe;
>> > > +
>> > > +	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>> > > +		if (!(pe->flags | (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
>> > > +			continue;
>> > 
>> > This condition is always false. I think the first "|" is supposed to be "&".
>> > 
>> 
>> Yeah, that should be "&". I think the problem isn't found when doing
>> testing in non-SRIOV environment. Thanks for pointing it out.
>> 
>> Michael, please let me know if I need send a follow-up revision to
>> correct this one? I found the first two patches have been put into
>> linux-next branch. I think we probably just need repost the correct
>> version for this one only.
>
>As it happens I needed to rebase my fixes branch anyway, so I've updated it and
>rebased. No further action required :)
>

Thanks Michael. Sorry for the rebase :-)

Thanks,
Gavin
diff mbox

Patch

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 573ae19..f90dc04 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3180,6 +3180,7 @@  static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
 
 static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
        .dma_dev_setup = pnv_pci_dma_dev_setup,
+       .dma_bus_setup = pnv_pci_dma_bus_setup,
 #ifdef CONFIG_PCI_MSI
        .setup_msi_irqs = pnv_setup_msi_irqs,
        .teardown_msi_irqs = pnv_teardown_msi_irqs,
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 8de0140..2441dec 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -757,6 +757,26 @@  void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 		phb->dma_dev_setup(phb, pdev);
 }
 
+void pnv_pci_dma_bus_setup(struct pci_bus *bus)
+{
+	struct pci_controller *hose = bus->sysdata;
+	struct pnv_phb *phb = hose->private_data;
+	struct pnv_ioda_pe *pe;
+
+	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
+		if (!(pe->flags | (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
+			continue;
+
+		if (!pe->pbus)
+			continue;
+
+		if (bus->number == ((pe->rid >> 8) & 0xFF)) {
+			pe->pbus = bus;
+			break;
+		}
+	}
+}
+
 void pnv_pci_shutdown(void)
 {
 	struct pci_controller *hose;
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 32cae3d..3f814f3 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -232,6 +232,7 @@  extern void pnv_pci_reset_secondary_bus(struct pci_dev *dev);
 extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option);
 
 extern void pnv_pci_dma_dev_setup(struct pci_dev *pdev);
+extern void pnv_pci_dma_bus_setup(struct pci_bus *bus);
 extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type);
 extern void pnv_teardown_msi_irqs(struct pci_dev *pdev);