Patchwork eeh: Fixing a bug when pci structure is null

login
register
mail settings
Submitter Breno Leitao
Date Feb. 3, 2010, 3:56 p.m.
Message ID <4B699CB9.3030605@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/44388/
State Accepted, archived
Commit 8d3d50bf1913561ef3b1f5b53115c5a481ba9b1e
Delegated to: Benjamin Herrenschmidt
Headers show

Comments

Breno Leitao - Feb. 3, 2010, 3:56 p.m.
During a EEH recover, the pci_dev structure can be null, mainly if an
eeh event is detected during cpi config operation. In this case, the
pci_dev will not be known (and will be null) the kernel will crash
with the following message:

Unable to handle kernel paging request for data at address 0x000000a0
Faulting instruction address: 0xc00000000006b8b4
Oops: Kernel access of bad area, sig: 11 [#1]

NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0
LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
Call Trace:
[c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
[c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70

The bug occurs because pci_name() tries to access a null pointer.
This patch just guarantee that pci_name() is not called on Null pointers.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
---
 arch/powerpc/include/asm/ppc-pci.h          |    5 +++++
 arch/powerpc/platforms/pseries/eeh.c        |    4 ++--
 arch/powerpc/platforms/pseries/eeh_driver.c |    4 ++--
 arch/powerpc/platforms/pseries/eeh_event.c  |    2 +-
 4 files changed, 10 insertions(+), 5 deletions(-)
Breno Leitao - Feb. 19, 2010, 4:43 p.m.
Hi Ben, 

I'd like to ask about this patch ? Should I re-submit ?

Thanks, 

Breno Leitao wrote:
> During a EEH recover, the pci_dev structure can be null, mainly if an
> eeh event is detected during cpi config operation. In this case, the
> pci_dev will not be known (and will be null) the kernel will crash
> with the following message:
Linas Vepstas - Feb. 19, 2010, 5:05 p.m.
Hi Paul, Breno,

Some confusion -- I've been out of the loop for a while -- I assume
its still Paul who is pushing
these patches upstream, and not Ben?  So Breno, maybe you should
resend the patch to Paul?

--linas

On 19 February 2010 10:43, Breno Leitao <leitao@linux.vnet.ibm.com> wrote:
> Hi Ben,
>
> I'd like to ask about this patch ? Should I re-submit ?
>
> Thanks,
>
> Breno Leitao wrote:
>> During a EEH recover, the pci_dev structure can be null, mainly if an
>> eeh event is detected during cpi config operation. In this case, the
>> pci_dev will not be known (and will be null) the kernel will crash
>> with the following message:
>
Benjamin Herrenschmidt - Feb. 19, 2010, 9:54 p.m.
On Fri, 2010-02-19 at 14:43 -0200, Breno Leitao wrote:
> Hi Ben, 
> 
> I'd like to ask about this patch ? Should I re-submit ?
> 
> Thanks, 
> 
> Breno Leitao wrote:
> > During a EEH recover, the pci_dev structure can be null, mainly if an
> > eeh event is detected during cpi config operation. In this case, the
> > pci_dev will not be known (and will be null) the kernel will crash
> > with the following message:

It should be in -next, can you dbl check ?

Cheers,
Ben.
Benjamin Herrenschmidt - Feb. 19, 2010, 9:55 p.m.
On Fri, 2010-02-19 at 11:05 -0600, Linas Vepstas wrote:
> 
> Some confusion -- I've been out of the loop for a while -- I assume
> its still Paul who is pushing
> these patches upstream, and not Ben?  So Breno, maybe you should
> resend the patch to Paul? 

No, it's me.

Cheers,
Ben.
Mike Mason - Feb. 24, 2010, 10:13 p.m.
On 2/19/2010 1:54 PM, Benjamin Herrenschmidt wrote:
> On Fri, 2010-02-19 at 14:43 -0200, Breno Leitao wrote:
>> Hi Ben,
>>
>> I'd like to ask about this patch ? Should I re-submit ?
>>
>> Thanks,
>>
>> Breno Leitao wrote:
>>> During a EEH recover, the pci_dev structure can be null, mainly if an
>>> eeh event is detected during cpi config operation. In this case, the
>>> pci_dev will not be known (and will be null) the kernel will crash
>>> with the following message:
>
> It should be in -next, can you dbl check ?

I just confirmed the patch is in the -next tree.

Mike

>
> Cheers,
> Ben.
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

Patch

diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 2828f9d..42fdff0 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -137,6 +137,11 @@  struct device_node * find_device_pe(struct device_node *dn);
 void eeh_sysfs_add_device(struct pci_dev *pdev);
 void eeh_sysfs_remove_device(struct pci_dev *pdev);
 
+static inline const char *eeh_pci_name(struct pci_dev *pdev) 
+{ 
+	return pdev ? pci_name(pdev) : "<null>";
+} 
+
 #endif /* CONFIG_EEH */
 
 #else /* CONFIG_PCI */
diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c
index ccd8dd0..3304f32 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -491,7 +491,7 @@  int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 	    pdn->eeh_mode & EEH_MODE_NOCHECK) {
 		ignored_check++;
 		pr_debug("EEH: Ignored check (%x) for %s %s\n",
-			 pdn->eeh_mode, pci_name (dev), dn->full_name);
+			 pdn->eeh_mode, eeh_pci_name(dev), dn->full_name);
 		return 0;
 	}
 
@@ -515,7 +515,7 @@  int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 			printk (KERN_ERR "EEH: %d reads ignored for recovering device at "
 				"location=%s driver=%s pci addr=%s\n",
 				pdn->eeh_check_count, location,
-				dev->driver->name, pci_name(dev));
+				dev->driver->name, eeh_pci_name(dev));
 			printk (KERN_ERR "EEH: Might be infinite loop in %s driver\n",
 				dev->driver->name);
 			dump_stack();
diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c
index ef8e454..977d87d 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -337,7 +337,7 @@  struct pci_dn * handle_eeh_events (struct eeh_event *event)
 		location = location ? location : "unknown";
 		printk(KERN_ERR "EEH: Error: Cannot find partition endpoint "
 		                "for location=%s pci addr=%s\n",
-		        location, pci_name(event->dev));
+		        location, eeh_pci_name(event->dev));
 		return NULL;
 	}
 
@@ -368,7 +368,7 @@  struct pci_dn * handle_eeh_events (struct eeh_event *event)
 		pci_str = pci_name (frozen_pdn->pcidev);
 		drv_str = pcid_name (frozen_pdn->pcidev);
 	} else {
-		pci_str = pci_name (event->dev);
+		pci_str = eeh_pci_name(event->dev);
 		drv_str = pcid_name (event->dev);
 	}
 	
diff --git a/arch/powerpc/platforms/pseries/eeh_event.c b/arch/powerpc/platforms/pseries/eeh_event.c
index ddb80f5..ec5df8f 100644
--- a/arch/powerpc/platforms/pseries/eeh_event.c
+++ b/arch/powerpc/platforms/pseries/eeh_event.c
@@ -80,7 +80,7 @@  static int eeh_event_handler(void * dummy)
 	eeh_mark_slot(event->dn, EEH_MODE_RECOVERING);
 
 	printk(KERN_INFO "EEH: Detected PCI bus error on device %s\n",
-	       pci_name(event->dev));
+	       eeh_pci_name(event->dev));
 
 	pdn = handle_eeh_events(event);