Message ID | 20171122054615.18092-1-vaibhav@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | cxl: Check if vphb exists before iterating over AFU devices | expand |
On 22/11/17 16:46, Vaibhav Jain wrote: > During an eeh a kernel-oops is reported if no vPHB to allocated to the > AFU. This happens as during AFU init, an error in creation of vPHB is > a non-fatal error. Hence afu->phb should always be checked for NULL > before iterating over it for the virtual AFU pci devices. > > This patch fixes the kenel-oops by adding a NULL pointer check for > afu->phb before it is dereferenced. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Looks to me like we might need the same fix in cxl_vphb_error_detected()? It's called twice in cxl_pci_error_detected(), and in only one of those cases is it surrounded by an afu->phb NULL check. > --- > drivers/misc/cxl/pci.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c > index bb7fd3f4edab..80ac40cdc31b 100644 > --- a/drivers/misc/cxl/pci.c > +++ b/drivers/misc/cxl/pci.c > @@ -2265,6 +2265,9 @@ static pci_ers_result_t cxl_pci_slot_reset(struct pci_dev *pdev) > if (cxl_afu_select_best_mode(afu)) > goto err; > > + if (afu->phb == NULL) > + continue; > + > list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) { > /* Reset the device context. > * TODO: make this less disruptive > @@ -2327,6 +2330,9 @@ static void cxl_pci_resume(struct pci_dev *pdev) > for (i = 0; i < adapter->slices; i++) { > afu = adapter->afu[i]; > > + if (afu->phb != NULL) > + continue; > + > list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) { > if (afu_dev->driver && afu_dev->driver->err_handler && > afu_dev->driver->err_handler->resume) >
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes: > Looks to me like we might need the same fix in > cxl_vphb_error_detected()? It's called twice in > cxl_pci_error_detected(), and in only one of those cases is it > surrounded by an afu->phb NULL check. Thanks for catching this. Will send a v2 with update.
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index bb7fd3f4edab..80ac40cdc31b 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -2265,6 +2265,9 @@ static pci_ers_result_t cxl_pci_slot_reset(struct pci_dev *pdev) if (cxl_afu_select_best_mode(afu)) goto err; + if (afu->phb == NULL) + continue; + list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) { /* Reset the device context. * TODO: make this less disruptive @@ -2327,6 +2330,9 @@ static void cxl_pci_resume(struct pci_dev *pdev) for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i]; + if (afu->phb != NULL) + continue; + list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) { if (afu_dev->driver && afu_dev->driver->err_handler && afu_dev->driver->err_handler->resume)
During an eeh a kernel-oops is reported if no vPHB to allocated to the AFU. This happens as during AFU init, an error in creation of vPHB is a non-fatal error. Hence afu->phb should always be checked for NULL before iterating over it for the virtual AFU pci devices. This patch fixes the kenel-oops by adding a NULL pointer check for afu->phb before it is dereferenced. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> --- drivers/misc/cxl/pci.c | 6 ++++++ 1 file changed, 6 insertions(+)