diff mbox

[v4] sparc/PCI: Fix for panic while enabling SR-IOV

Message ID 1458849742-162399-1-git-send-email-babu.moger@oracle.com
State Accepted
Delegated to: David Miller
Headers show

Commit Message

Babu Moger March 24, 2016, 8:02 p.m. UTC
We noticed this panic while enabling SR-IOV in sparc.

mlx4_core: Mellanox ConnectX core driver v2.2-1 (Jan  1 2015)
mlx4_core: Initializing 0007:01:00.0
mlx4_core 0007:01:00.0: Enabling SR-IOV with 5 VFs
mlx4_core: Initializing 0007:01:00.1
Unable to handle kernel NULL pointer dereference
insmod(10010): Oops [#1]
CPU: 391 PID: 10010 Comm: insmod Not tainted
		4.1.12-32.el6uek.kdump2.sparc64 #1
TPC: <dma_supported+0x20/0x80>
I7: <__mlx4_init_one+0x324/0x500 [mlx4_core]>
Call Trace:
 [00000000104c5ea4] __mlx4_init_one+0x324/0x500 [mlx4_core]
 [00000000104c613c] mlx4_init_one+0xbc/0x120 [mlx4_core]
 [0000000000725f14] local_pci_probe+0x34/0xa0
 [0000000000726028] pci_call_probe+0xa8/0xe0
 [0000000000726310] pci_device_probe+0x50/0x80
 [000000000079f700] really_probe+0x140/0x420
 [000000000079fa24] driver_probe_device+0x44/0xa0
 [000000000079fb5c] __device_attach+0x3c/0x60
 [000000000079d85c] bus_for_each_drv+0x5c/0xa0
 [000000000079f588] device_attach+0x88/0xc0
 [000000000071acd0] pci_bus_add_device+0x30/0x80
 [0000000000736090] virtfn_add.clone.1+0x210/0x360
 [00000000007364a4] sriov_enable+0x2c4/0x520
 [000000000073672c] pci_enable_sriov+0x2c/0x40
 [00000000104c2d58] mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
 [00000000104c49ac] mlx4_load_one+0x42c/0xd40 [mlx4_core]
Disabling lock debugging due to kernel taint
Caller[00000000104c5ea4]: __mlx4_init_one+0x324/0x500 [mlx4_core]
Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
Caller[0000000000726310]: pci_device_probe+0x50/0x80
Caller[000000000079f700]: really_probe+0x140/0x420
Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
Caller[000000000079fb5c]: __device_attach+0x3c/0x60
Caller[000000000079d85c]: bus_for_each_drv+0x5c/0xa0
Caller[000000000079f588]: device_attach+0x88/0xc0
Caller[000000000071acd0]: pci_bus_add_device+0x30/0x80
Caller[0000000000736090]: virtfn_add.clone.1+0x210/0x360
Caller[00000000007364a4]: sriov_enable+0x2c4/0x520
Caller[000000000073672c]: pci_enable_sriov+0x2c/0x40
Caller[00000000104c2d58]: mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
Caller[00000000104c49ac]: mlx4_load_one+0x42c/0xd40 [mlx4_core]
Caller[00000000104c5f90]: __mlx4_init_one+0x410/0x500 [mlx4_core]
Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
Caller[0000000000726310]: pci_device_probe+0x50/0x80
Caller[000000000079f700]: really_probe+0x140/0x420
Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
Caller[000000000079fb08]: __driver_attach+0x88/0xa0
Caller[000000000079d90c]: bus_for_each_dev+0x6c/0xa0
Caller[000000000079f29c]: driver_attach+0x1c/0x40
Caller[000000000079e35c]: bus_add_driver+0x17c/0x220
Caller[00000000007a02d4]: driver_register+0x74/0x120
Caller[00000000007263fc]: __pci_register_driver+0x3c/0x60
Caller[00000000104f62bc]: mlx4_init+0x60/0xcc [mlx4_core]
Kernel panic - not syncing: Fatal exception
Press Stop-A (L1-A) to return to the boot prom
---[ end Kernel panic - not syncing: Fatal exception

Details:
Here is the call sequence
virtfn_add->__mlx4_init_one->dma_set_mask->dma_supported

The panic happened at line 760(file arch/sparc/kernel/iommu.c)

758 int dma_supported(struct device *dev, u64 device_mask)
759 {
760         struct iommu *iommu = dev->archdata.iommu;
761         u64 dma_addr_mask = iommu->dma_addr_mask;
762
763         if (device_mask >= (1UL << 32UL))
764                 return 0;
765
766         if ((device_mask & dma_addr_mask) == dma_addr_mask)
767                 return 1;
768
769 #ifdef CONFIG_PCI
770         if (dev_is_pci(dev))
771		return pci64_dma_supported(to_pci_dev(dev), device_mask);
772 #endif
773
774         return 0;
775 }
776 EXPORT_SYMBOL(dma_supported);

Same panic happened with Intel ixgbe driver also.

SR-IOV code looks for arch specific data while enabling
VFs. When VF device is added, driver probe function makes set
of calls to initialize the pci device. Because the VF device is
added different way than the normal PF device(which happens via
of_create_pci_dev for sparc), some of the arch specific initialization
does not happen for VF device.  That causes panic when archdata is
accessed.

To fix this, I have used already defined weak function
pcibios_setup_device to copy archdata from PF to VF.
Also verified the fix.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
---
v2:
 Removed RFC.
 Made changes per comments from Ethan Zhao.
 Now the changes are only in Sparc specific code.
 Removed the changes from driver/pci.
 Implemented already defined weak function pcibios_add_device
 in arch/sparc/kernel/pci.c to initialize sriov archdata. 

v3:
 Fixed the compile error reported in kbuild test robot.

v4:
 Fixed indentation per comments from David Miller

 arch/sparc/kernel/pci.c |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)

Comments

David Miller March 30, 2016, 12:57 a.m. UTC | #1
From: Babu Moger <babu.moger@oracle.com>
Date: Thu, 24 Mar 2016 13:02:22 -0700

> We noticed this panic while enabling SR-IOV in sparc.
 ...
> SR-IOV code looks for arch specific data while enabling
> VFs. When VF device is added, driver probe function makes set
> of calls to initialize the pci device. Because the VF device is
> added different way than the normal PF device(which happens via
> of_create_pci_dev for sparc), some of the arch specific initialization
> does not happen for VF device.  That causes panic when archdata is
> accessed.
> 
> To fix this, I have used already defined weak function
> pcibios_setup_device to copy archdata from PF to VF.
> Also verified the fix.
> 
> Signed-off-by: Babu Moger <babu.moger@oracle.com>
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>

Looks good, applied and queued up for -stable, thanks.

Just a note, I am assuming that the VFs are not instantiated in the
device tree.  Because when you just memcpy the arch data over from the
PF, one thing we end up doing is using the device node of the PF.

I slightly cringed at the memcpy, because at least one of these
pointers are to objects which are reference counted, the OF device.

Generally speaking we don't really support hot-plug for OF probed
devices, but if we did all of the device tree pointers have to be
refcounted properly.

So in the long term that whole sequence where we go:

	struct dev_archdata *sd;
 ...
	sd = &dev->dev.archdata;
	sd->iommu = pbm->iommu;
	sd->stc = &pbm->stc;
	sd->host_controller = pbm;
	sd->op = op = of_find_device_by_node(node);
	sd->numa_node = pbm->numa_node;

should be encapsulated into a helper function, and both
of_create_pci_dev() and this new pcibios_setup_device() can
invoke it.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Babu Moger March 30, 2016, 3:31 p.m. UTC | #2
Hi David,

On 3/29/2016 7:57 PM, David Miller wrote:
> From: Babu Moger <babu.moger@oracle.com>
> Date: Thu, 24 Mar 2016 13:02:22 -0700
> 
>> We noticed this panic while enabling SR-IOV in sparc.
>  ...
>> SR-IOV code looks for arch specific data while enabling
>> VFs. When VF device is added, driver probe function makes set
>> of calls to initialize the pci device. Because the VF device is
>> added different way than the normal PF device(which happens via
>> of_create_pci_dev for sparc), some of the arch specific initialization
>> does not happen for VF device.  That causes panic when archdata is
>> accessed.
>>
>> To fix this, I have used already defined weak function
>> pcibios_setup_device to copy archdata from PF to VF.
>> Also verified the fix.
>>
>> Signed-off-by: Babu Moger <babu.moger@oracle.com>
>> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
>> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
> 
> Looks good, applied and queued up for -stable, thanks.

Thanks.

> 
> Just a note, I am assuming that the VFs are not instantiated in the
> device tree.  Because when you just memcpy the arch data over from the
> PF, one thing we end up doing is using the device node of the PF.

No. VFs are not instantiated in device tree(/proc/device-tree)

> 
> I slightly cringed at the memcpy, because at least one of these
> pointers are to objects which are reference counted, the OF device.
> 
> Generally speaking we don't really support hot-plug for OF probed
> devices, but if we did all of the device tree pointers have to be
> refcounted properly.
> 
> So in the long term that whole sequence where we go:
> 
> 	struct dev_archdata *sd;
>  ...
> 	sd = &dev->dev.archdata;
> 	sd->iommu = pbm->iommu;
> 	sd->stc = &pbm->stc;
> 	sd->host_controller = pbm;
> 	sd->op = op = of_find_device_by_node(node);
> 	sd->numa_node = pbm->numa_node;
> 
> should be encapsulated into a helper function, and both
> of_create_pci_dev() and this new pcibios_setup_device() can
> invoke it.
> 

Yes. Agree. We need to refactor the whole of_create_pci_dev path to support
hot-plug for the long term. I will start looking at it. For now we should be
fine with the current patch. thanks
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas March 30, 2016, 7:37 p.m. UTC | #3
On Wed, Mar 30, 2016 at 10:31:18AM -0500, Babu Moger wrote:
> Hi David,
> 
> On 3/29/2016 7:57 PM, David Miller wrote:
> > From: Babu Moger <babu.moger@oracle.com>
> > Date: Thu, 24 Mar 2016 13:02:22 -0700
> > 
> >> We noticed this panic while enabling SR-IOV in sparc.
> >  ...
> >> SR-IOV code looks for arch specific data while enabling
> >> VFs. When VF device is added, driver probe function makes set
> >> of calls to initialize the pci device. Because the VF device is
> >> added different way than the normal PF device(which happens via
> >> of_create_pci_dev for sparc), some of the arch specific initialization
> >> does not happen for VF device.  That causes panic when archdata is
> >> accessed.
> >>
> >> To fix this, I have used already defined weak function
> >> pcibios_setup_device to copy archdata from PF to VF.
> >> Also verified the fix.
> >>
> >> Signed-off-by: Babu Moger <babu.moger@oracle.com>
> >> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> >> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
> > 
> > Looks good, applied and queued up for -stable, thanks.
> 
> Thanks.
> 
> > 
> > Just a note, I am assuming that the VFs are not instantiated in the
> > device tree.  Because when you just memcpy the arch data over from the
> > PF, one thing we end up doing is using the device node of the PF.
> 
> No. VFs are not instantiated in device tree(/proc/device-tree)
> 
> > 
> > I slightly cringed at the memcpy, because at least one of these
> > pointers are to objects which are reference counted, the OF device.
> > 
> > Generally speaking we don't really support hot-plug for OF probed
> > devices, but if we did all of the device tree pointers have to be
> > refcounted properly.
> > 
> > So in the long term that whole sequence where we go:
> > 
> > 	struct dev_archdata *sd;
> >  ...
> > 	sd = &dev->dev.archdata;
> > 	sd->iommu = pbm->iommu;
> > 	sd->stc = &pbm->stc;
> > 	sd->host_controller = pbm;
> > 	sd->op = op = of_find_device_by_node(node);
> > 	sd->numa_node = pbm->numa_node;
> > 
> > should be encapsulated into a helper function, and both
> > of_create_pci_dev() and this new pcibios_setup_device() can
> > invoke it.
> > 
> 
> Yes. Agree. We need to refactor the whole of_create_pci_dev path to support
> hot-plug for the long term. I will start looking at it. For now we should be
> fine with the current patch. thanks

of_create_pci_dev() duplicates a lot of the code in
pci_setup_device().  I wish we didn't have to do that
because it's easy to let them get out of sync, but I
don't know if there are any reasonable alternatives.

I've wondered in the past whether it would be possible
to use the pci_setup_device() path on sparc & powerpc by
writing PCI config accessors that look up OF properties
as needed to fabricate responses to config reads.  Several
of the drivers in drivers/pci/host/* do a little bit of
this fabrication, although I don't think any go to the
extent of using OF.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index badf095..9f9614d 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -994,6 +994,23 @@  void pcibios_set_master(struct pci_dev *dev)
 	/* No special bus mastering setup handling */
 }
 
+#ifdef CONFIG_PCI_IOV
+int pcibios_add_device(struct pci_dev *dev)
+{
+	struct pci_dev *pdev;
+
+	/* Add sriov arch specific initialization here.
+	 * Copy dev_archdata from PF to VF
+	 */
+	if (dev->is_virtfn) {
+		pdev = dev->physfn;
+		memcpy(&dev->dev.archdata, &pdev->dev.archdata,
+		       sizeof(struct dev_archdata));
+	}
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static int __init pcibios_init(void)
 {
 	pci_dfl_cache_line_size = 64 >> 2;