PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

Message ID 20181206041951.22413-1-david@gibson.dropbear.id.au
State New
Headers show
Series
  • PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
Related show

Checks

Context Check Description
snowpatch_ozlabs/checkpatch warning total: 0 errors, 2 warnings, 0 checks, 40 lines checked
snowpatch_ozlabs/build-pmac32 success build succeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64e success build succeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64be success build succeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64le success build succeded & removed 0 sparse warning(s)
snowpatch_ozlabs/apply_patch success next/apply_patch Successfully applied

Commit Message

David Gibson Dec. 6, 2018, 4:19 a.m.
Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when
unbound from their regular driver and attached to vfio-pci in order to pass
them through to a guest.

This goes away if the disable_idle_d3 option is used, so it looks like a
problem with the hardware handling D3 state.  To fix that more permanently,
use a device quirk to disable D3 state for these devices.

We do this by renaming the existing quirk_no_ata_d3() more generally and
attaching it to the ConnectX-[45] devices (0x15b3:0x1013).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 drivers/pci/quirks.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

Comments

Leon Romanovsky Dec. 6, 2018, 6:45 a.m. | #1
On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote:
> Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when
> unbound from their regular driver and attached to vfio-pci in order to pass
> them through to a guest.
>
> This goes away if the disable_idle_d3 option is used, so it looks like a
> problem with the hardware handling D3 state.  To fix that more permanently,
> use a device quirk to disable D3 state for these devices.
>
> We do this by renaming the existing quirk_no_ata_d3() more generally and
> attaching it to the ConnectX-[45] devices (0x15b3:0x1013).
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  drivers/pci/quirks.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
>

Hi David,

Thank for your patch,

I would like to reproduce the calltrace before moving forward,
but have trouble to reproduce the original issue.

I'm working with vfio-pci and CX-4/5 cards on daily basis,
tried manually enter into D3 state now, and it worked for me.

Can you please post your full calltrace, and "lspci -s PCI_ID -vv"
output?

Thanks

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4700d24e5d55..add3f516ca12 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1315,23 +1315,24 @@  static void quirk_ide_samemode(struct pci_dev *pdev)
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_10, quirk_ide_samemode);
 
-/* Some ATA devices break if put into D3 */
-static void quirk_no_ata_d3(struct pci_dev *pdev)
+/* Some devices (including a number of ATA cards) break if put into D3 */
+static void quirk_no_d3(struct pci_dev *pdev)
 {
 	pdev->dev_flags |= PCI_DEV_FLAGS_NO_D3;
 }
+
 /* Quirk the legacy ATA devices only. The AHCI ones are ok */
 DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_ANY_ID,
-				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
+				PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3);
 DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_ATI, PCI_ANY_ID,
-				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
+				PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3);
 /* ALi loses some register settings that we cannot then restore */
 DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
-				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
+				PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3);
 /* VIA comes back fine but we need to keep it alive or ACPI GTM failures
    occur when mode detecting */
 DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
-				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
+				PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3);
 
 /*
  * This was originally an Alpha-specific thing, but it really fits here.
@@ -3367,6 +3368,10 @@  static void mellanox_check_broken_intx_masking(struct pci_dev *pdev)
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID,
 			mellanox_check_broken_intx_masking);
 
+/* Mellanox MT27800 (ConnectX-5) IB card seems to break with D3
+ * In particular this shows up when the device is bound to the vfio-pci driver */
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_CONNECTX4, quirk_no_d3)
+
 static void quirk_no_bus_reset(struct pci_dev *dev)
 {
 	dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;