diff mbox

PCI: quirks: DMA alias quirk for Adaptec 3405

Message ID 20150113182605.15479.9020.stgit@gimli.home
State Accepted
Headers show

Commit Message

Alex Williamson Jan. 13, 2015, 6:26 p.m. UTC
As noted in the added comment, this device is actually an Intel 80333
I/O processor where the exposed device at 0e.0 is actually the address
translation unit of the I/O processor and a hidden, private device at
01.0 masters the DMA for the device.  In order to enable the IOMMU, we
therefore need to create a fixed alias between the exposed and hidden
devfn.

Scenarios like this are potentially likely for any device incorporating
this I/O processor, so this little bit of abstraction with the fixed
alias table should make future additions trivial.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Adaptec OEM Raid Solutions <aacraid@adaptec.com>
---

 drivers/pci/quirks.c |   37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bjorn Helgaas Jan. 23, 2015, 9:50 p.m. UTC | #1
On Tue, Jan 13, 2015 at 11:26:50AM -0700, Alex Williamson wrote:
> As noted in the added comment, this device is actually an Intel 80333
> I/O processor where the exposed device at 0e.0 is actually the address
> translation unit of the I/O processor and a hidden, private device at
> 01.0 masters the DMA for the device.  In order to enable the IOMMU, we
> therefore need to create a fixed alias between the exposed and hidden
> devfn.
> 
> Scenarios like this are potentially likely for any device incorporating
> this I/O processor, so this little bit of abstraction with the fixed
> alias table should make future additions trivial.

This sounds like the result of some serious debugging :)  Do you have a
pointer to a bugzilla or some email discussion?  I'd like to make the fix
discoverable starting with a failure symptom.

> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Cc: Adaptec OEM Raid Solutions <aacraid@adaptec.com>
> ---
> 
>  drivers/pci/quirks.c |   37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ed6f89b..19bdb17 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3528,6 +3528,43 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
>  			 quirk_dma_func1_alias);
>  
>  /*
> + * Some devices DMA with the wrong devfn, not just the wrong function.
> + * quirk_fixed_dma_alias() uses this table to create fixed aliass, where
> + * the alias is "fixed" and independent of the device devfn.
> + *
> + * For example, the Adaptec 3405 is a PCIe card making use of an Intel 80333
> + * I/O processor.  To software, this appears as a straightforward PCIe-to-PCI/X
> + * bridge with a single device on the subordinate bus.  In reality, the exposed
> + * device at 0e.0 is the Address Translation Unit (ATU) of the controller that
> + * provides a bridge to the internal bus of the I/O processor.  The controller
> + * supports private devices, which can be hidden from PCI config space.  In the
> + * case of the Adaptec 3405, a private device at 01.0 appears to be the DMA
> + * engine, which therefore needs to become a DMA alias for the device.
> + */
> +static const struct pci_device_id fixed_dma_alias_tbl[] = {
> +	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285,
> +			 PCI_VENDOR_ID_ADAPTEC2, 0x02bb), /* Adaptec 3405 */
> +	  .driver_data = PCI_DEVFN(1, 0) },
> +	{ 0 }
> +};
> +
> +static void quirk_fixed_dma_alias(struct pci_dev *dev)
> +{
> +	const struct pci_device_id *id;
> +
> +	id = pci_match_id(fixed_dma_alias_tbl, dev);
> +	if (id) {
> +		dev->dma_alias_devfn = id->driver_data;
> +		dev->dev_flags |= PCI_DEV_FLAGS_DMA_ALIAS_DEVFN;
> +		dev_info(&dev->dev, "Enabling fixed DMA alias to %02x.%d\n",
> +			 PCI_SLOT(dev->dma_alias_devfn),
> +			 PCI_FUNC(dev->dma_alias_devfn));
> +	}
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias);
> +
> +/*
>   * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in
>   * using the wrong DMA alias for the device.  Some of these devices can be
>   * used as either forward or reverse bridges, so we need to test whether the
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson Jan. 23, 2015, 10:30 p.m. UTC | #2
On Fri, 2015-01-23 at 15:50 -0600, Bjorn Helgaas wrote:
> On Tue, Jan 13, 2015 at 11:26:50AM -0700, Alex Williamson wrote:
> > As noted in the added comment, this device is actually an Intel 80333
> > I/O processor where the exposed device at 0e.0 is actually the address
> > translation unit of the I/O processor and a hidden, private device at
> > 01.0 masters the DMA for the device.  In order to enable the IOMMU, we
> > therefore need to create a fixed alias between the exposed and hidden
> > devfn.
> > 
> > Scenarios like this are potentially likely for any device incorporating
> > this I/O processor, so this little bit of abstraction with the fixed
> > alias table should make future additions trivial.
> 
> This sounds like the result of some serious debugging :)  Do you have a
> pointer to a bugzilla or some email discussion?  I'd like to make the fix
> discoverable starting with a failure symptom.

Sorry, this one comes from a private bugzilla.  The symptom is simply a
slew of IOMMU faults when trying to boot the system with the IOMMU
enabled.  For example, with the Adaptec 3405 showing up at 02:0e.0 on an
Intel system, the log is filled with:

dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Write] Request device [02:01.0] fault addr ffbff000 
DMAR:[fault reason 02] Present bit in context entry is clear
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Write] Request device [02:01.0] fault addr ffbfe000 
DMAR:[fault reason 02] Present bit in context entry is clear

As expected, the device is non-functional.  In the log I have there are
only DMA write faults and the addresses are all ffbxxxxx, which falls
into a reserved memory area on the system and owned by a PNP0c02 device
according to iomem.  With the change here the device, of course, works
without any issues with VT-d enabled.  Thanks,

Alex

> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Adaptec OEM Raid Solutions <aacraid@adaptec.com>
> > ---
> > 
> >  drivers/pci/quirks.c |   37 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index ed6f89b..19bdb17 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3528,6 +3528,43 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
> >  			 quirk_dma_func1_alias);
> >  
> >  /*
> > + * Some devices DMA with the wrong devfn, not just the wrong function.
> > + * quirk_fixed_dma_alias() uses this table to create fixed aliass, where
> > + * the alias is "fixed" and independent of the device devfn.
> > + *
> > + * For example, the Adaptec 3405 is a PCIe card making use of an Intel 80333
> > + * I/O processor.  To software, this appears as a straightforward PCIe-to-PCI/X
> > + * bridge with a single device on the subordinate bus.  In reality, the exposed
> > + * device at 0e.0 is the Address Translation Unit (ATU) of the controller that
> > + * provides a bridge to the internal bus of the I/O processor.  The controller
> > + * supports private devices, which can be hidden from PCI config space.  In the
> > + * case of the Adaptec 3405, a private device at 01.0 appears to be the DMA
> > + * engine, which therefore needs to become a DMA alias for the device.
> > + */
> > +static const struct pci_device_id fixed_dma_alias_tbl[] = {
> > +	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285,
> > +			 PCI_VENDOR_ID_ADAPTEC2, 0x02bb), /* Adaptec 3405 */
> > +	  .driver_data = PCI_DEVFN(1, 0) },
> > +	{ 0 }
> > +};
> > +
> > +static void quirk_fixed_dma_alias(struct pci_dev *dev)
> > +{
> > +	const struct pci_device_id *id;
> > +
> > +	id = pci_match_id(fixed_dma_alias_tbl, dev);
> > +	if (id) {
> > +		dev->dma_alias_devfn = id->driver_data;
> > +		dev->dev_flags |= PCI_DEV_FLAGS_DMA_ALIAS_DEVFN;
> > +		dev_info(&dev->dev, "Enabling fixed DMA alias to %02x.%d\n",
> > +			 PCI_SLOT(dev->dma_alias_devfn),
> > +			 PCI_FUNC(dev->dma_alias_devfn));
> > +	}
> > +}
> > +
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias);
> > +
> > +/*
> >   * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in
> >   * using the wrong DMA alias for the device.  Some of these devices can be
> >   * used as either forward or reverse bridges, so we need to test whether the
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Jan. 23, 2015, 10:46 p.m. UTC | #3
On Tue, Jan 13, 2015 at 11:26:50AM -0700, Alex Williamson wrote:
> As noted in the added comment, this device is actually an Intel 80333
> I/O processor where the exposed device at 0e.0 is actually the address
> translation unit of the I/O processor and a hidden, private device at
> 01.0 masters the DMA for the device.  In order to enable the IOMMU, we
> therefore need to create a fixed alias between the exposed and hidden
> devfn.
> 
> Scenarios like this are potentially likely for any device incorporating
> this I/O processor, so this little bit of abstraction with the fixed
> alias table should make future additions trivial.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Cc: Adaptec OEM Raid Solutions <aacraid@adaptec.com>

I added the symptoms (log messages) to the changelog and applied this to
pci/virtualization for v3.20, thanks!

> ---
> 
>  drivers/pci/quirks.c |   37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ed6f89b..19bdb17 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3528,6 +3528,43 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
>  			 quirk_dma_func1_alias);
>  
>  /*
> + * Some devices DMA with the wrong devfn, not just the wrong function.
> + * quirk_fixed_dma_alias() uses this table to create fixed aliass, where
> + * the alias is "fixed" and independent of the device devfn.
> + *
> + * For example, the Adaptec 3405 is a PCIe card making use of an Intel 80333
> + * I/O processor.  To software, this appears as a straightforward PCIe-to-PCI/X
> + * bridge with a single device on the subordinate bus.  In reality, the exposed
> + * device at 0e.0 is the Address Translation Unit (ATU) of the controller that
> + * provides a bridge to the internal bus of the I/O processor.  The controller
> + * supports private devices, which can be hidden from PCI config space.  In the
> + * case of the Adaptec 3405, a private device at 01.0 appears to be the DMA
> + * engine, which therefore needs to become a DMA alias for the device.
> + */
> +static const struct pci_device_id fixed_dma_alias_tbl[] = {
> +	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285,
> +			 PCI_VENDOR_ID_ADAPTEC2, 0x02bb), /* Adaptec 3405 */
> +	  .driver_data = PCI_DEVFN(1, 0) },
> +	{ 0 }
> +};
> +
> +static void quirk_fixed_dma_alias(struct pci_dev *dev)
> +{
> +	const struct pci_device_id *id;
> +
> +	id = pci_match_id(fixed_dma_alias_tbl, dev);
> +	if (id) {
> +		dev->dma_alias_devfn = id->driver_data;
> +		dev->dev_flags |= PCI_DEV_FLAGS_DMA_ALIAS_DEVFN;
> +		dev_info(&dev->dev, "Enabling fixed DMA alias to %02x.%d\n",
> +			 PCI_SLOT(dev->dma_alias_devfn),
> +			 PCI_FUNC(dev->dma_alias_devfn));
> +	}
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias);
> +
> +/*
>   * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in
>   * using the wrong DMA alias for the device.  Some of these devices can be
>   * used as either forward or reverse bridges, so we need to test whether the
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Jan. 23, 2015, 10:53 p.m. UTC | #4
On Fri, Jan 23, 2015 at 4:30 PM, Alex Williamson
<alex.williamson@redhat.com> wrote:
> On Fri, 2015-01-23 at 15:50 -0600, Bjorn Helgaas wrote:
>> On Tue, Jan 13, 2015 at 11:26:50AM -0700, Alex Williamson wrote:
>> > As noted in the added comment, this device is actually an Intel 80333
>> > I/O processor where the exposed device at 0e.0 is actually the address
>> > translation unit of the I/O processor and a hidden, private device at
>> > 01.0 masters the DMA for the device.  In order to enable the IOMMU, we
>> > therefore need to create a fixed alias between the exposed and hidden
>> > devfn.
>> >
>> > Scenarios like this are potentially likely for any device incorporating
>> > this I/O processor, so this little bit of abstraction with the fixed
>> > alias table should make future additions trivial.
>>
>> This sounds like the result of some serious debugging :)  Do you have a
>> pointer to a bugzilla or some email discussion?  I'd like to make the fix
>> discoverable starting with a failure symptom.
>
> Sorry, this one comes from a private bugzilla.  The symptom is simply a
> slew of IOMMU faults when trying to boot the system with the IOMMU
> enabled.  For example, with the Adaptec 3405 showing up at 02:0e.0 on an
> Intel system, the log is filled with:
>
> dmar: DRHD: handling fault status reg 3
> dmar: DMAR:[DMA Write] Request device [02:01.0] fault addr ffbff000
> DMAR:[fault reason 02] Present bit in context entry is clear
> dmar: DRHD: handling fault status reg 3
> dmar: DMAR:[DMA Write] Request device [02:01.0] fault addr ffbfe000
> DMAR:[fault reason 02] Present bit in context entry is clear

Huh.  I applied it, but I really prefer to have public bugzillas for
public fixes.  If you want to keep the proprietary details in a
private bugzilla and just open a corresponding kernel.org bugzilla
with the secret things omitted, that would be fine with me.  In fact,
I'd prefer that, because I'd rather keep references to kernel.org
bugzillas than to redhat.com ones because then we don't have to depend
on RedHat's continuing existence or bugzilla maintenance.

Sometimes the details of the problem are really useful when we get a
better understanding of the issue later, or when we want to do some
refactoring, or simply when we want to test a change to the code.

Bjorn

> As expected, the device is non-functional.  In the log I have there are
> only DMA write faults and the addresses are all ffbxxxxx, which falls
> into a reserved memory area on the system and owned by a PNP0c02 device
> according to iomem.  With the change here the device, of course, works
> without any issues with VT-d enabled.  Thanks,
>
> Alex
>
>> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>> > Cc: Adaptec OEM Raid Solutions <aacraid@adaptec.com>
>> > ---
>> >
>> >  drivers/pci/quirks.c |   37 +++++++++++++++++++++++++++++++++++++
>> >  1 file changed, 37 insertions(+)
>> >
>> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> > index ed6f89b..19bdb17 100644
>> > --- a/drivers/pci/quirks.c
>> > +++ b/drivers/pci/quirks.c
>> > @@ -3528,6 +3528,43 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
>> >                      quirk_dma_func1_alias);
>> >
>> >  /*
>> > + * Some devices DMA with the wrong devfn, not just the wrong function.
>> > + * quirk_fixed_dma_alias() uses this table to create fixed aliass, where
>> > + * the alias is "fixed" and independent of the device devfn.
>> > + *
>> > + * For example, the Adaptec 3405 is a PCIe card making use of an Intel 80333
>> > + * I/O processor.  To software, this appears as a straightforward PCIe-to-PCI/X
>> > + * bridge with a single device on the subordinate bus.  In reality, the exposed
>> > + * device at 0e.0 is the Address Translation Unit (ATU) of the controller that
>> > + * provides a bridge to the internal bus of the I/O processor.  The controller
>> > + * supports private devices, which can be hidden from PCI config space.  In the
>> > + * case of the Adaptec 3405, a private device at 01.0 appears to be the DMA
>> > + * engine, which therefore needs to become a DMA alias for the device.
>> > + */
>> > +static const struct pci_device_id fixed_dma_alias_tbl[] = {
>> > +   { PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285,
>> > +                    PCI_VENDOR_ID_ADAPTEC2, 0x02bb), /* Adaptec 3405 */
>> > +     .driver_data = PCI_DEVFN(1, 0) },
>> > +   { 0 }
>> > +};
>> > +
>> > +static void quirk_fixed_dma_alias(struct pci_dev *dev)
>> > +{
>> > +   const struct pci_device_id *id;
>> > +
>> > +   id = pci_match_id(fixed_dma_alias_tbl, dev);
>> > +   if (id) {
>> > +           dev->dma_alias_devfn = id->driver_data;
>> > +           dev->dev_flags |= PCI_DEV_FLAGS_DMA_ALIAS_DEVFN;
>> > +           dev_info(&dev->dev, "Enabling fixed DMA alias to %02x.%d\n",
>> > +                    PCI_SLOT(dev->dma_alias_devfn),
>> > +                    PCI_FUNC(dev->dma_alias_devfn));
>> > +   }
>> > +}
>> > +
>> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias);
>> > +
>> > +/*
>> >   * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in
>> >   * using the wrong DMA alias for the device.  Some of these devices can be
>> >   * used as either forward or reverse bridges, so we need to test whether the
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ed6f89b..19bdb17 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3528,6 +3528,43 @@  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
 			 quirk_dma_func1_alias);
 
 /*
+ * Some devices DMA with the wrong devfn, not just the wrong function.
+ * quirk_fixed_dma_alias() uses this table to create fixed aliass, where
+ * the alias is "fixed" and independent of the device devfn.
+ *
+ * For example, the Adaptec 3405 is a PCIe card making use of an Intel 80333
+ * I/O processor.  To software, this appears as a straightforward PCIe-to-PCI/X
+ * bridge with a single device on the subordinate bus.  In reality, the exposed
+ * device at 0e.0 is the Address Translation Unit (ATU) of the controller that
+ * provides a bridge to the internal bus of the I/O processor.  The controller
+ * supports private devices, which can be hidden from PCI config space.  In the
+ * case of the Adaptec 3405, a private device at 01.0 appears to be the DMA
+ * engine, which therefore needs to become a DMA alias for the device.
+ */
+static const struct pci_device_id fixed_dma_alias_tbl[] = {
+	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285,
+			 PCI_VENDOR_ID_ADAPTEC2, 0x02bb), /* Adaptec 3405 */
+	  .driver_data = PCI_DEVFN(1, 0) },
+	{ 0 }
+};
+
+static void quirk_fixed_dma_alias(struct pci_dev *dev)
+{
+	const struct pci_device_id *id;
+
+	id = pci_match_id(fixed_dma_alias_tbl, dev);
+	if (id) {
+		dev->dma_alias_devfn = id->driver_data;
+		dev->dev_flags |= PCI_DEV_FLAGS_DMA_ALIAS_DEVFN;
+		dev_info(&dev->dev, "Enabling fixed DMA alias to %02x.%d\n",
+			 PCI_SLOT(dev->dma_alias_devfn),
+			 PCI_FUNC(dev->dma_alias_devfn));
+	}
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias);
+
+/*
  * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in
  * using the wrong DMA alias for the device.  Some of these devices can be
  * used as either forward or reverse bridges, so we need to test whether the