Patchwork Quirk to support Marvell 88SE91xx SATA controllers with Intel IOMMU.

login
register
mail settings
Submitter Andrew Cooks
Date March 1, 2013, 8:26 a.m.
Message ID <1362126373-32318-1-git-send-email-acooks@gmail.com>
Download mbox | patch
Permalink /patch/224248/
State Not Applicable
Headers show

Comments

Andrew Cooks - March 1, 2013, 8:26 a.m.
This is my third submitted patch to make Marvell 88SE91xx SATA controllers work when IOMMU is enabled.[1][2]

What's changed:
* Adopt David Woodhouse's terminology by referring to the quirky functions as 'ghost' functions.
* Unmap ghost functions when device is detached from IOMMU.
* Stub function for when CONFIG_PCI_QUIRKS is not enabled.

The bad:
* Still no AMD support.
* The table of affected chip IDs is as complete as I can make it by googling for bug reports.

This patch was generated against commit b0af9cd9aab60ceb17d3ebabb9fdf4ff0a99cf50, but will also apply cleanly to 3.7.10.

Bug reports:
1. https://bugzilla.redhat.com/show_bug.cgi?id=757166
2. https://bugzilla.kernel.org/show_bug.cgi?id=42679

Signed-off-by: Andrew Cooks <acooks@gmail.com>
---
 drivers/iommu/intel-iommu.c |   50 +++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/quirks.c        |   47 +++++++++++++++++++++++++++++++++++++++-
 include/linux/pci.h         |    5 ++++
 include/linux/pci_ids.h     |    1 +
 4 files changed, 102 insertions(+), 1 deletions(-)
Justin Piszcz - March 1, 2013, 5:54 p.m.
-----Original Message-----
From: Andrew Cooks [mailto:acooks@gmail.com] 
Sent: Friday, March 01, 2013 3:26 AM
To: acooks@gmail.com; joro@8bytes.org; xjtuychu@hotmail.com;
gm.ychu@gmail.com; alex.williamson@redhat.com; bhelgaas@google.com;
jpiszcz@lucidpixels.com; dwmw2@infradead.org
Cc: open list:INTEL IOMMU (VT-d); open list; open list:PCI SUBSYSTEM
Subject: [PATCH] Quirk to support Marvell 88SE91xx SATA controllers with
Intel IOMMU.

This is my third submitted patch to make Marvell 88SE91xx SATA controllers
work when IOMMU is enabled.[1][2]

What's changed:
* Adopt David Woodhouse's terminology by referring to the quirky functions
as 'ghost' functions.
* Unmap ghost functions when device is detached from IOMMU.
* Stub function for when CONFIG_PCI_QUIRKS is not enabled.

The bad:
* Still no AMD support.
* The table of affected chip IDs is as complete as I can make it by googling
for bug reports.

This patch was generated against commit
b0af9cd9aab60ceb17d3ebabb9fdf4ff0a99cf50, but will also apply cleanly to
3.7.10.

--

Hi,

Against 3.7.10:

# patch -p1 <
../RFC-Fix-Intel-IOMMU-support-for-Marvell-88SE91xx-SATA-controllers..patch 
patching file drivers/iommu/intel-iommu.c
patching file drivers/pci/quirks.c
Hunk #1 succeeded at 3230 (offset 3 lines).
patching file include/linux/pci.h
#

Recompile kernel, reboot..

Shutdown host, re-attach to Marvell Controller w/IOMMU.

The host still failed to boot, dmesg/panic here:
http://home.comcast.net/~jpiszcz/20130301/boot_failure.JPG

(The root disk is /dev/sdc)

I recompiled again with IOMMU off and it booted ok:

# uname -a
Linux host 3.7.10 #2 SMP Fri Mar 1 12:44:25 EST 2013 x86_64 GNU/Linux

Here is the part of dmesg (what it looks like when it succeeds with
IOMMU=off)

[    4.288113] input: American Megatrends Inc. Virtual Keyboard and Mouse as
/devices/pci0000:00/0000:00:1a.1/usb4/4-2/4-2:1.0/input/input3
[    4.289025] hid-generic 0003:046B:FF10.0001: input,hidraw0: USB HID v1.10
Keyboard [American Megatrends Inc. Virtual Keyboard and Mouse] on
usb-0000:00:1a.1-2/input0
[    4.305993] input: American Megatrends Inc. Virtual Keyboard and Mouse as
/devices/pci0000:00/0000:00:1a.1/usb4/4-2/4-2:1.1/input/input4
[    4.307106] hid-generic 0003:046B:FF10.0002: input,hidraw1: USB HID v1.10
Mouse [American Megatrends Inc. Virtual Keyboard and Mouse] on
usb-0000:00:1a.1-2/input1
[    4.326481] ata6: SATA link down (SStatus 0 SControl 300)
[    4.327324] scsi 7:0:0:0: Direct-Access     ATA      INTEL SSDSC2MH25
PWG4 PQ: 0 ANSI: 5
[    4.329953] sd 7:0:0:0: [sdc] 488397168 512-byte logical blocks: (250
GB/232 GiB)
[    4.330639] scsi 14:0:0:0: Processor         Marvell  91xx Config
1.01 PQ: 0 ANSI: 5
[    4.333276] sd 7:0:0:0: [sdc] Write Protect is off
[    4.334746] sd 7:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    4.334921] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[    4.345622]  sdc: sdc1 sdc2
[    4.347493] sd 7:0:0:0: [sdc] Attached SCSI disk

Justin.
 


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Cooks - March 1, 2013, 10:19 p.m.
On Sat, Mar 2, 2013 at 1:51 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>
>
> On Fri, Mar 1, 2013 at 3:26 AM, Andrew Cooks <acooks@gmail.com> wrote:
>>
>> This is my third submitted patch to make Marvell 88SE91xx SATA controllers
>> work when IOMMU is enabled.[1][2]
>>
>
> Hi,
>
> Against 3.7.10:
>
> # patch -p1 <
> ../RFC-Fix-Intel-IOMMU-support-for-Marvell-88SE91xx-SATA-controllers..patch
> patching file drivers/iommu/intel-iommu.c
> patching file drivers/pci/quirks.c
> Hunk #1 succeeded at 3230 (offset 3 lines).
> patching file include/linux/pci.h
> #

Thanks for testing!

Patching 3.7.10 looks somewhat different here. I'm using the
linux-3.7.y branch from
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
at commit 356d8c6fb2a7cf49e836742738a8b9a47e77cfea. The output I get
is:

$ patch -p1 < ~/devel/lk_patches_dma_source_maps-20130301/marvell_ghost_funcs.patch
patching file drivers/iommu/intel-iommu.c
Hunk #1 succeeded at 1672 (offset -2 lines).
Hunk #2 succeeded at 1729 (offset -2 lines).
Hunk #3 succeeded at 3833 (offset -2 lines).
patching file drivers/pci/quirks.c
Hunk #1 succeeded at 3210 (offset -39 lines).
Hunk #2 succeeded at 3240 (offset -39 lines).
Hunk #3 succeeded at 3258 (offset -39 lines).
patching file include/linux/pci.h
Hunk #1 succeeded at 1546 (offset -32 lines).
Hunk #2 succeeded at 1555 (offset -32 lines).
patching file include/linux/pci_ids.h

>
> Recompile kernel, reboot..
>
> Shutdown host, re-attach to Marvell Controller w/IOMMU.
>
> The host still failed to boot, dmesg/panic here:
> http://home.comcast.net/~jpiszcz/20130301/boot_failure.JPG
>
> (The root disk is /dev/sdc)

Sorry it doesn't work for you, but I can't really tell what's failing
from the photo you posted. I'm running 3.7.10 with this patch right
now.

According to the dmesg output you posted at
https://home.comcast.net/~jpiszcz/20121128/dmesg.txt and the lspci
output in http://www.kernelhub.org/?p=2&msg=171877, you've run into
two separate DMAR issues:

Problem A - nvidia graphics:
[    2.968248] dmar: DRHD: handling fault status reg 202
[    2.968253] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0
[    2.968253] DMAR:[fault reason 06] PTE Read access is not set

Problem B - Marvell 88SE9123:
[    2.974534] ata9: SATA link down (SStatus 0 SControl 300)
[    2.976297] ata7: SATA link down (SStatus 0 SControl 300)
[    2.977939] ata13: SATA link down (SStatus 0 SControl 300)
[    2.979689] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    2.981419] dmar: DRHD: handling fault status reg 2
[    2.981431] ata8: SATA link down (SStatus 0 SControl 300)
[    2.981449] ata11: SATA link down (SStatus 0 SControl 300)
[    2.988479] dmar: DMAR:[DMA Read] Request device [84:00.1] fault
addr fff00000
[    2.988479] DMAR:[fault reason 02] Present bit in context entry is clear

Can you capture a full dmesg with this patch applied?

--
a.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Justin Piszcz - March 1, 2013, 11:18 p.m.
-----Original Message-----
From: Andrew Cooks [mailto:acooks@gmail.com] 
Sent: Friday, March 01, 2013 5:19 PM
To: Justin Piszcz
Cc: Joerg Roedel; YingChu; Chu Ying; Alex Williamson; bhelgaas@google.com;
David Woodhouse; open list:INTEL IOMMU (VT-d); open list; open list:PCI
SUBSYSTEM
Subject: Re: [PATCH] Quirk to support Marvell 88SE91xx SATA controllers with
Intel IOMMU.

On Sat, Mar 2, 2013 at 1:51 AM, Justin Piszcz <jpiszcz@lucidpixels.com>
wrote:
>
>

> Thanks for testing!
No problem.

Against a clean 3.7.10 (from ftp.kernel.org)

# patch -p1 <
../patch/RFC-Fix-Intel-IOMMU-support-for-Marvell-88SE91xx-SATA-controllers..
patch 
patching file drivers/iommu/intel-iommu.c
patching file drivers/pci/quirks.c
Hunk #1 succeeded at 3230 (offset 3 lines).
patching file include/linux/pci.h
# pwd
/usr/src/linux-3.7.10

Full dmesg with the patch applied: (but with IOMMU off)
http://home.comcast.net/~jpiszcz/20130301/dmesg-full.txt

Full dmesg (as much as possible through netconsole with IOMMU on)
http://home.comcast.net/~jpiszcz/20130301/dmesg-iommu-on.txt

Let me know if anything else is needed, thanks.

Justin.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Cooks - March 4, 2013, 1:35 a.m.
On Sat, Mar 2, 2013 at 7:18 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>
> Against a clean 3.7.10 (from ftp.kernel.org)
>
> # patch -p1 <
> ../patch/RFC-Fix-Intel-IOMMU-support-for-Marvell-88SE91xx-SATA-controllers..
> patch
> patching file drivers/iommu/intel-iommu.c
> patching file drivers/pci/quirks.c
> Hunk #1 succeeded at 3230 (offset 3 lines).
> patching file include/linux/pci.h
> # pwd
> /usr/src/linux-3.7.10
>

I've downloaded and patched the 3.7.10 tarball and still get the same
output I got before; different output from yours. I'm not sure the
patch is complete or applying correctly, are you?
Could you please check whether the patch you're applying is the same
as the attached file?

> Full dmesg with the patch applied: (but with IOMMU off)
> http://home.comcast.net/~jpiszcz/20130301/dmesg-full.txt
>
> Full dmesg (as much as possible through netconsole with IOMMU on)
> http://home.comcast.net/~jpiszcz/20130301/dmesg-iommu-on.txt
>
> Let me know if anything else is needed, thanks.

I think some important error messages might have been lost here.

You captured a complete dmesg at
https://home.comcast.net/~jpiszcz/20121128/dmesg.txt
with iommu on. Are you still able to get the same with 3.7.10 if you
exclude this patch, or has something else changed?

a.
Justin Piszcz - March 4, 2013, 11:32 a.m.
On Sat, Mar 2, 2013 at 7:18 AM, Justin Piszcz <jpiszcz@lucidpixels.com>
wrote:
>
> Against a clean 3.7.10 (from ftp.kernel.org)
>
> # patch -p1 <
>
../patch/RFC-Fix-Intel-IOMMU-support-for-Marvell-88SE91xx-SATA-controllers..
> patch
> patching file drivers/iommu/intel-iommu.c
> patching file drivers/pci/quirks.c
> Hunk #1 succeeded at 3230 (offset 3 lines).
> patching file include/linux/pci.h
> # pwd
> /usr/src/linux-3.7.10
>

I've downloaded and patched the 3.7.10 tarball and still get the same
output I got before; different output from yours. I'm not sure the
patch is complete or applying correctly, are you?
Could you please check whether the patch you're applying is the same
as the attached file?

Hi,

Success!

Patch from e-mail:
# md5sum marvell_ghost_funcs.patch 
718bfb5876e3538ec23a516ef28d03f5  marvell_ghost_funcs.patch

Kernel from ftp.kernel.org:
# md5sum linux-3.7.10.tar.bz2
56ec294a922b6112a1ef129668f38a83  linux-3.7.10.tar.bz2

Decompress, patch, re-compile w/IOMMU=on.

# tar jxf linux-3.7.10.tar.bz2 ; ln -s linux-3.7.10 linux   
# cd linux; patch -p1 < ../marvell_ghost_funcs.patch
patching file drivers/iommu/intel-iommu.c
Hunk #1 succeeded at 1672 (offset -2 lines).
Hunk #2 succeeded at 1729 (offset -2 lines).
Hunk #3 succeeded at 3833 (offset -2 lines).
patching file drivers/pci/quirks.c
Hunk #1 succeeded at 3210 (offset -39 lines).
Hunk #2 succeeded at 3240 (offset -39 lines).
Hunk #3 succeeded at 3258 (offset -39 lines).
patching file include/linux/pci.h
Hunk #1 succeeded at 1546 (offset -32 lines).
Hunk #2 succeeded at 1555 (offset -32 lines).
patching file include/linux/pci_ids.h

Reboot, re-test.

# lilo
Added 3.7.7-1
Added 3.7.10-5-ioff
Added 3.7.10-7    (iommu=off w/patch) = OK
Added 3.7.10-8  * (iommu=on w/patch) = OK

dmesg w/patch + iommu
http://home.comcast.net/~jpiszcz/20130304/dmesg-success-patch.txt

Thanks!

Justin.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson - March 6, 2013, 4:04 a.m.
On Fri, 2013-03-01 at 16:26 +0800, Andrew Cooks wrote:
> This is my third submitted patch to make Marvell 88SE91xx SATA controllers work when IOMMU is enabled.[1][2]
> 
> What's changed:
> * Adopt David Woodhouse's terminology by referring to the quirky functions as 'ghost' functions.
> * Unmap ghost functions when device is detached from IOMMU.
> * Stub function for when CONFIG_PCI_QUIRKS is not enabled.
> 
> The bad:
> * Still no AMD support.
> * The table of affected chip IDs is as complete as I can make it by googling for bug reports.
> 
> This patch was generated against commit b0af9cd9aab60ceb17d3ebabb9fdf4ff0a99cf50, but will also apply cleanly to 3.7.10.
> 
> Bug reports:
> 1. https://bugzilla.redhat.com/show_bug.cgi?id=757166
> 2. https://bugzilla.kernel.org/show_bug.cgi?id=42679
> 
> Signed-off-by: Andrew Cooks <acooks@gmail.com>
> ---
>  drivers/iommu/intel-iommu.c |   50 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/pci/quirks.c        |   47 +++++++++++++++++++++++++++++++++++++++-
>  include/linux/pci.h         |    5 ++++
>  include/linux/pci_ids.h     |    1 +
>  4 files changed, 102 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 0099667..13323f2 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1674,6 +1674,50 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment,
>  	return 0;
>  }
>  
> +/* For quirky devices like Marvell 88SE91xx chips that use ghost functions. */
> +static int map_ghost_dma_fn(struct dmar_domain *domain,
> +		struct pci_dev *pdev,
> +		int translation)
> +{
> +	u8 fn, fn_map;
> +	int err = 0;
> +
> +	fn_map = pci_get_dma_source_map(pdev);

if (!fn_map)
	return 0;

> +
> +	for (fn = 1; fn < 8; fn++) {

Wouldn't you want to do 0 to 7, then add:

if (fn == PCI_FUNC(pdev->devfn))
	continue;

You could also get more creative with the loop using shifts and exit
when the remaining map is 0.

> +		if (fn_map & (1 << fn)) {
> +			err = domain_context_mapping_one(domain,
> +					pci_domain_nr(pdev->bus),
> +					pdev->bus->number,
> +					PCI_DEVFN(PCI_SLOT(pdev->devfn), fn),
> +					translation);
> +			if (err)
> +				return err;

I'd be worried that there's missing cleanup here, what if you were
mapping multiple ghost functions and the 2nd one failed, leaving one
attached?

> +			dev_dbg(&pdev->dev, "dma quirk; func %d mapped", fn);
> +		}
> +	}
> +	return 0;
> +}
> +
> +static void iommu_detach_dev(struct intel_iommu *iommu, u8 bus, u8 devfn);
> +
> +static void unmap_ghost_dma_fn(struct intel_iommu *iommu,
> +		struct pci_dev *pdev)
> +{
> +	u8 fn, fn_map;
> +
> +	fn_map = pci_get_dma_source_map(pdev);
> +
> +	for (fn = 1; fn < 8; fn++) {

Same early exit and loop comments as above.

> +		if (fn_map & (1 << fn)) {
> +			iommu_detach_dev(iommu,
> +					pdev->bus->number,
> +					PCI_DEVFN(PCI_SLOT(pdev->devfn), fn));
> +			dev_dbg(&pdev->dev, "dma quirk; func %d unmapped", fn);
> +		}
> +	}
> +}
> +
>  static int
>  domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev,
>  			int translation)
> @@ -1687,6 +1731,11 @@ domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev,
>  	if (ret)
>  		return ret;
>  
> +	/* quirk for undeclared/ghost pci functions */
> +	ret = map_ghost_dma_fn(domain, pdev, translation);
> +	if (ret)
> +		return ret;
> +
>  	/* dependent device mapping */
>  	tmp = pci_find_upstream_pcie_bridge(pdev);
>  	if (!tmp)
> @@ -3786,6 +3835,7 @@ static void domain_remove_one_dev_info(struct dmar_domain *domain,
>  			iommu_disable_dev_iotlb(info);
>  			iommu_detach_dev(iommu, info->bus, info->devfn);
>  			iommu_detach_dependent_devices(iommu, pdev);
> +			unmap_ghost_dma_fn(iommu, pdev);
>  			free_devinfo_mem(info);
>  
>  			spin_lock_irqsave(&device_domain_lock, flags);
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 0369fb6..d311100 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3249,6 +3249,10 @@ static struct pci_dev *pci_func_0_dma_source(struct pci_dev *dev)
>  	return pci_get_slot(dev->bus, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
>  }
>  
> +/* Table of source functions for real devices. The DMA requests for the
> + * device are tagged with a different real function as source. This is
> + * relevant to multifunction devices.
> + */
>  static const struct pci_dev_dma_source {
>  	u16 vendor;
>  	u16 device;
> @@ -3275,7 +3279,8 @@ static const struct pci_dev_dma_source {
>   * the device doing the DMA, but sometimes hardware is broken and will
>   * tag the DMA as being sourced from a different device.  This function
>   * allows that translation.  Note that the reference count of the
> - * returned device is incremented on all paths.
> + * returned device is incremented on all paths. Translation is done when
> + * the device is added to an IOMMU group.
>   */
>  struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
>  {
> @@ -3292,6 +3297,46 @@ struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
>  	return pci_dev_get(dev);
>  }
>  
> +/* Table of multiple (ghost) source functions. This is similar to the
> + * translated sources above, but with the following differences:
> + * 1. the device may use multiple functions as DMA sources,
> + * 2. these functions cannot be assumed to be actual devices,
> + * 3. the specific ghost function for a request can not be exactly predicted.
> + * The bitmap only contains the additional quirk functions.
> + */
> +static const struct pci_dev_dma_multi_func_sources {
> +	u16 vendor;
> +	u16 device;
> +	u8 func_map;	/* bit map. lsb is fn 0. */
> +} pci_dev_dma_multi_func_sources[] = {
> +	{ PCI_VENDOR_ID_MARVELL_2, 0x9123, (1<<0)|(1<<1)},
> +	{ PCI_VENDOR_ID_MARVELL_2, 0x9125, (1<<0)|(1<<1)},
> +	{ PCI_VENDOR_ID_MARVELL_2, 0x9128, (1<<0)|(1<<1)},
> +	{ PCI_VENDOR_ID_MARVELL_2, 0x9130, (1<<0)|(1<<1)},
> +	{ PCI_VENDOR_ID_MARVELL_2, 0x9172, (1<<0)|(1<<1)},

Links to bug reports in the comments might be useful for future
workarounds.  I'm also not sure what you mean in the 3rd bullet, the
non-ghost function of some of these is sometimes 0, sometimes 1?  And
the ghost function is the other?  Skipping fn 0 above, I assume all
cases are fn 0 exists and fn 1 is the ghost function, right?  If so,
then we probably only want bit 1 set.  I'm afraid to ask whether there
are configurations of this vendor/device that have a fn 1.  Thanks,

Alex

> +	{ 0 }
> +};
> +
> +/*
> + * The mapping of fake/ghost functions is used when the real device is
> + * attached to an IOMMU domain. IOMMU groups are not aware of these
> + * functions, because they're not real devices.
> + */
> +u8 pci_get_dma_source_map(struct pci_dev *dev)
> +{
> +	const struct pci_dev_dma_multi_func_sources *i;
> +
> +	for (i = pci_dev_dma_multi_func_sources; i->func_map; i++) {
> +		if ((i->vendor == dev->vendor ||
> +		     i->vendor == (u16)PCI_ANY_ID) &&
> +		    (i->device == dev->device ||
> +		     i->device == (u16)PCI_ANY_ID)) {
> +			return i->func_map;
> +		}
> +	}
> +	return 0;
> +}
> +
>  static const struct pci_dev_acs_enabled {
>  	u16 vendor;
>  	u16 device;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2461033..5ad3822 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1578,6 +1578,7 @@ enum pci_fixup_pass {
>  #ifdef CONFIG_PCI_QUIRKS
>  void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev);
>  struct pci_dev *pci_get_dma_source(struct pci_dev *dev);
> +u8 pci_get_dma_source_map(struct pci_dev *dev);
>  int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
>  #else
>  static inline void pci_fixup_device(enum pci_fixup_pass pass,
> @@ -1586,6 +1587,10 @@ static inline struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
>  {
>  	return pci_dev_get(dev);
>  }
> +u8 pci_get_dma_source_map(struct pci_dev *dev)
> +{
> +	return 0;
> +}
>  static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
>  					       u16 acs_flags)
>  {
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index f11c1c2..df57496 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -1604,6 +1604,7 @@
>  #define PCI_SUBDEVICE_ID_KEYSPAN_SX2	0x5334
>  
>  #define PCI_VENDOR_ID_MARVELL		0x11ab
> +#define PCI_VENDOR_ID_MARVELL_2	0x1b4b
>  #define PCI_DEVICE_ID_MARVELL_GT64111	0x4146
>  #define PCI_DEVICE_ID_MARVELL_GT64260	0x6430
>  #define PCI_DEVICE_ID_MARVELL_MV64360	0x6460



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Cooks - March 6, 2013, 6:59 a.m.
On Wed, Mar 6, 2013 at 12:04 PM, Alex Williamson
<alex.williamson@redhat.com> wrote:
> On Fri, 2013-03-01 at 16:26 +0800, Andrew Cooks wrote:
>
>> +
>> +     for (fn = 1; fn < 8; fn++) {
>
> Wouldn't you want to do 0 to 7, then add:
>
> if (fn == PCI_FUNC(pdev->devfn))
>         continue;
>
> You could also get more creative with the loop using shifts and exit
> when the remaining map is 0.

Thanks, I'll use a shift instead.

Up to now I've assumed that function 0 will always be the real device
and that function 1-7 may be ghost functions, but as we saw in the
case of the Marvell 88NV9143 in the Super Talent CoreStore MV 64GB
mini-PCIe SSD, that assumption is probably wrong.

To handle the case where the real device is function 1 and function 0
needs to be mapped as a ghost function, would it be acceptable to
iterate over 0 to 7 and let domain_context_mapping_one take care of
preventing duplicates, or should I duplicate the
device_to_context_entry and context_present function calls?

>
>> +             if (fn_map & (1 << fn)) {
>> +                     err = domain_context_mapping_one(domain,
>> +                                     pci_domain_nr(pdev->bus),
>> +                                     pdev->bus->number,
>> +                                     PCI_DEVFN(PCI_SLOT(pdev->devfn), fn),
>> +                                     translation);
>> +                     if (err)
>> +                             return err;
>
> I'd be worried that there's missing cleanup here, what if you were
> mapping multiple ghost functions and the 2nd one failed, leaving one
> attached?

I don't understand the failure cases sufficiently, but I understand
that it's better to have all mappings succeed or fail together. Will
fix it.

>> +/* Table of multiple (ghost) source functions. This is similar to the
>> + * translated sources above, but with the following differences:
>> + * 1. the device may use multiple functions as DMA sources,
>> + * 2. these functions cannot be assumed to be actual devices,
>> + * 3. the specific ghost function for a request can not be exactly predicted.
>> + * The bitmap only contains the additional quirk functions.
>> + */
>> +static const struct pci_dev_dma_multi_func_sources {
>> +     u16 vendor;
>> +     u16 device;
>> +     u8 func_map;    /* bit map. lsb is fn 0. */
>> +} pci_dev_dma_multi_func_sources[] = {
>> +     { PCI_VENDOR_ID_MARVELL_2, 0x9123, (1<<0)|(1<<1)},
>> +     { PCI_VENDOR_ID_MARVELL_2, 0x9125, (1<<0)|(1<<1)},
>> +     { PCI_VENDOR_ID_MARVELL_2, 0x9128, (1<<0)|(1<<1)},
>> +     { PCI_VENDOR_ID_MARVELL_2, 0x9130, (1<<0)|(1<<1)},
>> +     { PCI_VENDOR_ID_MARVELL_2, 0x9172, (1<<0)|(1<<1)},
>
> Links to bug reports in the comments might be useful for future
> workarounds.  I'm also not sure what you mean in the 3rd bullet, the
> non-ghost function of some of these is sometimes 0, sometimes 1?  And
> the ghost function is the other?

The ghost function is the one that doesn't correspond to an actual
device, but the actual device could be either 0 or 1 and it could use
both 0 and 1 for different requests, with no obvious way to tell when
it will use 0 and when it will use 1. I'll reword the bullet.

> Skipping fn 0 above, I assume all
> cases are fn 0 exists and fn 1 is the ghost function, right?  If so,
> then we probably only want bit 1 set.  I'm afraid to ask whether there
> are configurations of this vendor/device that have a fn 1.

See comment about Marvell 88NV9143 above.

Thanks for reviewing!

a.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Justin Piszcz - March 29, 2013, 2:36 p.m.
On Sun, Mar 3, 2013 at 8:35 PM, Andrew Cooks <acooks@gmail.com> wrote:
> On Sat, Mar 2, 2013 at 7:18 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>>
>> Against a clean 3.7.10 (from ftp.kernel.org)
>>
>> # patch -p1 <
>> ../patch/RFC-Fix-Intel-IOMMU-support-for-Marvell-88SE91xx-SATA-controllers..
>> patch
>> patching file drivers/iommu/intel-iommu.c
>> patching file drivers/pci/quirks.c
>> Hunk #1 succeeded at 3230 (offset 3 lines).
>> patching file include/linux/pci.h
>> # pwd
>> /usr/src/linux-3.7.10
>>
>
> I've downloaded and patched the 3.7.10 tarball and still get the same
> output I got before; different output from yours. I'm not sure the
> patch is complete or applying correctly, are you?
> Could you please check whether the patch you're applying is the same

Hi,

As this patch is now working for some time (against 3.7.x), I was
wondering when it was going to be included in mainline?

I had upgraded to 3.8.x and rebooted and the same problem recurred and
had to revert back to 3.7.x.

Thanks,

Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0099667..13323f2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1674,6 +1674,50 @@  static int domain_context_mapping_one(struct dmar_domain *domain, int segment,
 	return 0;
 }
 
+/* For quirky devices like Marvell 88SE91xx chips that use ghost functions. */
+static int map_ghost_dma_fn(struct dmar_domain *domain,
+		struct pci_dev *pdev,
+		int translation)
+{
+	u8 fn, fn_map;
+	int err = 0;
+
+	fn_map = pci_get_dma_source_map(pdev);
+
+	for (fn = 1; fn < 8; fn++) {
+		if (fn_map & (1 << fn)) {
+			err = domain_context_mapping_one(domain,
+					pci_domain_nr(pdev->bus),
+					pdev->bus->number,
+					PCI_DEVFN(PCI_SLOT(pdev->devfn), fn),
+					translation);
+			if (err)
+				return err;
+			dev_dbg(&pdev->dev, "dma quirk; func %d mapped", fn);
+		}
+	}
+	return 0;
+}
+
+static void iommu_detach_dev(struct intel_iommu *iommu, u8 bus, u8 devfn);
+
+static void unmap_ghost_dma_fn(struct intel_iommu *iommu,
+		struct pci_dev *pdev)
+{
+	u8 fn, fn_map;
+
+	fn_map = pci_get_dma_source_map(pdev);
+
+	for (fn = 1; fn < 8; fn++) {
+		if (fn_map & (1 << fn)) {
+			iommu_detach_dev(iommu,
+					pdev->bus->number,
+					PCI_DEVFN(PCI_SLOT(pdev->devfn), fn));
+			dev_dbg(&pdev->dev, "dma quirk; func %d unmapped", fn);
+		}
+	}
+}
+
 static int
 domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev,
 			int translation)
@@ -1687,6 +1731,11 @@  domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev,
 	if (ret)
 		return ret;
 
+	/* quirk for undeclared/ghost pci functions */
+	ret = map_ghost_dma_fn(domain, pdev, translation);
+	if (ret)
+		return ret;
+
 	/* dependent device mapping */
 	tmp = pci_find_upstream_pcie_bridge(pdev);
 	if (!tmp)
@@ -3786,6 +3835,7 @@  static void domain_remove_one_dev_info(struct dmar_domain *domain,
 			iommu_disable_dev_iotlb(info);
 			iommu_detach_dev(iommu, info->bus, info->devfn);
 			iommu_detach_dependent_devices(iommu, pdev);
+			unmap_ghost_dma_fn(iommu, pdev);
 			free_devinfo_mem(info);
 
 			spin_lock_irqsave(&device_domain_lock, flags);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0369fb6..d311100 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3249,6 +3249,10 @@  static struct pci_dev *pci_func_0_dma_source(struct pci_dev *dev)
 	return pci_get_slot(dev->bus, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
 }
 
+/* Table of source functions for real devices. The DMA requests for the
+ * device are tagged with a different real function as source. This is
+ * relevant to multifunction devices.
+ */
 static const struct pci_dev_dma_source {
 	u16 vendor;
 	u16 device;
@@ -3275,7 +3279,8 @@  static const struct pci_dev_dma_source {
  * the device doing the DMA, but sometimes hardware is broken and will
  * tag the DMA as being sourced from a different device.  This function
  * allows that translation.  Note that the reference count of the
- * returned device is incremented on all paths.
+ * returned device is incremented on all paths. Translation is done when
+ * the device is added to an IOMMU group.
  */
 struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
 {
@@ -3292,6 +3297,46 @@  struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
 	return pci_dev_get(dev);
 }
 
+/* Table of multiple (ghost) source functions. This is similar to the
+ * translated sources above, but with the following differences:
+ * 1. the device may use multiple functions as DMA sources,
+ * 2. these functions cannot be assumed to be actual devices,
+ * 3. the specific ghost function for a request can not be exactly predicted.
+ * The bitmap only contains the additional quirk functions.
+ */
+static const struct pci_dev_dma_multi_func_sources {
+	u16 vendor;
+	u16 device;
+	u8 func_map;	/* bit map. lsb is fn 0. */
+} pci_dev_dma_multi_func_sources[] = {
+	{ PCI_VENDOR_ID_MARVELL_2, 0x9123, (1<<0)|(1<<1)},
+	{ PCI_VENDOR_ID_MARVELL_2, 0x9125, (1<<0)|(1<<1)},
+	{ PCI_VENDOR_ID_MARVELL_2, 0x9128, (1<<0)|(1<<1)},
+	{ PCI_VENDOR_ID_MARVELL_2, 0x9130, (1<<0)|(1<<1)},
+	{ PCI_VENDOR_ID_MARVELL_2, 0x9172, (1<<0)|(1<<1)},
+	{ 0 }
+};
+
+/*
+ * The mapping of fake/ghost functions is used when the real device is
+ * attached to an IOMMU domain. IOMMU groups are not aware of these
+ * functions, because they're not real devices.
+ */
+u8 pci_get_dma_source_map(struct pci_dev *dev)
+{
+	const struct pci_dev_dma_multi_func_sources *i;
+
+	for (i = pci_dev_dma_multi_func_sources; i->func_map; i++) {
+		if ((i->vendor == dev->vendor ||
+		     i->vendor == (u16)PCI_ANY_ID) &&
+		    (i->device == dev->device ||
+		     i->device == (u16)PCI_ANY_ID)) {
+			return i->func_map;
+		}
+	}
+	return 0;
+}
+
 static const struct pci_dev_acs_enabled {
 	u16 vendor;
 	u16 device;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2461033..5ad3822 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1578,6 +1578,7 @@  enum pci_fixup_pass {
 #ifdef CONFIG_PCI_QUIRKS
 void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev);
 struct pci_dev *pci_get_dma_source(struct pci_dev *dev);
+u8 pci_get_dma_source_map(struct pci_dev *dev);
 int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
 #else
 static inline void pci_fixup_device(enum pci_fixup_pass pass,
@@ -1586,6 +1587,10 @@  static inline struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
 {
 	return pci_dev_get(dev);
 }
+u8 pci_get_dma_source_map(struct pci_dev *dev)
+{
+	return 0;
+}
 static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
 					       u16 acs_flags)
 {
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index f11c1c2..df57496 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -1604,6 +1604,7 @@ 
 #define PCI_SUBDEVICE_ID_KEYSPAN_SX2	0x5334
 
 #define PCI_VENDOR_ID_MARVELL		0x11ab
+#define PCI_VENDOR_ID_MARVELL_2	0x1b4b
 #define PCI_DEVICE_ID_MARVELL_GT64111	0x4146
 #define PCI_DEVICE_ID_MARVELL_GT64260	0x6430
 #define PCI_DEVICE_ID_MARVELL_MV64360	0x6460