diff mbox series

[RFC/RFT] Add noats flag to boot parameters

Message ID 1525025808-2365-1-git-send-email-gilkup@cs.technion.ac.il
State Accepted
Delegated to: Bjorn Helgaas
Headers show
Series [RFC/RFT] Add noats flag to boot parameters | expand

Commit Message

Gil Kupfer April 29, 2018, 6:16 p.m. UTC
This patch adds noats option to the pci boot parameter.
When noats is selected, all ATS related functions fail immediately and
the IOMMU is configured to not use device-iotlb.

Any function that checks for ATS capabilities directly against the
devices should also check this flag. (Currently, such functions exist
only in IOMMU drivers, and they are covered by this patch.)

The motivation behind this patch is the existence of malicious devices.
Lots of research has been done about how to utilitize the IOMMU as a
protection from such devices. When ATS is supported, any I/O device can
access any physical access by faking device-IOTLB entries.
Adding the ability to ignore these entries lets sysadmins enhance system
security.

Signed-off-by: Gil Kupfer <gilkup@cs.technion.ac.il>

---
This patch is intended to add the ability to disable ATS at boot time.

My IOMMU has the ATS ecap but I don't have any PCI device that supports
ATS so I can't fully test it. However, I did ran it (with and without the
new boot flag) on QEMU with virtualized IOMMU with the device-iotlb flag
and it seems that at least the machine does not crush.

QEMU version:
	master branch from Jul 11 2017
	commit aa916e409c04 ("Merge 29741be b876804")
---
 Documentation/admin-guide/kernel-parameters.txt |  2 ++
 drivers/iommu/amd_iommu.c                       | 11 ++++++++---
 drivers/iommu/intel-iommu.c                     |  3 ++-
 drivers/pci/ats.c                               |  3 +++
 drivers/pci/pci.c                               |  7 +++++++
 include/linux/pci.h                             |  2 ++
 6 files changed, 24 insertions(+), 4 deletions(-)

Comments

Joerg Roedel May 3, 2018, 1:35 p.m. UTC | #1
On Sun, Apr 29, 2018 at 09:16:48PM +0300, Gil Kupfer wrote:
> This patch adds noats option to the pci boot parameter.
> When noats is selected, all ATS related functions fail immediately and
> the IOMMU is configured to not use device-iotlb.
> 
> Any function that checks for ATS capabilities directly against the
> devices should also check this flag. (Currently, such functions exist
> only in IOMMU drivers, and they are covered by this patch.)
> 
> The motivation behind this patch is the existence of malicious devices.
> Lots of research has been done about how to utilitize the IOMMU as a
> protection from such devices. When ATS is supported, any I/O device can
> access any physical access by faking device-IOTLB entries.
> Adding the ability to ignore these entries lets sysadmins enhance system
> security.
> 
> Signed-off-by: Gil Kupfer <gilkup@cs.technion.ac.il>

This has also been on my list, thanks for doing that.

Acked-by: Joerg Roedel <jroedel@suse.de>
Sinan Kaya May 3, 2018, 1:46 p.m. UTC | #2
On 5/3/2018 9:35 AM, Joerg Roedel wrote:
> On Sun, Apr 29, 2018 at 09:16:48PM +0300, Gil Kupfer wrote:
>> This patch adds noats option to the pci boot parameter.
>> When noats is selected, all ATS related functions fail immediately and
>> the IOMMU is configured to not use device-iotlb.
>>
>> Any function that checks for ATS capabilities directly against the
>> devices should also check this flag. (Currently, such functions exist
>> only in IOMMU drivers, and they are covered by this patch.)
>>
>> The motivation behind this patch is the existence of malicious devices.
>> Lots of research has been done about how to utilitize the IOMMU as a
>> protection from such devices. When ATS is supported, any I/O device can
>> access any physical access by faking device-IOTLB entries.
>> Adding the ability to ignore these entries lets sysadmins enhance system
>> security.
>>
>> Signed-off-by: Gil Kupfer <gilkup@cs.technion.ac.il>
> 
> This has also been on my list, thanks for doing that.
> 
> Acked-by: Joerg Roedel <jroedel@suse.de>
> 

I also like the idea in general.
Minor nit..

Shouldn't this be an iommu parameter rather than a PCI kernel command line parameter?
We now have an iommu.passthrough argument that prevents page translation.

Doesn't this fit into the same category especially when it is the IOMMU drivers that
call ATS functions for enablement not the PCI drivers.
Joerg Roedel May 3, 2018, 1:59 p.m. UTC | #3
On Thu, May 03, 2018 at 09:46:34AM -0400, Sinan Kaya wrote:
> I also like the idea in general.
> Minor nit..
> 
> Shouldn't this be an iommu parameter rather than a PCI kernel command line parameter?
> We now have an iommu.passthrough argument that prevents page translation.
> 
> Doesn't this fit into the same category especially when it is the IOMMU drivers that
> call ATS functions for enablement not the PCI drivers.

ATS is a bit of a grey area between PCI and IOMMU, but since ATS is
PCI-specific and the code to enable/disable it is in PCI as well, I
think the parameter makes sense for PCI too.


	Joerg
Sinan Kaya May 3, 2018, 2:23 p.m. UTC | #4
+Bjorn,

On 5/3/2018 9:59 AM, Joerg Roedel wrote:
> On Thu, May 03, 2018 at 09:46:34AM -0400, Sinan Kaya wrote:
>> I also like the idea in general.
>> Minor nit..
>>
>> Shouldn't this be an iommu parameter rather than a PCI kernel command line parameter?
>> We now have an iommu.passthrough argument that prevents page translation.
>>
>> Doesn't this fit into the same category especially when it is the IOMMU drivers that
>> call ATS functions for enablement not the PCI drivers.
> 
> ATS is a bit of a grey area between PCI and IOMMU, but since ATS is
> PCI-specific and the code to enable/disable it is in PCI as well, I
> think the parameter makes sense for PCI too.
> 

OK. Bjorn was interested in having a command line driven feature enables in driver/pci
directory with bitmasks for each optional PCI spec capability rather than noXYZ feature.

This would allow us to troubleshoot code breakage as well as the platform bring up to
turn off all optional features.

Sounds like this would be a good match for that work.


> 
> 	Joerg
> 
>
Nadav Amit May 3, 2018, 10:15 p.m. UTC | #5
Sinan Kaya <okaya@codeaurora.org> wrote:

> +Bjorn,
> 
> On 5/3/2018 9:59 AM, Joerg Roedel wrote:
>> On Thu, May 03, 2018 at 09:46:34AM -0400, Sinan Kaya wrote:
>>> I also like the idea in general.
>>> Minor nit..
>>> 
>>> Shouldn't this be an iommu parameter rather than a PCI kernel command line parameter?
>>> We now have an iommu.passthrough argument that prevents page translation.
>>> 
>>> Doesn't this fit into the same category especially when it is the IOMMU drivers that
>>> call ATS functions for enablement not the PCI drivers.
>> 
>> ATS is a bit of a grey area between PCI and IOMMU, but since ATS is
>> PCI-specific and the code to enable/disable it is in PCI as well, I
>> think the parameter makes sense for PCI too.
> 
> OK. Bjorn was interested in having a command line driven feature enables in driver/pci
> directory with bitmasks for each optional PCI spec capability rather than noXYZ feature.
> 
> This would allow us to troubleshoot code breakage as well as the platform bring up to
> turn off all optional features.
> 
> Sounds like this would be a good match for that work.

I think that since this feature (ATS) has security implications, it should
be controllable through the kernel boot parameters. Otherwise, it can be
potentially too late to turn it off.

Regards,
Nadav
Bjorn Helgaas May 3, 2018, 10:52 p.m. UTC | #6
On Thu, May 03, 2018 at 10:23:02AM -0400, Sinan Kaya wrote:
> +Bjorn,
> 
> On 5/3/2018 9:59 AM, Joerg Roedel wrote:
> > On Thu, May 03, 2018 at 09:46:34AM -0400, Sinan Kaya wrote:
> >> I also like the idea in general.
> >> Minor nit..
> >>
> >> Shouldn't this be an iommu parameter rather than a PCI kernel command line parameter?
> >> We now have an iommu.passthrough argument that prevents page translation.
> >>
> >> Doesn't this fit into the same category especially when it is the IOMMU drivers that
> >> call ATS functions for enablement not the PCI drivers.
> > 
> > ATS is a bit of a grey area between PCI and IOMMU, but since ATS is
> > PCI-specific and the code to enable/disable it is in PCI as well, I
> > think the parameter makes sense for PCI too.
> 
> OK. Bjorn was interested in having a command line driven feature enables
> in driver/pci directory with bitmasks for each optional PCI spec
> capability rather than noXYZ feature.

It's true that I try to avoid adding *any* kernel parameters as much
as possible because they're usually not practical for end-users.

I think it's unreasonable to expect users to use "pci=" parameters
based on what specific hardware they have.  That's too hard to
discover and too hard to use.  I did wonder about a "pci=safe"
parameter that would disable potentially risky features just as a
debugging feature [1].

This ATS case is a security question and the parameter is not
something that would have to be used to get certain hardware to work,
so I think it's probably reasonable to add.  I would maybe expand the
documentation so it includes the reason somebody might want it, i.e.,
to defend against malicious PCIe devices.

A parameter using bitmasks could be conceivable for developers but
sounds too unwieldy for end-users.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=196197#c53
Bjorn Helgaas May 10, 2018, 11:09 p.m. UTC | #7
On Sun, Apr 29, 2018 at 09:16:48PM +0300, Gil Kupfer wrote:
> This patch adds noats option to the pci boot parameter.
> When noats is selected, all ATS related functions fail immediately and
> the IOMMU is configured to not use device-iotlb.
> 
> Any function that checks for ATS capabilities directly against the
> devices should also check this flag. (Currently, such functions exist
> only in IOMMU drivers, and they are covered by this patch.)
> 
> The motivation behind this patch is the existence of malicious devices.
> Lots of research has been done about how to utilitize the IOMMU as a
> protection from such devices. When ATS is supported, any I/O device can
> access any physical access by faking device-IOTLB entries.
> Adding the ability to ignore these entries lets sysadmins enhance system
> security.
> 
> Signed-off-by: Gil Kupfer <gilkup@cs.technion.ac.il>

Applied with Joerg's ack to pci/virtualization for v4.18, thanks!

> ---
> This patch is intended to add the ability to disable ATS at boot time.
> 
> My IOMMU has the ATS ecap but I don't have any PCI device that supports
> ATS so I can't fully test it. However, I did ran it (with and without the
> new boot flag) on QEMU with virtualized IOMMU with the device-iotlb flag
> and it seems that at least the machine does not crush.
> 
> QEMU version:
> 	master branch from Jul 11 2017
> 	commit aa916e409c04 ("Merge 29741be b876804")
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  2 ++
>  drivers/iommu/amd_iommu.c                       | 11 ++++++++---
>  drivers/iommu/intel-iommu.c                     |  3 ++-
>  drivers/pci/ats.c                               |  3 +++
>  drivers/pci/pci.c                               |  7 +++++++
>  include/linux/pci.h                             |  2 ++
>  6 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 7737ab5..f443362 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3000,6 +3000,8 @@
>  		pcie_scan_all	Scan all possible PCIe devices.  Otherwise we
>  				only look for one device below a PCIe downstream
>  				port.
> +		noats		[PCIE, Intel-IOMMU, AMD-IOMMU]
> +				do not use PCIe ATS (and IOMMU device-iotlb).
>  
>  	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
>  			Management.
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 0f1219f..2aa757e 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -388,6 +388,9 @@ static bool pci_iommuv2_capable(struct pci_dev *pdev)
>  	};
>  	int i, pos;
>  
> +	if (is_pcie_ats_disabled())
> +		return false;
> +
>  	for (i = 0; i < 3; ++i) {
>  		pos = pci_find_ext_capability(pdev, caps[i]);
>  		if (pos == 0)
> @@ -3602,9 +3605,11 @@ int amd_iommu_device_info(struct pci_dev *pdev,
>  
>  	memset(info, 0, sizeof(*info));
>  
> -	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
> -	if (pos)
> -		info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
> +	if (!is_pcie_ats_disabled()) {
> +		pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
> +		if (pos)
> +			info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
> +	}
>  
>  	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI);
>  	if (pos)
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index fc2765c..7ac4adc 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2434,7 +2434,8 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>  	if (dev && dev_is_pci(dev)) {
>  		struct pci_dev *pdev = to_pci_dev(info->dev);
>  
> -		if (ecap_dev_iotlb_support(iommu->ecap) &&
> +		if (!is_pcie_ats_disabled() &&
> +		    ecap_dev_iotlb_support(iommu->ecap) &&
>  		    pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS) &&
>  		    dmar_find_matched_atsr_unit(pdev))
>  			info->ats_supported = 1;
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index eeb9fb2..619024d 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -21,6 +21,9 @@ void pci_ats_init(struct pci_dev *dev)
>  {
>  	int pos;
>  
> +	if (is_pcie_ats_disabled())
> +		return;
> +
>  	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ATS);
>  	if (!pos)
>  		return;
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 563901c..eb77590 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -109,6 +109,10 @@ unsigned int pcibios_max_latency = 255;
>  /* If set, the PCIe ARI capability will not be used. */
>  static bool pcie_ari_disabled;
>  
> +/* If set, the PCIe ATS capability will not be used. */
> +static bool pcie_ats_disabled;
> +bool is_pcie_ats_disabled(void) { return pcie_ats_disabled; }
> +
>  /* Disable bridge_d3 for all PCIe ports */
>  static bool pci_bridge_d3_disable;
>  /* Force bridge_d3 for all PCIe ports */
> @@ -5430,6 +5434,9 @@ static int __init pci_setup(char *str)
>  		if (*str && (str = pcibios_setup(str)) && *str) {
>  			if (!strcmp(str, "nomsi")) {
>  				pci_no_msi();
> +			} else if (!strncmp(str, "noats", 5)) {
> +				pr_info("PCIe: ATS is disabled\n");
> +				pcie_ats_disabled = true;
>  			} else if (!strcmp(str, "noaer")) {
>  				pci_no_aer();
>  			} else if (!strncmp(str, "realloc=", 8)) {
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 8039f9f..58fe5fb 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1420,6 +1420,8 @@ int  ht_create_irq(struct pci_dev *dev, int idx);
>  void ht_destroy_irq(unsigned int irq);
>  #endif /* CONFIG_HT_IRQ */
>  
> +bool is_pcie_ats_disabled(void);
> +
>  #ifdef CONFIG_PCI_ATS
>  /* Address Translation Service */
>  void pci_ats_init(struct pci_dev *dev);
> -- 
> 2.7.4
>
diff mbox series

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7737ab5..f443362 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3000,6 +3000,8 @@ 
 		pcie_scan_all	Scan all possible PCIe devices.  Otherwise we
 				only look for one device below a PCIe downstream
 				port.
+		noats		[PCIE, Intel-IOMMU, AMD-IOMMU]
+				do not use PCIe ATS (and IOMMU device-iotlb).
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 0f1219f..2aa757e 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -388,6 +388,9 @@  static bool pci_iommuv2_capable(struct pci_dev *pdev)
 	};
 	int i, pos;
 
+	if (is_pcie_ats_disabled())
+		return false;
+
 	for (i = 0; i < 3; ++i) {
 		pos = pci_find_ext_capability(pdev, caps[i]);
 		if (pos == 0)
@@ -3602,9 +3605,11 @@  int amd_iommu_device_info(struct pci_dev *pdev,
 
 	memset(info, 0, sizeof(*info));
 
-	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
-	if (pos)
-		info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
+	if (!is_pcie_ats_disabled()) {
+		pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
+		if (pos)
+			info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
+	}
 
 	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI);
 	if (pos)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index fc2765c..7ac4adc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2434,7 +2434,8 @@  static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	if (dev && dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(info->dev);
 
-		if (ecap_dev_iotlb_support(iommu->ecap) &&
+		if (!is_pcie_ats_disabled() &&
+		    ecap_dev_iotlb_support(iommu->ecap) &&
 		    pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS) &&
 		    dmar_find_matched_atsr_unit(pdev))
 			info->ats_supported = 1;
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index eeb9fb2..619024d 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -21,6 +21,9 @@  void pci_ats_init(struct pci_dev *dev)
 {
 	int pos;
 
+	if (is_pcie_ats_disabled())
+		return;
+
 	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ATS);
 	if (!pos)
 		return;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 563901c..eb77590 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -109,6 +109,10 @@  unsigned int pcibios_max_latency = 255;
 /* If set, the PCIe ARI capability will not be used. */
 static bool pcie_ari_disabled;
 
+/* If set, the PCIe ATS capability will not be used. */
+static bool pcie_ats_disabled;
+bool is_pcie_ats_disabled(void) { return pcie_ats_disabled; }
+
 /* Disable bridge_d3 for all PCIe ports */
 static bool pci_bridge_d3_disable;
 /* Force bridge_d3 for all PCIe ports */
@@ -5430,6 +5434,9 @@  static int __init pci_setup(char *str)
 		if (*str && (str = pcibios_setup(str)) && *str) {
 			if (!strcmp(str, "nomsi")) {
 				pci_no_msi();
+			} else if (!strncmp(str, "noats", 5)) {
+				pr_info("PCIe: ATS is disabled\n");
+				pcie_ats_disabled = true;
 			} else if (!strcmp(str, "noaer")) {
 				pci_no_aer();
 			} else if (!strncmp(str, "realloc=", 8)) {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8039f9f..58fe5fb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1420,6 +1420,8 @@  int  ht_create_irq(struct pci_dev *dev, int idx);
 void ht_destroy_irq(unsigned int irq);
 #endif /* CONFIG_HT_IRQ */
 
+bool is_pcie_ats_disabled(void);
+
 #ifdef CONFIG_PCI_ATS
 /* Address Translation Service */
 void pci_ats_init(struct pci_dev *dev);