diff mbox

[2/4] IOMMU: Force the VFs of an untrusted PF device to be in the PFs IOMMU group

Message ID 1465474173-53960-3-git-send-email-ilyal@mellanox.com
State Superseded
Headers show

Commit Message

Ilya Lesokhin June 9, 2016, 12:09 p.m. UTC
Add a new PCI_DEV_FLAGS_UNTRUSTED to indicate that a PCI device
is probed by a driver that gives untrusted entities access to that device.
Make iommu_group_get_for_pci_dev check the new flag when an IOMMU
group is selected for a virtual function.
Mark VFIO devices with the new flag.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
---
 drivers/iommu/iommu.c       | 4 ++++
 drivers/vfio/pci/vfio_pci.c | 3 +++
 include/linux/pci.h         | 3 +++
 3 files changed, 10 insertions(+)

Comments

kernel test robot June 9, 2016, 12:56 p.m. UTC | #1
Hi,

[auto build test ERROR on vfio/next]
[also build test ERROR on v4.7-rc2 next-20160609]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/VFIO-SR-IOV-support/20160609-202117
base:   https://github.com/awilliam/linux-vfio.git next
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 5.3.1-8) 5.3.1 20160205
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   drivers/iommu/iommu.c: In function 'pci_device_group':
>> drivers/iommu/iommu.c:758:10: error: 'struct pci_dev' has no member named 'physfn'
        (pdev->physfn->dev_flags & PCI_DEV_FLAGS_UNTRUSTED))
             ^
   drivers/iommu/iommu.c:759:31: error: 'struct pci_dev' has no member named 'physfn'
      return iommu_group_get(&pdev->physfn->dev);
                                  ^

vim +758 drivers/iommu/iommu.c

   752		struct group_for_pci_data data;
   753		struct pci_bus *bus;
   754		struct iommu_group *group = NULL;
   755		u64 devfns[4] = { 0 };
   756		
   757		if (pdev->is_virtfn && 
 > 758		   (pdev->physfn->dev_flags & PCI_DEV_FLAGS_UNTRUSTED))
   759			return iommu_group_get(&pdev->physfn->dev);
   760	
   761		if (WARN_ON(!dev_is_pci(dev)))

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
kernel test robot June 9, 2016, 3:21 p.m. UTC | #2
Hi,

[auto build test ERROR on vfio/next]
[also build test ERROR on v4.7-rc2 next-20160609]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/VFIO-SR-IOV-support/20160609-202117
base:   https://github.com/awilliam/linux-vfio.git next
config: x86_64-randconfig-s4-06092146 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/iommu/iommu.c: In function 'pci_device_group':
>> drivers/iommu/iommu.c:758:10: error: 'struct pci_dev' has no member named 'physfn'; did you mean 'is_physfn'?
        (pdev->physfn->dev_flags & PCI_DEV_FLAGS_UNTRUSTED))
             ^~
   drivers/iommu/iommu.c:759:31: error: 'struct pci_dev' has no member named 'physfn'; did you mean 'is_physfn'?
      return iommu_group_get(&pdev->physfn->dev);
                                  ^~

vim +758 drivers/iommu/iommu.c

   752		struct group_for_pci_data data;
   753		struct pci_bus *bus;
   754		struct iommu_group *group = NULL;
   755		u64 devfns[4] = { 0 };
   756		
   757		if (pdev->is_virtfn && 
 > 758		   (pdev->physfn->dev_flags & PCI_DEV_FLAGS_UNTRUSTED))
   759			return iommu_group_get(&pdev->physfn->dev);
   760	
   761		if (WARN_ON(!dev_is_pci(dev)))

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Alex Williamson June 9, 2016, 10:21 p.m. UTC | #3
On Thu,  9 Jun 2016 15:09:30 +0300
Ilya Lesokhin <ilyal@mellanox.com> wrote:

> Add a new PCI_DEV_FLAGS_UNTRUSTED to indicate that a PCI device
> is probed by a driver that gives untrusted entities access to that device.
> Make iommu_group_get_for_pci_dev check the new flag when an IOMMU
> group is selected for a virtual function.
> Mark VFIO devices with the new flag.
> 
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> ---
>  drivers/iommu/iommu.c       | 4 ++++
>  drivers/vfio/pci/vfio_pci.c | 3 +++
>  include/linux/pci.h         | 3 +++
>  3 files changed, 10 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 3000051..9bb914c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -749,6 +749,10 @@ struct iommu_group *pci_device_group(struct device *dev)
>  	struct pci_bus *bus;
>  	struct iommu_group *group = NULL;
>  	u64 devfns[4] = { 0 };
> +	
> +	if (pdev->is_virtfn && 
> +	   (pdev->physfn->dev_flags & PCI_DEV_FLAGS_UNTRUSTED))
> +		return iommu_group_get(&pdev->physfn->dev);

This deserves a comment in the code as well as the commit log, but more
importantly the side effect of this is that the user can't make use of
separate IOMMU domains for the PF vs the VF.  I think this ends up
making the entire solution incompatible with things like vIOMMU since
we really need to be able to create separate address spaces per device
to make that work.  What's the point of enabling SR-IOV from userspace
if we can't do things like assign VFs to nested guests or userspace in
the guest?  This is an incomplete feature with that restriction.
Thanks,

Alex

>  
>  	if (WARN_ON(!dev_is_pci(dev)))
>  		return ERR_PTR(-EINVAL);
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 188b1ff..72d048e 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1180,6 +1180,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		pci_set_power_state(pdev, PCI_D3hot);
>  	}
>  
> +	pdev->dev_flags |= PCI_DEV_FLAGS_UNTRUSTED;
> +
>  	return ret;
>  }
>  
> @@ -1187,6 +1189,7 @@ static void vfio_pci_remove(struct pci_dev *pdev)
>  {
>  	struct vfio_pci_device *vdev;
>  
> +	pdev->dev_flags &= ~PCI_DEV_FLAGS_UNTRUSTED;
>  	vdev = vfio_del_group_dev(&pdev->dev);
>  	if (!vdev)
>  		return;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b67e4df..bef9115 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -174,6 +174,9 @@ enum pci_dev_flags {
>  	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
>  	/* Get VPD from function 0 VPD */
>  	PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1 << 8),
> +	/* Untrusted software controls this device
> +	 * The VFs of this device should be put in the device's IOMMUs group*/
> +	PCI_DEV_FLAGS_UNTRUSTED = (__force pci_dev_flags_t) (1 << 9),
>  };
>  
>  enum pci_irq_reroute_variant {

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilya Lesokhin June 13, 2016, 7:08 a.m. UTC | #4
> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Friday, June 10, 2016 1:21 AM
> To: Ilya Lesokhin <ilyal@mellanox.com>
> Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com;
> Noa Osherovich <noaos@mellanox.com>; Haggai Eran
> <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>; Liran Liss
> <liranl@mellanox.com>
> Subject: Re: [PATCH 2/4] IOMMU: Force the VFs of an untrusted PF device to
> be in the PFs IOMMU group
...
> This deserves a comment in the code as well as the commit log, but more
> importantly the side effect of this is that the user can't make use of separate
> IOMMU domains for the PF vs the VF.  I think this ends up making the entire
> solution incompatible with things like vIOMMU since we really need to be
> able to create separate address spaces per device to make that work.  What's
> the point of enabling SR-IOV from userspace if we can't do things like assign
> VFs to nested guests or userspace in the guest?  This is an incomplete
> feature with that restriction.

I agree that this is a problem and I will mention this limitation in the commit log.
However to address this properly you need nested IOMMU group which don't really exist.
This feature is still useful, at least for us, as you can enable sriov in a guest and work with
probed VFs in the same guest.
Furthermore, if you have a real nested IOMMU you should be able to do nested device 
assignment even though the PF and VF's belong to the same group.

I think we should push this feature with the limitation and hopefully
it will be addressed in the future, agree?

Thanks,
Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson June 13, 2016, 4 p.m. UTC | #5
On Mon, 13 Jun 2016 07:08:03 +0000
Ilya Lesokhin <ilyal@mellanox.com> wrote:

> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Friday, June 10, 2016 1:21 AM
> > To: Ilya Lesokhin <ilyal@mellanox.com>
> > Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com;
> > Noa Osherovich <noaos@mellanox.com>; Haggai Eran
> > <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>; Liran Liss
> > <liranl@mellanox.com>
> > Subject: Re: [PATCH 2/4] IOMMU: Force the VFs of an untrusted PF device to
> > be in the PFs IOMMU group  
> ...
> > This deserves a comment in the code as well as the commit log, but more
> > importantly the side effect of this is that the user can't make use of separate
> > IOMMU domains for the PF vs the VF.  I think this ends up making the entire
> > solution incompatible with things like vIOMMU since we really need to be
> > able to create separate address spaces per device to make that work.  What's
> > the point of enabling SR-IOV from userspace if we can't do things like assign
> > VFs to nested guests or userspace in the guest?  This is an incomplete
> > feature with that restriction.  
> 
> I agree that this is a problem and I will mention this limitation in the commit log.
> However to address this properly you need nested IOMMU group which don't really exist.
> This feature is still useful, at least for us, as you can enable sriov in a guest and work with
> probed VFs in the same guest.
> Furthermore, if you have a real nested IOMMU you should be able to do nested device 
> assignment even though the PF and VF's belong to the same group.
> 
> I think we should push this feature with the limitation and hopefully
> it will be addressed in the future, agree?

Sorry, I don't agree, nor do I think that nested IOMMU groups are the
solution to the problem (or even really understand what nest IOMMU
groups are).  It seems we have an issue that an untrusted user is
creating devices and we're trying to overload the concept of an IOMMU
group to handle that.  An IOMMU group is meant to describe the DMA
isolation of a set of devices, so it really has no business being
overloaded to enforce ownership like this, nor can we assume that we
can support multiple IOMMU contexts within a group regardless of a "real
nested IOMMU".  We already see coming a very serious usage restriction
that a user cannot create independent IOMMU contexts as a direct result
of this hack, which severely limits future usefulness.  I don't know
what the solution is here, but I'm pretty sure this is not it.  Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilya Lesokhin June 16, 2016, 12:46 p.m. UTC | #6
You've convinced us that we should use separate groups.
But If I do that I can't think of a clean way to force the VFs to be probed by VFIO.

The ideas I have are:
	1. Add a driver_override parameter to pci_enable_sriov, sriov_enable  and virtfn_add.
	2. Register to the bus notifier of the PF's bus before calling pci_enable_sriov and then override the
	Driver in BUS_NOTIFY_ADD_DEVICE event like I did in the IOMMU group notifier when the group was shared.

Do you have a better idea? or a recommendation as to which of my ideas I should use?

Thank,
Ilya
	
> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Monday, June 13, 2016 7:01 PM
> To: Ilya Lesokhin <ilyal@mellanox.com>
> Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com;
> Noa Osherovich <noaos@mellanox.com>; Haggai Eran
> <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>; Liran Liss
> <liranl@mellanox.com>
> Subject: Re: [PATCH 2/4] IOMMU: Force the VFs of an untrusted PF device to
> be in the PFs IOMMU group
> 
> On Mon, 13 Jun 2016 07:08:03 +0000
> Ilya Lesokhin <ilyal@mellanox.com> wrote:
> 
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Friday, June 10, 2016 1:21 AM
> > > To: Ilya Lesokhin <ilyal@mellanox.com>
> > > Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org;
> > > bhelgaas@google.com; Noa Osherovich <noaos@mellanox.com>; Haggai
> > > Eran <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>;
> > > Liran Liss <liranl@mellanox.com>
> > > Subject: Re: [PATCH 2/4] IOMMU: Force the VFs of an untrusted PF
> > > device to be in the PFs IOMMU group
> > ...
> > > This deserves a comment in the code as well as the commit log, but
> > > more importantly the side effect of this is that the user can't make
> > > use of separate IOMMU domains for the PF vs the VF.  I think this
> > > ends up making the entire solution incompatible with things like
> > > vIOMMU since we really need to be able to create separate address
> > > spaces per device to make that work.  What's the point of enabling
> > > SR-IOV from userspace if we can't do things like assign VFs to
> > > nested guests or userspace in the guest?  This is an incomplete feature
> with that restriction.
> >
> > I agree that this is a problem and I will mention this limitation in the commit
> log.
> > However to address this properly you need nested IOMMU group which
> don't really exist.
> > This feature is still useful, at least for us, as you can enable sriov
> > in a guest and work with probed VFs in the same guest.
> > Furthermore, if you have a real nested IOMMU you should be able to do
> > nested device assignment even though the PF and VF's belong to the same
> group.
> >
> > I think we should push this feature with the limitation and hopefully
> > it will be addressed in the future, agree?
> 
> Sorry, I don't agree, nor do I think that nested IOMMU groups are the
> solution to the problem (or even really understand what nest IOMMU
> groups are).  It seems we have an issue that an untrusted user is creating
> devices and we're trying to overload the concept of an IOMMU group to
> handle that.  An IOMMU group is meant to describe the DMA isolation of a
> set of devices, so it really has no business being overloaded to enforce
> ownership like this, nor can we assume that we can support multiple IOMMU
> contexts within a group regardless of a "real nested IOMMU".  We already
> see coming a very serious usage restriction that a user cannot create
> independent IOMMU contexts as a direct result of this hack, which severely
> limits future usefulness.  I don't know what the solution is here, but I'm
> pretty sure this is not it.  Thanks,
> 
> Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3000051..9bb914c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -749,6 +749,10 @@  struct iommu_group *pci_device_group(struct device *dev)
 	struct pci_bus *bus;
 	struct iommu_group *group = NULL;
 	u64 devfns[4] = { 0 };
+	
+	if (pdev->is_virtfn && 
+	   (pdev->physfn->dev_flags & PCI_DEV_FLAGS_UNTRUSTED))
+		return iommu_group_get(&pdev->physfn->dev);
 
 	if (WARN_ON(!dev_is_pci(dev)))
 		return ERR_PTR(-EINVAL);
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 188b1ff..72d048e 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1180,6 +1180,8 @@  static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		pci_set_power_state(pdev, PCI_D3hot);
 	}
 
+	pdev->dev_flags |= PCI_DEV_FLAGS_UNTRUSTED;
+
 	return ret;
 }
 
@@ -1187,6 +1189,7 @@  static void vfio_pci_remove(struct pci_dev *pdev)
 {
 	struct vfio_pci_device *vdev;
 
+	pdev->dev_flags &= ~PCI_DEV_FLAGS_UNTRUSTED;
 	vdev = vfio_del_group_dev(&pdev->dev);
 	if (!vdev)
 		return;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b67e4df..bef9115 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -174,6 +174,9 @@  enum pci_dev_flags {
 	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
 	/* Get VPD from function 0 VPD */
 	PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1 << 8),
+	/* Untrusted software controls this device
+	 * The VFs of this device should be put in the device's IOMMUs group*/
+	PCI_DEV_FLAGS_UNTRUSTED = (__force pci_dev_flags_t) (1 << 9),
 };
 
 enum pci_irq_reroute_variant {