mbox series

[v2,0/7] vfio/pci: SR-IOV support

Message ID 158213716959.17090.8399427017403507114.stgit@gimli.home
Headers show
Series vfio/pci: SR-IOV support | expand

Message

Alex Williamson Feb. 19, 2020, 6:53 p.m. UTC
Changes since v1 are primarily to patch 3/7 where the commit log is
rewritten, along with option parsing and failure logging based on
upstream discussions.  The primary user visible difference is that
option parsing is now much more strict.  If a vf_token option is
provided that cannot be used, we generate an error.  As a result of
this, opening a PF with a vf_token option will serve as a mechanism of
setting the vf_token.  This seems like a more user friendly API than
the alternative of sometimes requiring the option (VFs in use) and
sometimes rejecting it, and upholds our desire that the option is
always either used or rejected.

This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
means of setting the VF token, which might call into question whether
we absolutely need this new ioctl.  Currently I'm keeping it because I
can imagine use cases, for example if a hypervisor were to support
SR-IOV, the PF device might be opened without consideration for a VF
token and we'd require the hypservisor to close and re-open the PF in
order to set a known VF token, which is impractical.

Series overview (same as provided with v1):

The synopsis of this series is that we have an ongoing desire to drive
PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
for this with DPDK drivers and potentially interesting future use
cases in virtualization.  We've been reluctant to add this support
previously due to the dependency and trust relationship between the
VF device and PF driver.  Minimally the PF driver can induce a denial
of service to the VF, but depending on the specific implementation,
the PF driver might also be responsible for moving data between VFs
or have direct access to the state of the VF, including data or state
otherwise private to the VF or VF driver.

To help resolve these concerns, we introduce a VF token into the VFIO
PCI ABI, which acts as a shared secret key between drivers.  The
userspace PF driver is required to set the VF token to a known value
and userspace VF drivers are required to provide the token to access
the VF device.  If a PF driver is restarted with VF drivers in use, it
must also provide the current token in order to prevent a rogue
untrusted PF driver from replacing a known driver.  The degree to
which this new token is considered secret is left to the userspace
drivers, the kernel intentionally provides no means to retrieve the
current token.

Note that the above token is only required for this new model where
both the PF and VF devices are usable through vfio-pci.  Existing
models of VFIO drivers where the PF is used without SR-IOV enabled
or the VF is bound to a userspace driver with an in-kernel, host PF
driver are unaffected.

The latter configuration above also highlights a new inverted scenario
that is now possible, a userspace PF driver with in-kernel VF drivers.
I believe this is a scenario that should be allowed, but should not be
enabled by default.  This series includes code to set a default
driver_override for VFs sourced from a vfio-pci user owned PF, such
that the VFs are also bound to vfio-pci.  This model is compatible
with tools like driverctl and allows the system administrator to
decide if other bindings should be enabled.  The VF token interface
above exists only between vfio-pci PF and VF drivers, once a VF is
bound to another driver, the administrator has effectively pronounced
the device as trusted.  The vfio-pci driver will note alternate
binding in dmesg for logging and debugging purposes.

Please review, comment, and test.  The example QEMU implementation
provided with the RFC is still current for this version.  Thanks,

Alex

RFC: https://lore.kernel.org/lkml/158085337582.9445.17682266437583505502.stgit@gimli.home/
v1: https://lore.kernel.org/lkml/158145472604.16827.15751375540102298130.stgit@gimli.home/

---

Alex Williamson (7):
      vfio: Include optional device match in vfio_device_ops callbacks
      vfio/pci: Implement match ops
      vfio/pci: Introduce VF token
      vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
      vfio/pci: Add sriov_configure support
      vfio/pci: Remove dev_fmt definition
      vfio/pci: Cleanup .probe() exit paths


 drivers/vfio/pci/vfio_pci.c         |  383 +++++++++++++++++++++++++++++++++--
 drivers/vfio/pci/vfio_pci_private.h |   10 +
 drivers/vfio/vfio.c                 |   20 +-
 include/linux/vfio.h                |    4 
 include/uapi/linux/vfio.h           |   37 +++
 5 files changed, 426 insertions(+), 28 deletions(-)

Comments

Tian, Kevin Feb. 25, 2020, 2:33 a.m. UTC | #1
> From: Alex Williamson
> Sent: Thursday, February 20, 2020 2:54 AM
> 
> Changes since v1 are primarily to patch 3/7 where the commit log is
> rewritten, along with option parsing and failure logging based on
> upstream discussions.  The primary user visible difference is that
> option parsing is now much more strict.  If a vf_token option is
> provided that cannot be used, we generate an error.  As a result of
> this, opening a PF with a vf_token option will serve as a mechanism of
> setting the vf_token.  This seems like a more user friendly API than
> the alternative of sometimes requiring the option (VFs in use) and
> sometimes rejecting it, and upholds our desire that the option is
> always either used or rejected.
> 
> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
> means of setting the VF token, which might call into question whether
> we absolutely need this new ioctl.  Currently I'm keeping it because I
> can imagine use cases, for example if a hypervisor were to support
> SR-IOV, the PF device might be opened without consideration for a VF
> token and we'd require the hypservisor to close and re-open the PF in
> order to set a known VF token, which is impractical.
> 
> Series overview (same as provided with v1):

Thanks for doing this! 

> 
> The synopsis of this series is that we have an ongoing desire to drive
> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> for this with DPDK drivers and potentially interesting future use

Can you provide a link to the DPDK discussion? 

> cases in virtualization.  We've been reluctant to add this support
> previously due to the dependency and trust relationship between the
> VF device and PF driver.  Minimally the PF driver can induce a denial
> of service to the VF, but depending on the specific implementation,
> the PF driver might also be responsible for moving data between VFs
> or have direct access to the state of the VF, including data or state
> otherwise private to the VF or VF driver.

Just a loud thinking. While the motivation of VF token sounds reasonable
to me, I'm curious why the same concern is not raised in other usages.
For example, there is no such design in virtio framework, where the
virtio device could also be restarted, putting in separate process (vhost-user),
and even in separate VM (virtio-vhost-user), etc. Of course the para-
virtualized attribute of virtio implies some degree of trust, but as you
mentioned many SR-IOV implementations support VF->PF communication
which also implies some level of trust. It's perfectly fine if VFIO just tries
to do better than other sub-systems, but knowing how other people
tackle the similar problem may make the whole picture clearer. 😊

+Jason.

> 
> To help resolve these concerns, we introduce a VF token into the VFIO
> PCI ABI, which acts as a shared secret key between drivers.  The
> userspace PF driver is required to set the VF token to a known value
> and userspace VF drivers are required to provide the token to access
> the VF device.  If a PF driver is restarted with VF drivers in use, it
> must also provide the current token in order to prevent a rogue
> untrusted PF driver from replacing a known driver.  The degree to
> which this new token is considered secret is left to the userspace
> drivers, the kernel intentionally provides no means to retrieve the
> current token.

I'm wondering whether the token idea can be used beyond SR-IOV, e.g.
(1) we may allow vfio user space to manage Scalable IOV in the future,
which faces the similar challenge between the PF and mdev; (2) the
token might be used as a canonical way to replace off-tree acs-override
workaround, say, allowing the admin to assign devices within the 
same iommu group to different VMs which trust each other. I'm not
sure how much complexity will be further introduced, but it's greatly
appreciated if you can help think a bit and if feasible abstract some 
logic in vfio core layer for such potential usages...

> 
> Note that the above token is only required for this new model where
> both the PF and VF devices are usable through vfio-pci.  Existing
> models of VFIO drivers where the PF is used without SR-IOV enabled
> or the VF is bound to a userspace driver with an in-kernel, host PF
> driver are unaffected.
> 
> The latter configuration above also highlights a new inverted scenario
> that is now possible, a userspace PF driver with in-kernel VF drivers.
> I believe this is a scenario that should be allowed, but should not be
> enabled by default.  This series includes code to set a default
> driver_override for VFs sourced from a vfio-pci user owned PF, such
> that the VFs are also bound to vfio-pci.  This model is compatible
> with tools like driverctl and allows the system administrator to
> decide if other bindings should be enabled.  The VF token interface
> above exists only between vfio-pci PF and VF drivers, once a VF is
> bound to another driver, the administrator has effectively pronounced
> the device as trusted.  The vfio-pci driver will note alternate
> binding in dmesg for logging and debugging purposes.
> 
> Please review, comment, and test.  The example QEMU implementation
> provided with the RFC is still current for this version.  Thanks,
> 
> Alex
> 
> RFC:
> https://lore.kernel.org/lkml/158085337582.9445.17682266437583505502.stg
> it@gimli.home/
> v1:
> https://lore.kernel.org/lkml/158145472604.16827.15751375540102298130.st
> git@gimli.home/
> 
> ---
> 
> Alex Williamson (7):
>       vfio: Include optional device match in vfio_device_ops callbacks
>       vfio/pci: Implement match ops
>       vfio/pci: Introduce VF token
>       vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
>       vfio/pci: Add sriov_configure support
>       vfio/pci: Remove dev_fmt definition
>       vfio/pci: Cleanup .probe() exit paths
> 
> 
>  drivers/vfio/pci/vfio_pci.c         |  383
> +++++++++++++++++++++++++++++++++--
>  drivers/vfio/pci/vfio_pci_private.h |   10 +
>  drivers/vfio/vfio.c                 |   20 +-
>  include/linux/vfio.h                |    4
>  include/uapi/linux/vfio.h           |   37 +++
>  5 files changed, 426 insertions(+), 28 deletions(-)
Jason Wang Feb. 25, 2020, 6:09 a.m. UTC | #2
On 2020/2/25 上午10:33, Tian, Kevin wrote:
>> From: Alex Williamson
>> Sent: Thursday, February 20, 2020 2:54 AM
>>
>> Changes since v1 are primarily to patch 3/7 where the commit log is
>> rewritten, along with option parsing and failure logging based on
>> upstream discussions.  The primary user visible difference is that
>> option parsing is now much more strict.  If a vf_token option is
>> provided that cannot be used, we generate an error.  As a result of
>> this, opening a PF with a vf_token option will serve as a mechanism of
>> setting the vf_token.  This seems like a more user friendly API than
>> the alternative of sometimes requiring the option (VFs in use) and
>> sometimes rejecting it, and upholds our desire that the option is
>> always either used or rejected.
>>
>> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
>> means of setting the VF token, which might call into question whether
>> we absolutely need this new ioctl.  Currently I'm keeping it because I
>> can imagine use cases, for example if a hypervisor were to support
>> SR-IOV, the PF device might be opened without consideration for a VF
>> token and we'd require the hypservisor to close and re-open the PF in
>> order to set a known VF token, which is impractical.
>>
>> Series overview (same as provided with v1):
> Thanks for doing this!
>
>> The synopsis of this series is that we have an ongoing desire to drive
>> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
>> for this with DPDK drivers and potentially interesting future use
> Can you provide a link to the DPDK discussion?
>
>> cases in virtualization.  We've been reluctant to add this support
>> previously due to the dependency and trust relationship between the
>> VF device and PF driver.  Minimally the PF driver can induce a denial
>> of service to the VF, but depending on the specific implementation,
>> the PF driver might also be responsible for moving data between VFs
>> or have direct access to the state of the VF, including data or state
>> otherwise private to the VF or VF driver.
> Just a loud thinking. While the motivation of VF token sounds reasonable
> to me, I'm curious why the same concern is not raised in other usages.
> For example, there is no such design in virtio framework, where the
> virtio device could also be restarted, putting in separate process (vhost-user),
> and even in separate VM (virtio-vhost-user), etc.


AFAIK, the restart could only be triggered by either VM or qemu. But 
yes, the datapath could be offloaded.

But I'm not sure introducing another dedicated mechanism is better than 
using the exist generic POSIX mechanism to make sure the connection 
(AF_UINX) is secure.


>   Of course the para-
> virtualized attribute of virtio implies some degree of trust, but as you
> mentioned many SR-IOV implementations support VF->PF communication
> which also implies some level of trust. It's perfectly fine if VFIO just tries
> to do better than other sub-systems, but knowing how other people
> tackle the similar problem may make the whole picture clearer. 😊
>
> +Jason.


I'm not quite sure e.g allowing userspace PF driver with kernel VF 
driver would not break the assumption of kernel security model. At least 
we should forbid a unprivileged PF driver running in userspace.

Thanks
Vamsi Krishna Attunuru March 5, 2020, 6:38 a.m. UTC | #3
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Alex Williamson
> Sent: Thursday, February 20, 2020 12:24 AM
> To: kvm@vger.kernel.org
> Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; dev@dpdk.org;
> mtosatti@redhat.com; thomas@monjalon.net; bluca@debian.org;
> jerinjacobk@gmail.com; bruce.richardson@intel.com; cohuck@redhat.com
> Subject: [dpdk-dev] [PATCH v2 0/7] vfio/pci: SR-IOV support
> 
> Changes since v1 are primarily to patch 3/7 where the commit log is
> rewritten, along with option parsing and failure logging based on upstream
> discussions.  The primary user visible difference is that option parsing is now
> much more strict.  If a vf_token option is provided that cannot be used, we
> generate an error.  As a result of this, opening a PF with a vf_token option
> will serve as a mechanism of setting the vf_token.  This seems like a more
> user friendly API than the alternative of sometimes requiring the option (VFs
> in use) and sometimes rejecting it, and upholds our desire that the option is
> always either used or rejected.
> 
> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only means
> of setting the VF token, which might call into question whether we absolutely
> need this new ioctl.  Currently I'm keeping it because I can imagine use cases,
> for example if a hypervisor were to support SR-IOV, the PF device might be
> opened without consideration for a VF token and we'd require the
> hypservisor to close and re-open the PF in order to set a known VF token,
> which is impractical.
> 
> Series overview (same as provided with v1):
> 
> The synopsis of this series is that we have an ongoing desire to drive PCIe SR-
> IOV PFs from userspace with VFIO.  There's an immediate need for this with
> DPDK drivers and potentially interesting future use cases in virtualization.
> We've been reluctant to add this support previously due to the dependency
> and trust relationship between the VF device and PF driver.  Minimally the PF
> driver can induce a denial of service to the VF, but depending on the specific
> implementation, the PF driver might also be responsible for moving data
> between VFs or have direct access to the state of the VF, including data or
> state otherwise private to the VF or VF driver.
> 
> To help resolve these concerns, we introduce a VF token into the VFIO PCI
> ABI, which acts as a shared secret key between drivers.  The userspace PF
> driver is required to set the VF token to a known value and userspace VF
> drivers are required to provide the token to access the VF device.  If a PF
> driver is restarted with VF drivers in use, it must also provide the current
> token in order to prevent a rogue untrusted PF driver from replacing a known
> driver.  The degree to which this new token is considered secret is left to the
> userspace drivers, the kernel intentionally provides no means to retrieve the
> current token.
> 
> Note that the above token is only required for this new model where both
> the PF and VF devices are usable through vfio-pci.  Existing models of VFIO
> drivers where the PF is used without SR-IOV enabled or the VF is bound to a
> userspace driver with an in-kernel, host PF driver are unaffected.
> 
> The latter configuration above also highlights a new inverted scenario that is
> now possible, a userspace PF driver with in-kernel VF drivers.
> I believe this is a scenario that should be allowed, but should not be enabled
> by default.  This series includes code to set a default driver_override for VFs
> sourced from a vfio-pci user owned PF, such that the VFs are also bound to
> vfio-pci.  This model is compatible with tools like driverctl and allows the
> system administrator to decide if other bindings should be enabled.  The VF
> token interface above exists only between vfio-pci PF and VF drivers, once a
> VF is bound to another driver, the administrator has effectively pronounced
> the device as trusted.  The vfio-pci driver will note alternate binding in dmesg
> for logging and debugging purposes.
> 
> Please review, comment, and test.  The example QEMU implementation
> provided with the RFC is still current for this version.  Thanks,
> 
> Alex

Hi Alex,

Thanks for enabling this feature support.

Tested-by: Vamsi Attunuru <vattunuru@marvell.com>

Tested v2 patch set with below DPDK patch.
http://patches.dpdk.org/patch/66281/

Regards
A Vamsi

> 
> RFC: https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lore.kernel.org_lkml_158085337582.9445.17682266437583505502.stgit-
> 40gimli.home_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=2rpxxNF2qeP0
> 2gVZIWTVrW-6zNZz5-uKt9pRqpR_M3U&m=V-6mKmCTHPZa5jwepXU_-
> Ma1_BGF0OWJ_IRCF_p4GVo&s=YnO98PGK9ro7F6_XZTccHdYcZ-
> rMMOin0nRFhPD6Uv4&e=
> v1: https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lore.kernel.org_lkml_158145472604.16827.15751375540102298130.stgit
> -
> 40gimli.home_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=2rpxxNF2qeP0
> 2gVZIWTVrW-6zNZz5-uKt9pRqpR_M3U&m=V-6mKmCTHPZa5jwepXU_-
> Ma1_BGF0OWJ_IRCF_p4GVo&s=rvUxLCENwNk0GBYkcsBVVobsLfMb4BV5gtc
> 3VqYQTS4&e=
> 
> ---
> 
> Alex Williamson (7):
>       vfio: Include optional device match in vfio_device_ops callbacks
>       vfio/pci: Implement match ops
>       vfio/pci: Introduce VF token
>       vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
>       vfio/pci: Add sriov_configure support
>       vfio/pci: Remove dev_fmt definition
>       vfio/pci: Cleanup .probe() exit paths
> 
> 
>  drivers/vfio/pci/vfio_pci.c         |  383
> +++++++++++++++++++++++++++++++++--
>  drivers/vfio/pci/vfio_pci_private.h |   10 +
>  drivers/vfio/vfio.c                 |   20 +-
>  include/linux/vfio.h                |    4
>  include/uapi/linux/vfio.h           |   37 +++
>  5 files changed, 426 insertions(+), 28 deletions(-)
Alex Williamson March 5, 2020, 5:14 p.m. UTC | #4
On Tue, 25 Feb 2020 14:09:07 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2020/2/25 上午10:33, Tian, Kevin wrote:
> >> From: Alex Williamson
> >> Sent: Thursday, February 20, 2020 2:54 AM
> >>
> >> Changes since v1 are primarily to patch 3/7 where the commit log is
> >> rewritten, along with option parsing and failure logging based on
> >> upstream discussions.  The primary user visible difference is that
> >> option parsing is now much more strict.  If a vf_token option is
> >> provided that cannot be used, we generate an error.  As a result of
> >> this, opening a PF with a vf_token option will serve as a mechanism of
> >> setting the vf_token.  This seems like a more user friendly API than
> >> the alternative of sometimes requiring the option (VFs in use) and
> >> sometimes rejecting it, and upholds our desire that the option is
> >> always either used or rejected.
> >>
> >> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
> >> means of setting the VF token, which might call into question whether
> >> we absolutely need this new ioctl.  Currently I'm keeping it because I
> >> can imagine use cases, for example if a hypervisor were to support
> >> SR-IOV, the PF device might be opened without consideration for a VF
> >> token and we'd require the hypservisor to close and re-open the PF in
> >> order to set a known VF token, which is impractical.
> >>
> >> Series overview (same as provided with v1):  
> > Thanks for doing this!
> >  
> >> The synopsis of this series is that we have an ongoing desire to drive
> >> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> >> for this with DPDK drivers and potentially interesting future use  
> > Can you provide a link to the DPDK discussion?
> >  
> >> cases in virtualization.  We've been reluctant to add this support
> >> previously due to the dependency and trust relationship between the
> >> VF device and PF driver.  Minimally the PF driver can induce a denial
> >> of service to the VF, but depending on the specific implementation,
> >> the PF driver might also be responsible for moving data between VFs
> >> or have direct access to the state of the VF, including data or state
> >> otherwise private to the VF or VF driver.  
> > Just a loud thinking. While the motivation of VF token sounds reasonable
> > to me, I'm curious why the same concern is not raised in other usages.
> > For example, there is no such design in virtio framework, where the
> > virtio device could also be restarted, putting in separate process (vhost-user),
> > and even in separate VM (virtio-vhost-user), etc.  
> 
> 
> AFAIK, the restart could only be triggered by either VM or qemu. But 
> yes, the datapath could be offloaded.
> 
> But I'm not sure introducing another dedicated mechanism is better than 
> using the exist generic POSIX mechanism to make sure the connection 
> (AF_UINX) is secure.
> 
> 
> >   Of course the para-
> > virtualized attribute of virtio implies some degree of trust, but as you
> > mentioned many SR-IOV implementations support VF->PF communication
> > which also implies some level of trust. It's perfectly fine if VFIO just tries
> > to do better than other sub-systems, but knowing how other people
> > tackle the similar problem may make the whole picture clearer. 😊
> >
> > +Jason.  
> 
> 
> I'm not quite sure e.g allowing userspace PF driver with kernel VF 
> driver would not break the assumption of kernel security model. At least 
> we should forbid a unprivileged PF driver running in userspace.

It might be useful to have your opinion on this series, because that's
exactly what we're trying to do here.  Various environments, DPDK
specifically, want a userspace PF driver.  This series takes steps to
mitigate the risk of having such a driver, such as requiring this VF
token interface to extend the VFIO interface and validate participation
around a PF that is not considered trusted by the kernel.  We also set
a driver_override to try to make sure no host kernel driver can
automatically bind to a VF of a user owned PF, only vfio-pci, but we
don't prevent the admin from creating configurations where the VFs are
used by other host kernel drivers.

I think the question Kevin is inquiring about is whether virtio devices
are susceptible to the type of collaborative, shared key environment
we're creating here.  For example, can a VM or qemu have access to
reset a virtio device in a way that could affect other devices, ex. FLR
on a PF that could interfere with VF operation.  Thanks,

Alex
Alex Williamson March 5, 2020, 5:33 p.m. UTC | #5
Hi Kevin,

Sorry for the delay, I've been out on PTO...

On Tue, 25 Feb 2020 02:33:27 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson
> > Sent: Thursday, February 20, 2020 2:54 AM
> > 
> > Changes since v1 are primarily to patch 3/7 where the commit log is
> > rewritten, along with option parsing and failure logging based on
> > upstream discussions.  The primary user visible difference is that
> > option parsing is now much more strict.  If a vf_token option is
> > provided that cannot be used, we generate an error.  As a result of
> > this, opening a PF with a vf_token option will serve as a mechanism of
> > setting the vf_token.  This seems like a more user friendly API than
> > the alternative of sometimes requiring the option (VFs in use) and
> > sometimes rejecting it, and upholds our desire that the option is
> > always either used or rejected.
> > 
> > This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
> > means of setting the VF token, which might call into question whether
> > we absolutely need this new ioctl.  Currently I'm keeping it because I
> > can imagine use cases, for example if a hypervisor were to support
> > SR-IOV, the PF device might be opened without consideration for a VF
> > token and we'd require the hypservisor to close and re-open the PF in
> > order to set a known VF token, which is impractical.
> > 
> > Series overview (same as provided with v1):  
> 
> Thanks for doing this! 
> 
> > 
> > The synopsis of this series is that we have an ongoing desire to drive
> > PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> > for this with DPDK drivers and potentially interesting future use  
> 
> Can you provide a link to the DPDK discussion?

There's a thread here which proposed an out-of-tree driver that enables
a parallel sr-iov enabling interface for a vfio-pci own device.
Clearly I felt strongly about it ;)

https://patches.dpdk.org/patch/58810/

Also, documentation for making use of an Intel FPGA device with DPDK
requires the PF bound to igb_uio to support enabling SR-IOV:

https://doc.dpdk.org/guides/bbdevs/fpga_lte_fec.html

> > cases in virtualization.  We've been reluctant to add this support
> > previously due to the dependency and trust relationship between the
> > VF device and PF driver.  Minimally the PF driver can induce a denial
> > of service to the VF, but depending on the specific implementation,
> > the PF driver might also be responsible for moving data between VFs
> > or have direct access to the state of the VF, including data or state
> > otherwise private to the VF or VF driver.  
> 
> Just a loud thinking. While the motivation of VF token sounds reasonable
> to me, I'm curious why the same concern is not raised in other usages.
> For example, there is no such design in virtio framework, where the
> virtio device could also be restarted, putting in separate process (vhost-user),
> and even in separate VM (virtio-vhost-user), etc. Of course the para-
> virtualized attribute of virtio implies some degree of trust, but as you
> mentioned many SR-IOV implementations support VF->PF communication
> which also implies some level of trust. It's perfectly fine if VFIO just tries
> to do better than other sub-systems, but knowing how other people
> tackle the similar problem may make the whole picture clearer. 😊
> 
> +Jason.

We can follow the thread with Jason, but I can't really speak to
whether virtio needs something similar or doesn't provide enough PF
access to be concerned.  If they need a similar solution, we can
collaborate, but the extension we're defining here is specifically part
of the vfio-pci ABI, so it might not be easily portable to virtio.

> > To help resolve these concerns, we introduce a VF token into the VFIO
> > PCI ABI, which acts as a shared secret key between drivers.  The
> > userspace PF driver is required to set the VF token to a known value
> > and userspace VF drivers are required to provide the token to access
> > the VF device.  If a PF driver is restarted with VF drivers in use, it
> > must also provide the current token in order to prevent a rogue
> > untrusted PF driver from replacing a known driver.  The degree to
> > which this new token is considered secret is left to the userspace
> > drivers, the kernel intentionally provides no means to retrieve the
> > current token.  
> 
> I'm wondering whether the token idea can be used beyond SR-IOV, e.g.
> (1) we may allow vfio user space to manage Scalable IOV in the future,
> which faces the similar challenge between the PF and mdev; (2) the
> token might be used as a canonical way to replace off-tree acs-override
> workaround, say, allowing the admin to assign devices within the 
> same iommu group to different VMs which trust each other. I'm not
> sure how much complexity will be further introduced, but it's greatly
> appreciated if you can help think a bit and if feasible abstract some 
> logic in vfio core layer for such potential usages...

I don't see how this can be used for ACS override.  Lacking ACS, we
must assume lack of DMA isolation, which results in our IOMMU grouping.
If we split IOMMU groups, that implies something that doesn't exist.  A
user can already create a process that can own the vfio group and pass
vfio devices to other tasks, with the restriction of having a single
DMA address space.  If there is DMA isolation, then an mdev solution
might be better, but given the IOMMU integration of SIOV, I'm not sure
why the devices wouldn't simply be placed in separate groups by the
IOMMU driver.  Thanks,

Alex
 
> > Note that the above token is only required for this new model where
> > both the PF and VF devices are usable through vfio-pci.  Existing
> > models of VFIO drivers where the PF is used without SR-IOV enabled
> > or the VF is bound to a userspace driver with an in-kernel, host PF
> > driver are unaffected.
> > 
> > The latter configuration above also highlights a new inverted scenario
> > that is now possible, a userspace PF driver with in-kernel VF drivers.
> > I believe this is a scenario that should be allowed, but should not be
> > enabled by default.  This series includes code to set a default
> > driver_override for VFs sourced from a vfio-pci user owned PF, such
> > that the VFs are also bound to vfio-pci.  This model is compatible
> > with tools like driverctl and allows the system administrator to
> > decide if other bindings should be enabled.  The VF token interface
> > above exists only between vfio-pci PF and VF drivers, once a VF is
> > bound to another driver, the administrator has effectively pronounced
> > the device as trusted.  The vfio-pci driver will note alternate
> > binding in dmesg for logging and debugging purposes.
> > 
> > Please review, comment, and test.  The example QEMU implementation
> > provided with the RFC is still current for this version.  Thanks,
> > 
> > Alex
> > 
> > RFC:
> > https://lore.kernel.org/lkml/158085337582.9445.17682266437583505502.stg
> > it@gimli.home/
> > v1:
> > https://lore.kernel.org/lkml/158145472604.16827.15751375540102298130.st
> > git@gimli.home/
> > 
> > ---
> > 
> > Alex Williamson (7):
> >       vfio: Include optional device match in vfio_device_ops callbacks
> >       vfio/pci: Implement match ops
> >       vfio/pci: Introduce VF token
> >       vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
> >       vfio/pci: Add sriov_configure support
> >       vfio/pci: Remove dev_fmt definition
> >       vfio/pci: Cleanup .probe() exit paths
> > 
> > 
> >  drivers/vfio/pci/vfio_pci.c         |  383
> > +++++++++++++++++++++++++++++++++--
> >  drivers/vfio/pci/vfio_pci_private.h |   10 +
> >  drivers/vfio/vfio.c                 |   20 +-
> >  include/linux/vfio.h                |    4
> >  include/uapi/linux/vfio.h           |   37 +++
> >  5 files changed, 426 insertions(+), 28 deletions(-)  
>
Jason Wang March 6, 2020, 3:35 a.m. UTC | #6
On 2020/3/6 上午1:14, Alex Williamson wrote:
> On Tue, 25 Feb 2020 14:09:07 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> On 2020/2/25 上午10:33, Tian, Kevin wrote:
>>>> From: Alex Williamson
>>>> Sent: Thursday, February 20, 2020 2:54 AM
>>>>
>>>> Changes since v1 are primarily to patch 3/7 where the commit log is
>>>> rewritten, along with option parsing and failure logging based on
>>>> upstream discussions.  The primary user visible difference is that
>>>> option parsing is now much more strict.  If a vf_token option is
>>>> provided that cannot be used, we generate an error.  As a result of
>>>> this, opening a PF with a vf_token option will serve as a mechanism of
>>>> setting the vf_token.  This seems like a more user friendly API than
>>>> the alternative of sometimes requiring the option (VFs in use) and
>>>> sometimes rejecting it, and upholds our desire that the option is
>>>> always either used or rejected.
>>>>
>>>> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
>>>> means of setting the VF token, which might call into question whether
>>>> we absolutely need this new ioctl.  Currently I'm keeping it because I
>>>> can imagine use cases, for example if a hypervisor were to support
>>>> SR-IOV, the PF device might be opened without consideration for a VF
>>>> token and we'd require the hypservisor to close and re-open the PF in
>>>> order to set a known VF token, which is impractical.
>>>>
>>>> Series overview (same as provided with v1):
>>> Thanks for doing this!
>>>   
>>>> The synopsis of this series is that we have an ongoing desire to drive
>>>> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
>>>> for this with DPDK drivers and potentially interesting future use
>>> Can you provide a link to the DPDK discussion?
>>>   
>>>> cases in virtualization.  We've been reluctant to add this support
>>>> previously due to the dependency and trust relationship between the
>>>> VF device and PF driver.  Minimally the PF driver can induce a denial
>>>> of service to the VF, but depending on the specific implementation,
>>>> the PF driver might also be responsible for moving data between VFs
>>>> or have direct access to the state of the VF, including data or state
>>>> otherwise private to the VF or VF driver.
>>> Just a loud thinking. While the motivation of VF token sounds reasonable
>>> to me, I'm curious why the same concern is not raised in other usages.
>>> For example, there is no such design in virtio framework, where the
>>> virtio device could also be restarted, putting in separate process (vhost-user),
>>> and even in separate VM (virtio-vhost-user), etc.
>>
>> AFAIK, the restart could only be triggered by either VM or qemu. But
>> yes, the datapath could be offloaded.
>>
>> But I'm not sure introducing another dedicated mechanism is better than
>> using the exist generic POSIX mechanism to make sure the connection
>> (AF_UINX) is secure.
>>
>>
>>>    Of course the para-
>>> virtualized attribute of virtio implies some degree of trust, but as you
>>> mentioned many SR-IOV implementations support VF->PF communication
>>> which also implies some level of trust. It's perfectly fine if VFIO just tries
>>> to do better than other sub-systems, but knowing how other people
>>> tackle the similar problem may make the whole picture clearer. 😊
>>>
>>> +Jason.
>>
>> I'm not quite sure e.g allowing userspace PF driver with kernel VF
>> driver would not break the assumption of kernel security model. At least
>> we should forbid a unprivileged PF driver running in userspace.
> It might be useful to have your opinion on this series, because that's
> exactly what we're trying to do here.  Various environments, DPDK
> specifically, want a userspace PF driver.  This series takes steps to
> mitigate the risk of having such a driver, such as requiring this VF
> token interface to extend the VFIO interface and validate participation
> around a PF that is not considered trusted by the kernel.


I may miss something. But what happens if:

- PF driver is running by unprivileged user
- PF is programmed to send translated DMA request
- Then unprivileged user can mangle the kernel data


> We also set
> a driver_override to try to make sure no host kernel driver can
> automatically bind to a VF of a user owned PF, only vfio-pci, but we
> don't prevent the admin from creating configurations where the VFs are
> used by other host kernel drivers.
>
> I think the question Kevin is inquiring about is whether virtio devices
> are susceptible to the type of collaborative, shared key environment
> we're creating here.  For example, can a VM or qemu have access to
> reset a virtio device in a way that could affect other devices, ex. FLR
> on a PF that could interfere with VF operation.  Thanks,


Right, but I'm not sure it can be done only via virtio or need support 
from transport (e.g PCI).

Thanks


>
> Alex
>
Tian, Kevin March 6, 2020, 9:21 a.m. UTC | #7
> From: Alex Williamson
> Sent: Friday, March 6, 2020 1:34 AM
> 
> Hi Kevin,
> 
> Sorry for the delay, I've been out on PTO...
> 
> On Tue, 25 Feb 2020 02:33:27 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Alex Williamson
> > > Sent: Thursday, February 20, 2020 2:54 AM
> > >
> > > Changes since v1 are primarily to patch 3/7 where the commit log is
> > > rewritten, along with option parsing and failure logging based on
> > > upstream discussions.  The primary user visible difference is that
> > > option parsing is now much more strict.  If a vf_token option is
> > > provided that cannot be used, we generate an error.  As a result of
> > > this, opening a PF with a vf_token option will serve as a mechanism of
> > > setting the vf_token.  This seems like a more user friendly API than
> > > the alternative of sometimes requiring the option (VFs in use) and
> > > sometimes rejecting it, and upholds our desire that the option is
> > > always either used or rejected.
> > >
> > > This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
> > > means of setting the VF token, which might call into question whether
> > > we absolutely need this new ioctl.  Currently I'm keeping it because I
> > > can imagine use cases, for example if a hypervisor were to support
> > > SR-IOV, the PF device might be opened without consideration for a VF
> > > token and we'd require the hypservisor to close and re-open the PF in
> > > order to set a known VF token, which is impractical.
> > >
> > > Series overview (same as provided with v1):
> >
> > Thanks for doing this!
> >
> > >
> > > The synopsis of this series is that we have an ongoing desire to drive
> > > PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> > > for this with DPDK drivers and potentially interesting future use
> >
> > Can you provide a link to the DPDK discussion?
> 
> There's a thread here which proposed an out-of-tree driver that enables
> a parallel sr-iov enabling interface for a vfio-pci own device.
> Clearly I felt strongly about it ;)
> 
> https://patches.dpdk.org/patch/58810/
> 
> Also, documentation for making use of an Intel FPGA device with DPDK
> requires the PF bound to igb_uio to support enabling SR-IOV:
> 
> https://doc.dpdk.org/guides/bbdevs/fpga_lte_fec.html

thanks. it is useful.

> 
> > > cases in virtualization.  We've been reluctant to add this support
> > > previously due to the dependency and trust relationship between the
> > > VF device and PF driver.  Minimally the PF driver can induce a denial
> > > of service to the VF, but depending on the specific implementation,
> > > the PF driver might also be responsible for moving data between VFs
> > > or have direct access to the state of the VF, including data or state
> > > otherwise private to the VF or VF driver.
> >
> > Just a loud thinking. While the motivation of VF token sounds reasonable
> > to me, I'm curious why the same concern is not raised in other usages.
> > For example, there is no such design in virtio framework, where the
> > virtio device could also be restarted, putting in separate process (vhost-
> user),
> > and even in separate VM (virtio-vhost-user), etc. Of course the para-
> > virtualized attribute of virtio implies some degree of trust, but as you
> > mentioned many SR-IOV implementations support VF->PF communication
> > which also implies some level of trust. It's perfectly fine if VFIO just tries
> > to do better than other sub-systems, but knowing how other people
> > tackle the similar problem may make the whole picture clearer. 😊
> >
> > +Jason.
> 
> We can follow the thread with Jason, but I can't really speak to
> whether virtio needs something similar or doesn't provide enough PF
> access to be concerned.  If they need a similar solution, we can
> collaborate, but the extension we're defining here is specifically part
> of the vfio-pci ABI, so it might not be easily portable to virtio.
> 
> > > To help resolve these concerns, we introduce a VF token into the VFIO
> > > PCI ABI, which acts as a shared secret key between drivers.  The
> > > userspace PF driver is required to set the VF token to a known value
> > > and userspace VF drivers are required to provide the token to access
> > > the VF device.  If a PF driver is restarted with VF drivers in use, it
> > > must also provide the current token in order to prevent a rogue
> > > untrusted PF driver from replacing a known driver.  The degree to
> > > which this new token is considered secret is left to the userspace
> > > drivers, the kernel intentionally provides no means to retrieve the
> > > current token.
> >
> > I'm wondering whether the token idea can be used beyond SR-IOV, e.g.
> > (1) we may allow vfio user space to manage Scalable IOV in the future,
> > which faces the similar challenge between the PF and mdev; (2) the
> > token might be used as a canonical way to replace off-tree acs-override
> > workaround, say, allowing the admin to assign devices within the
> > same iommu group to different VMs which trust each other. I'm not
> > sure how much complexity will be further introduced, but it's greatly
> > appreciated if you can help think a bit and if feasible abstract some
> > logic in vfio core layer for such potential usages...
> 
> I don't see how this can be used for ACS override.  Lacking ACS, we
> must assume lack of DMA isolation, which results in our IOMMU grouping.
> If we split IOMMU groups, that implies something that doesn't exist.  A
> user can already create a process that can own the vfio group and pass
> vfio devices to other tasks, with the restriction of having a single
> DMA address space.  If there is DMA isolation, then an mdev solution
> might be better, but given the IOMMU integration of SIOV, I'm not sure
> why the devices wouldn't simply be placed in separate groups by the
> IOMMU driver.  Thanks,

You are right. I overlooked the single DMA address space limitation.

> 
> Alex
> 
> > > Note that the above token is only required for this new model where
> > > both the PF and VF devices are usable through vfio-pci.  Existing
> > > models of VFIO drivers where the PF is used without SR-IOV enabled
> > > or the VF is bound to a userspace driver with an in-kernel, host PF
> > > driver are unaffected.
> > >
> > > The latter configuration above also highlights a new inverted scenario
> > > that is now possible, a userspace PF driver with in-kernel VF drivers.
> > > I believe this is a scenario that should be allowed, but should not be
> > > enabled by default.  This series includes code to set a default
> > > driver_override for VFs sourced from a vfio-pci user owned PF, such
> > > that the VFs are also bound to vfio-pci.  This model is compatible
> > > with tools like driverctl and allows the system administrator to
> > > decide if other bindings should be enabled.  The VF token interface
> > > above exists only between vfio-pci PF and VF drivers, once a VF is
> > > bound to another driver, the administrator has effectively pronounced
> > > the device as trusted.  The vfio-pci driver will note alternate
> > > binding in dmesg for logging and debugging purposes.
> > >
> > > Please review, comment, and test.  The example QEMU implementation
> > > provided with the RFC is still current for this version.  Thanks,
> > >
> > > Alex
> > >
> > > RFC:
> > >
> https://lore.kernel.org/lkml/158085337582.9445.17682266437583505502.stg
> > > it@gimli.home/
> > > v1:
> > >
> https://lore.kernel.org/lkml/158145472604.16827.15751375540102298130.st
> > > git@gimli.home/
> > >
> > > ---
> > >
> > > Alex Williamson (7):
> > >       vfio: Include optional device match in vfio_device_ops callbacks
> > >       vfio/pci: Implement match ops
> > >       vfio/pci: Introduce VF token
> > >       vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
> > >       vfio/pci: Add sriov_configure support
> > >       vfio/pci: Remove dev_fmt definition
> > >       vfio/pci: Cleanup .probe() exit paths
> > >
> > >
> > >  drivers/vfio/pci/vfio_pci.c         |  383
> > > +++++++++++++++++++++++++++++++++--
> > >  drivers/vfio/pci/vfio_pci_private.h |   10 +
> > >  drivers/vfio/vfio.c                 |   20 +-
> > >  include/linux/vfio.h                |    4
> > >  include/uapi/linux/vfio.h           |   37 +++
> > >  5 files changed, 426 insertions(+), 28 deletions(-)
> >
Alex Williamson March 6, 2020, 4:24 p.m. UTC | #8
On Fri, 6 Mar 2020 11:35:21 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2020/3/6 上午1:14, Alex Williamson wrote:
> > On Tue, 25 Feb 2020 14:09:07 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >  
> >> On 2020/2/25 上午10:33, Tian, Kevin wrote:  
> >>>> From: Alex Williamson
> >>>> Sent: Thursday, February 20, 2020 2:54 AM
> >>>>
> >>>> Changes since v1 are primarily to patch 3/7 where the commit log is
> >>>> rewritten, along with option parsing and failure logging based on
> >>>> upstream discussions.  The primary user visible difference is that
> >>>> option parsing is now much more strict.  If a vf_token option is
> >>>> provided that cannot be used, we generate an error.  As a result of
> >>>> this, opening a PF with a vf_token option will serve as a mechanism of
> >>>> setting the vf_token.  This seems like a more user friendly API than
> >>>> the alternative of sometimes requiring the option (VFs in use) and
> >>>> sometimes rejecting it, and upholds our desire that the option is
> >>>> always either used or rejected.
> >>>>
> >>>> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
> >>>> means of setting the VF token, which might call into question whether
> >>>> we absolutely need this new ioctl.  Currently I'm keeping it because I
> >>>> can imagine use cases, for example if a hypervisor were to support
> >>>> SR-IOV, the PF device might be opened without consideration for a VF
> >>>> token and we'd require the hypservisor to close and re-open the PF in
> >>>> order to set a known VF token, which is impractical.
> >>>>
> >>>> Series overview (same as provided with v1):  
> >>> Thanks for doing this!
> >>>     
> >>>> The synopsis of this series is that we have an ongoing desire to drive
> >>>> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> >>>> for this with DPDK drivers and potentially interesting future use  
> >>> Can you provide a link to the DPDK discussion?
> >>>     
> >>>> cases in virtualization.  We've been reluctant to add this support
> >>>> previously due to the dependency and trust relationship between the
> >>>> VF device and PF driver.  Minimally the PF driver can induce a denial
> >>>> of service to the VF, but depending on the specific implementation,
> >>>> the PF driver might also be responsible for moving data between VFs
> >>>> or have direct access to the state of the VF, including data or state
> >>>> otherwise private to the VF or VF driver.  
> >>> Just a loud thinking. While the motivation of VF token sounds reasonable
> >>> to me, I'm curious why the same concern is not raised in other usages.
> >>> For example, there is no such design in virtio framework, where the
> >>> virtio device could also be restarted, putting in separate process (vhost-user),
> >>> and even in separate VM (virtio-vhost-user), etc.  
> >>
> >> AFAIK, the restart could only be triggered by either VM or qemu. But
> >> yes, the datapath could be offloaded.
> >>
> >> But I'm not sure introducing another dedicated mechanism is better than
> >> using the exist generic POSIX mechanism to make sure the connection
> >> (AF_UINX) is secure.
> >>
> >>  
> >>>    Of course the para-
> >>> virtualized attribute of virtio implies some degree of trust, but as you
> >>> mentioned many SR-IOV implementations support VF->PF communication
> >>> which also implies some level of trust. It's perfectly fine if VFIO just tries
> >>> to do better than other sub-systems, but knowing how other people
> >>> tackle the similar problem may make the whole picture clearer. 😊
> >>>
> >>> +Jason.  
> >>
> >> I'm not quite sure e.g allowing userspace PF driver with kernel VF
> >> driver would not break the assumption of kernel security model. At least
> >> we should forbid a unprivileged PF driver running in userspace.  
> > It might be useful to have your opinion on this series, because that's
> > exactly what we're trying to do here.  Various environments, DPDK
> > specifically, want a userspace PF driver.  This series takes steps to
> > mitigate the risk of having such a driver, such as requiring this VF
> > token interface to extend the VFIO interface and validate participation
> > around a PF that is not considered trusted by the kernel.  
> 
> 
> I may miss something. But what happens if:
> 
> - PF driver is running by unprivileged user
> - PF is programmed to send translated DMA request
> - Then unprivileged user can mangle the kernel data

ATS is a security risk regardless of SR-IOV, how does this change it?
Thanks,

Alex

> > We also set
> > a driver_override to try to make sure no host kernel driver can
> > automatically bind to a VF of a user owned PF, only vfio-pci, but we
> > don't prevent the admin from creating configurations where the VFs are
> > used by other host kernel drivers.
> >
> > I think the question Kevin is inquiring about is whether virtio devices
> > are susceptible to the type of collaborative, shared key environment
> > we're creating here.  For example, can a VM or qemu have access to
> > reset a virtio device in a way that could affect other devices, ex. FLR
> > on a PF that could interfere with VF operation.  Thanks,  
> 
> 
> Right, but I'm not sure it can be done only via virtio or need support 
> from transport (e.g PCI).
> 
> Thanks
> 
> 
> >
> > Alex
> >
Jason Wang March 9, 2020, 3:36 a.m. UTC | #9
On 2020/3/7 上午12:24, Alex Williamson wrote:
> On Fri, 6 Mar 2020 11:35:21 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> On 2020/3/6 上午1:14, Alex Williamson wrote:
>>> On Tue, 25 Feb 2020 14:09:07 +0800
>>> Jason Wang <jasowang@redhat.com> wrote:
>>>   
>>>> On 2020/2/25 上午10:33, Tian, Kevin wrote:
>>>>>> From: Alex Williamson
>>>>>> Sent: Thursday, February 20, 2020 2:54 AM
>>>>>>
>>>>>> Changes since v1 are primarily to patch 3/7 where the commit log is
>>>>>> rewritten, along with option parsing and failure logging based on
>>>>>> upstream discussions.  The primary user visible difference is that
>>>>>> option parsing is now much more strict.  If a vf_token option is
>>>>>> provided that cannot be used, we generate an error.  As a result of
>>>>>> this, opening a PF with a vf_token option will serve as a mechanism of
>>>>>> setting the vf_token.  This seems like a more user friendly API than
>>>>>> the alternative of sometimes requiring the option (VFs in use) and
>>>>>> sometimes rejecting it, and upholds our desire that the option is
>>>>>> always either used or rejected.
>>>>>>
>>>>>> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
>>>>>> means of setting the VF token, which might call into question whether
>>>>>> we absolutely need this new ioctl.  Currently I'm keeping it because I
>>>>>> can imagine use cases, for example if a hypervisor were to support
>>>>>> SR-IOV, the PF device might be opened without consideration for a VF
>>>>>> token and we'd require the hypservisor to close and re-open the PF in
>>>>>> order to set a known VF token, which is impractical.
>>>>>>
>>>>>> Series overview (same as provided with v1):
>>>>> Thanks for doing this!
>>>>>      
>>>>>> The synopsis of this series is that we have an ongoing desire to drive
>>>>>> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
>>>>>> for this with DPDK drivers and potentially interesting future use
>>>>> Can you provide a link to the DPDK discussion?
>>>>>      
>>>>>> cases in virtualization.  We've been reluctant to add this support
>>>>>> previously due to the dependency and trust relationship between the
>>>>>> VF device and PF driver.  Minimally the PF driver can induce a denial
>>>>>> of service to the VF, but depending on the specific implementation,
>>>>>> the PF driver might also be responsible for moving data between VFs
>>>>>> or have direct access to the state of the VF, including data or state
>>>>>> otherwise private to the VF or VF driver.
>>>>> Just a loud thinking. While the motivation of VF token sounds reasonable
>>>>> to me, I'm curious why the same concern is not raised in other usages.
>>>>> For example, there is no such design in virtio framework, where the
>>>>> virtio device could also be restarted, putting in separate process (vhost-user),
>>>>> and even in separate VM (virtio-vhost-user), etc.
>>>> AFAIK, the restart could only be triggered by either VM or qemu. But
>>>> yes, the datapath could be offloaded.
>>>>
>>>> But I'm not sure introducing another dedicated mechanism is better than
>>>> using the exist generic POSIX mechanism to make sure the connection
>>>> (AF_UINX) is secure.
>>>>
>>>>   
>>>>>     Of course the para-
>>>>> virtualized attribute of virtio implies some degree of trust, but as you
>>>>> mentioned many SR-IOV implementations support VF->PF communication
>>>>> which also implies some level of trust. It's perfectly fine if VFIO just tries
>>>>> to do better than other sub-systems, but knowing how other people
>>>>> tackle the similar problem may make the whole picture clearer. 😊
>>>>>
>>>>> +Jason.
>>>> I'm not quite sure e.g allowing userspace PF driver with kernel VF
>>>> driver would not break the assumption of kernel security model. At least
>>>> we should forbid a unprivileged PF driver running in userspace.
>>> It might be useful to have your opinion on this series, because that's
>>> exactly what we're trying to do here.  Various environments, DPDK
>>> specifically, want a userspace PF driver.  This series takes steps to
>>> mitigate the risk of having such a driver, such as requiring this VF
>>> token interface to extend the VFIO interface and validate participation
>>> around a PF that is not considered trusted by the kernel.
>>
>> I may miss something. But what happens if:
>>
>> - PF driver is running by unprivileged user
>> - PF is programmed to send translated DMA request
>> - Then unprivileged user can mangle the kernel data
> ATS is a security risk regardless of SR-IOV, how does this change it?
> Thanks,


My understanding is the ATS only happen for some bugous devices. Some 
hardware has on-chip IOMMU, this probably means unprivileged userspace 
PF driver can control the on-chip IOMMU in this case.

Thanks


>
> Alex
>
>>> We also set
>>> a driver_override to try to make sure no host kernel driver can
>>> automatically bind to a VF of a user owned PF, only vfio-pci, but we
>>> don't prevent the admin from creating configurations where the VFs are
>>> used by other host kernel drivers.
>>>
>>> I think the question Kevin is inquiring about is whether virtio devices
>>> are susceptible to the type of collaborative, shared key environment
>>> we're creating here.  For example, can a VM or qemu have access to
>>> reset a virtio device in a way that could affect other devices, ex. FLR
>>> on a PF that could interfere with VF operation.  Thanks,
>>
>> Right, but I'm not sure it can be done only via virtio or need support
>> from transport (e.g PCI).
>>
>> Thanks
>>
>>
>>> Alex
>>>
Alex Williamson March 9, 2020, 2:45 p.m. UTC | #10
On Mon, 9 Mar 2020 11:36:46 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2020/3/7 上午12:24, Alex Williamson wrote:
> > On Fri, 6 Mar 2020 11:35:21 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >  
> >> On 2020/3/6 上午1:14, Alex Williamson wrote:  
> >>> On Tue, 25 Feb 2020 14:09:07 +0800
> >>> Jason Wang <jasowang@redhat.com> wrote:
> >>>     
> >>>> On 2020/2/25 上午10:33, Tian, Kevin wrote:  
> >>>>>> From: Alex Williamson
> >>>>>> Sent: Thursday, February 20, 2020 2:54 AM
> >>>>>>
> >>>>>> Changes since v1 are primarily to patch 3/7 where the commit log is
> >>>>>> rewritten, along with option parsing and failure logging based on
> >>>>>> upstream discussions.  The primary user visible difference is that
> >>>>>> option parsing is now much more strict.  If a vf_token option is
> >>>>>> provided that cannot be used, we generate an error.  As a result of
> >>>>>> this, opening a PF with a vf_token option will serve as a mechanism of
> >>>>>> setting the vf_token.  This seems like a more user friendly API than
> >>>>>> the alternative of sometimes requiring the option (VFs in use) and
> >>>>>> sometimes rejecting it, and upholds our desire that the option is
> >>>>>> always either used or rejected.
> >>>>>>
> >>>>>> This also means that the VFIO_DEVICE_FEATURE ioctl is not the only
> >>>>>> means of setting the VF token, which might call into question whether
> >>>>>> we absolutely need this new ioctl.  Currently I'm keeping it because I
> >>>>>> can imagine use cases, for example if a hypervisor were to support
> >>>>>> SR-IOV, the PF device might be opened without consideration for a VF
> >>>>>> token and we'd require the hypservisor to close and re-open the PF in
> >>>>>> order to set a known VF token, which is impractical.
> >>>>>>
> >>>>>> Series overview (same as provided with v1):  
> >>>>> Thanks for doing this!
> >>>>>        
> >>>>>> The synopsis of this series is that we have an ongoing desire to drive
> >>>>>> PCIe SR-IOV PFs from userspace with VFIO.  There's an immediate need
> >>>>>> for this with DPDK drivers and potentially interesting future use  
> >>>>> Can you provide a link to the DPDK discussion?
> >>>>>        
> >>>>>> cases in virtualization.  We've been reluctant to add this support
> >>>>>> previously due to the dependency and trust relationship between the
> >>>>>> VF device and PF driver.  Minimally the PF driver can induce a denial
> >>>>>> of service to the VF, but depending on the specific implementation,
> >>>>>> the PF driver might also be responsible for moving data between VFs
> >>>>>> or have direct access to the state of the VF, including data or state
> >>>>>> otherwise private to the VF or VF driver.  
> >>>>> Just a loud thinking. While the motivation of VF token sounds reasonable
> >>>>> to me, I'm curious why the same concern is not raised in other usages.
> >>>>> For example, there is no such design in virtio framework, where the
> >>>>> virtio device could also be restarted, putting in separate process (vhost-user),
> >>>>> and even in separate VM (virtio-vhost-user), etc.  
> >>>> AFAIK, the restart could only be triggered by either VM or qemu. But
> >>>> yes, the datapath could be offloaded.
> >>>>
> >>>> But I'm not sure introducing another dedicated mechanism is better than
> >>>> using the exist generic POSIX mechanism to make sure the connection
> >>>> (AF_UINX) is secure.
> >>>>
> >>>>     
> >>>>>     Of course the para-
> >>>>> virtualized attribute of virtio implies some degree of trust, but as you
> >>>>> mentioned many SR-IOV implementations support VF->PF communication
> >>>>> which also implies some level of trust. It's perfectly fine if VFIO just tries
> >>>>> to do better than other sub-systems, but knowing how other people
> >>>>> tackle the similar problem may make the whole picture clearer. 😊
> >>>>>
> >>>>> +Jason.  
> >>>> I'm not quite sure e.g allowing userspace PF driver with kernel VF
> >>>> driver would not break the assumption of kernel security model. At least
> >>>> we should forbid a unprivileged PF driver running in userspace.  
> >>> It might be useful to have your opinion on this series, because that's
> >>> exactly what we're trying to do here.  Various environments, DPDK
> >>> specifically, want a userspace PF driver.  This series takes steps to
> >>> mitigate the risk of having such a driver, such as requiring this VF
> >>> token interface to extend the VFIO interface and validate participation
> >>> around a PF that is not considered trusted by the kernel.  
> >>
> >> I may miss something. But what happens if:
> >>
> >> - PF driver is running by unprivileged user
> >> - PF is programmed to send translated DMA request
> >> - Then unprivileged user can mangle the kernel data  
> > ATS is a security risk regardless of SR-IOV, how does this change it?
> > Thanks,  
> 
> 
> My understanding is the ATS only happen for some bugous devices. Some 
> hardware has on-chip IOMMU, this probably means unprivileged userspace 
> PF driver can control the on-chip IOMMU in this case.

Again, how does this relate to SR-IOV?  A PF is currently assignable
regardless of the support in this series.  Thanks,

Alex