diff mbox series

[ovs-dev,2/2] netdev-dpdk: add support for vhost IOMMU feature

Message ID 1510830105-157433-3-git-send-email-mark.b.kavanagh@intel.com
State Changes Requested
Headers show
Series DPDK v17.11 Support | expand

Commit Message

Mark Kavanagh Nov. 16, 2017, 11:01 a.m. UTC
DPDK v17.11 introduces support for the vHost IOMMU feature.
This is a security feature, that restricts the vhost memory
that a virtio device may access.

This feature also enables the vhost REPLY_ACK protocol, the
implementation of which is known to work in newer versions of
QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
v2.9.0, inclusive). As such, the feature is disabled by default
in (and should remain so, for the aforementioned older QEMU
verions). Starting with QEMU v2.9.1, vhost-iommu-support can
safely be enabled, even without having an IOMMU device, with
no performance penalty.

This patch adds a new vhost port option, vhost-iommu-support,
to allow enablement of the vhost IOMMU feature:

    $ ovs-vsctl add-port br0 vhost-client-1 \
        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
             options:vhost-iommu-support=true

Note that support for this feature is only implemented for vhost
user client ports (since vhost user ports are considered deprecated).

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
---
 Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
 NEWS                                     |  1 +
 lib/netdev-dpdk.c                        | 29 ++++++++++++++++++++++++++---
 vswitchd/vswitch.xml                     | 10 ++++++++++
 4 files changed, 58 insertions(+), 3 deletions(-)

Comments

Ilya Maximets Nov. 24, 2017, 8:06 a.m. UTC | #1
Hello, Mark.

Thanks for the patch. Few comments:

1. We can not change the RTE_VHOST_USER_IOMMU_SUPPORT flag if device already
   registered, i.e. if (dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT) != 0.
   IMHO, we should print a warning message in that case and should not update
   the device flags and request reconfiguration.

   This should be clearly described in commit message and documentation that
   vhost-iommu-support can't be changed if device already fully initialized.
   It should be clear that this option should be set before or simultaneously
   with vhost-server-path.

2. General style comment for the patch: According to coding style, you need to
   use braces for if statements even if there is only one statement in it.

3. Few more comments inline.

> DPDK v17.11 introduces support for the vHost IOMMU feature.
> This is a security feature, that restricts the vhost memory
> that a virtio device may access.
> 
> This feature also enables the vhost REPLY_ACK protocol, the
> implementation of which is known to work in newer versions of
> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
> v2.9.0, inclusive). As such, the feature is disabled by default
> in (and should remain so, for the aforementioned older QEMU
> verions). Starting with QEMU v2.9.1, vhost-iommu-support can
> safely be enabled, even without having an IOMMU device, with
> no performance penalty.
> 
> This patch adds a new vhost port option, vhost-iommu-support,
> to allow enablement of the vhost IOMMU feature:
> 
>     $ ovs-vsctl add-port br0 vhost-client-1 \
>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>              options:vhost-iommu-support=true
> 
> Note that support for this feature is only implemented for vhost
> user client ports (since vhost user ports are considered deprecated).
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
> Acked-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> Acked-by: Ciara Loftus <ciara.loftus at intel.com>
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
>  NEWS                                     |  1 +
>  lib/netdev-dpdk.c                        | 29 ++++++++++++++++++++++++++---
>  vswitchd/vswitch.xml                     | 10 ++++++++++
>  4 files changed, 58 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
> index 5347995..8dff901 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been added to the switch, they must be
>  added to the guest. Like vhost-user ports, there are two ways to do this: using
>  QEMU directly, or using libvirt. Only the QEMU case is covered here.
>  
> +vhost-user client IOMMU
> +~~~~~~~~~~~~~~~~~~~~~~~
> +It is possible to enable IOMMU support for vHost User client ports. This is
> +a feature which restricts the vhost memory that a virtio device can access, and
> +as such is useful in deployments in which security is a concern. IOMMU mode may
> +be enabled on the command line::
> +
> +    $ ovs-vsctl add-port br0 vhost-client-1 \
> +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> +             options:vhost-iommu-support=true
> +
> +.. important::
> +
> +    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
> +    default (and should remain so if using the aforementioned versions of QEMU).
> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
> +    without having an IOMMU device, with no performance penalty.
> +
>  Adding vhost-user-client ports to the guest (QEMU)
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/NEWS b/NEWS
> index 74e59bf..c15dc24 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -14,6 +14,7 @@ Post-v2.8.0
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>     - DPDK:
>       * Add support for DPDK v17.11
> +     * Add support for vHost IOMMU feature
>  
>  v2.8.0 - 31 Aug 2017
>  --------------------
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index ed5bf62..2e9633a 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      const char *path;
> +    bool iommu_enable;
> +    bool request_reconfigure = false;
> +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;

This should be done under the mutex.

>  
>      ovs_mutex_lock(&dev->mutex);
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
>          path = smap_get(args, "vhost-server-path");
>          if (path && strcmp(path, dev->vhost_id)) {
>              strcpy(dev->vhost_id, path);
> -            netdev_request_reconfigure(netdev);
> +            request_reconfigure = true;
>          }
>      }
> +
> +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
> +    if (iommu_enable)
> +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +    else
> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
> +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
> +        request_reconfigure = true;
> +
> +    if (request_reconfigure)
> +        netdev_request_reconfigure(netdev);
>      ovs_mutex_unlock(&dev->mutex);
>  
>      return 0;
> @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>       */
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>              && strlen(dev->vhost_id)) {
> +        /* Enable vhost IOMMU, if it was requested.

Above comment connected to 'if', but not the 'flags'. I'm suggesting to
split the comment by moving above line between 'flags' declaration and the 'if'.

> +         * XXX: the 'flags' variable is required, as not all vhost backend
> +         * features are currently supported by OvS; in time, it should be
> +         * possible to invoke rte_vhost_driver_register(), passing
> +         * dev->vhost_driver_flags directly as a parameter to same.
> +         */
> +        uint64_t flags = RTE_VHOST_USER_CLIENT
Anyway, there should be blank line between variable declaration and the other code.

> +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
> +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +
>          /* Register client-mode device */
> -        err = rte_vhost_driver_register(dev->vhost_id,
> -                                        RTE_VHOST_USER_CLIENT);
> +        err = rte_vhost_driver_register(dev->vhost_id, flags);
>          if (err) {
>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>                       dev->vhost_id);
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index c145e1a..a633226 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>          </p>
>        </column>
>  
> +      <column name="options" key="vhost-iommu-support"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          The value specifies whether IOMMU support is enabled for a vHost User
> +          client mode device that has been or will be created by QEMU.
> +          Only supported by dpdkvhostuserclient interfaces. If not specified or
> +          an incorrect value is specified, defaults to 'false'.
> +        </p>
> +      </column>
> +
>        <column name="options" key="n_rxq_desc"
>                type='{"type": "integer", "minInteger": 1, "maxInteger": 4096}'>
>          <p>
> -- 
> 1.9.3
Stokes, Ian Nov. 24, 2017, 8:18 a.m. UTC | #2
> Hello, Mark.
> 
> Thanks for the patch. Few comments:
> 

Thanks for the feedback Ilya, great minds :). The feedback is quite timely, from my point of view I'd like to see these issues resolved as quickly as possible. I don't think there's anything 'show stopper' wise in what you've raise, we should be able to correct these adjustments.


@Mark, what do you think?

Ian


> 1. We can not change the RTE_VHOST_USER_IOMMU_SUPPORT flag if device
> already
>    registered, i.e. if (dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
> != 0.
>    IMHO, we should print a warning message in that case and should not
> update
>    the device flags and request reconfiguration.
> 
>    This should be clearly described in commit message and documentation
> that
>    vhost-iommu-support can't be changed if device already fully
> initialized.
>    It should be clear that this option should be set before or
> simultaneously
>    with vhost-server-path.
> 
> 2. General style comment for the patch: According to coding style, you
> need to
>    use braces for if statements even if there is only one statement in it.
> 
> 3. Few more comments inline.
> 
> > DPDK v17.11 introduces support for the vHost IOMMU feature.
> > This is a security feature, that restricts the vhost memory that a
> > virtio device may access.
> >
> > This feature also enables the vhost REPLY_ACK protocol, the
> > implementation of which is known to work in newer versions of QEMU
> > (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0,
> > inclusive). As such, the feature is disabled by default in (and should
> > remain so, for the aforementioned older QEMU verions). Starting with
> > QEMU v2.9.1, vhost-iommu-support can safely be enabled, even without
> > having an IOMMU device, with no performance penalty.
> >
> > This patch adds a new vhost port option, vhost-iommu-support, to allow
> > enablement of the vhost IOMMU feature:
> >
> >     $ ovs-vsctl add-port br0 vhost-client-1 \
> >         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> >              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> >              options:vhost-iommu-support=true
> >
> > Note that support for this feature is only implemented for vhost user
> > client ports (since vhost user ports are considered deprecated).
> >
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
> > Acked-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> > Acked-by: Ciara Loftus <ciara.loftus at intel.com>
> > ---
> >  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
> >  NEWS                                     |  1 +
> >  lib/netdev-dpdk.c                        | 29
> ++++++++++++++++++++++++++---
> >  vswitchd/vswitch.xml                     | 10 ++++++++++
> >  4 files changed, 58 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/topics/dpdk/vhost-user.rst
> > b/Documentation/topics/dpdk/vhost-user.rst
> > index 5347995..8dff901 100644
> > --- a/Documentation/topics/dpdk/vhost-user.rst
> > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been added
> > to the switch, they must be  added to the guest. Like vhost-user
> > ports, there are two ways to do this: using  QEMU directly, or using
> libvirt. Only the QEMU case is covered here.
> >
> > +vhost-user client IOMMU
> > +~~~~~~~~~~~~~~~~~~~~~~~
> > +It is possible to enable IOMMU support for vHost User client ports.
> > +This is a feature which restricts the vhost memory that a virtio
> > +device can access, and as such is useful in deployments in which
> > +security is a concern. IOMMU mode may be enabled on the command line::
> > +
> > +    $ ovs-vsctl add-port br0 vhost-client-1 \
> > +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> > +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> > +             options:vhost-iommu-support=true
> > +
> > +.. important::
> > +
> > +    Enabling the IOMMU feature also enables the vhost user reply-ack
> protocol;
> > +    this is known to work on QEMU v2.10.0, but is buggy on older
> versions
> > +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
> disabled by
> > +    default (and should remain so if using the aforementioned versions
> of QEMU).
> > +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
> enabled, even
> > +    without having an IOMMU device, with no performance penalty.
> > +
> >  Adding vhost-user-client ports to the guest (QEMU)
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > diff --git a/NEWS b/NEWS
> > index 74e59bf..c15dc24 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -14,6 +14,7 @@ Post-v2.8.0
> >       * Add support for compiling OVS with the latest Linux 4.13 kernel
> >     - DPDK:
> >       * Add support for DPDK v17.11
> > +     * Add support for vHost IOMMU feature
> >
> >  v2.8.0 - 31 Aug 2017
> >  --------------------
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> > ed5bf62..2e9633a 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct
> > netdev *netdev,  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >      const char *path;
> > +    bool iommu_enable;
> > +    bool request_reconfigure = false;
> > +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
> 
> This should be done under the mutex.
> 
> >
> >      ovs_mutex_lock(&dev->mutex);
> >      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
> >          path = smap_get(args, "vhost-server-path");
> >          if (path && strcmp(path, dev->vhost_id)) {
> >              strcpy(dev->vhost_id, path);
> > -            netdev_request_reconfigure(netdev);
> > +            request_reconfigure = true;
> >          }
> >      }
> > +
> > +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
> > +    if (iommu_enable)
> > +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> > +    else
> > +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
> > +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
> > +        request_reconfigure = true;
> > +
> > +    if (request_reconfigure)
> > +        netdev_request_reconfigure(netdev);
> >      ovs_mutex_unlock(&dev->mutex);
> >
> >      return 0;
> > @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct
> netdev *netdev)
> >       */
> >      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
> >              && strlen(dev->vhost_id)) {
> > +        /* Enable vhost IOMMU, if it was requested.
> 
> Above comment connected to 'if', but not the 'flags'. I'm suggesting to
> split the comment by moving above line between 'flags' declaration and the
> 'if'.
> 
> > +         * XXX: the 'flags' variable is required, as not all vhost
> backend
> > +         * features are currently supported by OvS; in time, it should
> be
> > +         * possible to invoke rte_vhost_driver_register(), passing
> > +         * dev->vhost_driver_flags directly as a parameter to same.
> > +         */
> > +        uint64_t flags = RTE_VHOST_USER_CLIENT
> Anyway, there should be blank line between variable declaration and the
> other code.
> 
> > +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
> > +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> > +
> >          /* Register client-mode device */
> > -        err = rte_vhost_driver_register(dev->vhost_id,
> > -                                        RTE_VHOST_USER_CLIENT);
> > +        err = rte_vhost_driver_register(dev->vhost_id, flags);
> >          if (err) {
> >              VLOG_ERR("vhost-user device setup failure for device %s\n",
> >                       dev->vhost_id);
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> > c145e1a..a633226 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
> type=patch options:peer=p1 \
> >          </p>
> >        </column>
> >
> > +      <column name="options" key="vhost-iommu-support"
> > +              type='{"type": "boolean"}'>
> > +        <p>
> > +          The value specifies whether IOMMU support is enabled for a
> vHost User
> > +          client mode device that has been or will be created by QEMU.
> > +          Only supported by dpdkvhostuserclient interfaces. If not
> specified or
> > +          an incorrect value is specified, defaults to 'false'.
> > +        </p>
> > +      </column>
> > +
> >        <column name="options" key="n_rxq_desc"
> >                type='{"type": "integer", "minInteger": 1, "maxInteger":
> 4096}'>
> >          <p>
> > --
> > 1.9.3
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Stokes, Ian Nov. 24, 2017, 9:01 a.m. UTC | #3
> -----Original Message-----
> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> bounces@openvswitch.org] On Behalf Of Stokes, Ian
> Sent: Friday, November 24, 2017 8:19 AM
> To: Ilya Maximets <i.maximets@samsung.com>; ovs-dev@openvswitch.org;
> Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
> IOMMU feature
> 
> > Hello, Mark.
> >
> > Thanks for the patch. Few comments:
> >
> 
> Thanks for the feedback Ilya, great minds :). The feedback is quite
> timely, from my point of view I'd like to see these issues resolved as
> quickly as possible. I don't think there's anything 'show stopper' wise in
> what you've raise, we should be able to correct these adjustments.
> 
> 
> @Mark, what do you think?
> 
> Ian
> 
> 
> > 1. We can not change the RTE_VHOST_USER_IOMMU_SUPPORT flag if device
> > already
> >    registered, i.e. if (dev->vhost_driver_flags &
> > RTE_VHOST_USER_CLIENT) != 0.
> >    IMHO, we should print a warning message in that case and should not
> > update
> >    the device flags and request reconfiguration.
> >
> >    This should be clearly described in commit message and
> > documentation that
> >    vhost-iommu-support can't be changed if device already fully
> > initialized.
> >    It should be clear that this option should be set before or
> > simultaneously
> >    with vhost-server-path.
> >
> > 2. General style comment for the patch: According to coding style, you
> > need to
> >    use braces for if statements even if there is only one statement in
> it.
> >
> > 3. Few more comments inline.
> >
> > > DPDK v17.11 introduces support for the vHost IOMMU feature.
> > > This is a security feature, that restricts the vhost memory that a
> > > virtio device may access.
> > >
> > > This feature also enables the vhost REPLY_ACK protocol, the
> > > implementation of which is known to work in newer versions of QEMU
> > > (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0,
> > > inclusive). As such, the feature is disabled by default in (and
> > > should remain so, for the aforementioned older QEMU verions).
> > > Starting with QEMU v2.9.1, vhost-iommu-support can safely be
> > > enabled, even without having an IOMMU device, with no performance
> penalty.
> > >
> > > This patch adds a new vhost port option, vhost-iommu-support, to
> > > allow enablement of the vhost IOMMU feature:
> > >
> > >     $ ovs-vsctl add-port br0 vhost-client-1 \
> > >         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> > >              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> > >              options:vhost-iommu-support=true
> > >
> > > Note that support for this feature is only implemented for vhost
> > > user client ports (since vhost user ports are considered deprecated).
> > >
> > > Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
> > > Acked-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> > > Acked-by: Ciara Loftus <ciara.loftus at intel.com>
> > > ---
> > >  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
> > >  NEWS                                     |  1 +
> > >  lib/netdev-dpdk.c                        | 29
> > ++++++++++++++++++++++++++---
> > >  vswitchd/vswitch.xml                     | 10 ++++++++++
> > >  4 files changed, 58 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/Documentation/topics/dpdk/vhost-user.rst
> > > b/Documentation/topics/dpdk/vhost-user.rst
> > > index 5347995..8dff901 100644
> > > --- a/Documentation/topics/dpdk/vhost-user.rst
> > > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > > @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been
> > > added to the switch, they must be  added to the guest. Like
> > > vhost-user ports, there are two ways to do this: using  QEMU
> > > directly, or using
> > libvirt. Only the QEMU case is covered here.
> > >
> > > +vhost-user client IOMMU
> > > +~~~~~~~~~~~~~~~~~~~~~~~
> > > +It is possible to enable IOMMU support for vHost User client ports.
> > > +This is a feature which restricts the vhost memory that a virtio
> > > +device can access, and as such is useful in deployments in which
> > > +security is a concern. IOMMU mode may be enabled on the command
> line::
> > > +
> > > +    $ ovs-vsctl add-port br0 vhost-client-1 \
> > > +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> > > +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> > > +             options:vhost-iommu-support=true
> > > +
> > > +.. important::
> > > +
> > > +    Enabling the IOMMU feature also enables the vhost user
> > > + reply-ack
> > protocol;
> > > +    this is known to work on QEMU v2.10.0, but is buggy on older
> > versions
> > > +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
> > disabled by
> > > +    default (and should remain so if using the aforementioned
> > > + versions
> > of QEMU).
> > > +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
> > enabled, even
> > > +    without having an IOMMU device, with no performance penalty.
> > > +
> > >  Adding vhost-user-client ports to the guest (QEMU)
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >
> > > diff --git a/NEWS b/NEWS
> > > index 74e59bf..c15dc24 100644
> > > --- a/NEWS
> > > +++ b/NEWS
> > > @@ -14,6 +14,7 @@ Post-v2.8.0
> > >       * Add support for compiling OVS with the latest Linux 4.13
> kernel
> > >     - DPDK:
> > >       * Add support for DPDK v17.11
> > > +     * Add support for vHost IOMMU feature
> > >
> > >  v2.8.0 - 31 Aug 2017
> > >  --------------------
> > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> > > ed5bf62..2e9633a 100644
> > > --- a/lib/netdev-dpdk.c
> > > +++ b/lib/netdev-dpdk.c
> > > @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct
> > > netdev *netdev,  {
> > >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> > >      const char *path;
> > > +    bool iommu_enable;
> > > +    bool request_reconfigure = false;
> > > +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
> >
> > This should be done under the mutex.
> >
> > >
> > >      ovs_mutex_lock(&dev->mutex);
> > >      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
> > >          path = smap_get(args, "vhost-server-path");
> > >          if (path && strcmp(path, dev->vhost_id)) {
> > >              strcpy(dev->vhost_id, path);
> > > -            netdev_request_reconfigure(netdev);
> > > +            request_reconfigure = true;
> > >          }
> > >      }
> > > +
> > > +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
> > > +    if (iommu_enable)
> > > +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> > > +    else
> > > +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
> > > +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
> > > +        request_reconfigure = true;
> > > +
> > > +    if (request_reconfigure)
> > > +        netdev_request_reconfigure(netdev);
> > >      ovs_mutex_unlock(&dev->mutex);
> > >
> > >      return 0;
> > > @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct
> > netdev *netdev)
> > >       */
> > >      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
> > >              && strlen(dev->vhost_id)) {
> > > +        /* Enable vhost IOMMU, if it was requested.
> >
> > Above comment connected to 'if', but not the 'flags'. I'm suggesting
> > to split the comment by moving above line between 'flags' declaration
> > and the 'if'.
> >
> > > +         * XXX: the 'flags' variable is required, as not all vhost
> > backend
> > > +         * features are currently supported by OvS; in time, it
> > > + should
> > be
> > > +         * possible to invoke rte_vhost_driver_register(), passing
> > > +         * dev->vhost_driver_flags directly as a parameter to same.
> > > +         */
> > > +        uint64_t flags = RTE_VHOST_USER_CLIENT
> > Anyway, there should be blank line between variable declaration and
> > the other code.
> >
> > > +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
> > > +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> > > +
> > >          /* Register client-mode device */
> > > -        err = rte_vhost_driver_register(dev->vhost_id,
> > > -                                        RTE_VHOST_USER_CLIENT);
> > > +        err = rte_vhost_driver_register(dev->vhost_id, flags);
> > >          if (err) {
> > >              VLOG_ERR("vhost-user device setup failure for device
> %s\n",
> > >                       dev->vhost_id); diff --git
> > > a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> > > c145e1a..a633226 100644
> > > --- a/vswitchd/vswitch.xml
> > > +++ b/vswitchd/vswitch.xml
> > > @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
> > type=patch options:peer=p1 \
> > >          </p>
> > >        </column>
> > >
> > > +      <column name="options" key="vhost-iommu-support"
> > > +              type='{"type": "boolean"}'>
> > > +        <p>
> > > +          The value specifies whether IOMMU support is enabled for
> > > + a
> > vHost User
> > > +          client mode device that has been or will be created by
> QEMU.
> > > +          Only supported by dpdkvhostuserclient interfaces. If not
> > specified or
> > > +          an incorrect value is specified, defaults to 'false'.
> > > +        </p>
> > > +      </column>
> > > +
> > >        <column name="options" key="n_rxq_desc"
> > >                type='{"type": "integer", "minInteger": 1,
> "maxInteger":
> > 4096}'>
> > >          <p>
> > > --
> > > 1.9.3
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Kevin Traynor Nov. 24, 2017, 11:31 a.m. UTC | #4
On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
> DPDK v17.11 introduces support for the vHost IOMMU feature.
> This is a security feature, that restricts the vhost memory
> that a virtio device may access.
> 
> This feature also enables the vhost REPLY_ACK protocol, the
> implementation of which is known to work in newer versions of
> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
> v2.9.0, inclusive). As such, the feature is disabled by default
> in (and should remain so, for the aforementioned older QEMU
> verions). Starting with QEMU v2.9.1, vhost-iommu-support can
> safely be enabled, even without having an IOMMU device, with
> no performance penalty.
> 
> This patch adds a new vhost port option, vhost-iommu-support,
> to allow enablement of the vhost IOMMU feature:
> 
>     $ ovs-vsctl add-port br0 vhost-client-1 \
>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>              options:vhost-iommu-support=true
> 
> Note that support for this feature is only implemented for vhost
> user client ports (since vhost user ports are considered deprecated).
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Acked-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
>  NEWS                                     |  1 +
>  lib/netdev-dpdk.c                        | 29 ++++++++++++++++++++++++++---
>  vswitchd/vswitch.xml                     | 10 ++++++++++
>  4 files changed, 58 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
> index 5347995..8dff901 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been added to the switch, they must be
>  added to the guest. Like vhost-user ports, there are two ways to do this: using
>  QEMU directly, or using libvirt. Only the QEMU case is covered here.
>  
> +vhost-user client IOMMU
> +~~~~~~~~~~~~~~~~~~~~~~~
> +It is possible to enable IOMMU support for vHost User client ports. This is
> +a feature which restricts the vhost memory that a virtio device can access, and
> +as such is useful in deployments in which security is a concern. IOMMU mode may
> +be enabled on the command line::
> +
> +    $ ovs-vsctl add-port br0 vhost-client-1 \
> +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> +             options:vhost-iommu-support=true
> +
> +.. important::
> +
> +    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
> +    default (and should remain so if using the aforementioned versions of QEMU).
> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
> +    without having an IOMMU device, with no performance penalty.
> +
>  Adding vhost-user-client ports to the guest (QEMU)
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/NEWS b/NEWS
> index 74e59bf..c15dc24 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -14,6 +14,7 @@ Post-v2.8.0
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>     - DPDK:
>       * Add support for DPDK v17.11
> +     * Add support for vHost IOMMU feature
>  
>  v2.8.0 - 31 Aug 2017
>  --------------------
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index ed5bf62..2e9633a 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      const char *path;
> +    bool iommu_enable;
> +    bool request_reconfigure = false;
> +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
>  
>      ovs_mutex_lock(&dev->mutex);
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
>          path = smap_get(args, "vhost-server-path");
>          if (path && strcmp(path, dev->vhost_id)) {
>              strcpy(dev->vhost_id, path);
> -            netdev_request_reconfigure(netdev);
> +            request_reconfigure = true;
>          }
>      }
> +
> +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
> +    if (iommu_enable)
> +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +    else
> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
> +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
> +        request_reconfigure = true;
> +
> +    if (request_reconfigure)
> +        netdev_request_reconfigure(netdev);
>      ovs_mutex_unlock(&dev->mutex);
>  
>      return 0;
> @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>       */
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>              && strlen(dev->vhost_id)) {
> +        /* Enable vhost IOMMU, if it was requested.
> +         * XXX: the 'flags' variable is required, as not all vhost backend
> +         * features are currently supported by OvS; in time, it should be
> +         * possible to invoke rte_vhost_driver_register(), passing
> +         * dev->vhost_driver_flags directly as a parameter to same.
> +         */

Can you elaborate on this? I only see vhost_driver_flags used to
indicate support for client and iommu, so I'm not sure the need for the
flags variable.

> +        uint64_t flags = RTE_VHOST_USER_CLIENT;
> +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
> +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +
>          /* Register client-mode device */
> -        err = rte_vhost_driver_register(dev->vhost_id,
> -                                        RTE_VHOST_USER_CLIENT);
> +        err = rte_vhost_driver_register(dev->vhost_id, flags);
>          if (err) {
>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>                       dev->vhost_id);
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index c145e1a..a633226 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>          </p>
>        </column>
>  
> +      <column name="options" key="vhost-iommu-support"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          The value specifies whether IOMMU support is enabled for a vHost User
> +          client mode device that has been or will be created by QEMU.
> +          Only supported by dpdkvhostuserclient interfaces. If not specified or
> +          an incorrect value is specified, defaults to 'false'.
> +        </p>
> +      </column>
> +
>        <column name="options" key="n_rxq_desc"
>                type='{"type": "integer", "minInteger": 1, "maxInteger": 4096}'>
>          <p>
>
Mark Kavanagh Nov. 24, 2017, 12:58 p.m. UTC | #5
>From: Kevin Traynor [mailto:ktraynor@redhat.com]
>Sent: Friday, November 24, 2017 11:31 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>Cc: maxime.coquelin@redhat.com
>Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>feature
>
>On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>> This is a security feature, that restricts the vhost memory
>> that a virtio device may access.
>>
>> This feature also enables the vhost REPLY_ACK protocol, the
>> implementation of which is known to work in newer versions of
>> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
>> v2.9.0, inclusive). As such, the feature is disabled by default
>> in (and should remain so, for the aforementioned older QEMU
>> verions). Starting with QEMU v2.9.1, vhost-iommu-support can
>> safely be enabled, even without having an IOMMU device, with
>> no performance penalty.
>>
>> This patch adds a new vhost port option, vhost-iommu-support,
>> to allow enablement of the vhost IOMMU feature:
>>
>>     $ ovs-vsctl add-port br0 vhost-client-1 \
>>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>>              options:vhost-iommu-support=true
>>
>> Note that support for this feature is only implemented for vhost
>> user client ports (since vhost user ports are considered deprecated).
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Acked-by: Ciara Loftus <ciara.loftus@intel.com>
>> ---
>>  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
>>  NEWS                                     |  1 +
>>  lib/netdev-dpdk.c                        | 29 ++++++++++++++++++++++++++---
>>  vswitchd/vswitch.xml                     | 10 ++++++++++
>>  4 files changed, 58 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>b/Documentation/topics/dpdk/vhost-user.rst
>> index 5347995..8dff901 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been added to the
>switch, they must be
>>  added to the guest. Like vhost-user ports, there are two ways to do this:
>using
>>  QEMU directly, or using libvirt. Only the QEMU case is covered here.
>>
>> +vhost-user client IOMMU
>> +~~~~~~~~~~~~~~~~~~~~~~~
>> +It is possible to enable IOMMU support for vHost User client ports. This is
>> +a feature which restricts the vhost memory that a virtio device can access,
>and
>> +as such is useful in deployments in which security is a concern. IOMMU mode
>may
>> +be enabled on the command line::
>> +
>> +    $ ovs-vsctl add-port br0 vhost-client-1 \
>> +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>> +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>> +             options:vhost-iommu-support=true
>> +
>> +.. important::
>> +
>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>protocol;
>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled
>by
>> +    default (and should remain so if using the aforementioned versions of
>QEMU).
>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>even
>> +    without having an IOMMU device, with no performance penalty.
>> +
>>  Adding vhost-user-client ports to the guest (QEMU)
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> diff --git a/NEWS b/NEWS
>> index 74e59bf..c15dc24 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -14,6 +14,7 @@ Post-v2.8.0
>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>     - DPDK:
>>       * Add support for DPDK v17.11
>> +     * Add support for vHost IOMMU feature
>>
>>  v2.8.0 - 31 Aug 2017
>>  --------------------
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index ed5bf62..2e9633a 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct netdev
>*netdev,
>>  {
>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>      const char *path;
>> +    bool iommu_enable;
>> +    bool request_reconfigure = false;
>> +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
>>
>>      ovs_mutex_lock(&dev->mutex);
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
>>          path = smap_get(args, "vhost-server-path");
>>          if (path && strcmp(path, dev->vhost_id)) {
>>              strcpy(dev->vhost_id, path);
>> -            netdev_request_reconfigure(netdev);
>> +            request_reconfigure = true;
>>          }
>>      }
>> +
>> +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
>> +    if (iommu_enable)
>> +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +    else
>> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
>> +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
>> +        request_reconfigure = true;
>> +
>> +    if (request_reconfigure)
>> +        netdev_request_reconfigure(netdev);
>>      ovs_mutex_unlock(&dev->mutex);
>>
>>      return 0;
>> @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>*netdev)
>>       */
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>              && strlen(dev->vhost_id)) {
>> +        /* Enable vhost IOMMU, if it was requested.
>> +         * XXX: the 'flags' variable is required, as not all vhost backend
>> +         * features are currently supported by OvS; in time, it should be
>> +         * possible to invoke rte_vhost_driver_register(), passing
>> +         * dev->vhost_driver_flags directly as a parameter to same.
>> +         */
>
>Can you elaborate on this? I only see vhost_driver_flags used to
>indicate support for client and iommu, so I'm not sure the need for the
>flags variable.

Hey Kevin,

There's also RTE_VHOST_USER_DEQUEUE_ZERO_COPY, which isn't supported yet.

Thanks,
Mark

>
>> +        uint64_t flags = RTE_VHOST_USER_CLIENT;
>> +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
>> +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +
>>          /* Register client-mode device */
>> -        err = rte_vhost_driver_register(dev->vhost_id,
>> -                                        RTE_VHOST_USER_CLIENT);
>> +        err = rte_vhost_driver_register(dev->vhost_id, flags);
>>          if (err) {
>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>                       dev->vhost_id);
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index c145e1a..a633226 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>type=patch options:peer=p1 \
>>          </p>
>>        </column>
>>
>> +      <column name="options" key="vhost-iommu-support"
>> +              type='{"type": "boolean"}'>
>> +        <p>
>> +          The value specifies whether IOMMU support is enabled for a vHost
>User
>> +          client mode device that has been or will be created by QEMU.
>> +          Only supported by dpdkvhostuserclient interfaces. If not
>specified or
>> +          an incorrect value is specified, defaults to 'false'.
>> +        </p>
>> +      </column>
>> +
>>        <column name="options" key="n_rxq_desc"
>>                type='{"type": "integer", "minInteger": 1, "maxInteger":
>4096}'>
>>          <p>
>>
Kevin Traynor Nov. 24, 2017, 2:38 p.m. UTC | #6
On 11/24/2017 08:06 AM, Ilya Maximets wrote:
> Hello, Mark.
> 
> Thanks for the patch. Few comments:
> 
> 1. We can not change the RTE_VHOST_USER_IOMMU_SUPPORT flag if device already
>    registered, i.e. if (dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT) != 0.
>    IMHO, we should print a warning message in that case and should not update
>    the device flags and request reconfiguration.
> 

Also, I'm not sure there's any point to request a reconfigure on change
of iommu regardless of the state of the client flag.
rte_vhost_driver_register() will not happen anyway until there is a
vhost-server-path. The only thing I can think of is that it would
decouple of the config and reconfig fns.

>    This should be clearly described in commit message and documentation that
>    vhost-iommu-support can't be changed if device already fully initialized.
>    It should be clear that this option should be set before or simultaneously
>    with vhost-server-path.
> 
> 2. General style comment for the patch: According to coding style, you need to
>    use braces for if statements even if there is only one statement in it.
> 
> 3. Few more comments inline.
> 
>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>> This is a security feature, that restricts the vhost memory
>> that a virtio device may access.
>>
>> This feature also enables the vhost REPLY_ACK protocol, the
>> implementation of which is known to work in newer versions of
>> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
>> v2.9.0, inclusive). As such, the feature is disabled by default
>> in (and should remain so, for the aforementioned older QEMU
>> verions). Starting with QEMU v2.9.1, vhost-iommu-support can
>> safely be enabled, even without having an IOMMU device, with
>> no performance penalty.
>>
>> This patch adds a new vhost port option, vhost-iommu-support,
>> to allow enablement of the vhost IOMMU feature:
>>
>>     $ ovs-vsctl add-port br0 vhost-client-1 \
>>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>>              options:vhost-iommu-support=true
>>
>> Note that support for this feature is only implemented for vhost
>> user client ports (since vhost user ports are considered deprecated).
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
>> Acked-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>> Acked-by: Ciara Loftus <ciara.loftus at intel.com>
>> ---
>>  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
>>  NEWS                                     |  1 +
>>  lib/netdev-dpdk.c                        | 29 ++++++++++++++++++++++++++---
>>  vswitchd/vswitch.xml                     | 10 ++++++++++
>>  4 files changed, 58 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
>> index 5347995..8dff901 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been added to the switch, they must be
>>  added to the guest. Like vhost-user ports, there are two ways to do this: using
>>  QEMU directly, or using libvirt. Only the QEMU case is covered here.
>>  
>> +vhost-user client IOMMU
>> +~~~~~~~~~~~~~~~~~~~~~~~
>> +It is possible to enable IOMMU support for vHost User client ports. This is
>> +a feature which restricts the vhost memory that a virtio device can access, and
>> +as such is useful in deployments in which security is a concern. IOMMU mode may
>> +be enabled on the command line::
>> +
>> +    $ ovs-vsctl add-port br0 vhost-client-1 \
>> +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>> +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>> +             options:vhost-iommu-support=true
>> +
>> +.. important::
>> +
>> +    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
>> +    default (and should remain so if using the aforementioned versions of QEMU).
>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
>> +    without having an IOMMU device, with no performance penalty.
>> +
>>  Adding vhost-user-client ports to the guest (QEMU)
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>  
>> diff --git a/NEWS b/NEWS
>> index 74e59bf..c15dc24 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -14,6 +14,7 @@ Post-v2.8.0
>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>     - DPDK:
>>       * Add support for DPDK v17.11
>> +     * Add support for vHost IOMMU feature
>>  
>>  v2.8.0 - 31 Aug 2017
>>  --------------------
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index ed5bf62..2e9633a 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>>  {
>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>      const char *path;
>> +    bool iommu_enable;
>> +    bool request_reconfigure = false;
>> +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
> 
> This should be done under the mutex.
> 
>>  
>>      ovs_mutex_lock(&dev->mutex);
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
>>          path = smap_get(args, "vhost-server-path");
>>          if (path && strcmp(path, dev->vhost_id)) {
>>              strcpy(dev->vhost_id, path);
>> -            netdev_request_reconfigure(netdev);
>> +            request_reconfigure = true;
>>          }
>>      }
>> +
>> +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
>> +    if (iommu_enable)
>> +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +    else
>> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
>> +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
>> +        request_reconfigure = true;
>> +
>> +    if (request_reconfigure)
>> +        netdev_request_reconfigure(netdev);
>>      ovs_mutex_unlock(&dev->mutex);
>>  
>>      return 0;
>> @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>>       */
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>              && strlen(dev->vhost_id)) {
>> +        /* Enable vhost IOMMU, if it was requested.
> 
> Above comment connected to 'if', but not the 'flags'. I'm suggesting to
> split the comment by moving above line between 'flags' declaration and the 'if'.
> 
>> +         * XXX: the 'flags' variable is required, as not all vhost backend
>> +         * features are currently supported by OvS; in time, it should be
>> +         * possible to invoke rte_vhost_driver_register(), passing
>> +         * dev->vhost_driver_flags directly as a parameter to same.
>> +         */
>> +        uint64_t flags = RTE_VHOST_USER_CLIENT
> Anyway, there should be blank line between variable declaration and the other code.
> 
>> +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
>> +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +
>>          /* Register client-mode device */
>> -        err = rte_vhost_driver_register(dev->vhost_id,
>> -                                        RTE_VHOST_USER_CLIENT);
>> +        err = rte_vhost_driver_register(dev->vhost_id, flags);
>>          if (err) {
>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>                       dev->vhost_id);
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index c145e1a..a633226 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>>          </p>
>>        </column>
>>  
>> +      <column name="options" key="vhost-iommu-support"
>> +              type='{"type": "boolean"}'>
>> +        <p>
>> +          The value specifies whether IOMMU support is enabled for a vHost User
>> +          client mode device that has been or will be created by QEMU.
>> +          Only supported by dpdkvhostuserclient interfaces. If not specified or
>> +          an incorrect value is specified, defaults to 'false'.
>> +        </p>
>> +      </column>
>> +
>>        <column name="options" key="n_rxq_desc"
>>                type='{"type": "integer", "minInteger": 1, "maxInteger": 4096}'>
>>          <p>
>> -- 
>> 1.9.3
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Kevin Traynor Nov. 24, 2017, 2:42 p.m. UTC | #7
On 11/24/2017 12:58 PM, Kavanagh, Mark B wrote:
>> From: Kevin Traynor [mailto:ktraynor@redhat.com]
>> Sent: Friday, November 24, 2017 11:31 AM
>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>> Cc: maxime.coquelin@redhat.com
>> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>> feature
>>
>> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>> This is a security feature, that restricts the vhost memory
>>> that a virtio device may access.
>>>
>>> This feature also enables the vhost REPLY_ACK protocol, the
>>> implementation of which is known to work in newer versions of
>>> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
>>> v2.9.0, inclusive). As such, the feature is disabled by default
>>> in (and should remain so, for the aforementioned older QEMU
>>> verions). Starting with QEMU v2.9.1, vhost-iommu-support can
>>> safely be enabled, even without having an IOMMU device, with
>>> no performance penalty.
>>>
>>> This patch adds a new vhost port option, vhost-iommu-support,
>>> to allow enablement of the vhost IOMMU feature:
>>>
>>>     $ ovs-vsctl add-port br0 vhost-client-1 \
>>>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>>>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>>>              options:vhost-iommu-support=true
>>>
>>> Note that support for this feature is only implemented for vhost
>>> user client ports (since vhost user ports are considered deprecated).
>>>
>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Acked-by: Ciara Loftus <ciara.loftus@intel.com>
>>> ---
>>>  Documentation/topics/dpdk/vhost-user.rst | 21 +++++++++++++++++++++
>>>  NEWS                                     |  1 +
>>>  lib/netdev-dpdk.c                        | 29 ++++++++++++++++++++++++++---
>>>  vswitchd/vswitch.xml                     | 10 ++++++++++
>>>  4 files changed, 58 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>> b/Documentation/topics/dpdk/vhost-user.rst
>>> index 5347995..8dff901 100644
>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>> @@ -250,6 +250,27 @@ Once the vhost-user-client ports have been added to the
>> switch, they must be
>>>  added to the guest. Like vhost-user ports, there are two ways to do this:
>> using
>>>  QEMU directly, or using libvirt. Only the QEMU case is covered here.
>>>
>>> +vhost-user client IOMMU
>>> +~~~~~~~~~~~~~~~~~~~~~~~
>>> +It is possible to enable IOMMU support for vHost User client ports. This is
>>> +a feature which restricts the vhost memory that a virtio device can access,
>> and
>>> +as such is useful in deployments in which security is a concern. IOMMU mode
>> may
>>> +be enabled on the command line::
>>> +
>>> +    $ ovs-vsctl add-port br0 vhost-client-1 \
>>> +        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>>> +             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>>> +             options:vhost-iommu-support=true
>>> +
>>> +.. important::
>>> +
>>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>> protocol;
>>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled
>> by
>>> +    default (and should remain so if using the aforementioned versions of
>> QEMU).
>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>> even
>>> +    without having an IOMMU device, with no performance penalty.
>>> +
>>>  Adding vhost-user-client ports to the guest (QEMU)
>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> diff --git a/NEWS b/NEWS
>>> index 74e59bf..c15dc24 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -14,6 +14,7 @@ Post-v2.8.0
>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>>     - DPDK:
>>>       * Add support for DPDK v17.11
>>> +     * Add support for vHost IOMMU feature
>>>
>>>  v2.8.0 - 31 Aug 2017
>>>  --------------------
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index ed5bf62..2e9633a 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -1424,15 +1424,29 @@ netdev_dpdk_vhost_client_set_config(struct netdev
>> *netdev,
>>>  {
>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>      const char *path;
>>> +    bool iommu_enable;
>>> +    bool request_reconfigure = false;
>>> +    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
>>>
>>>      ovs_mutex_lock(&dev->mutex);
>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
>>>          path = smap_get(args, "vhost-server-path");
>>>          if (path && strcmp(path, dev->vhost_id)) {
>>>              strcpy(dev->vhost_id, path);
>>> -            netdev_request_reconfigure(netdev);
>>> +            request_reconfigure = true;
>>>          }
>>>      }
>>> +
>>> +    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
>>> +    if (iommu_enable)
>>> +        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>> +    else
>>> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
>>> +    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
>>> +        request_reconfigure = true;
>>> +
>>> +    if (request_reconfigure)
>>> +        netdev_request_reconfigure(netdev);
>>>      ovs_mutex_unlock(&dev->mutex);
>>>
>>>      return 0;
>>> @@ -3326,9 +3340,18 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>> *netdev)
>>>       */
>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>>              && strlen(dev->vhost_id)) {
>>> +        /* Enable vhost IOMMU, if it was requested.
>>> +         * XXX: the 'flags' variable is required, as not all vhost backend
>>> +         * features are currently supported by OvS; in time, it should be
>>> +         * possible to invoke rte_vhost_driver_register(), passing
>>> +         * dev->vhost_driver_flags directly as a parameter to same.
>>> +         */
>>
>> Can you elaborate on this? I only see vhost_driver_flags used to
>> indicate support for client and iommu, so I'm not sure the need for the
>> flags variable.
> 
> Hey Kevin,
> 
> There's also RTE_VHOST_USER_DEQUEUE_ZERO_COPY, which isn't supported yet.
> 

Sure, but that is never set in OVS, so it would not be enabled by using
the vhost_driver_flag directly. That is what is done for the vhostuser
port case. If support (and the bit) was enabled in the future, then we
would want to pass it down to the rte_vhost_driver_register() also.

thanks,
Kevin.

> Thanks,
> Mark
> 
>>
>>> +        uint64_t flags = RTE_VHOST_USER_CLIENT;
>>> +        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
>>> +            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>> +
>>>          /* Register client-mode device */
>>> -        err = rte_vhost_driver_register(dev->vhost_id,
>>> -                                        RTE_VHOST_USER_CLIENT);
>>> +        err = rte_vhost_driver_register(dev->vhost_id, flags);
>>>          if (err) {
>>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>>                       dev->vhost_id);
>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>>> index c145e1a..a633226 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -2654,6 +2654,16 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>> type=patch options:peer=p1 \
>>>          </p>
>>>        </column>
>>>
>>> +      <column name="options" key="vhost-iommu-support"
>>> +              type='{"type": "boolean"}'>
>>> +        <p>
>>> +          The value specifies whether IOMMU support is enabled for a vHost
>> User
>>> +          client mode device that has been or will be created by QEMU.
>>> +          Only supported by dpdkvhostuserclient interfaces. If not
>> specified or
>>> +          an incorrect value is specified, defaults to 'false'.
>>> +        </p>
>>> +      </column>
>>> +
>>>        <column name="options" key="n_rxq_desc"
>>>                type='{"type": "integer", "minInteger": 1, "maxInteger":
>> 4096}'>
>>>          <p>
>>>
>
Kevin Traynor Nov. 24, 2017, 4:11 p.m. UTC | #8
On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
> DPDK v17.11 introduces support for the vHost IOMMU feature.
> This is a security feature, that restricts the vhost memory
> that a virtio device may access.
> 
> This feature also enables the vhost REPLY_ACK protocol, the
> implementation of which is known to work in newer versions of
> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
> v2.9.0, inclusive). As such, the feature is disabled by default
> in (and should remain so, for the aforementioned older QEMU
> verions). Starting with QEMU v2.9.1, vhost-iommu-support can
> safely be enabled, even without having an IOMMU device, with
> no performance penalty.
> 
> This patch adds a new vhost port option, vhost-iommu-support,
> to allow enablement of the vhost IOMMU feature:
> 
>     $ ovs-vsctl add-port br0 vhost-client-1 \
>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>              options:vhost-iommu-support=true
> 

Hi Mark, All,

I'm thinking about this and whether the current approach provides more
than what is actually needed by users at the cost of making the user
interface more complex.

As an alternative, how about having a global other_config (to be set
like vhost-socket-dir can be) for this instead of having to set it for
each individual interface. It would mean that it would only have to be
set once, instead of having this (ugly?!) option every time a vhost port
is added, so it's a less intrusive change and I can't really think that
a user would require to do this per vhostclient interface??? It's pain
to have to add this at all for a bug in QEMU and I'm sure in 1/2/3 years
time someone will say that users could still be using QEMU < 2.9.1 and
we can't remove it, so it would be nice to keep it as discreet as
possible as we're going to be stuck with it for a while.

I assume that a user would only use one version of QEMU at a time and
would either want or not want this feature. In the worst case, if that
proved completely wrong in the future, then a per interface override
could easily be added. Once there's a way to maintain backwards
compatibility (which there would be) I'd rather err on the side of
introducing just enough enough functionality over increasing complexity
for the user.

What do you think?

thanks,
Kevin.
Jan Scheurich Nov. 24, 2017, 4:20 p.m. UTC | #9
+1 
Jan

> -----Original Message-----
> From: Kevin Traynor [mailto:ktraynor@redhat.com]
> Sent: Friday, 24 November, 2017 17:11
> To: Mark Kavanagh <mark.b.kavanagh@intel.com>; dev@openvswitch.org
> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck Baudin <fbaudin@redhat.com>; Mooney, Sean K
> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>; Ian Stokes <ian.stokes@intel.com>; Loftus, Ciara
> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>; Jan Scheurich
> <jan.scheurich@ericsson.com>
> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU feature
> 
> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
> > DPDK v17.11 introduces support for the vHost IOMMU feature.
> > This is a security feature, that restricts the vhost memory
> > that a virtio device may access.
> >
> > This feature also enables the vhost REPLY_ACK protocol, the
> > implementation of which is known to work in newer versions of
> > QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
> > v2.9.0, inclusive). As such, the feature is disabled by default
> > in (and should remain so, for the aforementioned older QEMU
> > verions). Starting with QEMU v2.9.1, vhost-iommu-support can
> > safely be enabled, even without having an IOMMU device, with
> > no performance penalty.
> >
> > This patch adds a new vhost port option, vhost-iommu-support,
> > to allow enablement of the vhost IOMMU feature:
> >
> >     $ ovs-vsctl add-port br0 vhost-client-1 \
> >         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> >              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> >              options:vhost-iommu-support=true
> >
> 
> Hi Mark, All,
> 
> I'm thinking about this and whether the current approach provides more
> than what is actually needed by users at the cost of making the user
> interface more complex.
> 
> As an alternative, how about having a global other_config (to be set
> like vhost-socket-dir can be) for this instead of having to set it for
> each individual interface. It would mean that it would only have to be
> set once, instead of having this (ugly?!) option every time a vhost port
> is added, so it's a less intrusive change and I can't really think that
> a user would require to do this per vhostclient interface??? It's pain
> to have to add this at all for a bug in QEMU and I'm sure in 1/2/3 years
> time someone will say that users could still be using QEMU < 2.9.1 and
> we can't remove it, so it would be nice to keep it as discreet as
> possible as we're going to be stuck with it for a while.
> 
> I assume that a user would only use one version of QEMU at a time and
> would either want or not want this feature. In the worst case, if that
> proved completely wrong in the future, then a per interface override
> could easily be added. Once there's a way to maintain backwards
> compatibility (which there would be) I'd rather err on the side of
> introducing just enough enough functionality over increasing complexity
> for the user.
> 
> What do you think?
> 
> thanks,
> Kevin.
Mark Kavanagh Nov. 24, 2017, 4:45 p.m. UTC | #10
Sounds good guys - I'll get cracking on this on Monday.

Cheers,
Mark

>-----Original Message-----
>From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]
>Sent: Friday, November 24, 2017 4:21 PM
>To: Kevin Traynor <ktraynor@redhat.com>; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>; dev@openvswitch.org
>Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck Baudin
><fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>; Ilya Maximets
><i.maximets@samsung.com>; Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara
><ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron Conole
><aconole@redhat.com>
>Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>feature
>
>+1
>Jan
>
>> -----Original Message-----
>> From: Kevin Traynor [mailto:ktraynor@redhat.com]
>> Sent: Friday, 24 November, 2017 17:11
>> To: Mark Kavanagh <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck
>Baudin <fbaudin@redhat.com>; Mooney, Sean K
>> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>; Ian
>Stokes <ian.stokes@intel.com>; Loftus, Ciara
>> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron Conole
><aconole@redhat.com>; Jan Scheurich
>> <jan.scheurich@ericsson.com>
>> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>feature
>>
>> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
>> > DPDK v17.11 introduces support for the vHost IOMMU feature.
>> > This is a security feature, that restricts the vhost memory
>> > that a virtio device may access.
>> >
>> > This feature also enables the vhost REPLY_ACK protocol, the
>> > implementation of which is known to work in newer versions of
>> > QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
>> > v2.9.0, inclusive). As such, the feature is disabled by default
>> > in (and should remain so, for the aforementioned older QEMU
>> > verions). Starting with QEMU v2.9.1, vhost-iommu-support can
>> > safely be enabled, even without having an IOMMU device, with
>> > no performance penalty.
>> >
>> > This patch adds a new vhost port option, vhost-iommu-support,
>> > to allow enablement of the vhost IOMMU feature:
>> >
>> >     $ ovs-vsctl add-port br0 vhost-client-1 \
>> >         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>> >              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>> >              options:vhost-iommu-support=true
>> >
>>
>> Hi Mark, All,
>>
>> I'm thinking about this and whether the current approach provides more
>> than what is actually needed by users at the cost of making the user
>> interface more complex.
>>
>> As an alternative, how about having a global other_config (to be set
>> like vhost-socket-dir can be) for this instead of having to set it for
>> each individual interface. It would mean that it would only have to be
>> set once, instead of having this (ugly?!) option every time a vhost port
>> is added, so it's a less intrusive change and I can't really think that
>> a user would require to do this per vhostclient interface??? It's pain
>> to have to add this at all for a bug in QEMU and I'm sure in 1/2/3 years
>> time someone will say that users could still be using QEMU < 2.9.1 and
>> we can't remove it, so it would be nice to keep it as discreet as
>> possible as we're going to be stuck with it for a while.
>>
>> I assume that a user would only use one version of QEMU at a time and
>> would either want or not want this feature. In the worst case, if that
>> proved completely wrong in the future, then a per interface override
>> could easily be added. Once there's a way to maintain backwards
>> compatibility (which there would be) I'd rather err on the side of
>> introducing just enough enough functionality over increasing complexity
>> for the user.
>>
>> What do you think?
>>
>> thanks,
>> Kevin.
Mooney, Sean K Nov. 26, 2017, 11:27 p.m. UTC | #11
> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Friday, November 24, 2017 4:45 PM
> To: Jan Scheurich <jan.scheurich@ericsson.com>; Kevin Traynor
> <ktraynor@redhat.com>; dev@openvswitch.org
> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck
> Baudin <fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>;
> Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
> <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Darrell
> Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>
> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
> IOMMU feature
> 
> Sounds good guys - I'll get cracking on this on Monday.
> 
> Cheers,
> Mark
> 
> >-----Original Message-----
> >From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]
> >Sent: Friday, November 24, 2017 4:21 PM
> >To: Kevin Traynor <ktraynor@redhat.com>; Kavanagh, Mark B
> ><mark.b.kavanagh@intel.com>; dev@openvswitch.org
> >Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;
> Franck
> >Baudin <fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>;
> >Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
> ><ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>;
> Darrell
> >Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>
> >Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
> >IOMMU feature
> >
> >+1
> >Jan
> >
> >> -----Original Message-----
> >> From: Kevin Traynor [mailto:ktraynor@redhat.com]
> >> Sent: Friday, 24 November, 2017 17:11
> >> To: Mark Kavanagh <mark.b.kavanagh@intel.com>; dev@openvswitch.org
> >> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;
> >> Franck
> >Baudin <fbaudin@redhat.com>; Mooney, Sean K
> >> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>;
> >> Ian
> >Stokes <ian.stokes@intel.com>; Loftus, Ciara
> >> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron
> >> Conole
> ><aconole@redhat.com>; Jan Scheurich
> >> <jan.scheurich@ericsson.com>
> >> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for
> vhost
> >> IOMMU
> >feature
> >>
> >> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
> >> > DPDK v17.11 introduces support for the vHost IOMMU feature.
> >> > This is a security feature, that restricts the vhost memory that a
> >> > virtio device may access.
> >> >
> >> > This feature also enables the vhost REPLY_ACK protocol, the
> >> > implementation of which is known to work in newer versions of QEMU
> >> > (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0,
> >> > inclusive). As such, the feature is disabled by default in (and
> >> > should remain so, for the aforementioned older QEMU verions).
> >> > Starting with QEMU v2.9.1, vhost-iommu-support can safely be
> >> > enabled, even without having an IOMMU device, with no performance
> >> > penalty.
> >> >
> >> > This patch adds a new vhost port option, vhost-iommu-support, to
> >> > allow enablement of the vhost IOMMU feature:
> >> >
> >> >     $ ovs-vsctl add-port br0 vhost-client-1 \
> >> >         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
> >> >              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
> >> >              options:vhost-iommu-support=true
> >> >
> >>
> >> Hi Mark, All,
> >>
> >> I'm thinking about this and whether the current approach provides
> >> more than what is actually needed by users at the cost of making the
> >> user interface more complex.
[Mooney, Sean K] I am personally split on this. To enable iommu support in
openstack with the above interface we would have to do two things. 1 extend the image metatdata
or flavor extra specs to allow requesting a vIOMMU. Second we would have to modify os-vif to produce
the add-port command above. Using this interfaces gives us a nice parity in our api
as we only enable iommu support in ovs if we enable for qemu. E.g. if the falvor/image does not request
a virtualized iommu we wont declare its support in the options.
> >>
> >> As an alternative, how about having a global other_config (to be set
> >> like vhost-socket-dir can be) for this instead of having to set it
> >> for each individual interface. It would mean that it would only have
> >> to be set once, instead of having this (ugly?!) option every time a
> >> vhost port is added, so it's a less intrusive change and I can't
> >> really think that a user would require to do this per vhostclient
[Mooney, Sean K]  well one taught that instantly comes to mind is if I set
The global other_config option what happens if I do not request a iommu in qemu
Or I have an old qemu.  If it would have any negative effect on network connectivity
Or prevent the vm from functioning, that would require the nova scheduler to be
Aware of which node had this option set and take that into account when placing vms.
I assume if it had no negative effects  then we would not need a option, global or per
Interface.
> >> interface??? It's pain to have to add this at all for a bug in QEMU
> >> and I'm sure in 1/2/3 years time someone will say that users could
> >> still be using QEMU < 2.9.1 and we can't remove it, so it would be
> >> nice to keep it as discreet as possible as we're going to be stuck
> with it for a while.
> >>
> >> I assume that a user would only use one version of QEMU at a time
> and
> >> would either want or not want this feature. In the worst case, if
> >> that proved completely wrong in the future, then a per interface
> >> override could easily be added. Once there's a way to maintain
> >> backwards compatibility (which there would be) I'd rather err on the
> >> side of introducing just enough enough functionality over increasing
> >> complexity for the user.
> >>
> >> What do you think?
[Mooney, Sean K] I'm not sure that a single qemu version is a valid assumption to make.
At least in an OpenStack context where rolling upgrades are a thing. But you are right
That it will be less common however if it was no existent we would not have the issue with
Live migration between nodes with different feature sets that is also trying to be addressed this
Cycle. If we add a global config option for iommu support that is yet another thing that needs
To be accounted for during live migration.
> >>
> >> thanks,
> >> Kevin.
Mark Kavanagh Nov. 27, 2017, 12:55 p.m. UTC | #12
>From: Mooney, Sean K
>Sent: Sunday, November 26, 2017 11:28 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Jan Scheurich
><jan.scheurich@ericsson.com>; Kevin Traynor <ktraynor@redhat.com>;
>dev@openvswitch.org
>Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck Baudin
><fbaudin@redhat.com>; Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
><ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Darrell Ball
><dball@vmware.com>; Aaron Conole <aconole@redhat.com>; Mooney, Sean K
><sean.k.mooney@intel.com>
>Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>feature
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Friday, November 24, 2017 4:45 PM
>> To: Jan Scheurich <jan.scheurich@ericsson.com>; Kevin Traynor
>> <ktraynor@redhat.com>; dev@openvswitch.org
>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck
>> Baudin <fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>;
>> Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
>> <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Darrell
>> Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>
>> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
>> IOMMU feature
>>
>> Sounds good guys - I'll get cracking on this on Monday.
>>
>> Cheers,
>> Mark
>>
>> >-----Original Message-----
>> >From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]
>> >Sent: Friday, November 24, 2017 4:21 PM
>> >To: Kevin Traynor <ktraynor@redhat.com>; Kavanagh, Mark B
>> ><mark.b.kavanagh@intel.com>; dev@openvswitch.org
>> >Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;
>> Franck
>> >Baudin <fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>;
>> >Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
>> ><ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>;
>> Darrell
>> >Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>
>> >Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
>> >IOMMU feature
>> >
>> >+1
>> >Jan
>> >
>> >> -----Original Message-----
>> >> From: Kevin Traynor [mailto:ktraynor@redhat.com]
>> >> Sent: Friday, 24 November, 2017 17:11
>> >> To: Mark Kavanagh <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>> >> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;
>> >> Franck
>> >Baudin <fbaudin@redhat.com>; Mooney, Sean K
>> >> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>;
>> >> Ian
>> >Stokes <ian.stokes@intel.com>; Loftus, Ciara
>> >> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron
>> >> Conole
>> ><aconole@redhat.com>; Jan Scheurich
>> >> <jan.scheurich@ericsson.com>
>> >> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for
>> vhost
>> >> IOMMU
>> >feature
>> >>
>> >> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
>> >> > DPDK v17.11 introduces support for the vHost IOMMU feature.
>> >> > This is a security feature, that restricts the vhost memory that a
>> >> > virtio device may access.
>> >> >
>> >> > This feature also enables the vhost REPLY_ACK protocol, the
>> >> > implementation of which is known to work in newer versions of QEMU
>> >> > (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0,
>> >> > inclusive). As such, the feature is disabled by default in (and
>> >> > should remain so, for the aforementioned older QEMU verions).
>> >> > Starting with QEMU v2.9.1, vhost-iommu-support can safely be
>> >> > enabled, even without having an IOMMU device, with no performance
>> >> > penalty.
>> >> >
>> >> > This patch adds a new vhost port option, vhost-iommu-support, to
>> >> > allow enablement of the vhost IOMMU feature:
>> >> >
>> >> >     $ ovs-vsctl add-port br0 vhost-client-1 \
>> >> >         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>> >> >              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>> >> >              options:vhost-iommu-support=true
>> >> >
>> >>
>> >> Hi Mark, All,
>> >>
>> >> I'm thinking about this and whether the current approach provides
>> >> more than what is actually needed by users at the cost of making the
>> >> user interface more complex.
>[Mooney, Sean K] I am personally split on this. To enable iommu support in
>openstack with the above interface we would have to do two things. 1 extend
>the image metatdata
>or flavor extra specs to allow requesting a vIOMMU. Second we would have to
>modify os-vif to produce
>the add-port command above. Using this interfaces gives us a nice parity in
>our api
>as we only enable iommu support in ovs if we enable for qemu. E.g. if the
>falvor/image does not request
>a virtualized iommu we wont declare its support in the options.
>> >>
>> >> As an alternative, how about having a global other_config (to be set
>> >> like vhost-socket-dir can be) for this instead of having to set it
>> >> for each individual interface. It would mean that it would only have
>> >> to be set once, instead of having this (ugly?!) option every time a
>> >> vhost port is added, so it's a less intrusive change and I can't
>> >> really think that a user would require to do this per vhostclient
>[Mooney, Sean K]  well one taught that instantly comes to mind is if I set
>The global other_config option what happens if I do not request a iommu in
>qemu
>Or I have an old qemu.  If it would have any negative effect on network
>connectivity
>Or prevent the vm from functioning, that would require the nova scheduler to
>be
>Aware of which node had this option set and take that into account when
>placing vms.
>I assume if it had no negative effects  then we would not need a option,
>global or per
>Interface.
>> >> interface??? It's pain to have to add this at all for a bug in QEMU
>> >> and I'm sure in 1/2/3 years time someone will say that users could
>> >> still be using QEMU < 2.9.1 and we can't remove it, so it would be
>> >> nice to keep it as discreet as possible as we're going to be stuck
>> with it for a while.
>> >>
>> >> I assume that a user would only use one version of QEMU at a time
>> and
>> >> would either want or not want this feature. In the worst case, if
>> >> that proved completely wrong in the future, then a per interface
>> >> override could easily be added. Once there's a way to maintain
>> >> backwards compatibility (which there would be) I'd rather err on the
>> >> side of introducing just enough enough functionality over increasing
>> >> complexity for the user.
>> >>
>> >> What do you think?
>[Mooney, Sean K] I'm not sure that a single qemu version is a valid assumption
>to make.
>At least in an OpenStack context where rolling upgrades are a thing. But you
>are right
>That it will be less common however if it was no existent we would not have
>the issue with
>Live migration between nodes with different feature sets that is also trying
>to be addressed this
>Cycle. If we add a global config option for iommu support that is yet another
>thing that needs
>To be accounted for during live migration.
>> >>

Hi Kevin, Jan,

Any thoughts on Sean's concerns?

I'll hold off on implementation until we have consensus.

Thanks,
Mark




>> >> thanks,
>> >> Kevin.
Kevin Traynor Nov. 27, 2017, 1:31 p.m. UTC | #13
On 11/27/2017 12:55 PM, Kavanagh, Mark B wrote:
> 
> 
>> From: Mooney, Sean K
>> Sent: Sunday, November 26, 2017 11:28 PM
>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Jan Scheurich
>> <jan.scheurich@ericsson.com>; Kevin Traynor <ktraynor@redhat.com>;
>> dev@openvswitch.org
>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck Baudin
>> <fbaudin@redhat.com>; Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
>> <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Darrell Ball
>> <dball@vmware.com>; Aaron Conole <aconole@redhat.com>; Mooney, Sean K
>> <sean.k.mooney@intel.com>
>> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>> feature
>>
>>
>>
>>> -----Original Message-----
>>> From: Kavanagh, Mark B
>>> Sent: Friday, November 24, 2017 4:45 PM
>>> To: Jan Scheurich <jan.scheurich@ericsson.com>; Kevin Traynor
>>> <ktraynor@redhat.com>; dev@openvswitch.org
>>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck
>>> Baudin <fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>;
>>> Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
>>> <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Darrell
>>> Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>
>>> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
>>> IOMMU feature
>>>
>>> Sounds good guys - I'll get cracking on this on Monday.
>>>
>>> Cheers,
>>> Mark
>>>
>>>> -----Original Message-----
>>>> From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]
>>>> Sent: Friday, November 24, 2017 4:21 PM
>>>> To: Kevin Traynor <ktraynor@redhat.com>; Kavanagh, Mark B
>>>> <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>>>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;
>>> Franck
>>>> Baudin <fbaudin@redhat.com>; Mooney, Sean K <sean.k.mooney@intel.com>;
>>>> Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
>>>> <ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>;
>>> Darrell
>>>> Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>
>>>> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
>>>> IOMMU feature
>>>>
>>>> +1
>>>> Jan
>>>>
>>>>> -----Original Message-----
>>>>> From: Kevin Traynor [mailto:ktraynor@redhat.com]
>>>>> Sent: Friday, 24 November, 2017 17:11
>>>>> To: Mark Kavanagh <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>>>>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;
>>>>> Franck
>>>> Baudin <fbaudin@redhat.com>; Mooney, Sean K
>>>>> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>;
>>>>> Ian
>>>> Stokes <ian.stokes@intel.com>; Loftus, Ciara
>>>>> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron
>>>>> Conole
>>>> <aconole@redhat.com>; Jan Scheurich
>>>>> <jan.scheurich@ericsson.com>
>>>>> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for
>>> vhost
>>>>> IOMMU
>>>> feature
>>>>>
>>>>> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:
>>>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>>>>> This is a security feature, that restricts the vhost memory that a
>>>>>> virtio device may access.
>>>>>>
>>>>>> This feature also enables the vhost REPLY_ACK protocol, the
>>>>>> implementation of which is known to work in newer versions of QEMU
>>>>>> (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0,
>>>>>> inclusive). As such, the feature is disabled by default in (and
>>>>>> should remain so, for the aforementioned older QEMU verions).
>>>>>> Starting with QEMU v2.9.1, vhost-iommu-support can safely be
>>>>>> enabled, even without having an IOMMU device, with no performance
>>>>>> penalty.
>>>>>>
>>>>>> This patch adds a new vhost port option, vhost-iommu-support, to
>>>>>> allow enablement of the vhost IOMMU feature:
>>>>>>
>>>>>>     $ ovs-vsctl add-port br0 vhost-client-1 \
>>>>>>         -- set Interface vhost-client-1 type=dpdkvhostuserclient \
>>>>>>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
>>>>>>              options:vhost-iommu-support=true
>>>>>>
>>>>>
>>>>> Hi Mark, All,
>>>>>
>>>>> I'm thinking about this and whether the current approach provides
>>>>> more than what is actually needed by users at the cost of making the
>>>>> user interface more complex.
>> [Mooney, Sean K] I am personally split on this. To enable iommu support in
>> openstack with the above interface we would have to do two things. 1 extend
>> the image metatdata
>> or flavor extra specs to allow requesting a vIOMMU. Second we would have to
>> modify os-vif to produce
>> the add-port command above. Using this interfaces gives us a nice parity in
>> our api
>> as we only enable iommu support in ovs if we enable for qemu. E.g. if the
>> falvor/image does not request
>> a virtualized iommu we wont declare its support in the options.
>>>>>
>>>>> As an alternative, how about having a global other_config (to be set
>>>>> like vhost-socket-dir can be) for this instead of having to set it
>>>>> for each individual interface. It would mean that it would only have
>>>>> to be set once, instead of having this (ugly?!) option every time a
>>>>> vhost port is added, so it's a less intrusive change and I can't
>>>>> really think that a user would require to do this per vhostclient
>> [Mooney, Sean K]  well one taught that instantly comes to mind is if I set
>> The global other_config option what happens if I do not request a iommu in
>> qemu
>> Or I have an old qemu.  If it would have any negative effect on network
>> connectivity
>> Or prevent the vm from functioning, that would require the nova scheduler to
>> be
>> Aware of which node had this option set and take that into account when
>> placing vms.
>> I assume if it had no negative effects  then we would not need a option,
>> global or per
>> Interface.
>>>>> interface??? It's pain to have to add this at all for a bug in QEMU
>>>>> and I'm sure in 1/2/3 years time someone will say that users could
>>>>> still be using QEMU < 2.9.1 and we can't remove it, so it would be
>>>>> nice to keep it as discreet as possible as we're going to be stuck
>>> with it for a while.
>>>>>
>>>>> I assume that a user would only use one version of QEMU at a time
>>> and
>>>>> would either want or not want this feature. In the worst case, if
>>>>> that proved completely wrong in the future, then a per interface
>>>>> override could easily be added. Once there's a way to maintain
>>>>> backwards compatibility (which there would be) I'd rather err on the
>>>>> side of introducing just enough enough functionality over increasing
>>>>> complexity for the user.
>>>>>
>>>>> What do you think?
>> [Mooney, Sean K] I'm not sure that a single qemu version is a valid assumption
>> to make.

Hi Sean, the solution in the current patch is effectively to allow
iommu/different QEMU versions on a *per port* basis. I am assuming that
level of granularity is not needed and instead proposing to allow
iommu/different QEMU versions on a *per OVS instance* basis. It's not
making any assumption about QEMU versions across multiple nodes.

>> At least in an OpenStack context where rolling upgrades are a thing. But you
>> are right
>> That it will be less common however if it was no existent we would not have
>> the issue with
>> Live migration between nodes with different feature sets that is also trying
>> to be addressed this
>> Cycle. If we add a global config option for iommu support that is yet another
>> thing that needs
>> To be accounted for during live migration.

Yes that's true, but it's difficult problem anyway and I don't think an
additional binary field would make it much more complex. The alternative
is never allow the iommu feature to be enabled, or drop support for QEMU
versions < 2.9.1. It's default off, so you should have backwards
compatibility at least.

>>>>>
> 
> Hi Kevin, Jan,
> 
> Any thoughts on Sean's concerns?
> 

Sean can confirm, but my reading is that Sean's primary concern is with
the proposal to add any config option to enable iommu - not with the
proposal to change from a per port to per OVS basis.

Kevin.

> I'll hold off on implementation until we have consensus.
> 
> Thanks,
> Mark
> 
> 
> 
> 
>>>>> thanks,
>>>>> Kevin.
Jan Scheurich Nov. 27, 2017, 2:15 p.m. UTC | #14
> >> >> Hi Mark, All,
> >> >>
> >> >> I'm thinking about this and whether the current approach provides
> >> >> more than what is actually needed by users at the cost of making the
> >> >> user interface more complex.
> >[Mooney, Sean K] I am personally split on this. To enable iommu support in
> >openstack with the above interface we would have to do two things. 1 extend
> >the image metatdata
> >or flavor extra specs to allow requesting a vIOMMU. Second we would have to
> >modify os-vif to produce
> >the add-port command above. Using this interfaces gives us a nice parity in
> >our api
> >as we only enable iommu support in ovs if we enable for qemu. E.g. if the
> >falvor/image does not request
> >a virtualized iommu we wont declare its support in the options.
> >> >>
> >> >> As an alternative, how about having a global other_config (to be set
> >> >> like vhost-socket-dir can be) for this instead of having to set it
> >> >> for each individual interface. It would mean that it would only have
> >> >> to be set once, instead of having this (ugly?!) option every time a
> >> >> vhost port is added, so it's a less intrusive change and I can't
> >> >> really think that a user would require to do this per vhostclient
> >[Mooney, Sean K]  well one taught that instantly comes to mind is if I set
> >The global other_config option what happens if I do not request a iommu in
> >qemu
> >Or I have an old qemu.  If it would have any negative effect on network
> >connectivity
> >Or prevent the vm from functioning, that would require the nova scheduler to
> >be
> >Aware of which node had this option set and take that into account when
> >placing vms.
> >I assume if it had no negative effects  then we would not need a option,
> >global or per
> >Interface.
> >> >> interface??? It's pain to have to add this at all for a bug in QEMU
> >> >> and I'm sure in 1/2/3 years time someone will say that users could
> >> >> still be using QEMU < 2.9.1 and we can't remove it, so it would be
> >> >> nice to keep it as discreet as possible as we're going to be stuck
> >> with it for a while.
> >> >>
> >> >> I assume that a user would only use one version of QEMU at a time
> >> and
> >> >> would either want or not want this feature. In the worst case, if
> >> >> that proved completely wrong in the future, then a per interface
> >> >> override could easily be added. Once there's a way to maintain
> >> >> backwards compatibility (which there would be) I'd rather err on the
> >> >> side of introducing just enough enough functionality over increasing
> >> >> complexity for the user.
> >> >>
> >> >> What do you think?
> >[Mooney, Sean K] I'm not sure that a single qemu version is a valid assumption
> >to make.
> >At least in an OpenStack context where rolling upgrades are a thing. But you
> >are right
> >That it will be less common however if it was no existent we would not have
> >the issue with
> >Live migration between nodes with different feature sets that is also trying
> >to be addressed this
> >Cycle. If we add a global config option for iommu support that is yet another
> >thing that needs
> >To be accounted for during live migration.
> >> >>
> 
> Hi Kevin, Jan,
> 
> Any thoughts on Sean's concerns?
> 
> I'll hold off on implementation until we have consensus.
> 
> Thanks,
> Mark

Here are my 2cts:

As I see it, vIOMMU for vhostuser is a pure infrastructure feature negotiated between guest driver,  Qemu and OVS. If both Qemu and OVS support it and the driver requests it, it should be used, otherwise not.

My understanding is that the vhostuser library in DPDK 17.11 supports vIOMMU, such that OVS could always signal support for this feature. The only reason not to do so is to work around the problem that Qemu claims vIOMMU support in certain releases but the vhostuser backend implementation is broken (prior to 2.9.1). For these cases it should be sufficient to globally disable vIOMMU support in OVS.

I don't see the need for one OVS instance to interwork with multiple different Qemu versions on the same host. And even if that were the case in an upgrade scenario, one could keep vIOMMU disabled until the old Qemu is removed.

The specific enabling of vIOMMU per tenant VM port is done by supplying an iommu device to the VM in the Libvirt XML and enabling iommu for the device in the driver element of the interface section. These are the places that Nova/OpenStack would use to control the vIOMMU per VM based on image properties or flavor extra specs. I do not see any value to additionally configure the iommu support per vhostuser port through OS-VIF.

Conclusion: A global other_config parameter to enable/disable vhostuser IOMMU is sufficient. By default this could be OFF for now and changed to ON when broken Qemu versions are largely gone.

BR, Jan
Mark Kavanagh Nov. 27, 2017, 2:24 p.m. UTC | #15
>From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]
>Sent: Monday, November 27, 2017 2:15 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Mooney, Sean K
><sean.k.mooney@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
>dev@openvswitch.org
>Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck Baudin
><fbaudin@redhat.com>; Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
><ian.stokes@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Darrell Ball
><dball@vmware.com>; Aaron Conole <aconole@redhat.com>
>Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost IOMMU
>feature
>
>> >> >> Hi Mark, All,
>> >> >>
>> >> >> I'm thinking about this and whether the current approach provides
>> >> >> more than what is actually needed by users at the cost of making the
>> >> >> user interface more complex.
>> >[Mooney, Sean K] I am personally split on this. To enable iommu support in
>> >openstack with the above interface we would have to do two things. 1 extend
>> >the image metatdata
>> >or flavor extra specs to allow requesting a vIOMMU. Second we would have to
>> >modify os-vif to produce
>> >the add-port command above. Using this interfaces gives us a nice parity in
>> >our api
>> >as we only enable iommu support in ovs if we enable for qemu. E.g. if the
>> >falvor/image does not request
>> >a virtualized iommu we wont declare its support in the options.
>> >> >>
>> >> >> As an alternative, how about having a global other_config (to be set
>> >> >> like vhost-socket-dir can be) for this instead of having to set it
>> >> >> for each individual interface. It would mean that it would only have
>> >> >> to be set once, instead of having this (ugly?!) option every time a
>> >> >> vhost port is added, so it's a less intrusive change and I can't
>> >> >> really think that a user would require to do this per vhostclient
>> >[Mooney, Sean K]  well one taught that instantly comes to mind is if I set
>> >The global other_config option what happens if I do not request a iommu in
>> >qemu
>> >Or I have an old qemu.  If it would have any negative effect on network
>> >connectivity
>> >Or prevent the vm from functioning, that would require the nova scheduler
>to
>> >be
>> >Aware of which node had this option set and take that into account when
>> >placing vms.
>> >I assume if it had no negative effects  then we would not need a option,
>> >global or per
>> >Interface.
>> >> >> interface??? It's pain to have to add this at all for a bug in QEMU
>> >> >> and I'm sure in 1/2/3 years time someone will say that users could
>> >> >> still be using QEMU < 2.9.1 and we can't remove it, so it would be
>> >> >> nice to keep it as discreet as possible as we're going to be stuck
>> >> with it for a while.
>> >> >>
>> >> >> I assume that a user would only use one version of QEMU at a time
>> >> and
>> >> >> would either want or not want this feature. In the worst case, if
>> >> >> that proved completely wrong in the future, then a per interface
>> >> >> override could easily be added. Once there's a way to maintain
>> >> >> backwards compatibility (which there would be) I'd rather err on the
>> >> >> side of introducing just enough enough functionality over increasing
>> >> >> complexity for the user.
>> >> >>
>> >> >> What do you think?
>> >[Mooney, Sean K] I'm not sure that a single qemu version is a valid
>assumption
>> >to make.
>> >At least in an OpenStack context where rolling upgrades are a thing. But
>you
>> >are right
>> >That it will be less common however if it was no existent we would not have
>> >the issue with
>> >Live migration between nodes with different feature sets that is also
>trying
>> >to be addressed this
>> >Cycle. If we add a global config option for iommu support that is yet
>another
>> >thing that needs
>> >To be accounted for during live migration.
>> >> >>
>>
>> Hi Kevin, Jan,
>>
>> Any thoughts on Sean's concerns?
>>
>> I'll hold off on implementation until we have consensus.
>>
>> Thanks,
>> Mark
>
>Here are my 2cts:
>
>As I see it, vIOMMU for vhostuser is a pure infrastructure feature negotiated
>between guest driver,  Qemu and OVS. If both Qemu and OVS support it and the
>driver requests it, it should be used, otherwise not.
>
>My understanding is that the vhostuser library in DPDK 17.11 supports vIOMMU,
>such that OVS could always signal support for this feature. The only reason
>not to do so is to work around the problem that Qemu claims vIOMMU support in
>certain releases but the vhostuser backend implementation is broken (prior to
>2.9.1). For these cases it should be sufficient to globally disable vIOMMU
>support in OVS.
>
>I don't see the need for one OVS instance to interwork with multiple different
>Qemu versions on the same host. And even if that were the case in an upgrade
>scenario, one could keep vIOMMU disabled until the old Qemu is removed.
>
>The specific enabling of vIOMMU per tenant VM port is done by supplying an
>iommu device to the VM in the Libvirt XML and enabling iommu for the device in
>the driver element of the interface section. These are the places that
>Nova/OpenStack would use to control the vIOMMU per VM based on image
>properties or flavor extra specs. I do not see any value to additionally
>configure the iommu support per vhostuser port through OS-VIF.
>
>Conclusion: A global other_config parameter to enable/disable vhostuser IOMMU
>is sufficient. By default this could be OFF for now and changed to ON when
>broken Qemu versions are largely gone.
>
>BR, Jan

Thanks Jan, Kevin for your respective inputs.

I've spoken to Sean about this offline, and we came to the same conclusion regarding the global config option.

I'll proceed with same, as discussed.

Thanks again,
Mark
Mooney, Sean K Nov. 27, 2017, 2:25 p.m. UTC | #16
> -----Original Message-----

> From: Kevin Traynor [mailto:ktraynor@redhat.com]

> Sent: Monday, November 27, 2017 1:32 PM

> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Mooney, Sean K

> <sean.k.mooney@intel.com>; Jan Scheurich <jan.scheurich@ericsson.com>;

> dev@openvswitch.org

> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck

> Baudin <fbaudin@redhat.com>; Ilya Maximets <i.maximets@samsung.com>;

> Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara

> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron Conole

> <aconole@redhat.com>

> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost

> IOMMU feature

> 

> On 11/27/2017 12:55 PM, Kavanagh, Mark B wrote:

> >

> >

> >> From: Mooney, Sean K

> >> Sent: Sunday, November 26, 2017 11:28 PM

> >> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Jan Scheurich

> >> <jan.scheurich@ericsson.com>; Kevin Traynor <ktraynor@redhat.com>;

> >> dev@openvswitch.org

> >> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;

> >> Franck Baudin <fbaudin@redhat.com>; Ilya Maximets

> >> <i.maximets@samsung.com>; Stokes, Ian <ian.stokes@intel.com>;

> Loftus,

> >> Ciara <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>;

> >> Aaron Conole <aconole@redhat.com>; Mooney, Sean K

> >> <sean.k.mooney@intel.com>

> >> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for

> vhost

> >> IOMMU feature

> >>

> >>

> >>

> >>> -----Original Message-----

> >>> From: Kavanagh, Mark B

> >>> Sent: Friday, November 24, 2017 4:45 PM

> >>> To: Jan Scheurich <jan.scheurich@ericsson.com>; Kevin Traynor

> >>> <ktraynor@redhat.com>; dev@openvswitch.org

> >>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;

> >>> Franck Baudin <fbaudin@redhat.com>; Mooney, Sean K

> >>> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>;

> >>> Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara

> >>> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron

> >>> Conole <aconole@redhat.com>

> >>> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for

> >>> vhost IOMMU feature

> >>>

> >>> Sounds good guys - I'll get cracking on this on Monday.

> >>>

> >>> Cheers,

> >>> Mark

> >>>

> >>>> -----Original Message-----

> >>>> From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]

> >>>> Sent: Friday, November 24, 2017 4:21 PM

> >>>> To: Kevin Traynor <ktraynor@redhat.com>; Kavanagh, Mark B

> >>>> <mark.b.kavanagh@intel.com>; dev@openvswitch.org

> >>>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;

> >>> Franck

> >>>> Baudin <fbaudin@redhat.com>; Mooney, Sean K

> >>>> <sean.k.mooney@intel.com>; Ilya Maximets <i.maximets@samsung.com>;

> >>>> Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara

> >>>> <ciara.loftus@intel.com>;

> >>> Darrell

> >>>> Ball <dball@vmware.com>; Aaron Conole <aconole@redhat.com>

> >>>> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for

> >>>> vhost IOMMU feature

> >>>>

> >>>> +1

> >>>> Jan

> >>>>

> >>>>> -----Original Message-----

> >>>>> From: Kevin Traynor [mailto:ktraynor@redhat.com]

> >>>>> Sent: Friday, 24 November, 2017 17:11

> >>>>> To: Mark Kavanagh <mark.b.kavanagh@intel.com>;

> dev@openvswitch.org

> >>>>> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>;

> >>>>> Franck

> >>>> Baudin <fbaudin@redhat.com>; Mooney, Sean K

> >>>>> <sean.k.mooney@intel.com>; Ilya Maximets

> <i.maximets@samsung.com>;

> >>>>> Ian

> >>>> Stokes <ian.stokes@intel.com>; Loftus, Ciara

> >>>>> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron

> >>>>> Conole

> >>>> <aconole@redhat.com>; Jan Scheurich

> >>>>> <jan.scheurich@ericsson.com>

> >>>>> Subject: Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for

> >>> vhost

> >>>>> IOMMU

> >>>> feature

> >>>>>

> >>>>> On 11/16/2017 11:01 AM, Mark Kavanagh wrote:

> >>>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.

> >>>>>> This is a security feature, that restricts the vhost memory that

> >>>>>> a virtio device may access.

> >>>>>>

> >>>>>> This feature also enables the vhost REPLY_ACK protocol, the

> >>>>>> implementation of which is known to work in newer versions of

> >>>>>> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -

> >>>>>> v2.9.0, inclusive). As such, the feature is disabled by default

> >>>>>> in (and should remain so, for the aforementioned older QEMU

> verions).

> >>>>>> Starting with QEMU v2.9.1, vhost-iommu-support can safely be

> >>>>>> enabled, even without having an IOMMU device, with no

> performance

> >>>>>> penalty.

> >>>>>>

> >>>>>> This patch adds a new vhost port option, vhost-iommu-support, to

> >>>>>> allow enablement of the vhost IOMMU feature:

> >>>>>>

> >>>>>>     $ ovs-vsctl add-port br0 vhost-client-1 \

> >>>>>>         -- set Interface vhost-client-1 type=dpdkvhostuserclient

> \

> >>>>>>              options:vhost-server-path=$VHOST_USER_SOCKET_PATH

> \

> >>>>>>              options:vhost-iommu-support=true

> >>>>>>

> >>>>>

> >>>>> Hi Mark, All,

> >>>>>

> >>>>> I'm thinking about this and whether the current approach provides

> >>>>> more than what is actually needed by users at the cost of making

> >>>>> the user interface more complex.

> >> [Mooney, Sean K] I am personally split on this. To enable iommu

> >> support in openstack with the above interface we would have to do

> two

> >> things. 1 extend the image metatdata or flavor extra specs to allow

> >> requesting a vIOMMU. Second we would have to modify os-vif to

> produce

> >> the add-port command above. Using this interfaces gives us a nice

> >> parity in our api as we only enable iommu support in ovs if we

> enable

> >> for qemu. E.g. if the falvor/image does not request a virtualized

> >> iommu we wont declare its support in the options.

> >>>>>

> >>>>> As an alternative, how about having a global other_config (to be

> >>>>> set like vhost-socket-dir can be) for this instead of having to

> >>>>> set it for each individual interface. It would mean that it would

> >>>>> only have to be set once, instead of having this (ugly?!) option

> >>>>> every time a vhost port is added, so it's a less intrusive change

> >>>>> and I can't really think that a user would require to do this per

> >>>>> vhostclient

> >> [Mooney, Sean K]  well one taught that instantly comes to mind is if

> >> I set The global other_config option what happens if I do not

> request

> >> a iommu in qemu Or I have an old qemu.  If it would have any

> negative

> >> effect on network connectivity Or prevent the vm from functioning,

> >> that would require the nova scheduler to be Aware of which node had

> >> this option set and take that into account when placing vms.

> >> I assume if it had no negative effects  then we would not need a

> >> option, global or per Interface.

> >>>>> interface??? It's pain to have to add this at all for a bug in

> >>>>> QEMU and I'm sure in 1/2/3 years time someone will say that users

> >>>>> could still be using QEMU < 2.9.1 and we can't remove it, so it

> >>>>> would be nice to keep it as discreet as possible as we're going

> to

> >>>>> be stuck

> >>> with it for a while.

> >>>>>

> >>>>> I assume that a user would only use one version of QEMU at a time

> >>> and

> >>>>> would either want or not want this feature. In the worst case, if

> >>>>> that proved completely wrong in the future, then a per interface

> >>>>> override could easily be added. Once there's a way to maintain

> >>>>> backwards compatibility (which there would be) I'd rather err on

> >>>>> the side of introducing just enough enough functionality over

> >>>>> increasing complexity for the user.

> >>>>>

> >>>>> What do you think?

> >> [Mooney, Sean K] I'm not sure that a single qemu version is a valid

> >> assumption to make.

> 

> Hi Sean, the solution in the current patch is effectively to allow

> iommu/different QEMU versions on a *per port* basis. I am assuming that

> level of granularity is not needed and instead proposing to allow

> iommu/different QEMU versions on a *per OVS instance* basis. It's not

> making any assumption about QEMU versions across multiple nodes.

> 

> >> At least in an OpenStack context where rolling upgrades are a thing.

> >> But you are right That it will be less common however if it was no

> >> existent we would not have the issue with Live migration between

> >> nodes with different feature sets that is also trying to be

> addressed

> >> this Cycle. If we add a global config option for iommu support that

> >> is yet another thing that needs To be accounted for during live

> >> migration.

> 

> Yes that's true, but it's difficult problem anyway and I don't think an

> additional binary field would make it much more complex. The

> alternative is never allow the iommu feature to be enabled, or drop

> support for QEMU versions < 2.9.1. It's default off, so you should have

> backwards compatibility at least.

> 

> >>>>>

> >

> > Hi Kevin, Jan,

> >

> > Any thoughts on Sean's concerns?

> >

> 

> Sean can confirm, but my reading is that Sean's primary concern is with

> the proposal to add any config option to enable iommu - not with the

> proposal to change from a per port to per OVS basis.

[Mooney, Sean K] so I had a quick chat with mark about the background to this.
My current understanding is that form qemu 2.6-2.9 there is a bug that causes the
Guest to hang due to a lack of support for a required ack to correctly negotiate the
Use of the iommu feature. As a result a config option either global or per interfaces is
Required. As an aside OpenStack does not currently support requesting iommu virtualization so 
I don’t see why we could not put a requirement on our side that you use qemu 2.9.1 or newer
To avail of that feature. Going back to ovs then assuming we used a single global config vaule
In for example the other_config dictionary of the Open_vSwitch table in the ovsdb.
e.g. ovs-vsctl set Open_vswitch . other_config:vhost-iommu-support=true
I would be fine with that approach if we can confirm 1 thing first.

Assuming qemu > 3.9.1 + other_config:vhost-iommu-support=true it is possible
To boot a vm with and without "-device intel-iommu,intremap=on,eim=off,caching-mode=on"
Specified on the qemu commandline.

That is to say assuming all software versions have support for this feature,
if we globally enable vhost iommu support negotiation in ovs, and we spawn a
Vm with or with a virtualized iommu then we it will spawn correctly with network connectivity
In both cases. 

If we cannot assert the above to be true I would prefer to have the per interface option though
I tend to agree that since this is just working around a qemu but the global option is less intrusive
Simpler to removed long term. For example if ovs chooses in the future to drop support for
Qemu < 3.9.1 it can deprecate suprt for old qemu and the other_config:vhost-iommu-support option changing
Its default from false to true in release n. then in release n+1 I can drop support for old qemu versions
and remove the other_config:vhost-iommu-support option entirely. 


> 

> Kevin.

> 

> > I'll hold off on implementation until we have consensus.

> >

> > Thanks,

> > Mark

> >

> >

> >

> >

> >>>>> thanks,

> >>>>> Kevin.
Mooney, Sean K Nov. 27, 2017, 2:30 p.m. UTC | #17
> -----Original Message-----
> From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]
> Sent: Monday, November 27, 2017 2:15 PM
> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Mooney, Sean K
> <sean.k.mooney@intel.com>; Kevin Traynor <ktraynor@redhat.com>;
> dev@openvswitch.org
> Cc: maxime.coquelin@redhat.com; Flavio Leitner <fbl@redhat.com>; Franck
> Baudin <fbaudin@redhat.com>; Ilya Maximets <i.maximets@samsung.com>;
> Stokes, Ian <ian.stokes@intel.com>; Loftus, Ciara
> <ciara.loftus@intel.com>; Darrell Ball <dball@vmware.com>; Aaron Conole
> <aconole@redhat.com>
> Subject: RE: [ovs-dev] [PATCH 2/2] netdev-dpdk: add support for vhost
> IOMMU feature
> 
> > >> >> Hi Mark, All,
> > >> >>
> > >> >> I'm thinking about this and whether the current approach
> > >> >> provides more than what is actually needed by users at the cost
> > >> >> of making the user interface more complex.
> > >[Mooney, Sean K] I am personally split on this. To enable iommu
> > >support in openstack with the above interface we would have to do
> two
> > >things. 1 extend the image metatdata or flavor extra specs to allow
> > >requesting a vIOMMU. Second we would have to modify os-vif to
> produce
> > >the add-port command above. Using this interfaces gives us a nice
> > >parity in our api as we only enable iommu support in ovs if we
> enable
> > >for qemu. E.g. if the falvor/image does not request a virtualized
> > >iommu we wont declare its support in the options.
> > >> >>
> > >> >> As an alternative, how about having a global other_config (to
> be
> > >> >> set like vhost-socket-dir can be) for this instead of having to
> > >> >> set it for each individual interface. It would mean that it
> > >> >> would only have to be set once, instead of having this (ugly?!)
> > >> >> option every time a vhost port is added, so it's a less
> > >> >> intrusive change and I can't really think that a user would
> > >> >> require to do this per vhostclient
> > >[Mooney, Sean K]  well one taught that instantly comes to mind is if
> > >I set The global other_config option what happens if I do not
> request
> > >a iommu in qemu Or I have an old qemu.  If it would have any
> negative
> > >effect on network connectivity Or prevent the vm from functioning,
> > >that would require the nova scheduler to be Aware of which node had
> > >this option set and take that into account when placing vms.
> > >I assume if it had no negative effects  then we would not need a
> > >option, global or per Interface.
> > >> >> interface??? It's pain to have to add this at all for a bug in
> > >> >> QEMU and I'm sure in 1/2/3 years time someone will say that
> > >> >> users could still be using QEMU < 2.9.1 and we can't remove it,
> > >> >> so it would be nice to keep it as discreet as possible as we're
> > >> >> going to be stuck
> > >> with it for a while.
> > >> >>
> > >> >> I assume that a user would only use one version of QEMU at a
> > >> >> time
> > >> and
> > >> >> would either want or not want this feature. In the worst case,
> > >> >> if that proved completely wrong in the future, then a per
> > >> >> interface override could easily be added. Once there's a way to
> > >> >> maintain backwards compatibility (which there would be) I'd
> > >> >> rather err on the side of introducing just enough enough
> > >> >> functionality over increasing complexity for the user.
> > >> >>
> > >> >> What do you think?
> > >[Mooney, Sean K] I'm not sure that a single qemu version is a valid
> > >assumption to make.
> > >At least in an OpenStack context where rolling upgrades are a thing.
> > >But you are right That it will be less common however if it was no
> > >existent we would not have the issue with Live migration between
> > >nodes with different feature sets that is also trying to be
> addressed
> > >this Cycle. If we add a global config option for iommu support that
> > >is yet another thing that needs To be accounted for during live
> > >migration.
> > >> >>
> >
> > Hi Kevin, Jan,
> >
> > Any thoughts on Sean's concerns?
> >
> > I'll hold off on implementation until we have consensus.
> >
> > Thanks,
> > Mark
> 
> Here are my 2cts:
> 
> As I see it, vIOMMU for vhostuser is a pure infrastructure feature
> negotiated between guest driver,  Qemu and OVS. If both Qemu and OVS
> support it and the driver requests it, it should be used, otherwise
> not.
> 
> My understanding is that the vhostuser library in DPDK 17.11 supports
> vIOMMU, such that OVS could always signal support for this feature. The
> only reason not to do so is to work around the problem that Qemu claims
> vIOMMU support in certain releases but the vhostuser backend
> implementation is broken (prior to 2.9.1). For these cases it should be
> sufficient to globally disable vIOMMU support in OVS.
> 
> I don't see the need for one OVS instance to interwork with multiple
> different Qemu versions on the same host. And even if that were the
> case in an upgrade scenario, one could keep vIOMMU disabled until the
> old Qemu is removed.
> 
> The specific enabling of vIOMMU per tenant VM port is done by supplying
> an iommu device to the VM in the Libvirt XML and enabling iommu for the
> device in the driver element of the interface section. These are the
> places that Nova/OpenStack would use to control the vIOMMU per VM based
> on image properties or flavor extra specs. I do not see any value to
> additionally configure the iommu support per vhostuser port through OS-
> VIF.
> 
> Conclusion: A global other_config parameter to enable/disable vhostuser
> IOMMU is sufficient. By default this could be OFF for now and changed
> to ON when broken Qemu versions are largely gone.
[Mooney, Sean K] hi yes I just responded to this before I saw your reply.
A global option will be fine if we can confirm that enabling the iommu feature negotiation
In ovs will not impact vms that do not have a virtualized iommu enabled in the Libvirt/ qemu commandline.
I would personally consider it a driver bug if ovs advertised support for using a virtualized iommu and
The driver in the guest requested it with our have an actual iommu present in the vm.
> 
> BR, Jan
Jan Scheurich Nov. 27, 2017, 2:40 p.m. UTC | #18
> > Conclusion: A global other_config parameter to enable/disable vhostuser
> > IOMMU is sufficient. By default this could be OFF for now and changed
> > to ON when broken Qemu versions are largely gone.

> [Mooney, Sean K] hi yes I just responded to this before I saw your reply.
> A global option will be fine if we can confirm that enabling the iommu feature negotiation
> In ovs will not impact vms that do not have a virtualized iommu enabled in the Libvirt/ qemu commandline.
> I would personally consider it a driver bug if ovs advertised support for using a virtualized iommu and
> The driver in the guest requested it with our have an actual iommu present in the vm.
> 

I agree. I think it is up to driver and Qemu VM to agree on whether it is OK to use vIOMMU in the first place. If OVS as vhostuser supports vIOMMU, it will be used, otherwise I would expect either of two things to happen:
1. Qemu falls back to vhostuser without IOMMU, or 
2. Qemu fails to start the VM.

My guess would be the latter, but I do not know. 

BR, Jan
Maxime Coquelin Nov. 28, 2017, 1:03 p.m. UTC | #19
Hi Sean,

On 11/27/2017 03:25 PM, Mooney, Sean K wrote:
>> Sean can confirm, but my reading is that Sean's primary concern is with
>> the proposal to add any config option to enable iommu - not with the
>> proposal to change from a per port to per OVS basis.
> [Mooney, Sean K] so I had a quick chat with mark about the background to this.
> My current understanding is that form qemu 2.6-2.9 there is a bug that causes the
> Guest to hang due to a lack of support for a required ack to correctly negotiate the
> Use of the iommu feature. As a result a config option either global or per interfaces is
> Required. As an aside OpenStack does not currently support requesting iommu virtualization so
> I don’t see why we could not put a requirement on our side that you use qemu 2.9.1 or newer
> To avail of that feature. Going back to ovs then assuming we used a single global config vaule
> In for example the other_config dictionary of the Open_vSwitch table in the ovsdb.
> e.g. ovs-vsctl set Open_vswitch . other_config:vhost-iommu-support=true
> I would be fine with that approach if we can confirm 1 thing first.

I confirm you understanding:
1. Before Qemu v2.9.1, we must not enable iommu-support option because
because it enables a vhost-user protocol feature (reply-ack) that is
suffers from a bug in these Qemu version. And this is not a problem
because iommu isn't anyway supported in these Qemu versions.
2. Since Qemu v2.9.1, the reply-ack bug has been fixed, so we can enable
iommu-support option safely, using iommu or not.

Regards,
Maxime
diff mbox series

Patch

diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
index 5347995..8dff901 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -250,6 +250,27 @@  Once the vhost-user-client ports have been added to the switch, they must be
 added to the guest. Like vhost-user ports, there are two ways to do this: using
 QEMU directly, or using libvirt. Only the QEMU case is covered here.
 
+vhost-user client IOMMU
+~~~~~~~~~~~~~~~~~~~~~~~
+It is possible to enable IOMMU support for vHost User client ports. This is
+a feature which restricts the vhost memory that a virtio device can access, and
+as such is useful in deployments in which security is a concern. IOMMU mode may
+be enabled on the command line::
+
+    $ ovs-vsctl add-port br0 vhost-client-1 \
+        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
+             options:vhost-server-path=$VHOST_USER_SOCKET_PATH   \
+             options:vhost-iommu-support=true
+
+.. important::
+
+    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
+    this is known to work on QEMU v2.10.0, but is buggy on older versions
+    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
+    default (and should remain so if using the aforementioned versions of QEMU).
+    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
+    without having an IOMMU device, with no performance penalty.
+
 Adding vhost-user-client ports to the guest (QEMU)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/NEWS b/NEWS
index 74e59bf..c15dc24 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,7 @@  Post-v2.8.0
      * Add support for compiling OVS with the latest Linux 4.13 kernel
    - DPDK:
      * Add support for DPDK v17.11
+     * Add support for vHost IOMMU feature
 
 v2.8.0 - 31 Aug 2017
 --------------------
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ed5bf62..2e9633a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1424,15 +1424,29 @@  netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     const char *path;
+    bool iommu_enable;
+    bool request_reconfigure = false;
+    uint64_t vhost_driver_flags_prev = dev->vhost_driver_flags;
 
     ovs_mutex_lock(&dev->mutex);
     if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
         path = smap_get(args, "vhost-server-path");
         if (path && strcmp(path, dev->vhost_id)) {
             strcpy(dev->vhost_id, path);
-            netdev_request_reconfigure(netdev);
+            request_reconfigure = true;
         }
     }
+
+    iommu_enable = smap_get_bool(args, "vhost-iommu-support", false);
+    if (iommu_enable)
+        dev->vhost_driver_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
+    else
+        dev->vhost_driver_flags &= ~RTE_VHOST_USER_IOMMU_SUPPORT;
+    if (vhost_driver_flags_prev != dev->vhost_driver_flags)
+        request_reconfigure = true;
+
+    if (request_reconfigure)
+        netdev_request_reconfigure(netdev);
     ovs_mutex_unlock(&dev->mutex);
 
     return 0;
@@ -3326,9 +3340,18 @@  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
      */
     if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
             && strlen(dev->vhost_id)) {
+        /* Enable vhost IOMMU, if it was requested.
+         * XXX: the 'flags' variable is required, as not all vhost backend
+         * features are currently supported by OvS; in time, it should be
+         * possible to invoke rte_vhost_driver_register(), passing
+         * dev->vhost_driver_flags directly as a parameter to same.
+         */
+        uint64_t flags = RTE_VHOST_USER_CLIENT;
+        if (dev->vhost_driver_flags & RTE_VHOST_USER_IOMMU_SUPPORT)
+            flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
+
         /* Register client-mode device */
-        err = rte_vhost_driver_register(dev->vhost_id,
-                                        RTE_VHOST_USER_CLIENT);
+        err = rte_vhost_driver_register(dev->vhost_id, flags);
         if (err) {
             VLOG_ERR("vhost-user device setup failure for device %s\n",
                      dev->vhost_id);
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index c145e1a..a633226 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -2654,6 +2654,16 @@  ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
         </p>
       </column>
 
+      <column name="options" key="vhost-iommu-support"
+              type='{"type": "boolean"}'>
+        <p>
+          The value specifies whether IOMMU support is enabled for a vHost User
+          client mode device that has been or will be created by QEMU.
+          Only supported by dpdkvhostuserclient interfaces. If not specified or
+          an incorrect value is specified, defaults to 'false'.
+        </p>
+      </column>
+
       <column name="options" key="n_rxq_desc"
               type='{"type": "integer", "minInteger": 1, "maxInteger": 4096}'>
         <p>