diff mbox series

[ovs-dev,V4,2/2] netdev-dpdk: vHost IOMMU support

Message ID 1512386159-236028-3-git-send-email-mark.b.kavanagh@intel.com
State Changes Requested
Delegated to: Ian Stokes
Headers show
Series DPDK v17.11 Support | expand

Commit Message

Mark Kavanagh Dec. 4, 2017, 11:15 a.m. UTC
DPDK v17.11 introduces support for the vHost IOMMU feature.
This is a security feature, which restricts the vhost memory
that a virtio device may access.

This feature also enables the vhost REPLY_ACK protocol, the
implementation of which is known to work in newer versions of
QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
v2.9.0, inclusive). As such, the feature is disabled by default
in (and should remain so), for the aforementioned older QEMU
verions. Starting with QEMU v2.9.1, vhost-iommu-support can
safely be enabled, even without having an IOMMU device, with
no performance penalty.

This patch adds a new global config option, vhost-iommu-support,
that controls enablement of the vhost IOMMU feature:

    ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true

This value defaults to false; to enable IOMMU support, this field
should be set to true when setting other global parameters on init
(such as "dpdk-socket-mem", for example). Changing the value at
runtime is not supported, and requires restarting the vswitch daemon.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>

---

v4:  - rebase to HEAD of master
     - clarify that IOMMU support applies exclusively to vhost user
       client ports.
     - reword caveats regarding modifying vhost-iommu-support at runtime
     - use smap_get_bool() to parse vhost-iommu-support, instead of
       process_vhost_flags().
     - add stub implementation of dpdk_vhost_iommu_enabled().
     - move stdbool.h #include outside of DPDK compiler guards.
     - remove mention of vhost IOMMU mode from
       netdev_dpdk_vhost_client_configure() log.

v3:  - erroneous; disregard.

v2:  - rebase to HEAD of master
     - refactor vHost IOMMU enablement mechanism (use a global
       config option, instead of the previous per-port approach).
---
 Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
 NEWS                                     |  1 +
 lib/dpdk-stub.c                          |  6 ++++++
 lib/dpdk.c                               | 12 ++++++++++++
 lib/dpdk.h                               |  3 +++
 lib/netdev-dpdk.c                        | 14 ++++++++++----
 vswitchd/vswitch.xml                     | 15 +++++++++++++++
 7 files changed, 74 insertions(+), 4 deletions(-)

Comments

Stokes, Ian Dec. 7, 2017, 8:42 a.m. UTC | #1
> DPDK v17.11 introduces support for the vHost IOMMU feature.
> This is a security feature, which restricts the vhost memory that a virtio
> device may access.

Hi all,

There were a few requests for changes in the v4 of this patch and it's an important aspect of the DPDK 17.11 support for OVS.

I'd like to get this series in to the pull request this week. If people have flagged issues for the latest revision I'd appreciate if they could review the latest revision and flag any new issues that need to be raised.

Ian

> 
> This feature also enables the vhost REPLY_ACK protocol, the implementation
> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but is
> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the feature
> is disabled by default in (and should remain so), for the aforementioned
> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
> safely be enabled, even without having an IOMMU device, with no
> performance penalty.
> 
> This patch adds a new global config option, vhost-iommu-support, that
> controls enablement of the vhost IOMMU feature:
> 
>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
> 
> This value defaults to false; to enable IOMMU support, this field should
> be set to true when setting other global parameters on init (such as
> "dpdk-socket-mem", for example). Changing the value at runtime is not
> supported, and requires restarting the vswitch daemon.
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> 
> ---
> 
> v4:  - rebase to HEAD of master
>      - clarify that IOMMU support applies exclusively to vhost user
>        client ports.
>      - reword caveats regarding modifying vhost-iommu-support at runtime
>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>        process_vhost_flags().
>      - add stub implementation of dpdk_vhost_iommu_enabled().
>      - move stdbool.h #include outside of DPDK compiler guards.
>      - remove mention of vhost IOMMU mode from
>        netdev_dpdk_vhost_client_configure() log.
> 
> v3:  - erroneous; disregard.
> 
> v2:  - rebase to HEAD of master
>      - refactor vHost IOMMU enablement mechanism (use a global
>        config option, instead of the previous per-port approach).
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
>  NEWS                                     |  1 +
>  lib/dpdk-stub.c                          |  6 ++++++
>  lib/dpdk.c                               | 12 ++++++++++++
>  lib/dpdk.h                               |  3 +++
>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>  7 files changed, 74 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst
> b/Documentation/topics/dpdk/vhost-user.rst
> index 5347995..ffef653 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
> vHost ports to 'reconnect' in  event of the switch crashing or being
> brought down. Once it is brought back up,  the vHost ports will reconnect
> automatically and normal service will resume.
> 
> +vhost-user-client IOMMU Support
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
> +device can access, and as such is useful in deployments in which security
> is a concern.
> +
> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
> support```.
> +Setting this to true enables vhost IOMMU support for all vhost ports
> +when/where
> +available::
> +
> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
> +
> +The default value is false.
> +
> +.. important::
> +
> +    Changing this value requires restarting the daemon.
> +
> +.. important::
> +
> +    Enabling the IOMMU feature also enables the vhost user reply-ack
> protocol;
> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
> disabled by
> +    default (and should remain so if using the aforementioned versions of
> QEMU).
> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
> even
> +    without having an IOMMU device, with no performance penalty.
> +
>  .. _dpdk-testpmd:
> 
>  DPDK in the Guest
> diff --git a/NEWS b/NEWS
> index d4a1c9a..99c47d8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -15,6 +15,7 @@ Post-v2.8.0
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>     - DPDK:
>       * Add support for DPDK v17.11
> +     * Add support for vHost IOMMU
> 
>  v2.8.0 - 31 Aug 2017
>  --------------------
> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
> 100644
> --- a/lib/dpdk-stub.c
> +++ b/lib/dpdk-stub.c
> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>      return NULL;
>  }
> +
> +bool
> +dpdk_vhost_iommu_enabled(void)
> +{
> +    return false;
> +}
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 8da6c32..6710d10 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
> */
> 
>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
> */
> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
> +support */
> 
>  static int
>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>          vhost_sock_dir = sock_dir_subcomponent;
>      }
> 
> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
> +                                        "vhost-iommu-support", false);
> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
> +               vhost_iommu_enabled ? "enabled" : "disabled");
> +
>      argv = grow_argv(&argv, 0, 1);
>      argc = 1;
>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
> dpdk_get_vhost_sock_dir(void)
>      return vhost_sock_dir;
>  }
> 
> +bool
> +dpdk_vhost_iommu_enabled(void)
> +{
> +    return vhost_iommu_enabled;
> +}
> +
>  void
>  dpdk_set_lcore_id(unsigned cpu)
>  {
> diff --git a/lib/dpdk.h b/lib/dpdk.h
> index 673a1f1..dc58d96 100644
> --- a/lib/dpdk.h
> +++ b/lib/dpdk.h
> @@ -17,6 +17,8 @@
>  #ifndef DPDK_H
>  #define DPDK_H
> 
> +#include <stdbool.h>
> +
>  #ifdef DPDK_NETDEV
> 
>  #include <rte_config.h>
> @@ -35,5 +37,6 @@ struct smap;
>  void dpdk_init(const struct smap *ovs_other_config);  void
> dpdk_set_lcore_id(unsigned cpu);  const char
> *dpdk_get_vhost_sock_dir(void);
> +bool dpdk_vhost_iommu_enabled(void);
> 
>  #endif /* dpdk.h */
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
> *netdev)  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int err;
> +    uint64_t vhost_flags = 0;
> 
>      ovs_mutex_lock(&dev->mutex);
> 
> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
> *netdev)
>       */
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>              && strlen(dev->vhost_id)) {
> -        /* Register client-mode device */
> -        err = rte_vhost_driver_register(dev->vhost_id,
> -                                        RTE_VHOST_USER_CLIENT);
> +        /* Register client-mode device. */
> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
> +
> +        /* Enable IOMMU support, if explicitly requested. */
> +        if (dpdk_vhost_iommu_enabled()) {
> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +        }
> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>          if (err) {
>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>                       dev->vhost_id);
>              goto unlock;
>          } else {
>              /* Configuration successful */
> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
> +            dev->vhost_driver_flags |= vhost_flags;
>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>                        "using client socket '%s'",
>                        dev->up.name, dev->vhost_id); diff --git
> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
> 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -344,6 +344,21 @@
>          </p>
>        </column>
> 
> +      <column name="other_config" key="vhost-iommu-support"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          vHost IOMMU is a  security feature, which restricts the vhost
> memory
> +          that a virtio device may access. vHost IOMMU support is
> disabled by
> +          default, due to a bug in QEMU implementations of the vhost
> REPLY_ACK
> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
> Setting this
> +          value to <code>true</code> enables vHost IOMMU support for
> vHost User
> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
> +        </p>
> +        <p>
> +          Changing this value requires restarting the daemon.
> +        </p>
> +      </column>
> +
>        <column name="other_config" key="n-handler-threads"
>                type='{"type": "integer", "minInteger": 1}'>
>          <p>
> --
> 2.7.4
Ilya Maximets Dec. 7, 2017, 9:48 a.m. UTC | #2
Looks good to me in general, but I prefer if you'll fix following
checkpatch warnings:

== Checking 2e9587f27add ("netdev-dpdk: vHost IOMMU support") ==
WARNING: Line length is >79-characters long
#52 FILE: Documentation/topics/dpdk/vhost-user.rst:280:
can access, and as such is useful in deployments in which security is a concern.

WARNING: Line length is >79-characters long
#54 FILE: Documentation/topics/dpdk/vhost-user.rst:282:
IOMMU support may be enabled via a global config value, ```vhost-iommu-support```.

WARNING: Line length is >79-characters long
#71 FILE: Documentation/topics/dpdk/vhost-user.rst:299:
    default (and should remain so if using the aforementioned versions of QEMU).

One more comment inline.

Best regards, Ilya Maximets.

On 04.12.2017 14:15, Mark Kavanagh wrote:
> DPDK v17.11 introduces support for the vHost IOMMU feature.
> This is a security feature, which restricts the vhost memory
> that a virtio device may access.
> 
> This feature also enables the vhost REPLY_ACK protocol, the
> implementation of which is known to work in newer versions of
> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
> v2.9.0, inclusive). As such, the feature is disabled by default
> in (and should remain so), for the aforementioned older QEMU
> verions. Starting with QEMU v2.9.1, vhost-iommu-support can
> safely be enabled, even without having an IOMMU device, with
> no performance penalty.
> 
> This patch adds a new global config option, vhost-iommu-support,
> that controls enablement of the vhost IOMMU feature:
> 
>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
> 
> This value defaults to false; to enable IOMMU support, this field
> should be set to true when setting other global parameters on init
> (such as "dpdk-socket-mem", for example). Changing the value at
> runtime is not supported, and requires restarting the vswitch daemon.
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> 
> ---
> 
> v4:  - rebase to HEAD of master
>      - clarify that IOMMU support applies exclusively to vhost user
>        client ports.
>      - reword caveats regarding modifying vhost-iommu-support at runtime
>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>        process_vhost_flags().
>      - add stub implementation of dpdk_vhost_iommu_enabled().
>      - move stdbool.h #include outside of DPDK compiler guards.
>      - remove mention of vhost IOMMU mode from
>        netdev_dpdk_vhost_client_configure() log.
> 
> v3:  - erroneous; disregard.
> 
> v2:  - rebase to HEAD of master
>      - refactor vHost IOMMU enablement mechanism (use a global
>        config option, instead of the previous per-port approach).
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
>  NEWS                                     |  1 +
>  lib/dpdk-stub.c                          |  6 ++++++
>  lib/dpdk.c                               | 12 ++++++++++++
>  lib/dpdk.h                               |  3 +++
>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>  7 files changed, 74 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
> index 5347995..ffef653 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for vHost ports to 'reconnect' in
>  event of the switch crashing or being brought down. Once it is brought back up,
>  the vHost ports will reconnect automatically and normal service will resume.
>  
> +vhost-user-client IOMMU Support
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +vhost IOMMU is a feature which restricts the vhost memory that a virtio device
> +can access, and as such is useful in deployments in which security is a concern.
> +
> +IOMMU support may be enabled via a global config value, ```vhost-iommu-support```.
> +Setting this to true enables vhost IOMMU support for all vhost ports when/where
> +available::
> +
> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true

Few spaces missing in above command.

> +
> +The default value is false.
> +
> +.. important::
> +
> +    Changing this value requires restarting the daemon.
> +
> +.. important::
> +
> +    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
> +    default (and should remain so if using the aforementioned versions of QEMU).
> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
> +    without having an IOMMU device, with no performance penalty.
> +
>  .. _dpdk-testpmd:
>  
>  DPDK in the Guest
> diff --git a/NEWS b/NEWS
> index d4a1c9a..99c47d8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -15,6 +15,7 @@ Post-v2.8.0
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>     - DPDK:
>       * Add support for DPDK v17.11
> +     * Add support for vHost IOMMU
>  
>  v2.8.0 - 31 Aug 2017
>  --------------------
> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
> index daef729..3602180 100644
> --- a/lib/dpdk-stub.c
> +++ b/lib/dpdk-stub.c
> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)
>  {
>      return NULL;
>  }
> +
> +bool
> +dpdk_vhost_iommu_enabled(void)
> +{
> +    return false;
> +}
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 8da6c32..6710d10 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
>  
>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support */
>  
>  static int
>  process_vhost_flags(char *flag, const char *default_val, int size,
> @@ -345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>          vhost_sock_dir = sock_dir_subcomponent;
>      }
>  
> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
> +                                        "vhost-iommu-support", false);
> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
> +               vhost_iommu_enabled ? "enabled" : "disabled");
> +
>      argv = grow_argv(&argv, 0, 1);
>      argc = 1;
>      argv[0] = xstrdup(ovs_get_program_name());
> @@ -482,6 +488,12 @@ dpdk_get_vhost_sock_dir(void)
>      return vhost_sock_dir;
>  }
>  
> +bool
> +dpdk_vhost_iommu_enabled(void)
> +{
> +    return vhost_iommu_enabled;
> +}
> +
>  void
>  dpdk_set_lcore_id(unsigned cpu)
>  {
> diff --git a/lib/dpdk.h b/lib/dpdk.h
> index 673a1f1..dc58d96 100644
> --- a/lib/dpdk.h
> +++ b/lib/dpdk.h
> @@ -17,6 +17,8 @@
>  #ifndef DPDK_H
>  #define DPDK_H
>  
> +#include <stdbool.h>
> +
>  #ifdef DPDK_NETDEV
>  
>  #include <rte_config.h>
> @@ -35,5 +37,6 @@ struct smap;
>  void dpdk_init(const struct smap *ovs_other_config);
>  void dpdk_set_lcore_id(unsigned cpu);
>  const char *dpdk_get_vhost_sock_dir(void);
> +bool dpdk_vhost_iommu_enabled(void);
>  
>  #endif /* dpdk.h */
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index f552444..9715c39 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int err;
> +    uint64_t vhost_flags = 0;
>  
>      ovs_mutex_lock(&dev->mutex);
>  
> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>       */
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>              && strlen(dev->vhost_id)) {
> -        /* Register client-mode device */
> -        err = rte_vhost_driver_register(dev->vhost_id,
> -                                        RTE_VHOST_USER_CLIENT);
> +        /* Register client-mode device. */
> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
> +
> +        /* Enable IOMMU support, if explicitly requested. */
> +        if (dpdk_vhost_iommu_enabled()) {
> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +        }
> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>          if (err) {
>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>                       dev->vhost_id);
>              goto unlock;
>          } else {
>              /* Configuration successful */
> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
> +            dev->vhost_driver_flags |= vhost_flags;
>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>                        "using client socket '%s'",
>                        dev->up.name, dev->vhost_id);
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index c145e1a..6c6df50 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -344,6 +344,21 @@
>          </p>
>        </column>
>  
> +      <column name="other_config" key="vhost-iommu-support"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          vHost IOMMU is a  security feature, which restricts the vhost memory
> +          that a virtio device may access. vHost IOMMU support is disabled by
> +          default, due to a bug in QEMU implementations of the vhost REPLY_ACK
> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting this
> +          value to <code>true</code> enables vHost IOMMU support for vHost User
> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
> +        </p>
> +        <p>
> +          Changing this value requires restarting the daemon.
> +        </p>
> +      </column>
> +
>        <column name="other_config" key="n-handler-threads"
>                type='{"type": "integer", "minInteger": 1}'>
>          <p>
>
Mark Kavanagh Dec. 7, 2017, 11:24 a.m. UTC | #3
Hi folks,

Thanks for all of your respective reviews and testing of this patchset.

It seems, however, that an issue in DPDK v17.11 has come to light that affects guests which use testpmd. 
Specifically, traffic does not reach a guest when traffic is live prior to kicking off the testpmd app within said guest, and at least two forwarding cores are used. [1]

This is explained in additional detail is [2], an extract of which is reproduced below:

	"the vector Rx could be broken if backend has consumed all the avail descs before the
	 device is started. Because in current implementation, the vector Rx will return immediately
 	 without refilling the avail ring if the used ring is empty. So we have to refill
	 the avail ring after flushing the elements in the used ring."

This issue was initially uncovered by Antonio Fischetti, as part of the 17.11 patchset validation, and has since been localized to DPDK, rather than OvS.
As a result, it seems now that we should not move to 17.11, but seek an out-of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on this - should we:
	a) move to 17.11 now, note the issue above in the 'errata' section of the documentation, and move to 17.11.1 when it becomes available in February of next year
	b) request the early release of the 17.11.1 stable branch, which incorporates a fix for this issue (the possibility, and availability, of which are both TBD).

Thanks in advance,
Mark

[1] http://dpdk.org/ml/archives/dev/2017-December/082801.html 
[2] http://dpdk.org/ml/archives/dev/2017-December/082874.html 


>-----Original Message-----
>From: Stokes, Ian
>Sent: Thursday, December 7, 2017 8:43 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com; i.maximets@samsung.com;
>jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>
>Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>
>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>> This is a security feature, which restricts the vhost memory that a virtio
>> device may access.
>
>Hi all,
>
>There were a few requests for changes in the v4 of this patch and it's an
>important aspect of the DPDK 17.11 support for OVS.
>
>I'd like to get this series in to the pull request this week. If people have
>flagged issues for the latest revision I'd appreciate if they could review the
>latest revision and flag any new issues that need to be raised.
>
>Ian
>
>>
>> This feature also enables the vhost REPLY_ACK protocol, the implementation
>> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but is
>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the feature
>> is disabled by default in (and should remain so), for the aforementioned
>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>> safely be enabled, even without having an IOMMU device, with no
>> performance penalty.
>>
>> This patch adds a new global config option, vhost-iommu-support, that
>> controls enablement of the vhost IOMMU feature:
>>
>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>
>> This value defaults to false; to enable IOMMU support, this field should
>> be set to true when setting other global parameters on init (such as
>> "dpdk-socket-mem", for example). Changing the value at runtime is not
>> supported, and requires restarting the vswitch daemon.
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>
>> ---
>>
>> v4:  - rebase to HEAD of master
>>      - clarify that IOMMU support applies exclusively to vhost user
>>        client ports.
>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>        process_vhost_flags().
>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>      - move stdbool.h #include outside of DPDK compiler guards.
>>      - remove mention of vhost IOMMU mode from
>>        netdev_dpdk_vhost_client_configure() log.
>>
>> v3:  - erroneous; disregard.
>>
>> v2:  - rebase to HEAD of master
>>      - refactor vHost IOMMU enablement mechanism (use a global
>>        config option, instead of the previous per-port approach).
>> ---
>>  Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
>>  NEWS                                     |  1 +
>>  lib/dpdk-stub.c                          |  6 ++++++
>>  lib/dpdk.c                               | 12 ++++++++++++
>>  lib/dpdk.h                               |  3 +++
>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>> b/Documentation/topics/dpdk/vhost-user.rst
>> index 5347995..ffef653 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
>> vHost ports to 'reconnect' in  event of the switch crashing or being
>> brought down. Once it is brought back up,  the vHost ports will reconnect
>> automatically and normal service will resume.
>>
>> +vhost-user-client IOMMU Support
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>> +device can access, and as such is useful in deployments in which security
>> is a concern.
>> +
>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>> support```.
>> +Setting this to true enables vhost IOMMU support for all vhost ports
>> +when/where
>> +available::
>> +
>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>> +
>> +The default value is false.
>> +
>> +.. important::
>> +
>> +    Changing this value requires restarting the daemon.
>> +
>> +.. important::
>> +
>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>> protocol;
>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
>> disabled by
>> +    default (and should remain so if using the aforementioned versions of
>> QEMU).
>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>> even
>> +    without having an IOMMU device, with no performance penalty.
>> +
>>  .. _dpdk-testpmd:
>>
>>  DPDK in the Guest
>> diff --git a/NEWS b/NEWS
>> index d4a1c9a..99c47d8 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>     - DPDK:
>>       * Add support for DPDK v17.11
>> +     * Add support for vHost IOMMU
>>
>>  v2.8.0 - 31 Aug 2017
>>  --------------------
>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
>> 100644
>> --- a/lib/dpdk-stub.c
>> +++ b/lib/dpdk-stub.c
>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>>      return NULL;
>>  }
>> +
>> +bool
>> +dpdk_vhost_iommu_enabled(void)
>> +{
>> +    return false;
>> +}
>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>> index 8da6c32..6710d10 100644
>> --- a/lib/dpdk.c
>> +++ b/lib/dpdk.c
>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
>> */
>>
>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>> */
>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
>> +support */
>>
>>  static int
>>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>          vhost_sock_dir = sock_dir_subcomponent;
>>      }
>>
>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>> +                                        "vhost-iommu-support", false);
>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>> +
>>      argv = grow_argv(&argv, 0, 1);
>>      argc = 1;
>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
>> dpdk_get_vhost_sock_dir(void)
>>      return vhost_sock_dir;
>>  }
>>
>> +bool
>> +dpdk_vhost_iommu_enabled(void)
>> +{
>> +    return vhost_iommu_enabled;
>> +}
>> +
>>  void
>>  dpdk_set_lcore_id(unsigned cpu)
>>  {
>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>> index 673a1f1..dc58d96 100644
>> --- a/lib/dpdk.h
>> +++ b/lib/dpdk.h
>> @@ -17,6 +17,8 @@
>>  #ifndef DPDK_H
>>  #define DPDK_H
>>
>> +#include <stdbool.h>
>> +
>>  #ifdef DPDK_NETDEV
>>
>>  #include <rte_config.h>
>> @@ -35,5 +37,6 @@ struct smap;
>>  void dpdk_init(const struct smap *ovs_other_config);  void
>> dpdk_set_lcore_id(unsigned cpu);  const char
>> *dpdk_get_vhost_sock_dir(void);
>> +bool dpdk_vhost_iommu_enabled(void);
>>
>>  #endif /* dpdk.h */
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
>> 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>> *netdev)  {
>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>      int err;
>> +    uint64_t vhost_flags = 0;
>>
>>      ovs_mutex_lock(&dev->mutex);
>>
>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>> *netdev)
>>       */
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>              && strlen(dev->vhost_id)) {
>> -        /* Register client-mode device */
>> -        err = rte_vhost_driver_register(dev->vhost_id,
>> -                                        RTE_VHOST_USER_CLIENT);
>> +        /* Register client-mode device. */
>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>> +
>> +        /* Enable IOMMU support, if explicitly requested. */
>> +        if (dpdk_vhost_iommu_enabled()) {
>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +        }
>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>          if (err) {
>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>                       dev->vhost_id);
>>              goto unlock;
>>          } else {
>>              /* Configuration successful */
>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>> +            dev->vhost_driver_flags |= vhost_flags;
>>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>>                        "using client socket '%s'",
>>                        dev->up.name, dev->vhost_id); diff --git
>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
>> 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -344,6 +344,21 @@
>>          </p>
>>        </column>
>>
>> +      <column name="other_config" key="vhost-iommu-support"
>> +              type='{"type": "boolean"}'>
>> +        <p>
>> +          vHost IOMMU is a  security feature, which restricts the vhost
>> memory
>> +          that a virtio device may access. vHost IOMMU support is
>> disabled by
>> +          default, due to a bug in QEMU implementations of the vhost
>> REPLY_ACK
>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
>> Setting this
>> +          value to <code>true</code> enables vHost IOMMU support for
>> vHost User
>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>> +        </p>
>> +        <p>
>> +          Changing this value requires restarting the daemon.
>> +        </p>
>> +      </column>
>> +
>>        <column name="other_config" key="n-handler-threads"
>>                type='{"type": "integer", "minInteger": 1}'>
>>          <p>
>> --
>> 2.7.4
Mark Kavanagh Dec. 7, 2017, 11:32 a.m. UTC | #4
Yuanhan's old email address was used in the previous mail - updated correctly here.
-Mark

>-----Original Message-----
>From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-bounces@openvswitch.org]
>On Behalf Of Kavanagh, Mark B
>Sent: Thursday, December 7, 2017 11:24 AM
>To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
>Cc: Liu, Yuanhan <yuanhan.liu@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
>Mcnamara, John <john.mcnamara@intel.com>; maxime.coquelin@redhat.com;
>i.maximets@samsung.com
>Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>
>Hi folks,
>
>Thanks for all of your respective reviews and testing of this patchset.
>
>It seems, however, that an issue in DPDK v17.11 has come to light that affects
>guests which use testpmd.
>Specifically, traffic does not reach a guest when traffic is live prior to
>kicking off the testpmd app within said guest, and at least two forwarding
>cores are used. [1]
>
>This is explained in additional detail is [2], an extract of which is
>reproduced below:
>
>	"the vector Rx could be broken if backend has consumed all the avail
>descs before the
>	 device is started. Because in current implementation, the vector Rx will
>return immediately
> 	 without refilling the avail ring if the used ring is empty. So we have
>to refill
>	 the avail ring after flushing the elements in the used ring."
>
>This issue was initially uncovered by Antonio Fischetti, as part of the 17.11
>patchset validation, and has since been localized to DPDK, rather than OvS.
>As a result, it seems now that we should not move to 17.11, but seek an out-
>of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on this -
>should we:
>	a) move to 17.11 now, note the issue above in the 'errata' section of the
>documentation, and move to 17.11.1 when it becomes available in February of
>next year
>	b) request the early release of the 17.11.1 stable branch, which
>incorporates a fix for this issue (the possibility, and availability, of which
>are both TBD).
>
>Thanks in advance,
>Mark
>
>[1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
>[2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
>
>
>>-----Original Message-----
>>From: Stokes, Ian
>>Sent: Thursday, December 7, 2017 8:43 AM
>>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>>Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com; i.maximets@samsung.com;
>>jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>
>>Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>
>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>> This is a security feature, which restricts the vhost memory that a virtio
>>> device may access.
>>
>>Hi all,
>>
>>There were a few requests for changes in the v4 of this patch and it's an
>>important aspect of the DPDK 17.11 support for OVS.
>>
>>I'd like to get this series in to the pull request this week. If people have
>>flagged issues for the latest revision I'd appreciate if they could review
>the
>>latest revision and flag any new issues that need to be raised.
>>
>>Ian
>>
>>>
>>> This feature also enables the vhost REPLY_ACK protocol, the implementation
>>> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but is
>>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the feature
>>> is disabled by default in (and should remain so), for the aforementioned
>>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>>> safely be enabled, even without having an IOMMU device, with no
>>> performance penalty.
>>>
>>> This patch adds a new global config option, vhost-iommu-support, that
>>> controls enablement of the vhost IOMMU feature:
>>>
>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>>
>>> This value defaults to false; to enable IOMMU support, this field should
>>> be set to true when setting other global parameters on init (such as
>>> "dpdk-socket-mem", for example). Changing the value at runtime is not
>>> supported, and requires restarting the vswitch daemon.
>>>
>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>>
>>> ---
>>>
>>> v4:  - rebase to HEAD of master
>>>      - clarify that IOMMU support applies exclusively to vhost user
>>>        client ports.
>>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>>        process_vhost_flags().
>>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>>      - move stdbool.h #include outside of DPDK compiler guards.
>>>      - remove mention of vhost IOMMU mode from
>>>        netdev_dpdk_vhost_client_configure() log.
>>>
>>> v3:  - erroneous; disregard.
>>>
>>> v2:  - rebase to HEAD of master
>>>      - refactor vHost IOMMU enablement mechanism (use a global
>>>        config option, instead of the previous per-port approach).
>>> ---
>>>  Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
>>>  NEWS                                     |  1 +
>>>  lib/dpdk-stub.c                          |  6 ++++++
>>>  lib/dpdk.c                               | 12 ++++++++++++
>>>  lib/dpdk.h                               |  3 +++
>>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>>> b/Documentation/topics/dpdk/vhost-user.rst
>>> index 5347995..ffef653 100644
>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
>>> vHost ports to 'reconnect' in  event of the switch crashing or being
>>> brought down. Once it is brought back up,  the vHost ports will reconnect
>>> automatically and normal service will resume.
>>>
>>> +vhost-user-client IOMMU Support
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>>> +device can access, and as such is useful in deployments in which security
>>> is a concern.
>>> +
>>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>>> support```.
>>> +Setting this to true enables vhost IOMMU support for all vhost ports
>>> +when/where
>>> +available::
>>> +
>>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>>> +
>>> +The default value is false.
>>> +
>>> +.. important::
>>> +
>>> +    Changing this value requires restarting the daemon.
>>> +
>>> +.. important::
>>> +
>>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>>> protocol;
>>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
>>> disabled by
>>> +    default (and should remain so if using the aforementioned versions of
>>> QEMU).
>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>>> even
>>> +    without having an IOMMU device, with no performance penalty.
>>> +
>>>  .. _dpdk-testpmd:
>>>
>>>  DPDK in the Guest
>>> diff --git a/NEWS b/NEWS
>>> index d4a1c9a..99c47d8 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>>     - DPDK:
>>>       * Add support for DPDK v17.11
>>> +     * Add support for vHost IOMMU
>>>
>>>  v2.8.0 - 31 Aug 2017
>>>  --------------------
>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
>>> 100644
>>> --- a/lib/dpdk-stub.c
>>> +++ b/lib/dpdk-stub.c
>>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>>>      return NULL;
>>>  }
>>> +
>>> +bool
>>> +dpdk_vhost_iommu_enabled(void)
>>> +{
>>> +    return false;
>>> +}
>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>> index 8da6c32..6710d10 100644
>>> --- a/lib/dpdk.c
>>> +++ b/lib/dpdk.c
>>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
>>> */
>>>
>>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>>> */
>>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
>>> +support */
>>>
>>>  static int
>>>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
>>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>          vhost_sock_dir = sock_dir_subcomponent;
>>>      }
>>>
>>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>>> +                                        "vhost-iommu-support", false);
>>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>>> +
>>>      argv = grow_argv(&argv, 0, 1);
>>>      argc = 1;
>>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
>>> dpdk_get_vhost_sock_dir(void)
>>>      return vhost_sock_dir;
>>>  }
>>>
>>> +bool
>>> +dpdk_vhost_iommu_enabled(void)
>>> +{
>>> +    return vhost_iommu_enabled;
>>> +}
>>> +
>>>  void
>>>  dpdk_set_lcore_id(unsigned cpu)
>>>  {
>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>>> index 673a1f1..dc58d96 100644
>>> --- a/lib/dpdk.h
>>> +++ b/lib/dpdk.h
>>> @@ -17,6 +17,8 @@
>>>  #ifndef DPDK_H
>>>  #define DPDK_H
>>>
>>> +#include <stdbool.h>
>>> +
>>>  #ifdef DPDK_NETDEV
>>>
>>>  #include <rte_config.h>
>>> @@ -35,5 +37,6 @@ struct smap;
>>>  void dpdk_init(const struct smap *ovs_other_config);  void
>>> dpdk_set_lcore_id(unsigned cpu);  const char
>>> *dpdk_get_vhost_sock_dir(void);
>>> +bool dpdk_vhost_iommu_enabled(void);
>>>
>>>  #endif /* dpdk.h */
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
>>> 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>> *netdev)  {
>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>      int err;
>>> +    uint64_t vhost_flags = 0;
>>>
>>>      ovs_mutex_lock(&dev->mutex);
>>>
>>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>> *netdev)
>>>       */
>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>>              && strlen(dev->vhost_id)) {
>>> -        /* Register client-mode device */
>>> -        err = rte_vhost_driver_register(dev->vhost_id,
>>> -                                        RTE_VHOST_USER_CLIENT);
>>> +        /* Register client-mode device. */
>>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>>> +
>>> +        /* Enable IOMMU support, if explicitly requested. */
>>> +        if (dpdk_vhost_iommu_enabled()) {
>>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>> +        }
>>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>>          if (err) {
>>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>>                       dev->vhost_id);
>>>              goto unlock;
>>>          } else {
>>>              /* Configuration successful */
>>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>>> +            dev->vhost_driver_flags |= vhost_flags;
>>>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>>>                        "using client socket '%s'",
>>>                        dev->up.name, dev->vhost_id); diff --git
>>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
>>> 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -344,6 +344,21 @@
>>>          </p>
>>>        </column>
>>>
>>> +      <column name="other_config" key="vhost-iommu-support"
>>> +              type='{"type": "boolean"}'>
>>> +        <p>
>>> +          vHost IOMMU is a  security feature, which restricts the vhost
>>> memory
>>> +          that a virtio device may access. vHost IOMMU support is
>>> disabled by
>>> +          default, due to a bug in QEMU implementations of the vhost
>>> REPLY_ACK
>>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
>>> Setting this
>>> +          value to <code>true</code> enables vHost IOMMU support for
>>> vHost User
>>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>>> +        </p>
>>> +        <p>
>>> +          Changing this value requires restarting the daemon.
>>> +        </p>
>>> +      </column>
>>> +
>>>        <column name="other_config" key="n-handler-threads"
>>>                type='{"type": "integer", "minInteger": 1}'>
>>>          <p>
>>> --
>>> 2.7.4
>
>_______________________________________________
>dev mailing list
>dev@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Mark Kavanagh Dec. 7, 2017, 11:33 a.m. UTC | #5
>From: Ilya Maximets [mailto:i.maximets@samsung.com]
>Sent: Thursday, December 7, 2017 9:49 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com;
>jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>; Stokes,
>Ian <ian.stokes@intel.com>
>Subject: Re: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>
>Looks good to me in general, but I prefer if you'll fix following
>checkpatch warnings:

Thanks for your comments Ilya - I'll address same in the next version of this patchset, depending on the outcome of this: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341701.html 

Best regards,
Mark

>
>== Checking 2e9587f27add ("netdev-dpdk: vHost IOMMU support") ==
>WARNING: Line length is >79-characters long
>#52 FILE: Documentation/topics/dpdk/vhost-user.rst:280:
>can access, and as such is useful in deployments in which security is a
>concern.
>
>WARNING: Line length is >79-characters long
>#54 FILE: Documentation/topics/dpdk/vhost-user.rst:282:
>IOMMU support may be enabled via a global config value, ```vhost-iommu-
>support```.
>
>WARNING: Line length is >79-characters long
>#71 FILE: Documentation/topics/dpdk/vhost-user.rst:299:
>    default (and should remain so if using the aforementioned versions of
>QEMU).
>
>One more comment inline.
>
>Best regards, Ilya Maximets.
>
>On 04.12.2017 14:15, Mark Kavanagh wrote:
>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>> This is a security feature, which restricts the vhost memory
>> that a virtio device may access.
>>
>> This feature also enables the vhost REPLY_ACK protocol, the
>> implementation of which is known to work in newer versions of
>> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
>> v2.9.0, inclusive). As such, the feature is disabled by default
>> in (and should remain so), for the aforementioned older QEMU
>> verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>> safely be enabled, even without having an IOMMU device, with
>> no performance penalty.
>>
>> This patch adds a new global config option, vhost-iommu-support,
>> that controls enablement of the vhost IOMMU feature:
>>
>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>
>> This value defaults to false; to enable IOMMU support, this field
>> should be set to true when setting other global parameters on init
>> (such as "dpdk-socket-mem", for example). Changing the value at
>> runtime is not supported, and requires restarting the vswitch daemon.
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>
>> ---
>>
>> v4:  - rebase to HEAD of master
>>      - clarify that IOMMU support applies exclusively to vhost user
>>        client ports.
>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>        process_vhost_flags().
>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>      - move stdbool.h #include outside of DPDK compiler guards.
>>      - remove mention of vhost IOMMU mode from
>>        netdev_dpdk_vhost_client_configure() log.
>>
>> v3:  - erroneous; disregard.
>>
>> v2:  - rebase to HEAD of master
>>      - refactor vHost IOMMU enablement mechanism (use a global
>>        config option, instead of the previous per-port approach).
>> ---
>>  Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
>>  NEWS                                     |  1 +
>>  lib/dpdk-stub.c                          |  6 ++++++
>>  lib/dpdk.c                               | 12 ++++++++++++
>>  lib/dpdk.h                               |  3 +++
>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>b/Documentation/topics/dpdk/vhost-user.rst
>> index 5347995..ffef653 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for vHost
>ports to 'reconnect' in
>>  event of the switch crashing or being brought down. Once it is brought back
>up,
>>  the vHost ports will reconnect automatically and normal service will
>resume.
>>
>> +vhost-user-client IOMMU Support
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>device
>> +can access, and as such is useful in deployments in which security is a
>concern.
>> +
>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>support```.
>> +Setting this to true enables vhost IOMMU support for all vhost ports
>when/where
>> +available::
>> +
>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>
>Few spaces missing in above command.
>
>> +
>> +The default value is false.
>> +
>> +.. important::
>> +
>> +    Changing this value requires restarting the daemon.
>> +
>> +.. important::
>> +
>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>protocol;
>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled
>by
>> +    default (and should remain so if using the aforementioned versions of
>QEMU).
>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>even
>> +    without having an IOMMU device, with no performance penalty.
>> +
>>  .. _dpdk-testpmd:
>>
>>  DPDK in the Guest
>> diff --git a/NEWS b/NEWS
>> index d4a1c9a..99c47d8 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>     - DPDK:
>>       * Add support for DPDK v17.11
>> +     * Add support for vHost IOMMU
>>
>>  v2.8.0 - 31 Aug 2017
>>  --------------------
>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
>> index daef729..3602180 100644
>> --- a/lib/dpdk-stub.c
>> +++ b/lib/dpdk-stub.c
>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)
>>  {
>>      return NULL;
>>  }
>> +
>> +bool
>> +dpdk_vhost_iommu_enabled(void)
>> +{
>> +    return false;
>> +}
>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>> index 8da6c32..6710d10 100644
>> --- a/lib/dpdk.c
>> +++ b/lib/dpdk.c
>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
>>
>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support
>*/
>>
>>  static int
>>  process_vhost_flags(char *flag, const char *default_val, int size,
>> @@ -345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>          vhost_sock_dir = sock_dir_subcomponent;
>>      }
>>
>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>> +                                        "vhost-iommu-support", false);
>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>> +
>>      argv = grow_argv(&argv, 0, 1);
>>      argc = 1;
>>      argv[0] = xstrdup(ovs_get_program_name());
>> @@ -482,6 +488,12 @@ dpdk_get_vhost_sock_dir(void)
>>      return vhost_sock_dir;
>>  }
>>
>> +bool
>> +dpdk_vhost_iommu_enabled(void)
>> +{
>> +    return vhost_iommu_enabled;
>> +}
>> +
>>  void
>>  dpdk_set_lcore_id(unsigned cpu)
>>  {
>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>> index 673a1f1..dc58d96 100644
>> --- a/lib/dpdk.h
>> +++ b/lib/dpdk.h
>> @@ -17,6 +17,8 @@
>>  #ifndef DPDK_H
>>  #define DPDK_H
>>
>> +#include <stdbool.h>
>> +
>>  #ifdef DPDK_NETDEV
>>
>>  #include <rte_config.h>
>> @@ -35,5 +37,6 @@ struct smap;
>>  void dpdk_init(const struct smap *ovs_other_config);
>>  void dpdk_set_lcore_id(unsigned cpu);
>>  const char *dpdk_get_vhost_sock_dir(void);
>> +bool dpdk_vhost_iommu_enabled(void);
>>
>>  #endif /* dpdk.h */
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index f552444..9715c39 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>*netdev)
>>  {
>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>      int err;
>> +    uint64_t vhost_flags = 0;
>>
>>      ovs_mutex_lock(&dev->mutex);
>>
>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>*netdev)
>>       */
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>              && strlen(dev->vhost_id)) {
>> -        /* Register client-mode device */
>> -        err = rte_vhost_driver_register(dev->vhost_id,
>> -                                        RTE_VHOST_USER_CLIENT);
>> +        /* Register client-mode device. */
>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>> +
>> +        /* Enable IOMMU support, if explicitly requested. */
>> +        if (dpdk_vhost_iommu_enabled()) {
>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +        }
>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>          if (err) {
>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>                       dev->vhost_id);
>>              goto unlock;
>>          } else {
>>              /* Configuration successful */
>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>> +            dev->vhost_driver_flags |= vhost_flags;
>>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>>                        "using client socket '%s'",
>>                        dev->up.name, dev->vhost_id);
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index c145e1a..6c6df50 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -344,6 +344,21 @@
>>          </p>
>>        </column>
>>
>> +      <column name="other_config" key="vhost-iommu-support"
>> +              type='{"type": "boolean"}'>
>> +        <p>
>> +          vHost IOMMU is a  security feature, which restricts the vhost
>memory
>> +          that a virtio device may access. vHost IOMMU support is disabled
>by
>> +          default, due to a bug in QEMU implementations of the vhost
>REPLY_ACK
>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting
>this
>> +          value to <code>true</code> enables vHost IOMMU support for vHost
>User
>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>> +        </p>
>> +        <p>
>> +          Changing this value requires restarting the daemon.
>> +        </p>
>> +      </column>
>> +
>>        <column name="other_config" key="n-handler-threads"
>>                type='{"type": "integer", "minInteger": 1}'>
>>          <p>
>>
Stokes, Ian Dec. 7, 2017, 11:40 a.m. UTC | #6
> Hi folks,
> 
> Thanks for all of your respective reviews and testing of this patchset.
> 
> It seems, however, that an issue in DPDK v17.11 has come to light that
> affects guests which use testpmd.
> Specifically, traffic does not reach a guest when traffic is live prior to
> kicking off the testpmd app within said guest, and at least two forwarding
> cores are used. [1]
> 
> This is explained in additional detail is [2], an extract of which is
> reproduced below:
> 
> 	"the vector Rx could be broken if backend has consumed all the avail
> descs before the
> 	 device is started. Because in current implementation, the vector Rx
> will return immediately
>  	 without refilling the avail ring if the used ring is empty. So we
> have to refill
> 	 the avail ring after flushing the elements in the used ring."
> 

Thanks for flagging this Mark.

> This issue was initially uncovered by Antonio Fischetti, as part of the
> 17.11 patchset validation, and has since been localized to DPDK, rather
> than OvS.
> As a result, it seems now that we should not move to 17.11, but seek an
> out-of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on
> this - should we:
> 	a) move to 17.11 now, note the issue above in the 'errata' section of
> the documentation, and move to 17.11.1 when it becomes available in
> February of next year

I'm not sure this would work, this seems like a pretty common use case, it would be quite an important errata for users to be aware of.

> 	b) request the early release of the 17.11.1 stable branch, which
> incorporates a fix for this issue (the possibility, and availability, of
> which are both TBD).

+1 for the above option. I'd like to see this issue resolved before moving but I'm open to others input.

Thanks
Ian
> 
> Thanks in advance,
> Mark
> 
> [1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
> [2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
> 
> 
> >-----Original Message-----
> >From: Stokes, Ian
> >Sent: Thursday, December 7, 2017 8:43 AM
> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
> >Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com;
> i.maximets@samsung.com;
> >jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>
> >Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
> >
> >> DPDK v17.11 introduces support for the vHost IOMMU feature.
> >> This is a security feature, which restricts the vhost memory that a
> virtio
> >> device may access.
> >
> >Hi all,
> >
> >There were a few requests for changes in the v4 of this patch and it's an
> >important aspect of the DPDK 17.11 support for OVS.
> >
> >I'd like to get this series in to the pull request this week. If people
> have
> >flagged issues for the latest revision I'd appreciate if they could
> review the
> >latest revision and flag any new issues that need to be raised.
> >
> >Ian
> >
> >>
> >> This feature also enables the vhost REPLY_ACK protocol, the
> implementation
> >> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but
> is
> >> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the
> feature
> >> is disabled by default in (and should remain so), for the
> aforementioned
> >> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
> >> safely be enabled, even without having an IOMMU device, with no
> >> performance penalty.
> >>
> >> This patch adds a new global config option, vhost-iommu-support, that
> >> controls enablement of the vhost IOMMU feature:
> >>
> >>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
> >>
> >> This value defaults to false; to enable IOMMU support, this field
> should
> >> be set to true when setting other global parameters on init (such as
> >> "dpdk-socket-mem", for example). Changing the value at runtime is not
> >> supported, and requires restarting the vswitch daemon.
> >>
> >> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >>
> >> ---
> >>
> >> v4:  - rebase to HEAD of master
> >>      - clarify that IOMMU support applies exclusively to vhost user
> >>        client ports.
> >>      - reword caveats regarding modifying vhost-iommu-support at
> runtime
> >>      - use smap_get_bool() to parse vhost-iommu-support, instead of
> >>        process_vhost_flags().
> >>      - add stub implementation of dpdk_vhost_iommu_enabled().
> >>      - move stdbool.h #include outside of DPDK compiler guards.
> >>      - remove mention of vhost IOMMU mode from
> >>        netdev_dpdk_vhost_client_configure() log.
> >>
> >> v3:  - erroneous; disregard.
> >>
> >> v2:  - rebase to HEAD of master
> >>      - refactor vHost IOMMU enablement mechanism (use a global
> >>        config option, instead of the previous per-port approach).
> >> ---
> >>  Documentation/topics/dpdk/vhost-user.rst | 27
> +++++++++++++++++++++++++++
> >>  NEWS                                     |  1 +
> >>  lib/dpdk-stub.c                          |  6 ++++++
> >>  lib/dpdk.c                               | 12 ++++++++++++
> >>  lib/dpdk.h                               |  3 +++
> >>  lib/netdev-dpdk.c                        | 14 ++++++++++----
> >>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
> >>  7 files changed, 74 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/Documentation/topics/dpdk/vhost-user.rst
> >> b/Documentation/topics/dpdk/vhost-user.rst
> >> index 5347995..ffef653 100644
> >> --- a/Documentation/topics/dpdk/vhost-user.rst
> >> +++ b/Documentation/topics/dpdk/vhost-user.rst
> >> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
> >> vHost ports to 'reconnect' in  event of the switch crashing or being
> >> brought down. Once it is brought back up,  the vHost ports will
> reconnect
> >> automatically and normal service will resume.
> >>
> >> +vhost-user-client IOMMU Support
> >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> +
> >> +vhost IOMMU is a feature which restricts the vhost memory that a
> virtio
> >> +device can access, and as such is useful in deployments in which
> security
> >> is a concern.
> >> +
> >> +IOMMU support may be enabled via a global config value, ```vhost-
> iommu-
> >> support```.
> >> +Setting this to true enables vhost IOMMU support for all vhost ports
> >> +when/where
> >> +available::
> >> +
> >> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
> >> +
> >> +The default value is false.
> >> +
> >> +.. important::
> >> +
> >> +    Changing this value requires restarting the daemon.
> >> +
> >> +.. important::
> >> +
> >> +    Enabling the IOMMU feature also enables the vhost user reply-ack
> >> protocol;
> >> +    this is known to work on QEMU v2.10.0, but is buggy on older
> versions
> >> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
> >> disabled by
> >> +    default (and should remain so if using the aforementioned versions
> of
> >> QEMU).
> >> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
> enabled,
> >> even
> >> +    without having an IOMMU device, with no performance penalty.
> >> +
> >>  .. _dpdk-testpmd:
> >>
> >>  DPDK in the Guest
> >> diff --git a/NEWS b/NEWS
> >> index d4a1c9a..99c47d8 100644
> >> --- a/NEWS
> >> +++ b/NEWS
> >> @@ -15,6 +15,7 @@ Post-v2.8.0
> >>       * Add support for compiling OVS with the latest Linux 4.13 kernel
> >>     - DPDK:
> >>       * Add support for DPDK v17.11
> >> +     * Add support for vHost IOMMU
> >>
> >>  v2.8.0 - 31 Aug 2017
> >>  --------------------
> >> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
> >> 100644
> >> --- a/lib/dpdk-stub.c
> >> +++ b/lib/dpdk-stub.c
> >> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
> >>      return NULL;
> >>  }
> >> +
> >> +bool
> >> +dpdk_vhost_iommu_enabled(void)
> >> +{
> >> +    return false;
> >> +}
> >> diff --git a/lib/dpdk.c b/lib/dpdk.c
> >> index 8da6c32..6710d10 100644
> >> --- a/lib/dpdk.c
> >> +++ b/lib/dpdk.c
> >> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
> >>  static FILE *log_stream = NULL;       /* Stream for DPDK log
> redirection
> >> */
> >>
> >>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user
> sockets
> >> */
> >> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
> >> +support */
> >>
> >>  static int
> >>  process_vhost_flags(char *flag, const char *default_val, int size, @@
> -
> >> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
> >>          vhost_sock_dir = sock_dir_subcomponent;
> >>      }
> >>
> >> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
> >> +                                        "vhost-iommu-support", false);
> >> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
> >> +               vhost_iommu_enabled ? "enabled" : "disabled");
> >> +
> >>      argv = grow_argv(&argv, 0, 1);
> >>      argc = 1;
> >>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
> >> dpdk_get_vhost_sock_dir(void)
> >>      return vhost_sock_dir;
> >>  }
> >>
> >> +bool
> >> +dpdk_vhost_iommu_enabled(void)
> >> +{
> >> +    return vhost_iommu_enabled;
> >> +}
> >> +
> >>  void
> >>  dpdk_set_lcore_id(unsigned cpu)
> >>  {
> >> diff --git a/lib/dpdk.h b/lib/dpdk.h
> >> index 673a1f1..dc58d96 100644
> >> --- a/lib/dpdk.h
> >> +++ b/lib/dpdk.h
> >> @@ -17,6 +17,8 @@
> >>  #ifndef DPDK_H
> >>  #define DPDK_H
> >>
> >> +#include <stdbool.h>
> >> +
> >>  #ifdef DPDK_NETDEV
> >>
> >>  #include <rte_config.h>
> >> @@ -35,5 +37,6 @@ struct smap;
> >>  void dpdk_init(const struct smap *ovs_other_config);  void
> >> dpdk_set_lcore_id(unsigned cpu);  const char
> >> *dpdk_get_vhost_sock_dir(void);
> >> +bool dpdk_vhost_iommu_enabled(void);
> >>
> >>  #endif /* dpdk.h */
> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> f552444..9715c39
> >> 100644
> >> --- a/lib/netdev-dpdk.c
> >> +++ b/lib/netdev-dpdk.c
> >> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct
> netdev
> >> *netdev)  {
> >>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >>      int err;
> >> +    uint64_t vhost_flags = 0;
> >>
> >>      ovs_mutex_lock(&dev->mutex);
> >>
> >> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct
> netdev
> >> *netdev)
> >>       */
> >>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
> >>              && strlen(dev->vhost_id)) {
> >> -        /* Register client-mode device */
> >> -        err = rte_vhost_driver_register(dev->vhost_id,
> >> -                                        RTE_VHOST_USER_CLIENT);
> >> +        /* Register client-mode device. */
> >> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
> >> +
> >> +        /* Enable IOMMU support, if explicitly requested. */
> >> +        if (dpdk_vhost_iommu_enabled()) {
> >> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> >> +        }
> >> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
> >>          if (err) {
> >>              VLOG_ERR("vhost-user device setup failure for device
> %s\n",
> >>                       dev->vhost_id);
> >>              goto unlock;
> >>          } else {
> >>              /* Configuration successful */
> >> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
> >> +            dev->vhost_driver_flags |= vhost_flags;
> >>              VLOG_INFO("vHost User device '%s' created in 'client'
> mode, "
> >>                        "using client socket '%s'",
> >>                        dev->up.name, dev->vhost_id); diff --git
> >> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
> >> 100644
> >> --- a/vswitchd/vswitch.xml
> >> +++ b/vswitchd/vswitch.xml
> >> @@ -344,6 +344,21 @@
> >>          </p>
> >>        </column>
> >>
> >> +      <column name="other_config" key="vhost-iommu-support"
> >> +              type='{"type": "boolean"}'>
> >> +        <p>
> >> +          vHost IOMMU is a  security feature, which restricts the
> vhost
> >> memory
> >> +          that a virtio device may access. vHost IOMMU support is
> >> disabled by
> >> +          default, due to a bug in QEMU implementations of the vhost
> >> REPLY_ACK
> >> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
> >> Setting this
> >> +          value to <code>true</code> enables vHost IOMMU support for
> >> vHost User
> >> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
> >> +        </p>
> >> +        <p>
> >> +          Changing this value requires restarting the daemon.
> >> +        </p>
> >> +      </column>
> >> +
> >>        <column name="other_config" key="n-handler-threads"
> >>                type='{"type": "integer", "minInteger": 1}'>
> >>          <p>
> >> --
> >> 2.7.4
Ilya Maximets Dec. 7, 2017, 12:19 p.m. UTC | #7
On 07.12.2017 14:32, Kavanagh, Mark B wrote:
> Yuanhan's old email address was used in the previous mail - updated correctly here.
> -Mark
> 
>> -----Original Message-----
>> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-bounces@openvswitch.org]
>> On Behalf Of Kavanagh, Mark B
>> Sent: Thursday, December 7, 2017 11:24 AM
>> To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
>> Cc: Liu, Yuanhan <yuanhan.liu@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
>> Mcnamara, John <john.mcnamara@intel.com>; maxime.coquelin@redhat.com;
>> i.maximets@samsung.com
>> Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>
>> Hi folks,
>>
>> Thanks for all of your respective reviews and testing of this patchset.
>>
>> It seems, however, that an issue in DPDK v17.11 has come to light that affects
>> guests which use testpmd.
>> Specifically, traffic does not reach a guest when traffic is live prior to
>> kicking off the testpmd app within said guest, and at least two forwarding
>> cores are used. [1]
>>
>> This is explained in additional detail is [2], an extract of which is
>> reproduced below:
>>
>> 	"the vector Rx could be broken if backend has consumed all the avail
>> descs before the
>> 	 device is started. Because in current implementation, the vector Rx will
>> return immediately
>> 	 without refilling the avail ring if the used ring is empty. So we have
>> to refill
>> 	 the avail ring after flushing the elements in the used ring."
>>
>> This issue was initially uncovered by Antonio Fischetti, as part of the 17.11
>> patchset validation, and has since been localized to DPDK, rather than OvS.
>> As a result, it seems now that we should not move to 17.11, but seek an out-
>> of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on this -
>> should we:
>> 	a) move to 17.11 now, note the issue above in the 'errata' section of the
>> documentation, and move to 17.11.1 when it becomes available in February of
>> next year
>> 	b) request the early release of the 17.11.1 stable branch, which
>> incorporates a fix for this issue (the possibility, and availability, of which
>> are both TBD).
>>
>> Thanks in advance,
>> Mark

Hmm. Isn't it a guest driver issue? Do we need to care so much about running
OVS inside the VM? If I assumed right, if we're running OVS not inside the VM,
there will be no issues on the OVS side. The issue happens if guest application
uses DPDK 17.11, but it will happen regardless of the DPDK version on the host
side.

So, the only bad scenario is running OVS inside the VM with virtio-net PMD driver.
My question is: Do we need to care about this scenario so much?
If the answer is "not really" then we can use variant a).
But if it's very important thing we need to support than b) will be preferable.

Am I missing something?

Best regards, Ilya Maximets.


>>
>> [1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
>> [2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
>>
>>
>>> -----Original Message-----
>>> From: Stokes, Ian
>>> Sent: Thursday, December 7, 2017 8:43 AM
>>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>>> Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com; i.maximets@samsung.com;
>>> jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>
>>> Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>
>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>>> This is a security feature, which restricts the vhost memory that a virtio
>>>> device may access.
>>>
>>> Hi all,
>>>
>>> There were a few requests for changes in the v4 of this patch and it's an
>>> important aspect of the DPDK 17.11 support for OVS.
>>>
>>> I'd like to get this series in to the pull request this week. If people have
>>> flagged issues for the latest revision I'd appreciate if they could review
>> the
>>> latest revision and flag any new issues that need to be raised.
>>>
>>> Ian
>>>
>>>>
>>>> This feature also enables the vhost REPLY_ACK protocol, the implementation
>>>> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but is
>>>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the feature
>>>> is disabled by default in (and should remain so), for the aforementioned
>>>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>>>> safely be enabled, even without having an IOMMU device, with no
>>>> performance penalty.
>>>>
>>>> This patch adds a new global config option, vhost-iommu-support, that
>>>> controls enablement of the vhost IOMMU feature:
>>>>
>>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>>>
>>>> This value defaults to false; to enable IOMMU support, this field should
>>>> be set to true when setting other global parameters on init (such as
>>>> "dpdk-socket-mem", for example). Changing the value at runtime is not
>>>> supported, and requires restarting the vswitch daemon.
>>>>
>>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>>>
>>>> ---
>>>>
>>>> v4:  - rebase to HEAD of master
>>>>      - clarify that IOMMU support applies exclusively to vhost user
>>>>        client ports.
>>>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>>>        process_vhost_flags().
>>>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>>>      - move stdbool.h #include outside of DPDK compiler guards.
>>>>      - remove mention of vhost IOMMU mode from
>>>>        netdev_dpdk_vhost_client_configure() log.
>>>>
>>>> v3:  - erroneous; disregard.
>>>>
>>>> v2:  - rebase to HEAD of master
>>>>      - refactor vHost IOMMU enablement mechanism (use a global
>>>>        config option, instead of the previous per-port approach).
>>>> ---
>>>>  Documentation/topics/dpdk/vhost-user.rst | 27 +++++++++++++++++++++++++++
>>>>  NEWS                                     |  1 +
>>>>  lib/dpdk-stub.c                          |  6 ++++++
>>>>  lib/dpdk.c                               | 12 ++++++++++++
>>>>  lib/dpdk.h                               |  3 +++
>>>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>>>> b/Documentation/topics/dpdk/vhost-user.rst
>>>> index 5347995..ffef653 100644
>>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
>>>> vHost ports to 'reconnect' in  event of the switch crashing or being
>>>> brought down. Once it is brought back up,  the vHost ports will reconnect
>>>> automatically and normal service will resume.
>>>>
>>>> +vhost-user-client IOMMU Support
>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> +
>>>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>>>> +device can access, and as such is useful in deployments in which security
>>>> is a concern.
>>>> +
>>>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>>>> support```.
>>>> +Setting this to true enables vhost IOMMU support for all vhost ports
>>>> +when/where
>>>> +available::
>>>> +
>>>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>>>> +
>>>> +The default value is false.
>>>> +
>>>> +.. important::
>>>> +
>>>> +    Changing this value requires restarting the daemon.
>>>> +
>>>> +.. important::
>>>> +
>>>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>>>> protocol;
>>>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
>>>> disabled by
>>>> +    default (and should remain so if using the aforementioned versions of
>>>> QEMU).
>>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>>>> even
>>>> +    without having an IOMMU device, with no performance penalty.
>>>> +
>>>>  .. _dpdk-testpmd:
>>>>
>>>>  DPDK in the Guest
>>>> diff --git a/NEWS b/NEWS
>>>> index d4a1c9a..99c47d8 100644
>>>> --- a/NEWS
>>>> +++ b/NEWS
>>>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>>>     - DPDK:
>>>>       * Add support for DPDK v17.11
>>>> +     * Add support for vHost IOMMU
>>>>
>>>>  v2.8.0 - 31 Aug 2017
>>>>  --------------------
>>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
>>>> 100644
>>>> --- a/lib/dpdk-stub.c
>>>> +++ b/lib/dpdk-stub.c
>>>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>>>>      return NULL;
>>>>  }
>>>> +
>>>> +bool
>>>> +dpdk_vhost_iommu_enabled(void)
>>>> +{
>>>> +    return false;
>>>> +}
>>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>>> index 8da6c32..6710d10 100644
>>>> --- a/lib/dpdk.c
>>>> +++ b/lib/dpdk.c
>>>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
>>>> */
>>>>
>>>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>>>> */
>>>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
>>>> +support */
>>>>
>>>>  static int
>>>>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
>>>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>>          vhost_sock_dir = sock_dir_subcomponent;
>>>>      }
>>>>
>>>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>>>> +                                        "vhost-iommu-support", false);
>>>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>>>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>>>> +
>>>>      argv = grow_argv(&argv, 0, 1);
>>>>      argc = 1;
>>>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
>>>> dpdk_get_vhost_sock_dir(void)
>>>>      return vhost_sock_dir;
>>>>  }
>>>>
>>>> +bool
>>>> +dpdk_vhost_iommu_enabled(void)
>>>> +{
>>>> +    return vhost_iommu_enabled;
>>>> +}
>>>> +
>>>>  void
>>>>  dpdk_set_lcore_id(unsigned cpu)
>>>>  {
>>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>>>> index 673a1f1..dc58d96 100644
>>>> --- a/lib/dpdk.h
>>>> +++ b/lib/dpdk.h
>>>> @@ -17,6 +17,8 @@
>>>>  #ifndef DPDK_H
>>>>  #define DPDK_H
>>>>
>>>> +#include <stdbool.h>
>>>> +
>>>>  #ifdef DPDK_NETDEV
>>>>
>>>>  #include <rte_config.h>
>>>> @@ -35,5 +37,6 @@ struct smap;
>>>>  void dpdk_init(const struct smap *ovs_other_config);  void
>>>> dpdk_set_lcore_id(unsigned cpu);  const char
>>>> *dpdk_get_vhost_sock_dir(void);
>>>> +bool dpdk_vhost_iommu_enabled(void);
>>>>
>>>>  #endif /* dpdk.h */
>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
>>>> 100644
>>>> --- a/lib/netdev-dpdk.c
>>>> +++ b/lib/netdev-dpdk.c
>>>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>>> *netdev)  {
>>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>      int err;
>>>> +    uint64_t vhost_flags = 0;
>>>>
>>>>      ovs_mutex_lock(&dev->mutex);
>>>>
>>>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>>> *netdev)
>>>>       */
>>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>>>              && strlen(dev->vhost_id)) {
>>>> -        /* Register client-mode device */
>>>> -        err = rte_vhost_driver_register(dev->vhost_id,
>>>> -                                        RTE_VHOST_USER_CLIENT);
>>>> +        /* Register client-mode device. */
>>>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>>>> +
>>>> +        /* Enable IOMMU support, if explicitly requested. */
>>>> +        if (dpdk_vhost_iommu_enabled()) {
>>>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>>> +        }
>>>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>>>          if (err) {
>>>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>>>                       dev->vhost_id);
>>>>              goto unlock;
>>>>          } else {
>>>>              /* Configuration successful */
>>>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>>>> +            dev->vhost_driver_flags |= vhost_flags;
>>>>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>>>>                        "using client socket '%s'",
>>>>                        dev->up.name, dev->vhost_id); diff --git
>>>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
>>>> 100644
>>>> --- a/vswitchd/vswitch.xml
>>>> +++ b/vswitchd/vswitch.xml
>>>> @@ -344,6 +344,21 @@
>>>>          </p>
>>>>        </column>
>>>>
>>>> +      <column name="other_config" key="vhost-iommu-support"
>>>> +              type='{"type": "boolean"}'>
>>>> +        <p>
>>>> +          vHost IOMMU is a  security feature, which restricts the vhost
>>>> memory
>>>> +          that a virtio device may access. vHost IOMMU support is
>>>> disabled by
>>>> +          default, due to a bug in QEMU implementations of the vhost
>>>> REPLY_ACK
>>>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
>>>> Setting this
>>>> +          value to <code>true</code> enables vHost IOMMU support for
>>>> vHost User
>>>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>>>> +        </p>
>>>> +        <p>
>>>> +          Changing this value requires restarting the daemon.
>>>> +        </p>
>>>> +      </column>
>>>> +
>>>>        <column name="other_config" key="n-handler-threads"
>>>>                type='{"type": "integer", "minInteger": 1}'>
>>>>          <p>
>>>> --
>>>> 2.7.4
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 
>
Mark Kavanagh Dec. 7, 2017, 12:27 p.m. UTC | #8
>From: Ilya Maximets [mailto:i.maximets@samsung.com]
>Sent: Thursday, December 7, 2017 12:19 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Stokes, Ian
><ian.stokes@intel.com>; dev@openvswitch.org
>Cc: maxime.coquelin@redhat.com; jan.scheurich@ericsson.com; Mooney, Sean K
><sean.k.mooney@intel.com>; Fischetti, Antonio <antonio.fischetti@intel.com>;
>Bie, Tiwei <tiwei.bie@intel.com>; Mcnamara, John <john.mcnamara@intel.com>;
>Guoshuai Li <ligs@dtdream.com>; ktraynor@redhat.com; Loftus, Ciara
><ciara.loftus@intel.com>; Yuanhan Liu <yliu@fridaylinux.org>
>Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>
>On 07.12.2017 14:32, Kavanagh, Mark B wrote:
>> Yuanhan's old email address was used in the previous mail - updated
>correctly here.
>> -Mark
>>
>>> -----Original Message-----
>>> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
>bounces@openvswitch.org]
>>> On Behalf Of Kavanagh, Mark B
>>> Sent: Thursday, December 7, 2017 11:24 AM
>>> To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
>>> Cc: Liu, Yuanhan <yuanhan.liu@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
>>> Mcnamara, John <john.mcnamara@intel.com>; maxime.coquelin@redhat.com;
>>> i.maximets@samsung.com
>>> Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>
>>> Hi folks,
>>>
>>> Thanks for all of your respective reviews and testing of this patchset.
>>>
>>> It seems, however, that an issue in DPDK v17.11 has come to light that
>affects
>>> guests which use testpmd.
>>> Specifically, traffic does not reach a guest when traffic is live prior to
>>> kicking off the testpmd app within said guest, and at least two forwarding
>>> cores are used. [1]
>>>
>>> This is explained in additional detail is [2], an extract of which is
>>> reproduced below:
>>>
>>> 	"the vector Rx could be broken if backend has consumed all the avail
>>> descs before the
>>> 	 device is started. Because in current implementation, the vector Rx will
>>> return immediately
>>> 	 without refilling the avail ring if the used ring is empty. So we have
>>> to refill
>>> 	 the avail ring after flushing the elements in the used ring."
>>>
>>> This issue was initially uncovered by Antonio Fischetti, as part of the
>17.11
>>> patchset validation, and has since been localized to DPDK, rather than OvS.
>>> As a result, it seems now that we should not move to 17.11, but seek an
>out-
>>> of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on this -
>>> should we:
>>> 	a) move to 17.11 now, note the issue above in the 'errata' section of the
>>> documentation, and move to 17.11.1 when it becomes available in February of
>>> next year
>>> 	b) request the early release of the 17.11.1 stable branch, which
>>> incorporates a fix for this issue (the possibility, and availability, of
>which
>>> are both TBD).
>>>
>>> Thanks in advance,
>>> Mark
>
>Hmm. Isn't it a guest driver issue? Do we need to care so much about running
>OVS inside the VM? If I assumed right, if we're running OVS not inside the VM,
>there will be no issues on the OVS side. The issue happens if guest
>application
>uses DPDK 17.11, but it will happen regardless of the DPDK version on the host
>side.
>
>So, the only bad scenario is running OVS inside the VM with virtio-net PMD
>driver.
>My question is: Do we need to care about this scenario so much?
>If the answer is "not really" then we can use variant a).
>But if it's very important thing we need to support than b) will be
>preferable.
>
>Am I missing something?

Hey Ilya,

I've just had the exact same conversation with Sean Mooney locally.

I think the point that both yourself and Sean has made is completely valid, which puts option a) back on the table.

Antonio has agreed to test that this works; once he confirms I can push v5 of the 17.11 patchet, and simply document the guest DPDK issue.

Thanks again,
Mark


>
>Best regards, Ilya Maximets.
>
>
>>>
>>> [1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
>>> [2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
>>>
>>>
>>>> -----Original Message-----
>>>> From: Stokes, Ian
>>>> Sent: Thursday, December 7, 2017 8:43 AM
>>>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>>>> Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com;
>i.maximets@samsung.com;
>>>> jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>
>>>> Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>>
>>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>>>> This is a security feature, which restricts the vhost memory that a
>virtio
>>>>> device may access.
>>>>
>>>> Hi all,
>>>>
>>>> There were a few requests for changes in the v4 of this patch and it's an
>>>> important aspect of the DPDK 17.11 support for OVS.
>>>>
>>>> I'd like to get this series in to the pull request this week. If people
>have
>>>> flagged issues for the latest revision I'd appreciate if they could review
>>> the
>>>> latest revision and flag any new issues that need to be raised.
>>>>
>>>> Ian
>>>>
>>>>>
>>>>> This feature also enables the vhost REPLY_ACK protocol, the
>implementation
>>>>> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but
>is
>>>>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the
>feature
>>>>> is disabled by default in (and should remain so), for the aforementioned
>>>>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>>>>> safely be enabled, even without having an IOMMU device, with no
>>>>> performance penalty.
>>>>>
>>>>> This patch adds a new global config option, vhost-iommu-support, that
>>>>> controls enablement of the vhost IOMMU feature:
>>>>>
>>>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>>>>
>>>>> This value defaults to false; to enable IOMMU support, this field should
>>>>> be set to true when setting other global parameters on init (such as
>>>>> "dpdk-socket-mem", for example). Changing the value at runtime is not
>>>>> supported, and requires restarting the vswitch daemon.
>>>>>
>>>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>>>>
>>>>> ---
>>>>>
>>>>> v4:  - rebase to HEAD of master
>>>>>      - clarify that IOMMU support applies exclusively to vhost user
>>>>>        client ports.
>>>>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>>>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>>>>        process_vhost_flags().
>>>>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>>>>      - move stdbool.h #include outside of DPDK compiler guards.
>>>>>      - remove mention of vhost IOMMU mode from
>>>>>        netdev_dpdk_vhost_client_configure() log.
>>>>>
>>>>> v3:  - erroneous; disregard.
>>>>>
>>>>> v2:  - rebase to HEAD of master
>>>>>      - refactor vHost IOMMU enablement mechanism (use a global
>>>>>        config option, instead of the previous per-port approach).
>>>>> ---
>>>>>  Documentation/topics/dpdk/vhost-user.rst | 27
>+++++++++++++++++++++++++++
>>>>>  NEWS                                     |  1 +
>>>>>  lib/dpdk-stub.c                          |  6 ++++++
>>>>>  lib/dpdk.c                               | 12 ++++++++++++
>>>>>  lib/dpdk.h                               |  3 +++
>>>>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>>>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>>>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>>>>> b/Documentation/topics/dpdk/vhost-user.rst
>>>>> index 5347995..ffef653 100644
>>>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>>>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
>>>>> vHost ports to 'reconnect' in  event of the switch crashing or being
>>>>> brought down. Once it is brought back up,  the vHost ports will reconnect
>>>>> automatically and normal service will resume.
>>>>>
>>>>> +vhost-user-client IOMMU Support
>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> +
>>>>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>>>>> +device can access, and as such is useful in deployments in which
>security
>>>>> is a concern.
>>>>> +
>>>>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>>>>> support```.
>>>>> +Setting this to true enables vhost IOMMU support for all vhost ports
>>>>> +when/where
>>>>> +available::
>>>>> +
>>>>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>>>>> +
>>>>> +The default value is false.
>>>>> +
>>>>> +.. important::
>>>>> +
>>>>> +    Changing this value requires restarting the daemon.
>>>>> +
>>>>> +.. important::
>>>>> +
>>>>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>>>>> protocol;
>>>>> +    this is known to work on QEMU v2.10.0, but is buggy on older
>versions
>>>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
>>>>> disabled by
>>>>> +    default (and should remain so if using the aforementioned versions
>of
>>>>> QEMU).
>>>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
>enabled,
>>>>> even
>>>>> +    without having an IOMMU device, with no performance penalty.
>>>>> +
>>>>>  .. _dpdk-testpmd:
>>>>>
>>>>>  DPDK in the Guest
>>>>> diff --git a/NEWS b/NEWS
>>>>> index d4a1c9a..99c47d8 100644
>>>>> --- a/NEWS
>>>>> +++ b/NEWS
>>>>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>>>>     - DPDK:
>>>>>       * Add support for DPDK v17.11
>>>>> +     * Add support for vHost IOMMU
>>>>>
>>>>>  v2.8.0 - 31 Aug 2017
>>>>>  --------------------
>>>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
>>>>> 100644
>>>>> --- a/lib/dpdk-stub.c
>>>>> +++ b/lib/dpdk-stub.c
>>>>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>>>>>      return NULL;
>>>>>  }
>>>>> +
>>>>> +bool
>>>>> +dpdk_vhost_iommu_enabled(void)
>>>>> +{
>>>>> +    return false;
>>>>> +}
>>>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>>>> index 8da6c32..6710d10 100644
>>>>> --- a/lib/dpdk.c
>>>>> +++ b/lib/dpdk.c
>>>>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>>>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
>>>>> */
>>>>>
>>>>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>>>>> */
>>>>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
>>>>> +support */
>>>>>
>>>>>  static int
>>>>>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
>>>>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>>>          vhost_sock_dir = sock_dir_subcomponent;
>>>>>      }
>>>>>
>>>>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>>>>> +                                        "vhost-iommu-support", false);
>>>>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>>>>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>>>>> +
>>>>>      argv = grow_argv(&argv, 0, 1);
>>>>>      argc = 1;
>>>>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
>>>>> dpdk_get_vhost_sock_dir(void)
>>>>>      return vhost_sock_dir;
>>>>>  }
>>>>>
>>>>> +bool
>>>>> +dpdk_vhost_iommu_enabled(void)
>>>>> +{
>>>>> +    return vhost_iommu_enabled;
>>>>> +}
>>>>> +
>>>>>  void
>>>>>  dpdk_set_lcore_id(unsigned cpu)
>>>>>  {
>>>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>>>>> index 673a1f1..dc58d96 100644
>>>>> --- a/lib/dpdk.h
>>>>> +++ b/lib/dpdk.h
>>>>> @@ -17,6 +17,8 @@
>>>>>  #ifndef DPDK_H
>>>>>  #define DPDK_H
>>>>>
>>>>> +#include <stdbool.h>
>>>>> +
>>>>>  #ifdef DPDK_NETDEV
>>>>>
>>>>>  #include <rte_config.h>
>>>>> @@ -35,5 +37,6 @@ struct smap;
>>>>>  void dpdk_init(const struct smap *ovs_other_config);  void
>>>>> dpdk_set_lcore_id(unsigned cpu);  const char
>>>>> *dpdk_get_vhost_sock_dir(void);
>>>>> +bool dpdk_vhost_iommu_enabled(void);
>>>>>
>>>>>  #endif /* dpdk.h */
>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
>>>>> 100644
>>>>> --- a/lib/netdev-dpdk.c
>>>>> +++ b/lib/netdev-dpdk.c
>>>>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>>>> *netdev)  {
>>>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>>      int err;
>>>>> +    uint64_t vhost_flags = 0;
>>>>>
>>>>>      ovs_mutex_lock(&dev->mutex);
>>>>>
>>>>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct
>netdev
>>>>> *netdev)
>>>>>       */
>>>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>>>>              && strlen(dev->vhost_id)) {
>>>>> -        /* Register client-mode device */
>>>>> -        err = rte_vhost_driver_register(dev->vhost_id,
>>>>> -                                        RTE_VHOST_USER_CLIENT);
>>>>> +        /* Register client-mode device. */
>>>>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>>>>> +
>>>>> +        /* Enable IOMMU support, if explicitly requested. */
>>>>> +        if (dpdk_vhost_iommu_enabled()) {
>>>>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>>>> +        }
>>>>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>>>>          if (err) {
>>>>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>>>>                       dev->vhost_id);
>>>>>              goto unlock;
>>>>>          } else {
>>>>>              /* Configuration successful */
>>>>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>>>>> +            dev->vhost_driver_flags |= vhost_flags;
>>>>>              VLOG_INFO("vHost User device '%s' created in 'client' mode,
>"
>>>>>                        "using client socket '%s'",
>>>>>                        dev->up.name, dev->vhost_id); diff --git
>>>>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
>>>>> 100644
>>>>> --- a/vswitchd/vswitch.xml
>>>>> +++ b/vswitchd/vswitch.xml
>>>>> @@ -344,6 +344,21 @@
>>>>>          </p>
>>>>>        </column>
>>>>>
>>>>> +      <column name="other_config" key="vhost-iommu-support"
>>>>> +              type='{"type": "boolean"}'>
>>>>> +        <p>
>>>>> +          vHost IOMMU is a  security feature, which restricts the vhost
>>>>> memory
>>>>> +          that a virtio device may access. vHost IOMMU support is
>>>>> disabled by
>>>>> +          default, due to a bug in QEMU implementations of the vhost
>>>>> REPLY_ACK
>>>>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
>>>>> Setting this
>>>>> +          value to <code>true</code> enables vHost IOMMU support for
>>>>> vHost User
>>>>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>>>>> +        </p>
>>>>> +        <p>
>>>>> +          Changing this value requires restarting the daemon.
>>>>> +        </p>
>>>>> +      </column>
>>>>> +
>>>>>        <column name="other_config" key="n-handler-threads"
>>>>>                type='{"type": "integer", "minInteger": 1}'>
>>>>>          <p>
>>>>> --
>>>>> 2.7.4
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
>>
Fischetti, Antonio Dec. 7, 2017, 12:49 p.m. UTC | #9
I just checked there's no issue if we use dpdk v17.11 on the host
and dpdk 17.05.2 in the guest.

Antonio

> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, December 7, 2017 12:28 PM
> To: Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian
> <ian.stokes@intel.com>; dev@openvswitch.org
> Cc: maxime.coquelin@redhat.com; jan.scheurich@ericsson.com; Mooney, Sean
> K <sean.k.mooney@intel.com>; Fischetti, Antonio
> <antonio.fischetti@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> Mcnamara, John <john.mcnamara@intel.com>; Guoshuai Li
> <ligs@dtdream.com>; ktraynor@redhat.com; Loftus, Ciara
> <ciara.loftus@intel.com>; Yuanhan Liu <yliu@fridaylinux.org>
> Subject: RE: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
> 
> >From: Ilya Maximets [mailto:i.maximets@samsung.com]
> >Sent: Thursday, December 7, 2017 12:19 PM
> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Stokes, Ian
> ><ian.stokes@intel.com>; dev@openvswitch.org
> >Cc: maxime.coquelin@redhat.com; jan.scheurich@ericsson.com; Mooney,
> Sean K
> ><sean.k.mooney@intel.com>; Fischetti, Antonio
> <antonio.fischetti@intel.com>;
> >Bie, Tiwei <tiwei.bie@intel.com>; Mcnamara, John
> <john.mcnamara@intel.com>;
> >Guoshuai Li <ligs@dtdream.com>; ktraynor@redhat.com; Loftus, Ciara
> ><ciara.loftus@intel.com>; Yuanhan Liu <yliu@fridaylinux.org>
> >Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
> >
> >On 07.12.2017 14:32, Kavanagh, Mark B wrote:
> >> Yuanhan's old email address was used in the previous mail - updated
> >correctly here.
> >> -Mark
> >>
> >>> -----Original Message-----
> >>> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> >bounces@openvswitch.org]
> >>> On Behalf Of Kavanagh, Mark B
> >>> Sent: Thursday, December 7, 2017 11:24 AM
> >>> To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
> >>> Cc: Liu, Yuanhan <yuanhan.liu@intel.com>; Bie, Tiwei
> <tiwei.bie@intel.com>;
> >>> Mcnamara, John <john.mcnamara@intel.com>;
> maxime.coquelin@redhat.com;
> >>> i.maximets@samsung.com
> >>> Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU
> support
> >>>
> >>> Hi folks,
> >>>
> >>> Thanks for all of your respective reviews and testing of this
> patchset.
> >>>
> >>> It seems, however, that an issue in DPDK v17.11 has come to light
> that
> >affects
> >>> guests which use testpmd.
> >>> Specifically, traffic does not reach a guest when traffic is live
> prior to
> >>> kicking off the testpmd app within said guest, and at least two
> forwarding
> >>> cores are used. [1]
> >>>
> >>> This is explained in additional detail is [2], an extract of which
> is
> >>> reproduced below:
> >>>
> >>> 	"the vector Rx could be broken if backend has consumed all the
> avail
> >>> descs before the
> >>> 	 device is started. Because in current implementation, the vector
> Rx will
> >>> return immediately
> >>> 	 without refilling the avail ring if the used ring is empty. So we
> have
> >>> to refill
> >>> 	 the avail ring after flushing the elements in the used ring."
> >>>
> >>> This issue was initially uncovered by Antonio Fischetti, as part of
> the
> >17.11
> >>> patchset validation, and has since been localized to DPDK, rather
> than OvS.
> >>> As a result, it seems now that we should not move to 17.11, but seek
> an
> >out-
> >>> of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on
> this -
> >>> should we:
> >>> 	a) move to 17.11 now, note the issue above in the 'errata' section
> of the
> >>> documentation, and move to 17.11.1 when it becomes available in
> February of
> >>> next year
> >>> 	b) request the early release of the 17.11.1 stable branch, which
> >>> incorporates a fix for this issue (the possibility, and
> availability, of
> >which
> >>> are both TBD).
> >>>
> >>> Thanks in advance,
> >>> Mark
> >
> >Hmm. Isn't it a guest driver issue? Do we need to care so much about
> running
> >OVS inside the VM? If I assumed right, if we're running OVS not inside
> the VM,
> >there will be no issues on the OVS side. The issue happens if guest
> >application
> >uses DPDK 17.11, but it will happen regardless of the DPDK version on
> the host
> >side.
> >
> >So, the only bad scenario is running OVS inside the VM with virtio-net
> PMD
> >driver.
> >My question is: Do we need to care about this scenario so much?
> >If the answer is "not really" then we can use variant a).
> >But if it's very important thing we need to support than b) will be
> >preferable.
> >
> >Am I missing something?
> 
> Hey Ilya,
> 
> I've just had the exact same conversation with Sean Mooney locally.
> 
> I think the point that both yourself and Sean has made is completely
> valid, which puts option a) back on the table.
> 
> Antonio has agreed to test that this works; once he confirms I can push
> v5 of the 17.11 patchet, and simply document the guest DPDK issue.
> 
> Thanks again,
> Mark
> 
> 
> >
> >Best regards, Ilya Maximets.
> >
> >
> >>>
> >>> [1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
> >>> [2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Stokes, Ian
> >>>> Sent: Thursday, December 7, 2017 8:43 AM
> >>>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> dev@openvswitch.org
> >>>> Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com;
> >i.maximets@samsung.com;
> >>>> jan.scheurich@ericsson.com; Mooney, Sean K
> <sean.k.mooney@intel.com>
> >>>> Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU
> support
> >>>>
> >>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
> >>>>> This is a security feature, which restricts the vhost memory that
> a
> >virtio
> >>>>> device may access.
> >>>>
> >>>> Hi all,
> >>>>
> >>>> There were a few requests for changes in the v4 of this patch and
> it's an
> >>>> important aspect of the DPDK 17.11 support for OVS.
> >>>>
> >>>> I'd like to get this series in to the pull request this week. If
> people
> >have
> >>>> flagged issues for the latest revision I'd appreciate if they could
> review
> >>> the
> >>>> latest revision and flag any new issues that need to be raised.
> >>>>
> >>>> Ian
> >>>>
> >>>>>
> >>>>> This feature also enables the vhost REPLY_ACK protocol, the
> >implementation
> >>>>> of which is known to work in newer versions of QEMU (i.e.
> v2.10.0), but
> >is
> >>>>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the
> >feature
> >>>>> is disabled by default in (and should remain so), for the
> aforementioned
> >>>>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support
> can
> >>>>> safely be enabled, even without having an IOMMU device, with no
> >>>>> performance penalty.
> >>>>>
> >>>>> This patch adds a new global config option, vhost-iommu-support,
> that
> >>>>> controls enablement of the vhost IOMMU feature:
> >>>>>
> >>>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-
> support=true
> >>>>>
> >>>>> This value defaults to false; to enable IOMMU support, this field
> should
> >>>>> be set to true when setting other global parameters on init (such
> as
> >>>>> "dpdk-socket-mem", for example). Changing the value at runtime is
> not
> >>>>> supported, and requires restarting the vswitch daemon.
> >>>>>
> >>>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >>>>>
> >>>>> ---
> >>>>>
> >>>>> v4:  - rebase to HEAD of master
> >>>>>      - clarify that IOMMU support applies exclusively to vhost
> user
> >>>>>        client ports.
> >>>>>      - reword caveats regarding modifying vhost-iommu-support at
> runtime
> >>>>>      - use smap_get_bool() to parse vhost-iommu-support, instead
> of
> >>>>>        process_vhost_flags().
> >>>>>      - add stub implementation of dpdk_vhost_iommu_enabled().
> >>>>>      - move stdbool.h #include outside of DPDK compiler guards.
> >>>>>      - remove mention of vhost IOMMU mode from
> >>>>>        netdev_dpdk_vhost_client_configure() log.
> >>>>>
> >>>>> v3:  - erroneous; disregard.
> >>>>>
> >>>>> v2:  - rebase to HEAD of master
> >>>>>      - refactor vHost IOMMU enablement mechanism (use a global
> >>>>>        config option, instead of the previous per-port approach).
> >>>>> ---
> >>>>>  Documentation/topics/dpdk/vhost-user.rst | 27
> >+++++++++++++++++++++++++++
> >>>>>  NEWS                                     |  1 +
> >>>>>  lib/dpdk-stub.c                          |  6 ++++++
> >>>>>  lib/dpdk.c                               | 12 ++++++++++++
> >>>>>  lib/dpdk.h                               |  3 +++
> >>>>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
> >>>>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
> >>>>>  7 files changed, 74 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
> >>>>> b/Documentation/topics/dpdk/vhost-user.rst
> >>>>> index 5347995..ffef653 100644
> >>>>> --- a/Documentation/topics/dpdk/vhost-user.rst
> >>>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
> >>>>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability
> for
> >>>>> vHost ports to 'reconnect' in  event of the switch crashing or
> being
> >>>>> brought down. Once it is brought back up,  the vHost ports will
> reconnect
> >>>>> automatically and normal service will resume.
> >>>>>
> >>>>> +vhost-user-client IOMMU Support
> >>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>> +
> >>>>> +vhost IOMMU is a feature which restricts the vhost memory that a
> virtio
> >>>>> +device can access, and as such is useful in deployments in which
> >security
> >>>>> is a concern.
> >>>>> +
> >>>>> +IOMMU support may be enabled via a global config value, ```vhost-
> iommu-
> >>>>> support```.
> >>>>> +Setting this to true enables vhost IOMMU support for all vhost
> ports
> >>>>> +when/where
> >>>>> +available::
> >>>>> +
> >>>>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-
> support=true
> >>>>> +
> >>>>> +The default value is false.
> >>>>> +
> >>>>> +.. important::
> >>>>> +
> >>>>> +    Changing this value requires restarting the daemon.
> >>>>> +
> >>>>> +.. important::
> >>>>> +
> >>>>> +    Enabling the IOMMU feature also enables the vhost user reply-
> ack
> >>>>> protocol;
> >>>>> +    this is known to work on QEMU v2.10.0, but is buggy on older
> >versions
> >>>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
> >>>>> disabled by
> >>>>> +    default (and should remain so if using the aforementioned
> versions
> >of
> >>>>> QEMU).
> >>>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
> >enabled,
> >>>>> even
> >>>>> +    without having an IOMMU device, with no performance penalty.
> >>>>> +
> >>>>>  .. _dpdk-testpmd:
> >>>>>
> >>>>>  DPDK in the Guest
> >>>>> diff --git a/NEWS b/NEWS
> >>>>> index d4a1c9a..99c47d8 100644
> >>>>> --- a/NEWS
> >>>>> +++ b/NEWS
> >>>>> @@ -15,6 +15,7 @@ Post-v2.8.0
> >>>>>       * Add support for compiling OVS with the latest Linux 4.13
> kernel
> >>>>>     - DPDK:
> >>>>>       * Add support for DPDK v17.11
> >>>>> +     * Add support for vHost IOMMU
> >>>>>
> >>>>>  v2.8.0 - 31 Aug 2017
> >>>>>  --------------------
> >>>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index
> daef729..3602180
> >>>>> 100644
> >>>>> --- a/lib/dpdk-stub.c
> >>>>> +++ b/lib/dpdk-stub.c
> >>>>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
> >>>>>      return NULL;
> >>>>>  }
> >>>>> +
> >>>>> +bool
> >>>>> +dpdk_vhost_iommu_enabled(void)
> >>>>> +{
> >>>>> +    return false;
> >>>>> +}
> >>>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
> >>>>> index 8da6c32..6710d10 100644
> >>>>> --- a/lib/dpdk.c
> >>>>> +++ b/lib/dpdk.c
> >>>>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
> >>>>>  static FILE *log_stream = NULL;       /* Stream for DPDK log
> redirection
> >>>>> */
> >>>>>
> >>>>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user
> sockets
> >>>>> */
> >>>>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
> >>>>> +support */
> >>>>>
> >>>>>  static int
> >>>>>  process_vhost_flags(char *flag, const char *default_val, int
> size, @@ -
> >>>>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
> >>>>>          vhost_sock_dir = sock_dir_subcomponent;
> >>>>>      }
> >>>>>
> >>>>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
> >>>>> +                                        "vhost-iommu-support",
> false);
> >>>>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
> >>>>> +               vhost_iommu_enabled ? "enabled" : "disabled");
> >>>>> +
> >>>>>      argv = grow_argv(&argv, 0, 1);
> >>>>>      argc = 1;
> >>>>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12
> @@
> >>>>> dpdk_get_vhost_sock_dir(void)
> >>>>>      return vhost_sock_dir;
> >>>>>  }
> >>>>>
> >>>>> +bool
> >>>>> +dpdk_vhost_iommu_enabled(void)
> >>>>> +{
> >>>>> +    return vhost_iommu_enabled;
> >>>>> +}
> >>>>> +
> >>>>>  void
> >>>>>  dpdk_set_lcore_id(unsigned cpu)
> >>>>>  {
> >>>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
> >>>>> index 673a1f1..dc58d96 100644
> >>>>> --- a/lib/dpdk.h
> >>>>> +++ b/lib/dpdk.h
> >>>>> @@ -17,6 +17,8 @@
> >>>>>  #ifndef DPDK_H
> >>>>>  #define DPDK_H
> >>>>>
> >>>>> +#include <stdbool.h>
> >>>>> +
> >>>>>  #ifdef DPDK_NETDEV
> >>>>>
> >>>>>  #include <rte_config.h>
> >>>>> @@ -35,5 +37,6 @@ struct smap;
> >>>>>  void dpdk_init(const struct smap *ovs_other_config);  void
> >>>>> dpdk_set_lcore_id(unsigned cpu);  const char
> >>>>> *dpdk_get_vhost_sock_dir(void);
> >>>>> +bool dpdk_vhost_iommu_enabled(void);
> >>>>>
> >>>>>  #endif /* dpdk.h */
> >>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> f552444..9715c39
> >>>>> 100644
> >>>>> --- a/lib/netdev-dpdk.c
> >>>>> +++ b/lib/netdev-dpdk.c
> >>>>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct
> netdev
> >>>>> *netdev)  {
> >>>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >>>>>      int err;
> >>>>> +    uint64_t vhost_flags = 0;
> >>>>>
> >>>>>      ovs_mutex_lock(&dev->mutex);
> >>>>>
> >>>>> @@ -3263,16 +3264,21 @@
> netdev_dpdk_vhost_client_reconfigure(struct
> >netdev
> >>>>> *netdev)
> >>>>>       */
> >>>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
> >>>>>              && strlen(dev->vhost_id)) {
> >>>>> -        /* Register client-mode device */
> >>>>> -        err = rte_vhost_driver_register(dev->vhost_id,
> >>>>> -                                        RTE_VHOST_USER_CLIENT);
> >>>>> +        /* Register client-mode device. */
> >>>>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
> >>>>> +
> >>>>> +        /* Enable IOMMU support, if explicitly requested. */
> >>>>> +        if (dpdk_vhost_iommu_enabled()) {
> >>>>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> >>>>> +        }
> >>>>> +        err = rte_vhost_driver_register(dev->vhost_id,
> vhost_flags);
> >>>>>          if (err) {
> >>>>>              VLOG_ERR("vhost-user device setup failure for device
> %s\n",
> >>>>>                       dev->vhost_id);
> >>>>>              goto unlock;
> >>>>>          } else {
> >>>>>              /* Configuration successful */
> >>>>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
> >>>>> +            dev->vhost_driver_flags |= vhost_flags;
> >>>>>              VLOG_INFO("vHost User device '%s' created in 'client'
> mode,
> >"
> >>>>>                        "using client socket '%s'",
> >>>>>                        dev->up.name, dev->vhost_id); diff --git
> >>>>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> c145e1a..6c6df50
> >>>>> 100644
> >>>>> --- a/vswitchd/vswitch.xml
> >>>>> +++ b/vswitchd/vswitch.xml
> >>>>> @@ -344,6 +344,21 @@
> >>>>>          </p>
> >>>>>        </column>
> >>>>>
> >>>>> +      <column name="other_config" key="vhost-iommu-support"
> >>>>> +              type='{"type": "boolean"}'>
> >>>>> +        <p>
> >>>>> +          vHost IOMMU is a  security feature, which restricts the
> vhost
> >>>>> memory
> >>>>> +          that a virtio device may access. vHost IOMMU support is
> >>>>> disabled by
> >>>>> +          default, due to a bug in QEMU implementations of the
> vhost
> >>>>> REPLY_ACK
> >>>>> +          protocol, (on which vHost IOMMU relies) prior to
> v2.9.1.
> >>>>> Setting this
> >>>>> +          value to <code>true</code> enables vHost IOMMU support
> for
> >>>>> vHost User
> >>>>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
> >>>>> +        </p>
> >>>>> +        <p>
> >>>>> +          Changing this value requires restarting the daemon.
> >>>>> +        </p>
> >>>>> +      </column>
> >>>>> +
> >>>>>        <column name="other_config" key="n-handler-threads"
> >>>>>                type='{"type": "integer", "minInteger": 1}'>
> >>>>>          <p>
> >>>>> --
> >>>>> 2.7.4
> >>>
> >>> _______________________________________________
> >>> dev mailing list
> >>> dev@openvswitch.org
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >>
> >>
Kevin Traynor Dec. 7, 2017, 2:54 p.m. UTC | #10
On 12/07/2017 12:27 PM, Kavanagh, Mark B wrote:
>> From: Ilya Maximets [mailto:i.maximets@samsung.com]
>> Sent: Thursday, December 7, 2017 12:19 PM
>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Stokes, Ian
>> <ian.stokes@intel.com>; dev@openvswitch.org
>> Cc: maxime.coquelin@redhat.com; jan.scheurich@ericsson.com; Mooney, Sean K
>> <sean.k.mooney@intel.com>; Fischetti, Antonio <antonio.fischetti@intel.com>;
>> Bie, Tiwei <tiwei.bie@intel.com>; Mcnamara, John <john.mcnamara@intel.com>;
>> Guoshuai Li <ligs@dtdream.com>; ktraynor@redhat.com; Loftus, Ciara
>> <ciara.loftus@intel.com>; Yuanhan Liu <yliu@fridaylinux.org>
>> Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>
>> On 07.12.2017 14:32, Kavanagh, Mark B wrote:
>>> Yuanhan's old email address was used in the previous mail - updated
>> correctly here.
>>> -Mark
>>>
>>>> -----Original Message-----
>>>> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
>> bounces@openvswitch.org]
>>>> On Behalf Of Kavanagh, Mark B
>>>> Sent: Thursday, December 7, 2017 11:24 AM
>>>> To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
>>>> Cc: Liu, Yuanhan <yuanhan.liu@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
>>>> Mcnamara, John <john.mcnamara@intel.com>; maxime.coquelin@redhat.com;
>>>> i.maximets@samsung.com
>>>> Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>>
>>>> Hi folks,
>>>>
>>>> Thanks for all of your respective reviews and testing of this patchset.
>>>>
>>>> It seems, however, that an issue in DPDK v17.11 has come to light that
>> affects
>>>> guests which use testpmd.
>>>> Specifically, traffic does not reach a guest when traffic is live prior to
>>>> kicking off the testpmd app within said guest, and at least two forwarding
>>>> cores are used. [1]
>>>>
>>>> This is explained in additional detail is [2], an extract of which is
>>>> reproduced below:
>>>>
>>>> 	"the vector Rx could be broken if backend has consumed all the avail
>>>> descs before the
>>>> 	 device is started. Because in current implementation, the vector Rx will
>>>> return immediately
>>>> 	 without refilling the avail ring if the used ring is empty. So we have
>>>> to refill
>>>> 	 the avail ring after flushing the elements in the used ring."
>>>>
>>>> This issue was initially uncovered by Antonio Fischetti, as part of the
>> 17.11
>>>> patchset validation, and has since been localized to DPDK, rather than OvS.
>>>> As a result, it seems now that we should not move to 17.11, but seek an
>> out-
>>>> of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on this -
>>>> should we:
>>>> 	a) move to 17.11 now, note the issue above in the 'errata' section of the
>>>> documentation, and move to 17.11.1 when it becomes available in February of
>>>> next year
>>>> 	b) request the early release of the 17.11.1 stable branch, which
>>>> incorporates a fix for this issue (the possibility, and availability, of
>> which
>>>> are both TBD).
>>>>
>>>> Thanks in advance,
>>>> Mark
>>
>> Hmm. Isn't it a guest driver issue? Do we need to care so much about running
>> OVS inside the VM? If I assumed right, if we're running OVS not inside the VM,
>> there will be no issues on the OVS side. The issue happens if guest
>> application
>> uses DPDK 17.11, but it will happen regardless of the DPDK version on the host
>> side.
>>
>> So, the only bad scenario is running OVS inside the VM with virtio-net PMD
>> driver.
>> My question is: Do we need to care about this scenario so much?
>> If the answer is "not really" then we can use variant a).
>> But if it's very important thing we need to support than b) will be
>> preferable.
>>
>> Am I missing something?
> 
> Hey Ilya,
> 
> I've just had the exact same conversation with Sean Mooney locally.
> 
> I think the point that both yourself and Sean has made is completely valid, which puts option a) back on the table.
> 

a) Sounds ok to me. I think an early DPDK17.11.1 before OVS 2.9 would be
good in addition though. It is nicer that an OVS 2.9 user doesn't have
to know they can't use the latest DPDK in the guest.

Other than the outstanding documentation, the patches LGTM.

> Antonio has agreed to test that this works; once he confirms I can push v5 of the 17.11 patchet, and simply document the guest DPDK issue.
> 
> Thanks again,
> Mark
> 
> 
>>
>> Best regards, Ilya Maximets.
>>
>>
>>>>
>>>> [1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
>>>> [2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Stokes, Ian
>>>>> Sent: Thursday, December 7, 2017 8:43 AM
>>>>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>>>>> Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com;
>> i.maximets@samsung.com;
>>>>> jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>
>>>>> Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>>>
>>>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>>>>> This is a security feature, which restricts the vhost memory that a
>> virtio
>>>>>> device may access.
>>>>>
>>>>> Hi all,
>>>>>
>>>>> There were a few requests for changes in the v4 of this patch and it's an
>>>>> important aspect of the DPDK 17.11 support for OVS.
>>>>>
>>>>> I'd like to get this series in to the pull request this week. If people
>> have
>>>>> flagged issues for the latest revision I'd appreciate if they could review
>>>> the
>>>>> latest revision and flag any new issues that need to be raised.
>>>>>
>>>>> Ian
>>>>>
>>>>>>
>>>>>> This feature also enables the vhost REPLY_ACK protocol, the
>> implementation
>>>>>> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but
>> is
>>>>>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the
>> feature
>>>>>> is disabled by default in (and should remain so), for the aforementioned
>>>>>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>>>>>> safely be enabled, even without having an IOMMU device, with no
>>>>>> performance penalty.
>>>>>>
>>>>>> This patch adds a new global config option, vhost-iommu-support, that
>>>>>> controls enablement of the vhost IOMMU feature:
>>>>>>
>>>>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>>>>>
>>>>>> This value defaults to false; to enable IOMMU support, this field should
>>>>>> be set to true when setting other global parameters on init (such as
>>>>>> "dpdk-socket-mem", for example). Changing the value at runtime is not
>>>>>> supported, and requires restarting the vswitch daemon.
>>>>>>
>>>>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> v4:  - rebase to HEAD of master
>>>>>>      - clarify that IOMMU support applies exclusively to vhost user
>>>>>>        client ports.
>>>>>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>>>>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>>>>>        process_vhost_flags().
>>>>>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>>>>>      - move stdbool.h #include outside of DPDK compiler guards.
>>>>>>      - remove mention of vhost IOMMU mode from
>>>>>>        netdev_dpdk_vhost_client_configure() log.
>>>>>>
>>>>>> v3:  - erroneous; disregard.
>>>>>>
>>>>>> v2:  - rebase to HEAD of master
>>>>>>      - refactor vHost IOMMU enablement mechanism (use a global
>>>>>>        config option, instead of the previous per-port approach).
>>>>>> ---
>>>>>>  Documentation/topics/dpdk/vhost-user.rst | 27
>> +++++++++++++++++++++++++++
>>>>>>  NEWS                                     |  1 +
>>>>>>  lib/dpdk-stub.c                          |  6 ++++++
>>>>>>  lib/dpdk.c                               | 12 ++++++++++++
>>>>>>  lib/dpdk.h                               |  3 +++
>>>>>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>>>>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>>>>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>>>>>> b/Documentation/topics/dpdk/vhost-user.rst
>>>>>> index 5347995..ffef653 100644
>>>>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>>>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>>>>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
>>>>>> vHost ports to 'reconnect' in  event of the switch crashing or being
>>>>>> brought down. Once it is brought back up,  the vHost ports will reconnect
>>>>>> automatically and normal service will resume.
>>>>>>
>>>>>> +vhost-user-client IOMMU Support
>>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>> +
>>>>>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>>>>>> +device can access, and as such is useful in deployments in which
>> security
>>>>>> is a concern.
>>>>>> +
>>>>>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>>>>>> support```.
>>>>>> +Setting this to true enables vhost IOMMU support for all vhost ports
>>>>>> +when/where
>>>>>> +available::
>>>>>> +
>>>>>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>>>>>> +
>>>>>> +The default value is false.
>>>>>> +
>>>>>> +.. important::
>>>>>> +
>>>>>> +    Changing this value requires restarting the daemon.
>>>>>> +
>>>>>> +.. important::
>>>>>> +
>>>>>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>>>>>> protocol;
>>>>>> +    this is known to work on QEMU v2.10.0, but is buggy on older
>> versions
>>>>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
>>>>>> disabled by
>>>>>> +    default (and should remain so if using the aforementioned versions
>> of
>>>>>> QEMU).
>>>>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
>> enabled,
>>>>>> even
>>>>>> +    without having an IOMMU device, with no performance penalty.
>>>>>> +
>>>>>>  .. _dpdk-testpmd:
>>>>>>
>>>>>>  DPDK in the Guest
>>>>>> diff --git a/NEWS b/NEWS
>>>>>> index d4a1c9a..99c47d8 100644
>>>>>> --- a/NEWS
>>>>>> +++ b/NEWS
>>>>>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>>>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>>>>>     - DPDK:
>>>>>>       * Add support for DPDK v17.11
>>>>>> +     * Add support for vHost IOMMU
>>>>>>
>>>>>>  v2.8.0 - 31 Aug 2017
>>>>>>  --------------------
>>>>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
>>>>>> 100644
>>>>>> --- a/lib/dpdk-stub.c
>>>>>> +++ b/lib/dpdk-stub.c
>>>>>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>>>>>>      return NULL;
>>>>>>  }
>>>>>> +
>>>>>> +bool
>>>>>> +dpdk_vhost_iommu_enabled(void)
>>>>>> +{
>>>>>> +    return false;
>>>>>> +}
>>>>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>>>>> index 8da6c32..6710d10 100644
>>>>>> --- a/lib/dpdk.c
>>>>>> +++ b/lib/dpdk.c
>>>>>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>>>>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
>>>>>> */
>>>>>>
>>>>>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>>>>>> */
>>>>>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
>>>>>> +support */
>>>>>>
>>>>>>  static int
>>>>>>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
>>>>>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>>>>          vhost_sock_dir = sock_dir_subcomponent;
>>>>>>      }
>>>>>>
>>>>>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>>>>>> +                                        "vhost-iommu-support", false);
>>>>>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>>>>>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>>>>>> +
>>>>>>      argv = grow_argv(&argv, 0, 1);
>>>>>>      argc = 1;
>>>>>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
>>>>>> dpdk_get_vhost_sock_dir(void)
>>>>>>      return vhost_sock_dir;
>>>>>>  }
>>>>>>
>>>>>> +bool
>>>>>> +dpdk_vhost_iommu_enabled(void)
>>>>>> +{
>>>>>> +    return vhost_iommu_enabled;
>>>>>> +}
>>>>>> +
>>>>>>  void
>>>>>>  dpdk_set_lcore_id(unsigned cpu)
>>>>>>  {
>>>>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>>>>>> index 673a1f1..dc58d96 100644
>>>>>> --- a/lib/dpdk.h
>>>>>> +++ b/lib/dpdk.h
>>>>>> @@ -17,6 +17,8 @@
>>>>>>  #ifndef DPDK_H
>>>>>>  #define DPDK_H
>>>>>>
>>>>>> +#include <stdbool.h>
>>>>>> +
>>>>>>  #ifdef DPDK_NETDEV
>>>>>>
>>>>>>  #include <rte_config.h>
>>>>>> @@ -35,5 +37,6 @@ struct smap;
>>>>>>  void dpdk_init(const struct smap *ovs_other_config);  void
>>>>>> dpdk_set_lcore_id(unsigned cpu);  const char
>>>>>> *dpdk_get_vhost_sock_dir(void);
>>>>>> +bool dpdk_vhost_iommu_enabled(void);
>>>>>>
>>>>>>  #endif /* dpdk.h */
>>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
>>>>>> 100644
>>>>>> --- a/lib/netdev-dpdk.c
>>>>>> +++ b/lib/netdev-dpdk.c
>>>>>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>>>>> *netdev)  {
>>>>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>>>      int err;
>>>>>> +    uint64_t vhost_flags = 0;
>>>>>>
>>>>>>      ovs_mutex_lock(&dev->mutex);
>>>>>>
>>>>>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct
>> netdev
>>>>>> *netdev)
>>>>>>       */
>>>>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>>>>>              && strlen(dev->vhost_id)) {
>>>>>> -        /* Register client-mode device */
>>>>>> -        err = rte_vhost_driver_register(dev->vhost_id,
>>>>>> -                                        RTE_VHOST_USER_CLIENT);
>>>>>> +        /* Register client-mode device. */
>>>>>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>>>>>> +
>>>>>> +        /* Enable IOMMU support, if explicitly requested. */
>>>>>> +        if (dpdk_vhost_iommu_enabled()) {
>>>>>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>>>>> +        }
>>>>>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>>>>>          if (err) {
>>>>>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>>>>>                       dev->vhost_id);
>>>>>>              goto unlock;
>>>>>>          } else {
>>>>>>              /* Configuration successful */
>>>>>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>>>>>> +            dev->vhost_driver_flags |= vhost_flags;
>>>>>>              VLOG_INFO("vHost User device '%s' created in 'client' mode,
>> "
>>>>>>                        "using client socket '%s'",
>>>>>>                        dev->up.name, dev->vhost_id); diff --git
>>>>>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
>>>>>> 100644
>>>>>> --- a/vswitchd/vswitch.xml
>>>>>> +++ b/vswitchd/vswitch.xml
>>>>>> @@ -344,6 +344,21 @@
>>>>>>          </p>
>>>>>>        </column>
>>>>>>
>>>>>> +      <column name="other_config" key="vhost-iommu-support"
>>>>>> +              type='{"type": "boolean"}'>
>>>>>> +        <p>
>>>>>> +          vHost IOMMU is a  security feature, which restricts the vhost
>>>>>> memory
>>>>>> +          that a virtio device may access. vHost IOMMU support is
>>>>>> disabled by
>>>>>> +          default, due to a bug in QEMU implementations of the vhost
>>>>>> REPLY_ACK
>>>>>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
>>>>>> Setting this
>>>>>> +          value to <code>true</code> enables vHost IOMMU support for
>>>>>> vHost User
>>>>>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>>>>>> +        </p>
>>>>>> +        <p>
>>>>>> +          Changing this value requires restarting the daemon.
>>>>>> +        </p>
>>>>>> +      </column>
>>>>>> +
>>>>>>        <column name="other_config" key="n-handler-threads"
>>>>>>                type='{"type": "integer", "minInteger": 1}'>
>>>>>>          <p>
>>>>>> --
>>>>>> 2.7.4
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>>
>>>
Kevin Traynor Dec. 7, 2017, 2:57 p.m. UTC | #11
On 12/04/2017 11:15 AM, Mark Kavanagh wrote:
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index c145e1a..6c6df50 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -344,6 +344,21 @@
>          </p>
>        </column>
>  
> +      <column name="other_config" key="vhost-iommu-support"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          vHost IOMMU is a  security feature, which restricts the vhost memory

very minor nit if you are still reworking anyway, there's some
additional spaces above

> +          that a virtio device may access. vHost IOMMU support is disabled by
> +          default, due to a bug in QEMU implementations of the vhost REPLY_ACK
> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting this
> +          value to <code>true</code> enables vHost IOMMU support for vHost User
> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
> +        </p>
> +        <p>
> +          Changing this value requires restarting the daemon.
> +        </p>
> +      </column>
> +
>        <column name="other_config" key="n-handler-threads"
>                type='{"type": "integer", "minInteger": 1}'>
>          <p>
Jan Scheurich Dec. 7, 2017, 3:09 p.m. UTC | #12
> > I think the point that both yourself and Sean has made is completely valid, which puts option a) back on the table.
> >
> 
> a) Sounds ok to me. I think an early DPDK17.11.1 before OVS 2.9 would be
> good in addition though. It is nicer that an OVS 2.9 user doesn't have
> to know they can't use the latest DPDK in the guest.
> 

Would the virtio PMD bug in DPDK 17.11 in the guest actually be mitigated by running DPDK 17.05 or a fixed 17.11.1 as vhostuser backend inside OVS on the host?

If not, I would prefer if we decoupled the DPDK life cycle of OVS and DPDK applications in the guest. Guests should update their DPDK version if  it contains a critical bug.

BR, Jan
Kevin Traynor Dec. 7, 2017, 3:29 p.m. UTC | #13
On 12/07/2017 03:09 PM, Jan Scheurich wrote:
>>> I think the point that both yourself and Sean has made is completely valid, which puts option a) back on the table.
>>>
>>
>> a) Sounds ok to me. I think an early DPDK17.11.1 before OVS 2.9 would be
>> good in addition though. It is nicer that an OVS 2.9 user doesn't have
>> to know they can't use the latest DPDK in the guest.
>>
> 
> Would the virtio PMD bug in DPDK 17.11 in the guest actually be mitigated by running DPDK 17.05 or a fixed 17.11.1 as vhostuser backend inside OVS on the host?
> 
> If not, I would prefer if we decoupled the DPDK life cycle of OVS and DPDK applications in the guest. Guests should update their DPDK version if  it contains a critical bug.
> 

I don't think there is any documented coupling between host and guest
DPDK versions. I doubt anyone tests lots of combinations but hopefully
virtio provides the necessary means to run multiple combos. I think it's
reasonable in this case to document a warning for an OVS user about a
known bad combination that is likely to be selected (i.e. latest
upstream/releases).

Kevin.

> BR, Jan
> 
>
Mooney, Sean K Dec. 7, 2017, 3:30 p.m. UTC | #14
> -----Original Message-----

> From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]

> Sent: Thursday, December 7, 2017 3:09 PM

> To: Kevin Traynor <ktraynor@redhat.com>; Kavanagh, Mark B

> <mark.b.kavanagh@intel.com>; Ilya Maximets <i.maximets@samsung.com>;

> Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org

> Cc: maxime.coquelin@redhat.com; Mooney, Sean K

> <sean.k.mooney@intel.com>; Fischetti, Antonio

> <antonio.fischetti@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;

> Mcnamara, John <john.mcnamara@intel.com>; Guoshuai Li

> <ligs@dtdream.com>; Loftus, Ciara <ciara.loftus@intel.com>; Yuanhan Liu

> <yliu@fridaylinux.org>

> Subject: RE: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support

> 

> > > I think the point that both yourself and Sean has made is

> completely valid, which puts option a) back on the table.

> > >

> >

> > a) Sounds ok to me. I think an early DPDK17.11.1 before OVS 2.9 would

> > be good in addition though. It is nicer that an OVS 2.9 user doesn't

> > have to know they can't use the latest DPDK in the guest.

> >

> 

> Would the virtio PMD bug in DPDK 17.11 in the guest actually be

> mitigated by running DPDK 17.05 or a fixed 17.11.1 as vhostuser backend

> inside OVS on the host?

[Mooney, Sean K] from talking to mark about this eairlier I don’t belive so.
I think if you used 17.11 testpmd in the guest with kernel ovs you should
get the same behavior e.g. it does not forward packet. The guest should not
be able to know with certainty what vhost backend is in use on the host.
> 

> If not, I would prefer if we decoupled the DPDK life cycle of OVS and

> DPDK applications in the guest. Guests should update their DPDK version

> if  it contains a critical bug.

> 

> BR, Jan

>
Mark Kavanagh Dec. 7, 2017, 3:35 p.m. UTC | #15
>From: Kevin Traynor [mailto:ktraynor@redhat.com]
>Sent: Thursday, December 7, 2017 3:29 PM
>To: Jan Scheurich <jan.scheurich@ericsson.com>; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>; Ilya Maximets <i.maximets@samsung.com>; Stokes,
>Ian <ian.stokes@intel.com>; dev@openvswitch.org
>Cc: maxime.coquelin@redhat.com; Mooney, Sean K <sean.k.mooney@intel.com>;
>Fischetti, Antonio <antonio.fischetti@intel.com>; Bie, Tiwei
><tiwei.bie@intel.com>; Mcnamara, John <john.mcnamara@intel.com>; Guoshuai Li
><ligs@dtdream.com>; Loftus, Ciara <ciara.loftus@intel.com>; Yuanhan Liu
><yliu@fridaylinux.org>
>Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>
>On 12/07/2017 03:09 PM, Jan Scheurich wrote:
>>>> I think the point that both yourself and Sean has made is completely
>valid, which puts option a) back on the table.
>>>>
>>>
>>> a) Sounds ok to me. I think an early DPDK17.11.1 before OVS 2.9 would be
>>> good in addition though. It is nicer that an OVS 2.9 user doesn't have
>>> to know they can't use the latest DPDK in the guest.
>>>
>>
>> Would the virtio PMD bug in DPDK 17.11 in the guest actually be mitigated by
>running DPDK 17.05 or a fixed 17.11.1 as vhostuser backend inside OVS on the
>host?
>>
>> If not, I would prefer if we decoupled the DPDK life cycle of OVS and DPDK
>applications in the guest. Guests should update their DPDK version if  it
>contains a critical bug.
>>
>
>I don't think there is any documented coupling between host and guest
>DPDK versions. I doubt anyone tests lots of combinations but hopefully
>virtio provides the necessary means to run multiple combos. I think it's
>reasonable in this case to document a warning for an OVS user about a
>known bad combination that is likely to be selected (i.e. latest
>upstream/releases).
>
>Kevin.
>
>> BR, Jan
>>

Hi Jan,

DPDK v17.11 in the host is fine; the observed issue is present in the guest's virtio driver in DPDK v17.11 (which is not present in v17.05.2).

As Kevin mentioned, I don't think there is any explicit coupling - or expectation of same - regarding the combination of DPDK versions used in guest and host; however, in this case I think it's certainly sensible to document the issue presented by the inclusion of 17.11 DPDK in the guest (at least when DPDK applications are used therein).

Thanks,
Mark


>>
Jan Scheurich Dec. 7, 2017, 4:06 p.m. UTC | #16
> > Would the virtio PMD bug in DPDK 17.11 in the guest actually be

> > mitigated by running DPDK 17.05 or a fixed 17.11.1 as vhostuser backend

> > inside OVS on the host?


> [Mooney, Sean K] from talking to mark about this eairlier I don’t belive so.

> I think if you used 17.11 testpmd in the guest with kernel ovs you should

> get the same behavior e.g. it does not forward packet. The guest should not

> be able to know with certainty what vhost backend is in use on the host.


If that is the case why should OVS even consider documenting that there are known problems when running a DPDK 17.11 application in a guest, irrespective of the OVS version or datapath?

If anything, this is a critical DPDK bug and DPDK need to warn their virtio PMD users not to use 17.11 until this bug is fixed in a 17.11.1 maintenance release.

The real issue I see is that OVS-DPDK may also be deployed in in a VM with virtio ports, and that kind of deployment will indeed be broken with DPDK 17.11. Can we accept and document that as a limitation in OVS 2.9 that will be fixed in OVS 2.9.1 as soon as DPDK 17.11.1 is out?

Or should we as a temporary solution apply a patch to DPDK 17.11 to fix this when building OVS?

BR, Jan
Mooney, Sean K Dec. 7, 2017, 5:03 p.m. UTC | #17
> -----Original Message-----

> From: Jan Scheurich [mailto:jan.scheurich@ericsson.com]

> Sent: Thursday, December 7, 2017 4:07 PM

> To: Mooney, Sean K <sean.k.mooney@intel.com>; Kevin Traynor

> <ktraynor@redhat.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;

> Ilya Maximets <i.maximets@samsung.com>; Stokes, Ian

> <ian.stokes@intel.com>; dev@openvswitch.org

> Cc: maxime.coquelin@redhat.com; Fischetti, Antonio

> <antonio.fischetti@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;

> Mcnamara, John <john.mcnamara@intel.com>; Guoshuai Li

> <ligs@dtdream.com>; Loftus, Ciara <ciara.loftus@intel.com>; Yuanhan Liu

> <yliu@fridaylinux.org>

> Subject: RE: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support

> 

> > > Would the virtio PMD bug in DPDK 17.11 in the guest actually be

> > > mitigated by running DPDK 17.05 or a fixed 17.11.1 as vhostuser

> > > backend inside OVS on the host?

> 

> > [Mooney, Sean K] from talking to mark about this eairlier I don’t

> belive so.

> > I think if you used 17.11 testpmd in the guest with kernel ovs you

> > should get the same behavior e.g. it does not forward packet. The

> > guest should not be able to know with certainty what vhost backend is

> in use on the host.

> 

> If that is the case why should OVS even consider documenting that there

> are known problems when running a DPDK 17.11 application in a guest,

> irrespective of the OVS version or datapath?

> 

> If anything, this is a critical DPDK bug and DPDK need to warn their

> virtio PMD users not to use 17.11 until this bug is fixed in a 17.11.1

> maintenance release.

> 

> The real issue I see is that OVS-DPDK may also be deployed in in a VM

> with virtio ports, and that kind of deployment will indeed be broken

> with DPDK 17.11. Can we accept and document that as a limitation in OVS

> 2.9 that will be fixed in OVS 2.9.1 as soon as DPDK 17.11.1 is out?

[Mooney, Sean K] that is a concern for me also. I run ovs-dpdk in a vm on a daily
Basis but I typical configure the vm with e1000 virtual nics but that then requires
Me to use kernel ovs on the host. I started using this partly to work around old
virtio pmd bugs that are have since been fixed.

I ran ovs-dpdk in a vm  on top of ovs-dpdk with vhost-user nics via openstack for years
However using Virtio-net-pci devices for ovs-dpdk requires special handeling
Due to some limitations in the choice of pf driver.

Without a virtualized iommu in a guest its not possible to use the vfio-pci
Driver with virtio-net-interfaces. qemu/kvm did not support
vIOMMUs in the past so this has only recently become an option but I have not
validated that it works myself.

Uio support is compiled out by default by Ubuntu,fedora,centos and presumable rhel.
That mean even on distros that package ovs-dpdk to run it in a vm you have to build
From source.

Igb_uio was previously known to be buggy with vitio-net-pci devices and dpdk had advised
against using it as you could not reliably unbind the virtio interface form igb_uio and
back to the kernel driver.

So igb_uio worked but you had to reboot the vm to get the kernel dirver working again
which was annoying for my development and ci usecases.hence my use of e1000 vNICs
This may have been fixed.

Uio_pci_generic works with virtio-net-pci devices as far as I know but unlike
Vfio-pci or igb_uio its memory access is over privaldged so its not a great choice
If you care about you applications security. It did however avoid the buggy igb_uio
Rebinding issue and it does not need an iommu to work so has been the most reliable
If least secure option for nested ovs-dpdk.

Since this bug is in the dpdk virtio pmd and not the pf driver however it will effect
Each one equally so I think the onus remain with the vnf vendors that build on ovs and dpdk 
to validate their dependencies before choosing to rev the ovs and/or dpdk versions they use.
Having a warning about this particular issue in the ovs docs however may help some of them
Avoid this trap and it will give us something I can refer people to between now and the dpdk 17.11.1
Release.

> 

> Or should we as a temporary solution apply a patch to DPDK 17.11 to fix

> this when building OVS?

[Mooney, Sean K] dpdk is not built via the ovs make file we just link to it so
I don’t think that is a good option for upstream ovs. That said I have patching support
In my networking-ovs-dpdk devstack plugin for just this reason. If people are building
Ovs form source I think just specifying which dpdk commit fixes the issue should be
Enough for them to be able to work around this issue until 17.11.1 is released.
> 

> BR, Jan

> 

>
diff mbox series

Patch

diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
index 5347995..ffef653 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -273,6 +273,33 @@  One benefit of using this mode is the ability for vHost ports to 'reconnect' in
 event of the switch crashing or being brought down. Once it is brought back up,
 the vHost ports will reconnect automatically and normal service will resume.
 
+vhost-user-client IOMMU Support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+vhost IOMMU is a feature which restricts the vhost memory that a virtio device
+can access, and as such is useful in deployments in which security is a concern.
+
+IOMMU support may be enabled via a global config value, ```vhost-iommu-support```.
+Setting this to true enables vhost IOMMU support for all vhost ports when/where
+available::
+
+    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
+
+The default value is false.
+
+.. important::
+
+    Changing this value requires restarting the daemon.
+
+.. important::
+
+    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
+    this is known to work on QEMU v2.10.0, but is buggy on older versions
+    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
+    default (and should remain so if using the aforementioned versions of QEMU).
+    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
+    without having an IOMMU device, with no performance penalty.
+
 .. _dpdk-testpmd:
 
 DPDK in the Guest
diff --git a/NEWS b/NEWS
index d4a1c9a..99c47d8 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,7 @@  Post-v2.8.0
      * Add support for compiling OVS with the latest Linux 4.13 kernel
    - DPDK:
      * Add support for DPDK v17.11
+     * Add support for vHost IOMMU
 
 v2.8.0 - 31 Aug 2017
 --------------------
diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
index daef729..3602180 100644
--- a/lib/dpdk-stub.c
+++ b/lib/dpdk-stub.c
@@ -48,3 +48,9 @@  dpdk_get_vhost_sock_dir(void)
 {
     return NULL;
 }
+
+bool
+dpdk_vhost_iommu_enabled(void)
+{
+    return false;
+}
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 8da6c32..6710d10 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -41,6 +41,7 @@  VLOG_DEFINE_THIS_MODULE(dpdk);
 static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
 
 static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
+static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support */
 
 static int
 process_vhost_flags(char *flag, const char *default_val, int size,
@@ -345,6 +346,11 @@  dpdk_init__(const struct smap *ovs_other_config)
         vhost_sock_dir = sock_dir_subcomponent;
     }
 
+    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
+                                        "vhost-iommu-support", false);
+    VLOG_INFO("IOMMU support for vhost-user-client %s.",
+               vhost_iommu_enabled ? "enabled" : "disabled");
+
     argv = grow_argv(&argv, 0, 1);
     argc = 1;
     argv[0] = xstrdup(ovs_get_program_name());
@@ -482,6 +488,12 @@  dpdk_get_vhost_sock_dir(void)
     return vhost_sock_dir;
 }
 
+bool
+dpdk_vhost_iommu_enabled(void)
+{
+    return vhost_iommu_enabled;
+}
+
 void
 dpdk_set_lcore_id(unsigned cpu)
 {
diff --git a/lib/dpdk.h b/lib/dpdk.h
index 673a1f1..dc58d96 100644
--- a/lib/dpdk.h
+++ b/lib/dpdk.h
@@ -17,6 +17,8 @@ 
 #ifndef DPDK_H
 #define DPDK_H
 
+#include <stdbool.h>
+
 #ifdef DPDK_NETDEV
 
 #include <rte_config.h>
@@ -35,5 +37,6 @@  struct smap;
 void dpdk_init(const struct smap *ovs_other_config);
 void dpdk_set_lcore_id(unsigned cpu);
 const char *dpdk_get_vhost_sock_dir(void);
+bool dpdk_vhost_iommu_enabled(void);
 
 #endif /* dpdk.h */
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f552444..9715c39 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -3253,6 +3253,7 @@  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int err;
+    uint64_t vhost_flags = 0;
 
     ovs_mutex_lock(&dev->mutex);
 
@@ -3263,16 +3264,21 @@  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
      */
     if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
             && strlen(dev->vhost_id)) {
-        /* Register client-mode device */
-        err = rte_vhost_driver_register(dev->vhost_id,
-                                        RTE_VHOST_USER_CLIENT);
+        /* Register client-mode device. */
+        vhost_flags |= RTE_VHOST_USER_CLIENT;
+
+        /* Enable IOMMU support, if explicitly requested. */
+        if (dpdk_vhost_iommu_enabled()) {
+            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
+        }
+        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
         if (err) {
             VLOG_ERR("vhost-user device setup failure for device %s\n",
                      dev->vhost_id);
             goto unlock;
         } else {
             /* Configuration successful */
-            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
+            dev->vhost_driver_flags |= vhost_flags;
             VLOG_INFO("vHost User device '%s' created in 'client' mode, "
                       "using client socket '%s'",
                       dev->up.name, dev->vhost_id);
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index c145e1a..6c6df50 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -344,6 +344,21 @@ 
         </p>
       </column>
 
+      <column name="other_config" key="vhost-iommu-support"
+              type='{"type": "boolean"}'>
+        <p>
+          vHost IOMMU is a  security feature, which restricts the vhost memory
+          that a virtio device may access. vHost IOMMU support is disabled by
+          default, due to a bug in QEMU implementations of the vhost REPLY_ACK
+          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting this
+          value to <code>true</code> enables vHost IOMMU support for vHost User
+          Client ports in OvS-DPDK, starting from DPDK v17.11.
+        </p>
+        <p>
+          Changing this value requires restarting the daemon.
+        </p>
+      </column>
+
       <column name="other_config" key="n-handler-threads"
               type='{"type": "integer", "minInteger": 1}'>
         <p>