diff mbox series

[ovs-dev,V2,2/2] netdev-dpdk: vHost IOMMU support

Message ID 1512036198-16810-3-git-send-email-mark.b.kavanagh@intel.com
State Superseded
Headers show
Series DPDK v17.11 Support | expand

Commit Message

Mark Kavanagh Nov. 30, 2017, 10:03 a.m. UTC
DPDK v17.11 introduces support for the vHost IOMMU feature.
This is a security feature, which restricts the vhost memory
that a virtio device may access.

This feature also enables the vhost REPLY_ACK protocol, the
implementation of which is known to work in newer versions of
QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
v2.9.0, inclusive). As such, the feature is disabled by default
in (and should remain so), for the aforementioned older QEMU
verions. Starting with QEMU v2.9.1, vhost-iommu-support can
safely be enabled, even without having an IOMMU device, with
no performance penalty.

This patch adds a new global config option, vhost-iommu-support,
that controls enablement of the vhost IOMMU feature:

    ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true

Note that changing this value after guest devices have already been
initialized will not toggle IOMMU support. To that end, if IOMMU
support is required, this field should be set to true when setting
other global parameters on init (such as "dpdk-socket-mem", for
example).

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>

---

v2->v1:
    - rebase to HEAD of master
    - refactor vHost IOMMU enablement mechanism (use a global
      config option, instead of the previous per-port approach).
---
 Documentation/topics/dpdk/vhost-user.rst | 29 +++++++++++++++++++++++++++++
 NEWS                                     |  1 +
 lib/dpdk.c                               | 16 ++++++++++++++++
 lib/dpdk.h                               |  3 +++
 lib/netdev-dpdk.c                        | 19 +++++++++++++------
 vswitchd/vswitch.xml                     | 19 +++++++++++++++++++
 6 files changed, 81 insertions(+), 6 deletions(-)

Comments

Ilya Maximets Nov. 30, 2017, 11:24 a.m. UTC | #1
Thanks for the patches. Comments inline.

Best regards, Ilya Maximets.

On 30.11.2017 13:03, Mark Kavanagh wrote:
> DPDK v17.11 introduces support for the vHost IOMMU feature.
> This is a security feature, which restricts the vhost memory
> that a virtio device may access.
> 
> This feature also enables the vhost REPLY_ACK protocol, the
> implementation of which is known to work in newer versions of
> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
> v2.9.0, inclusive). As such, the feature is disabled by default
> in (and should remain so), for the aforementioned older QEMU
> verions. Starting with QEMU v2.9.1, vhost-iommu-support can
> safely be enabled, even without having an IOMMU device, with
> no performance penalty.
> 
> This patch adds a new global config option, vhost-iommu-support,
> that controls enablement of the vhost IOMMU feature:
> 
>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
> 
> Note that changing this value after guest devices have already been
> initialized will not toggle IOMMU support. To that end, if IOMMU
> support is required, this field should be set to true when setting
> other global parameters on init (such as "dpdk-socket-mem", for
> example).
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> 
> ---
> 
> v2->v1:
>     - rebase to HEAD of master
>     - refactor vHost IOMMU enablement mechanism (use a global
>       config option, instead of the previous per-port approach).
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 29 +++++++++++++++++++++++++++++
>  NEWS                                     |  1 +
>  lib/dpdk.c                               | 16 ++++++++++++++++
>  lib/dpdk.h                               |  3 +++
>  lib/netdev-dpdk.c                        | 19 +++++++++++++------
>  vswitchd/vswitch.xml                     | 19 +++++++++++++++++++
>  6 files changed, 81 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
> index 5347995..814c50b 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -273,6 +273,35 @@ One benefit of using this mode is the ability for vHost ports to 'reconnect' in
>  event of the switch crashing or being brought down. Once it is brought back up,
>  the vHost ports will reconnect automatically and normal service will resume.
>  
> +vhost IOMMU Support
> +-------------------

We need to state here that this supported only by vhost-user-client ports.
Maybe you need to change section name to "vhost-user-client IOMMU support"?
And add some clues to the text below.

Also, as soon as we're describing how to start QEMU with vhost-user* ports
in this document it'll be nice to have section about how to add IOMMU device
to QEMU cmdline/libvirt xml.

> +
> +vhost IOMMU is a feature which restricts the vhost memory that a virtio device
> +can access, and as such is useful in deployments in which security is a concern.
> +
> +IOMMU support may be enabled via a global config value, ```vhost-iommu-support```.
> +Setting this to true enables vhost IOMMU support for all vhost ports when/where
> +available::
> +
> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
> +
> +.. important::
> +
> +    Changing this value after guest devices have already been initialized
> +    will not toggle vHost IOMMU support. To that end, if vHost IOMMU
> +    support is required, this field should be set to ```true```
> +    when setting other global parameters on init (such as ```dpdk-socket-mem```,
> +    for example).

As we can't actually configure this in runtime, we could just state that restart
is required on change. How about just this:
"The default is 'false'. Changing this value requires restarting the daemon."

> +
> +.. important::
> +
> +    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
> +    default (and should remain so if using the aforementioned versions of QEMU).
> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
> +    without having an IOMMU device, with no performance penalty.
> +
>  .. _dpdk-testpmd:
>  
>  DPDK in the Guest
> diff --git a/NEWS b/NEWS
> index 74e59bf..3e1a073 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -14,6 +14,7 @@ Post-v2.8.0
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>     - DPDK:
>       * Add support for DPDK v17.11
> +     * Add support for vHost IOMMU
>  
>  v2.8.0 - 31 Aug 2017
>  --------------------
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 8da6c32..c5120f7 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
>  
>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support */
>  
>  static int
>  process_vhost_flags(char *flag, const char *default_val, int size,
> @@ -312,6 +313,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>      int err = 0;
>      cpu_set_t cpuset;
>      char *sock_dir_subcomponent;
> +    char *enable_vhost_iommu;
>  
>      log_stream = fopencookie(NULL, "w+", dpdk_log_func);
>      if (log_stream == NULL) {
> @@ -345,6 +347,14 @@ dpdk_init__(const struct smap *ovs_other_config)
>          vhost_sock_dir = sock_dir_subcomponent;
>      }
>  
> +    if (process_vhost_flags("vhost-iommu-support", "false",
> +                            strlen("vhost-iommu-support"), ovs_other_config,
> +                            &enable_vhost_iommu)) {
> +        vhost_iommu_enabled = (strncmp(enable_vhost_iommu, "true",
> +                                        strlen("true")) == 0) ?
> +                               true : false;
> +    }
> +

We may replace above code with just something like this:
----------------------------------------------------------------
    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
                                        "vhost-iommu-support", false);
    VLOG_INFO("IOMMU support for vhost-user-client %s",
              vhost_iommu_enabled ? "enabled" : "disabled");
----------------------------------------------------------------

IHMO, it looks much better and properly handles upper/lowercase characters.

Beside that,
I guess, we should rename 'process_vhost_flags' somehow and use it only for
string parameters for which it was implemented. But we can do this in a separate
change someday.

>      argv = grow_argv(&argv, 0, 1);
>      argc = 1;
>      argv[0] = xstrdup(ovs_get_program_name());
> @@ -482,6 +492,12 @@ dpdk_get_vhost_sock_dir(void)
>      return vhost_sock_dir;
>  }
>  
> +bool
> +dpdk_vhost_iommu_enabled(void)
> +{
> +    return vhost_iommu_enabled;
> +}
> +

You also need to add dummy implementation of this function to lib/dpdk-stub.c

>  void
>  dpdk_set_lcore_id(unsigned cpu)
>  {
> diff --git a/lib/dpdk.h b/lib/dpdk.h
> index 673a1f1..83f0fac 100644
> --- a/lib/dpdk.h
> +++ b/lib/dpdk.h
> @@ -19,6 +19,8 @@
>  
>  #ifdef DPDK_NETDEV
>  
> +#include <stdbool.h>
> +

Looks like you need to include this above the '#ifdef DPDK_NETDEV'.

>  #include <rte_config.h>
>  #include <rte_lcore.h>
>  
> @@ -35,5 +37,6 @@ struct smap;
>  void dpdk_init(const struct smap *ovs_other_config);
>  void dpdk_set_lcore_id(unsigned cpu);
>  const char *dpdk_get_vhost_sock_dir(void);
> +bool dpdk_vhost_iommu_enabled(void);
>  
>  #endif /* dpdk.h */
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index f552444..e190e0c 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int err;
> +    uint64_t vhost_flags = 0;
>  
>      ovs_mutex_lock(&dev->mutex);
>  
> @@ -3263,19 +3264,25 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>       */
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>              && strlen(dev->vhost_id)) {
> -        /* Register client-mode device */
> -        err = rte_vhost_driver_register(dev->vhost_id,
> -                                        RTE_VHOST_USER_CLIENT);
> +        /* Register client-mode device. */
> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
> +
> +        /* Enable IOMMU support, if explicitly requested. */
> +        if (dpdk_vhost_iommu_enabled()) {
> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> +        }
> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>          if (err) {
>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>                       dev->vhost_id);
>              goto unlock;
>          } else {
>              /* Configuration successful */
> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
> +            dev->vhost_driver_flags |= vhost_flags;
>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
> -                      "using client socket '%s'",
> -                      dev->up.name, dev->vhost_id);
> +                      "using client socket '%s'. vHost IOMMU support is %s.",
> +                      dev->up.name, dev->vhost_id, dpdk_vhost_iommu_enabled() ?
> +                      "enabled" : "disabled");

We don't need to print IOMMU status here because we already printed it on sartup.
Also, this may confuse users, because they may start thinking that IOMMU is really
in use while QEMU is not configured for that.

>          }
>  
>          err = rte_vhost_driver_callback_register(dev->vhost_id,
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index c145e1a..d8e767b 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -344,6 +344,25 @@
>          </p>
>        </column>
>  
> +      <column name="other_config" key="vhost-iommu-support"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          vHost IOMMU is a  security feature, which restricts the vhost memory
> +          that a virtio device may access. vHost IOMMU support is disabled by
> +          default, due to a bug in QEMU implementations of the vhost REPLY_ACK
> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting this
> +          value to <code>true</code> enables vHost IOMMU support for vHost User
> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
> +        </p>
> +        <p>
> +          Changing this value after guest devices have already been initialized
> +          does not toggle vHost IOMMU support. To that end, if vHost IOMMU
> +          support is required, this field should be set to <code>true</code>
> +          when setting other global parameters on init (such as
> +          <ref column="other_config" key="dpdk-socket-mem"/>, for example).

Same as above. "Changing this value requires restarting the daemon." should be enough.

> +        </p>
> +      </column>
> +
>        <column name="other_config" key="n-handler-threads"
>                type='{"type": "integer", "minInteger": 1}'>
>          <p>
>
Mark Kavanagh Nov. 30, 2017, 11:48 a.m. UTC | #2
>From: Ilya Maximets [mailto:i.maximets@samsung.com]
>Sent: Thursday, November 30, 2017 11:24 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>Cc: ktraynor@redhat.com; maxime.coquelin@redhat.com;
>jan.scheurich@ericsson.com; Mooney, Sean K <sean.k.mooney@intel.com>; Stokes,
>Ian <ian.stokes@intel.com>
>Subject: Re: [ovs-dev][PATCH V2 2/2] netdev-dpdk: vHost IOMMU support
>
>Thanks for the patches. Comments inline.

Thanks for the quick turnaround Ilya - I'll incorporate your feedback into V3!

Best,
Mark

>
>Best regards, Ilya Maximets.
>
>On 30.11.2017 13:03, Mark Kavanagh wrote:
>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>> This is a security feature, which restricts the vhost memory
>> that a virtio device may access.
>>
>> This feature also enables the vhost REPLY_ACK protocol, the
>> implementation of which is known to work in newer versions of
>> QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
>> v2.9.0, inclusive). As such, the feature is disabled by default
>> in (and should remain so), for the aforementioned older QEMU
>> verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>> safely be enabled, even without having an IOMMU device, with
>> no performance penalty.
>>
>> This patch adds a new global config option, vhost-iommu-support,
>> that controls enablement of the vhost IOMMU feature:
>>
>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>
>> Note that changing this value after guest devices have already been
>> initialized will not toggle IOMMU support. To that end, if IOMMU
>> support is required, this field should be set to true when setting
>> other global parameters on init (such as "dpdk-socket-mem", for
>> example).
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>
>> ---
>>
>> v2->v1:
>>     - rebase to HEAD of master
>>     - refactor vHost IOMMU enablement mechanism (use a global
>>       config option, instead of the previous per-port approach).
>> ---
>>  Documentation/topics/dpdk/vhost-user.rst | 29 +++++++++++++++++++++++++++++
>>  NEWS                                     |  1 +
>>  lib/dpdk.c                               | 16 ++++++++++++++++
>>  lib/dpdk.h                               |  3 +++
>>  lib/netdev-dpdk.c                        | 19 +++++++++++++------
>>  vswitchd/vswitch.xml                     | 19 +++++++++++++++++++
>>  6 files changed, 81 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>b/Documentation/topics/dpdk/vhost-user.rst
>> index 5347995..814c50b 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -273,6 +273,35 @@ One benefit of using this mode is the ability for vHost
>ports to 'reconnect' in
>>  event of the switch crashing or being brought down. Once it is brought back
>up,
>>  the vHost ports will reconnect automatically and normal service will
>resume.
>>
>> +vhost IOMMU Support
>> +-------------------
>
>We need to state here that this supported only by vhost-user-client ports.
>Maybe you need to change section name to "vhost-user-client IOMMU support"?
>And add some clues to the text below.
>
>Also, as soon as we're describing how to start QEMU with vhost-user* ports
>in this document it'll be nice to have section about how to add IOMMU device
>to QEMU cmdline/libvirt xml.
>
>> +
>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>device
>> +can access, and as such is useful in deployments in which security is a
>concern.
>> +
>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>support```.
>> +Setting this to true enables vhost IOMMU support for all vhost ports
>when/where
>> +available::
>> +
>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>> +
>> +.. important::
>> +
>> +    Changing this value after guest devices have already been initialized
>> +    will not toggle vHost IOMMU support. To that end, if vHost IOMMU
>> +    support is required, this field should be set to ```true```
>> +    when setting other global parameters on init (such as ```dpdk-socket-
>mem```,
>> +    for example).
>
>As we can't actually configure this in runtime, we could just state that
>restart
>is required on change. How about just this:
>"The default is 'false'. Changing this value requires restarting the daemon."
>
>> +
>> +.. important::
>> +
>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>protocol;
>> +    this is known to work on QEMU v2.10.0, but is buggy on older versions
>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled
>by
>> +    default (and should remain so if using the aforementioned versions of
>QEMU).
>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled,
>even
>> +    without having an IOMMU device, with no performance penalty.
>> +
>>  .. _dpdk-testpmd:
>>
>>  DPDK in the Guest
>> diff --git a/NEWS b/NEWS
>> index 74e59bf..3e1a073 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -14,6 +14,7 @@ Post-v2.8.0
>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>     - DPDK:
>>       * Add support for DPDK v17.11
>> +     * Add support for vHost IOMMU
>>
>>  v2.8.0 - 31 Aug 2017
>>  --------------------
>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>> index 8da6c32..c5120f7 100644
>> --- a/lib/dpdk.c
>> +++ b/lib/dpdk.c
>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
>>
>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support
>*/
>>
>>  static int
>>  process_vhost_flags(char *flag, const char *default_val, int size,
>> @@ -312,6 +313,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>>      int err = 0;
>>      cpu_set_t cpuset;
>>      char *sock_dir_subcomponent;
>> +    char *enable_vhost_iommu;
>>
>>      log_stream = fopencookie(NULL, "w+", dpdk_log_func);
>>      if (log_stream == NULL) {
>> @@ -345,6 +347,14 @@ dpdk_init__(const struct smap *ovs_other_config)
>>          vhost_sock_dir = sock_dir_subcomponent;
>>      }
>>
>> +    if (process_vhost_flags("vhost-iommu-support", "false",
>> +                            strlen("vhost-iommu-support"),
>ovs_other_config,
>> +                            &enable_vhost_iommu)) {
>> +        vhost_iommu_enabled = (strncmp(enable_vhost_iommu, "true",
>> +                                        strlen("true")) == 0) ?
>> +                               true : false;
>> +    }
>> +
>
>We may replace above code with just something like this:
>----------------------------------------------------------------
>    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>                                        "vhost-iommu-support", false);
>    VLOG_INFO("IOMMU support for vhost-user-client %s",
>              vhost_iommu_enabled ? "enabled" : "disabled");
>----------------------------------------------------------------
>
>IHMO, it looks much better and properly handles upper/lowercase characters.
>
>Beside that,
>I guess, we should rename 'process_vhost_flags' somehow and use it only for
>string parameters for which it was implemented. But we can do this in a
>separate
>change someday.
>
>>      argv = grow_argv(&argv, 0, 1);
>>      argc = 1;
>>      argv[0] = xstrdup(ovs_get_program_name());
>> @@ -482,6 +492,12 @@ dpdk_get_vhost_sock_dir(void)
>>      return vhost_sock_dir;
>>  }
>>
>> +bool
>> +dpdk_vhost_iommu_enabled(void)
>> +{
>> +    return vhost_iommu_enabled;
>> +}
>> +
>
>You also need to add dummy implementation of this function to lib/dpdk-stub.c
>
>>  void
>>  dpdk_set_lcore_id(unsigned cpu)
>>  {
>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>> index 673a1f1..83f0fac 100644
>> --- a/lib/dpdk.h
>> +++ b/lib/dpdk.h
>> @@ -19,6 +19,8 @@
>>
>>  #ifdef DPDK_NETDEV
>>
>> +#include <stdbool.h>
>> +
>
>Looks like you need to include this above the '#ifdef DPDK_NETDEV'.
>
>>  #include <rte_config.h>
>>  #include <rte_lcore.h>
>>
>> @@ -35,5 +37,6 @@ struct smap;
>>  void dpdk_init(const struct smap *ovs_other_config);
>>  void dpdk_set_lcore_id(unsigned cpu);
>>  const char *dpdk_get_vhost_sock_dir(void);
>> +bool dpdk_vhost_iommu_enabled(void);
>>
>>  #endif /* dpdk.h */
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index f552444..e190e0c 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>*netdev)
>>  {
>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>      int err;
>> +    uint64_t vhost_flags = 0;
>>
>>      ovs_mutex_lock(&dev->mutex);
>>
>> @@ -3263,19 +3264,25 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>*netdev)
>>       */
>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>              && strlen(dev->vhost_id)) {
>> -        /* Register client-mode device */
>> -        err = rte_vhost_driver_register(dev->vhost_id,
>> -                                        RTE_VHOST_USER_CLIENT);
>> +        /* Register client-mode device. */
>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>> +
>> +        /* Enable IOMMU support, if explicitly requested. */
>> +        if (dpdk_vhost_iommu_enabled()) {
>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>> +        }
>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>          if (err) {
>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>                       dev->vhost_id);
>>              goto unlock;
>>          } else {
>>              /* Configuration successful */
>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>> +            dev->vhost_driver_flags |= vhost_flags;
>>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>> -                      "using client socket '%s'",
>> -                      dev->up.name, dev->vhost_id);
>> +                      "using client socket '%s'. vHost IOMMU support is
>%s.",
>> +                      dev->up.name, dev->vhost_id,
>dpdk_vhost_iommu_enabled() ?
>> +                      "enabled" : "disabled");
>
>We don't need to print IOMMU status here because we already printed it on
>sartup.
>Also, this may confuse users, because they may start thinking that IOMMU is
>really
>in use while QEMU is not configured for that.
>
>>          }
>>
>>          err = rte_vhost_driver_callback_register(dev->vhost_id,
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index c145e1a..d8e767b 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -344,6 +344,25 @@
>>          </p>
>>        </column>
>>
>> +      <column name="other_config" key="vhost-iommu-support"
>> +              type='{"type": "boolean"}'>
>> +        <p>
>> +          vHost IOMMU is a  security feature, which restricts the vhost
>memory
>> +          that a virtio device may access. vHost IOMMU support is disabled
>by
>> +          default, due to a bug in QEMU implementations of the vhost
>REPLY_ACK
>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting
>this
>> +          value to <code>true</code> enables vHost IOMMU support for vHost
>User
>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>> +        </p>
>> +        <p>
>> +          Changing this value after guest devices have already been
>initialized
>> +          does not toggle vHost IOMMU support. To that end, if vHost IOMMU
>> +          support is required, this field should be set to
><code>true</code>
>> +          when setting other global parameters on init (such as
>> +          <ref column="other_config" key="dpdk-socket-mem"/>, for example).
>
>Same as above. "Changing this value requires restarting the daemon." should be
>enough.
>
>> +        </p>
>> +      </column>
>> +
>>        <column name="other_config" key="n-handler-threads"
>>                type='{"type": "integer", "minInteger": 1}'>
>>          <p>
>>
diff mbox series

Patch

diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
index 5347995..814c50b 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -273,6 +273,35 @@  One benefit of using this mode is the ability for vHost ports to 'reconnect' in
 event of the switch crashing or being brought down. Once it is brought back up,
 the vHost ports will reconnect automatically and normal service will resume.
 
+vhost IOMMU Support
+-------------------
+
+vhost IOMMU is a feature which restricts the vhost memory that a virtio device
+can access, and as such is useful in deployments in which security is a concern.
+
+IOMMU support may be enabled via a global config value, ```vhost-iommu-support```.
+Setting this to true enables vhost IOMMU support for all vhost ports when/where
+available::
+
+    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
+
+.. important::
+
+    Changing this value after guest devices have already been initialized
+    will not toggle vHost IOMMU support. To that end, if vHost IOMMU
+    support is required, this field should be set to ```true```
+    when setting other global parameters on init (such as ```dpdk-socket-mem```,
+    for example).
+
+.. important::
+
+    Enabling the IOMMU feature also enables the vhost user reply-ack protocol;
+    this is known to work on QEMU v2.10.0, but is buggy on older versions
+    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is disabled by
+    default (and should remain so if using the aforementioned versions of QEMU).
+    Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even
+    without having an IOMMU device, with no performance penalty.
+
 .. _dpdk-testpmd:
 
 DPDK in the Guest
diff --git a/NEWS b/NEWS
index 74e59bf..3e1a073 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,7 @@  Post-v2.8.0
      * Add support for compiling OVS with the latest Linux 4.13 kernel
    - DPDK:
      * Add support for DPDK v17.11
+     * Add support for vHost IOMMU
 
 v2.8.0 - 31 Aug 2017
 --------------------
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 8da6c32..c5120f7 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -41,6 +41,7 @@  VLOG_DEFINE_THIS_MODULE(dpdk);
 static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
 
 static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
+static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support */
 
 static int
 process_vhost_flags(char *flag, const char *default_val, int size,
@@ -312,6 +313,7 @@  dpdk_init__(const struct smap *ovs_other_config)
     int err = 0;
     cpu_set_t cpuset;
     char *sock_dir_subcomponent;
+    char *enable_vhost_iommu;
 
     log_stream = fopencookie(NULL, "w+", dpdk_log_func);
     if (log_stream == NULL) {
@@ -345,6 +347,14 @@  dpdk_init__(const struct smap *ovs_other_config)
         vhost_sock_dir = sock_dir_subcomponent;
     }
 
+    if (process_vhost_flags("vhost-iommu-support", "false",
+                            strlen("vhost-iommu-support"), ovs_other_config,
+                            &enable_vhost_iommu)) {
+        vhost_iommu_enabled = (strncmp(enable_vhost_iommu, "true",
+                                        strlen("true")) == 0) ?
+                               true : false;
+    }
+
     argv = grow_argv(&argv, 0, 1);
     argc = 1;
     argv[0] = xstrdup(ovs_get_program_name());
@@ -482,6 +492,12 @@  dpdk_get_vhost_sock_dir(void)
     return vhost_sock_dir;
 }
 
+bool
+dpdk_vhost_iommu_enabled(void)
+{
+    return vhost_iommu_enabled;
+}
+
 void
 dpdk_set_lcore_id(unsigned cpu)
 {
diff --git a/lib/dpdk.h b/lib/dpdk.h
index 673a1f1..83f0fac 100644
--- a/lib/dpdk.h
+++ b/lib/dpdk.h
@@ -19,6 +19,8 @@ 
 
 #ifdef DPDK_NETDEV
 
+#include <stdbool.h>
+
 #include <rte_config.h>
 #include <rte_lcore.h>
 
@@ -35,5 +37,6 @@  struct smap;
 void dpdk_init(const struct smap *ovs_other_config);
 void dpdk_set_lcore_id(unsigned cpu);
 const char *dpdk_get_vhost_sock_dir(void);
+bool dpdk_vhost_iommu_enabled(void);
 
 #endif /* dpdk.h */
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f552444..e190e0c 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -3253,6 +3253,7 @@  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int err;
+    uint64_t vhost_flags = 0;
 
     ovs_mutex_lock(&dev->mutex);
 
@@ -3263,19 +3264,25 @@  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
      */
     if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
             && strlen(dev->vhost_id)) {
-        /* Register client-mode device */
-        err = rte_vhost_driver_register(dev->vhost_id,
-                                        RTE_VHOST_USER_CLIENT);
+        /* Register client-mode device. */
+        vhost_flags |= RTE_VHOST_USER_CLIENT;
+
+        /* Enable IOMMU support, if explicitly requested. */
+        if (dpdk_vhost_iommu_enabled()) {
+            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
+        }
+        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
         if (err) {
             VLOG_ERR("vhost-user device setup failure for device %s\n",
                      dev->vhost_id);
             goto unlock;
         } else {
             /* Configuration successful */
-            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
+            dev->vhost_driver_flags |= vhost_flags;
             VLOG_INFO("vHost User device '%s' created in 'client' mode, "
-                      "using client socket '%s'",
-                      dev->up.name, dev->vhost_id);
+                      "using client socket '%s'. vHost IOMMU support is %s.",
+                      dev->up.name, dev->vhost_id, dpdk_vhost_iommu_enabled() ?
+                      "enabled" : "disabled");
         }
 
         err = rte_vhost_driver_callback_register(dev->vhost_id,
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index c145e1a..d8e767b 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -344,6 +344,25 @@ 
         </p>
       </column>
 
+      <column name="other_config" key="vhost-iommu-support"
+              type='{"type": "boolean"}'>
+        <p>
+          vHost IOMMU is a  security feature, which restricts the vhost memory
+          that a virtio device may access. vHost IOMMU support is disabled by
+          default, due to a bug in QEMU implementations of the vhost REPLY_ACK
+          protocol, (on which vHost IOMMU relies) prior to v2.9.1. Setting this
+          value to <code>true</code> enables vHost IOMMU support for vHost User
+          Client ports in OvS-DPDK, starting from DPDK v17.11.
+        </p>
+        <p>
+          Changing this value after guest devices have already been initialized
+          does not toggle vHost IOMMU support. To that end, if vHost IOMMU
+          support is required, this field should be set to <code>true</code>
+          when setting other global parameters on init (such as
+          <ref column="other_config" key="dpdk-socket-mem"/>, for example).
+        </p>
+      </column>
+
       <column name="other_config" key="n-handler-threads"
               type='{"type": "integer", "minInteger": 1}'>
         <p>