diff mbox

[ovs-dev,2/2] netdev-dpdk: Enable optional dequeue zero copy for vHost User

Message ID 1503659923-24322-3-git-send-email-ciara.loftus@intel.com
State Superseded
Headers show

Commit Message

Ciara Loftus Aug. 25, 2017, 11:18 a.m. UTC
Enabled per port like so:
ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=true

Zero Copy can only be enabled/disabled when a vHost port is down.

When packets from a vHost device with zero copy enabled are destined for
a 'dpdk' port, the number of tx descriptors on that 'dpdk' port must be
set to a smaller value. 128 is recommended. This can be achieved like
so:

ovs-vsctl set Interface dpdkport options:n_txq_desc=128

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 Documentation/topics/dpdk/vhost-user.rst | 24 +++++++++
 NEWS                                     |  4 +-
 lib/netdev-dpdk.c                        | 89 +++++++++++++++++++++++++++++++-
 vswitchd/vswitch.xml                     | 11 ++++
 4 files changed, 126 insertions(+), 2 deletions(-)

Comments

Stokes, Ian Sept. 13, 2017, 8:30 a.m. UTC | #1
> Enabled per port like so:
> ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=true
> 
> Zero Copy can only be enabled/disabled when a vHost port is down.
> 
> When packets from a vHost device with zero copy enabled are destined for a
> 'dpdk' port, the number of tx descriptors on that 'dpdk' port must be set
> to a smaller value. 128 is recommended. This can be achieved like
> so:
> 
> ovs-vsctl set Interface dpdkport options:n_txq_desc=128

Thanks for working on this Ciara, very useful feature. I'm still testing a few cases but have some initial first pass feedback below.

A few comments below
 
> 
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 24 +++++++++
>  NEWS                                     |  4 +-
>  lib/netdev-dpdk.c                        | 89
> +++++++++++++++++++++++++++++++-
>  vswitchd/vswitch.xml                     | 11 ++++
>  4 files changed, 126 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst
> b/Documentation/topics/dpdk/vhost-user.rst
> index a3d5de3..937d83a 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -408,3 +408,27 @@ Sample XML
>      </domain>
> 
>  .. _QEMU documentation: http://git.qemu-
> project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-
> user.txt;h=7890d7169;hb=HEAD
> +
> +vhost-user Dequeue Zero Copy
> +-------------------------------------
> +
> +Normally when dequeuing a packet from a vHost User device, a memcpy
> +operation must be used to copy that packet from guest address space to
> +host address space. This memcpy can be removed by enabling dequeue zero-
> copy like so:

Although it's implied by the argument options:dq-zero-copy=true, I think it would be good to explicitly say that by default the feature is disabled in the above paragraph.

> +
> +    $ ovs-vsctl set Interface dpdkvhostuserclient0
> + options:dq-zero-copy=true
> +
> +With this feature enabled, a reference (pointer) to the packet is
> +passed to the host, instead of a copy of the packet. Removing this
> +memcpy can give a performance improvement for some use cases, for
> +example switching large packets between different VMs.
> +
> +A limitation exists whereby if packets from a vHost port with
> +dq-zero-copy=true are destined for a 'dpdk' type port, the number of tx
> +descriptors (n_txq_desc) for that port must be reduced to a smaller
> +number, 128 being the recommended value. This can be achieved by issuing
> the following command:
> +
> +    $ ovs-vsctl set Interface dpdkport options:n_txq_desc=128
> +
> +Further information can be found in the `DPDK documentation
> +<http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html#vhost-a
> +pi-overview>`__

In addition to the options explanation above, I wonder would an example under either the DPDK vhostuser section or the PHY-VM-PHY example be beneficial to a user to show how the limitation above would be dealt with in a deployment? As this is an addition to a key feature I think people would be interested in seeing how to deploy it in a basic configuration where it interacts with both vhostuser and DPDK phy ports. 

Check patch utility throws up the following for the URL above

WARNING: Line length is >79-characters long
#52 FILE: Documentation/topics/dpdk/vhost-user.rst:434:
`DPDK documentation <http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html#vhost-api-overview>`__

Could be worth breaking the URL down to avoid the error. 

> diff --git a/NEWS b/NEWS
> index cbcfd8e..750e08a 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -1,6 +1,8 @@
>  Post-v2.8.0
>  --------------------
> -   - Nothing yet.
> +   - DPDK:
> +     * Optional dequeue zero copy feature for vHost ports enabled per
> port
> +       via the boolean 'dq-zero-copy' option.
> 
>  v2.8.0 - xx xxx xxxx
>  ---------------------
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ade813b..b9c7fc4
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -381,6 +381,9 @@ struct netdev_dpdk {
>      /* True if vHost device is 'up' and has been reconfigured at least
> once */
>      bool vhost_reconfigured;
> 
> +    /* True if dq-zero-copy feature has successfully been enabled */
> +    bool zc_enabled;
> +

Will zero copy be enabled for enqueue at some stage in the future? If so, just wondering would it make sense to change the bool name to 'zc_dq_enabled' or is it the case in the future both enqueue and dequeue are enabled together?

>      /* Identifier used to distinguish vhost devices from each other. */
>      char vhost_id[PATH_MAX];
> 
> @@ -879,6 +882,7 @@ common_construct(struct netdev *netdev, dpdk_port_t
> port_no,
>      dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
>      ovsrcu_index_init(&dev->vid, -1);
>      dev->vhost_reconfigured = false;
> +    dev->zc_enabled = false;
>      dev->attached = false;
> 
>      ovsrcu_init(&dev->qos_conf, NULL);
> @@ -1387,6 +1391,29 @@ netdev_dpdk_ring_set_config(struct netdev *netdev,
> const struct smap *args,
>      return 0;
>  }
> 
> +static void
> +dpdk_vhost_set_config_helper(struct netdev_dpdk *dev,
> +                             const struct smap *args) {
> +    bool needs_reconfigure = false;
> +    bool zc_requested = smap_get_bool(args, "dq-zero-copy", false);
> +
> +    if (zc_requested &&
> +            !(dev->vhost_driver_flags &
> RTE_VHOST_USER_DEQUEUE_ZERO_COPY)) {
> +        dev->vhost_driver_flags |= RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> +        needs_reconfigure = true;
> +    } else if (!zc_requested &&
> +            (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY))
> {
> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> +        needs_reconfigure = true;
> +    }
> +
> +    /* Only try to change ZC mode when device is down */
> +    if (needs_reconfigure && (netdev_dpdk_get_vid(dev) == -1)) {
> +        netdev_request_reconfigure(&dev->up);
> +    }
> +}
> +
>  static int
>  netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>                                      const struct smap *args, @@ -1403,6
> +1430,23 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>              netdev_request_reconfigure(netdev);
>          }
>      }
> +
> +    dpdk_vhost_set_config_helper(dev, args);
> +
> +    ovs_mutex_unlock(&dev->mutex);
> +
> +    return 0;
> +}
> +
> +static int
> +netdev_dpdk_vhost_set_config(struct netdev *netdev,
> +                             const struct smap *args,
> +                             char **errp OVS_UNUSED) {
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +
> +    ovs_mutex_lock(&dev->mutex);
> +    dpdk_vhost_set_config_helper(dev, args);

I had assumed to this point that zero copy was specific to vhost user client? But it seems that it works for both vhost user and vhost user client, if so I'd like to see that called out in the docs (even though vhost user ports are depreciated, I think it's good to have a comment regarding ZC support if it exists). 
>      ovs_mutex_unlock(&dev->mutex);
> 
>      return 0;
> @@ -2717,6 +2761,46 @@ netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
>      }
>  }
> 
> +static void
> +vhost_change_zero_copy_mode(struct netdev_dpdk *dev, bool client_mode,
> +                            bool enable) {
> +    int err = rte_vhost_driver_unregister(dev->vhost_id);
> +
> +    if (err) {
> +        VLOG_ERR("Error unregistering vHost socket %s; can't change zero
> copy "
> +                "mode", dev->vhost_id);
> +    } else {
> +        err = dpdk_setup_vhost_device(dev, client_mode);
> +        if (err) {
> +            VLOG_ERR("Error changing zero copy mode for vHost socket %s",
> +                    dev->vhost_id);
> +        } else if (enable) {
> +            dev->zc_enabled = true;
> +            VLOG_INFO("Zero copy enabled for vHost socket %s", dev-
> >vhost_id);
> +        } else {
> +            dev->zc_enabled = false;
> +            VLOG_INFO("Zero copy disabled for vHost socket %s", dev-
> >vhost_id);
> +        }
> +    }
> +}
> +
> +static void
> +vhost_check_zero_copy_status(struct netdev_dpdk *dev) {
> +    bool mode = dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT;
> +
> +    if ((dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY)
> +                && !dev->zc_enabled) {
> +        /* ZC disabled but requested to be enabled, enable it. */
> +        vhost_change_zero_copy_mode(dev, mode, true);
> +    } else if (!(dev->vhost_driver_flags &
> +            RTE_VHOST_USER_DEQUEUE_ZERO_COPY) && dev->zc_enabled) {
> +        /* ZC enabled but requested to be disabled, disable it. */
> +        vhost_change_zero_copy_mode(dev, mode, false);
> +    }
> +}
> +
>  /*
>   * Remove a virtio-net device from the specific vhost port.  Use dev-
> >remove
>   * flag to stop any more packets from being sent or received to/from a VM
> and @@ -2762,6 +2846,7 @@ destroy_device(int vid)
>           */
>          ovsrcu_quiesce_start();
>          VLOG_INFO("vHost Device '%s' has been removed", ifname);
> +        netdev_request_reconfigure(&dev->up);
>      } else {
>          VLOG_INFO("vHost Device '%s' not found", ifname);
>      }
> @@ -3253,6 +3338,8 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk
> *dev)
> 
>      if (netdev_dpdk_get_vid(dev) >= 0) {
>          dev->vhost_reconfigured = true;
> +    } else {
> +        vhost_check_zero_copy_status(dev);
>      }
> 
>      return 0;
> @@ -3414,7 +3501,7 @@ static const struct netdev_class dpdk_vhost_class =
>          NULL,
>          netdev_dpdk_vhost_construct,
>          netdev_dpdk_vhost_destruct,
> -        NULL,
> +        netdev_dpdk_vhost_set_config,
>          NULL,
>          netdev_dpdk_vhost_send,
>          netdev_dpdk_vhost_get_carrier,
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> 074535b..660b46f 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2621,6 +2621,17 @@
>          </p>
>        </column>
> 
> +      <column name="options" key="dq-zero-copy"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          The value specifies whether or not to enable dequeue zero copy
> on
> +          the given interface.
> +          The port must be in an inactive state in order to enable or
> disable
> +          this feature.

Should the port being inactive be mentioned in the documentation as well?

> +          Only supported by dpdkvhostuserclient and dpdkvhostuser
> interfaces.
> +        </p>
> +      </column>
> +
>        <column name="options" key="n_rxq_desc"
>                type='{"type": "integer", "minInteger": 1, "maxInteger":
> 4096}'>
>          <p>
> --
> 2.7.5
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff mbox

Patch

diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
index a3d5de3..937d83a 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -408,3 +408,27 @@  Sample XML
     </domain>
 
 .. _QEMU documentation: http://git.qemu-project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-user.txt;h=7890d7169;hb=HEAD
+
+vhost-user Dequeue Zero Copy
+-------------------------------------
+
+Normally when dequeuing a packet from a vHost User device, a memcpy operation
+must be used to copy that packet from guest address space to host address
+space. This memcpy can be removed by enabling dequeue zero-copy like so:
+
+    $ ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=true
+
+With this feature enabled, a reference (pointer) to the packet is passed to
+the host, instead of a copy of the packet. Removing this memcpy can give a
+performance improvement for some use cases, for example switching large packets
+between different VMs.
+
+A limitation exists whereby if packets from a vHost port with dq-zero-copy=true
+are destined for a 'dpdk' type port, the number of tx descriptors (n_txq_desc)
+for that port must be reduced to a smaller number, 128 being the recommended
+value. This can be achieved by issuing the following command:
+
+    $ ovs-vsctl set Interface dpdkport options:n_txq_desc=128
+
+Further information can be found in the
+`DPDK documentation <http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html#vhost-api-overview>`__
diff --git a/NEWS b/NEWS
index cbcfd8e..750e08a 100644
--- a/NEWS
+++ b/NEWS
@@ -1,6 +1,8 @@ 
 Post-v2.8.0
 --------------------
-   - Nothing yet.
+   - DPDK:
+     * Optional dequeue zero copy feature for vHost ports enabled per port
+       via the boolean 'dq-zero-copy' option.
 
 v2.8.0 - xx xxx xxxx
 ---------------------
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ade813b..b9c7fc4 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -381,6 +381,9 @@  struct netdev_dpdk {
     /* True if vHost device is 'up' and has been reconfigured at least once */
     bool vhost_reconfigured;
 
+    /* True if dq-zero-copy feature has successfully been enabled */
+    bool zc_enabled;
+
     /* Identifier used to distinguish vhost devices from each other. */
     char vhost_id[PATH_MAX];
 
@@ -879,6 +882,7 @@  common_construct(struct netdev *netdev, dpdk_port_t port_no,
     dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
     ovsrcu_index_init(&dev->vid, -1);
     dev->vhost_reconfigured = false;
+    dev->zc_enabled = false;
     dev->attached = false;
 
     ovsrcu_init(&dev->qos_conf, NULL);
@@ -1387,6 +1391,29 @@  netdev_dpdk_ring_set_config(struct netdev *netdev, const struct smap *args,
     return 0;
 }
 
+static void
+dpdk_vhost_set_config_helper(struct netdev_dpdk *dev,
+                             const struct smap *args)
+{
+    bool needs_reconfigure = false;
+    bool zc_requested = smap_get_bool(args, "dq-zero-copy", false);
+
+    if (zc_requested &&
+            !(dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY)) {
+        dev->vhost_driver_flags |= RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
+        needs_reconfigure = true;
+    } else if (!zc_requested &&
+            (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY)) {
+        dev->vhost_driver_flags &= ~RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
+        needs_reconfigure = true;
+    }
+
+    /* Only try to change ZC mode when device is down */
+    if (needs_reconfigure && (netdev_dpdk_get_vid(dev) == -1)) {
+        netdev_request_reconfigure(&dev->up);
+    }
+}
+
 static int
 netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
                                     const struct smap *args,
@@ -1403,6 +1430,23 @@  netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
             netdev_request_reconfigure(netdev);
         }
     }
+
+    dpdk_vhost_set_config_helper(dev, args);
+
+    ovs_mutex_unlock(&dev->mutex);
+
+    return 0;
+}
+
+static int
+netdev_dpdk_vhost_set_config(struct netdev *netdev,
+                             const struct smap *args,
+                             char **errp OVS_UNUSED)
+{
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+    ovs_mutex_lock(&dev->mutex);
+    dpdk_vhost_set_config_helper(dev, args);
     ovs_mutex_unlock(&dev->mutex);
 
     return 0;
@@ -2717,6 +2761,46 @@  netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
     }
 }
 
+static void
+vhost_change_zero_copy_mode(struct netdev_dpdk *dev, bool client_mode,
+                            bool enable)
+{
+    int err = rte_vhost_driver_unregister(dev->vhost_id);
+
+    if (err) {
+        VLOG_ERR("Error unregistering vHost socket %s; can't change zero copy "
+                "mode", dev->vhost_id);
+    } else {
+        err = dpdk_setup_vhost_device(dev, client_mode);
+        if (err) {
+            VLOG_ERR("Error changing zero copy mode for vHost socket %s",
+                    dev->vhost_id);
+        } else if (enable) {
+            dev->zc_enabled = true;
+            VLOG_INFO("Zero copy enabled for vHost socket %s", dev->vhost_id);
+        } else {
+            dev->zc_enabled = false;
+            VLOG_INFO("Zero copy disabled for vHost socket %s", dev->vhost_id);
+        }
+    }
+}
+
+static void
+vhost_check_zero_copy_status(struct netdev_dpdk *dev)
+{
+    bool mode = dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT;
+
+    if ((dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY)
+                && !dev->zc_enabled) {
+        /* ZC disabled but requested to be enabled, enable it. */
+        vhost_change_zero_copy_mode(dev, mode, true);
+    } else if (!(dev->vhost_driver_flags &
+            RTE_VHOST_USER_DEQUEUE_ZERO_COPY) && dev->zc_enabled) {
+        /* ZC enabled but requested to be disabled, disable it. */
+        vhost_change_zero_copy_mode(dev, mode, false);
+    }
+}
+
 /*
  * Remove a virtio-net device from the specific vhost port.  Use dev->remove
  * flag to stop any more packets from being sent or received to/from a VM and
@@ -2762,6 +2846,7 @@  destroy_device(int vid)
          */
         ovsrcu_quiesce_start();
         VLOG_INFO("vHost Device '%s' has been removed", ifname);
+        netdev_request_reconfigure(&dev->up);
     } else {
         VLOG_INFO("vHost Device '%s' not found", ifname);
     }
@@ -3253,6 +3338,8 @@  dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
 
     if (netdev_dpdk_get_vid(dev) >= 0) {
         dev->vhost_reconfigured = true;
+    } else {
+        vhost_check_zero_copy_status(dev);
     }
 
     return 0;
@@ -3414,7 +3501,7 @@  static const struct netdev_class dpdk_vhost_class =
         NULL,
         netdev_dpdk_vhost_construct,
         netdev_dpdk_vhost_destruct,
-        NULL,
+        netdev_dpdk_vhost_set_config,
         NULL,
         netdev_dpdk_vhost_send,
         netdev_dpdk_vhost_get_carrier,
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 074535b..660b46f 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -2621,6 +2621,17 @@ 
         </p>
       </column>
 
+      <column name="options" key="dq-zero-copy"
+              type='{"type": "boolean"}'>
+        <p>
+          The value specifies whether or not to enable dequeue zero copy on
+          the given interface.
+          The port must be in an inactive state in order to enable or disable
+          this feature.
+          Only supported by dpdkvhostuserclient and dpdkvhostuser interfaces.
+        </p>
+      </column>
+
       <column name="options" key="n_rxq_desc"
               type='{"type": "integer", "minInteger": 1, "maxInteger": 4096}'>
         <p>