diff mbox series

[ovs-dev,RFC] netdev-dpdk: Integrate vHost User PMD

Message ID 1526917453-17997-1-git-send-email-ciara.loftus@intel.com
State Rejected
Delegated to: Ian Stokes
Headers show
Series [ovs-dev,RFC] netdev-dpdk: Integrate vHost User PMD | expand

Commit Message

Ciara Loftus May 21, 2018, 3:44 p.m. UTC
The vHost PMD brings vHost User port types ('dpdkvhostuser' and
'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
calls to DPDK's librte_vhost library are removed and replaced with
librte_ether API calls, for which most of the infrastructure is already
in place.

This change has a number of benefits, including:
* Reduced codebase (~200LOC removed)
* More features automatically enabled for vHost ports eg. custom stats
  and additional get_status information.
* OVS can be ignorant to changes in the librte_vhost API between DPDK
  releases potentially making upgrades easier and the OVS codebase less
  susceptible to change.

The sum of all DPDK port types must not exceed RTE_MAX_ETHPORTS which is
set and can be modified in the DPDK configuration. Prior to this patch
this only applied to 'dpdk' and 'dpdkr' ports, but now applies to all
DPDK port types including vHost User.

Performance (pps) of the different topologies p2p, pvp, pvvp and vv has
been measured to remain within a +/- 5% margin of existing performance.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
To function correctly, this patch requires the following patches to be
applied to DPDK and OVS respectively:
1. http://dpdk.org/dev/patchwork/patch/39315/ (due to be backported to
DPDK 17.11.3)
2. https://patchwork.ozlabs.org/patch/914653/

 NEWS                 |   3 +
 lib/dpdk.c           |  11 +
 lib/dpdk.h           |   1 +
 lib/netdev-dpdk.c    | 920 ++++++++++++++++++++-------------------------------
 tests/system-dpdk.at |   1 +
 5 files changed, 372 insertions(+), 564 deletions(-)

Comments

Flavio Leitner May 24, 2018, 8:35 p.m. UTC | #1
On Mon, May 21, 2018 at 04:44:13PM +0100, Ciara Loftus wrote:
> The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> calls to DPDK's librte_vhost library are removed and replaced with
> librte_ether API calls, for which most of the infrastructure is already
> in place.
> 
> This change has a number of benefits, including:
> * Reduced codebase (~200LOC removed)
> * More features automatically enabled for vHost ports eg. custom stats
>   and additional get_status information.
> * OVS can be ignorant to changes in the librte_vhost API between DPDK
>   releases potentially making upgrades easier and the OVS codebase less
>   susceptible to change.
> 
> The sum of all DPDK port types must not exceed RTE_MAX_ETHPORTS which is
> set and can be modified in the DPDK configuration. Prior to this patch
> this only applied to 'dpdk' and 'dpdkr' ports, but now applies to all
> DPDK port types including vHost User.
> 
> Performance (pps) of the different topologies p2p, pvp, pvvp and vv has
> been measured to remain within a +/- 5% margin of existing performance.

Thanks for putting this together.

I think when this idea was discussed at least in my head we would
pretty much kill any vhost specific info and use a standard eth API
instead.  However, it doesn't look like to be case, we still have the
mtu and queue issues, special construct/destruct, send, and etc which
IMHO defeats the initial goal.

Leaving that aside for a moment, I wonder about imposed limitations
if we switch to the eth API too. I mean, things that we can do today
because OVS is managing vhost that we won't be able after the API
switch.

Thanks,
fbl
Stokes, Ian May 29, 2018, 4:14 p.m. UTC | #2
> The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> calls to DPDK's librte_vhost library are removed and replaced with
> librte_ether API calls, for which most of the infrastructure is already
> in place.
> 
> This change has a number of benefits, including:
> * Reduced codebase (~200LOC removed)
> * More features automatically enabled for vHost ports eg. custom stats
>   and additional get_status information.
> * OVS can be ignorant to changes in the librte_vhost API between DPDK
>   releases potentially making upgrades easier and the OVS codebase less
>   susceptible to change.
> 
> The sum of all DPDK port types must not exceed RTE_MAX_ETHPORTS which is
> set and can be modified in the DPDK configuration. Prior to this patch
> this only applied to 'dpdk' and 'dpdkr' ports, but now applies to all
> DPDK port types including vHost User.
> 
> Performance (pps) of the different topologies p2p, pvp, pvvp and vv has
> been measured to remain within a +/- 5% margin of existing performance.
> 

Thanks for working on this Ciara. I'm aware of a separate thread from Flavio which I will respond to also.

For the purpose of code reviews and if this progresses to v2 I've included comments below. 

Also there's a few coding standard errors in the patch, I haven't flagged them here but a run of the check patch utility will identify them of the next version.

Thanks
Ian

> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
> To function correctly, this patch requires the following patches to be
> applied to DPDK and OVS respectively:
> 1. http://dpdk.org/dev/patchwork/patch/39315/ (due to be backported to
> DPDK 17.11.3)
> 2. https://patchwork.ozlabs.org/patch/914653/
> 
>  NEWS                 |   3 +
>  lib/dpdk.c           |  11 +
>  lib/dpdk.h           |   1 +
>  lib/netdev-dpdk.c    | 920 ++++++++++++++++++++--------------------------
> -----
>  tests/system-dpdk.at |   1 +
>  5 files changed, 372 insertions(+), 564 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index ec548b0..55dc513 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -30,6 +30,9 @@ Post-v2.9.0
>       * New 'check-dpdk' Makefile target to run a new system testsuite.
>         See Testing topic for the details.
>       * Add LSC interrupt support for DPDK physical devices.
> +     * Use DPDK's vHost PMD instead of direct library calls. This means
> the
> +       maximum number of vHost ports is equal to RTE_MAX_ETHPORTS as
> defined
> +       in the DPDK configuration.
>     - Userspace datapath:
>       * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a
> single PMD
>       * Detailed PMD performance metrics available with new command
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 00dd974..6cfc6fc 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -22,6 +22,7 @@
>  #include <sys/stat.h>
>  #include <getopt.h>
> 
> +#include <rte_ethdev.h>
>  #include <rte_log.h>
>  #include <rte_memzone.h>
>  #include <rte_version.h>
> @@ -32,6 +33,7 @@
> 
>  #include "dirs.h"
>  #include "fatal-signal.h"
> +#include "id-pool.h"
>  #include "netdev-dpdk.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "openvswitch/vlog.h"
> @@ -43,6 +45,7 @@ static FILE *log_stream = NULL;       /* Stream for DPDK
> log redirection */
> 
>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
> */
>  static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support
> */
> +static struct id_pool *vhost_driver_ids;  /* Pool of IDs for vHost PMDs
> */
> 
>  static int
>  process_vhost_flags(char *flag, const char *default_val, int size,
> @@ -457,6 +460,8 @@ dpdk_init__(const struct smap *ovs_other_config)
>      }
>  #endif
> 
> +    vhost_driver_ids = id_pool_create(0, RTE_MAX_ETHPORTS);
> +
>      /* Finally, register the dpdk classes */
>      netdev_dpdk_register();
>  }
> @@ -498,6 +503,12 @@ dpdk_vhost_iommu_enabled(void)
>      return vhost_iommu_enabled;
>  }
> 
> +struct id_pool *
> +dpdk_get_vhost_id_pool(void)
> +{
> +    return vhost_driver_ids;
> +}
> +
>  void
>  dpdk_set_lcore_id(unsigned cpu)
>  {
> diff --git a/lib/dpdk.h b/lib/dpdk.h
> index b041535..c7143f7 100644
> --- a/lib/dpdk.h
> +++ b/lib/dpdk.h
> @@ -39,5 +39,6 @@ void dpdk_set_lcore_id(unsigned cpu);
>  const char *dpdk_get_vhost_sock_dir(void);
>  bool dpdk_vhost_iommu_enabled(void);
>  void print_dpdk_version(void);
> +struct id_pool *dpdk_get_vhost_id_pool(void);
> 
>  #endif /* dpdk.h */
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index afddf6d..defc51d 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -31,6 +31,7 @@
>  #include <rte_cycles.h>
>  #include <rte_errno.h>
>  #include <rte_eth_ring.h>
> +#include <rte_eth_vhost.h>
>  #include <rte_ethdev.h>
>  #include <rte_malloc.h>
>  #include <rte_mbuf.h>
> @@ -44,6 +45,7 @@
>  #include "dpdk.h"
>  #include "dpif-netdev.h"
>  #include "fatal-signal.h"
> +#include "id-pool.h"
>  #include "netdev-provider.h"
>  #include "netdev-vport.h"
>  #include "odp-util.h"
> @@ -63,6 +65,7 @@
>  #include "unixctl.h"
> 
>  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> +enum {VHOST_SERVER_MODE, VHOST_CLIENT_MODE};
> 
>  VLOG_DEFINE_THIS_MODULE(netdev_dpdk);
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
> @@ -122,6 +125,7 @@ static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 20);
>  #define XSTAT_RX_BROADCAST_PACKETS       "rx_broadcast_packets"
>  #define XSTAT_TX_BROADCAST_PACKETS       "tx_broadcast_packets"
>  #define XSTAT_RX_UNDERSIZED_ERRORS       "rx_undersized_errors"
> +#define XSTAT_RX_UNDERSIZE_PACKETS       "rx_undersize_packets"
>  #define XSTAT_RX_OVERSIZE_ERRORS         "rx_oversize_errors"
>  #define XSTAT_RX_FRAGMENTED_ERRORS       "rx_fragmented_errors"
>  #define XSTAT_RX_JABBER_ERRORS           "rx_jabber_errors"
> @@ -135,7 +139,7 @@ static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 20);
>  /* Maximum size of Physical NIC Queues */
>  #define NIC_PORT_MAX_Q_SIZE 4096
> 
> -#define OVS_VHOST_MAX_QUEUE_NUM 1024  /* Maximum number of vHost TX
> queues. */
> +#define OVS_VHOST_MAX_QUEUE_NUM RTE_MAX_QUEUES_PER_PORT /* Max vHost TXQs
> */
>  #define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not initialized. */
>  #define OVS_VHOST_QUEUE_DISABLED    (-2) /* Queue was disabled by guest
> and not
>                                            * yet mapped to another queue.
> */
> @@ -170,21 +174,6 @@ static const struct rte_eth_conf port_conf = {
>      },
>  };
> 
> -/*
> - * These callbacks allow virtio-net devices to be added to vhost ports
> when
> - * configuration has been fully completed.
> - */

Just a general comment, one of the difficulties I found when reviewing was keeping in mind where existing functionality was being moved to.
Would the comments above be suited to some of the new functions introduced below? I 'll call it out explicitly below.

> -static int new_device(int vid);
> -static void destroy_device(int vid);
> -static int vring_state_changed(int vid, uint16_t queue_id, int enable);
> -static const struct vhost_device_ops virtio_net_device_ops =
> -{
> -    .new_device =  new_device,
> -    .destroy_device = destroy_device,
> -    .vring_state_changed = vring_state_changed,
> -    .features_changed = NULL
> -};
> -
>  enum { DPDK_RING_SIZE = 256 };
>  BUILD_ASSERT_DECL(IS_POW2(DPDK_RING_SIZE));
>  enum { DRAIN_TSC = 200000ULL };
> @@ -379,6 +368,8 @@ struct netdev_dpdk {
>          char *devargs;  /* Device arguments for dpdk ports */
>          struct dpdk_tx_queue *tx_q;
>          struct rte_eth_link link;
> +        /* ID of vhost user port given to the PMD driver */
> +        int32_t vhost_pmd_id;
>      );
> 
>      PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline1,
> @@ -472,7 +463,12 @@ static void netdev_dpdk_vhost_destruct(struct netdev
> *netdev);
> 
>  static void netdev_dpdk_clear_xstats(struct netdev_dpdk *dev);
> 
> -int netdev_dpdk_get_vid(const struct netdev_dpdk *dev);

So is the previous virtio_net functionality now part of these functions?
This is where it could make sense to flag what is handled here.

> +static int link_status_changed_callback(dpdk_port_t port_id,
> +        enum rte_eth_event_type type, void *param, void *ret_param);
> +static int vring_state_changed_callback(dpdk_port_t port_id,
> +        enum rte_eth_event_type type, void *param, void *ret_param);
> +static void netdev_dpdk_remap_txqs(struct netdev_dpdk *dev);
> +static void netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev);
> 
>  struct ingress_policer *
>  netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
> @@ -812,11 +808,13 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev,
> int n_rxq, int n_txq)
>              break;
>          }
> 
> -        diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
> -        if (diag) {
> -            VLOG_ERR("Interface %s MTU (%d) setup error: %s",
> -                    dev->up.name, dev->mtu, rte_strerror(-diag));
> -            break;
> +        if (dev->type == DPDK_DEV_ETH) {
> +            diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
> +            if (diag) {
> +                VLOG_ERR("Interface %s MTU (%d) setup error: %s",
> +                        dev->up.name, dev->mtu, rte_strerror(-diag));
> +                break;
> +            }

I think you're already aware but there's a bug fix a case where the device does not support rte_eth_dev_set_mtu, will require a rebase here it's introduced.

>          }
> 
>          for (i = 0; i < n_txq; i++) {
> @@ -851,8 +849,13 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int
> n_rxq, int n_txq)
>              continue;
>          }
> 
> -        dev->up.n_rxq = n_rxq;
> -        dev->up.n_txq = n_txq;
> +        /* Only set n_*xq for physical devices. vHost User devices will
> set
> +         * this value correctly using info from the virtio backend.
> +         */
> +        if (dev->type == DPDK_DEV_ETH) {
> +            dev->up.n_rxq = n_rxq;
> +            dev->up.n_txq = n_txq;
> +        }
> 
>          return 0;
>      }
> @@ -893,8 +896,17 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>          dev->hw_ol_features |= NETDEV_RX_CHECKSUM_OFFLOAD;
>      }
> 
> -    n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
> -    n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
> +    if (dev->type != DPDK_DEV_ETH) {
> +        /* We don't know how many queues QEMU will request so we need to
> +         * provision for the maximum, as if we configure less up front
> than
> +         * what QEMU configures later, those additional queues will never
> be
> +         * available to us. */
> +        n_rxq = OVS_VHOST_MAX_QUEUE_NUM;
> +        n_txq = OVS_VHOST_MAX_QUEUE_NUM;
> +    } else {
> +        n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
> +        n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
> +    }
> 
>      diag = dpdk_eth_dev_port_config(dev, n_rxq, n_txq);
>      if (diag) {
> @@ -997,9 +1009,8 @@ common_construct(struct netdev *netdev, dpdk_port_t
> port_no,
>      dev->requested_mtu = ETHER_MTU;
>      dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
>      dev->requested_lsc_interrupt_mode = 0;
> -    ovsrcu_index_init(&dev->vid, -1);
> +    dev->vhost_pmd_id = -1;
>      dev->vhost_reconfigured = false;
> -    dev->attached = false;
> 
>      ovsrcu_init(&dev->qos_conf, NULL);
> 
> @@ -1057,19 +1068,62 @@ dpdk_dev_parse_name(const char dev_name[], const
> char prefix[],
>  }
> 
>  static int
> -vhost_common_construct(struct netdev *netdev)
> -    OVS_REQUIRES(dpdk_mutex)
> +dpdk_attach_vhost_pmd(struct netdev_dpdk *dev, int mode)
>  {
> -    int socket_id = rte_lcore_to_socket_id(rte_get_master_lcore());
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +    char *devargs;
> +    int err = 0;
> +    dpdk_port_t port_no = 0;
> +    uint32_t driver_id = 0;
> +    int iommu_enabled = 0;
> +    int zc_enabled = 0;
> 
> -    dev->tx_q = netdev_dpdk_alloc_txq(OVS_VHOST_MAX_QUEUE_NUM);
> -    if (!dev->tx_q) {
> -        return ENOMEM;
> +    if (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY) {
> +        zc_enabled = 1;
> +    }
> +
> +    if (dpdk_vhost_iommu_enabled()) {
> +        iommu_enabled = 1;
>      }
> 
> -    return common_construct(netdev, DPDK_ETH_PORT_ID_INVALID,
> -                            DPDK_DEV_VHOST, socket_id);
> +    if (id_pool_alloc_id(dpdk_get_vhost_id_pool(), &driver_id)) {
> +        devargs = xasprintf("net_vhost%u,iface=%s,queues=%i,client=%i,"
> +                            "dequeue-zero-copy=%i,iommu-support=%i",
> +                 driver_id, dev->vhost_id, OVS_VHOST_MAX_QUEUE_NUM, mode,
> +                 zc_enabled, iommu_enabled);
> +        err = rte_eth_dev_attach(devargs, &port_no);
> +        if (!err) {
> +            dev->attached = true;
> +            dev->port_id = port_no;
> +            dev->vhost_pmd_id = driver_id;
> +            err = rte_vhost_driver_disable_features(dev->vhost_id,
> +                                1ULL << VIRTIO_NET_F_HOST_TSO4
> +                                | 1ULL << VIRTIO_NET_F_HOST_TSO6
> +                                | 1ULL << VIRTIO_NET_F_CSUM);
> +            if (err) {
> +                VLOG_ERR("rte_vhost_driver_disable_features failed for
> vhost "
> +                         "user client port: %s\n", dev->up.name);
> +            }
> +
> +            rte_eth_dev_callback_register(dev->port_id,
> +                                          RTE_ETH_EVENT_QUEUE_STATE,
> +                                          vring_state_changed_callback,
> +                                          NULL);
> +            rte_eth_dev_callback_register(dev->port_id,
> +                                          RTE_ETH_EVENT_INTR_LSC,
> +                                          link_status_changed_callback,
> +                                          NULL);
> +        } else {
> +            id_pool_free_id(dpdk_get_vhost_id_pool(), driver_id);
> +            VLOG_ERR("Failed to attach vhost-user device %s to DPDK",
> +                     dev->vhost_id);
> +        }
> +    } else {
> +        VLOG_ERR("Unable to create vhost-user device %s - too many vhost-
> user "
> +                 "devices registered with PMD", dev->vhost_id);
> +        err = ENODEV;
> +    }
> +
> +    return err;
>  }
> 
>  static int
> @@ -1082,7 +1136,7 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>      /* 'name' is appended to 'vhost_sock_dir' and used to create a socket
> in
>       * the file system. '/' or '\' would traverse directories, so they're
> not
>       * acceptable in 'name'. */
> -    if (strchr(name, '/') || strchr(name, '\\')) {
> +    if (strchr(name, '/') || strchr(name, '\\') || strchr(name, ',')) {
>          VLOG_ERR("\"%s\" is not a valid name for a vhost-user port. "
>                   "A valid name must not include '/' or '\\'",
>                   name);
> @@ -1097,46 +1151,23 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>               dpdk_get_vhost_sock_dir(), name);
> 
>      dev->vhost_driver_flags &= ~RTE_VHOST_USER_CLIENT;
> -    err = rte_vhost_driver_register(dev->vhost_id, dev-
> >vhost_driver_flags);
> -    if (err) {
> -        VLOG_ERR("vhost-user socket device setup failure for socket
> %s\n",
> -                 dev->vhost_id);
> -        goto out;
> -    } else {
> +    err = dpdk_attach_vhost_pmd(dev, VHOST_SERVER_MODE);
> +    if (!err) {
>          fatal_signal_add_file_to_unlink(dev->vhost_id);
>          VLOG_INFO("Socket %s created for vhost-user port %s\n",
>                    dev->vhost_id, name);
> -    }
> -
> -    err = rte_vhost_driver_callback_register(dev->vhost_id,
> -                                                &virtio_net_device_ops);
> -    if (err) {
> -        VLOG_ERR("rte_vhost_driver_callback_register failed for vhost
> user "
> -                 "port: %s\n", name);
> -        goto out;
> -    }
> -
> -    err = rte_vhost_driver_disable_features(dev->vhost_id,
> -                                1ULL << VIRTIO_NET_F_HOST_TSO4
> -                                | 1ULL << VIRTIO_NET_F_HOST_TSO6
> -                                | 1ULL << VIRTIO_NET_F_CSUM);
> -    if (err) {
> -        VLOG_ERR("rte_vhost_driver_disable_features failed for vhost user
> "
> -                 "port: %s\n", name);
> -        goto out;
> -    }
> -
> -    err = rte_vhost_driver_start(dev->vhost_id);
> -    if (err) {
> -        VLOG_ERR("rte_vhost_driver_start failed for vhost user "
> -                 "port: %s\n", name);
> +    } else {
>          goto out;
>      }
> 
> -    err = vhost_common_construct(netdev);
> +    err = common_construct(&dev->up, dev->port_id, DPDK_DEV_VHOST,
> +
> rte_lcore_to_socket_id(rte_get_master_lcore()));
>      if (err) {
> -        VLOG_ERR("vhost_common_construct failed for vhost user "
> -                 "port: %s\n", name);
> +        VLOG_ERR("common_construct failed for vhost user port: %s\n",
> name);
> +        rte_eth_dev_detach(dev->port_id, dev->vhost_id);
> +        if (dev->vhost_pmd_id >= 0) {
> +            id_pool_free_id(dpdk_get_vhost_id_pool(), dev->vhost_pmd_id);
> +        }
>      }
> 
>  out:
> @@ -1149,12 +1180,14 @@ out:
>  static int
>  netdev_dpdk_vhost_client_construct(struct netdev *netdev)
>  {
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int err;
> 
>      ovs_mutex_lock(&dpdk_mutex);
> -    err = vhost_common_construct(netdev);
> +    err = common_construct(&dev->up, DPDK_ETH_PORT_ID_INVALID,
> DPDK_DEV_VHOST,
> +
> rte_lcore_to_socket_id(rte_get_master_lcore()));
>      if (err) {
> -        VLOG_ERR("vhost_common_construct failed for vhost user client"
> +        VLOG_ERR("common_construct failed for vhost user client"
>                   "port: %s\n", netdev->name);
>      }
>      ovs_mutex_unlock(&dpdk_mutex);
> @@ -1178,90 +1211,76 @@ common_destruct(struct netdev_dpdk *dev)
>      OVS_REQUIRES(dpdk_mutex)
>      OVS_EXCLUDED(dev->mutex)
>  {
> -    rte_free(dev->tx_q);
> -    dpdk_mp_release(dev->mp);
> -
> -    ovs_list_remove(&dev->list_node);
> -    free(ovsrcu_get_protected(struct ingress_policer *,
> -                              &dev->ingress_policer));
> -    ovs_mutex_destroy(&dev->mutex);
> -}
> -
> -static void
> -netdev_dpdk_destruct(struct netdev *netdev)
> -{
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      char devname[RTE_ETH_NAME_MAX_LEN];
> 
> -    ovs_mutex_lock(&dpdk_mutex);
> -
>      rte_eth_dev_stop(dev->port_id);
>      dev->started = false;
> 
>      if (dev->attached) {
>          rte_eth_dev_close(dev->port_id);
>          if (rte_eth_dev_detach(dev->port_id, devname) < 0) {
> -            VLOG_ERR("Device '%s' can not be detached", dev->devargs);
> +            VLOG_ERR("Device '%s' can not be detached", devname);
>          } else {
>              VLOG_INFO("Device '%s' has been detached", devname);
>          }
>      }
> 
>      netdev_dpdk_clear_xstats(dev);
> -    free(dev->devargs);
> -    common_destruct(dev);
> -
> -    ovs_mutex_unlock(&dpdk_mutex);
> +    rte_free(dev->tx_q);
> +    dpdk_mp_release(dev->mp);
> +    ovs_list_remove(&dev->list_node);
> +    free(ovsrcu_get_protected(struct ingress_policer *,
> +                              &dev->ingress_policer));
> +    ovs_mutex_destroy(&dev->mutex);
>  }
> 
> -/* rte_vhost_driver_unregister() can call back destroy_device(), which
> will
> - * try to acquire 'dpdk_mutex' and possibly 'dev->mutex'.  To avoid a
> - * deadlock, none of the mutexes must be held while calling this
> function. */
> -static int
> -dpdk_vhost_driver_unregister(struct netdev_dpdk *dev OVS_UNUSED,
> -                             char *vhost_id)
> -    OVS_EXCLUDED(dpdk_mutex)
> -    OVS_EXCLUDED(dev->mutex)
> +static void
> +netdev_dpdk_destruct(struct netdev *netdev)
>  {
> -    return rte_vhost_driver_unregister(vhost_id);
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +
> +    ovs_mutex_lock(&dpdk_mutex);
> +    common_destruct(dev);
> +    free(dev->devargs);
> +    ovs_mutex_unlock(&dpdk_mutex);
>  }
> 
>  static void
>  netdev_dpdk_vhost_destruct(struct netdev *netdev)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> -    char *vhost_id;
> 
>      ovs_mutex_lock(&dpdk_mutex);
> 
>      /* Guest becomes an orphan if still attached. */
> -    if (netdev_dpdk_get_vid(dev) >= 0
> -        && !(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
> +    check_link_status(dev);
> +    if (dev->link.link_status == ETH_LINK_UP) {
>          VLOG_ERR("Removing port '%s' while vhost device still attached.",
>                   netdev->name);
>          VLOG_ERR("To restore connectivity after re-adding of port, VM on
> "
>                   "socket '%s' must be restarted.", dev->vhost_id);
>      }
> 
> -    vhost_id = xstrdup(dev->vhost_id);
> -
> -    common_destruct(dev);
> -
> -    ovs_mutex_unlock(&dpdk_mutex);
> +    rte_eth_dev_callback_unregister(dev->port_id,
> +                                    RTE_ETH_EVENT_QUEUE_STATE,
> +                                    vring_state_changed_callback, NULL);
> +    rte_eth_dev_callback_unregister(dev->port_id,
> +                                    RTE_ETH_EVENT_INTR_LSC,
> +                                    link_status_changed_callback, NULL);
> 
> -    if (!vhost_id[0]) {
> -        goto out;
> +    if (dev->vhost_pmd_id >= 0) {
> +        id_pool_free_id(dpdk_get_vhost_id_pool(),
> +                dev->vhost_pmd_id);
>      }
> 
> -    if (dpdk_vhost_driver_unregister(dev, vhost_id)) {
> -        VLOG_ERR("%s: Unable to unregister vhost driver for socket
> '%s'.\n",
> -                 netdev->name, vhost_id);
> -    } else if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
> -        /* OVS server mode - remove this socket from list for deletion */
> -        fatal_signal_remove_file_to_unlink(vhost_id);
> +    if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
> +           /* OVS server mode - remove this socket from list for deletion
> */
> +           fatal_signal_remove_file_to_unlink(dev->vhost_id);
>      }
> -out:
> -    free(vhost_id);
> +
> +    common_destruct(dev);
> +
> +    ovs_mutex_unlock(&dpdk_mutex);
>  }
> 
>  static void
> @@ -1846,12 +1865,6 @@ ingress_policer_run(struct ingress_policer
> *policer, struct rte_mbuf **pkts,
>      return cnt;
>  }
> 
> -static bool
> -is_vhost_running(struct netdev_dpdk *dev)
> -{
> -    return (netdev_dpdk_get_vid(dev) >= 0 && dev->vhost_reconfigured);
> -}
> -
>  static inline void
>  netdev_dpdk_vhost_update_rx_size_counters(struct netdev_stats *stats,
>                                            unsigned int packet_size)
> @@ -1913,64 +1926,9 @@ netdev_dpdk_vhost_update_rx_counters(struct
> netdev_stats *stats,
>      }
>  }
> 
> -/*
> - * The receive path for the vhost port is the TX path out from guest.
> - */
> -static int
> -netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
> -                           struct dp_packet_batch *batch, int *qfill)
> -{
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
> -    struct ingress_policer *policer =
> netdev_dpdk_get_ingress_policer(dev);
> -    uint16_t nb_rx = 0;
> -    uint16_t dropped = 0;
> -    int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
> -    int vid = netdev_dpdk_get_vid(dev);
> -
> -    if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured
> -                     || !(dev->flags & NETDEV_UP))) {
> -        return EAGAIN;
> -    }
> -
> -    nb_rx = rte_vhost_dequeue_burst(vid, qid, dev->mp,
> -                                    (struct rte_mbuf **) batch->packets,
> -                                    NETDEV_MAX_BURST);
> -    if (!nb_rx) {
> -        return EAGAIN;
> -    }
> -
> -    if (qfill) {
> -        if (nb_rx == NETDEV_MAX_BURST) {
> -            /* The DPDK API returns a uint32_t which often has invalid
> bits in
> -             * the upper 16-bits. Need to restrict the value to uint16_t.
> */
> -            *qfill = rte_vhost_rx_queue_count(vid, qid) & UINT16_MAX;
> -        } else {
> -            *qfill = 0;
> -        }
> -    }
> -
> -    if (policer) {
> -        dropped = nb_rx;
> -        nb_rx = ingress_policer_run(policer,
> -                                    (struct rte_mbuf **) batch->packets,
> -                                    nb_rx, true);
> -        dropped -= nb_rx;
> -    }
> -
> -    rte_spinlock_lock(&dev->stats_lock);
> -    netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
> -                                         nb_rx, dropped);
> -    rte_spinlock_unlock(&dev->stats_lock);
> -
> -    batch->count = nb_rx;
> -    dp_packet_batch_init_packet_fields(batch);
> -
> -    return 0;
> -}
> -
>  static int
> -netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch
> *batch,
> -                     int *qfill)
> +common_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
> +            int *qfill)
>  {
>      struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq);
>      struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
> @@ -2018,6 +1976,30 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct
> dp_packet_batch *batch,
>      return 0;
>  }
> 
> +/*
> + * The receive path for the vhost port is the TX path out from guest.
> + */
> +static int
> +netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
> +                           struct dp_packet_batch *batch,
> +                           int *qfill)
> +{
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
> +
> +    if (dev->vhost_reconfigured) {
> +        return common_recv(rxq, batch, qfill);
> +    }
> +
> +    return EAGAIN;
> +}
> +
> +static int
> +netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch
> *batch,
> +                     int *qfill)
> +{
> +    return common_recv(rxq, batch, qfill);
> +}
> +
>  static inline int
>  netdev_dpdk_qos_run(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
>                      int cnt, bool may_steal)
> @@ -2059,80 +2041,6 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk
> *dev, struct rte_mbuf **pkts,
>      return cnt;
>  }
> 
> -static inline void
> -netdev_dpdk_vhost_update_tx_counters(struct netdev_stats *stats,
> -                                     struct dp_packet **packets,
> -                                     int attempted,
> -                                     int dropped)
> -{
> -    int i;
> -    int sent = attempted - dropped;
> -
> -    stats->tx_packets += sent;
> -    stats->tx_dropped += dropped;
> -
> -    for (i = 0; i < sent; i++) {
> -        stats->tx_bytes += dp_packet_size(packets[i]);
> -    }
> -}
> -
> -static void
> -__netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
> -                         struct dp_packet **pkts, int cnt)
> -{
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> -    struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
> -    unsigned int total_pkts = cnt;
> -    unsigned int dropped = 0;
> -    int i, retries = 0;
> -    int vid = netdev_dpdk_get_vid(dev);
> -
> -    qid = dev->tx_q[qid % netdev->n_txq].map;
> -
> -    if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured || qid < 0
> -                     || !(dev->flags & NETDEV_UP))) {
> -        rte_spinlock_lock(&dev->stats_lock);
> -        dev->stats.tx_dropped+= cnt;
> -        rte_spinlock_unlock(&dev->stats_lock);
> -        goto out;
> -    }
> -
> -    rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
> -
> -    cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
> -    /* Check has QoS has been configured for the netdev */
> -    cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
> -    dropped = total_pkts - cnt;
> -
> -    do {
> -        int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
> -        unsigned int tx_pkts;
> -
> -        tx_pkts = rte_vhost_enqueue_burst(vid, vhost_qid, cur_pkts, cnt);
> -        if (OVS_LIKELY(tx_pkts)) {
> -            /* Packets have been sent.*/
> -            cnt -= tx_pkts;
> -            /* Prepare for possible retry.*/
> -            cur_pkts = &cur_pkts[tx_pkts];
> -        } else {
> -            /* No packets sent - do not retry.*/
> -            break;
> -        }
> -    } while (cnt && (retries++ <= VHOST_ENQ_RETRY_NUM));
> -
> -    rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
> -
> -    rte_spinlock_lock(&dev->stats_lock);
> -    netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
> -                                         cnt + dropped);
> -    rte_spinlock_unlock(&dev->stats_lock);
> -
> -out:
> -    for (i = 0; i < total_pkts - dropped; i++) {
> -        dp_packet_delete(pkts[i]);
> -    }
> -}
> -
>  /* Tx function. Transmit packets indefinitely */
>  static void
>  dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch
> *batch)
> @@ -2186,12 +2094,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid,
> struct dp_packet_batch *batch)
>      }
> 
>      if (OVS_LIKELY(txcnt)) {
> -        if (dev->type == DPDK_DEV_VHOST) {
> -            __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **)
> pkts,
> -                                     txcnt);
> -        } else {
> -            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
> -        }
> +        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
>      }
> 
>      if (OVS_UNLIKELY(dropped)) {
> @@ -2201,21 +2104,6 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid,
> struct dp_packet_batch *batch)
>      }
>  }
> 
> -static int
> -netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
> -                       struct dp_packet_batch *batch,
> -                       bool concurrent_txq OVS_UNUSED)
> -{
> -
> -    if (OVS_UNLIKELY(batch->packets[0]->source != DPBUF_DPDK)) {
> -        dpdk_do_tx_copy(netdev, qid, batch);
> -        dp_packet_delete_batch(batch, true);
> -    } else {
> -        __netdev_dpdk_vhost_send(netdev, qid, batch->packets, batch-
> >count);
> -    }
> -    return 0;
> -}
> -
>  static inline void
>  netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>                     struct dp_packet_batch *batch,
> @@ -2226,8 +2114,7 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>          return;
>      }
> 
> -    if (OVS_UNLIKELY(concurrent_txq)) {
> -        qid = qid % dev->up.n_txq;
> +    if (concurrent_txq) {
>          rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>      }
> 
> @@ -2254,7 +2141,7 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>          }
>      }
> 
> -    if (OVS_UNLIKELY(concurrent_txq)) {
> +    if (concurrent_txq) {
>          rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>      }
>  }
> @@ -2265,11 +2152,35 @@ netdev_dpdk_eth_send(struct netdev *netdev, int
> qid,
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> 
> +    if (concurrent_txq) {
> +        qid = qid % dev->up.n_txq;
> +    }
> +
>      netdev_dpdk_send__(dev, qid, batch, concurrent_txq);
>      return 0;
>  }
> 
>  static int
> +netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
> +                       struct dp_packet_batch *batch,
> +                       bool concurrent_txq OVS_UNUSED)
> +{
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +
> +    qid = dev->tx_q[qid % netdev->n_txq].map;
> +    if (qid == -1 || !dev->vhost_reconfigured) {
> +        rte_spinlock_lock(&dev->stats_lock);
> +        dev->stats.tx_dropped+= batch->count;
> +        rte_spinlock_unlock(&dev->stats_lock);
> +        dp_packet_delete_batch(batch, true);
> +    } else {
> +        netdev_dpdk_send__(dev, qid, batch, false);
> +    }
> +
> +    return 0;
> +}
> +
> +static int
>  netdev_dpdk_set_etheraddr(struct netdev *netdev, const struct eth_addr
> mac)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> @@ -2343,41 +2254,6 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
>  static int
>  netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier);
> 
> -static int
> -netdev_dpdk_vhost_get_stats(const struct netdev *netdev,
> -                            struct netdev_stats *stats)
> -{
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> -
> -    ovs_mutex_lock(&dev->mutex);
> -
> -    rte_spinlock_lock(&dev->stats_lock);
> -    /* Supported Stats */
> -    stats->rx_packets = dev->stats.rx_packets;
> -    stats->tx_packets = dev->stats.tx_packets;
> -    stats->rx_dropped = dev->stats.rx_dropped;
> -    stats->tx_dropped = dev->stats.tx_dropped;
> -    stats->multicast = dev->stats.multicast;
> -    stats->rx_bytes = dev->stats.rx_bytes;
> -    stats->tx_bytes = dev->stats.tx_bytes;
> -    stats->rx_errors = dev->stats.rx_errors;
> -    stats->rx_length_errors = dev->stats.rx_length_errors;
> -
> -    stats->rx_1_to_64_packets = dev->stats.rx_1_to_64_packets;
> -    stats->rx_65_to_127_packets = dev->stats.rx_65_to_127_packets;
> -    stats->rx_128_to_255_packets = dev->stats.rx_128_to_255_packets;
> -    stats->rx_256_to_511_packets = dev->stats.rx_256_to_511_packets;
> -    stats->rx_512_to_1023_packets = dev->stats.rx_512_to_1023_packets;
> -    stats->rx_1024_to_1522_packets = dev->stats.rx_1024_to_1522_packets;
> -    stats->rx_1523_to_max_packets = dev->stats.rx_1523_to_max_packets;
> -
> -    rte_spinlock_unlock(&dev->stats_lock);
> -
> -    ovs_mutex_unlock(&dev->mutex);
> -
> -    return 0;
> -}
> -
>  static void
>  netdev_dpdk_convert_xstats(struct netdev_stats *stats,
>                             const struct rte_eth_xstat *xstats,
> @@ -2423,6 +2299,8 @@ netdev_dpdk_convert_xstats(struct netdev_stats
> *stats,
>              stats->tx_broadcast_packets = xstats[i].value;
>          } else if (strcmp(XSTAT_RX_UNDERSIZED_ERRORS, names[i].name) ==
> 0) {
>              stats->rx_undersized_errors = xstats[i].value;
> +        } else if (strcmp(XSTAT_RX_UNDERSIZE_PACKETS, names[i].name) ==
> 0) {
> +            stats->rx_undersized_errors = xstats[i].value;
>          } else if (strcmp(XSTAT_RX_FRAGMENTED_ERRORS, names[i].name) ==
> 0) {
>              stats->rx_fragmented_errors = xstats[i].value;
>          } else if (strcmp(XSTAT_RX_JABBER_ERRORS, names[i].name) == 0) {
> @@ -2445,6 +2323,11 @@ netdev_dpdk_get_stats(const struct netdev *netdev,
> struct netdev_stats *stats)
>      struct rte_eth_xstat_name *rte_xstats_names = NULL;
>      int rte_xstats_len, rte_xstats_new_len, rte_xstats_ret;
> 
> +    if (!rte_eth_dev_is_valid_port(dev->port_id)) {
> +        ovs_mutex_unlock(&dev->mutex);
> +        return EPROTO;
> +    }
> +
>      if (rte_eth_stats_get(dev->port_id, &rte_stats)) {
>          VLOG_ERR("Can't get ETH statistics for port: "DPDK_PORT_ID_FMT,
>                   dev->port_id);
> @@ -2521,6 +2404,10 @@ netdev_dpdk_get_custom_stats(const struct netdev
> *netdev,
> 
>      ovs_mutex_lock(&dev->mutex);
> 
> +    if (rte_eth_dev_is_valid_port(dev->port_id)) {
> +        goto out;
> +    }
> +
>      if (netdev_dpdk_configure_xstats(dev)) {
>          uint64_t *values = xcalloc(dev->rte_xstats_ids_size,
>                                     sizeof(uint64_t));
> @@ -2557,6 +2444,7 @@ netdev_dpdk_get_custom_stats(const struct netdev
> *netdev,
>          free(values);
>      }
> 
> +out:
>      ovs_mutex_unlock(&dev->mutex);
> 
>      return 0;
> @@ -2713,24 +2601,6 @@ netdev_dpdk_get_carrier(const struct netdev
> *netdev, bool *carrier)
>      return 0;
>  }
> 
> -static int
> -netdev_dpdk_vhost_get_carrier(const struct netdev *netdev, bool *carrier)
> -{
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> -
> -    ovs_mutex_lock(&dev->mutex);
> -
> -    if (is_vhost_running(dev)) {
> -        *carrier = 1;
> -    } else {
> -        *carrier = 0;
> -    }
> -
> -    ovs_mutex_unlock(&dev->mutex);
> -
> -    return 0;
> -}
> -
>  static long long int
>  netdev_dpdk_get_carrier_resets(const struct netdev *netdev)
>  {
> @@ -2780,8 +2650,7 @@ netdev_dpdk_update_flags__(struct netdev_dpdk *dev,
>           * running then change netdev's change_seq to trigger link state
>           * update. */
> 
> -        if ((NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off)))
> -            && is_vhost_running(dev)) {
> +        if ((NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off)))) {
>              netdev_change_seq_changed(&dev->up);
> 
>              /* Clear statistics if device is getting up. */
> @@ -2811,18 +2680,41 @@ netdev_dpdk_update_flags(struct netdev *netdev,
>      return error;
>  }
> 
> +static void
> +common_get_status(struct smap *args, struct netdev_dpdk *dev,
> +                  struct rte_eth_dev_info *dev_info)
> +{
> +    smap_add_format(args, "port_no", DPDK_PORT_ID_FMT, dev->port_id);
> +    smap_add_format(args, "numa_id", "%d",
> +                           rte_eth_dev_socket_id(dev->port_id));
> +    smap_add_format(args, "driver_name", "%s", dev_info->driver_name);
> +    smap_add_format(args, "min_rx_bufsize", "%u", dev_info-
> >min_rx_bufsize);
> +    smap_add_format(args, "max_rx_pktlen", "%u", dev->max_packet_len);
> +    smap_add_format(args, "max_rx_queues", "%u", dev_info-
> >max_rx_queues);
> +    smap_add_format(args, "max_tx_queues", "%u", dev_info-
> >max_tx_queues);
> +    smap_add_format(args, "max_mac_addrs", "%u", dev_info-
> >max_mac_addrs);
> +}
> +
>  static int
>  netdev_dpdk_vhost_user_get_status(const struct netdev *netdev,
>                                    struct smap *args)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +    struct rte_eth_dev_info dev_info;
> +
> +    if (!rte_eth_dev_is_valid_port(dev->port_id)) {
> +        return ENODEV;
> +    }
> 
>      ovs_mutex_lock(&dev->mutex);
> +    rte_eth_dev_info_get(dev->port_id, &dev_info);
> +
> +    common_get_status(args, dev, &dev_info);
> 
>      bool client_mode = dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT;
>      smap_add_format(args, "mode", "%s", client_mode ? "client" :
> "server");
> 
> -    int vid = netdev_dpdk_get_vid(dev);
> +    int vid = rte_eth_vhost_get_vid_from_port_id(dev->port_id);;
>      if (vid < 0) {
>          smap_add_format(args, "status", "disconnected");
>          ovs_mutex_unlock(&dev->mutex);
> @@ -2883,15 +2775,8 @@ netdev_dpdk_get_status(const struct netdev *netdev,
> struct smap *args)
>      rte_eth_dev_info_get(dev->port_id, &dev_info);
>      ovs_mutex_unlock(&dev->mutex);
> 
> -    smap_add_format(args, "port_no", DPDK_PORT_ID_FMT, dev->port_id);
> -    smap_add_format(args, "numa_id", "%d",
> -                           rte_eth_dev_socket_id(dev->port_id));
> -    smap_add_format(args, "driver_name", "%s", dev_info.driver_name);
> -    smap_add_format(args, "min_rx_bufsize", "%u",
> dev_info.min_rx_bufsize);
> -    smap_add_format(args, "max_rx_pktlen", "%u", dev->max_packet_len);
> -    smap_add_format(args, "max_rx_queues", "%u", dev_info.max_rx_queues);
> -    smap_add_format(args, "max_tx_queues", "%u", dev_info.max_tx_queues);
> -    smap_add_format(args, "max_mac_addrs", "%u", dev_info.max_mac_addrs);
> +    common_get_status(args, dev, &dev_info);
> +
>      smap_add_format(args, "max_hash_mac_addrs", "%u",
>                             dev_info.max_hash_mac_addrs);
>      smap_add_format(args, "max_vfs", "%u", dev_info.max_vfs);
> @@ -3070,19 +2955,6 @@ out:
>  }
> 
>  /*
> - * Set virtqueue flags so that we do not receive interrupts.
> - */
> -static void
> -set_irq_status(int vid)
> -{
> -    uint32_t i;
> -
> -    for (i = 0; i < rte_vhost_get_vring_num(vid); i++) {
> -        rte_vhost_enable_guest_notification(vid, i, 0);
> -    }
> -}
> -
> -/*
>   * Fixes mapping for vhost-user tx queues. Must be called after each
>   * enabling/disabling of queues and n_txq modifications.
>   */
> @@ -3123,53 +2995,60 @@ netdev_dpdk_remap_txqs(struct netdev_dpdk *dev)
>      free(enabled_queues);
>  }
> 
> -/*
> - * A new virtio-net device is added to a vhost port.
> - */
>  static int
> -new_device(int vid)

Would be nice to see a comment giving an overview of the function now as it encompasses some aspects of new device() (possibly others as well).

> +link_status_changed_callback(dpdk_port_t port_id,
> +                             enum rte_eth_event_type type OVS_UNUSED,
> +                             void *param OVS_UNUSED,
> +                             void *ret_param OVS_UNUSED)
>  {
>      struct netdev_dpdk *dev;
>      bool exists = false;
>      int newnode = 0;
> -    char ifname[IF_NAME_SZ];
> -
> -    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
> 
>      ovs_mutex_lock(&dpdk_mutex);
>      /* Add device to the vhost port with the same name as that passed
> down. */
>      LIST_FOR_EACH(dev, list_node, &dpdk_list) {
>          ovs_mutex_lock(&dev->mutex);
> -        if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
> -            uint32_t qp_num = rte_vhost_get_vring_num(vid)/VIRTIO_QNUM;
> -
> -            /* Get NUMA information */
> -            newnode = rte_vhost_get_numa_node(vid);
> -            if (newnode == -1) {
> +        if (port_id == dev->port_id) {
> +            check_link_status(dev);
> +            if (dev->link.link_status == ETH_LINK_UP) {
> +                /* Device brought up */
> +                /* Get queue information */
> +                int vid = rte_eth_vhost_get_vid_from_port_id(dev-
> >port_id);
> +                uint32_t qp_num = rte_vhost_get_vring_num(vid) /
> VIRTIO_QNUM;
> +                if (qp_num <= 0) {
> +                    qp_num = dev->requested_n_rxq;
> +                }
> +                /* Get NUMA information */
> +                newnode = rte_eth_dev_socket_id(dev->port_id);
> +                if (newnode == -1) {
>  #ifdef VHOST_NUMA
> -                VLOG_INFO("Error getting NUMA info for vHost Device
> '%s'",
> -                          ifname);
> +                    VLOG_INFO("Error getting NUMA info for vHost Device
> '%s'",
> +                              dev->vhost_id);
>  #endif
> -                newnode = dev->socket_id;
> -            }
> +                    newnode = dev->socket_id;
> +                }
> +                if (dev->requested_n_txq != qp_num
> +                                || dev->requested_n_rxq != qp_num
> +                                || dev->requested_socket_id != newnode) {
> +                    dev->requested_socket_id = newnode;
> +                    dev->requested_n_rxq = qp_num;
> +                    dev->requested_n_txq = qp_num;
> +                    netdev_request_reconfigure(&dev->up);
> +                } else {
> +                    /* Reconfiguration not required. */
> +                    dev->vhost_reconfigured = true;
> +                }
> 
> -            if (dev->requested_n_txq != qp_num
> -                || dev->requested_n_rxq != qp_num
> -                || dev->requested_socket_id != newnode) {
> -                dev->requested_socket_id = newnode;
> -                dev->requested_n_rxq = qp_num;
> -                dev->requested_n_txq = qp_num;
> -                netdev_request_reconfigure(&dev->up);
> +                VLOG_INFO("vHost Device '%s' has been added on numa node
> %i",
> +                          dev->vhost_id, newnode);
>              } else {
> -                /* Reconfiguration not required. */
> -                dev->vhost_reconfigured = true;
> +                /* Device brought down */
> +                dev->vhost_reconfigured = false;
> +                netdev_dpdk_txq_map_clear(dev);
> +                VLOG_INFO("vHost Device '%s' has been removed", dev-
> >vhost_id);
>              }
> -
> -            ovsrcu_index_set(&dev->vid, vid);
>              exists = true;
> -
> -            /* Disable notifications. */
> -            set_irq_status(vid);
>              netdev_change_seq_changed(&dev->up);
>              ovs_mutex_unlock(&dev->mutex);
>              break;
> @@ -3179,14 +3058,11 @@ new_device(int vid)
>      ovs_mutex_unlock(&dpdk_mutex);
> 
>      if (!exists) {
> -        VLOG_INFO("vHost Device '%s' can't be added - name not found",
> ifname);
> +        VLOG_INFO("vHost Device with port id %i not found", port_id);
> 
>          return -1;
>      }
> 
> -    VLOG_INFO("vHost Device '%s' has been added on numa node %i",
> -              ifname, newnode);
> -
>      return 0;
>  }
> 
> @@ -3202,78 +3078,32 @@ netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
>      }
>  }
> 
> -/*
> - * Remove a virtio-net device from the specific vhost port.  Use dev-
> >remove
> - * flag to stop any more packets from being sent or received to/from a VM
> and
> - * ensure all currently queued packets have been sent/received before
> removing
> - *  the device.
> - */
> -static void
> -destroy_device(int vid)
> -{
> -    struct netdev_dpdk *dev;
> -    bool exists = false;
> -    char ifname[IF_NAME_SZ];
> -
> -    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
> -
> -    ovs_mutex_lock(&dpdk_mutex);
> -    LIST_FOR_EACH (dev, list_node, &dpdk_list) {
> -        if (netdev_dpdk_get_vid(dev) == vid) {
> -
> -            ovs_mutex_lock(&dev->mutex);
> -            dev->vhost_reconfigured = false;
> -            ovsrcu_index_set(&dev->vid, -1);
> -            netdev_dpdk_txq_map_clear(dev);
> -
> -            netdev_change_seq_changed(&dev->up);
> -            ovs_mutex_unlock(&dev->mutex);
> -            exists = true;
> -            break;
> -        }
> -    }
> -
> -    ovs_mutex_unlock(&dpdk_mutex);
> -
> -    if (exists) {
> -        /*
> -         * Wait for other threads to quiesce after setting the
> 'virtio_dev'
> -         * to NULL, before returning.
> -         */
> -        ovsrcu_synchronize();
> -        /*
> -         * As call to ovsrcu_synchronize() will end the quiescent state,
> -         * put thread back into quiescent state before returning.
> -         */
> -        ovsrcu_quiesce_start();
> -        VLOG_INFO("vHost Device '%s' has been removed", ifname);
> -    } else {
> -        VLOG_INFO("vHost Device '%s' not found", ifname);
> -    }
> -}
> -
>  static int
> -vring_state_changed(int vid, uint16_t queue_id, int enable)
> +vring_state_changed_callback(dpdk_port_t port_id,
> +                             enum rte_eth_event_type type OVS_UNUSED,
> +                             void *param OVS_UNUSED,
> +                             void *ret_param OVS_UNUSED)
>  {
>      struct netdev_dpdk *dev;
>      bool exists = false;
> -    int qid = queue_id / VIRTIO_QNUM;
> +    int vid = -1;
>      char ifname[IF_NAME_SZ];
> +    struct rte_eth_vhost_queue_event event;
> +    int err = 0;
> 
> -    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
> -
> -    if (queue_id % VIRTIO_QNUM == VIRTIO_TXQ) {
> +    err = rte_eth_vhost_get_queue_event(port_id, &event);
> +    if (err || event.rx) {
>          return 0;
>      }
> 
>      ovs_mutex_lock(&dpdk_mutex);
>      LIST_FOR_EACH (dev, list_node, &dpdk_list) {
>          ovs_mutex_lock(&dev->mutex);
> -        if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
> -            if (enable) {
> -                dev->tx_q[qid].map = qid;
> +        if (port_id == dev->port_id) {
> +            if (event.enable) {
> +                dev->tx_q[event.queue_id].map = event.queue_id;
>              } else {
> -                dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
> +                dev->tx_q[event.queue_id].map = OVS_VHOST_QUEUE_DISABLED;
>              }
>              netdev_dpdk_remap_txqs(dev);
>              exists = true;
> @@ -3284,10 +3114,13 @@ vring_state_changed(int vid, uint16_t queue_id,
> int enable)
>      }
>      ovs_mutex_unlock(&dpdk_mutex);
> 
> +    vid = rte_eth_vhost_get_vid_from_port_id(dev->port_id);
> +    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
> +
>      if (exists) {
> -        VLOG_INFO("State of queue %d ( tx_qid %d ) of vhost device '%s'"
> -                  "changed to \'%s\'", queue_id, qid, ifname,
> -                  (enable == 1) ? "enabled" : "disabled");
> +        VLOG_INFO("State of tx_qid %d  of vhost device '%s'"
> +                  "changed to \'%s\'", event.queue_id, ifname,
> +                  (event.enable == 1) ? "enabled" : "disabled");
>      } else {
>          VLOG_INFO("vHost Device '%s' not found", ifname);
>          return -1;
> @@ -3296,25 +3129,6 @@ vring_state_changed(int vid, uint16_t queue_id, int
> enable)
>      return 0;
>  }
> 
> -/*
> - * Retrieve the DPDK virtio device ID (vid) associated with a vhostuser
> - * or vhostuserclient netdev.
> - *
> - * Returns a value greater or equal to zero for a valid vid or '-1' if
> - * there is no valid vid associated. A vid of '-1' must not be used in
> - * rte_vhost_ APi calls.
> - *
> - * Once obtained and validated, a vid can be used by a PMD for multiple
> - * subsequent rte_vhost API calls until the PMD quiesces. A PMD should
> - * not fetch the vid again for each of a series of API calls.
> - */
> -
> -int
> -netdev_dpdk_get_vid(const struct netdev_dpdk *dev)
> -{
> -    return ovsrcu_index_get(&dev->vid);
> -}
> -
>  struct ingress_policer *
>  netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev)
>  {
> @@ -3681,13 +3495,12 @@ static const struct dpdk_qos_ops
> egress_policer_ops = {
>  };
> 
>  static int
> -netdev_dpdk_reconfigure(struct netdev *netdev)
> +common_reconfigure(struct netdev *netdev)
> +    OVS_REQUIRES(dev->mutex)

This can cause compilation errors as dev->mutex will be treated as undeclared identifier. You would have to change struct netdev *netdev to struct netdev_dpdk *dev to avoid it.

https://travis-ci.org/istokes/ovs/jobs/383702541

>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int err = 0;
> 
> -    ovs_mutex_lock(&dev->mutex);
> -
>      if (netdev->n_txq == dev->requested_n_txq
>          && netdev->n_rxq == dev->requested_n_rxq
>          && dev->mtu == dev->requested_mtu
> @@ -3727,17 +3540,36 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>      netdev_change_seq_changed(netdev);
> 
>  out:
> +    return err;
> +}
> +
> +static int
> +netdev_dpdk_reconfigure(struct netdev *netdev)
> +{
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +    int err = 0;
> +
> +    ovs_mutex_lock(&dev->mutex);
> +    err = common_reconfigure(netdev);
>      ovs_mutex_unlock(&dev->mutex);
> +
>      return err;
>  }
> 
>  static int
> -dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
> +dpdk_vhost_reconfigure_helper(struct netdev *netdev)
>      OVS_REQUIRES(dev->mutex)

Same as above WRT compilation error.

>  {
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +    int err;
> +
>      dev->up.n_txq = dev->requested_n_txq;
>      dev->up.n_rxq = dev->requested_n_rxq;
> -    int err;
> +
> +    err = common_reconfigure(netdev);
> +    if (err) {
> +        return err;
> +    }
> 
>      /* Enable TX queue 0 by default if it wasn't disabled. */
>      if (dev->tx_q[0].map == OVS_VHOST_QUEUE_MAP_UNKNOWN) {
> @@ -3746,14 +3578,7 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk
> *dev)
> 
>      netdev_dpdk_remap_txqs(dev);
> 
> -    err = netdev_dpdk_mempool_configure(dev);
> -    if (!err) {
> -        /* A new mempool was created. */
> -        netdev_change_seq_changed(&dev->up);
> -    } else if (err != EEXIST){
> -        return err;
> -    }
> -    if (netdev_dpdk_get_vid(dev) >= 0) {
> +    if (rte_eth_vhost_get_vid_from_port_id(dev->port_id) >= 0) {
>          if (dev->vhost_reconfigured == false) {
>              dev->vhost_reconfigured = true;
>              /* Carrier status may need updating. */
> @@ -3771,7 +3596,7 @@ netdev_dpdk_vhost_reconfigure(struct netdev *netdev)
>      int err;
> 
>      ovs_mutex_lock(&dev->mutex);
> -    err = dpdk_vhost_reconfigure_helper(dev);
> +    err = dpdk_vhost_reconfigure_helper(netdev);
>      ovs_mutex_unlock(&dev->mutex);
> 
>      return err;
> @@ -3781,9 +3606,8 @@ static int
>  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> -    int err;
> -    uint64_t vhost_flags = 0;
> -    bool zc_enabled;
> +    int err = 0;
> +    int sid = -1;
> 
>      ovs_mutex_lock(&dev->mutex);
> 
> @@ -3794,64 +3618,50 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
> *netdev)
>       */
>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>              && strlen(dev->vhost_id)) {
> -        /* Register client-mode device. */
> -        vhost_flags |= RTE_VHOST_USER_CLIENT;
> +        /* First time once-only configuration */
> +        err = dpdk_attach_vhost_pmd(dev, VHOST_CLIENT_MODE);
> +
> +        if (!err) {
> +            sid = rte_eth_dev_socket_id(dev->port_id);
> +            dev->socket_id = sid < 0 ? SOCKET0 : sid;
> +            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
> +
> +            if (dev->requested_socket_id != dev->socket_id
> +                || dev->requested_mtu != dev->mtu) {
> +                err = netdev_dpdk_mempool_configure(dev);
> +                if (err && err != EEXIST) {
> +                    goto unlock;
> +                }
> +            }
> 
> -        /* Enable IOMMU support, if explicitly requested. */
> -        if (dpdk_vhost_iommu_enabled()) {
> -            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
> -        }
> +            netdev->n_txq = dev->requested_n_txq;
> +            netdev->n_rxq = dev->requested_n_rxq;
> +
> +            rte_free(dev->tx_q);
> +            err = dpdk_eth_dev_init(dev);
> +            dev->tx_q = netdev_dpdk_alloc_txq(netdev->n_txq);
> +            if (!dev->tx_q) {
> +                rte_eth_dev_detach(dev->port_id, dev->vhost_id);
> +                if (dev->vhost_pmd_id >= 0) {
> +                    id_pool_free_id(dpdk_get_vhost_id_pool(),
> +                            dev->vhost_pmd_id);
> +                }
> +                err = ENOMEM;
> +                goto unlock;
> +            }
> 
> -        zc_enabled = dev->vhost_driver_flags
> -                     & RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> -        /* Enable zero copy flag, if requested */
> -        if (zc_enabled) {
> -            vhost_flags |= RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> -        }
> +            netdev_change_seq_changed(netdev);
> 
> -        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
> -        if (err) {
> -            VLOG_ERR("vhost-user device setup failure for device %s\n",
> -                     dev->vhost_id);
> -            goto unlock;
> -        } else {
> -            /* Configuration successful */
> -            dev->vhost_driver_flags |= vhost_flags;
>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>                        "using client socket '%s'",
>                        dev->up.name, dev->vhost_id);
> -            if (zc_enabled) {
> -                VLOG_INFO("Zero copy enabled for vHost port %s", dev-
> >up.name);
> -            }
> -        }
> -
> -        err = rte_vhost_driver_callback_register(dev->vhost_id,
> -                                                 &virtio_net_device_ops);
> -        if (err) {
> -            VLOG_ERR("rte_vhost_driver_callback_register failed for "
> -                     "vhost user client port: %s\n", dev->up.name);
> -            goto unlock;
> -        }
> -
> -        err = rte_vhost_driver_disable_features(dev->vhost_id,
> -                                    1ULL << VIRTIO_NET_F_HOST_TSO4
> -                                    | 1ULL << VIRTIO_NET_F_HOST_TSO6
> -                                    | 1ULL << VIRTIO_NET_F_CSUM);
> -        if (err) {
> -            VLOG_ERR("rte_vhost_driver_disable_features failed for vhost
> user "
> -                     "client port: %s\n", dev->up.name);
> -            goto unlock;
> -        }
> -
> -        err = rte_vhost_driver_start(dev->vhost_id);
> -        if (err) {
> -            VLOG_ERR("rte_vhost_driver_start failed for vhost user "
> -                     "client port: %s\n", dev->up.name);
> -            goto unlock;
>          }
> +        goto unlock;
>      }
> 
> -    err = dpdk_vhost_reconfigure_helper(dev);
> +    if (rte_eth_dev_is_valid_port(dev->port_id)) {
> +        err = dpdk_vhost_reconfigure_helper(netdev);
> +    }
> 
>  unlock:
>      ovs_mutex_unlock(&dev->mutex);
> @@ -3861,9 +3671,7 @@ unlock:
> 
>  #define NETDEV_DPDK_CLASS(NAME, INIT, CONSTRUCT, DESTRUCT,    \
>                            SET_CONFIG, SET_TX_MULTIQ, SEND,    \
> -                          GET_CARRIER, GET_STATS,			  \
> -                          GET_CUSTOM_STATS,					  \
> -                          GET_FEATURES, GET_STATUS,           \
> +                          GET_STATUS,                         \
>                            RECONFIGURE, RXQ_RECV)              \
>  {                                                             \
>      NAME,                                                     \
> @@ -3893,12 +3701,12 @@ unlock:
>      netdev_dpdk_get_mtu,                                      \
>      netdev_dpdk_set_mtu,                                      \
>      netdev_dpdk_get_ifindex,                                  \
> -    GET_CARRIER,                                              \
> +    netdev_dpdk_get_carrier,                                  \
>      netdev_dpdk_get_carrier_resets,                           \
>      netdev_dpdk_set_miimon,                                   \
> -    GET_STATS,                                                \
> -    GET_CUSTOM_STATS,
> 	  \
> -    GET_FEATURES,                                             \
> +    netdev_dpdk_get_stats,                                    \
> +    netdev_dpdk_get_custom_stats,                             \
> +    netdev_dpdk_get_features,                                 \
>      NULL,                       /* set_advertisements */      \
>      NULL,                       /* get_pt_mode */             \
>                                                                \
> @@ -3945,10 +3753,6 @@ static const struct netdev_class dpdk_class =
>          netdev_dpdk_set_config,
>          netdev_dpdk_set_tx_multiq,
>          netdev_dpdk_eth_send,
> -        netdev_dpdk_get_carrier,
> -        netdev_dpdk_get_stats,
> -        netdev_dpdk_get_custom_stats,
> -        netdev_dpdk_get_features,
>          netdev_dpdk_get_status,
>          netdev_dpdk_reconfigure,
>          netdev_dpdk_rxq_recv);
> @@ -3962,10 +3766,6 @@ static const struct netdev_class dpdk_ring_class =
>          netdev_dpdk_ring_set_config,
>          netdev_dpdk_set_tx_multiq,
>          netdev_dpdk_ring_send,
> -        netdev_dpdk_get_carrier,
> -        netdev_dpdk_get_stats,
> -        netdev_dpdk_get_custom_stats,
> -        netdev_dpdk_get_features,
>          netdev_dpdk_get_status,
>          netdev_dpdk_reconfigure,
>          netdev_dpdk_rxq_recv);
> @@ -3979,10 +3779,6 @@ static const struct netdev_class dpdk_vhost_class =
>          NULL,
>          NULL,
>          netdev_dpdk_vhost_send,
> -        netdev_dpdk_vhost_get_carrier,
> -        netdev_dpdk_vhost_get_stats,
> -        NULL,
> -        NULL,
>          netdev_dpdk_vhost_user_get_status,
>          netdev_dpdk_vhost_reconfigure,
>          netdev_dpdk_vhost_rxq_recv);
> @@ -3995,10 +3791,6 @@ static const struct netdev_class
> dpdk_vhost_client_class =
>          netdev_dpdk_vhost_client_set_config,
>          NULL,
>          netdev_dpdk_vhost_send,
> -        netdev_dpdk_vhost_get_carrier,
> -        netdev_dpdk_vhost_get_stats,
> -        NULL,
> -        NULL,
>          netdev_dpdk_vhost_user_get_status,
>          netdev_dpdk_vhost_client_reconfigure,
>          netdev_dpdk_vhost_rxq_recv);
> diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
> index 3d21b01..baefa2b 100644
> --- a/tests/system-dpdk.at
> +++ b/tests/system-dpdk.at
> @@ -68,6 +68,7 @@ OVS_VSWITCHD_STOP("/does not exist. The Open vSwitch
> kernel module is probably n
>  /failed to connect to \/tmp\/dpdkvhostclient0: No such file or
> directory/d
>  /Global register is changed during/d
>  /EAL: No free hugepages reported in hugepages-1048576kB/d
> +/Rx checksum offload is not supported/d
>  ")
>  AT_CLEANUP
>  dnl ---------------------------------------------------------------------
> -----
> --
> 2.7.5
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Stokes, Ian May 29, 2018, 4:29 p.m. UTC | #3
> On Mon, May 21, 2018 at 04:44:13PM +0100, Ciara Loftus wrote:
> > The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> > 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> > all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> > calls to DPDK's librte_vhost library are removed and replaced with
> > librte_ether API calls, for which most of the infrastructure is
> > already in place.
> >
> > This change has a number of benefits, including:
> > * Reduced codebase (~200LOC removed)
> > * More features automatically enabled for vHost ports eg. custom stats
> >   and additional get_status information.
> > * OVS can be ignorant to changes in the librte_vhost API between DPDK
> >   releases potentially making upgrades easier and the OVS codebase less
> >   susceptible to change.
> >
> > The sum of all DPDK port types must not exceed RTE_MAX_ETHPORTS which
> > is set and can be modified in the DPDK configuration. Prior to this
> > patch this only applied to 'dpdk' and 'dpdkr' ports, but now applies
> > to all DPDK port types including vHost User.
> >
> > Performance (pps) of the different topologies p2p, pvp, pvvp and vv
> > has been measured to remain within a +/- 5% margin of existing
> performance.
> 
> Thanks for putting this together.
> 
> I think when this idea was discussed at least in my head we would pretty
> much kill any vhost specific info and use a standard eth API instead.
> However, it doesn't look like to be case, we still have the mtu and queue
> issues, special construct/destruct, send, and etc which IMHO defeats the
> initial goal.

I agree, I think that would be the ideal situation but it seems where not there yet.
I wonder if that is something that could be changed and fed back to DPDK? If we will always have to have the separate implementations is that reflective of OVS requirements or a gap in DPDK implementation of vhost PMD?

> 
> Leaving that aside for a moment, I wonder about imposed limitations if we
> switch to the eth API too. I mean, things that we can do today because OVS
> is managing vhost that we won't be able after the API switch.

I've been thinking of this situation also. But one concern is by not using the vhost PMD will there be features that are unavailable to vhost in ovs?

Nothing comes to mind for now, and as long as we continue to access DPDK vhost library that should be ok. However it's something we should keep an eye in the future (For example we recently had an example of a DPDK function that could not be used with DPDK compiled for shared libs).

It would be interesting to see where the DPDK community are trending towards with vhost development in the future WRT this.

Ian
 
> 
> Thanks,
> fbl
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ciara Loftus June 1, 2018, 1:40 p.m. UTC | #4
> 
> > On Mon, May 21, 2018 at 04:44:13PM +0100, Ciara Loftus wrote:
> > > The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> > > 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> > > all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> > > calls to DPDK's librte_vhost library are removed and replaced with
> > > librte_ether API calls, for which most of the infrastructure is
> > > already in place.
> > >
> > > This change has a number of benefits, including:
> > > * Reduced codebase (~200LOC removed)
> > > * More features automatically enabled for vHost ports eg. custom stats
> > >   and additional get_status information.
> > > * OVS can be ignorant to changes in the librte_vhost API between DPDK
> > >   releases potentially making upgrades easier and the OVS codebase less
> > >   susceptible to change.
> > >
> > > The sum of all DPDK port types must not exceed RTE_MAX_ETHPORTS
> which
> > > is set and can be modified in the DPDK configuration. Prior to this
> > > patch this only applied to 'dpdk' and 'dpdkr' ports, but now applies
> > > to all DPDK port types including vHost User.
> > >
> > > Performance (pps) of the different topologies p2p, pvp, pvvp and vv
> > > has been measured to remain within a +/- 5% margin of existing
> > performance.
> >
> > Thanks for putting this together.
> >
> > I think when this idea was discussed at least in my head we would pretty
> > much kill any vhost specific info and use a standard eth API instead.
> > However, it doesn't look like to be case, we still have the mtu and queue
> > issues, special construct/destruct, send, and etc which IMHO defeats the
> > initial goal.
> 
> I agree, I think that would be the ideal situation but it seems where not there
> yet.
> I wonder if that is something that could be changed and fed back to DPDK? If
> we will always have to have the separate implementations is that reflective
> of OVS requirements or a gap in DPDK implementation of vhost PMD?

Hi Ian & Flavio,

Thank you both for your responses. I agree, right now we are not at the ideal scenario using this API which would probably be closer to having the netdev_dpdk and netdev_dpdk_vhost* classes equivalent. However 4 functions have become common (get_ carrier, stats, custom_stats, features) and many of the remainder have some element of commonality through helper functions (send, receive, status, etc.). The hope would be that going forward we could narrow the gap through both OVS and DPDK changes. I think it would be difficult to narrow that gap if we opt for an "all or nothing" approach now.

> 
> >
> > Leaving that aside for a moment, I wonder about imposed limitations if we
> > switch to the eth API too. I mean, things that we can do today because OVS
> > is managing vhost that we won't be able after the API switch.
> 
> I've been thinking of this situation also. But one concern is by not using the
> vhost PMD will there be features that are unavailable to vhost in ovs?
> 
> Nothing comes to mind for now, and as long as we continue to access DPDK
> vhost library that should be ok. However it's something we should keep an
> eye in the future (For example we recently had an example of a DPDK
> function that could not be used with DPDK compiled for shared libs).
> 
> It would be interesting to see where the DPDK community are trending
> towards with vhost development in the future WRT this.

Feature-wise, it appears the development of any new DPDK vHost feature includes relevant support for that feature in the PMD. Dequeue zero copy and vhost iommu are examples of these. So going forward I don't see any issues there.

In my development and testing of this patch I haven't come across any limitations other than that out-of-the-box one is limited to a maximum of 32 vHost ports as defined by RTE_MAX_ETHPORTS. I would be interested to hear if that would be a concern for users.

On the other hand there are many more new things we can do with the API switch too eg. more information in get_status & custom statistics and hopefully more going forward. Although understand preserving existing functionality is critical.

Understand this is a large patch and might take some time to review. But would definitely welcome any further high level feedback especially around the topics above from anybody in the community interested in netdev-dpdk/vHost.

Thanks,
Ciara

> 
> Ian
> 
> >
> > Thanks,
> > fbl
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Flavio Leitner June 6, 2018, 5:46 p.m. UTC | #5
On Fri, Jun 01, 2018 at 01:40:31PM +0000, Loftus, Ciara wrote:
> > 
> > > On Mon, May 21, 2018 at 04:44:13PM +0100, Ciara Loftus wrote:
> > > > The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> > > > 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> > > > all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> > > > calls to DPDK's librte_vhost library are removed and replaced with
> > > > librte_ether API calls, for which most of the infrastructure is
> > > > already in place.
> > > >
> > > > This change has a number of benefits, including:
> > > > * Reduced codebase (~200LOC removed)
> > > > * More features automatically enabled for vHost ports eg. custom stats
> > > >   and additional get_status information.
> > > > * OVS can be ignorant to changes in the librte_vhost API between DPDK
> > > >   releases potentially making upgrades easier and the OVS codebase less
> > > >   susceptible to change.
> > > >
> > > > The sum of all DPDK port types must not exceed RTE_MAX_ETHPORTS
> > which
> > > > is set and can be modified in the DPDK configuration. Prior to this
> > > > patch this only applied to 'dpdk' and 'dpdkr' ports, but now applies
> > > > to all DPDK port types including vHost User.
> > > >
> > > > Performance (pps) of the different topologies p2p, pvp, pvvp and vv
> > > > has been measured to remain within a +/- 5% margin of existing
> > > performance.
> > >
> > > Thanks for putting this together.
> > >
> > > I think when this idea was discussed at least in my head we would pretty
> > > much kill any vhost specific info and use a standard eth API instead.
> > > However, it doesn't look like to be case, we still have the mtu and queue
> > > issues, special construct/destruct, send, and etc which IMHO defeats the
> > > initial goal.
> > 
> > I agree, I think that would be the ideal situation but it seems where not there
> > yet.
> > I wonder if that is something that could be changed and fed back to DPDK? If
> > we will always have to have the separate implementations is that reflective
> > of OVS requirements or a gap in DPDK implementation of vhost PMD?
> 
> Hi Ian & Flavio,

Hi Ciara,

> Thank you both for your responses. I agree, right now we are not at
> the ideal scenario using this API which would probably be closer to
> having the netdev_dpdk and netdev_dpdk_vhost* classes equivalent.
> However 4 functions have become common (get_ carrier, stats,
> custom_stats, features) and many of the remainder have some element
> of commonality through helper functions (send, receive, status,
> etc.). The hope would be that going forward we could narrow the gap
> through both OVS and DPDK changes. I think it would be difficult to
> narrow that gap if we opt for an "all or nothing" approach now.

It is all or nothing now because if we opt to apply this patch, we are
all in and hoping that the API will evolve.  The problem I see is that
as the API evolve, there will breakage/bugs and OVS would be exposed.

So, I still see value in OVS moving to common API, but I don't see a
good trade off for OVS as a project moving to it in its current state.
DPDK support isn't experimental anymore.

> > > Leaving that aside for a moment, I wonder about imposed limitations if we
> > > switch to the eth API too. I mean, things that we can do today because OVS
> > > is managing vhost that we won't be able after the API switch.
> > 
> > I've been thinking of this situation also. But one concern is by not using the
> > vhost PMD will there be features that are unavailable to vhost in ovs?
> > 
> > Nothing comes to mind for now, and as long as we continue to access DPDK
> > vhost library that should be ok. However it's something we should keep an
> > eye in the future (For example we recently had an example of a DPDK
> > function that could not be used with DPDK compiled for shared libs).
> > 
> > It would be interesting to see where the DPDK community are trending
> > towards with vhost development in the future WRT this.
> 
> Feature-wise, it appears the development of any new DPDK vHost
> feature includes relevant support for that feature in the PMD.
> Dequeue zero copy and vhost iommu are examples of these. So going
> forward I don't see any issues there.
> 
> In my development and testing of this patch I haven't come across
> any limitations other than that out-of-the-box one is limited to a
> maximum of 32 vHost ports as defined by RTE_MAX_ETHPORTS. I would be
> interested to hear if that would be a concern for users.

I am aware of cases where there are multiple vhost-user interfaces
to the same VM, then that limit sounds quite low, but I will double
check.

> On the other hand there are many more new things we can do with the
> API switch too eg. more information in get_status & custom
> statistics and hopefully more going forward. Although understand
> preserving existing functionality is critical.
> 
> Understand this is a large patch and might take some time to review.
> But would definitely welcome any further high level feedback
> especially around the topics above from anybody in the community
> interested in netdev-dpdk/vHost.

I'd like to thank you again for putting this together. One thing is
talking about it and another is having this patch showing the reality.

What would be the vHost PMD goal from the DPDK project point of view?
Because if DPDK is planning to drop the vhost library public API, we
will need to plan and switch anyways at some point and I agree with
you when you say we should close the gap soon.

However, if that's not the case, it sounds like when vhost PMD API is
compatible enough, current OVS would be able to use it as an ordinary
DPDK device with maybe some special parameters. Assuming that it would
be mature/good enough, the only change for OVS would be to remove pretty
much most of the vhost code.
Ciara Loftus June 21, 2018, 1:15 p.m. UTC | #6
> On Fri, Jun 01, 2018 at 01:40:31PM +0000, Loftus, Ciara wrote:
> > >
> > > > On Mon, May 21, 2018 at 04:44:13PM +0100, Ciara Loftus wrote:
> > > > > The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> > > > > 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> > > > > all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> > > > > calls to DPDK's librte_vhost library are removed and replaced with
> > > > > librte_ether API calls, for which most of the infrastructure is
> > > > > already in place.
> > > > >
> > > > > This change has a number of benefits, including:
> > > > > * Reduced codebase (~200LOC removed)
> > > > > * More features automatically enabled for vHost ports eg. custom
> stats
> > > > >   and additional get_status information.
> > > > > * OVS can be ignorant to changes in the librte_vhost API between
> DPDK
> > > > >   releases potentially making upgrades easier and the OVS codebase
> less
> > > > >   susceptible to change.
> > > > >
> > > > > The sum of all DPDK port types must not exceed
> RTE_MAX_ETHPORTS
> > > which
> > > > > is set and can be modified in the DPDK configuration. Prior to this
> > > > > patch this only applied to 'dpdk' and 'dpdkr' ports, but now applies
> > > > > to all DPDK port types including vHost User.
> > > > >
> > > > > Performance (pps) of the different topologies p2p, pvp, pvvp and vv
> > > > > has been measured to remain within a +/- 5% margin of existing
> > > > performance.
> > > >
> > > > Thanks for putting this together.
> > > >
> > > > I think when this idea was discussed at least in my head we would
> pretty
> > > > much kill any vhost specific info and use a standard eth API instead.
> > > > However, it doesn't look like to be case, we still have the mtu and
> queue
> > > > issues, special construct/destruct, send, and etc which IMHO defeats
> the
> > > > initial goal.
> > >
> > > I agree, I think that would be the ideal situation but it seems where not
> there
> > > yet.
> > > I wonder if that is something that could be changed and fed back to
> DPDK? If
> > > we will always have to have the separate implementations is that
> reflective
> > > of OVS requirements or a gap in DPDK implementation of vhost PMD?
> >
> > Hi Ian & Flavio,
> 
> Hi Ciara,
> 
> > Thank you both for your responses. I agree, right now we are not at
> > the ideal scenario using this API which would probably be closer to
> > having the netdev_dpdk and netdev_dpdk_vhost* classes equivalent.
> > However 4 functions have become common (get_ carrier, stats,
> > custom_stats, features) and many of the remainder have some element
> > of commonality through helper functions (send, receive, status,
> > etc.). The hope would be that going forward we could narrow the gap
> > through both OVS and DPDK changes. I think it would be difficult to
> > narrow that gap if we opt for an "all or nothing" approach now.
> 
> It is all or nothing now because if we opt to apply this patch, we are
> all in and hoping that the API will evolve.  The problem I see is that
> as the API evolve, there will breakage/bugs and OVS would be exposed.
> 
> So, I still see value in OVS moving to common API, but I don't see a
> good trade off for OVS as a project moving to it in its current state.
> DPDK support isn't experimental anymore.
> 
> > > > Leaving that aside for a moment, I wonder about imposed limitations if
> we
> > > > switch to the eth API too. I mean, things that we can do today because
> OVS
> > > > is managing vhost that we won't be able after the API switch.
> > >
> > > I've been thinking of this situation also. But one concern is by not using
> the
> > > vhost PMD will there be features that are unavailable to vhost in ovs?
> > >
> > > Nothing comes to mind for now, and as long as we continue to access
> DPDK
> > > vhost library that should be ok. However it's something we should keep
> an
> > > eye in the future (For example we recently had an example of a DPDK
> > > function that could not be used with DPDK compiled for shared libs).
> > >
> > > It would be interesting to see where the DPDK community are trending
> > > towards with vhost development in the future WRT this.
> >
> > Feature-wise, it appears the development of any new DPDK vHost
> > feature includes relevant support for that feature in the PMD.
> > Dequeue zero copy and vhost iommu are examples of these. So going
> > forward I don't see any issues there.
> >
> > In my development and testing of this patch I haven't come across
> > any limitations other than that out-of-the-box one is limited to a
> > maximum of 32 vHost ports as defined by RTE_MAX_ETHPORTS. I would be
> > interested to hear if that would be a concern for users.
> 
> I am aware of cases where there are multiple vhost-user interfaces
> to the same VM, then that limit sounds quite low, but I will double
> check.
> 
> > On the other hand there are many more new things we can do with the
> > API switch too eg. more information in get_status & custom
> > statistics and hopefully more going forward. Although understand
> > preserving existing functionality is critical.
> >
> > Understand this is a large patch and might take some time to review.
> > But would definitely welcome any further high level feedback
> > especially around the topics above from anybody in the community
> > interested in netdev-dpdk/vHost.
> 
> I'd like to thank you again for putting this together. One thing is
> talking about it and another is having this patch showing the reality.

Thanks for your reply Flavio and apologies in the delayed response, I was on vacation.

> 
> What would be the vHost PMD goal from the DPDK project point of view?
> Because if DPDK is planning to drop the vhost library public API, we
> will need to plan and switch anyways at some point and I agree with
> you when you say we should close the gap soon.

+ Maxime, Tiwei & Zhihong DPDK vHost/vHost PMD maintainers

I have posted a patch to the OVS mailing list which switches the codebase from using the DPDK vHost library API to the vHost PMD via the ether API. There is some discussion about the change and I think input from the DPDK side would be valuable.

1. As asked by Flavio, will the vHost library API continue to be public or is there a plan to move away from that?
2. Are there any upcoming planned changes to the vHost PMD?
3. Would changes to the vHost PMD be welcome eg. changes that would ease the integration with OVS like some of those mentioned below. Some context: we currently use the ether API for physical ports/vdevs and want to reuse this code for vHost with minimal vHost-specific logic however as it stands that's not totally possible at the moment.

> 
> However, if that's not the case, it sounds like when vhost PMD API is
> compatible enough, current OVS would be able to use it as an ordinary
> DPDK device with maybe some special parameters. Assuming that it would
> be mature/good enough, the only change for OVS would be to remove
> pretty
> much most of the vhost code.

I reviewed the code in the context of closing the gap between dpdk ethdevs and vhost ethdevs and have put together a list of the 'special cases' for vHost (client) and some considerations.

1. netdev_dpdk_vhost_construct
Here we set dev->type = DPDK_DEV_VHOST.
Maybe we could remove the dev->type variable and rework the code such that each netdev_class ensures the right code is called for that port type with no need for the 'if dev->type' check. The trade-off would be the inflation of the code this change would cause, but the benefit would be having a common netdev_dpdk_construct function.

2. netdev_dpdk_vhost_destruct
Here we:
i. Unregister new/destroy/vsc callbacks.
Could be removed by making this automatic in DPDK eg. via a flag in rte_eth_dev_stop?
ii. Free vhost_pmd_id from pool
DPDK could be changed to accept non-unique vdev device names and generate and manage their names ie. IDs

3. netdev_dpdk_vhost_send
i. Retrieve qid from tx_q map.
Could perhaps be avoided by configuring netdev->n_txq to reflect the number of enabled queues rather than the total number of queues. Either derive this information from vring_state_changed cbs or via new API. Wouldn't work if individual queues can be enabled/disabled in Virtio - not sure if this is the case?
ii. Verify dev->vhost_reconfigured
This flag tells us the device is active and we've configured the mempool & queue info for the device.
We don't these things before sending to a phy port so perhaps could be removed for vHost or introduced for phy for consistency.

4. netdev_dpdk_get_status
There will always be extra code here for vHost report vHost specific info eg. socket path, client/server, etc.

5. netdev_dpdk_vhost_reconfigure
i. Set queue numbers. Phy ports allow the user to specify n_rxq and set n_txq to equal the number of pmd threads. vHost uses the number of queues configured in QEMU to set those values. This probably has to stay the same.
ii. Remapping of queues - see point 3.i.
iii. Set vhost_reconfigured - see point 3.ii

6. netdev_dpdk_vhost_rxq_recv
Verify dev->vhost_reconfigured - see point 3.ii

Thanks,
Ciara

> 
> --
> Flavio
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index ec548b0..55dc513 100644
--- a/NEWS
+++ b/NEWS
@@ -30,6 +30,9 @@  Post-v2.9.0
      * New 'check-dpdk' Makefile target to run a new system testsuite.
        See Testing topic for the details.
      * Add LSC interrupt support for DPDK physical devices.
+     * Use DPDK's vHost PMD instead of direct library calls. This means the
+       maximum number of vHost ports is equal to RTE_MAX_ETHPORTS as defined
+       in the DPDK configuration.
    - Userspace datapath:
      * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a single PMD
      * Detailed PMD performance metrics available with new command
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 00dd974..6cfc6fc 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -22,6 +22,7 @@ 
 #include <sys/stat.h>
 #include <getopt.h>
 
+#include <rte_ethdev.h>
 #include <rte_log.h>
 #include <rte_memzone.h>
 #include <rte_version.h>
@@ -32,6 +33,7 @@ 
 
 #include "dirs.h"
 #include "fatal-signal.h"
+#include "id-pool.h"
 #include "netdev-dpdk.h"
 #include "openvswitch/dynamic-string.h"
 #include "openvswitch/vlog.h"
@@ -43,6 +45,7 @@  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection */
 
 static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
 static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU support */
+static struct id_pool *vhost_driver_ids;  /* Pool of IDs for vHost PMDs */
 
 static int
 process_vhost_flags(char *flag, const char *default_val, int size,
@@ -457,6 +460,8 @@  dpdk_init__(const struct smap *ovs_other_config)
     }
 #endif
 
+    vhost_driver_ids = id_pool_create(0, RTE_MAX_ETHPORTS);
+
     /* Finally, register the dpdk classes */
     netdev_dpdk_register();
 }
@@ -498,6 +503,12 @@  dpdk_vhost_iommu_enabled(void)
     return vhost_iommu_enabled;
 }
 
+struct id_pool *
+dpdk_get_vhost_id_pool(void)
+{
+    return vhost_driver_ids;
+}
+
 void
 dpdk_set_lcore_id(unsigned cpu)
 {
diff --git a/lib/dpdk.h b/lib/dpdk.h
index b041535..c7143f7 100644
--- a/lib/dpdk.h
+++ b/lib/dpdk.h
@@ -39,5 +39,6 @@  void dpdk_set_lcore_id(unsigned cpu);
 const char *dpdk_get_vhost_sock_dir(void);
 bool dpdk_vhost_iommu_enabled(void);
 void print_dpdk_version(void);
+struct id_pool *dpdk_get_vhost_id_pool(void);
 
 #endif /* dpdk.h */
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index afddf6d..defc51d 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -31,6 +31,7 @@ 
 #include <rte_cycles.h>
 #include <rte_errno.h>
 #include <rte_eth_ring.h>
+#include <rte_eth_vhost.h>
 #include <rte_ethdev.h>
 #include <rte_malloc.h>
 #include <rte_mbuf.h>
@@ -44,6 +45,7 @@ 
 #include "dpdk.h"
 #include "dpif-netdev.h"
 #include "fatal-signal.h"
+#include "id-pool.h"
 #include "netdev-provider.h"
 #include "netdev-vport.h"
 #include "odp-util.h"
@@ -63,6 +65,7 @@ 
 #include "unixctl.h"
 
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+enum {VHOST_SERVER_MODE, VHOST_CLIENT_MODE};
 
 VLOG_DEFINE_THIS_MODULE(netdev_dpdk);
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
@@ -122,6 +125,7 @@  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
 #define XSTAT_RX_BROADCAST_PACKETS       "rx_broadcast_packets"
 #define XSTAT_TX_BROADCAST_PACKETS       "tx_broadcast_packets"
 #define XSTAT_RX_UNDERSIZED_ERRORS       "rx_undersized_errors"
+#define XSTAT_RX_UNDERSIZE_PACKETS       "rx_undersize_packets"
 #define XSTAT_RX_OVERSIZE_ERRORS         "rx_oversize_errors"
 #define XSTAT_RX_FRAGMENTED_ERRORS       "rx_fragmented_errors"
 #define XSTAT_RX_JABBER_ERRORS           "rx_jabber_errors"
@@ -135,7 +139,7 @@  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
 /* Maximum size of Physical NIC Queues */
 #define NIC_PORT_MAX_Q_SIZE 4096
 
-#define OVS_VHOST_MAX_QUEUE_NUM 1024  /* Maximum number of vHost TX queues. */
+#define OVS_VHOST_MAX_QUEUE_NUM RTE_MAX_QUEUES_PER_PORT /* Max vHost TXQs */
 #define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not initialized. */
 #define OVS_VHOST_QUEUE_DISABLED    (-2) /* Queue was disabled by guest and not
                                           * yet mapped to another queue. */
@@ -170,21 +174,6 @@  static const struct rte_eth_conf port_conf = {
     },
 };
 
-/*
- * These callbacks allow virtio-net devices to be added to vhost ports when
- * configuration has been fully completed.
- */
-static int new_device(int vid);
-static void destroy_device(int vid);
-static int vring_state_changed(int vid, uint16_t queue_id, int enable);
-static const struct vhost_device_ops virtio_net_device_ops =
-{
-    .new_device =  new_device,
-    .destroy_device = destroy_device,
-    .vring_state_changed = vring_state_changed,
-    .features_changed = NULL
-};
-
 enum { DPDK_RING_SIZE = 256 };
 BUILD_ASSERT_DECL(IS_POW2(DPDK_RING_SIZE));
 enum { DRAIN_TSC = 200000ULL };
@@ -379,6 +368,8 @@  struct netdev_dpdk {
         char *devargs;  /* Device arguments for dpdk ports */
         struct dpdk_tx_queue *tx_q;
         struct rte_eth_link link;
+        /* ID of vhost user port given to the PMD driver */
+        int32_t vhost_pmd_id;
     );
 
     PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline1,
@@ -472,7 +463,12 @@  static void netdev_dpdk_vhost_destruct(struct netdev *netdev);
 
 static void netdev_dpdk_clear_xstats(struct netdev_dpdk *dev);
 
-int netdev_dpdk_get_vid(const struct netdev_dpdk *dev);
+static int link_status_changed_callback(dpdk_port_t port_id,
+        enum rte_eth_event_type type, void *param, void *ret_param);
+static int vring_state_changed_callback(dpdk_port_t port_id,
+        enum rte_eth_event_type type, void *param, void *ret_param);
+static void netdev_dpdk_remap_txqs(struct netdev_dpdk *dev);
+static void netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev);
 
 struct ingress_policer *
 netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
@@ -812,11 +808,13 @@  dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
             break;
         }
 
-        diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
-        if (diag) {
-            VLOG_ERR("Interface %s MTU (%d) setup error: %s",
-                    dev->up.name, dev->mtu, rte_strerror(-diag));
-            break;
+        if (dev->type == DPDK_DEV_ETH) {
+            diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
+            if (diag) {
+                VLOG_ERR("Interface %s MTU (%d) setup error: %s",
+                        dev->up.name, dev->mtu, rte_strerror(-diag));
+                break;
+            }
         }
 
         for (i = 0; i < n_txq; i++) {
@@ -851,8 +849,13 @@  dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
             continue;
         }
 
-        dev->up.n_rxq = n_rxq;
-        dev->up.n_txq = n_txq;
+        /* Only set n_*xq for physical devices. vHost User devices will set
+         * this value correctly using info from the virtio backend.
+         */
+        if (dev->type == DPDK_DEV_ETH) {
+            dev->up.n_rxq = n_rxq;
+            dev->up.n_txq = n_txq;
+        }
 
         return 0;
     }
@@ -893,8 +896,17 @@  dpdk_eth_dev_init(struct netdev_dpdk *dev)
         dev->hw_ol_features |= NETDEV_RX_CHECKSUM_OFFLOAD;
     }
 
-    n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
-    n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
+    if (dev->type != DPDK_DEV_ETH) {
+        /* We don't know how many queues QEMU will request so we need to
+         * provision for the maximum, as if we configure less up front than
+         * what QEMU configures later, those additional queues will never be
+         * available to us. */
+        n_rxq = OVS_VHOST_MAX_QUEUE_NUM;
+        n_txq = OVS_VHOST_MAX_QUEUE_NUM;
+    } else {
+        n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
+        n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
+    }
 
     diag = dpdk_eth_dev_port_config(dev, n_rxq, n_txq);
     if (diag) {
@@ -997,9 +1009,8 @@  common_construct(struct netdev *netdev, dpdk_port_t port_no,
     dev->requested_mtu = ETHER_MTU;
     dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
     dev->requested_lsc_interrupt_mode = 0;
-    ovsrcu_index_init(&dev->vid, -1);
+    dev->vhost_pmd_id = -1;
     dev->vhost_reconfigured = false;
-    dev->attached = false;
 
     ovsrcu_init(&dev->qos_conf, NULL);
 
@@ -1057,19 +1068,62 @@  dpdk_dev_parse_name(const char dev_name[], const char prefix[],
 }
 
 static int
-vhost_common_construct(struct netdev *netdev)
-    OVS_REQUIRES(dpdk_mutex)
+dpdk_attach_vhost_pmd(struct netdev_dpdk *dev, int mode)
 {
-    int socket_id = rte_lcore_to_socket_id(rte_get_master_lcore());
-    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+    char *devargs;
+    int err = 0;
+    dpdk_port_t port_no = 0;
+    uint32_t driver_id = 0;
+    int iommu_enabled = 0;
+    int zc_enabled = 0;
 
-    dev->tx_q = netdev_dpdk_alloc_txq(OVS_VHOST_MAX_QUEUE_NUM);
-    if (!dev->tx_q) {
-        return ENOMEM;
+    if (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY) {
+        zc_enabled = 1;
+    }
+
+    if (dpdk_vhost_iommu_enabled()) {
+        iommu_enabled = 1;
     }
 
-    return common_construct(netdev, DPDK_ETH_PORT_ID_INVALID,
-                            DPDK_DEV_VHOST, socket_id);
+    if (id_pool_alloc_id(dpdk_get_vhost_id_pool(), &driver_id)) {
+        devargs = xasprintf("net_vhost%u,iface=%s,queues=%i,client=%i,"
+                            "dequeue-zero-copy=%i,iommu-support=%i",
+                 driver_id, dev->vhost_id, OVS_VHOST_MAX_QUEUE_NUM, mode,
+                 zc_enabled, iommu_enabled);
+        err = rte_eth_dev_attach(devargs, &port_no);
+        if (!err) {
+            dev->attached = true;
+            dev->port_id = port_no;
+            dev->vhost_pmd_id = driver_id;
+            err = rte_vhost_driver_disable_features(dev->vhost_id,
+                                1ULL << VIRTIO_NET_F_HOST_TSO4
+                                | 1ULL << VIRTIO_NET_F_HOST_TSO6
+                                | 1ULL << VIRTIO_NET_F_CSUM);
+            if (err) {
+                VLOG_ERR("rte_vhost_driver_disable_features failed for vhost "
+                         "user client port: %s\n", dev->up.name);
+            }
+
+            rte_eth_dev_callback_register(dev->port_id,
+                                          RTE_ETH_EVENT_QUEUE_STATE,
+                                          vring_state_changed_callback,
+                                          NULL);
+            rte_eth_dev_callback_register(dev->port_id,
+                                          RTE_ETH_EVENT_INTR_LSC,
+                                          link_status_changed_callback,
+                                          NULL);
+        } else {
+            id_pool_free_id(dpdk_get_vhost_id_pool(), driver_id);
+            VLOG_ERR("Failed to attach vhost-user device %s to DPDK",
+                     dev->vhost_id);
+        }
+    } else {
+        VLOG_ERR("Unable to create vhost-user device %s - too many vhost-user "
+                 "devices registered with PMD", dev->vhost_id);
+        err = ENODEV;
+    }
+
+    return err;
 }
 
 static int
@@ -1082,7 +1136,7 @@  netdev_dpdk_vhost_construct(struct netdev *netdev)
     /* 'name' is appended to 'vhost_sock_dir' and used to create a socket in
      * the file system. '/' or '\' would traverse directories, so they're not
      * acceptable in 'name'. */
-    if (strchr(name, '/') || strchr(name, '\\')) {
+    if (strchr(name, '/') || strchr(name, '\\') || strchr(name, ',')) {
         VLOG_ERR("\"%s\" is not a valid name for a vhost-user port. "
                  "A valid name must not include '/' or '\\'",
                  name);
@@ -1097,46 +1151,23 @@  netdev_dpdk_vhost_construct(struct netdev *netdev)
              dpdk_get_vhost_sock_dir(), name);
 
     dev->vhost_driver_flags &= ~RTE_VHOST_USER_CLIENT;
-    err = rte_vhost_driver_register(dev->vhost_id, dev->vhost_driver_flags);
-    if (err) {
-        VLOG_ERR("vhost-user socket device setup failure for socket %s\n",
-                 dev->vhost_id);
-        goto out;
-    } else {
+    err = dpdk_attach_vhost_pmd(dev, VHOST_SERVER_MODE);
+    if (!err) {
         fatal_signal_add_file_to_unlink(dev->vhost_id);
         VLOG_INFO("Socket %s created for vhost-user port %s\n",
                   dev->vhost_id, name);
-    }
-
-    err = rte_vhost_driver_callback_register(dev->vhost_id,
-                                                &virtio_net_device_ops);
-    if (err) {
-        VLOG_ERR("rte_vhost_driver_callback_register failed for vhost user "
-                 "port: %s\n", name);
-        goto out;
-    }
-
-    err = rte_vhost_driver_disable_features(dev->vhost_id,
-                                1ULL << VIRTIO_NET_F_HOST_TSO4
-                                | 1ULL << VIRTIO_NET_F_HOST_TSO6
-                                | 1ULL << VIRTIO_NET_F_CSUM);
-    if (err) {
-        VLOG_ERR("rte_vhost_driver_disable_features failed for vhost user "
-                 "port: %s\n", name);
-        goto out;
-    }
-
-    err = rte_vhost_driver_start(dev->vhost_id);
-    if (err) {
-        VLOG_ERR("rte_vhost_driver_start failed for vhost user "
-                 "port: %s\n", name);
+    } else {
         goto out;
     }
 
-    err = vhost_common_construct(netdev);
+    err = common_construct(&dev->up, dev->port_id, DPDK_DEV_VHOST,
+                           rte_lcore_to_socket_id(rte_get_master_lcore()));
     if (err) {
-        VLOG_ERR("vhost_common_construct failed for vhost user "
-                 "port: %s\n", name);
+        VLOG_ERR("common_construct failed for vhost user port: %s\n", name);
+        rte_eth_dev_detach(dev->port_id, dev->vhost_id);
+        if (dev->vhost_pmd_id >= 0) {
+            id_pool_free_id(dpdk_get_vhost_id_pool(), dev->vhost_pmd_id);
+        }
     }
 
 out:
@@ -1149,12 +1180,14 @@  out:
 static int
 netdev_dpdk_vhost_client_construct(struct netdev *netdev)
 {
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int err;
 
     ovs_mutex_lock(&dpdk_mutex);
-    err = vhost_common_construct(netdev);
+    err = common_construct(&dev->up, DPDK_ETH_PORT_ID_INVALID, DPDK_DEV_VHOST,
+                           rte_lcore_to_socket_id(rte_get_master_lcore()));
     if (err) {
-        VLOG_ERR("vhost_common_construct failed for vhost user client"
+        VLOG_ERR("common_construct failed for vhost user client"
                  "port: %s\n", netdev->name);
     }
     ovs_mutex_unlock(&dpdk_mutex);
@@ -1178,90 +1211,76 @@  common_destruct(struct netdev_dpdk *dev)
     OVS_REQUIRES(dpdk_mutex)
     OVS_EXCLUDED(dev->mutex)
 {
-    rte_free(dev->tx_q);
-    dpdk_mp_release(dev->mp);
-
-    ovs_list_remove(&dev->list_node);
-    free(ovsrcu_get_protected(struct ingress_policer *,
-                              &dev->ingress_policer));
-    ovs_mutex_destroy(&dev->mutex);
-}
-
-static void
-netdev_dpdk_destruct(struct netdev *netdev)
-{
-    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     char devname[RTE_ETH_NAME_MAX_LEN];
 
-    ovs_mutex_lock(&dpdk_mutex);
-
     rte_eth_dev_stop(dev->port_id);
     dev->started = false;
 
     if (dev->attached) {
         rte_eth_dev_close(dev->port_id);
         if (rte_eth_dev_detach(dev->port_id, devname) < 0) {
-            VLOG_ERR("Device '%s' can not be detached", dev->devargs);
+            VLOG_ERR("Device '%s' can not be detached", devname);
         } else {
             VLOG_INFO("Device '%s' has been detached", devname);
         }
     }
 
     netdev_dpdk_clear_xstats(dev);
-    free(dev->devargs);
-    common_destruct(dev);
-
-    ovs_mutex_unlock(&dpdk_mutex);
+    rte_free(dev->tx_q);
+    dpdk_mp_release(dev->mp);
+    ovs_list_remove(&dev->list_node);
+    free(ovsrcu_get_protected(struct ingress_policer *,
+                              &dev->ingress_policer));
+    ovs_mutex_destroy(&dev->mutex);
 }
 
-/* rte_vhost_driver_unregister() can call back destroy_device(), which will
- * try to acquire 'dpdk_mutex' and possibly 'dev->mutex'.  To avoid a
- * deadlock, none of the mutexes must be held while calling this function. */
-static int
-dpdk_vhost_driver_unregister(struct netdev_dpdk *dev OVS_UNUSED,
-                             char *vhost_id)
-    OVS_EXCLUDED(dpdk_mutex)
-    OVS_EXCLUDED(dev->mutex)
+static void
+netdev_dpdk_destruct(struct netdev *netdev)
 {
-    return rte_vhost_driver_unregister(vhost_id);
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+    ovs_mutex_lock(&dpdk_mutex);
+    common_destruct(dev);
+    free(dev->devargs);
+    ovs_mutex_unlock(&dpdk_mutex);
 }
 
 static void
 netdev_dpdk_vhost_destruct(struct netdev *netdev)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-    char *vhost_id;
 
     ovs_mutex_lock(&dpdk_mutex);
 
     /* Guest becomes an orphan if still attached. */
-    if (netdev_dpdk_get_vid(dev) >= 0
-        && !(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
+    check_link_status(dev);
+    if (dev->link.link_status == ETH_LINK_UP) {
         VLOG_ERR("Removing port '%s' while vhost device still attached.",
                  netdev->name);
         VLOG_ERR("To restore connectivity after re-adding of port, VM on "
                  "socket '%s' must be restarted.", dev->vhost_id);
     }
 
-    vhost_id = xstrdup(dev->vhost_id);
-
-    common_destruct(dev);
-
-    ovs_mutex_unlock(&dpdk_mutex);
+    rte_eth_dev_callback_unregister(dev->port_id,
+                                    RTE_ETH_EVENT_QUEUE_STATE,
+                                    vring_state_changed_callback, NULL);
+    rte_eth_dev_callback_unregister(dev->port_id,
+                                    RTE_ETH_EVENT_INTR_LSC,
+                                    link_status_changed_callback, NULL);
 
-    if (!vhost_id[0]) {
-        goto out;
+    if (dev->vhost_pmd_id >= 0) {
+        id_pool_free_id(dpdk_get_vhost_id_pool(),
+                dev->vhost_pmd_id);
     }
 
-    if (dpdk_vhost_driver_unregister(dev, vhost_id)) {
-        VLOG_ERR("%s: Unable to unregister vhost driver for socket '%s'.\n",
-                 netdev->name, vhost_id);
-    } else if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
-        /* OVS server mode - remove this socket from list for deletion */
-        fatal_signal_remove_file_to_unlink(vhost_id);
+    if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)) {
+           /* OVS server mode - remove this socket from list for deletion */
+           fatal_signal_remove_file_to_unlink(dev->vhost_id);
     }
-out:
-    free(vhost_id);
+
+    common_destruct(dev);
+
+    ovs_mutex_unlock(&dpdk_mutex);
 }
 
 static void
@@ -1846,12 +1865,6 @@  ingress_policer_run(struct ingress_policer *policer, struct rte_mbuf **pkts,
     return cnt;
 }
 
-static bool
-is_vhost_running(struct netdev_dpdk *dev)
-{
-    return (netdev_dpdk_get_vid(dev) >= 0 && dev->vhost_reconfigured);
-}
-
 static inline void
 netdev_dpdk_vhost_update_rx_size_counters(struct netdev_stats *stats,
                                           unsigned int packet_size)
@@ -1913,64 +1926,9 @@  netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats,
     }
 }
 
-/*
- * The receive path for the vhost port is the TX path out from guest.
- */
-static int
-netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
-                           struct dp_packet_batch *batch, int *qfill)
-{
-    struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
-    struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
-    uint16_t nb_rx = 0;
-    uint16_t dropped = 0;
-    int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
-    int vid = netdev_dpdk_get_vid(dev);
-
-    if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured
-                     || !(dev->flags & NETDEV_UP))) {
-        return EAGAIN;
-    }
-
-    nb_rx = rte_vhost_dequeue_burst(vid, qid, dev->mp,
-                                    (struct rte_mbuf **) batch->packets,
-                                    NETDEV_MAX_BURST);
-    if (!nb_rx) {
-        return EAGAIN;
-    }
-
-    if (qfill) {
-        if (nb_rx == NETDEV_MAX_BURST) {
-            /* The DPDK API returns a uint32_t which often has invalid bits in
-             * the upper 16-bits. Need to restrict the value to uint16_t. */
-            *qfill = rte_vhost_rx_queue_count(vid, qid) & UINT16_MAX;
-        } else {
-            *qfill = 0;
-        }
-    }
-
-    if (policer) {
-        dropped = nb_rx;
-        nb_rx = ingress_policer_run(policer,
-                                    (struct rte_mbuf **) batch->packets,
-                                    nb_rx, true);
-        dropped -= nb_rx;
-    }
-
-    rte_spinlock_lock(&dev->stats_lock);
-    netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
-                                         nb_rx, dropped);
-    rte_spinlock_unlock(&dev->stats_lock);
-
-    batch->count = nb_rx;
-    dp_packet_batch_init_packet_fields(batch);
-
-    return 0;
-}
-
 static int
-netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
-                     int *qfill)
+common_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
+            int *qfill)
 {
     struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq);
     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
@@ -2018,6 +1976,30 @@  netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
     return 0;
 }
 
+/*
+ * The receive path for the vhost port is the TX path out from guest.
+ */
+static int
+netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
+                           struct dp_packet_batch *batch,
+                           int *qfill)
+{
+    struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
+
+    if (dev->vhost_reconfigured) {
+        return common_recv(rxq, batch, qfill);
+    }
+
+    return EAGAIN;
+}
+
+static int
+netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
+                     int *qfill)
+{
+    return common_recv(rxq, batch, qfill);
+}
+
 static inline int
 netdev_dpdk_qos_run(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
                     int cnt, bool may_steal)
@@ -2059,80 +2041,6 @@  netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
     return cnt;
 }
 
-static inline void
-netdev_dpdk_vhost_update_tx_counters(struct netdev_stats *stats,
-                                     struct dp_packet **packets,
-                                     int attempted,
-                                     int dropped)
-{
-    int i;
-    int sent = attempted - dropped;
-
-    stats->tx_packets += sent;
-    stats->tx_dropped += dropped;
-
-    for (i = 0; i < sent; i++) {
-        stats->tx_bytes += dp_packet_size(packets[i]);
-    }
-}
-
-static void
-__netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
-                         struct dp_packet **pkts, int cnt)
-{
-    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-    struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
-    unsigned int total_pkts = cnt;
-    unsigned int dropped = 0;
-    int i, retries = 0;
-    int vid = netdev_dpdk_get_vid(dev);
-
-    qid = dev->tx_q[qid % netdev->n_txq].map;
-
-    if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured || qid < 0
-                     || !(dev->flags & NETDEV_UP))) {
-        rte_spinlock_lock(&dev->stats_lock);
-        dev->stats.tx_dropped+= cnt;
-        rte_spinlock_unlock(&dev->stats_lock);
-        goto out;
-    }
-
-    rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
-
-    cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
-    /* Check has QoS has been configured for the netdev */
-    cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
-    dropped = total_pkts - cnt;
-
-    do {
-        int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
-        unsigned int tx_pkts;
-
-        tx_pkts = rte_vhost_enqueue_burst(vid, vhost_qid, cur_pkts, cnt);
-        if (OVS_LIKELY(tx_pkts)) {
-            /* Packets have been sent.*/
-            cnt -= tx_pkts;
-            /* Prepare for possible retry.*/
-            cur_pkts = &cur_pkts[tx_pkts];
-        } else {
-            /* No packets sent - do not retry.*/
-            break;
-        }
-    } while (cnt && (retries++ <= VHOST_ENQ_RETRY_NUM));
-
-    rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
-
-    rte_spinlock_lock(&dev->stats_lock);
-    netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
-                                         cnt + dropped);
-    rte_spinlock_unlock(&dev->stats_lock);
-
-out:
-    for (i = 0; i < total_pkts - dropped; i++) {
-        dp_packet_delete(pkts[i]);
-    }
-}
-
 /* Tx function. Transmit packets indefinitely */
 static void
 dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
@@ -2186,12 +2094,7 @@  dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
     }
 
     if (OVS_LIKELY(txcnt)) {
-        if (dev->type == DPDK_DEV_VHOST) {
-            __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts,
-                                     txcnt);
-        } else {
-            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
-        }
+        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
     }
 
     if (OVS_UNLIKELY(dropped)) {
@@ -2201,21 +2104,6 @@  dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
     }
 }
 
-static int
-netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
-                       struct dp_packet_batch *batch,
-                       bool concurrent_txq OVS_UNUSED)
-{
-
-    if (OVS_UNLIKELY(batch->packets[0]->source != DPBUF_DPDK)) {
-        dpdk_do_tx_copy(netdev, qid, batch);
-        dp_packet_delete_batch(batch, true);
-    } else {
-        __netdev_dpdk_vhost_send(netdev, qid, batch->packets, batch->count);
-    }
-    return 0;
-}
-
 static inline void
 netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
                    struct dp_packet_batch *batch,
@@ -2226,8 +2114,7 @@  netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
         return;
     }
 
-    if (OVS_UNLIKELY(concurrent_txq)) {
-        qid = qid % dev->up.n_txq;
+    if (concurrent_txq) {
         rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
     }
 
@@ -2254,7 +2141,7 @@  netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
         }
     }
 
-    if (OVS_UNLIKELY(concurrent_txq)) {
+    if (concurrent_txq) {
         rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
     }
 }
@@ -2265,11 +2152,35 @@  netdev_dpdk_eth_send(struct netdev *netdev, int qid,
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 
+    if (concurrent_txq) {
+        qid = qid % dev->up.n_txq;
+    }
+
     netdev_dpdk_send__(dev, qid, batch, concurrent_txq);
     return 0;
 }
 
 static int
+netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
+                       struct dp_packet_batch *batch,
+                       bool concurrent_txq OVS_UNUSED)
+{
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+    qid = dev->tx_q[qid % netdev->n_txq].map;
+    if (qid == -1 || !dev->vhost_reconfigured) {
+        rte_spinlock_lock(&dev->stats_lock);
+        dev->stats.tx_dropped+= batch->count;
+        rte_spinlock_unlock(&dev->stats_lock);
+        dp_packet_delete_batch(batch, true);
+    } else {
+        netdev_dpdk_send__(dev, qid, batch, false);
+    }
+
+    return 0;
+}
+
+static int
 netdev_dpdk_set_etheraddr(struct netdev *netdev, const struct eth_addr mac)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
@@ -2343,41 +2254,6 @@  netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
 static int
 netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier);
 
-static int
-netdev_dpdk_vhost_get_stats(const struct netdev *netdev,
-                            struct netdev_stats *stats)
-{
-    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-
-    ovs_mutex_lock(&dev->mutex);
-
-    rte_spinlock_lock(&dev->stats_lock);
-    /* Supported Stats */
-    stats->rx_packets = dev->stats.rx_packets;
-    stats->tx_packets = dev->stats.tx_packets;
-    stats->rx_dropped = dev->stats.rx_dropped;
-    stats->tx_dropped = dev->stats.tx_dropped;
-    stats->multicast = dev->stats.multicast;
-    stats->rx_bytes = dev->stats.rx_bytes;
-    stats->tx_bytes = dev->stats.tx_bytes;
-    stats->rx_errors = dev->stats.rx_errors;
-    stats->rx_length_errors = dev->stats.rx_length_errors;
-
-    stats->rx_1_to_64_packets = dev->stats.rx_1_to_64_packets;
-    stats->rx_65_to_127_packets = dev->stats.rx_65_to_127_packets;
-    stats->rx_128_to_255_packets = dev->stats.rx_128_to_255_packets;
-    stats->rx_256_to_511_packets = dev->stats.rx_256_to_511_packets;
-    stats->rx_512_to_1023_packets = dev->stats.rx_512_to_1023_packets;
-    stats->rx_1024_to_1522_packets = dev->stats.rx_1024_to_1522_packets;
-    stats->rx_1523_to_max_packets = dev->stats.rx_1523_to_max_packets;
-
-    rte_spinlock_unlock(&dev->stats_lock);
-
-    ovs_mutex_unlock(&dev->mutex);
-
-    return 0;
-}
-
 static void
 netdev_dpdk_convert_xstats(struct netdev_stats *stats,
                            const struct rte_eth_xstat *xstats,
@@ -2423,6 +2299,8 @@  netdev_dpdk_convert_xstats(struct netdev_stats *stats,
             stats->tx_broadcast_packets = xstats[i].value;
         } else if (strcmp(XSTAT_RX_UNDERSIZED_ERRORS, names[i].name) == 0) {
             stats->rx_undersized_errors = xstats[i].value;
+        } else if (strcmp(XSTAT_RX_UNDERSIZE_PACKETS, names[i].name) == 0) {
+            stats->rx_undersized_errors = xstats[i].value;
         } else if (strcmp(XSTAT_RX_FRAGMENTED_ERRORS, names[i].name) == 0) {
             stats->rx_fragmented_errors = xstats[i].value;
         } else if (strcmp(XSTAT_RX_JABBER_ERRORS, names[i].name) == 0) {
@@ -2445,6 +2323,11 @@  netdev_dpdk_get_stats(const struct netdev *netdev, struct netdev_stats *stats)
     struct rte_eth_xstat_name *rte_xstats_names = NULL;
     int rte_xstats_len, rte_xstats_new_len, rte_xstats_ret;
 
+    if (!rte_eth_dev_is_valid_port(dev->port_id)) {
+        ovs_mutex_unlock(&dev->mutex);
+        return EPROTO;
+    }
+
     if (rte_eth_stats_get(dev->port_id, &rte_stats)) {
         VLOG_ERR("Can't get ETH statistics for port: "DPDK_PORT_ID_FMT,
                  dev->port_id);
@@ -2521,6 +2404,10 @@  netdev_dpdk_get_custom_stats(const struct netdev *netdev,
 
     ovs_mutex_lock(&dev->mutex);
 
+    if (rte_eth_dev_is_valid_port(dev->port_id)) {
+        goto out;
+    }
+
     if (netdev_dpdk_configure_xstats(dev)) {
         uint64_t *values = xcalloc(dev->rte_xstats_ids_size,
                                    sizeof(uint64_t));
@@ -2557,6 +2444,7 @@  netdev_dpdk_get_custom_stats(const struct netdev *netdev,
         free(values);
     }
 
+out:
     ovs_mutex_unlock(&dev->mutex);
 
     return 0;
@@ -2713,24 +2601,6 @@  netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier)
     return 0;
 }
 
-static int
-netdev_dpdk_vhost_get_carrier(const struct netdev *netdev, bool *carrier)
-{
-    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-
-    ovs_mutex_lock(&dev->mutex);
-
-    if (is_vhost_running(dev)) {
-        *carrier = 1;
-    } else {
-        *carrier = 0;
-    }
-
-    ovs_mutex_unlock(&dev->mutex);
-
-    return 0;
-}
-
 static long long int
 netdev_dpdk_get_carrier_resets(const struct netdev *netdev)
 {
@@ -2780,8 +2650,7 @@  netdev_dpdk_update_flags__(struct netdev_dpdk *dev,
          * running then change netdev's change_seq to trigger link state
          * update. */
 
-        if ((NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off)))
-            && is_vhost_running(dev)) {
+        if ((NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off)))) {
             netdev_change_seq_changed(&dev->up);
 
             /* Clear statistics if device is getting up. */
@@ -2811,18 +2680,41 @@  netdev_dpdk_update_flags(struct netdev *netdev,
     return error;
 }
 
+static void
+common_get_status(struct smap *args, struct netdev_dpdk *dev,
+                  struct rte_eth_dev_info *dev_info)
+{
+    smap_add_format(args, "port_no", DPDK_PORT_ID_FMT, dev->port_id);
+    smap_add_format(args, "numa_id", "%d",
+                           rte_eth_dev_socket_id(dev->port_id));
+    smap_add_format(args, "driver_name", "%s", dev_info->driver_name);
+    smap_add_format(args, "min_rx_bufsize", "%u", dev_info->min_rx_bufsize);
+    smap_add_format(args, "max_rx_pktlen", "%u", dev->max_packet_len);
+    smap_add_format(args, "max_rx_queues", "%u", dev_info->max_rx_queues);
+    smap_add_format(args, "max_tx_queues", "%u", dev_info->max_tx_queues);
+    smap_add_format(args, "max_mac_addrs", "%u", dev_info->max_mac_addrs);
+}
+
 static int
 netdev_dpdk_vhost_user_get_status(const struct netdev *netdev,
                                   struct smap *args)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+    struct rte_eth_dev_info dev_info;
+
+    if (!rte_eth_dev_is_valid_port(dev->port_id)) {
+        return ENODEV;
+    }
 
     ovs_mutex_lock(&dev->mutex);
+    rte_eth_dev_info_get(dev->port_id, &dev_info);
+
+    common_get_status(args, dev, &dev_info);
 
     bool client_mode = dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT;
     smap_add_format(args, "mode", "%s", client_mode ? "client" : "server");
 
-    int vid = netdev_dpdk_get_vid(dev);
+    int vid = rte_eth_vhost_get_vid_from_port_id(dev->port_id);;
     if (vid < 0) {
         smap_add_format(args, "status", "disconnected");
         ovs_mutex_unlock(&dev->mutex);
@@ -2883,15 +2775,8 @@  netdev_dpdk_get_status(const struct netdev *netdev, struct smap *args)
     rte_eth_dev_info_get(dev->port_id, &dev_info);
     ovs_mutex_unlock(&dev->mutex);
 
-    smap_add_format(args, "port_no", DPDK_PORT_ID_FMT, dev->port_id);
-    smap_add_format(args, "numa_id", "%d",
-                           rte_eth_dev_socket_id(dev->port_id));
-    smap_add_format(args, "driver_name", "%s", dev_info.driver_name);
-    smap_add_format(args, "min_rx_bufsize", "%u", dev_info.min_rx_bufsize);
-    smap_add_format(args, "max_rx_pktlen", "%u", dev->max_packet_len);
-    smap_add_format(args, "max_rx_queues", "%u", dev_info.max_rx_queues);
-    smap_add_format(args, "max_tx_queues", "%u", dev_info.max_tx_queues);
-    smap_add_format(args, "max_mac_addrs", "%u", dev_info.max_mac_addrs);
+    common_get_status(args, dev, &dev_info);
+
     smap_add_format(args, "max_hash_mac_addrs", "%u",
                            dev_info.max_hash_mac_addrs);
     smap_add_format(args, "max_vfs", "%u", dev_info.max_vfs);
@@ -3070,19 +2955,6 @@  out:
 }
 
 /*
- * Set virtqueue flags so that we do not receive interrupts.
- */
-static void
-set_irq_status(int vid)
-{
-    uint32_t i;
-
-    for (i = 0; i < rte_vhost_get_vring_num(vid); i++) {
-        rte_vhost_enable_guest_notification(vid, i, 0);
-    }
-}
-
-/*
  * Fixes mapping for vhost-user tx queues. Must be called after each
  * enabling/disabling of queues and n_txq modifications.
  */
@@ -3123,53 +2995,60 @@  netdev_dpdk_remap_txqs(struct netdev_dpdk *dev)
     free(enabled_queues);
 }
 
-/*
- * A new virtio-net device is added to a vhost port.
- */
 static int
-new_device(int vid)
+link_status_changed_callback(dpdk_port_t port_id,
+                             enum rte_eth_event_type type OVS_UNUSED,
+                             void *param OVS_UNUSED,
+                             void *ret_param OVS_UNUSED)
 {
     struct netdev_dpdk *dev;
     bool exists = false;
     int newnode = 0;
-    char ifname[IF_NAME_SZ];
-
-    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
 
     ovs_mutex_lock(&dpdk_mutex);
     /* Add device to the vhost port with the same name as that passed down. */
     LIST_FOR_EACH(dev, list_node, &dpdk_list) {
         ovs_mutex_lock(&dev->mutex);
-        if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
-            uint32_t qp_num = rte_vhost_get_vring_num(vid)/VIRTIO_QNUM;
-
-            /* Get NUMA information */
-            newnode = rte_vhost_get_numa_node(vid);
-            if (newnode == -1) {
+        if (port_id == dev->port_id) {
+            check_link_status(dev);
+            if (dev->link.link_status == ETH_LINK_UP) {
+                /* Device brought up */
+                /* Get queue information */
+                int vid = rte_eth_vhost_get_vid_from_port_id(dev->port_id);
+                uint32_t qp_num = rte_vhost_get_vring_num(vid) / VIRTIO_QNUM;
+                if (qp_num <= 0) {
+                    qp_num = dev->requested_n_rxq;
+                }
+                /* Get NUMA information */
+                newnode = rte_eth_dev_socket_id(dev->port_id);
+                if (newnode == -1) {
 #ifdef VHOST_NUMA
-                VLOG_INFO("Error getting NUMA info for vHost Device '%s'",
-                          ifname);
+                    VLOG_INFO("Error getting NUMA info for vHost Device '%s'",
+                              dev->vhost_id);
 #endif
-                newnode = dev->socket_id;
-            }
+                    newnode = dev->socket_id;
+                }
+                if (dev->requested_n_txq != qp_num
+                                || dev->requested_n_rxq != qp_num
+                                || dev->requested_socket_id != newnode) {
+                    dev->requested_socket_id = newnode;
+                    dev->requested_n_rxq = qp_num;
+                    dev->requested_n_txq = qp_num;
+                    netdev_request_reconfigure(&dev->up);
+                } else {
+                    /* Reconfiguration not required. */
+                    dev->vhost_reconfigured = true;
+                }
 
-            if (dev->requested_n_txq != qp_num
-                || dev->requested_n_rxq != qp_num
-                || dev->requested_socket_id != newnode) {
-                dev->requested_socket_id = newnode;
-                dev->requested_n_rxq = qp_num;
-                dev->requested_n_txq = qp_num;
-                netdev_request_reconfigure(&dev->up);
+                VLOG_INFO("vHost Device '%s' has been added on numa node %i",
+                          dev->vhost_id, newnode);
             } else {
-                /* Reconfiguration not required. */
-                dev->vhost_reconfigured = true;
+                /* Device brought down */
+                dev->vhost_reconfigured = false;
+                netdev_dpdk_txq_map_clear(dev);
+                VLOG_INFO("vHost Device '%s' has been removed", dev->vhost_id);
             }
-
-            ovsrcu_index_set(&dev->vid, vid);
             exists = true;
-
-            /* Disable notifications. */
-            set_irq_status(vid);
             netdev_change_seq_changed(&dev->up);
             ovs_mutex_unlock(&dev->mutex);
             break;
@@ -3179,14 +3058,11 @@  new_device(int vid)
     ovs_mutex_unlock(&dpdk_mutex);
 
     if (!exists) {
-        VLOG_INFO("vHost Device '%s' can't be added - name not found", ifname);
+        VLOG_INFO("vHost Device with port id %i not found", port_id);
 
         return -1;
     }
 
-    VLOG_INFO("vHost Device '%s' has been added on numa node %i",
-              ifname, newnode);
-
     return 0;
 }
 
@@ -3202,78 +3078,32 @@  netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
     }
 }
 
-/*
- * Remove a virtio-net device from the specific vhost port.  Use dev->remove
- * flag to stop any more packets from being sent or received to/from a VM and
- * ensure all currently queued packets have been sent/received before removing
- *  the device.
- */
-static void
-destroy_device(int vid)
-{
-    struct netdev_dpdk *dev;
-    bool exists = false;
-    char ifname[IF_NAME_SZ];
-
-    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
-
-    ovs_mutex_lock(&dpdk_mutex);
-    LIST_FOR_EACH (dev, list_node, &dpdk_list) {
-        if (netdev_dpdk_get_vid(dev) == vid) {
-
-            ovs_mutex_lock(&dev->mutex);
-            dev->vhost_reconfigured = false;
-            ovsrcu_index_set(&dev->vid, -1);
-            netdev_dpdk_txq_map_clear(dev);
-
-            netdev_change_seq_changed(&dev->up);
-            ovs_mutex_unlock(&dev->mutex);
-            exists = true;
-            break;
-        }
-    }
-
-    ovs_mutex_unlock(&dpdk_mutex);
-
-    if (exists) {
-        /*
-         * Wait for other threads to quiesce after setting the 'virtio_dev'
-         * to NULL, before returning.
-         */
-        ovsrcu_synchronize();
-        /*
-         * As call to ovsrcu_synchronize() will end the quiescent state,
-         * put thread back into quiescent state before returning.
-         */
-        ovsrcu_quiesce_start();
-        VLOG_INFO("vHost Device '%s' has been removed", ifname);
-    } else {
-        VLOG_INFO("vHost Device '%s' not found", ifname);
-    }
-}
-
 static int
-vring_state_changed(int vid, uint16_t queue_id, int enable)
+vring_state_changed_callback(dpdk_port_t port_id,
+                             enum rte_eth_event_type type OVS_UNUSED,
+                             void *param OVS_UNUSED,
+                             void *ret_param OVS_UNUSED)
 {
     struct netdev_dpdk *dev;
     bool exists = false;
-    int qid = queue_id / VIRTIO_QNUM;
+    int vid = -1;
     char ifname[IF_NAME_SZ];
+    struct rte_eth_vhost_queue_event event;
+    int err = 0;
 
-    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
-
-    if (queue_id % VIRTIO_QNUM == VIRTIO_TXQ) {
+    err = rte_eth_vhost_get_queue_event(port_id, &event);
+    if (err || event.rx) {
         return 0;
     }
 
     ovs_mutex_lock(&dpdk_mutex);
     LIST_FOR_EACH (dev, list_node, &dpdk_list) {
         ovs_mutex_lock(&dev->mutex);
-        if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
-            if (enable) {
-                dev->tx_q[qid].map = qid;
+        if (port_id == dev->port_id) {
+            if (event.enable) {
+                dev->tx_q[event.queue_id].map = event.queue_id;
             } else {
-                dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
+                dev->tx_q[event.queue_id].map = OVS_VHOST_QUEUE_DISABLED;
             }
             netdev_dpdk_remap_txqs(dev);
             exists = true;
@@ -3284,10 +3114,13 @@  vring_state_changed(int vid, uint16_t queue_id, int enable)
     }
     ovs_mutex_unlock(&dpdk_mutex);
 
+    vid = rte_eth_vhost_get_vid_from_port_id(dev->port_id);
+    rte_vhost_get_ifname(vid, ifname, sizeof ifname);
+
     if (exists) {
-        VLOG_INFO("State of queue %d ( tx_qid %d ) of vhost device '%s'"
-                  "changed to \'%s\'", queue_id, qid, ifname,
-                  (enable == 1) ? "enabled" : "disabled");
+        VLOG_INFO("State of tx_qid %d  of vhost device '%s'"
+                  "changed to \'%s\'", event.queue_id, ifname,
+                  (event.enable == 1) ? "enabled" : "disabled");
     } else {
         VLOG_INFO("vHost Device '%s' not found", ifname);
         return -1;
@@ -3296,25 +3129,6 @@  vring_state_changed(int vid, uint16_t queue_id, int enable)
     return 0;
 }
 
-/*
- * Retrieve the DPDK virtio device ID (vid) associated with a vhostuser
- * or vhostuserclient netdev.
- *
- * Returns a value greater or equal to zero for a valid vid or '-1' if
- * there is no valid vid associated. A vid of '-1' must not be used in
- * rte_vhost_ APi calls.
- *
- * Once obtained and validated, a vid can be used by a PMD for multiple
- * subsequent rte_vhost API calls until the PMD quiesces. A PMD should
- * not fetch the vid again for each of a series of API calls.
- */
-
-int
-netdev_dpdk_get_vid(const struct netdev_dpdk *dev)
-{
-    return ovsrcu_index_get(&dev->vid);
-}
-
 struct ingress_policer *
 netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev)
 {
@@ -3681,13 +3495,12 @@  static const struct dpdk_qos_ops egress_policer_ops = {
 };
 
 static int
-netdev_dpdk_reconfigure(struct netdev *netdev)
+common_reconfigure(struct netdev *netdev)
+    OVS_REQUIRES(dev->mutex)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int err = 0;
 
-    ovs_mutex_lock(&dev->mutex);
-
     if (netdev->n_txq == dev->requested_n_txq
         && netdev->n_rxq == dev->requested_n_rxq
         && dev->mtu == dev->requested_mtu
@@ -3727,17 +3540,36 @@  netdev_dpdk_reconfigure(struct netdev *netdev)
     netdev_change_seq_changed(netdev);
 
 out:
+    return err;
+}
+
+static int
+netdev_dpdk_reconfigure(struct netdev *netdev)
+{
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+    int err = 0;
+
+    ovs_mutex_lock(&dev->mutex);
+    err = common_reconfigure(netdev);
     ovs_mutex_unlock(&dev->mutex);
+
     return err;
 }
 
 static int
-dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
+dpdk_vhost_reconfigure_helper(struct netdev *netdev)
     OVS_REQUIRES(dev->mutex)
 {
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+    int err;
+
     dev->up.n_txq = dev->requested_n_txq;
     dev->up.n_rxq = dev->requested_n_rxq;
-    int err;
+
+    err = common_reconfigure(netdev);
+    if (err) {
+        return err;
+    }
 
     /* Enable TX queue 0 by default if it wasn't disabled. */
     if (dev->tx_q[0].map == OVS_VHOST_QUEUE_MAP_UNKNOWN) {
@@ -3746,14 +3578,7 @@  dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
 
     netdev_dpdk_remap_txqs(dev);
 
-    err = netdev_dpdk_mempool_configure(dev);
-    if (!err) {
-        /* A new mempool was created. */
-        netdev_change_seq_changed(&dev->up);
-    } else if (err != EEXIST){
-        return err;
-    }
-    if (netdev_dpdk_get_vid(dev) >= 0) {
+    if (rte_eth_vhost_get_vid_from_port_id(dev->port_id) >= 0) {
         if (dev->vhost_reconfigured == false) {
             dev->vhost_reconfigured = true;
             /* Carrier status may need updating. */
@@ -3771,7 +3596,7 @@  netdev_dpdk_vhost_reconfigure(struct netdev *netdev)
     int err;
 
     ovs_mutex_lock(&dev->mutex);
-    err = dpdk_vhost_reconfigure_helper(dev);
+    err = dpdk_vhost_reconfigure_helper(netdev);
     ovs_mutex_unlock(&dev->mutex);
 
     return err;
@@ -3781,9 +3606,8 @@  static int
 netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-    int err;
-    uint64_t vhost_flags = 0;
-    bool zc_enabled;
+    int err = 0;
+    int sid = -1;
 
     ovs_mutex_lock(&dev->mutex);
 
@@ -3794,64 +3618,50 @@  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
      */
     if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
             && strlen(dev->vhost_id)) {
-        /* Register client-mode device. */
-        vhost_flags |= RTE_VHOST_USER_CLIENT;
+        /* First time once-only configuration */
+        err = dpdk_attach_vhost_pmd(dev, VHOST_CLIENT_MODE);
+
+        if (!err) {
+            sid = rte_eth_dev_socket_id(dev->port_id);
+            dev->socket_id = sid < 0 ? SOCKET0 : sid;
+            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
+
+            if (dev->requested_socket_id != dev->socket_id
+                || dev->requested_mtu != dev->mtu) {
+                err = netdev_dpdk_mempool_configure(dev);
+                if (err && err != EEXIST) {
+                    goto unlock;
+                }
+            }
 
-        /* Enable IOMMU support, if explicitly requested. */
-        if (dpdk_vhost_iommu_enabled()) {
-            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
-        }
+            netdev->n_txq = dev->requested_n_txq;
+            netdev->n_rxq = dev->requested_n_rxq;
+
+            rte_free(dev->tx_q);
+            err = dpdk_eth_dev_init(dev);
+            dev->tx_q = netdev_dpdk_alloc_txq(netdev->n_txq);
+            if (!dev->tx_q) {
+                rte_eth_dev_detach(dev->port_id, dev->vhost_id);
+                if (dev->vhost_pmd_id >= 0) {
+                    id_pool_free_id(dpdk_get_vhost_id_pool(),
+                            dev->vhost_pmd_id);
+                }
+                err = ENOMEM;
+                goto unlock;
+            }
 
-        zc_enabled = dev->vhost_driver_flags
-                     & RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
-        /* Enable zero copy flag, if requested */
-        if (zc_enabled) {
-            vhost_flags |= RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
-        }
+            netdev_change_seq_changed(netdev);
 
-        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
-        if (err) {
-            VLOG_ERR("vhost-user device setup failure for device %s\n",
-                     dev->vhost_id);
-            goto unlock;
-        } else {
-            /* Configuration successful */
-            dev->vhost_driver_flags |= vhost_flags;
             VLOG_INFO("vHost User device '%s' created in 'client' mode, "
                       "using client socket '%s'",
                       dev->up.name, dev->vhost_id);
-            if (zc_enabled) {
-                VLOG_INFO("Zero copy enabled for vHost port %s", dev->up.name);
-            }
-        }
-
-        err = rte_vhost_driver_callback_register(dev->vhost_id,
-                                                 &virtio_net_device_ops);
-        if (err) {
-            VLOG_ERR("rte_vhost_driver_callback_register failed for "
-                     "vhost user client port: %s\n", dev->up.name);
-            goto unlock;
-        }
-
-        err = rte_vhost_driver_disable_features(dev->vhost_id,
-                                    1ULL << VIRTIO_NET_F_HOST_TSO4
-                                    | 1ULL << VIRTIO_NET_F_HOST_TSO6
-                                    | 1ULL << VIRTIO_NET_F_CSUM);
-        if (err) {
-            VLOG_ERR("rte_vhost_driver_disable_features failed for vhost user "
-                     "client port: %s\n", dev->up.name);
-            goto unlock;
-        }
-
-        err = rte_vhost_driver_start(dev->vhost_id);
-        if (err) {
-            VLOG_ERR("rte_vhost_driver_start failed for vhost user "
-                     "client port: %s\n", dev->up.name);
-            goto unlock;
         }
+        goto unlock;
     }
 
-    err = dpdk_vhost_reconfigure_helper(dev);
+    if (rte_eth_dev_is_valid_port(dev->port_id)) {
+        err = dpdk_vhost_reconfigure_helper(netdev);
+    }
 
 unlock:
     ovs_mutex_unlock(&dev->mutex);
@@ -3861,9 +3671,7 @@  unlock:
 
 #define NETDEV_DPDK_CLASS(NAME, INIT, CONSTRUCT, DESTRUCT,    \
                           SET_CONFIG, SET_TX_MULTIQ, SEND,    \
-                          GET_CARRIER, GET_STATS,			  \
-                          GET_CUSTOM_STATS,					  \
-                          GET_FEATURES, GET_STATUS,           \
+                          GET_STATUS,                         \
                           RECONFIGURE, RXQ_RECV)              \
 {                                                             \
     NAME,                                                     \
@@ -3893,12 +3701,12 @@  unlock:
     netdev_dpdk_get_mtu,                                      \
     netdev_dpdk_set_mtu,                                      \
     netdev_dpdk_get_ifindex,                                  \
-    GET_CARRIER,                                              \
+    netdev_dpdk_get_carrier,                                  \
     netdev_dpdk_get_carrier_resets,                           \
     netdev_dpdk_set_miimon,                                   \
-    GET_STATS,                                                \
-    GET_CUSTOM_STATS,										  \
-    GET_FEATURES,                                             \
+    netdev_dpdk_get_stats,                                    \
+    netdev_dpdk_get_custom_stats,                             \
+    netdev_dpdk_get_features,                                 \
     NULL,                       /* set_advertisements */      \
     NULL,                       /* get_pt_mode */             \
                                                               \
@@ -3945,10 +3753,6 @@  static const struct netdev_class dpdk_class =
         netdev_dpdk_set_config,
         netdev_dpdk_set_tx_multiq,
         netdev_dpdk_eth_send,
-        netdev_dpdk_get_carrier,
-        netdev_dpdk_get_stats,
-        netdev_dpdk_get_custom_stats,
-        netdev_dpdk_get_features,
         netdev_dpdk_get_status,
         netdev_dpdk_reconfigure,
         netdev_dpdk_rxq_recv);
@@ -3962,10 +3766,6 @@  static const struct netdev_class dpdk_ring_class =
         netdev_dpdk_ring_set_config,
         netdev_dpdk_set_tx_multiq,
         netdev_dpdk_ring_send,
-        netdev_dpdk_get_carrier,
-        netdev_dpdk_get_stats,
-        netdev_dpdk_get_custom_stats,
-        netdev_dpdk_get_features,
         netdev_dpdk_get_status,
         netdev_dpdk_reconfigure,
         netdev_dpdk_rxq_recv);
@@ -3979,10 +3779,6 @@  static const struct netdev_class dpdk_vhost_class =
         NULL,
         NULL,
         netdev_dpdk_vhost_send,
-        netdev_dpdk_vhost_get_carrier,
-        netdev_dpdk_vhost_get_stats,
-        NULL,
-        NULL,
         netdev_dpdk_vhost_user_get_status,
         netdev_dpdk_vhost_reconfigure,
         netdev_dpdk_vhost_rxq_recv);
@@ -3995,10 +3791,6 @@  static const struct netdev_class dpdk_vhost_client_class =
         netdev_dpdk_vhost_client_set_config,
         NULL,
         netdev_dpdk_vhost_send,
-        netdev_dpdk_vhost_get_carrier,
-        netdev_dpdk_vhost_get_stats,
-        NULL,
-        NULL,
         netdev_dpdk_vhost_user_get_status,
         netdev_dpdk_vhost_client_reconfigure,
         netdev_dpdk_vhost_rxq_recv);
diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 3d21b01..baefa2b 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -68,6 +68,7 @@  OVS_VSWITCHD_STOP("/does not exist. The Open vSwitch kernel module is probably n
 /failed to connect to \/tmp\/dpdkvhostclient0: No such file or directory/d
 /Global register is changed during/d
 /EAL: No free hugepages reported in hugepages-1048576kB/d
+/Rx checksum offload is not supported/d
 ")
 AT_CLEANUP
 dnl --------------------------------------------------------------------------