diff mbox

[v10,6/7] vhost-user: add multiple queue support

Message ID 1442588324-11365-7-git-send-email-yuanhan.liu@linux.intel.com
State New
Headers show

Commit Message

Yuanhan Liu Sept. 18, 2015, 2:58 p.m. UTC
From: Changchun Ouyang <changchun.ouyang@intel.com>

This patch is initially based a patch from Nikolay Nikolaev.

This patch adds vhost-user multiple queue support, by creating a nc
and vhost_net pair for each queue.

Qemu exits if find that the backend can't support number of requested
queues (by providing queues=# option). The max number is queried by a
new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
feature VHOST_USER_PROTOCOL_F_MQ is present first.

The max queue check is done at vhost-user initiation stage. We initiate
one queue first, which, in the meantime, also gets the max_queues the
backend supports.

In older version, it was reported that some messages are sent more times
than necessary. Here we came an agreement with Michael that we could
categorize vhost user messages to 2 types: non-vring specific messages,
which should be sent only once, and vring specific messages, which should
be sent per queue.

Here I introduced a helper function vhost_user_one_time_request(), which
lists following messages as non-vring specific messages:

        VHOST_USER_SET_OWNER
        VHOST_USER_RESET_DEVICE
        VHOST_USER_SET_MEM_TABLE
        VHOST_USER_GET_QUEUE_NUM

For above messages, we simply ignore them when they are not sent the first
time.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>

---
v10: don't treat VHOST_USER_SET/GET_[PROTOCOL]_FEATURES as one time
     request, as the two feature bits need to be stored at per-device.

v9: per suggested by Jason Wang, we could invoke qemu_chr_add_handlers()
    once only, and invoke qemu_find_net_clients_except() at the handler
    to gather all related ncs, for initiating all queues. Which addresses
    a hidden bug in older verion in a more QEMU-like way.

v8: set net->dev.vq_index correctly inside vhost_net_init() based on the
    value from net->nc->queue_index.
---
 docs/specs/vhost-user.txt |  15 +++++
 hw/net/vhost_net.c        |  10 ++--
 hw/virtio/vhost-user.c    |  22 +++++++
 hw/virtio/vhost.c         |   5 +-
 net/vhost-user.c          | 146 +++++++++++++++++++++++++++++++++-------------
 qapi-schema.json          |   6 +-
 qemu-options.hx           |   5 +-
 7 files changed, 159 insertions(+), 50 deletions(-)

Comments

Jason Wang Sept. 22, 2015, 10:14 a.m. UTC | #1
On 09/18/2015 10:58 PM, Yuanhan Liu wrote:
> From: Changchun Ouyang <changchun.ouyang@intel.com>
>
> This patch is initially based a patch from Nikolay Nikolaev.
>
> This patch adds vhost-user multiple queue support, by creating a nc
> and vhost_net pair for each queue.
>
> Qemu exits if find that the backend can't support number of requested
> queues (by providing queues=# option). The max number is queried by a
> new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
> feature VHOST_USER_PROTOCOL_F_MQ is present first.
>
> The max queue check is done at vhost-user initiation stage. We initiate
> one queue first, which, in the meantime, also gets the max_queues the
> backend supports.
>
> In older version, it was reported that some messages are sent more times
> than necessary. Here we came an agreement with Michael that we could
> categorize vhost user messages to 2 types: non-vring specific messages,
> which should be sent only once, and vring specific messages, which should
> be sent per queue.
>
> Here I introduced a helper function vhost_user_one_time_request(), which
> lists following messages as non-vring specific messages:
>
>         VHOST_USER_SET_OWNER
>         VHOST_USER_RESET_DEVICE
>         VHOST_USER_SET_MEM_TABLE
>         VHOST_USER_GET_QUEUE_NUM
>
> For above messages, we simply ignore them when they are not sent the first
> time.
>
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
>
> ---
> v10: don't treat VHOST_USER_SET/GET_[PROTOCOL]_FEATURES as one time
>      request, as the two feature bits need to be stored at per-device.
>
> v9: per suggested by Jason Wang, we could invoke qemu_chr_add_handlers()
>     once only, and invoke qemu_find_net_clients_except() at the handler
>     to gather all related ncs, for initiating all queues. Which addresses
>     a hidden bug in older verion in a more QEMU-like way.
>
> v8: set net->dev.vq_index correctly inside vhost_net_init() based on the
>     value from net->nc->queue_index.
> ---
>  docs/specs/vhost-user.txt |  15 +++++
>  hw/net/vhost_net.c        |  10 ++--
>  hw/virtio/vhost-user.c    |  22 +++++++
>  hw/virtio/vhost.c         |   5 +-
>  net/vhost-user.c          | 146 +++++++++++++++++++++++++++++++++-------------
>  qapi-schema.json          |   6 +-
>  qemu-options.hx           |   5 +-
>  7 files changed, 159 insertions(+), 50 deletions(-)
>
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 43db9b4..cfc9d41 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -135,6 +135,21 @@ As older slaves don't support negotiating protocol features,
>  a feature bit was dedicated for this purpose:
>  #define VHOST_USER_F_PROTOCOL_FEATURES 30
>  
> +Multiple queue support
> +----------------------
> +
> +Multiple queue is treated as a protocol extension, hence the slave has to
> +implement protocol features first. The multiple queues feature is supported
> +only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set:
> +#define VHOST_USER_PROTOCOL_F_MQ    0
> +
> +The max number of queues the slave supports can be queried with message
> +VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
> +requested queues is bigger than that.
> +
> +As all queues share one connection, the master uses a unique index for each
> +queue in the sent message to identify a specified queue.
> +
>  Message types
>  -------------
>  
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index bc88006..786191a 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -148,8 +148,11 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>          fprintf(stderr, "vhost-net requires net backend to be setup\n");
>          goto fail;
>      }
> +    net->nc = options->net_backend;
>  
>      net->dev.max_queues = 1;
> +    net->dev.nvqs = 2;
> +    net->dev.vqs = net->vqs;
>  
>      if (backend_kernel) {
>          r = vhost_net_get_fd(options->net_backend);
> @@ -164,11 +167,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>          net->dev.backend_features = 0;
>          net->dev.protocol_features = 0;
>          net->backend = -1;
> -    }
> -    net->nc = options->net_backend;
>  
> -    net->dev.nvqs = 2;
> -    net->dev.vqs = net->vqs;
> +        /* vhost-user needs vq_index to initiate a specific queue pair */
> +        net->dev.vq_index = net->nc->queue_index * net->dev.nvqs;
> +    }
>  
>      r = vhost_dev_init(&net->dev, options->opaque,
>                         options->backend_type);
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 5018fd6..e42fde6 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -187,6 +187,19 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
>              0 : -1;
>  }
>  
> +static bool vhost_user_one_time_request(VhostUserRequest request)
> +{
> +    switch (request) {
> +    case VHOST_USER_SET_OWNER:
> +    case VHOST_USER_RESET_DEVICE:
> +    case VHOST_USER_SET_MEM_TABLE:
> +    case VHOST_USER_GET_QUEUE_NUM:
> +        return true;
> +    default:
> +        return false;
> +    }
> +}
> +
>  static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>          void *arg)
>  {
> @@ -207,6 +220,15 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>          msg_request = request;
>      }
>  
> +    /*
> +     * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE,
> +     * we just need send it once in the first time. For later such
> +     * request, we just ignore it.
> +     */
> +    if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0) {
> +        return 0;
> +    }
> +
>      msg.request = msg_request;
>      msg.flags = VHOST_USER_VERSION;
>      msg.size = 0;
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 7a7812d..c0ed5b2 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -874,8 +874,9 @@ static void vhost_eventfd_del(MemoryListener *listener,
>  static int vhost_virtqueue_init(struct vhost_dev *dev,
>                                  struct vhost_virtqueue *vq, int n)
>  {
> +    int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, n);
>      struct vhost_vring_file file = {
> -        .index = n,
> +        .index = vhost_vq_index,
>      };
>      int r = event_notifier_init(&vq->masked_notifier, 0);
>      if (r < 0) {
> @@ -926,7 +927,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>      }
>  
>      for (i = 0; i < hdev->nvqs; ++i) {
> -        r = vhost_virtqueue_init(hdev, hdev->vqs + i, i);
> +        r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i);
>          if (r < 0) {
>              goto fail_vq;
>          }
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 93dcecd..4fa3d64 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -39,37 +39,77 @@ static int vhost_user_running(VhostUserState *s)
>      return (s->vhost_net) ? 1 : 0;
>  }
>  
> -static int vhost_user_start(VhostUserState *s)
> +static void vhost_user_stop(int queues, NetClientState *ncs[])
>  {
> -    VhostNetOptions options;
> -
> -    if (vhost_user_running(s)) {
> -        return 0;
> -    }
> +    VhostUserState *s;
> +    int i;
>  
> -    options.backend_type = VHOST_BACKEND_TYPE_USER;
> -    options.net_backend = &s->nc;
> -    options.opaque = s->chr;
> +    for (i = 0; i < queues; i++) {
> +        assert (ncs[i]->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
>  
> -    s->vhost_net = vhost_net_init(&options);
> +        s = DO_UPCAST(VhostUserState, nc, ncs[i]);
> +        if (!vhost_user_running(s)) {
> +            continue;
> +        }
>  
> -    return vhost_user_running(s) ? 0 : -1;
> +        if (s->vhost_net) {
> +            vhost_net_cleanup(s->vhost_net);
> +            s->vhost_net = NULL;
> +        }
> +    }
>  }
>  
> -static void vhost_user_stop(VhostUserState *s)
> +static int vhost_user_start(int queues, NetClientState *ncs[])
>  {
> -    if (vhost_user_running(s)) {
> -        vhost_net_cleanup(s->vhost_net);
> +    VhostNetOptions options;
> +    VhostUserState *s;
> +    int max_queues;
> +    int i;
> +
> +    options.backend_type = VHOST_BACKEND_TYPE_USER;
> +
> +    for (i = 0; i < queues; i++) {
> +        assert (ncs[i]->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
> +
> +        s = DO_UPCAST(VhostUserState, nc, ncs[i]);
> +        if (vhost_user_running(s)) {
> +            continue;
> +        }
> +
> +        options.net_backend = ncs[i];
> +        options.opaque      = s->chr;
> +        s->vhost_net = vhost_net_init(&options);
> +        if (!s->vhost_net) {
> +            error_report("failed to init vhost_net for queue %d\n", i);
> +            goto err;
> +        }
> +
> +        if (i == 0) {
> +            max_queues = vhost_net_get_max_queues(s->vhost_net);
> +            if (queues > max_queues) {
> +                error_report("you are asking more queues than "
> +                             "supported: %d\n", max_queues);
> +                goto err;
> +            }
> +        }
>      }
>  
> -    s->vhost_net = 0;
> +    return 0;
> +
> +err:
> +    vhost_user_stop(i + 1, ncs);
> +    return -1;
>  }
>  
>  static void vhost_user_cleanup(NetClientState *nc)
>  {
>      VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
>  
> -    vhost_user_stop(s);
> +    if (s->vhost_net) {
> +        vhost_net_cleanup(s->vhost_net);
> +        s->vhost_net = NULL;
> +    }
> +
>      qemu_purge_queued_packets(nc);
>  }
>  
> @@ -95,59 +135,81 @@ static NetClientInfo net_vhost_user_info = {
>          .has_ufo = vhost_user_has_ufo,
>  };
>  
> -static void net_vhost_link_down(VhostUserState *s, bool link_down)
> +static void net_vhost_link_down(int queues, NetClientState *ncs[],
> +                                bool link_down)
>  {
> -    s->nc.link_down = link_down;
> +    NetClientState *nc;
> +    int i;
>  
> -    if (s->nc.peer) {
> -        s->nc.peer->link_down = link_down;
> -    }
> +    for (i = 0; i < queues; i++) {
> +        nc = ncs[i];
>  
> -    if (s->nc.info->link_status_changed) {
> -        s->nc.info->link_status_changed(&s->nc);
> -    }
> +        nc->link_down = link_down;
> +
> +        if (nc->peer) {
> +            nc->peer->link_down = link_down;
> +        }
> +
> +        if (nc->info->link_status_changed) {
> +            nc->info->link_status_changed(nc);
> +        }
>  
> -    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> -        s->nc.peer->info->link_status_changed(s->nc.peer);
> +        if (nc->peer && nc->peer->info->link_status_changed) {
> +            nc->peer->info->link_status_changed(nc->peer);
> +        }
>      }
>  }

The semantics is a little bit difference with qmp_set_link. What's the
reason for this? If there's no specific reason, probably, we could just
reuse qmp_set_link() (or part of) here?

Other looks good to me.

>  
>  static void net_vhost_user_event(void *opaque, int event)
>  {
> -    VhostUserState *s = opaque;
> +    const char *name = opaque;
> +    NetClientState *ncs[MAX_QUEUE_NUM];
> +    VhostUserState *s;
> +    int queues;
>  
> +    queues = qemu_find_net_clients_except(name, ncs,
> +                                          NET_CLIENT_OPTIONS_KIND_NIC,
> +                                          MAX_QUEUE_NUM);
> +    s = DO_UPCAST(VhostUserState, nc, ncs[0]);
>      switch (event) {
>      case CHR_EVENT_OPENED:
> -        vhost_user_start(s);
> -        net_vhost_link_down(s, false);
> +        if (vhost_user_start(queues, ncs) < 0) {
> +            exit(1);
> +        }
> +        net_vhost_link_down(queues, ncs, false);
>          error_report("chardev \"%s\" went up", s->chr->label);
>          break;
>      case CHR_EVENT_CLOSED:
> -        net_vhost_link_down(s, true);
> -        vhost_user_stop(s);
> +        net_vhost_link_down(queues, ncs, true);
> +        vhost_user_stop(queues, ncs);
>          error_report("chardev \"%s\" went down", s->chr->label);
>          break;
>      }
>  }
>  
>  static int net_vhost_user_init(NetClientState *peer, const char *device,
> -                               const char *name, CharDriverState *chr)
> +                               const char *name, CharDriverState *chr,
> +                               int queues)
>  {
>      NetClientState *nc;
>      VhostUserState *s;
> +    int i;
>  
> -    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
> +    for (i = 0; i < queues; i++) {
> +        nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
>  
> -    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> -             chr->label);
> +        snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
> +                 i, chr->label);
>  
> -    s = DO_UPCAST(VhostUserState, nc, nc);
> +        /* We don't provide a receive callback */
> +        nc->receive_disabled = 1;
> +        nc->queue_index = i;
>  
> -    /* We don't provide a receive callback */
> -    s->nc.receive_disabled = 1;
> -    s->chr = chr;
> +        s = DO_UPCAST(VhostUserState, nc, nc);
> +        s->chr = chr;
> +    }
>  
> -    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
> +    qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, (void*)name);
>  
>      return 0;
>  }
> @@ -226,6 +288,7 @@ static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
>  int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>                          NetClientState *peer, Error **errp)
>  {
> +    int queues;
>      const NetdevVhostUserOptions *vhost_user_opts;
>      CharDriverState *chr;
>  
> @@ -243,6 +306,7 @@ int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>          return -1;
>      }
>  
> +    queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
>  
> -    return net_vhost_user_init(peer, "vhost_user", name, chr);
> +    return net_vhost_user_init(peer, "vhost_user", name, chr, queues);
>  }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 2bada60..a372f49 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2480,12 +2480,16 @@
>  #
>  # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
>  #
> +# @queues: #optional number of queues to be created for multiqueue vhost-user
> +#          (default: 1) (Since 2.5)
> +#
>  # Since 2.1
>  ##
>  { 'struct': 'NetdevVhostUserOptions',
>    'data': {
>      'chardev':        'str',
> -    '*vhostforce':    'bool' } }
> +    '*vhostforce':    'bool',
> +    '*queues':        'int' } }
>  
>  ##
>  # @NetClientOptions
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 7e147b8..328404c 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1990,13 +1990,14 @@ The hubport netdev lets you connect a NIC to a QEMU "vlan" instead of a single
>  netdev.  @code{-net} and @code{-device} with parameter @option{vlan} create the
>  required hub automatically.
>  
> -@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off]
> +@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n]
>  
>  Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev should
>  be a unix domain socket backed one. The vhost-user uses a specifically defined
>  protocol to pass vhost ioctl replacement messages to an application on the other
>  end of the socket. On non-MSIX guests, the feature can be forced with
> -@var{vhostforce}.
> +@var{vhostforce}. Use 'queues=@var{n}' to specify the number of queues to
> +be created for multiqueue vhost-user.
>  
>  Example:
>  @example
Michael S. Tsirkin Sept. 22, 2015, 11:53 a.m. UTC | #2
On Fri, Sep 18, 2015 at 10:58:43PM +0800, Yuanhan Liu wrote:
> From: Changchun Ouyang <changchun.ouyang@intel.com>
> 
> This patch is initially based a patch from Nikolay Nikolaev.
> 
> This patch adds vhost-user multiple queue support, by creating a nc
> and vhost_net pair for each queue.
> 
> Qemu exits if find that the backend can't support number of requested
> queues (by providing queues=# option). The max number is queried by a
> new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
> feature VHOST_USER_PROTOCOL_F_MQ is present first.
> 
> The max queue check is done at vhost-user initiation stage. We initiate
> one queue first, which, in the meantime, also gets the max_queues the
> backend supports.
> 
> In older version, it was reported that some messages are sent more times
> than necessary. Here we came an agreement with Michael that we could
> categorize vhost user messages to 2 types: non-vring specific messages,
> which should be sent only once, and vring specific messages, which should
> be sent per queue.
> 
> Here I introduced a helper function vhost_user_one_time_request(), which
> lists following messages as non-vring specific messages:
> 
>         VHOST_USER_SET_OWNER
>         VHOST_USER_RESET_DEVICE
>         VHOST_USER_SET_MEM_TABLE
>         VHOST_USER_GET_QUEUE_NUM
> 
> For above messages, we simply ignore them when they are not sent the first
> time.
> 
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>

Eric, could you review schema changes please?

> ---
> v10: don't treat VHOST_USER_SET/GET_[PROTOCOL]_FEATURES as one time
>      request, as the two feature bits need to be stored at per-device.
> 
> v9: per suggested by Jason Wang, we could invoke qemu_chr_add_handlers()
>     once only, and invoke qemu_find_net_clients_except() at the handler
>     to gather all related ncs, for initiating all queues. Which addresses
>     a hidden bug in older verion in a more QEMU-like way.
> 
> v8: set net->dev.vq_index correctly inside vhost_net_init() based on the
>     value from net->nc->queue_index.
> ---
>  docs/specs/vhost-user.txt |  15 +++++
>  hw/net/vhost_net.c        |  10 ++--
>  hw/virtio/vhost-user.c    |  22 +++++++
>  hw/virtio/vhost.c         |   5 +-
>  net/vhost-user.c          | 146 +++++++++++++++++++++++++++++++++-------------
>  qapi-schema.json          |   6 +-
>  qemu-options.hx           |   5 +-
>  7 files changed, 159 insertions(+), 50 deletions(-)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 43db9b4..cfc9d41 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -135,6 +135,21 @@ As older slaves don't support negotiating protocol features,
>  a feature bit was dedicated for this purpose:
>  #define VHOST_USER_F_PROTOCOL_FEATURES 30
>  
> +Multiple queue support
> +----------------------
> +
> +Multiple queue is treated as a protocol extension, hence the slave has to
> +implement protocol features first. The multiple queues feature is supported
> +only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set:
> +#define VHOST_USER_PROTOCOL_F_MQ    0
> +
> +The max number of queues the slave supports can be queried with message
> +VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
> +requested queues is bigger than that.
> +
> +As all queues share one connection, the master uses a unique index for each
> +queue in the sent message to identify a specified queue.
> +
>  Message types
>  -------------
>  
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index bc88006..786191a 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -148,8 +148,11 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>          fprintf(stderr, "vhost-net requires net backend to be setup\n");
>          goto fail;
>      }
> +    net->nc = options->net_backend;
>  
>      net->dev.max_queues = 1;
> +    net->dev.nvqs = 2;
> +    net->dev.vqs = net->vqs;
>  
>      if (backend_kernel) {
>          r = vhost_net_get_fd(options->net_backend);
> @@ -164,11 +167,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>          net->dev.backend_features = 0;
>          net->dev.protocol_features = 0;
>          net->backend = -1;
> -    }
> -    net->nc = options->net_backend;
>  
> -    net->dev.nvqs = 2;
> -    net->dev.vqs = net->vqs;
> +        /* vhost-user needs vq_index to initiate a specific queue pair */
> +        net->dev.vq_index = net->nc->queue_index * net->dev.nvqs;
> +    }
>  
>      r = vhost_dev_init(&net->dev, options->opaque,
>                         options->backend_type);
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 5018fd6..e42fde6 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -187,6 +187,19 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
>              0 : -1;
>  }
>  
> +static bool vhost_user_one_time_request(VhostUserRequest request)
> +{
> +    switch (request) {
> +    case VHOST_USER_SET_OWNER:
> +    case VHOST_USER_RESET_DEVICE:
> +    case VHOST_USER_SET_MEM_TABLE:
> +    case VHOST_USER_GET_QUEUE_NUM:
> +        return true;
> +    default:
> +        return false;
> +    }
> +}
> +
>  static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>          void *arg)
>  {
> @@ -207,6 +220,15 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>          msg_request = request;
>      }
>  
> +    /*
> +     * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE,
> +     * we just need send it once in the first time. For later such
> +     * request, we just ignore it.
> +     */
> +    if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0) {
> +        return 0;
> +    }
> +
>      msg.request = msg_request;
>      msg.flags = VHOST_USER_VERSION;
>      msg.size = 0;
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 7a7812d..c0ed5b2 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -874,8 +874,9 @@ static void vhost_eventfd_del(MemoryListener *listener,
>  static int vhost_virtqueue_init(struct vhost_dev *dev,
>                                  struct vhost_virtqueue *vq, int n)
>  {
> +    int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, n);
>      struct vhost_vring_file file = {
> -        .index = n,
> +        .index = vhost_vq_index,
>      };
>      int r = event_notifier_init(&vq->masked_notifier, 0);
>      if (r < 0) {
> @@ -926,7 +927,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>      }
>  
>      for (i = 0; i < hdev->nvqs; ++i) {
> -        r = vhost_virtqueue_init(hdev, hdev->vqs + i, i);
> +        r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i);
>          if (r < 0) {
>              goto fail_vq;
>          }
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 93dcecd..4fa3d64 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -39,37 +39,77 @@ static int vhost_user_running(VhostUserState *s)
>      return (s->vhost_net) ? 1 : 0;
>  }
>  
> -static int vhost_user_start(VhostUserState *s)
> +static void vhost_user_stop(int queues, NetClientState *ncs[])
>  {
> -    VhostNetOptions options;
> -
> -    if (vhost_user_running(s)) {
> -        return 0;
> -    }
> +    VhostUserState *s;
> +    int i;
>  
> -    options.backend_type = VHOST_BACKEND_TYPE_USER;
> -    options.net_backend = &s->nc;
> -    options.opaque = s->chr;
> +    for (i = 0; i < queues; i++) {
> +        assert (ncs[i]->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
>  
> -    s->vhost_net = vhost_net_init(&options);
> +        s = DO_UPCAST(VhostUserState, nc, ncs[i]);
> +        if (!vhost_user_running(s)) {
> +            continue;
> +        }
>  
> -    return vhost_user_running(s) ? 0 : -1;
> +        if (s->vhost_net) {
> +            vhost_net_cleanup(s->vhost_net);
> +            s->vhost_net = NULL;
> +        }
> +    }
>  }
>  
> -static void vhost_user_stop(VhostUserState *s)
> +static int vhost_user_start(int queues, NetClientState *ncs[])
>  {
> -    if (vhost_user_running(s)) {
> -        vhost_net_cleanup(s->vhost_net);
> +    VhostNetOptions options;
> +    VhostUserState *s;
> +    int max_queues;
> +    int i;
> +
> +    options.backend_type = VHOST_BACKEND_TYPE_USER;
> +
> +    for (i = 0; i < queues; i++) {
> +        assert (ncs[i]->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
> +
> +        s = DO_UPCAST(VhostUserState, nc, ncs[i]);
> +        if (vhost_user_running(s)) {
> +            continue;
> +        }
> +
> +        options.net_backend = ncs[i];
> +        options.opaque      = s->chr;
> +        s->vhost_net = vhost_net_init(&options);
> +        if (!s->vhost_net) {
> +            error_report("failed to init vhost_net for queue %d\n", i);
> +            goto err;
> +        }
> +
> +        if (i == 0) {
> +            max_queues = vhost_net_get_max_queues(s->vhost_net);
> +            if (queues > max_queues) {
> +                error_report("you are asking more queues than "
> +                             "supported: %d\n", max_queues);
> +                goto err;
> +            }
> +        }
>      }
>  
> -    s->vhost_net = 0;
> +    return 0;
> +
> +err:
> +    vhost_user_stop(i + 1, ncs);
> +    return -1;
>  }
>  
>  static void vhost_user_cleanup(NetClientState *nc)
>  {
>      VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
>  
> -    vhost_user_stop(s);
> +    if (s->vhost_net) {
> +        vhost_net_cleanup(s->vhost_net);
> +        s->vhost_net = NULL;
> +    }
> +
>      qemu_purge_queued_packets(nc);
>  }
>  
> @@ -95,59 +135,81 @@ static NetClientInfo net_vhost_user_info = {
>          .has_ufo = vhost_user_has_ufo,
>  };
>  
> -static void net_vhost_link_down(VhostUserState *s, bool link_down)
> +static void net_vhost_link_down(int queues, NetClientState *ncs[],
> +                                bool link_down)
>  {
> -    s->nc.link_down = link_down;
> +    NetClientState *nc;
> +    int i;
>  
> -    if (s->nc.peer) {
> -        s->nc.peer->link_down = link_down;
> -    }
> +    for (i = 0; i < queues; i++) {
> +        nc = ncs[i];
>  
> -    if (s->nc.info->link_status_changed) {
> -        s->nc.info->link_status_changed(&s->nc);
> -    }
> +        nc->link_down = link_down;
> +
> +        if (nc->peer) {
> +            nc->peer->link_down = link_down;
> +        }
> +
> +        if (nc->info->link_status_changed) {
> +            nc->info->link_status_changed(nc);
> +        }
>  
> -    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> -        s->nc.peer->info->link_status_changed(s->nc.peer);
> +        if (nc->peer && nc->peer->info->link_status_changed) {
> +            nc->peer->info->link_status_changed(nc->peer);
> +        }
>      }
>  }
>  
>  static void net_vhost_user_event(void *opaque, int event)
>  {
> -    VhostUserState *s = opaque;
> +    const char *name = opaque;
> +    NetClientState *ncs[MAX_QUEUE_NUM];
> +    VhostUserState *s;
> +    int queues;
>  
> +    queues = qemu_find_net_clients_except(name, ncs,
> +                                          NET_CLIENT_OPTIONS_KIND_NIC,
> +                                          MAX_QUEUE_NUM);
> +    s = DO_UPCAST(VhostUserState, nc, ncs[0]);
>      switch (event) {
>      case CHR_EVENT_OPENED:
> -        vhost_user_start(s);
> -        net_vhost_link_down(s, false);
> +        if (vhost_user_start(queues, ncs) < 0) {
> +            exit(1);
> +        }
> +        net_vhost_link_down(queues, ncs, false);
>          error_report("chardev \"%s\" went up", s->chr->label);
>          break;
>      case CHR_EVENT_CLOSED:
> -        net_vhost_link_down(s, true);
> -        vhost_user_stop(s);
> +        net_vhost_link_down(queues, ncs, true);
> +        vhost_user_stop(queues, ncs);
>          error_report("chardev \"%s\" went down", s->chr->label);
>          break;
>      }
>  }
>  
>  static int net_vhost_user_init(NetClientState *peer, const char *device,
> -                               const char *name, CharDriverState *chr)
> +                               const char *name, CharDriverState *chr,
> +                               int queues)
>  {
>      NetClientState *nc;
>      VhostUserState *s;
> +    int i;
>  
> -    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
> +    for (i = 0; i < queues; i++) {
> +        nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
>  
> -    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> -             chr->label);
> +        snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
> +                 i, chr->label);
>  
> -    s = DO_UPCAST(VhostUserState, nc, nc);
> +        /* We don't provide a receive callback */
> +        nc->receive_disabled = 1;
> +        nc->queue_index = i;
>  
> -    /* We don't provide a receive callback */
> -    s->nc.receive_disabled = 1;
> -    s->chr = chr;
> +        s = DO_UPCAST(VhostUserState, nc, nc);
> +        s->chr = chr;
> +    }
>  
> -    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
> +    qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, (void*)name);
>  
>      return 0;
>  }
> @@ -226,6 +288,7 @@ static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
>  int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>                          NetClientState *peer, Error **errp)
>  {
> +    int queues;
>      const NetdevVhostUserOptions *vhost_user_opts;
>      CharDriverState *chr;
>  
> @@ -243,6 +306,7 @@ int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>          return -1;
>      }
>  
> +    queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
>  
> -    return net_vhost_user_init(peer, "vhost_user", name, chr);
> +    return net_vhost_user_init(peer, "vhost_user", name, chr, queues);
>  }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 2bada60..a372f49 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2480,12 +2480,16 @@
>  #
>  # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
>  #
> +# @queues: #optional number of queues to be created for multiqueue vhost-user
> +#          (default: 1) (Since 2.5)
> +#
>  # Since 2.1
>  ##
>  { 'struct': 'NetdevVhostUserOptions',
>    'data': {
>      'chardev':        'str',
> -    '*vhostforce':    'bool' } }
> +    '*vhostforce':    'bool',
> +    '*queues':        'int' } }
>  
>  ##
>  # @NetClientOptions
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 7e147b8..328404c 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1990,13 +1990,14 @@ The hubport netdev lets you connect a NIC to a QEMU "vlan" instead of a single
>  netdev.  @code{-net} and @code{-device} with parameter @option{vlan} create the
>  required hub automatically.
>  
> -@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off]
> +@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n]
>  
>  Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev should
>  be a unix domain socket backed one. The vhost-user uses a specifically defined
>  protocol to pass vhost ioctl replacement messages to an application on the other
>  end of the socket. On non-MSIX guests, the feature can be forced with
> -@var{vhostforce}.
> +@var{vhostforce}. Use 'queues=@var{n}' to specify the number of queues to
> +be created for multiqueue vhost-user.
>  
>  Example:
>  @example
> -- 
> 1.9.0
Eric Blake Sept. 22, 2015, 2:44 p.m. UTC | #3
On 09/18/2015 08:58 AM, Yuanhan Liu wrote:
> From: Changchun Ouyang <changchun.ouyang@intel.com>
> 
> This patch is initially based a patch from Nikolay Nikolaev.
> 
> This patch adds vhost-user multiple queue support, by creating a nc
> and vhost_net pair for each queue.
> 
> Qemu exits if find that the backend can't support number of requested

s/support/support the/

> queues (by providing queues=# option). The max number is queried by a
> new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
> feature VHOST_USER_PROTOCOL_F_MQ is present first.
> 
> The max queue check is done at vhost-user initiation stage. We initiate
> one queue first, which, in the meantime, also gets the max_queues the
> backend supports.
> 
> In older version, it was reported that some messages are sent more times
> than necessary. Here we came an agreement with Michael that we could
> categorize vhost user messages to 2 types: non-vring specific messages,
> which should be sent only once, and vring specific messages, which should
> be sent per queue.
> 
> Here I introduced a helper function vhost_user_one_time_request(), which
> lists following messages as non-vring specific messages:
> 
>         VHOST_USER_SET_OWNER
>         VHOST_USER_RESET_DEVICE
>         VHOST_USER_SET_MEM_TABLE
>         VHOST_USER_GET_QUEUE_NUM
> 
> For above messages, we simply ignore them when they are not sent the first
> time.
> 
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
> 
> ---

Just focusing on the interface portions:

> +++ b/qapi-schema.json
> @@ -2480,12 +2480,16 @@
>  #
>  # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
>  #
> +# @queues: #optional number of queues to be created for multiqueue vhost-user
> +#          (default: 1) (Since 2.5)
> +#
>  # Since 2.1
>  ##
>  { 'struct': 'NetdevVhostUserOptions',
>    'data': {
>      'chardev':        'str',
> -    '*vhostforce':    'bool' } }
> +    '*vhostforce':    'bool',
> +    '*queues':        'int' } }
>  

Discoverable via introspection, and should work with the patches from
Zoltan and me that get QMP netdev_add converted over to qapi.

Interface is good to go.
Eduardo Habkost Sept. 22, 2015, 6:52 p.m. UTC | #4
On Fri, Sep 18, 2015 at 10:58:43PM +0800, Yuanhan Liu wrote:
[...]
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 93dcecd..4fa3d64 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
[...]
> +        if (i == 0) {
> +            max_queues = vhost_net_get_max_queues(s->vhost_net);

This breaks arm-softmmu (and any target that doesn't have CONFIG_VHOST_NET):

  $ make subdir-arm-softmmu
    CC    arm-softmmu/hw/net/vhost_net.o
    LINK  arm-softmmu/qemu-system-arm
  ../net/vhost-user.o: In function `vhost_user_start':
  /home/ehabkost/rh/proj/virt/qemu/net/vhost-user.c:88: undefined reference to `vhost_net_get_max_queues'
  collect2: error: ld returned 1 exit status
  Makefile:193: recipe for target 'qemu-system-arm' failed
  make[1]: *** [qemu-system-arm] Error 1
  Makefile:184: recipe for target 'subdir-arm-softmmu' failed
  make: *** [subdir-arm-softmmu] Error 2
Yuanhan Liu Sept. 23, 2015, 1:48 a.m. UTC | #5
On Tue, Sep 22, 2015 at 03:52:49PM -0300, Eduardo Habkost wrote:
> On Fri, Sep 18, 2015 at 10:58:43PM +0800, Yuanhan Liu wrote:
> [...]
> > diff --git a/net/vhost-user.c b/net/vhost-user.c
> > index 93dcecd..4fa3d64 100644
> > --- a/net/vhost-user.c
> > +++ b/net/vhost-user.c
> [...]
> > +        if (i == 0) {
> > +            max_queues = vhost_net_get_max_queues(s->vhost_net);
> 
> This breaks arm-softmmu (and any target that doesn't have CONFIG_VHOST_NET):

Thanks for testing. Will fix it next version.

	--yliu
> 
>   $ make subdir-arm-softmmu
>     CC    arm-softmmu/hw/net/vhost_net.o
>     LINK  arm-softmmu/qemu-system-arm
>   ../net/vhost-user.o: In function `vhost_user_start':
>   /home/ehabkost/rh/proj/virt/qemu/net/vhost-user.c:88: undefined reference to `vhost_net_get_max_queues'
>   collect2: error: ld returned 1 exit status
>   Makefile:193: recipe for target 'qemu-system-arm' failed
>   make[1]: *** [qemu-system-arm] Error 1
>   Makefile:184: recipe for target 'subdir-arm-softmmu' failed
>   make: *** [subdir-arm-softmmu] Error 2
> 
> -- 
> Eduardo
Yuanhan Liu Sept. 23, 2015, 1:57 a.m. UTC | #6
On Tue, Sep 22, 2015 at 06:14:14PM +0800, Jason Wang wrote:
> 
> 
[...]
> > -static void net_vhost_link_down(VhostUserState *s, bool link_down)
> > +static void net_vhost_link_down(int queues, NetClientState *ncs[],
> > +                                bool link_down)
> >  {
> > -    s->nc.link_down = link_down;
> > +    NetClientState *nc;
> > +    int i;
> >  
> > -    if (s->nc.peer) {
> > -        s->nc.peer->link_down = link_down;
> > -    }
> > +    for (i = 0; i < queues; i++) {
> > +        nc = ncs[i];
> >  
> > -    if (s->nc.info->link_status_changed) {
> > -        s->nc.info->link_status_changed(&s->nc);
> > -    }
> > +        nc->link_down = link_down;
> > +
> > +        if (nc->peer) {
> > +            nc->peer->link_down = link_down;
> > +        }
> > +
> > +        if (nc->info->link_status_changed) {
> > +            nc->info->link_status_changed(nc);
> > +        }
> >  
> > -    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> > -        s->nc.peer->info->link_status_changed(s->nc.peer);
> > +        if (nc->peer && nc->peer->info->link_status_changed) {
> > +            nc->peer->info->link_status_changed(nc->peer);
> > +        }
> >      }
> >  }
> 
> The semantics is a little bit difference with qmp_set_link. What's the
> reason for this? If there's no specific reason, probably, we could just
> reuse qmp_set_link() (or part of) here?

I did this just because I refactored the vhost_user init code, making
the char dev handler to initiate all ncs. Hence I turned net_vhost_link_down()
to handle all ncs, just like what I did for vhost_user_start().

And TBH, I'm not aware of qmp_set_link() before, and after reading the
code, I do think we can simply invoke it instead. And thanks for the
info.

> 
> Other looks good to me.

Thanks for the review. May I add your reviewed-by if I fix above issue?

	--yliu
Jason Wang Sept. 23, 2015, 2:12 a.m. UTC | #7
On 09/23/2015 09:57 AM, Yuanhan Liu wrote:
> On Tue, Sep 22, 2015 at 06:14:14PM +0800, Jason Wang wrote:
>>
> [...]
>>> -static void net_vhost_link_down(VhostUserState *s, bool link_down)
>>> +static void net_vhost_link_down(int queues, NetClientState *ncs[],
>>> +                                bool link_down)
>>>  {
>>> -    s->nc.link_down = link_down;
>>> +    NetClientState *nc;
>>> +    int i;
>>>  
>>> -    if (s->nc.peer) {
>>> -        s->nc.peer->link_down = link_down;
>>> -    }
>>> +    for (i = 0; i < queues; i++) {
>>> +        nc = ncs[i];
>>>  
>>> -    if (s->nc.info->link_status_changed) {
>>> -        s->nc.info->link_status_changed(&s->nc);
>>> -    }
>>> +        nc->link_down = link_down;
>>> +
>>> +        if (nc->peer) {
>>> +            nc->peer->link_down = link_down;
>>> +        }
>>> +
>>> +        if (nc->info->link_status_changed) {
>>> +            nc->info->link_status_changed(nc);
>>> +        }
>>>  
>>> -    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
>>> -        s->nc.peer->info->link_status_changed(s->nc.peer);
>>> +        if (nc->peer && nc->peer->info->link_status_changed) {
>>> +            nc->peer->info->link_status_changed(nc->peer);
>>> +        }
>>>      }
>>>  }
>> The semantics is a little bit difference with qmp_set_link. What's the
>> reason for this? If there's no specific reason, probably, we could just
>> reuse qmp_set_link() (or part of) here?
> I did this just because I refactored the vhost_user init code, making
> the char dev handler to initiate all ncs. Hence I turned net_vhost_link_down()
> to handle all ncs, just like what I did for vhost_user_start().
>
> And TBH, I'm not aware of qmp_set_link() before, and after reading the
> code, I do think we can simply invoke it instead. And thanks for the
> info.
>
>> Other looks good to me.
> Thanks for the review. May I add your reviewed-by if I fix above issue?
>
> 	--yliu

I prefer to add it myself :)

Thanks.
Yuanhan Liu Sept. 23, 2015, 2:16 a.m. UTC | #8
On Wed, Sep 23, 2015 at 10:12:10AM +0800, Jason Wang wrote:
> 
> 
> On 09/23/2015 09:57 AM, Yuanhan Liu wrote:
> > On Tue, Sep 22, 2015 at 06:14:14PM +0800, Jason Wang wrote:
> >>
> > [...]
> >>> -static void net_vhost_link_down(VhostUserState *s, bool link_down)
> >>> +static void net_vhost_link_down(int queues, NetClientState *ncs[],
> >>> +                                bool link_down)
> >>>  {
> >>> -    s->nc.link_down = link_down;
> >>> +    NetClientState *nc;
> >>> +    int i;
> >>>  
> >>> -    if (s->nc.peer) {
> >>> -        s->nc.peer->link_down = link_down;
> >>> -    }
> >>> +    for (i = 0; i < queues; i++) {
> >>> +        nc = ncs[i];
> >>>  
> >>> -    if (s->nc.info->link_status_changed) {
> >>> -        s->nc.info->link_status_changed(&s->nc);
> >>> -    }
> >>> +        nc->link_down = link_down;
> >>> +
> >>> +        if (nc->peer) {
> >>> +            nc->peer->link_down = link_down;
> >>> +        }
> >>> +
> >>> +        if (nc->info->link_status_changed) {
> >>> +            nc->info->link_status_changed(nc);
> >>> +        }
> >>>  
> >>> -    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> >>> -        s->nc.peer->info->link_status_changed(s->nc.peer);
> >>> +        if (nc->peer && nc->peer->info->link_status_changed) {
> >>> +            nc->peer->info->link_status_changed(nc->peer);
> >>> +        }
> >>>      }
> >>>  }
> >> The semantics is a little bit difference with qmp_set_link. What's the
> >> reason for this? If there's no specific reason, probably, we could just
> >> reuse qmp_set_link() (or part of) here?
> > I did this just because I refactored the vhost_user init code, making
> > the char dev handler to initiate all ncs. Hence I turned net_vhost_link_down()
> > to handle all ncs, just like what I did for vhost_user_start().
> >
> > And TBH, I'm not aware of qmp_set_link() before, and after reading the
> > code, I do think we can simply invoke it instead. And thanks for the
> > info.
> >
> >> Other looks good to me.
> > Thanks for the review. May I add your reviewed-by if I fix above issue?
> >
> > 	--yliu
> 
> I prefer to add it myself :)

Fair enough; I meant to save your effort :-)

	--yliu
diff mbox

Patch

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 43db9b4..cfc9d41 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -135,6 +135,21 @@  As older slaves don't support negotiating protocol features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+Multiple queue support
+----------------------
+
+Multiple queue is treated as a protocol extension, hence the slave has to
+implement protocol features first. The multiple queues feature is supported
+only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set:
+#define VHOST_USER_PROTOCOL_F_MQ    0
+
+The max number of queues the slave supports can be queried with message
+VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
+requested queues is bigger than that.
+
+As all queues share one connection, the master uses a unique index for each
+queue in the sent message to identify a specified queue.
+
 Message types
 -------------
 
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index bc88006..786191a 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -148,8 +148,11 @@  struct vhost_net *vhost_net_init(VhostNetOptions *options)
         fprintf(stderr, "vhost-net requires net backend to be setup\n");
         goto fail;
     }
+    net->nc = options->net_backend;
 
     net->dev.max_queues = 1;
+    net->dev.nvqs = 2;
+    net->dev.vqs = net->vqs;
 
     if (backend_kernel) {
         r = vhost_net_get_fd(options->net_backend);
@@ -164,11 +167,10 @@  struct vhost_net *vhost_net_init(VhostNetOptions *options)
         net->dev.backend_features = 0;
         net->dev.protocol_features = 0;
         net->backend = -1;
-    }
-    net->nc = options->net_backend;
 
-    net->dev.nvqs = 2;
-    net->dev.vqs = net->vqs;
+        /* vhost-user needs vq_index to initiate a specific queue pair */
+        net->dev.vq_index = net->nc->queue_index * net->dev.nvqs;
+    }
 
     r = vhost_dev_init(&net->dev, options->opaque,
                        options->backend_type);
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 5018fd6..e42fde6 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -187,6 +187,19 @@  static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
             0 : -1;
 }
 
+static bool vhost_user_one_time_request(VhostUserRequest request)
+{
+    switch (request) {
+    case VHOST_USER_SET_OWNER:
+    case VHOST_USER_RESET_DEVICE:
+    case VHOST_USER_SET_MEM_TABLE:
+    case VHOST_USER_GET_QUEUE_NUM:
+        return true;
+    default:
+        return false;
+    }
+}
+
 static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
         void *arg)
 {
@@ -207,6 +220,15 @@  static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
         msg_request = request;
     }
 
+    /*
+     * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE,
+     * we just need send it once in the first time. For later such
+     * request, we just ignore it.
+     */
+    if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0) {
+        return 0;
+    }
+
     msg.request = msg_request;
     msg.flags = VHOST_USER_VERSION;
     msg.size = 0;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7a7812d..c0ed5b2 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -874,8 +874,9 @@  static void vhost_eventfd_del(MemoryListener *listener,
 static int vhost_virtqueue_init(struct vhost_dev *dev,
                                 struct vhost_virtqueue *vq, int n)
 {
+    int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, n);
     struct vhost_vring_file file = {
-        .index = n,
+        .index = vhost_vq_index,
     };
     int r = event_notifier_init(&vq->masked_notifier, 0);
     if (r < 0) {
@@ -926,7 +927,7 @@  int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     }
 
     for (i = 0; i < hdev->nvqs; ++i) {
-        r = vhost_virtqueue_init(hdev, hdev->vqs + i, i);
+        r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i);
         if (r < 0) {
             goto fail_vq;
         }
diff --git a/net/vhost-user.c b/net/vhost-user.c
index 93dcecd..4fa3d64 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -39,37 +39,77 @@  static int vhost_user_running(VhostUserState *s)
     return (s->vhost_net) ? 1 : 0;
 }
 
-static int vhost_user_start(VhostUserState *s)
+static void vhost_user_stop(int queues, NetClientState *ncs[])
 {
-    VhostNetOptions options;
-
-    if (vhost_user_running(s)) {
-        return 0;
-    }
+    VhostUserState *s;
+    int i;
 
-    options.backend_type = VHOST_BACKEND_TYPE_USER;
-    options.net_backend = &s->nc;
-    options.opaque = s->chr;
+    for (i = 0; i < queues; i++) {
+        assert (ncs[i]->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
 
-    s->vhost_net = vhost_net_init(&options);
+        s = DO_UPCAST(VhostUserState, nc, ncs[i]);
+        if (!vhost_user_running(s)) {
+            continue;
+        }
 
-    return vhost_user_running(s) ? 0 : -1;
+        if (s->vhost_net) {
+            vhost_net_cleanup(s->vhost_net);
+            s->vhost_net = NULL;
+        }
+    }
 }
 
-static void vhost_user_stop(VhostUserState *s)
+static int vhost_user_start(int queues, NetClientState *ncs[])
 {
-    if (vhost_user_running(s)) {
-        vhost_net_cleanup(s->vhost_net);
+    VhostNetOptions options;
+    VhostUserState *s;
+    int max_queues;
+    int i;
+
+    options.backend_type = VHOST_BACKEND_TYPE_USER;
+
+    for (i = 0; i < queues; i++) {
+        assert (ncs[i]->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
+
+        s = DO_UPCAST(VhostUserState, nc, ncs[i]);
+        if (vhost_user_running(s)) {
+            continue;
+        }
+
+        options.net_backend = ncs[i];
+        options.opaque      = s->chr;
+        s->vhost_net = vhost_net_init(&options);
+        if (!s->vhost_net) {
+            error_report("failed to init vhost_net for queue %d\n", i);
+            goto err;
+        }
+
+        if (i == 0) {
+            max_queues = vhost_net_get_max_queues(s->vhost_net);
+            if (queues > max_queues) {
+                error_report("you are asking more queues than "
+                             "supported: %d\n", max_queues);
+                goto err;
+            }
+        }
     }
 
-    s->vhost_net = 0;
+    return 0;
+
+err:
+    vhost_user_stop(i + 1, ncs);
+    return -1;
 }
 
 static void vhost_user_cleanup(NetClientState *nc)
 {
     VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
 
-    vhost_user_stop(s);
+    if (s->vhost_net) {
+        vhost_net_cleanup(s->vhost_net);
+        s->vhost_net = NULL;
+    }
+
     qemu_purge_queued_packets(nc);
 }
 
@@ -95,59 +135,81 @@  static NetClientInfo net_vhost_user_info = {
         .has_ufo = vhost_user_has_ufo,
 };
 
-static void net_vhost_link_down(VhostUserState *s, bool link_down)
+static void net_vhost_link_down(int queues, NetClientState *ncs[],
+                                bool link_down)
 {
-    s->nc.link_down = link_down;
+    NetClientState *nc;
+    int i;
 
-    if (s->nc.peer) {
-        s->nc.peer->link_down = link_down;
-    }
+    for (i = 0; i < queues; i++) {
+        nc = ncs[i];
 
-    if (s->nc.info->link_status_changed) {
-        s->nc.info->link_status_changed(&s->nc);
-    }
+        nc->link_down = link_down;
+
+        if (nc->peer) {
+            nc->peer->link_down = link_down;
+        }
+
+        if (nc->info->link_status_changed) {
+            nc->info->link_status_changed(nc);
+        }
 
-    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
-        s->nc.peer->info->link_status_changed(s->nc.peer);
+        if (nc->peer && nc->peer->info->link_status_changed) {
+            nc->peer->info->link_status_changed(nc->peer);
+        }
     }
 }
 
 static void net_vhost_user_event(void *opaque, int event)
 {
-    VhostUserState *s = opaque;
+    const char *name = opaque;
+    NetClientState *ncs[MAX_QUEUE_NUM];
+    VhostUserState *s;
+    int queues;
 
+    queues = qemu_find_net_clients_except(name, ncs,
+                                          NET_CLIENT_OPTIONS_KIND_NIC,
+                                          MAX_QUEUE_NUM);
+    s = DO_UPCAST(VhostUserState, nc, ncs[0]);
     switch (event) {
     case CHR_EVENT_OPENED:
-        vhost_user_start(s);
-        net_vhost_link_down(s, false);
+        if (vhost_user_start(queues, ncs) < 0) {
+            exit(1);
+        }
+        net_vhost_link_down(queues, ncs, false);
         error_report("chardev \"%s\" went up", s->chr->label);
         break;
     case CHR_EVENT_CLOSED:
-        net_vhost_link_down(s, true);
-        vhost_user_stop(s);
+        net_vhost_link_down(queues, ncs, true);
+        vhost_user_stop(queues, ncs);
         error_report("chardev \"%s\" went down", s->chr->label);
         break;
     }
 }
 
 static int net_vhost_user_init(NetClientState *peer, const char *device,
-                               const char *name, CharDriverState *chr)
+                               const char *name, CharDriverState *chr,
+                               int queues)
 {
     NetClientState *nc;
     VhostUserState *s;
+    int i;
 
-    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
+    for (i = 0; i < queues; i++) {
+        nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
 
-    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
-             chr->label);
+        snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
+                 i, chr->label);
 
-    s = DO_UPCAST(VhostUserState, nc, nc);
+        /* We don't provide a receive callback */
+        nc->receive_disabled = 1;
+        nc->queue_index = i;
 
-    /* We don't provide a receive callback */
-    s->nc.receive_disabled = 1;
-    s->chr = chr;
+        s = DO_UPCAST(VhostUserState, nc, nc);
+        s->chr = chr;
+    }
 
-    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
+    qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, (void*)name);
 
     return 0;
 }
@@ -226,6 +288,7 @@  static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
 int net_init_vhost_user(const NetClientOptions *opts, const char *name,
                         NetClientState *peer, Error **errp)
 {
+    int queues;
     const NetdevVhostUserOptions *vhost_user_opts;
     CharDriverState *chr;
 
@@ -243,6 +306,7 @@  int net_init_vhost_user(const NetClientOptions *opts, const char *name,
         return -1;
     }
 
+    queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
 
-    return net_vhost_user_init(peer, "vhost_user", name, chr);
+    return net_vhost_user_init(peer, "vhost_user", name, chr, queues);
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 2bada60..a372f49 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2480,12 +2480,16 @@ 
 #
 # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
 #
+# @queues: #optional number of queues to be created for multiqueue vhost-user
+#          (default: 1) (Since 2.5)
+#
 # Since 2.1
 ##
 { 'struct': 'NetdevVhostUserOptions',
   'data': {
     'chardev':        'str',
-    '*vhostforce':    'bool' } }
+    '*vhostforce':    'bool',
+    '*queues':        'int' } }
 
 ##
 # @NetClientOptions
diff --git a/qemu-options.hx b/qemu-options.hx
index 7e147b8..328404c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1990,13 +1990,14 @@  The hubport netdev lets you connect a NIC to a QEMU "vlan" instead of a single
 netdev.  @code{-net} and @code{-device} with parameter @option{vlan} create the
 required hub automatically.
 
-@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off]
+@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n]
 
 Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev should
 be a unix domain socket backed one. The vhost-user uses a specifically defined
 protocol to pass vhost ioctl replacement messages to an application on the other
 end of the socket. On non-MSIX guests, the feature can be forced with
-@var{vhostforce}.
+@var{vhostforce}. Use 'queues=@var{n}' to specify the number of queues to
+be created for multiqueue vhost-user.
 
 Example:
 @example