From patchwork Wed Feb 12 09:51:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coiby Xu X-Patchwork-Id: 1236740 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=TMD4+3ko; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48HZjD3klVz9s1x for ; Wed, 12 Feb 2020 20:53:52 +1100 (AEDT) Received: from localhost ([::1]:34546 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1oiM-0005Zy-Cn for incoming@patchwork.ozlabs.org; Wed, 12 Feb 2020 04:53:50 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51118) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ogc-0003hi-Vd for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1ogb-0006Es-F5 for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:02 -0500 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:38955) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j1ogb-0006Dj-8W for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:01 -0500 Received: by mail-pj1-x1041.google.com with SMTP id e9so663994pjr.4 for ; Wed, 12 Feb 2020 01:52:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RRDnThdt4ySg1vAIm76nnPSu+KXuwzNmopRaMhLWftw=; b=TMD4+3koi/gg26HdSKjffO7xpdwKWEDeCDtAgL090+DxjX6Wcw6qwBJrAwP/+tCAFo WwlQZy7QYE23tn9xqgY3knTxB9Aazy9sySNIfM+uGHLTMMPGqwV5nx9hdOwT9poacgDy 0pPFxoeK/pDwhcUpIzjMgXEseej3IiqVHtmDdX3B5sUn3tczOiOvZBhoIsHH/lBCCydb 6cmPSQsOLQE4fm46q2qj1qrATvrrWvjQwDfoGhGnk0rL5urmLlPROeSvCl5caHbs/uDW X8nsMeWT/1Wnt4BDZfEdXr/Fv8FBWpuhI3K++XhxnTVDifqldR0wC1aau4f9pw35t4kl +KHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RRDnThdt4ySg1vAIm76nnPSu+KXuwzNmopRaMhLWftw=; b=ugvCkNdBGRaTT3fXWGsTjnugVGxmo1ln0xBh35au1tGJ35uunXMc2S2FDBpkgHz2tg WTPtnf8NVDegyBUx1JbTmHCAXklJGWLWTYpslZCzbzhD1abx8QNL1qcF3Kp9skEWUVO4 VySZQiFF5FUfVWX4YOub4D7BFr+GPTNX9BxFWUy+b08h07nVzP+8N0k4iUFA9tfEFQYs wZxXoiLjT4SjwFUwSEZZpeoqgXQ0leCoNukyvFB+CKaJGAkNj31bXUVz8LqtqmYNek6b nnBPMr6bPkFGgmkEfmSreQ/CKBspNQRkEzEnGUprqij4Or+QHaflflqYFt5cADzqLrWM wx9Q== X-Gm-Message-State: APjAAAVW5NmM/hnAQ/hpXydihziu1lIMZWPfRiWdHCl1iGFYLmuCspp2 bw0AmOrZGsUfOSbQgKJBf9IkMCZH X-Google-Smtp-Source: APXvYqzZpAnmpXmQjOVX6QpjpCvb0pFtZzZAyztuVBxl7lPuEvW4xVcma0ko6NpiT+vfD4kWzXYSlA== X-Received: by 2002:a17:902:6a84:: with SMTP id n4mr7507775plk.294.1581501119627; Wed, 12 Feb 2020 01:51:59 -0800 (PST) Received: from localhost.localdomain ([2402:9e80:0:1000::1:4ad5]) by smtp.googlemail.com with ESMTPSA id o19sm6298595pjr.2.2020.02.12.01.51.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2020 01:51:59 -0800 (PST) From: Coiby Xu To: qemu-devel@nongnu.org Subject: [PATCH v3 1/5] extend libvhost to support IOThread and coroutine Date: Wed, 12 Feb 2020 17:51:33 +0800 Message-Id: <20200212095137.7977-2-coiby.xu@gmail.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200212095137.7977-1-coiby.xu@gmail.com> References: <20200212095137.7977-1-coiby.xu@gmail.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1041 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, bharatlkmlkvm@gmail.com, Coiby Xu , stefanha@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Previously libvhost dispatch events in its own GMainContext. Now vhost-user client's kick event can be dispatched in block device drive's AioContext thus IOThread is supported. And also allow vu_message_read and vu_kick_cb to be replaced so QEMU can run them as coroutines. Signed-off-by: Coiby Xu --- contrib/libvhost-user/libvhost-user.c | 54 ++++++++++++++++++++++++--- contrib/libvhost-user/libvhost-user.h | 38 ++++++++++++++++++- 2 files changed, 85 insertions(+), 7 deletions(-) diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c index b89bf18501..f95664bb22 100644 --- a/contrib/libvhost-user/libvhost-user.c +++ b/contrib/libvhost-user/libvhost-user.c @@ -67,8 +67,6 @@ /* The version of inflight buffer */ #define INFLIGHT_VERSION 1 -#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64) - /* The version of the protocol we support */ #define VHOST_USER_VERSION 1 #define LIBVHOST_USER_DEBUG 0 @@ -260,7 +258,7 @@ have_userfault(void) } static bool -vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg) +vu_message_read_(VuDev *dev, int conn_fd, VhostUserMsg *vmsg) { char control[CMSG_SPACE(VHOST_MEMORY_MAX_NREGIONS * sizeof(int))] = { }; struct iovec iov = { @@ -328,6 +326,17 @@ fail: return false; } +static bool vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg) +{ + vu_read_msg_cb read_msg; + if (dev->co_iface) { + read_msg = dev->co_iface->read_msg; + } else { + read_msg = vu_message_read_; + } + return read_msg(dev, conn_fd, vmsg); +} + static bool vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg *vmsg) { @@ -1075,9 +1084,14 @@ vu_set_vring_kick_exec(VuDev *dev, VhostUserMsg *vmsg) } if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) { + if (dev->set_watch_packed_data) { + dev->set_watch_packed_data(dev, dev->vq[index].kick_fd, VU_WATCH_IN, + dev->co_iface->kick_callback, + (void *)(long)index); + } else { dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN, vu_kick_cb, (void *)(long)index); - + } DPRINT("Waiting for kicks on fd: %d for vq: %d\n", dev->vq[index].kick_fd, index); } @@ -1097,8 +1111,14 @@ void vu_set_queue_handler(VuDev *dev, VuVirtq *vq, vq->handler = handler; if (vq->kick_fd >= 0) { if (handler) { + if (dev->set_watch_packed_data) { + dev->set_watch_packed_data(dev, vq->kick_fd, VU_WATCH_IN, + dev->co_iface->kick_callback, + (void *)(long)qidx); + } else { dev->set_watch(dev, vq->kick_fd, VU_WATCH_IN, vu_kick_cb, (void *)(long)qidx); + } } else { dev->remove_watch(dev, vq->kick_fd); } @@ -1627,6 +1647,12 @@ vu_deinit(VuDev *dev) } if (vq->kick_fd != -1) { + /* remove watch for kick_fd + * When client process is running in gdb and + * quit command is run in gdb, QEMU will still dispatch the event + * which will cause segment fault in the callback function + */ + dev->remove_watch(dev, vq->kick_fd); close(vq->kick_fd); vq->kick_fd = -1; } @@ -1682,7 +1708,7 @@ vu_init(VuDev *dev, assert(max_queues > 0); assert(socket >= 0); - assert(set_watch); + /* assert(set_watch); */ assert(remove_watch); assert(iface); assert(panic); @@ -1715,6 +1741,24 @@ vu_init(VuDev *dev, return true; } +bool +vu_init_packed_data(VuDev *dev, + uint16_t max_queues, + int socket, + vu_panic_cb panic, + vu_set_watch_cb_packed_data set_watch_packed_data, + vu_remove_watch_cb remove_watch, + const VuDevIface *iface, + const CoIface *co_iface) +{ + if (vu_init(dev, max_queues, socket, panic, NULL, remove_watch, iface)) { + dev->set_watch_packed_data = set_watch_packed_data; + dev->co_iface = co_iface; + return true; + } + return false; +} + VuVirtq * vu_get_queue(VuDev *dev, int qidx) { diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h index 5cb7708559..6aadeaa0f2 100644 --- a/contrib/libvhost-user/libvhost-user.h +++ b/contrib/libvhost-user/libvhost-user.h @@ -30,6 +30,8 @@ #define VHOST_MEMORY_MAX_NREGIONS 8 +#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64) + typedef enum VhostSetConfigType { VHOST_SET_CONFIG_TYPE_MASTER = 0, VHOST_SET_CONFIG_TYPE_MIGRATION = 1, @@ -201,6 +203,7 @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev); typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features); typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg, int *do_reply); +typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg); typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started); typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx); typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len); @@ -208,6 +211,20 @@ typedef int (*vu_set_config_cb) (VuDev *dev, const uint8_t *data, uint32_t offset, uint32_t size, uint32_t flags); +typedef void (*vu_watch_cb_packed_data) (void *packed_data); + +typedef void (*vu_set_watch_cb_packed_data) (VuDev *dev, int fd, int condition, + vu_watch_cb_packed_data cb, + void *data); +/* + * allowing vu_read_msg_cb and kick_callback to be replaced so QEMU + * can run them as coroutines + */ +typedef struct CoIface { + vu_read_msg_cb read_msg; + vu_watch_cb_packed_data kick_callback; +} CoIface; + typedef struct VuDevIface { /* called by VHOST_USER_GET_FEATURES to get the features bitmask */ vu_get_features_cb get_features; @@ -372,7 +389,8 @@ struct VuDev { /* @set_watch: add or update the given fd to the watch set, * call cb when condition is met */ vu_set_watch_cb set_watch; - + /* AIO dispatch will only one data pointer to callback function */ + vu_set_watch_cb_packed_data set_watch_packed_data; /* @remove_watch: remove the given fd from the watch set */ vu_remove_watch_cb remove_watch; @@ -380,7 +398,7 @@ struct VuDev { * re-initialize */ vu_panic_cb panic; const VuDevIface *iface; - + const CoIface *co_iface; /* Postcopy data */ int postcopy_ufd; bool postcopy_listening; @@ -417,6 +435,22 @@ bool vu_init(VuDev *dev, const VuDevIface *iface); +/** + * vu_init_packed_data: + * Same as vu_init except for set_watch_packed_data which will pack + * two parameters into a struct thus QEMU aio_dispatch can pass the + * required data to callback function. + * + * Returns: true on success, false on failure. + **/ +bool vu_init_packed_data(VuDev *dev, + uint16_t max_queues, + int socket, + vu_panic_cb panic, + vu_set_watch_cb_packed_data set_watch_packed_data, + vu_remove_watch_cb remove_watch, + const VuDevIface *iface, + const CoIface *co_iface); /** * vu_deinit: * @dev: a VuDev context From patchwork Wed Feb 12 09:51:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coiby Xu X-Patchwork-Id: 1236739 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=gPVP0zf6; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48HZhB6VxGz9s1x for ; Wed, 12 Feb 2020 20:52:58 +1100 (AEDT) Received: from localhost ([::1]:34534 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ohU-0004Pv-RE for incoming@patchwork.ozlabs.org; Wed, 12 Feb 2020 04:52:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51162) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ogj-0003qA-0I for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1ogf-0006KY-I9 for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:08 -0500 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:37618) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j1ogf-0006Jk-9r for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:05 -0500 Received: by mail-pf1-x444.google.com with SMTP id p14so987418pfn.4 for ; Wed, 12 Feb 2020 01:52:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=X/aNgQOIY+xT9ydWwbqYnH3nfQvjNDbWUmOQgjYTpbs=; b=gPVP0zf69750E2iYmaTjrpNr5cPAIWPZ0mF/zKOroM9sX9LZuE7nfyxqh6mAiltIwU +CGaHYUTAtjmbBuThIe7snbr1zlUFKMGfcUpUDFVKXyUEVD2iZHiVP2e3/7IesbEtn30 R24KKHa5yjf7Uuv9BZ1KVqRn55nW+PwkqpMagLyPIdY0Ci7ViTOUuiYktkh07ARguHw8 wWFcvuC16kVfpBmJxqk8ngVp7WqarIG4Y0m08vtR2N21fUJ+0sZS8D2xXh7yFtdJW9XK 39j/gwhgp+zyCrfnfpfh//5++eHF7C5BvxqsxK22fu7gwfHQz0irrVTaY2l0WzpuVhc8 +opQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X/aNgQOIY+xT9ydWwbqYnH3nfQvjNDbWUmOQgjYTpbs=; b=JDKSopGrFQ6sjxWz0QkdFq3uM1uZVFXZVL2hGPIFoYhs6pFdcxB4K9a6fGkLteLZXH IXnGzciA0NnAs+zRDjYn1ro3LFxLbqGVZ/bJwkhflGNcBgjdgmp3tQW/GDBpEuRnJ0DU +mN74j36xLXcsSUh7o70oFbwhQ/hpE02Im1FbdrZVAKML3IiQTh/oo7RsccwDHbq7oA0 6v+K/jBNiMlM3qVOij7QbRncpZ9ytgat+OGABpAWVZfq2hMMVyrU9xcly7c3EqIoWqO2 Sn5RulNdxrfli8sWJAh2e/sPjOtfxZCSmcjudxLQf8uYfZ1Bp47HmVZc9QnxB7KIIvBN itkQ== X-Gm-Message-State: APjAAAV4J4WoAPmDKiDgYwW9eSvbsOev8Q+elPKueENGAsFQP8P5oldO jWI/JUui6kZQZoyfmegenSd6f7cq X-Google-Smtp-Source: APXvYqzkQM3FsmA5NwnqeLXmlf0ReYsfgz9MPNyZq4/9MwvEHw99/LrKfJEW9Xyz45hS+kbX2X2/kA== X-Received: by 2002:a63:4a47:: with SMTP id j7mr11545987pgl.196.1581501123617; Wed, 12 Feb 2020 01:52:03 -0800 (PST) Received: from localhost.localdomain ([2402:9e80:0:1000::1:4ad5]) by smtp.googlemail.com with ESMTPSA id o19sm6298595pjr.2.2020.02.12.01.52.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2020 01:52:03 -0800 (PST) From: Coiby Xu To: qemu-devel@nongnu.org Subject: [PATCH v3 2/5] generic vhost user server Date: Wed, 12 Feb 2020 17:51:34 +0800 Message-Id: <20200212095137.7977-3-coiby.xu@gmail.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200212095137.7977-1-coiby.xu@gmail.com> References: <20200212095137.7977-1-coiby.xu@gmail.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, bharatlkmlkvm@gmail.com, Coiby Xu , stefanha@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Sharing QEMU devices via vhost-user protocol Signed-off-by: Coiby Xu --- util/Makefile.objs | 3 + util/vhost-user-server.c | 429 +++++++++++++++++++++++++++++++++++++++ util/vhost-user-server.h | 56 +++++ 3 files changed, 489 insertions(+) create mode 100644 util/vhost-user-server.c create mode 100644 util/vhost-user-server.h -- 2.25.0 diff --git a/util/Makefile.objs b/util/Makefile.objs index 11262aafaf..5e450e501c 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -36,6 +36,9 @@ util-obj-y += readline.o util-obj-y += rcu.o util-obj-$(CONFIG_MEMBARRIER) += sys_membarrier.o util-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o +ifdef CONFIG_LINUX +util-obj-y += vhost-user-server.o +endif util-obj-y += qemu-coroutine-sleep.o util-obj-y += qemu-co-shared-resource.o util-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c new file mode 100644 index 0000000000..0766b414c3 --- /dev/null +++ b/util/vhost-user-server.c @@ -0,0 +1,429 @@ +/* + * Sharing QEMU devices via vhost-user protocol + * + * Author: Coiby Xu + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ +#include "qemu/osdep.h" +#include +#include "qemu/main-loop.h" +#include "vhost-user-server.h" + +static void vmsg_close_fds(VhostUserMsg *vmsg) +{ + int i; + for (i = 0; i < vmsg->fd_num; i++) { + close(vmsg->fds[i]); + } +} + +static void vmsg_unblock_fds(VhostUserMsg *vmsg) +{ + int i; + for (i = 0; i < vmsg->fd_num; i++) { + qemu_set_nonblock(vmsg->fds[i]); + } +} + + +static void close_client(VuClient *client) +{ + vu_deinit(&client->parent); + client->sioc = NULL; + object_unref(OBJECT(client->ioc)); + client->closed = true; + +} + +static void panic_cb(VuDev *vu_dev, const char *buf) +{ + if (buf) { + error_report("vu_panic: %s", buf); + } + + VuClient *client = container_of(vu_dev, VuClient, parent); + VuServer *server = client->server; + if (!client->closed) { + close_client(client); + QTAILQ_REMOVE(&server->clients, client, next); + } + + if (server->device_panic_notifier) { + server->device_panic_notifier(client); + } +} + + + +static bool coroutine_fn +vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg) +{ + struct iovec iov = { + .iov_base = (char *)vmsg, + .iov_len = VHOST_USER_HDR_SIZE, + }; + int rc, read_bytes = 0; + /* + * VhostUserMsg is a packed structure, gcc will complain about passing + * pointer to a packed structure member if we pass &VhostUserMsg.fd_num + * and &VhostUserMsg.fds directly when calling qio_channel_readv_full, + * thus two temporary variables nfds and fds are used here. + */ + size_t nfds = 0, nfds_t = 0; + int *fds = NULL, *fds_t = NULL; + VuClient *client = container_of(vu_dev, VuClient, parent); + QIOChannel *ioc = client->ioc; + + Error *erp; + assert(qemu_in_coroutine()); + do { + /* + * qio_channel_readv_full may have short reads, keeping calling it + * until getting VHOST_USER_HDR_SIZE or 0 bytes in total + */ + rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &erp); + if (rc < 0) { + if (rc == QIO_CHANNEL_ERR_BLOCK) { + qio_channel_yield(ioc, G_IO_IN); + continue; + } else { + error_report("Error while recvmsg: %s", strerror(errno)); + return false; + } + } + read_bytes += rc; + fds = g_renew(int, fds_t, nfds + nfds_t); + memcpy(fds + nfds, fds_t, nfds_t); + nfds += nfds_t; + if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) { + break; + } + } while (true); + + vmsg->fd_num = nfds; + memcpy(vmsg->fds, fds, nfds * sizeof(int)); + g_free(fds); + /* qio_channel_readv_full will make socket fds blocking, unblock them */ + vmsg_unblock_fds(vmsg); + if (vmsg->size > sizeof(vmsg->payload)) { + error_report("Error: too big message request: %d, " + "size: vmsg->size: %u, " + "while sizeof(vmsg->payload) = %zu", + vmsg->request, vmsg->size, sizeof(vmsg->payload)); + goto fail; + } + + struct iovec iov_payload = { + .iov_base = (char *)&vmsg->payload, + .iov_len = vmsg->size, + }; + if (vmsg->size) { + rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &erp); + if (rc == -1) { + error_report("Error while reading: %s", strerror(errno)); + goto fail; + } + } + + return true; + +fail: + vmsg_close_fds(vmsg); + + return false; +} + + +static coroutine_fn void vu_client_next_trip(VuClient *client); + +static coroutine_fn void vu_client_trip(void *opaque) +{ + VuClient *client = opaque; + + vu_dispatch(&client->parent); + client->co_trip = NULL; + if (!client->closed) { + vu_client_next_trip(client); + } +} + +static coroutine_fn void vu_client_next_trip(VuClient *client) +{ + if (!client->co_trip) { + client->co_trip = qemu_coroutine_create(vu_client_trip, client); + aio_co_schedule(client->ioc->ctx, client->co_trip); + } +} + +static void vu_client_start(VuClient *client) +{ + client->co_trip = qemu_coroutine_create(vu_client_trip, client); + aio_co_enter(client->ioc->ctx, client->co_trip); +} + +static void coroutine_fn vu_kick_cb_next(VuClient *client, + kick_info *data); + +static void coroutine_fn vu_kick_cb(void *opaque) +{ + kick_info *data = (kick_info *) opaque; + int index = data->index; + VuDev *dev = data->vu_dev; + VuClient *client; + client = container_of(dev, VuClient, parent); + VuVirtq *vq = &dev->vq[index]; + int sock = vq->kick_fd; + if (sock == -1) { + return; + } + assert(sock == data->fd); + eventfd_t kick_data; + ssize_t rc; + /* + * When eventfd is closed, the revent is POLLNVAL (=G_IO_NVAL) and + * reading eventfd will return errno=EBADF (Bad file number). + * Calling qio_channel_yield(ioc, G_IO_IN) will set reading handler + * for QIOChannel, but aio_dispatch_handlers will only dispatch + * G_IO_IN | G_IO_HUP | G_IO_ERR revents while ignoring + * G_IO_NVAL (POLLNVAL) revents. + * + * Thus when eventfd is closed by vhost-user client, QEMU will ignore + * G_IO_NVAL and keeping polling by repeatedly calling qemu_poll_ns which + * will lead to 100% CPU usage. + * + * To aovid this issue, make sure set_watch and remove_watch use the same + * AIOContext for QIOChannel. Thus remove_watch will eventually succefully + * remove eventfd from the set of file descriptors polled for + * corresponding GSource. + */ + rc = read(sock, &kick_data, sizeof(eventfd_t)); + if (rc != sizeof(eventfd_t)) { + if (errno == EAGAIN) { + qio_channel_yield(data->ioc, G_IO_IN); + } else if (errno != EINTR) { + data->co = NULL; + return; + } + } else { + vq->handler(dev, index); + } + data->co = NULL; + vu_kick_cb_next(client, data); + +} + +static void coroutine_fn vu_kick_cb_next(VuClient *client, + kick_info *cb_data) +{ + if (!cb_data->co) { + cb_data->co = qemu_coroutine_create(vu_kick_cb, cb_data); + aio_co_schedule(client->ioc->ctx, cb_data->co); + } +} +static const CoIface co_iface = { + .read_msg = vu_message_read, + .kick_callback = vu_kick_cb, +}; + + +static void +set_watch(VuDev *vu_dev, int fd, int vu_evt, + vu_watch_cb_packed_data cb, void *pvt) +{ + /* + * since aio_dispatch can only pass one user data pointer to the + * callback function, pack VuDev, pvt into a struct + */ + + VuClient *client; + + client = container_of(vu_dev, VuClient, parent); + g_assert(vu_dev); + g_assert(fd >= 0); + long index = (intptr_t) pvt; + g_assert(cb); + kick_info *kick_info = &client->kick_info[index]; + if (!kick_info->co) { + kick_info->fd = fd; + QIOChannelFile *fioc = qio_channel_file_new_fd(fd); + QIOChannel *ioc = QIO_CHANNEL(fioc); + ioc->ctx = client->ioc->ctx; + qio_channel_set_blocking(QIO_CHANNEL(ioc), false, NULL); + kick_info->fioc = fioc; + kick_info->ioc = ioc; + kick_info->vu_dev = vu_dev; + kick_info->co = qemu_coroutine_create(cb, kick_info); + aio_co_enter(client->ioc->ctx, kick_info->co); + } +} + + +static void remove_watch(VuDev *vu_dev, int fd) +{ + VuClient *client; + int i; + int index = -1; + g_assert(vu_dev); + g_assert(fd >= 0); + + client = container_of(vu_dev, VuClient, parent); + for (i = 0; i < vu_dev->max_queues; i++) { + if (client->kick_info[i].fd == fd) { + index = i; + break; + } + } + + if (index == -1) { + return; + } + + kick_info *kick_info = &client->kick_info[index]; + if (kick_info->ioc) { + aio_set_fd_handler(client->ioc->ctx, fd, false, NULL, + NULL, NULL, NULL); + kick_info->ioc = NULL; + g_free(kick_info->fioc); + kick_info->co = NULL; + kick_info->fioc = NULL; + } +} + + +static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc, + gpointer opaque) +{ + VuClient *client; + VuServer *server = opaque; + client = g_new0(VuClient, 1); + + if (!vu_init_packed_data(&client->parent, server->max_queues, + sioc->fd, panic_cb, + set_watch, remove_watch, + server->vu_iface, &co_iface)) { + error_report("Failed to initialized libvhost-user"); + g_free(client); + return; + } + + client->server = server; + client->sioc = sioc; + client->kick_info = g_new0(struct kick_info, server->max_queues); + /* + * increase the object reference, so cioc will not freed by + * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc)) + */ + object_ref(OBJECT(client->sioc)); + qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client"); + client->ioc = QIO_CHANNEL(sioc); + object_ref(OBJECT(client->ioc)); + object_ref(OBJECT(sioc)); + qio_channel_attach_aio_context(client->ioc, server->ctx); + qio_channel_set_blocking(QIO_CHANNEL(client->sioc), false, NULL); + client->closed = false; + QTAILQ_INSERT_TAIL(&server->clients, client, next); + vu_client_start(client); +} + + +void vhost_user_server_stop(VuServer *server) +{ + if (!server) { + return; + } + + VuClient *client, *next; + QTAILQ_FOREACH_SAFE(client, &server->clients, next, next) { + if (!client->closed) { + close_client(client); + QTAILQ_REMOVE(&server->clients, client, next); + } + } + + if (server->listener) { + qio_net_listener_disconnect(server->listener); + object_unref(OBJECT(server->listener)); + } +} + +static void detach_context(VuServer *server) +{ + VuClient *client; + int i; + QTAILQ_FOREACH(client, &server->clients, next) { + qio_channel_detach_aio_context(client->ioc); + for (i = 0; i < client->parent.max_queues; i++) { + if (client->kick_info[i].ioc) { + qio_channel_detach_aio_context(client->kick_info[i].ioc); + } + } + } +} + +static void attach_context(VuServer *server, AioContext *ctx) +{ + VuClient *client; + int i; + QTAILQ_FOREACH(client, &server->clients, next) { + qio_channel_attach_aio_context(client->ioc, ctx); + if (client->co_trip) { + aio_co_schedule(ctx, client->co_trip); + } + for (i = 0; i < client->parent.max_queues; i++) { + if (client->kick_info[i].co) { + qio_channel_attach_aio_context(client->kick_info[i].ioc, ctx); + aio_co_schedule(ctx, client->kick_info[i].co); + } + } + } +} +void change_vu_context(AioContext *ctx, VuServer *server) +{ + AioContext *acquire_ctx = ctx ? ctx : server->ctx; + aio_context_acquire(acquire_ctx); + server->ctx = ctx ? ctx : qemu_get_aio_context(); + if (ctx) { + attach_context(server, ctx); + } else { + detach_context(server); + } + aio_context_release(acquire_ctx); +} + + +VuServer *vhost_user_server_start(uint16_t max_queues, + char *unix_socket, + AioContext *ctx, + void *server_ptr, + void *device_panic_notifier, + const VuDevIface *vu_iface, + Error **errp) +{ + VuServer *server = g_new0(VuServer, 1); + server->ptr_in_device = server_ptr; + server->listener = qio_net_listener_new(); + SocketAddress addr = {}; + addr.u.q_unix.path = (char *) unix_socket; + addr.type = SOCKET_ADDRESS_TYPE_UNIX; + if (qio_net_listener_open_sync(server->listener, &addr, 1, errp) < 0) { + goto error; + } + + qio_net_listener_set_name(server->listener, "vhost-user-backend-listener"); + + server->vu_iface = vu_iface; + server->max_queues = max_queues; + server->ctx = ctx; + qio_net_listener_set_client_func(server->listener, + vu_accept, + server, + NULL); + + QTAILQ_INIT(&server->clients); + return server; +error: + g_free(server); + return NULL; +} diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h new file mode 100644 index 0000000000..2172d67937 --- /dev/null +++ b/util/vhost-user-server.h @@ -0,0 +1,56 @@ +#include "io/channel-socket.h" +#include "io/channel-file.h" +#include "io/net-listener.h" +#include "contrib/libvhost-user/libvhost-user.h" +#include "standard-headers/linux/virtio_blk.h" +#include "qemu/error-report.h" + +typedef struct VuClient VuClient; + +typedef struct VuServer { + QIONetListener *listener; + AioContext *ctx; + QTAILQ_HEAD(, VuClient) clients; + void (*device_panic_notifier)(struct VuClient *client) ; + int max_queues; + const VuDevIface *vu_iface; + /* + * @ptr_in_device: VuServer pointer memory location in vhost-user device + * struct, so later container_of can be used to get device destruct + */ + void *ptr_in_device; + bool close; +} VuServer; + +typedef struct kick_info { + VuDev *vu_dev; + int fd; /*kick fd*/ + long index; /*queue index*/ + QIOChannel *ioc; /*I/O channel for kick fd*/ + QIOChannelFile *fioc; /*underlying data channel for kick fd*/ + Coroutine *co; +} kick_info; + +struct VuClient { + VuDev parent; + VuServer *server; + QIOChannel *ioc; /* The current I/O channel */ + QIOChannelSocket *sioc; /* The underlying data channel */ + Coroutine *co_trip; + struct kick_info *kick_info; + QTAILQ_ENTRY(VuClient) next; + bool closed; +}; + + +VuServer *vhost_user_server_start(uint16_t max_queues, + char *unix_socket, + AioContext *ctx, + void *server_ptr, + void *device_panic_notifier, + const VuDevIface *vu_iface, + Error **errp); + +void vhost_user_server_stop(VuServer *server); + +void change_vu_context(AioContext *ctx, VuServer *server); From patchwork Wed Feb 12 09:51:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coiby Xu X-Patchwork-Id: 1236741 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=genqD3ps; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48HZjw634Gz9s1x for ; Wed, 12 Feb 2020 20:54:28 +1100 (AEDT) Received: from localhost ([::1]:34550 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1oiw-0006IB-OL for incoming@patchwork.ozlabs.org; Wed, 12 Feb 2020 04:54:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51198) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ogm-0003uu-55 for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1ogj-0006OO-CY for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:12 -0500 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:40329) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j1ogj-0006NA-2n for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:09 -0500 Received: by mail-pl1-x644.google.com with SMTP id y1so765669plp.7 for ; Wed, 12 Feb 2020 01:52:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=A7P1toJC9+7FjqzkLd4TC50F+zVWyY9EEQQLVrVA31U=; b=genqD3pszVIo6++mH3TSs331foPTvFjYWAyDmQDX3wUlsUwe9uKXAXWy98bPArJ3Ij 6DT4E7v6k8BHuaSnMz18fs5zwdAoWrrTQahsSJNHl89XfhZ4ifwsbUqYOisT5lN06/zf HlxiggG0FusABUdSFH8QH0IrTvSiqQydXqc+slP2gxSGkaqcjOoE4qc1XIBw/GLA12Pj VA09Y1MwsJxec36B7fzBi9QonCHTSEDuAv+uy26qcRfcxEqZeBVAjiVi8XOf5fqtliTo tXP+aZPC5OBDkqURv4/UsMAAtnroEgZmWtA15NlVjmfpMGfrKuJTEFYcLpiHJTVRTc3M oTZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=A7P1toJC9+7FjqzkLd4TC50F+zVWyY9EEQQLVrVA31U=; b=nAXYY6RKjq14PH+qUVFfP+xVEptIVBtBf69sRZ1vBjpmhDppkyGKw9yjScURCtpa0R NS1mpDBSnMBAMny+vuVJi3eGNLrK9KROL94e3u5W9XhoKVLpElVVw6spXduI+i7XK8se N0ldVRqk34GkfpwAxcMegwIy6hKTvXh4F2oUSQggeb5GpHON6JlvehnK4DDYVFk5fDhF 9jsOz351Q7bvJ/WoI1cqKdGD3AZPZmN+xLUUkDSNxGCtjTklg1iP9HjiY8cg9ub4kKC6 65loEaou3FHTYK95uAynmsPc0r3vVvEOloRaHkyMpJL6zx9j0phNFQr1wiSSMDiDPbKh lTRQ== X-Gm-Message-State: APjAAAUnxtc1rJesvrhs3qml6sy/jDya1b/UDvXObKAwKKGf7xJ/9P52 G5Wvs6MFnGnBa+dBRrXpslHGHIZM X-Google-Smtp-Source: APXvYqy1McGz0OQgKKGhMb9sW6bLXtQNxaAyNq0JNWQvIMsjqTUvXMbMF76Pn8tCfYz6zPewZBOA/Q== X-Received: by 2002:a17:902:6bc3:: with SMTP id m3mr7711455plt.27.1581501127433; Wed, 12 Feb 2020 01:52:07 -0800 (PST) Received: from localhost.localdomain ([2402:9e80:0:1000::1:4ad5]) by smtp.googlemail.com with ESMTPSA id o19sm6298595pjr.2.2020.02.12.01.52.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2020 01:52:06 -0800 (PST) From: Coiby Xu To: qemu-devel@nongnu.org Subject: [PATCH v3 3/5] vhost-user block device backend server Date: Wed, 12 Feb 2020 17:51:35 +0800 Message-Id: <20200212095137.7977-4-coiby.xu@gmail.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200212095137.7977-1-coiby.xu@gmail.com> References: <20200212095137.7977-1-coiby.xu@gmail.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::644 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, bharatlkmlkvm@gmail.com, Coiby Xu , stefanha@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" By making use of libvhost, multiple block device drives can be exported and each drive can serve multiple clients simultaneously. Since vhost-user-server needs a block drive to be created first, delay the creation of this object. Signed-off-by: Coiby Xu --- Makefile.target | 1 + backends/Makefile.objs | 2 + backends/vhost-user-blk-server.c | 716 +++++++++++++++++++++++++++++++ backends/vhost-user-blk-server.h | 21 + vl.c | 4 + 5 files changed, 744 insertions(+) create mode 100644 backends/vhost-user-blk-server.c create mode 100644 backends/vhost-user-blk-server.h diff --git a/Makefile.target b/Makefile.target index 6e61f607b1..8c6c01eb3a 100644 --- a/Makefile.target +++ b/Makefile.target @@ -159,6 +159,7 @@ obj-y += monitor/ obj-y += qapi/ obj-y += memory.o obj-y += memory_mapping.o +obj-$(CONFIG_LINUX) += ../contrib/libvhost-user/libvhost-user.o obj-y += migration/ram.o LIBS := $(libs_softmmu) $(LIBS) diff --git a/backends/Makefile.objs b/backends/Makefile.objs index 28a847cd57..4e7be731e0 100644 --- a/backends/Makefile.objs +++ b/backends/Makefile.objs @@ -14,6 +14,8 @@ common-obj-y += cryptodev-vhost.o common-obj-$(CONFIG_VHOST_CRYPTO) += cryptodev-vhost-user.o endif +common-obj-$(CONFIG_LINUX) += vhost-user-blk-server.o + common-obj-$(call land,$(CONFIG_VHOST_USER),$(CONFIG_VIRTIO)) += vhost-user.o common-obj-$(CONFIG_LINUX) += hostmem-memfd.o diff --git a/backends/vhost-user-blk-server.c b/backends/vhost-user-blk-server.c new file mode 100644 index 0000000000..7293ad87be --- /dev/null +++ b/backends/vhost-user-blk-server.c @@ -0,0 +1,716 @@ +/* + * Sharing QEMU block devices via vhost-user protocal + * + * Author: Coiby Xu + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ +#include "qemu/osdep.h" +#include "block/block.h" +#include "vhost-user-blk-server.h" +#include "qapi/error.h" +#include "qom/object_interfaces.h" +#include "sysemu/block-backend.h" + +enum { + VHOST_USER_BLK_MAX_QUEUES = 1, +}; +struct virtio_blk_inhdr { + unsigned char status; +}; + +static QTAILQ_HEAD(, VuBlockDev) vu_block_devs = + QTAILQ_HEAD_INITIALIZER(vu_block_devs); + + +typedef struct VuBlockReq { + VuVirtqElement *elem; + int64_t sector_num; + size_t size; + struct virtio_blk_inhdr *in; + struct virtio_blk_outhdr out; + VuClient *client; + struct VuVirtq *vq; +} VuBlockReq; + + +static void vu_block_req_complete(VuBlockReq *req) +{ + VuDev *vu_dev = &req->client->parent; + + /* IO size with 1 extra status byte */ + vu_queue_push(vu_dev, req->vq, req->elem, + req->size + 1); + vu_queue_notify(vu_dev, req->vq); + + if (req->elem) { + free(req->elem); + } + + g_free(req); +} + +static VuBlockDev *get_vu_block_device_by_client(VuClient *client) +{ + return container_of(client->server->ptr_in_device, VuBlockDev, vu_server); +} + +static int coroutine_fn +vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov, + uint32_t iovcnt, uint32_t type) +{ + struct virtio_blk_discard_write_zeroes desc; + ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc)); + if (unlikely(size != sizeof(desc))) { + error_report("Invalid size %ld, expect %ld", size, sizeof(desc)); + return -1; + } + + VuBlockDev *vdev_blk = get_vu_block_device_by_client(req->client); + uint64_t range[2] = { le64toh(desc.sector) << 9, + le32toh(desc.num_sectors) << 9 }; + if (type == VIRTIO_BLK_T_DISCARD) { + if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) { + return 0; + } + } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) { + if (blk_co_pwrite_zeroes(vdev_blk->backend, + range[0], range[1], 0) == 0) { + return 0; + } + } + + return -1; +} + + +static void coroutine_fn vu_block_flush(VuBlockReq *req) +{ + VuBlockDev *vdev_blk = get_vu_block_device_by_client(req->client); + BlockBackend *backend = vdev_blk->backend; + blk_co_flush(backend); +} + + +static int coroutine_fn vu_block_virtio_process_req(VuClient *client, + VuVirtq *vq) +{ + VuDev *vu_dev = &client->parent; + VuVirtqElement *elem; + uint32_t type; + VuBlockReq *req; + + VuBlockDev *vdev_blk = get_vu_block_device_by_client(client); + BlockBackend *backend = vdev_blk->backend; + elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) + + sizeof(VuBlockReq)); + if (!elem) { + return -1; + } + + struct iovec *in_iov = elem->in_sg; + struct iovec *out_iov = elem->out_sg; + unsigned in_num = elem->in_num; + unsigned out_num = elem->out_num; + /* refer to hw/block/virtio_blk.c */ + if (elem->out_num < 1 || elem->in_num < 1) { + error_report("virtio-blk request missing headers"); + free(elem); + return -1; + } + + req = g_new0(VuBlockReq, 1); + req->client = client; + req->vq = vq; + req->elem = elem; + + if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out, + sizeof(req->out)) != sizeof(req->out))) { + error_report("virtio-blk request outhdr too short"); + goto err; + } + + iov_discard_front(&out_iov, &out_num, sizeof(req->out)); + + if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) { + error_report("virtio-blk request inhdr too short"); + goto err; + } + + /* We always touch the last byte, so just see how big in_iov is. */ + req->in = (void *)in_iov[in_num - 1].iov_base + + in_iov[in_num - 1].iov_len + - sizeof(struct virtio_blk_inhdr); + iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr)); + + + type = le32toh(req->out.type); + switch (type & ~VIRTIO_BLK_T_BARRIER) { + case VIRTIO_BLK_T_IN: + case VIRTIO_BLK_T_OUT: { + ssize_t ret = 0; + bool is_write = type & VIRTIO_BLK_T_OUT; + req->sector_num = le64toh(req->out.sector); + + int64_t offset = req->sector_num * vdev_blk->blk_size; + QEMUIOVector *qiov = g_new0(QEMUIOVector, 1); + if (is_write) { + qemu_iovec_init_external(qiov, out_iov, out_num); + ret = blk_co_pwritev(backend, offset, qiov->size, + qiov, 0); + } else { + qemu_iovec_init_external(qiov, in_iov, in_num); + ret = blk_co_preadv(backend, offset, qiov->size, + qiov, 0); + } + aio_wait_kick(); + if (ret >= 0) { + req->in->status = VIRTIO_BLK_S_OK; + } else { + req->in->status = VIRTIO_BLK_S_IOERR; + } + vu_block_req_complete(req); + break; + } + case VIRTIO_BLK_T_FLUSH: + vu_block_flush(req); + req->in->status = VIRTIO_BLK_S_OK; + vu_block_req_complete(req); + break; + case VIRTIO_BLK_T_GET_ID: { + size_t size = MIN(iov_size(&elem->in_sg[0], in_num), + VIRTIO_BLK_ID_BYTES); + snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk_server"); + req->in->status = VIRTIO_BLK_S_OK; + req->size = elem->in_sg[0].iov_len; + vu_block_req_complete(req); + break; + } + case VIRTIO_BLK_T_DISCARD: + case VIRTIO_BLK_T_WRITE_ZEROES: { + int rc; + rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1], + out_num, type); + if (rc == 0) { + req->in->status = VIRTIO_BLK_S_OK; + } else { + req->in->status = VIRTIO_BLK_S_IOERR; + } + vu_block_req_complete(req); + break; + } + default: + req->in->status = VIRTIO_BLK_S_UNSUPP; + vu_block_req_complete(req); + break; + } + + return 0; + +err: + free(elem); + g_free(req); + return -1; +} + + +static void vu_block_process_vq(VuDev *vu_dev, int idx) +{ + VuClient *client; + VuVirtq *vq; + int ret; + + client = container_of(vu_dev, VuClient, parent); + assert(client); + + vq = vu_get_queue(vu_dev, idx); + assert(vq); + + while (1) { + ret = vu_block_virtio_process_req(client, vq); + if (ret) { + break; + } + } +} + +static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started) +{ + VuVirtq *vq; + + assert(vu_dev); + + vq = vu_get_queue(vu_dev, idx); + vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL); +} + +static uint64_t vu_block_get_features(VuDev *dev) +{ + uint64_t features; + VuClient *client = container_of(dev, VuClient, parent); + VuBlockDev *vdev_blk = get_vu_block_device_by_client(client); + features = 1ull << VIRTIO_BLK_F_SIZE_MAX | + 1ull << VIRTIO_BLK_F_SEG_MAX | + 1ull << VIRTIO_BLK_F_TOPOLOGY | + 1ull << VIRTIO_BLK_F_BLK_SIZE | + 1ull << VIRTIO_BLK_F_FLUSH | + 1ull << VIRTIO_BLK_F_DISCARD | + 1ull << VIRTIO_BLK_F_WRITE_ZEROES | + 1ull << VIRTIO_BLK_F_CONFIG_WCE | + 1ull << VIRTIO_F_VERSION_1 | + 1ull << VIRTIO_RING_F_INDIRECT_DESC | + 1ull << VIRTIO_RING_F_EVENT_IDX | + 1ull << VHOST_USER_F_PROTOCOL_FEATURES; + + if (!vdev_blk->writable) { + features |= 1ull << VIRTIO_BLK_F_RO; + } + + return features; +} + +static uint64_t vu_block_get_protocol_features(VuDev *dev) +{ + return 1ull << VHOST_USER_PROTOCOL_F_CONFIG | + 1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD; +} + +static int +vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len) +{ + VuClient *client = container_of(vu_dev, VuClient, parent); + VuBlockDev *vdev_blk = get_vu_block_device_by_client(client); + memcpy(config, &vdev_blk->blkcfg, len); + + return 0; +} + +static int +vu_block_set_config(VuDev *vu_dev, const uint8_t *data, + uint32_t offset, uint32_t size, uint32_t flags) +{ + VuClient *client = container_of(vu_dev, VuClient, parent); + VuBlockDev *vdev_blk = get_vu_block_device_by_client(client); + uint8_t wce; + + /* don't support live migration */ + if (flags != VHOST_SET_CONFIG_TYPE_MASTER) { + return -1; + } + + + if (offset != offsetof(struct virtio_blk_config, wce) || + size != 1) { + return -1; + } + + wce = *data; + if (wce == vdev_blk->blkcfg.wce) { + /* Do nothing as same with old configuration */ + return 0; + } + + vdev_blk->blkcfg.wce = wce; + blk_set_enable_write_cache(vdev_blk->backend, wce); + return 0; +} + + +/* + * When the client disconnects, it send a VHOST_USER_NONE request + * and vu_process_message will simple call exit which cause the VM + * to exit abruptly. + * To avoid this issue, process VHOST_USER_NONE request ahead + * of vu_process_message. + * + */ +static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply) +{ + if (vmsg->request == VHOST_USER_NONE) { + dev->panic(dev, "disconnect"); + return true; + } + return false; +} + + +static const VuDevIface vu_block_iface = { + .get_features = vu_block_get_features, + .queue_set_started = vu_block_queue_set_started, + .get_protocol_features = vu_block_get_protocol_features, + .get_config = vu_block_get_config, + .set_config = vu_block_set_config, + .process_msg = vu_block_process_msg, +}; + + +static void vu_block_free(VuBlockDev *vu_block_dev) +{ + if (!vu_block_dev) { + return; + } + + blk_unref(vu_block_dev->backend); + + if (vu_block_dev->next.tqe_circ.tql_prev) { + /* + * if vu_block_dev->next.tqe_circ.tql_prev = null, + * vu_block_dev hasn't been inserted into the queue and + * vu_block_free is called by obj->instance_finalize. + */ + QTAILQ_REMOVE(&vu_block_devs, vu_block_dev, next); + } +} + +static void blk_aio_attached(AioContext *ctx, void *opaque) +{ + VuBlockDev *vub_dev = opaque; + aio_context_acquire(ctx); + change_vu_context(ctx, vub_dev->vu_server); + aio_context_release(ctx); +} + + +static void blk_aio_detach(void *opaque) +{ + VuBlockDev *vub_dev = opaque; + AioContext *ctx = vub_dev->vu_server->ctx; + aio_context_acquire(ctx); + change_vu_context(NULL, vub_dev->vu_server); + aio_context_release(ctx); +} + + + +static void +vu_block_initialize_config(BlockDriverState *bs, + struct virtio_blk_config *config, uint32_t blk_size) +{ + config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; + config->blk_size = blk_size; + config->size_max = 0; + config->seg_max = 128 - 2; + config->min_io_size = 1; + config->opt_io_size = 1; + config->num_queues = VHOST_USER_BLK_MAX_QUEUES; + config->max_discard_sectors = 32768; + config->max_discard_seg = 1; + config->discard_sector_alignment = config->blk_size >> 9; + config->max_write_zeroes_sectors = 32768; + config->max_write_zeroes_seg = 1; +} + + +static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp) +{ + + BlockBackend *blk; + Error *local_error = NULL; + const char *node_name = vu_block_device->node_name; + bool writable = vu_block_device->writable; + /* + * Don't allow resize while the vhost user server is running, + * otherwise we don't care what happens with the node. + */ + uint64_t perm = BLK_PERM_CONSISTENT_READ; + int ret; + + AioContext *ctx; + + BlockDriverState *bs = bdrv_lookup_bs(node_name, + node_name, + &local_error); + + if (!bs) { + error_propagate(errp, local_error); + return NULL; + } + + if (bdrv_is_read_only(bs)) { + writable = false; + } + + if (writable) { + perm |= BLK_PERM_WRITE; + } + + ctx = bdrv_get_aio_context(bs); + aio_context_acquire(ctx); + bdrv_invalidate_cache(bs, NULL); + aio_context_release(ctx); + + blk = blk_new(bdrv_get_aio_context(bs), perm, + BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | + BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD); + ret = blk_insert_bs(blk, bs, errp); + + if (ret < 0) { + goto fail; + } + + blk_set_enable_write_cache(blk, false); + + blk_set_allow_aio_context_change(blk, true); + + vu_block_device->blkcfg.wce = 0; + vu_block_device->backend = blk; + if (!vu_block_device->blk_size) { + vu_block_device->blk_size = BDRV_SECTOR_SIZE; + } + vu_block_device->blkcfg.blk_size = vu_block_device->blk_size; + blk_set_guest_block_size(blk, vu_block_device->blk_size); + vu_block_initialize_config(bs, &vu_block_device->blkcfg, + vu_block_device->blk_size); + return vu_block_device; + +fail: + blk_unref(blk); + return NULL; +} + +static void vhost_user_blk_server_free(VuBlockDev *vu_block_device) +{ + if (!vu_block_device) { + return; + } + vhost_user_server_stop(vu_block_device->vu_server); + vu_block_free(vu_block_device); + +} + +/* + * A exported drive can serve multiple multiple clients simutateously, + * thus no need to export the same drive twice. + * + */ +static VuBlockDev *vu_block_dev_find(const char *node_name) +{ + VuBlockDev *vu_block_device; + QTAILQ_FOREACH(vu_block_device, &vu_block_devs, next) { + if (strcmp(node_name, vu_block_device->node_name) == 0) { + return vu_block_device; + } + } + + return NULL; +} + + +static VuBlockDev *vu_block_dev_find_by_unix_socket(const char *unix_socket) +{ + VuBlockDev *vu_block_device; + QTAILQ_FOREACH(vu_block_device, &vu_block_devs, next) { + if (strcmp(unix_socket, vu_block_device->unix_socket) == 0) { + return vu_block_device; + } + } + + return NULL; +} + + +static void device_panic_notifier(VuClient *client) +{ + VuBlockDev *vdev_blk = get_vu_block_device_by_client(client); + if (vdev_blk->exit_when_panic) { + vdev_blk->vu_server->close = true; + } +} + +static void vhost_user_blk_server_start(VuBlockDev *vu_block_device, + Error **errp) +{ + + const char *name = vu_block_device->node_name; + char *unix_socket = vu_block_device->unix_socket; + if (vu_block_dev_find(name) || + vu_block_dev_find_by_unix_socket(unix_socket)) { + error_setg(errp, "Vhost user server with name '%s' or " + "with socket_path '%s' has already been started", + name, unix_socket); + return; + } + + if (!vu_block_init(vu_block_device, errp)) { + return; + } + + + AioContext *ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend)); + VuServer *vu_server = vhost_user_server_start(VHOST_USER_BLK_MAX_QUEUES, + unix_socket, + ctx, + &vu_block_device->vu_server, + device_panic_notifier, + &vu_block_iface, + errp); + + if (!vu_server) { + goto error; + } + vu_block_device->vu_server = vu_server; + QTAILQ_INSERT_TAIL(&vu_block_devs, vu_block_device, next); + blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached, + blk_aio_detach, vu_block_device); + return; + + error: + vu_block_free(vu_block_device); +} + +static void vu_set_node_name(Object *obj, const char *value, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + + if (vus->node_name) { + error_setg(errp, "evdev property already set"); + return; + } + + vus->node_name = g_strdup(value); +} + +static char *vu_get_node_name(Object *obj, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + return g_strdup(vus->node_name); +} + + +static void vu_set_unix_socket(Object *obj, const char *value, + Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + + if (vus->unix_socket) { + error_setg(errp, "unix_socket property already set"); + return; + } + + vus->unix_socket = g_strdup(value); +} + +static char *vu_get_unix_socket(Object *obj, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + return g_strdup(vus->unix_socket); +} + +static bool vu_get_block_writable(Object *obj, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + return vus->writable; +} + +static void vu_set_block_writable(Object *obj, bool value, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + + vus->writable = value; +} + +static void vu_get_blk_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + uint32_t value = vus->blk_size; + + visit_type_uint32(v, name, &value, errp); +} + +static void vu_set_blk_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj); + + Error *local_err = NULL; + uint32_t value; + + visit_type_uint32(v, name, &value, &local_err); + if (local_err) { + goto out; + } + if (value != BDRV_SECTOR_SIZE && value != 4096) { + error_setg(&local_err, + "Property '%s.%s' can only take value 512 or 4096", + object_get_typename(obj), name); + goto out; + } + + vus->blk_size = value; + +out: + error_propagate(errp, local_err); + vus->blk_size = value; +} + +static void vhost_user_blk_server_instance_init(Object *obj) +{ + + object_property_add_bool(obj, "writable", + vu_get_block_writable, + vu_set_block_writable, NULL); + + object_property_add_str(obj, "node-name", + vu_get_node_name, + vu_set_node_name, NULL); + + object_property_add_str(obj, "unix-socket", + vu_get_unix_socket, + vu_set_unix_socket, NULL); + + object_property_add(obj, "blk-size", "uint32", + vu_get_blk_size, vu_set_blk_size, + NULL, NULL, NULL); + +} + +static void vhost_user_blk_server_instance_finalize(Object *obj) +{ + VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj); + + blk_remove_aio_context_notifier(vub->backend, blk_aio_attached, + blk_aio_detach, vub); + vhost_user_blk_server_free(vub); +} + +static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp) +{ + Error *local_error = NULL; + VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj); + + vhost_user_blk_server_start(vub, &local_error); + + if (local_error) { + error_propagate(errp, local_error); + return; + } +} + +static void vhost_user_blk_server_class_init(ObjectClass *klass, + void *class_data) +{ + UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass); + ucc->complete = vhost_user_blk_server_complete; +} + +static const TypeInfo vhost_user_blk_server_info = { + .name = TYPE_VHOST_USER_BLK_SERVER, + .parent = TYPE_OBJECT, + .instance_size = sizeof(VuBlockDev), + .instance_init = vhost_user_blk_server_instance_init, + .instance_finalize = vhost_user_blk_server_instance_finalize, + .class_init = vhost_user_blk_server_class_init, + .interfaces = (InterfaceInfo[]) { + {TYPE_USER_CREATABLE}, + {} + }, +}; + +static void vhost_user_blk_server_register_types(void) +{ + type_register_static(&vhost_user_blk_server_info); +} + +type_init(vhost_user_blk_server_register_types) diff --git a/backends/vhost-user-blk-server.h b/backends/vhost-user-blk-server.h new file mode 100644 index 0000000000..e572ae3801 --- /dev/null +++ b/backends/vhost-user-blk-server.h @@ -0,0 +1,21 @@ +#include "util/vhost-user-server.h" +typedef struct VuBlockDev VuBlockDev; +#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server" +#define VHOST_USER_BLK_SERVER(obj) \ + OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER) + +/* vhost user block device */ +struct VuBlockDev { + Object parent_obj; + char *node_name; + char *unix_socket; + bool exit_when_panic; + AioContext *ctx; + VuServer *vu_server; + uint32_t blk_size; + BlockBackend *backend; + QIOChannelSocket *sioc; + QTAILQ_ENTRY(VuBlockDev) next; + struct virtio_blk_config blkcfg; + bool writable; +}; diff --git a/vl.c b/vl.c index 7dcb0879c4..d0d593c0d3 100644 --- a/vl.c +++ b/vl.c @@ -2572,6 +2572,10 @@ static bool object_create_initial(const char *type, QemuOpts *opts) } #endif + /* Reason: vhost-user-server property "name" */ + if (g_str_equal(type, "vhost-user-blk-server")) { + return false; + } /* * Reason: filter-* property "netdev" etc. */ From patchwork Wed Feb 12 09:51:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coiby Xu X-Patchwork-Id: 1236742 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=FwSxI2uV; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48HZkj5xh7z9s3x for ; Wed, 12 Feb 2020 20:55:09 +1100 (AEDT) Received: from localhost ([::1]:34558 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ojb-0007FI-Ou for incoming@patchwork.ozlabs.org; Wed, 12 Feb 2020 04:55:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51206) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ogo-0003yA-7z for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1ogm-0006RK-EE for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:14 -0500 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:38381) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j1ogm-0006Qc-65 for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:12 -0500 Received: by mail-pl1-x641.google.com with SMTP id t6so767377plj.5 for ; Wed, 12 Feb 2020 01:52:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WNV4/ZVw8Apk2UrD+URUi38RDdE8LaZy5lWOVQracgk=; b=FwSxI2uVBkoTFc5Zx5BvG9zjs1dJnyMIM+m8deHmQtyJ5cGgxbl0Gr1AWz1ExstIYn imFjNuvI9HJXZZlGU3mM7NYrl3Dzk1mPWcJ5OJbO9HPKmgFAEJkj8Z2d4lsYOob1wW8v +gmHe0esWXhgyDn9rn1aGsGuCqB+2p76MIrSUQjkRxyeHtvwGtgO0U7+BPSdgzWFX3pH byGeIJAGgv0riHGnhzWbbDKeQLoTvY6UHXCMQ+6T57Yd8aW1w6cpOB8Rk8g0xKPvrpUn 2rnjX+5hGB/YRoVETAXv92zdp00mNoctzP9Szggwc+n5TE0Kn1TFV2lPrh9pyaVN30bJ dT8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WNV4/ZVw8Apk2UrD+URUi38RDdE8LaZy5lWOVQracgk=; b=jVOX+LWzAOxa9ZzbF9r8g3uz1qMgz8mDnDHSvI0fti8UnOnfBTutfC2QJJrxuMpDNl 87Nj726T/QFWPTxtb9aw1ypnD1ZhfXrpEdxVBB5mfDcyLSjJTPvNtoAJeV+jHRTegE07 DZo5H/MIAALi7Ojmo0K66pJSKlJTuKaIwV5st0+cAWthrZWFWavvU3xA2JG9vq6BB/LJ xOQozVTjPhlWCfqxdldjwYLPCU5TAsvBJth6XK514Yfl6UR85W43u2x2GCSAk5iFbC/i Kb2WJtNKzmHOk3jgS7ILUuypU9cwU0EwJ3yUDUsChVX2FkukcCuCldX39C5aptgvn1wD NAVQ== X-Gm-Message-State: APjAAAVYCZdG6GDPwAWVaB5lfjk4pRXU7LSvbAR0++YFheT7EmH9wFpi 3O26xU196zNPYx25w6I+xKZWeDbK X-Google-Smtp-Source: APXvYqxPmDbNW70rsOiVqPUrFzcEzrz/at67WAmHSfK12c7EYOr3CykeKDqjE+1YUhHDaYT5IR2bFQ== X-Received: by 2002:a17:902:ba8a:: with SMTP id k10mr7772211pls.333.1581501130800; Wed, 12 Feb 2020 01:52:10 -0800 (PST) Received: from localhost.localdomain ([2402:9e80:0:1000::1:4ad5]) by smtp.googlemail.com with ESMTPSA id o19sm6298595pjr.2.2020.02.12.01.52.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2020 01:52:10 -0800 (PST) From: Coiby Xu To: qemu-devel@nongnu.org Subject: [PATCH v3 4/5] a standone-alone tool to directly share disk image file via vhost-user protocol Date: Wed, 12 Feb 2020 17:51:36 +0800 Message-Id: <20200212095137.7977-5-coiby.xu@gmail.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200212095137.7977-1-coiby.xu@gmail.com> References: <20200212095137.7977-1-coiby.xu@gmail.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::641 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, bharatlkmlkvm@gmail.com, Coiby Xu , stefanha@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" vhost-user-blk could have played as vhost-user backend but it only supports raw file and don't support VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES operations on raw file (ioctl(fd, BLKDISCARD) is only valid for real block device). In the future Kevin's qemu-storage-daemon will be used to replace this tool. Signed-off-by: Coiby Xu --- Makefile | 4 + configure | 3 + qemu-vu.c | 252 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 259 insertions(+) create mode 100644 qemu-vu.c -- 2.25.0 diff --git a/Makefile b/Makefile index f0e1a2fc1d..0bfd2f1ddd 100644 --- a/Makefile +++ b/Makefile @@ -572,6 +572,10 @@ qemu-img.o: qemu-img-cmds.h qemu-img$(EXESUF): qemu-img.o $(authz-obj-y) $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) qemu-nbd$(EXESUF): qemu-nbd.o $(authz-obj-y) $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) + +ifdef CONFIG_LINUX +qemu-vu$(EXESUF): qemu-vu.o $(authz-obj-y) $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) libvhost-user.a +endif qemu-io$(EXESUF): qemu-io.o $(authz-obj-y) $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o $(COMMON_LDADDS) diff --git a/configure b/configure index 115dc38085..e87c9a5587 100755 --- a/configure +++ b/configure @@ -6217,6 +6217,9 @@ if test "$want_tools" = "yes" ; then if [ "$linux" = "yes" -o "$bsd" = "yes" -o "$solaris" = "yes" ] ; then tools="qemu-nbd\$(EXESUF) $tools" fi + if [ "$linux" = "yes" ] ; then + tools="qemu-vu\$(EXESUF) $tools" + fi if [ "$ivshmem" = "yes" ]; then tools="ivshmem-client\$(EXESUF) ivshmem-server\$(EXESUF) $tools" fi diff --git a/qemu-vu.c b/qemu-vu.c new file mode 100644 index 0000000000..dd1032b205 --- /dev/null +++ b/qemu-vu.c @@ -0,0 +1,252 @@ +/* + * Copyright (C) 2020 Coiby Xu + * + * standone-alone vhost-user-blk device server backend + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include +#include +#include "backends/vhost-user-blk-server.h" +#include "block/block_int.h" +#include "io/net-listener.h" +#include "qapi/error.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qstring.h" +#include "qemu/config-file.h" +#include "qemu/cutils.h" +#include "qemu/main-loop.h" +#include "qemu/module.h" +#include "qemu/option.h" +#include "qemu-common.h" +#include "qemu-version.h" +#include "qom/object_interfaces.h" +#include "sysemu/block-backend.h" +#define QEMU_VU_OPT_CACHE 256 +#define QEMU_VU_OPT_AIO 257 +#define QEMU_VU_OBJ_ID "vu_disk" +static QemuOptsList qemu_object_opts = { + .name = "object", + .implied_opt_name = "qom-type", + .head = QTAILQ_HEAD_INITIALIZER(qemu_object_opts.head), + .desc = { + { } + }, +}; +static char *srcpath; + +static void usage(const char *name) +{ + (printf) ( +"Usage: %s [OPTIONS] FILE\n" +" or: %s -L [OPTIONS]\n" +"QEMU Vhost-user Server Utility\n" +"\n" +" -h, --help display this help and exit\n" +" -V, --version output version information and exit\n" +"\n" +"Connection properties:\n" +" -k, --socket=PATH path to the unix socket\n" +"\n" +"General purpose options:\n" +" -e, -- exit-panic When the panic callback is called, the program\n" +" will exit. Useful for make check-qtest.\n" +"\n" +"Block device options:\n" +" -f, --format=FORMAT set image format (raw, qcow2, ...)\n" +" -r, --read-only export read-only\n" +" -n, --nocache disable host cache\n" +" --cache=MODE set cache mode (none, writeback, ...)\n" +" --aio=MODE set AIO mode (native or threads)\n" +"\n" +QEMU_HELP_BOTTOM "\n" + , name, name); +} + +static void version(const char *name) +{ + printf( +"%s " QEMU_FULL_VERSION "\n" +"Written by Coiby Xu, based on qemu-nbd by Anthony Liguori\n" +"\n" +QEMU_COPYRIGHT "\n" +"This is free software; see the source for copying conditions. There is NO\n" +"warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n" + , name); +} + +static VuBlockDev *vu_block_device; + +static void vus_shutdown(void) +{ + + Error *local_err = NULL; + job_cancel_sync_all(); + bdrv_close_all(); + user_creatable_del(QEMU_VU_OBJ_ID, &local_err); +} + +int main(int argc, char **argv) +{ + BlockBackend *blk; + BlockDriverState *bs; + bool readonly = false; + char *sockpath = NULL; + const char *sopt = "hVrnvek:f:"; + struct option lopt[] = { + { "help", no_argument, NULL, 'h' }, + { "version", no_argument, NULL, 'V' }, + { "exit-panic", no_argument, NULL, 'e' }, + { "socket", required_argument, NULL, 'k' }, + { "read-only", no_argument, NULL, 'r' }, + { "nocache", no_argument, NULL, 'n' }, + { "cache", required_argument, NULL, QEMU_VU_OPT_CACHE }, + { "aio", required_argument, NULL, QEMU_VU_OPT_AIO }, + { "format", required_argument, NULL, 'f' }, + { NULL, 0, NULL, 0 } + }; + int ch; + int opt_ind = 0; + int flags = BDRV_O_RDWR; + bool seen_cache = false; + bool seen_aio = false; + const char *fmt = NULL; + Error *local_err = NULL; + QDict *options = NULL; + bool writethrough = true; + bool exit_when_panic = false; + + error_init(argv[0]); + + module_call_init(MODULE_INIT_QOM); + qemu_init_exec_dir(argv[0]); + + while ((ch = getopt_long(argc, argv, sopt, lopt, &opt_ind)) != -1) { + switch (ch) { + case 'e': + exit_when_panic = true; + break; + case 'n': + optarg = (char *) "none"; + /* fallthrough */ + case QEMU_VU_OPT_CACHE: + if (seen_cache) { + error_report("-n and --cache can only be specified once"); + exit(EXIT_FAILURE); + } + seen_cache = true; + if (bdrv_parse_cache_mode(optarg, &flags, &writethrough) == -1) { + error_report("Invalid cache mode `%s'", optarg); + exit(EXIT_FAILURE); + } + break; + case QEMU_VU_OPT_AIO: + if (seen_aio) { + error_report("--aio can only be specified once"); + exit(EXIT_FAILURE); + } + seen_aio = true; + if (!strcmp(optarg, "native")) { + flags |= BDRV_O_NATIVE_AIO; + } else if (!strcmp(optarg, "threads")) { + /* this is the default */ + } else { + error_report("invalid aio mode `%s'", optarg); + exit(EXIT_FAILURE); + } + break; + case 'r': + readonly = true; + flags &= ~BDRV_O_RDWR; + break; + case 'k': + sockpath = optarg; + if (sockpath[0] != '/') { + error_report("socket path must be absolute"); + exit(EXIT_FAILURE); + } + break; + case 'f': + fmt = optarg; + break; + case 'V': + version(argv[0]); + exit(0); + break; + case 'h': + usage(argv[0]); + exit(0); + break; + case '?': + error_report("Try `%s --help' for more information.", argv[0]); + exit(EXIT_FAILURE); + } + } + + if ((argc - optind) != 1) { + error_report("Invalid number of arguments"); + error_printf("Try `%s --help' for more information.\n", argv[0]); + exit(EXIT_FAILURE); + } + if (qemu_init_main_loop(&local_err)) { + error_report_err(local_err); + exit(EXIT_FAILURE); + } + bdrv_init(); + + srcpath = argv[optind]; + if (fmt) { + options = qdict_new(); + qdict_put_str(options, "driver", fmt); + } + blk = blk_new_open(srcpath, NULL, options, flags, &local_err); + + if (!blk) { + error_reportf_err(local_err, "Failed to blk_new_open '%s': ", + argv[optind]); + exit(EXIT_FAILURE); + } + bs = blk_bs(blk); + + char buf[300]; + snprintf(buf, 300, "%s,id=%s,node-name=%s,unix-socket=%s,writable=%s", + TYPE_VHOST_USER_BLK_SERVER, QEMU_VU_OBJ_ID, bdrv_get_node_name(bs), + sockpath, !readonly ? "on" : "off"); + /* While calling user_creatable_del, 'object' group is required */ + qemu_add_opts(&qemu_object_opts); + QemuOpts *opts = qemu_opts_parse(&qemu_object_opts, buf, true, &local_err); + if (local_err) { + error_report_err(local_err); + goto error; + } + + Object *obj = user_creatable_add_opts(opts, &local_err); + + if (local_err) { + error_report_err(local_err); + goto error; + } + + vu_block_device = VHOST_USER_BLK_SERVER(obj); + vu_block_device->exit_when_panic = exit_when_panic; + + do { + main_loop_wait(false); + } while (!vu_block_device->exit_when_panic || !vu_block_device->vu_server->close); + + error: + vus_shutdown(); + exit(EXIT_SUCCESS); +} From patchwork Wed Feb 12 09:51:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Coiby Xu X-Patchwork-Id: 1236744 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=Iu2+/6AA; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48HZmL4ZKQz9s3x for ; Wed, 12 Feb 2020 20:56:34 +1100 (AEDT) Received: from localhost ([::1]:34584 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1oky-0000Vb-HA for incoming@patchwork.ozlabs.org; Wed, 12 Feb 2020 04:56:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51230) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ogu-000489-2r for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1ogq-0006YH-Jj for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:20 -0500 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:44728) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j1ogq-0006XH-9K for qemu-devel@nongnu.org; Wed, 12 Feb 2020 04:52:16 -0500 Received: by mail-pf1-x442.google.com with SMTP id y5so968666pfb.11 for ; Wed, 12 Feb 2020 01:52:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9an2w0N99Q+gjVdzTHMKJ08qTjypxg7oQztU8Id82QI=; b=Iu2+/6AA2wG4v0NqMFXyshgwKPTjC06GS1qMRjZJNLZD+Tm3rq1khlTmL1QKIEdAio FzH9RQWtKJd4MWstmv9E0suF7E2lxSx4+3l2nIMECWq9MHWDuSy5jAJxyHR4SeFKCYVh PT4YsW3o2yRrO5bNO8luIjfw2n1N5GhZlXUCNaGqLRmMkASTgkNzMpsun7AAdksG8q04 CZp9hk0pVDvoDSDFSe86s76Bq4QhAo9zBCBvaxiDwOCO9931qLCQY+k9NCcCGCkaXA0R CJSjDX2wtYIryukrEKQGwcyyWLuLYB5A72eHuDYFYB8OBtf3WAiccBiVOytVYT0igwXk bafA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9an2w0N99Q+gjVdzTHMKJ08qTjypxg7oQztU8Id82QI=; b=Rz5NXL8UR396Lq7jPt75r6Lnxshs2KOyN/NCkaFeTZlr/IsKcWsu6y6tAw3QEEzOCS VOEh8bYUh7iY+L6bLR+RHQTJuIapPr77Aur2TYDBaxBOJ7vRlD9Be2IUBNOVHyaj+31G ImSlB+I3Xym6+Ob9UOCeT5qrvpqVNoTTrrDF3ZjPioR7XaLYBj5oFmMkaJeNsLIYyzER GPZy+5iMI7weaiiZvFDPo0Nh+7mDAJACklQECxTL42JVep8sS09fcgVXL4cK7fc72SCP vU3ntXkOCuXaPZ7KU6iHMGm8+HhKNIuGZEffdloXD6a4YUpTUR3MgkaO+9syyGzsXMvm 19VA== X-Gm-Message-State: APjAAAVXfQdfc1yT2IRueeKTU1rGwlasJznRMca8vJUeZtgmfjJRqf8y DN3ojI7UeSS/uKAZs3ztyCp6PMH6 X-Google-Smtp-Source: APXvYqzjI4sErYVOBo4d9pft1ZEGZpT4E+7PGUeeDqIIVd4Z/ze8YFmZF7/MzxkKkU325JFs4X5GaQ== X-Received: by 2002:a65:6216:: with SMTP id d22mr11591706pgv.437.1581501134558; Wed, 12 Feb 2020 01:52:14 -0800 (PST) Received: from localhost.localdomain ([2402:9e80:0:1000::1:4ad5]) by smtp.googlemail.com with ESMTPSA id o19sm6298595pjr.2.2020.02.12.01.52.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2020 01:52:14 -0800 (PST) From: Coiby Xu To: qemu-devel@nongnu.org Subject: [PATCH v3 5/5] new qTest case to test the vhost-user-blk-server Date: Wed, 12 Feb 2020 17:51:37 +0800 Message-Id: <20200212095137.7977-6-coiby.xu@gmail.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200212095137.7977-1-coiby.xu@gmail.com> References: <20200212095137.7977-1-coiby.xu@gmail.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::442 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, bharatlkmlkvm@gmail.com, Coiby Xu , stefanha@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This test case has the same tests as tests/virtio-blk-test.c except for tests have block_resize. Signed-off-by: Coiby Xu --- tests/libqos/vhost-user-blk.c | 126 ++++++ tests/libqos/vhost-user-blk.h | 44 +++ tests/vhost-user-blk-test.c | 694 ++++++++++++++++++++++++++++++++++ 3 files changed, 864 insertions(+) create mode 100644 tests/libqos/vhost-user-blk.c create mode 100644 tests/libqos/vhost-user-blk.h create mode 100644 tests/vhost-user-blk-test.c -- 2.25.0 diff --git a/tests/libqos/vhost-user-blk.c b/tests/libqos/vhost-user-blk.c new file mode 100644 index 0000000000..ec46b7ddb4 --- /dev/null +++ b/tests/libqos/vhost-user-blk.c @@ -0,0 +1,126 @@ +/* + * libqos driver framework + * + * Copyright (c) 2018 Emanuele Giuseppe Esposito + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License version 2 as published by the Free Software Foundation. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see + */ + +#include "qemu/osdep.h" +#include "libqtest.h" +#include "qemu/module.h" +#include "standard-headers/linux/virtio_blk.h" +#include "libqos/qgraph.h" +#include "libqos/vhost-user-blk.h" + +#define PCI_SLOT 0x04 +#define PCI_FN 0x00 + +/* virtio-blk-device */ +static void *qvhost_user_blk_get_driver(QVhostUserBlk *v_blk, + const char *interface) +{ + if (!g_strcmp0(interface, "vhost-user-blk")) { + return v_blk; + } + if (!g_strcmp0(interface, "virtio")) { + return v_blk->vdev; + } + + fprintf(stderr, "%s not present in vhost-user-blk-device\n", interface); + g_assert_not_reached(); +} + +static void *qvhost_user_blk_device_get_driver(void *object, + const char *interface) +{ + QVhostUserBlkDevice *v_blk = object; + return qvhost_user_blk_get_driver(&v_blk->blk, interface); +} + +static void *vhost_user_blk_device_create(void *virtio_dev, + QGuestAllocator *t_alloc, + void *addr) +{ + QVhostUserBlkDevice *vhost_user_blk = g_new0(QVhostUserBlkDevice, 1); + QVhostUserBlk *interface = &vhost_user_blk->blk; + + interface->vdev = virtio_dev; + + vhost_user_blk->obj.get_driver = qvhost_user_blk_device_get_driver; + + return &vhost_user_blk->obj; +} + +/* virtio-blk-pci */ +static void *qvhost_user_blk_pci_get_driver(void *object, const char *interface) +{ + QVhostUserBlkPCI *v_blk = object; + if (!g_strcmp0(interface, "pci-device")) { + return v_blk->pci_vdev.pdev; + } + return qvhost_user_blk_get_driver(&v_blk->blk, interface); +} + +static void *vhost_user_blk_pci_create(void *pci_bus, QGuestAllocator *t_alloc, + void *addr) +{ + QVhostUserBlkPCI *vhost_user_blk = g_new0(QVhostUserBlkPCI, 1); + QVhostUserBlk *interface = &vhost_user_blk->blk; + QOSGraphObject *obj = &vhost_user_blk->pci_vdev.obj; + + virtio_pci_init(&vhost_user_blk->pci_vdev, pci_bus, addr); + interface->vdev = &vhost_user_blk->pci_vdev.vdev; + + g_assert_cmphex(interface->vdev->device_type, ==, VIRTIO_ID_BLOCK); + + obj->get_driver = qvhost_user_blk_pci_get_driver; + + return obj; +} + +static void vhost_user_blk_register_nodes(void) +{ + /* + * FIXME: every test using these two nodes needs to setup a + * -drive,id=drive0 otherwise QEMU is not going to start. + * Therefore, we do not include "produces" edge for virtio + * and pci-device yet. + */ + + char *arg = g_strdup_printf("id=drv0,chardev=char1,addr=%x.%x", + PCI_SLOT, PCI_FN); + + QPCIAddress addr = { + .devfn = QPCI_DEVFN(PCI_SLOT, PCI_FN), + }; + + QOSGraphEdgeOptions opts = { }; + + /* virtio-blk-device */ + /** opts.extra_device_opts = "drive=drive0"; */ + qos_node_create_driver("vhost-user-blk-device", vhost_user_blk_device_create); + qos_node_consumes("vhost-user-blk-device", "virtio-bus", &opts); + qos_node_produces("vhost-user-blk-device", "vhost-user-blk"); + + /* virtio-blk-pci */ + opts.extra_device_opts = arg; + add_qpci_address(&opts, &addr); + qos_node_create_driver("vhost-user-blk-pci", vhost_user_blk_pci_create); + qos_node_consumes("vhost-user-blk-pci", "pci-bus", &opts); + qos_node_produces("vhost-user-blk-pci", "vhost-user-blk"); + + g_free(arg); +} + +libqos_init(vhost_user_blk_register_nodes); diff --git a/tests/libqos/vhost-user-blk.h b/tests/libqos/vhost-user-blk.h new file mode 100644 index 0000000000..ef4ef09cca --- /dev/null +++ b/tests/libqos/vhost-user-blk.h @@ -0,0 +1,44 @@ +/* + * libqos driver framework + * + * Copyright (c) 2018 Emanuele Giuseppe Esposito + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License version 2 as published by the Free Software Foundation. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see + */ + +#ifndef TESTS_LIBQOS_VHOST_USER_BLK_H +#define TESTS_LIBQOS_VHOST_USER_BLK_H + +#include "libqos/qgraph.h" +#include "libqos/virtio.h" +#include "libqos/virtio-pci.h" + +typedef struct QVhostUserBlk QVhostUserBlk; +typedef struct QVhostUserBlkPCI QVhostUserBlkPCI; +typedef struct QVhostUserBlkDevice QVhostUserBlkDevice; + +struct QVhostUserBlk { + QVirtioDevice *vdev; +}; + +struct QVhostUserBlkPCI { + QVirtioPCIDevice pci_vdev; + QVhostUserBlk blk; +}; + +struct QVhostUserBlkDevice { + QOSGraphObject obj; + QVhostUserBlk blk; +}; + +#endif diff --git a/tests/vhost-user-blk-test.c b/tests/vhost-user-blk-test.c new file mode 100644 index 0000000000..528f034b55 --- /dev/null +++ b/tests/vhost-user-blk-test.c @@ -0,0 +1,694 @@ +/* + * QTest testcase for VirtIO Block Device + * + * Copyright (c) 2014 SUSE LINUX Products GmbH + * Copyright (c) 2014 Marc MarĂ­ + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "libqtest-single.h" +#include "qemu/bswap.h" +#include "qemu/module.h" +#include "standard-headers/linux/virtio_blk.h" +#include "standard-headers/linux/virtio_pci.h" +#include "libqos/qgraph.h" +#include "libqos/vhost-user-blk.h" +#include "libqos/libqos-pc.h" + +/* TODO actually test the results and get rid of this */ +#define qmp_discard_response(...) qobject_unref(qmp(__VA_ARGS__)) + +#define TEST_IMAGE_SIZE (64 * 1024 * 1024) +#define QVIRTIO_BLK_TIMEOUT_US (30 * 1000 * 1000) +#define PCI_SLOT_HP 0x06 + +typedef struct QVirtioBlkReq { + uint32_t type; + uint32_t ioprio; + uint64_t sector; + char *data; + uint8_t status; +} QVirtioBlkReq; + + +#ifdef HOST_WORDS_BIGENDIAN +static const bool host_is_big_endian = true; +#else +static const bool host_is_big_endian; /* false */ +#endif + +static inline void virtio_blk_fix_request(QVirtioDevice *d, QVirtioBlkReq *req) +{ + if (qvirtio_is_big_endian(d) != host_is_big_endian) { + req->type = bswap32(req->type); + req->ioprio = bswap32(req->ioprio); + req->sector = bswap64(req->sector); + } +} + + +static inline void virtio_blk_fix_dwz_hdr(QVirtioDevice *d, + struct virtio_blk_discard_write_zeroes *dwz_hdr) +{ + if (qvirtio_is_big_endian(d) != host_is_big_endian) { + dwz_hdr->sector = bswap64(dwz_hdr->sector); + dwz_hdr->num_sectors = bswap32(dwz_hdr->num_sectors); + dwz_hdr->flags = bswap32(dwz_hdr->flags); + } +} + +static uint64_t virtio_blk_request(QGuestAllocator *alloc, QVirtioDevice *d, + QVirtioBlkReq *req, uint64_t data_size) +{ + uint64_t addr; + uint8_t status = 0xFF; + + switch (req->type) { + case VIRTIO_BLK_T_IN: + case VIRTIO_BLK_T_OUT: + g_assert_cmpuint(data_size % 512, ==, 0); + break; + case VIRTIO_BLK_T_DISCARD: + case VIRTIO_BLK_T_WRITE_ZEROES: + g_assert_cmpuint(data_size % + sizeof(struct virtio_blk_discard_write_zeroes), ==, 0); + break; + default: + g_assert_cmpuint(data_size, ==, 0); + } + + addr = guest_alloc(alloc, sizeof(*req) + data_size); + + virtio_blk_fix_request(d, req); + + memwrite(addr, req, 16); + memwrite(addr + 16, req->data, data_size); + memwrite(addr + 16 + data_size, &status, sizeof(status)); + + return addr; +} + +/* Returns the request virtqueue so the caller can perform further tests */ +static QVirtQueue *test_basic(QVirtioDevice *dev, QGuestAllocator *alloc) +{ + QVirtioBlkReq req; + uint64_t req_addr; + uint64_t capacity; + uint64_t features; + uint32_t free_head; + uint8_t status; + char *data; + QTestState *qts = global_qtest; + QVirtQueue *vq; + + features = qvirtio_get_features(dev); + features = features & ~(QVIRTIO_F_BAD_FEATURE | + (1u << VIRTIO_RING_F_INDIRECT_DESC) | + (1u << VIRTIO_RING_F_EVENT_IDX) | + (1u << VIRTIO_BLK_F_SCSI)); + qvirtio_set_features(dev, features); + + capacity = qvirtio_config_readq(dev, 0); + g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512); + + vq = qvirtqueue_setup(dev, alloc, 0); + + qvirtio_set_driver_ok(dev); + + /* Write and read with 3 descriptor layout */ + /* Write request */ + req.type = VIRTIO_BLK_T_OUT; + req.ioprio = 1; + req.sector = 0; + req.data = g_malloc0(512); + strcpy(req.data, "TEST"); + + req_addr = virtio_blk_request(alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + guest_free(alloc, req_addr); + + /* Read request */ + req.type = VIRTIO_BLK_T_IN; + req.ioprio = 1; + req.sector = 0; + req.data = g_malloc0(512); + + req_addr = virtio_blk_request(alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + data = g_malloc0(512); + memread(req_addr + 16, data, 512); + g_assert_cmpstr(data, ==, "TEST"); + g_free(data); + + guest_free(alloc, req_addr); + + if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) { + struct virtio_blk_discard_write_zeroes dwz_hdr; + void *expected; + + /* + * WRITE_ZEROES request on the same sector of previous test where + * we wrote "TEST". + */ + req.type = VIRTIO_BLK_T_WRITE_ZEROES; + req.data = (char *) &dwz_hdr; + dwz_hdr.sector = 0; + dwz_hdr.num_sectors = 1; + dwz_hdr.flags = 0; + + virtio_blk_fix_dwz_hdr(dev, &dwz_hdr); + + req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr)); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true); + qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), 1, true, + false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 16 + sizeof(dwz_hdr)); + g_assert_cmpint(status, ==, 0); + + guest_free(alloc, req_addr); + + /* Read request to check if the sector contains all zeroes */ + req.type = VIRTIO_BLK_T_IN; + req.ioprio = 1; + req.sector = 0; + req.data = g_malloc0(512); + + req_addr = virtio_blk_request(alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + data = g_malloc(512); + expected = g_malloc0(512); + memread(req_addr + 16, data, 512); + g_assert_cmpmem(data, 512, expected, 512); + g_free(expected); + g_free(data); + + guest_free(alloc, req_addr); + } + + if (features & (1u << VIRTIO_BLK_F_DISCARD)) { + struct virtio_blk_discard_write_zeroes dwz_hdr; + + req.type = VIRTIO_BLK_T_DISCARD; + req.data = (char *) &dwz_hdr; + dwz_hdr.sector = 0; + dwz_hdr.num_sectors = 1; + dwz_hdr.flags = 0; + + virtio_blk_fix_dwz_hdr(dev, &dwz_hdr); + + req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr)); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true); + qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), + 1, true, false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 16 + sizeof(dwz_hdr)); + g_assert_cmpint(status, ==, 0); + + guest_free(alloc, req_addr); + } + + if (features & (1u << VIRTIO_F_ANY_LAYOUT)) { + /* Write and read with 2 descriptor layout */ + /* Write request */ + req.type = VIRTIO_BLK_T_OUT; + req.ioprio = 1; + req.sector = 1; + req.data = g_malloc0(512); + strcpy(req.data, "TEST"); + + req_addr = virtio_blk_request(alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 528, false, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + guest_free(alloc, req_addr); + + /* Read request */ + req.type = VIRTIO_BLK_T_IN; + req.ioprio = 1; + req.sector = 1; + req.data = g_malloc0(512); + + req_addr = virtio_blk_request(alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 513, true, false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + data = g_malloc0(512); + memread(req_addr + 16, data, 512); + g_assert_cmpstr(data, ==, "TEST"); + g_free(data); + + guest_free(alloc, req_addr); + } + + return vq; +} + +static void basic(void *obj, void *data, QGuestAllocator *t_alloc) +{ + QVhostUserBlk *blk_if = obj; + QVirtQueue *vq; + + vq = test_basic(blk_if->vdev, t_alloc); + qvirtqueue_cleanup(blk_if->vdev->bus, vq, t_alloc); + +} + +static void indirect(void *obj, void *u_data, QGuestAllocator *t_alloc) +{ + QVirtQueue *vq; + QVhostUserBlk *blk_if = obj; + QVirtioDevice *dev = blk_if->vdev; + QVirtioBlkReq req; + QVRingIndirectDesc *indirect; + uint64_t req_addr; + uint64_t capacity; + uint64_t features; + uint32_t free_head; + uint8_t status; + char *data; + QTestState *qts = global_qtest; + + features = qvirtio_get_features(dev); + g_assert_cmphex(features & (1u << VIRTIO_RING_F_INDIRECT_DESC), !=, 0); + features = features & ~(QVIRTIO_F_BAD_FEATURE | + (1u << VIRTIO_RING_F_EVENT_IDX) | + (1u << VIRTIO_BLK_F_SCSI)); + qvirtio_set_features(dev, features); + + capacity = qvirtio_config_readq(dev, 0); + g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512); + + vq = qvirtqueue_setup(dev, t_alloc, 0); + qvirtio_set_driver_ok(dev); + + /* Write request */ + req.type = VIRTIO_BLK_T_OUT; + req.ioprio = 1; + req.sector = 0; + req.data = g_malloc0(512); + strcpy(req.data, "TEST"); + + req_addr = virtio_blk_request(t_alloc, dev, &req, 512); + + g_free(req.data); + + indirect = qvring_indirect_desc_setup(qts, dev, t_alloc, 2); + qvring_indirect_desc_add(dev, qts, indirect, req_addr, 528, false); + qvring_indirect_desc_add(dev, qts, indirect, req_addr + 528, 1, true); + free_head = qvirtqueue_add_indirect(qts, vq, indirect); + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + g_free(indirect); + guest_free(t_alloc, req_addr); + + /* Read request */ + req.type = VIRTIO_BLK_T_IN; + req.ioprio = 1; + req.sector = 0; + req.data = g_malloc0(512); + strcpy(req.data, "TEST"); + + req_addr = virtio_blk_request(t_alloc, dev, &req, 512); + + g_free(req.data); + + indirect = qvring_indirect_desc_setup(qts, dev, t_alloc, 2); + qvring_indirect_desc_add(dev, qts, indirect, req_addr, 16, false); + qvring_indirect_desc_add(dev, qts, indirect, req_addr + 16, 513, true); + free_head = qvirtqueue_add_indirect(qts, vq, indirect); + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + data = g_malloc0(512); + memread(req_addr + 16, data, 512); + g_assert_cmpstr(data, ==, "TEST"); + g_free(data); + + g_free(indirect); + guest_free(t_alloc, req_addr); + qvirtqueue_cleanup(dev->bus, vq, t_alloc); +} + + +static void idx(void *obj, void *u_data, QGuestAllocator *t_alloc) +{ + QVirtQueue *vq; + QVhostUserBlkPCI *blk = obj; + QVirtioPCIDevice *pdev = &blk->pci_vdev; + QVirtioDevice *dev = &pdev->vdev; + QVirtioBlkReq req; + uint64_t req_addr; + uint64_t capacity; + uint64_t features; + uint32_t free_head; + uint32_t write_head; + uint32_t desc_idx; + uint8_t status; + char *data; + QOSGraphObject *blk_object = obj; + QPCIDevice *pci_dev = blk_object->get_driver(blk_object, "pci-device"); + QTestState *qts = global_qtest; + + if (qpci_check_buggy_msi(pci_dev)) { + return; + } + + qpci_msix_enable(pdev->pdev); + qvirtio_pci_set_msix_configuration_vector(pdev, t_alloc, 0); + + features = qvirtio_get_features(dev); + features = features & ~(QVIRTIO_F_BAD_FEATURE | + (1u << VIRTIO_RING_F_INDIRECT_DESC) | + (1u << VIRTIO_F_NOTIFY_ON_EMPTY) | + (1u << VIRTIO_BLK_F_SCSI)); + qvirtio_set_features(dev, features); + + capacity = qvirtio_config_readq(dev, 0); + g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512); + + vq = qvirtqueue_setup(dev, t_alloc, 0); + qvirtqueue_pci_msix_setup(pdev, (QVirtQueuePCI *)vq, t_alloc, 1); + + qvirtio_set_driver_ok(dev); + + /* Write request */ + req.type = VIRTIO_BLK_T_OUT; + req.ioprio = 1; + req.sector = 0; + req.data = g_malloc0(512); + strcpy(req.data, "TEST"); + + req_addr = virtio_blk_request(t_alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + qvirtqueue_kick(qts, dev, vq, free_head); + + qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + + /* Write request */ + req.type = VIRTIO_BLK_T_OUT; + req.ioprio = 1; + req.sector = 1; + req.data = g_malloc0(512); + strcpy(req.data, "TEST"); + + req_addr = virtio_blk_request(t_alloc, dev, &req, 512); + + g_free(req.data); + + /* Notify after processing the third request */ + qvirtqueue_set_used_event(qts, vq, 2); + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + qvirtqueue_kick(qts, dev, vq, free_head); + write_head = free_head; + + /* No notification expected */ + status = qvirtio_wait_status_byte_no_isr(qts, dev, + vq, req_addr + 528, + QVIRTIO_BLK_TIMEOUT_US); + g_assert_cmpint(status, ==, 0); + + guest_free(t_alloc, req_addr); + + /* Read request */ + req.type = VIRTIO_BLK_T_IN; + req.ioprio = 1; + req.sector = 1; + req.data = g_malloc0(512); + + req_addr = virtio_blk_request(t_alloc, dev, &req, 512); + + g_free(req.data); + + free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true); + qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true); + qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false); + + qvirtqueue_kick(qts, dev, vq, free_head); + + /* We get just one notification for both requests */ + qvirtio_wait_used_elem(qts, dev, vq, write_head, NULL, + QVIRTIO_BLK_TIMEOUT_US); + g_assert(qvirtqueue_get_buf(qts, vq, &desc_idx, NULL)); + g_assert_cmpint(desc_idx, ==, free_head); + + status = readb(req_addr + 528); + g_assert_cmpint(status, ==, 0); + + data = g_malloc0(512); + memread(req_addr + 16, data, 512); + g_assert_cmpstr(data, ==, "TEST"); + g_free(data); + + guest_free(t_alloc, req_addr); + + /* End test */ + qpci_msix_disable(pdev->pdev); + + qvirtqueue_cleanup(dev->bus, vq, t_alloc); +} + +static void pci_hotplug(void *obj, void *data, QGuestAllocator *t_alloc) +{ + QVirtioPCIDevice *dev1 = obj; + QVirtioPCIDevice *dev; + QTestState *qts = dev1->pdev->bus->qts; + + /* plug secondary disk */ + qtest_qmp_device_add(qts, "vhost-user-blk-pci", "drv1", + "{'addr': %s, 'chardev': 'char2'}", + stringify(PCI_SLOT_HP) ".0"); + + dev = virtio_pci_new(dev1->pdev->bus, + &(QPCIAddress) { .devfn = QPCI_DEVFN(PCI_SLOT_HP, 0) + }); + g_assert_nonnull(dev); + g_assert_cmpint(dev->vdev.device_type, ==, VIRTIO_ID_BLOCK); + qvirtio_pci_device_disable(dev); + qos_object_destroy((QOSGraphObject *)dev); + + /* unplug secondary disk */ + qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP); +} + +/* + * Check that setting the vring addr on a non-existent virtqueue does + * not crash. + */ +static void test_nonexistent_virtqueue(void *obj, void *data, + QGuestAllocator *t_alloc) +{ + QVhostUserBlkPCI *blk = obj; + QVirtioPCIDevice *pdev = &blk->pci_vdev; + QPCIBar bar0; + QPCIDevice *dev; + + dev = qpci_device_find(pdev->pdev->bus, QPCI_DEVFN(4, 0)); + g_assert(dev != NULL); + qpci_device_enable(dev); + + bar0 = qpci_iomap(dev, 0, NULL); + + qpci_io_writeb(dev, bar0, VIRTIO_PCI_QUEUE_SEL, 2); + qpci_io_writel(dev, bar0, VIRTIO_PCI_QUEUE_PFN, 1); + + g_free(dev); +} + +static const char *qtest_qemu_vu_binary(void) +{ + const char *qemu_vu_bin; + + qemu_vu_bin = getenv("QTEST_QEMU_VU_BINARY"); + if (!qemu_vu_bin) { + fprintf(stderr, "Environment variable QTEST_QEMU_VU_BINARY required\n"); + exit(0); + } + + return qemu_vu_bin; +} + +static void drive_destroy(void *path) +{ + unlink(path); + g_free(path); + qos_invalidate_command_line(); +} + + +static char *drive_create(void) +{ + int fd, ret; + /** vhost-user-blk won't recognize drive located in /tmp */ + char *t_path = g_strdup("qtest.XXXXXX"); + + /** Create a temporary raw image */ + fd = mkstemp(t_path); + g_assert_cmpint(fd, >=, 0); + ret = ftruncate(fd, TEST_IMAGE_SIZE); + g_assert_cmpint(ret, ==, 0); + close(fd); + + g_test_queue_destroy(drive_destroy, t_path); + return t_path; +} + + + +static void start_vhost_user_blk(const char *img_path, const char *sock_path) +{ + const char *vhost_user_blk_bin = qtest_qemu_vu_binary(); + /* + * "qemu-vu -e" will exit when the client disconnects thus the launched + * qemu-vu process will not block scripts/tap-driver.pl + */ + gchar *command = g_strdup_printf("exec %s " + "-e " + "-k %s " + "-f raw " + "%s", + vhost_user_blk_bin, + sock_path, img_path); + g_test_message("starting vhost-user backend: %s", command); + pid_t pid = fork(); + if (pid == 0) { + execlp("/bin/sh", "sh", "-c", command, NULL); + exit(1); + } + /* + * make sure qemu-vu i.e. socket server is started before tests + * otherwise qemu will complain, + * "Failed to connect socket ... Connection refused" + */ + g_usleep(G_USEC_PER_SEC); +} + +static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg) +{ + /* create image file */ + const char *img_path = drive_create(); + const char *sock_path = "/tmp/vhost-user-blk_vhost.socket"; + start_vhost_user_blk(img_path, sock_path); + /* "-chardev socket,id=char2" is used for pci_hotplug*/ + g_string_append_printf(cmd_line, + " -object memory-backend-memfd,id=mem,size=128M,share=on -numa node,memdev=mem " + "-chardev socket,id=char1,path=%s " + "-chardev socket,id=char2,path=%s", + sock_path, sock_path); + return arg; +} + +static void register_vhost_user_blk_test(void) +{ + QOSGraphTestOptions opts = { + .before = vhost_user_blk_test_setup, + }; + + /* + * tests for vhost-user-blk and vhost-user-blk-pci + * The tests are borrowed from tests/virtio-blk-test.c. But some tests + * regarding block_resize don't work for vhost-user-blk. + * vhost-user-blk device doesn't have -drive, so tests containing + * block_resize are also abandoned, + * - config + * - resize + */ + qos_add_test("basic", "vhost-user-blk", basic, &opts); + qos_add_test("indirect", "vhost-user-blk", indirect, &opts); + qos_add_test("idx", "vhost-user-blk-pci", idx, &opts); + qos_add_test("nxvirtq", "vhost-user-blk-pci", + test_nonexistent_virtqueue, &opts); + qos_add_test("hotplug", "vhost-user-blk-pci", pci_hotplug, &opts); +} + +libqos_init(register_vhost_user_blk_test);