From patchwork Mon Sep 17 15:26:20 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 184456 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id AD01C2C0082 for ; Tue, 18 Sep 2012 01:25:23 +1000 (EST) Received: from localhost ([::1]:52794 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TDdCL-0000wN-Nh for incoming@patchwork.ozlabs.org; Mon, 17 Sep 2012 11:25:21 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38819) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TDdC0-0000WM-Vv for qemu-devel@nongnu.org; Mon, 17 Sep 2012 11:25:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TDdBt-0002h0-9L for qemu-devel@nongnu.org; Mon, 17 Sep 2012 11:25:00 -0400 Received: from e23smtp01.au.ibm.com ([202.81.31.143]:33064) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TDdBs-0002g7-6N for qemu-devel@nongnu.org; Mon, 17 Sep 2012 11:24:53 -0400 Received: from /spool/local by e23smtp01.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 18 Sep 2012 01:23:15 +1000 Received: from d23relay03.au.ibm.com (202.81.31.245) by e23smtp01.au.ibm.com (202.81.31.207) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 18 Sep 2012 01:23:01 +1000 Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q8HFOW4u24969390 for ; Tue, 18 Sep 2012 01:24:35 +1000 Received: from d23av04.au.ibm.com (loopback [127.0.0.1]) by d23av04.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q8HFOWKK011417 for ; Tue, 18 Sep 2012 01:24:32 +1000 Received: from in.ibm.com ([9.79.199.95]) by d23av04.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q8HFONPW011315 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Tue, 18 Sep 2012 01:24:26 +1000 Date: Mon, 17 Sep 2012 20:56:20 +0530 From: Bharata B Rao To: qemu-devel@nongnu.org Message-ID: <20120917152620.GG6879@in.ibm.com> References: <20120917152149.GB6879@in.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20120917152149.GB6879@in.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) x-cbid: 12091715-1618-0000-0000-0000027E5E0B X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 202.81.31.143 Cc: Kevin Wolf , Anthony Liguori , Anand Avati , Vijay Bellur , Stefan Hajnoczi , Amar Tumballi , Markus Armbruster , Blue Swirl , Avi Kivity , Paolo Bonzini Subject: [Qemu-devel] [PATCH v7 5/5] block: Support GlusterFS as a QEMU block backend. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: bharata@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org block: Support GlusterFS as a QEMU block backend. From: Bharata B Rao This patch adds gluster as the new block backend in QEMU. This gives QEMU the ability to boot VM images from gluster volumes. Its already possible to boot from VM images on gluster volumes using FUSE mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. This is made possible by using libgfapi routines to perform IO on gluster volumes directly. VM Image on gluster volume is specified like this: file=gluster[+transport]://[server[:port]]/volname/image[?socket=...] 'gluster' is the protocol. 'transport' specifies the transport type used to connect to gluster management daemon (glusterd). Valid transport types are tcp, unix and rdma. If the transport type isn't specified, then tcp type is assumed. 'server' specifies the server where the volume file specification for the given volume resides. This can be either hostname or ipv4 address or ipv6 address. ipv6 address needs to be with in square brackets [ ]. If transport type is 'unix', then server field is ignored, but the 'socket' field needs to be populated with the path to unix domain socket. 'port' is the port number on which glusterd is listening. This is optional and if not specified, QEMU will send 0 which will make gluster to use the default port. port is ignored for unix type of transport. 'volname' is the name of the gluster volume which contains the VM image. 'image' is the path to the actual VM image that resides on gluster volume. Examples: file=gluster://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket file=gluster+rdma://1.2.3.4:24007/testvol/a.img Signed-off-by: Bharata B Rao --- block/Makefile.objs | 1 block/gluster.c | 694 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 695 insertions(+), 0 deletions(-) create mode 100644 block/gluster.c diff --git a/block/Makefile.objs b/block/Makefile.objs index b5754d3..a1ae67f 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -9,3 +9,4 @@ block-obj-$(CONFIG_POSIX) += raw-posix.o block-obj-$(CONFIG_LIBISCSI) += iscsi.o block-obj-$(CONFIG_CURL) += curl.o block-obj-$(CONFIG_RBD) += rbd.o +block-obj-$(CONFIG_GLUSTERFS) += gluster.o diff --git a/block/gluster.c b/block/gluster.c new file mode 100644 index 0000000..0de3286 --- /dev/null +++ b/block/gluster.c @@ -0,0 +1,694 @@ +/* + * GlusterFS backend for QEMU + * + * Copyright (C) 2012 Bharata B Rao + * + * Pipe handling mechanism in AIO implementation is derived from + * block/rbd.c. Hence, + * + * Copyright (C) 2010-2011 Christian Brunner , + * Josh Durgin + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Contributions after 2012-01-13 are licensed under the terms of the + * GNU GPL, version 2 or (at your option) any later version. + */ +#include +#include "block_int.h" +#include "qemu_socket.h" + +typedef struct GlusterAIOCB { + BlockDriverAIOCB common; + int64_t size; + int ret; + bool *finished; + QEMUBH *bh; +} GlusterAIOCB; + +typedef struct BDRVGlusterState { + struct glfs *glfs; + int fds[2]; + struct glfs_fd *fd; + int qemu_aio_count; + int event_reader_pos; + GlusterAIOCB *event_acb; +} BDRVGlusterState; + +#define GLUSTER_FD_READ 0 +#define GLUSTER_FD_WRITE 1 + +#define GLUSTER_TRANSPORT_DEFAULT "gluster://" +#define GLUSTER_TRANSPORT_DEFAULT_SZ strlen(GLUSTER_TRANSPORT_DEFAULT) +#define GLUSTER_TRANSPORT_TCP "gluster+tcp://" +#define GLUSTER_TRANSPORT_TCP_SZ strlen(GLUSTER_TRANSPORT_TCP) +#define GLUSTER_TRANSPORT_UNIX "gluster+unix://" +#define GLUSTER_TRANSPORT_UNIX_SZ strlen(GLUSTER_TRANSPORT_UNIX) +#define GLUSTER_TRANSPORT_RDMA "gluster+rdma://" +#define GLUSTER_TRANSPORT_RDMA_SZ strlen(GLUSTER_TRANSPORT_RDMA) + +typedef struct GlusterURI { + char *server; + int port; + char *volname; + char *image; + char *transport; + bool is_unix; +} GlusterURI; + +static void qemu_gluster_uri_free(GlusterURI *uri) +{ + g_free(uri->server); + g_free(uri->volname); + g_free(uri->image); + g_free(uri->transport); + g_free(uri); +} + +static int parse_socket(GlusterURI *uri, char *socket) +{ + char *token, *saveptr; + + if (!socket) { + return 0; + } + token = strtok_r(socket, "=", &saveptr); + if (!token || strcmp(token, "socket")) { + return -EINVAL; + } + token = strtok_r(NULL, "=", &saveptr); + if (!token) { + return -EINVAL; + } + uri->server = g_strdup(token); + uri->is_unix = true; + return 0; +} + +static int parse_gluster_spec(GlusterURI *uri, char *spec) +{ + char *token, *saveptr; + int ret; + QemuOpts *opts; + char *p, *q; + + /* transport */ + p = spec; + if (!strncmp(p, GLUSTER_TRANSPORT_DEFAULT, GLUSTER_TRANSPORT_DEFAULT_SZ)) { + uri->transport = g_strdup("tcp"); + p += GLUSTER_TRANSPORT_DEFAULT_SZ; + } else if (!strncmp(p, GLUSTER_TRANSPORT_TCP, GLUSTER_TRANSPORT_TCP_SZ)) { + uri->transport = g_strdup("tcp"); + p += GLUSTER_TRANSPORT_TCP_SZ; + } else if (!strncmp(p, GLUSTER_TRANSPORT_UNIX, GLUSTER_TRANSPORT_UNIX_SZ)) { + uri->transport = g_strdup("unix"); + p += GLUSTER_TRANSPORT_UNIX_SZ; + } else if (!strncmp(p, GLUSTER_TRANSPORT_RDMA, GLUSTER_TRANSPORT_RDMA_SZ)) { + uri->transport = g_strdup("rdma"); + p += GLUSTER_TRANSPORT_RDMA_SZ; + } else { + return -EINVAL; + } + q = p; + + /* server */ + if (!strcmp(uri->transport, "unix")) { + if (!uri->is_unix) { + return -EINVAL; + } + } else { + if (uri->is_unix) { + return -EINVAL; + } + p = strchr(p, '/'); + if (!p) { + return -EINVAL; + } + *p++ = '\0'; + opts = qemu_opts_create(qemu_find_opts("inet"), NULL, 0, NULL); + ret = inet_parse(opts, q); + if (!ret) { + uri->server = g_strdup(qemu_opt_get(opts, "host")); + uri->port = strtoul(qemu_opt_get(opts, "port"), NULL, 0); + if (uri->port < 0) { + ret = -EINVAL; + } + } + qemu_opts_del(opts); + if (ret < 0) { + return -EINVAL; + } + } + + /* volname */ + token = strtok_r(p, "/", &saveptr); + if (!token) { + return -EINVAL; + } + uri->volname = g_strdup(token); + + /* image */ + token = strtok_r(NULL, "?", &saveptr); + if (!token) { + return -EINVAL; + } + uri->image = g_strdup(token); + return 0; +} + +/* + * file=gluster[+transport]://[server[:port]]/volname/image[?socket=...] + * + * 'gluster' is the protocol. + * + * 'transport' specifies the transport type used to connect to gluster + * management daemon (glusterd). Valid transport types are + * tcp, unix and rdma. If the transport type isn't specified, then tcp + * type is assumed. + * + * 'server' specifies the server where the volume file specification for + * the given volume resides. This can be either hostname or ipv4 address + * or ipv6 address. ipv6 address needs to be with in square brackets [ ]. + * If transport type is 'unix', then server field is ignored, but the + * 'socket' field needs to be populated with the path to unix domain + * socket. + * + * 'port' is the port number on which glusterd is listening. This is optional + * and if not specified, QEMU will send 0 which will make gluster to use the + * default port. port is ignored for unix type of transport. + * + * 'volname' is the name of the gluster volume which contains the VM image. + * + * 'image' is the path to the actual VM image that resides on gluster volume. + * + * Examples: + * + * file=gluster://1.2.3.4/testvol/a.img + * file=gluster+tcp://1.2.3.4/testvol/a.img + * file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img + * file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img + * file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img + * file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img + * file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket + * file=gluster+rdma://1.2.3.4:24007/testvol/a.img + */ +static int qemu_gluster_parseuri(GlusterURI *uri, const char *filename) +{ + char *token, *saveptr; + char *p, *q, *gluster_spec = NULL; + int ret = -EINVAL; + + p = q = g_strdup(filename); + + /* Extract server, volname and image */ + token = strtok_r(p, "?", &saveptr); + if (!token) { + goto out; + } + gluster_spec = g_strdup(token); + + /* socket */ + token = strtok_r(NULL, "?", &saveptr); + ret = parse_socket(uri, token); + if (ret < 0) { + goto out; + } + + /* Flag error for extra options */ + token = strtok_r(NULL, "?", &saveptr); + if (token) { + ret = -EINVAL; + goto out; + } + + ret = parse_gluster_spec(uri, gluster_spec); + if (ret < 0) { + goto out; + } + ret = 0; +out: + g_free(q); + g_free(gluster_spec); + return ret; +} + +static struct glfs *qemu_gluster_init(GlusterURI *uri, const char *filename) +{ + struct glfs *glfs = NULL; + int ret; + + ret = qemu_gluster_parseuri(uri, filename); + if (ret < 0) { + error_report("Usage: file=gluster[+transport]://[server[:port]]/" + "volname/image[?socket=...]"); + errno = -ret; + goto out; + } + + glfs = glfs_new(uri->volname); + if (!glfs) { + goto out; + } + + ret = glfs_set_volfile_server(glfs, uri->transport, uri->server, uri->port); + if (ret < 0) { + goto out; + } + + /* + * TODO: Use GF_LOG_ERROR instead of hard code value of 4 here when + * GlusterFS exports it in a header. + */ + ret = glfs_set_logging(glfs, "-", 4); + if (ret < 0) { + goto out; + } + + ret = glfs_init(glfs); + if (ret) { + error_report("Gluster connection failed for server=%s port=%d " + "volume=%s image=%s transport=%s\n", uri->server, uri->port, + uri->volname, uri->image, uri->transport); + goto out; + } + return glfs; + +out: + if (glfs) { + glfs_fini(glfs); + } + return NULL; +} + +static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s) +{ + int ret; + bool *finished = acb->finished; + BlockDriverCompletionFunc *cb = acb->common.cb; + void *opaque = acb->common.opaque; + + if (!acb->ret || acb->ret == acb->size) { + ret = 0; /* Success */ + } else if (acb->ret < 0) { + ret = acb->ret; /* Read/Write failed */ + } else { + ret = -EIO; /* Partial read/write - fail it */ + } + + s->qemu_aio_count--; + qemu_aio_release(acb); + cb(opaque, ret); + if (finished) { + *finished = true; + } +} + +static void qemu_gluster_aio_event_reader(void *opaque) +{ + BDRVGlusterState *s = opaque; + ssize_t ret; + + do { + char *p = (char *)&s->event_acb; + + ret = read(s->fds[GLUSTER_FD_READ], p + s->event_reader_pos, + sizeof(s->event_acb) - s->event_reader_pos); + if (ret > 0) { + s->event_reader_pos += ret; + if (s->event_reader_pos == sizeof(s->event_acb)) { + s->event_reader_pos = 0; + qemu_gluster_complete_aio(s->event_acb, s); + } + } + } while (ret < 0 && errno == EINTR); +} + +static int qemu_gluster_aio_flush_cb(void *opaque) +{ + BDRVGlusterState *s = opaque; + + return (s->qemu_aio_count > 0); +} + +static int qemu_gluster_open(BlockDriverState *bs, const char *filename, + int bdrv_flags) +{ + BDRVGlusterState *s = bs->opaque; + int open_flags = 0; + int ret = 0; + GlusterURI *uri = g_malloc0(sizeof(GlusterURI)); + + s->glfs = qemu_gluster_init(uri, filename); + if (!s->glfs) { + ret = -errno; + goto out; + } + + open_flags |= O_BINARY; + open_flags &= ~O_ACCMODE; + if (bdrv_flags & BDRV_O_RDWR) { + open_flags |= O_RDWR; + } else { + open_flags |= O_RDONLY; + } + + if ((bdrv_flags & BDRV_O_NOCACHE)) { + open_flags |= O_DIRECT; + } + + s->fd = glfs_open(s->glfs, uri->image, open_flags); + if (!s->fd) { + ret = -errno; + goto out; + } + + ret = qemu_pipe(s->fds); + if (ret < 0) { + ret = -errno; + goto out; + } + fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK); + qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], + qemu_gluster_aio_event_reader, NULL, qemu_gluster_aio_flush_cb, s); + +out: + qemu_gluster_uri_free(uri); + if (!ret) { + return ret; + } + if (s->fd) { + glfs_close(s->fd); + } + if (s->glfs) { + glfs_fini(s->glfs); + } + return ret; +} + +static int qemu_gluster_create(const char *filename, + QEMUOptionParameter *options) +{ + struct glfs *glfs; + struct glfs_fd *fd; + int ret = 0; + int64_t total_size = 0; + GlusterURI *uri = g_malloc0(sizeof(GlusterURI)); + + glfs = qemu_gluster_init(uri, filename); + if (!glfs) { + ret = -errno; + goto out; + } + + while (options && options->name) { + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { + total_size = options->value.n / BDRV_SECTOR_SIZE; + } + options++; + } + + fd = glfs_creat(glfs, uri->image, + O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR); + if (!fd) { + ret = -errno; + } else { + if (glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) { + ret = -errno; + } + if (glfs_close(fd) != 0) { + ret = -errno; + } + } +out: + qemu_gluster_uri_free(uri); + if (glfs) { + glfs_fini(glfs); + } + return ret; +} + +static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb) +{ + GlusterAIOCB *acb = (GlusterAIOCB *)blockacb; + bool finished = false; + + acb->finished = &finished; + while (!finished) { + qemu_aio_wait(); + } +} + +static AIOPool gluster_aio_pool = { + .aiocb_size = sizeof(GlusterAIOCB), + .cancel = qemu_gluster_aio_cancel, +}; + +static int qemu_gluster_send_pipe(BDRVGlusterState *s, GlusterAIOCB *acb) +{ + int ret = 0; + + while (1) { + int fd = s->fds[GLUSTER_FD_WRITE]; + + ret = write(fd, (void *)&acb, sizeof(acb)); + if (ret >= 0) { + break; + } + if (errno == EINTR) { + continue; + } + if (errno != EAGAIN) { + break; + } + } + return ret; +} + +static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg) +{ + GlusterAIOCB *acb = (GlusterAIOCB *)arg; + BlockDriverState *bs = acb->common.bs; + BDRVGlusterState *s = bs->opaque; + + acb->ret = ret; + if (qemu_gluster_send_pipe(s, acb) < 0) { + /* + * Gluster AIO callback thread failed to notify the waiting + * QEMU thread about IO completion. + * + * Complete this IO request and make the disk inaccessible for + * subsequent reads and writes. + */ + error_report("Gluster failed to notify QEMU about IO completion"); + + qemu_mutex_lock_iothread(); /* We are in gluster thread context */ + acb->common.cb(acb->common.opaque, -EIO); + qemu_aio_release(acb); + s->qemu_aio_count--; + close(s->fds[GLUSTER_FD_READ]); + close(s->fds[GLUSTER_FD_WRITE]); + qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL, + NULL); + bs->drv = NULL; /* Make the disk inaccessible */ + qemu_mutex_unlock_iothread(); + } +} + +static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs, + int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque, int write) +{ + int ret; + GlusterAIOCB *acb; + BDRVGlusterState *s = bs->opaque; + size_t size; + off_t offset; + + offset = sector_num * BDRV_SECTOR_SIZE; + size = nb_sectors * BDRV_SECTOR_SIZE; + s->qemu_aio_count++; + + acb = qemu_aio_get(&gluster_aio_pool, bs, cb, opaque); + acb->size = size; + acb->ret = 0; + acb->finished = NULL; + + if (write) { + ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0, + &gluster_finish_aiocb, acb); + } else { + ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0, + &gluster_finish_aiocb, acb); + } + + if (ret < 0) { + goto out; + } + return &acb->common; + +out: + s->qemu_aio_count--; + qemu_aio_release(acb); + return NULL; +} + +static BlockDriverAIOCB *qemu_gluster_aio_readv(BlockDriverState *bs, + int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque) +{ + return qemu_gluster_aio_rw(bs, sector_num, qiov, nb_sectors, cb, opaque, 0); +} + +static BlockDriverAIOCB *qemu_gluster_aio_writev(BlockDriverState *bs, + int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque) +{ + return qemu_gluster_aio_rw(bs, sector_num, qiov, nb_sectors, cb, opaque, 1); +} + +static BlockDriverAIOCB *qemu_gluster_aio_flush(BlockDriverState *bs, + BlockDriverCompletionFunc *cb, void *opaque) +{ + int ret; + GlusterAIOCB *acb; + BDRVGlusterState *s = bs->opaque; + + acb = qemu_aio_get(&gluster_aio_pool, bs, cb, opaque); + acb->size = 0; + acb->ret = 0; + acb->finished = NULL; + s->qemu_aio_count++; + + ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb); + if (ret < 0) { + goto out; + } + return &acb->common; + +out: + s->qemu_aio_count--; + qemu_aio_release(acb); + return NULL; +} + +static int64_t qemu_gluster_getlength(BlockDriverState *bs) +{ + BDRVGlusterState *s = bs->opaque; + int64_t ret; + + ret = glfs_lseek(s->fd, 0, SEEK_END); + if (ret < 0) { + return -errno; + } else { + return ret; + } +} + +static int64_t qemu_gluster_allocated_file_size(BlockDriverState *bs) +{ + BDRVGlusterState *s = bs->opaque; + struct stat st; + int ret; + + ret = glfs_fstat(s->fd, &st); + if (ret < 0) { + return -errno; + } else { + return st.st_blocks * 512; + } +} + +static void qemu_gluster_close(BlockDriverState *bs) +{ + BDRVGlusterState *s = bs->opaque; + + close(s->fds[GLUSTER_FD_READ]); + close(s->fds[GLUSTER_FD_WRITE]); + qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL, NULL); + + if (s->fd) { + glfs_close(s->fd); + s->fd = NULL; + } + glfs_fini(s->glfs); +} + +static QEMUOptionParameter qemu_gluster_create_options[] = { + { + .name = BLOCK_OPT_SIZE, + .type = OPT_SIZE, + .help = "Virtual disk size" + }, + { NULL } +}; + +static BlockDriver bdrv_gluster = { + .format_name = "gluster", + .protocol_name = "gluster", + .instance_size = sizeof(BDRVGlusterState), + .bdrv_file_open = qemu_gluster_open, + .bdrv_close = qemu_gluster_close, + .bdrv_create = qemu_gluster_create, + .bdrv_getlength = qemu_gluster_getlength, + .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size, + .bdrv_aio_readv = qemu_gluster_aio_readv, + .bdrv_aio_writev = qemu_gluster_aio_writev, + .bdrv_aio_flush = qemu_gluster_aio_flush, + .create_options = qemu_gluster_create_options, +}; + +static BlockDriver bdrv_gluster_tcp = { + .format_name = "gluster", + .protocol_name = "gluster+tcp", + .instance_size = sizeof(BDRVGlusterState), + .bdrv_file_open = qemu_gluster_open, + .bdrv_close = qemu_gluster_close, + .bdrv_create = qemu_gluster_create, + .bdrv_getlength = qemu_gluster_getlength, + .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size, + .bdrv_aio_readv = qemu_gluster_aio_readv, + .bdrv_aio_writev = qemu_gluster_aio_writev, + .bdrv_aio_flush = qemu_gluster_aio_flush, + .create_options = qemu_gluster_create_options, +}; + +static BlockDriver bdrv_gluster_unix = { + .format_name = "gluster", + .protocol_name = "gluster+unix", + .instance_size = sizeof(BDRVGlusterState), + .bdrv_file_open = qemu_gluster_open, + .bdrv_close = qemu_gluster_close, + .bdrv_create = qemu_gluster_create, + .bdrv_getlength = qemu_gluster_getlength, + .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size, + .bdrv_aio_readv = qemu_gluster_aio_readv, + .bdrv_aio_writev = qemu_gluster_aio_writev, + .bdrv_aio_flush = qemu_gluster_aio_flush, + .create_options = qemu_gluster_create_options, +}; + +static BlockDriver bdrv_gluster_rdma = { + .format_name = "gluster", + .protocol_name = "gluster+rdma", + .instance_size = sizeof(BDRVGlusterState), + .bdrv_file_open = qemu_gluster_open, + .bdrv_close = qemu_gluster_close, + .bdrv_create = qemu_gluster_create, + .bdrv_getlength = qemu_gluster_getlength, + .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size, + .bdrv_aio_readv = qemu_gluster_aio_readv, + .bdrv_aio_writev = qemu_gluster_aio_writev, + .bdrv_aio_flush = qemu_gluster_aio_flush, + .create_options = qemu_gluster_create_options, +}; + +static void bdrv_gluster_init(void) +{ + bdrv_register(&bdrv_gluster_rdma); + bdrv_register(&bdrv_gluster_unix); + bdrv_register(&bdrv_gluster_tcp); + bdrv_register(&bdrv_gluster); +} + +block_init(bdrv_gluster_init);