Message ID | 1465395011-26088-6-git-send-email-kwolf@redhat.com |
---|---|
State | New |
Headers | show |
On 06/08/2016 08:10 AM, Kevin Wolf wrote: > The raw-posix block driver actually supports byte-aligned requests now > on non-O_DIRECT images, like it already (and previously incorrectly) > claimed in bs->request_alignment. > > For some block drivers this means that a RMW cycle can be avoided when > they write sub-sector metadata e.g. for cluster allocation. [well, there's still probably a RMW going on, but it's being done by the kernel, rather than qemu - and choice of caching may let the kernel optimize things... not worth cluttering the commit message with this, though] > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > +++ b/block/linux-aio.c > @@ -272,14 +272,12 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, > } > > int laio_submit_co(BlockDriverState *bs, LinuxAioState *s, int fd, > - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, int type) > + uint64_t offset, QEMUIOVector *qiov, int type) > { > - off_t offset = sector_num * 512; > int ret; > - > struct qemu_laiocb laiocb = { > .co = qemu_coroutine_self(), > - .nbytes = nb_sectors * 512, > + .nbytes = qiov->size, So for this interface, we require non-NULL qiov and no duplicated length; I guess it isn't used for write_zeroes. We may still want to do some consistency sweep to decide what level of NULL-ness we want for representing write_zeroes, rather than ad hoc decisions at each layer of the call stack, but that's a task for another day. > @@ -1344,26 +1344,27 @@ static int coroutine_fn raw_co_rw(BlockDriverState *bs, int64_t sector_num, > type |= QEMU_AIO_MISALIGNED; > #ifdef CONFIG_LINUX_AIO > } else if (s->use_aio) { > - return laio_submit_co(bs, s->aio_ctx, s->fd, sector_num, qiov, > - nb_sectors, type); > + assert(qiov->size == bytes); Worth hoisting the assertion outside of the #ifdef?... > + return laio_submit_co(bs, s->aio_ctx, s->fd, offset, qiov, type); > #endif > } > } > > - return paio_submit_co(bs, s->fd, sector_num * BDRV_SECTOR_SIZE, qiov, > - nb_sectors * BDRV_SECTOR_SIZE, type); > + return paio_submit_co(bs, s->fd, offset, qiov, bytes, type); ...then again, paio_submit_co() also does the assert - and this is more evidence of our inconsistency on whether we duplicate a separate bytes parameter or reuse qiov->size. > > -static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, > - int nb_sectors, QEMUIOVector *qiov) > +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset, > + uint64_t bytes, QEMUIOVector *qiov, > + int flags) > { > - return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_READ); > + return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ); We ignore flags, but that's not a change in semantics. (Maybe someday we need .supported_read_flags) > } > > -static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num, > - int nb_sectors, QEMUIOVector *qiov) > +static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, > + uint64_t bytes, QEMUIOVector *qiov, > + int flags) > { > - return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_WRITE); > + return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE); And here, we could assert(!flags) (since we intentionally don't set .supported_write_flags) - but I won't insist. None of my comments require a code change, other than a possible added assertion, so: Reviewed-by: Eric Blake <eblake@redhat.com>
On Wed, Jun 08, 2016 at 04:10:10PM +0200, Kevin Wolf wrote: > The raw-posix block driver actually supports byte-aligned requests now > on non-O_DIRECT images, like it already (and previously incorrectly) > claimed in bs->request_alignment. > > For some block drivers this means that a RMW cycle can be avoided when > they write sub-sector metadata e.g. for cluster allocation. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/linux-aio.c | 6 ++---- > block/raw-aio.h | 2 +- > block/raw-posix.c | 42 ++++++++++++++++++++++-------------------- > 3 files changed, 25 insertions(+), 25 deletions(-) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
diff --git a/block/linux-aio.c b/block/linux-aio.c index 1a56543..8dc34db 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -272,14 +272,12 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, } int laio_submit_co(BlockDriverState *bs, LinuxAioState *s, int fd, - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, int type) + uint64_t offset, QEMUIOVector *qiov, int type) { - off_t offset = sector_num * 512; int ret; - struct qemu_laiocb laiocb = { .co = qemu_coroutine_self(), - .nbytes = nb_sectors * 512, + .nbytes = qiov->size, .ctx = s, .is_read = (type == QEMU_AIO_READ), .qiov = qiov, diff --git a/block/raw-aio.h b/block/raw-aio.h index 1037502..3f5b8bb 100644 --- a/block/raw-aio.h +++ b/block/raw-aio.h @@ -39,7 +39,7 @@ typedef struct LinuxAioState LinuxAioState; LinuxAioState *laio_init(void); void laio_cleanup(LinuxAioState *s); int laio_submit_co(BlockDriverState *bs, LinuxAioState *s, int fd, - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, int type); + uint64_t offset, QEMUIOVector *qiov, int type); BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockCompletionFunc *cb, void *opaque, int type); diff --git a/block/raw-posix.c b/block/raw-posix.c index af7f69f..0db7876 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1325,8 +1325,8 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd, return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque); } -static int coroutine_fn raw_co_rw(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov, int type) +static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, int type) { BDRVRawState *s = bs->opaque; @@ -1344,26 +1344,27 @@ static int coroutine_fn raw_co_rw(BlockDriverState *bs, int64_t sector_num, type |= QEMU_AIO_MISALIGNED; #ifdef CONFIG_LINUX_AIO } else if (s->use_aio) { - return laio_submit_co(bs, s->aio_ctx, s->fd, sector_num, qiov, - nb_sectors, type); + assert(qiov->size == bytes); + return laio_submit_co(bs, s->aio_ctx, s->fd, offset, qiov, type); #endif } } - return paio_submit_co(bs, s->fd, sector_num * BDRV_SECTOR_SIZE, qiov, - nb_sectors * BDRV_SECTOR_SIZE, type); + return paio_submit_co(bs, s->fd, offset, qiov, bytes, type); } -static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov) +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, + int flags) { - return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_READ); + return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ); } -static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov) +static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, + int flags) { - return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_WRITE); + return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE); } static void raw_aio_plug(BlockDriverState *bs) @@ -1952,8 +1953,8 @@ BlockDriver bdrv_file = { .bdrv_co_get_block_status = raw_co_get_block_status, .bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes, - .bdrv_co_readv = raw_co_readv, - .bdrv_co_writev = raw_co_writev, + .bdrv_co_preadv = raw_co_preadv, + .bdrv_co_pwritev = raw_co_pwritev, .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = raw_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, @@ -2400,8 +2401,8 @@ static BlockDriver bdrv_host_device = { .create_opts = &raw_create_opts, .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes, - .bdrv_co_readv = raw_co_readv, - .bdrv_co_writev = raw_co_writev, + .bdrv_co_preadv = raw_co_preadv, + .bdrv_co_pwritev = raw_co_pwritev, .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = hdev_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, @@ -2530,8 +2531,9 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_create = hdev_create, .create_opts = &raw_create_opts, - .bdrv_co_readv = raw_co_readv, - .bdrv_co_writev = raw_co_writev, + + .bdrv_co_preadv = raw_co_preadv, + .bdrv_co_pwritev = raw_co_pwritev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, @@ -2665,8 +2667,8 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_create = hdev_create, .create_opts = &raw_create_opts, - .bdrv_co_readv = raw_co_readv, - .bdrv_co_writev = raw_co_writev, + .bdrv_co_preadv = raw_co_preadv, + .bdrv_co_pwritev = raw_co_pwritev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug,
The raw-posix block driver actually supports byte-aligned requests now on non-O_DIRECT images, like it already (and previously incorrectly) claimed in bs->request_alignment. For some block drivers this means that a RMW cycle can be avoided when they write sub-sector metadata e.g. for cluster allocation. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block/linux-aio.c | 6 ++---- block/raw-aio.h | 2 +- block/raw-posix.c | 42 ++++++++++++++++++++++-------------------- 3 files changed, 25 insertions(+), 25 deletions(-)