Message ID | 1464686130-12265-4-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On Tue, May 31, 2016 at 12:15:22PM +0300, Denis V. Lunev wrote: > From: Pavel Butsykin <pbutsykin@virtuozzo.com> > > Added implementation of the qcow2_co_write_compressed function that > will allow us to safely use compressed writes for the qcow2 from running VMs. > > Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Jeff Cody <jcody@redhat.com> > CC: Markus Armbruster <armbru@redhat.com> > CC: Eric Blake <eblake@redhat.com> > CC: John Snow <jsnow@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Kevin Wolf <kwolf@redhat.com> > --- > block/qcow2.c | 89 ++++++++++++++++++++++++++++++++++------------------------- > 1 file changed, 52 insertions(+), 37 deletions(-) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On 05/31/2016 03:15 AM, Denis V. Lunev wrote: > From: Pavel Butsykin <pbutsykin@virtuozzo.com> > > Added implementation of the qcow2_co_write_compressed function that > will allow us to safely use compressed writes for the qcow2 from running VMs. > > Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Jeff Cody <jcody@redhat.com> > CC: Markus Armbruster <armbru@redhat.com> > CC: Eric Blake <eblake@redhat.com> > CC: John Snow <jsnow@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Kevin Wolf <kwolf@redhat.com> > --- > block/qcow2.c | 89 ++++++++++++++++++++++++++++++++++------------------------- > 1 file changed, 52 insertions(+), 37 deletions(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index c9306a7..38caa66 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -2535,13 +2535,16 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) > > /* XXX: put compressed sectors first, then all the cluster aligned > tables to avoid losing bytes in alignment */ > -static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, > - const uint8_t *buf, int nb_sectors) > +static coroutine_fn int > +qcow2_co_write_compressed(BlockDriverState *bs, int64_t sector_num, > + int nb_sectors, QEMUIOVector *qiov) Is it worth converting to a byte-based qcow2_co_pwrite_compressed() while at it?
On 13.06.2016 23:14, Eric Blake wrote: > On 05/31/2016 03:15 AM, Denis V. Lunev wrote: >> From: Pavel Butsykin <pbutsykin@virtuozzo.com> >> >> Added implementation of the qcow2_co_write_compressed function that >> will allow us to safely use compressed writes for the qcow2 from running VMs. >> >> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: Jeff Cody <jcody@redhat.com> >> CC: Markus Armbruster <armbru@redhat.com> >> CC: Eric Blake <eblake@redhat.com> >> CC: John Snow <jsnow@redhat.com> >> CC: Stefan Hajnoczi <stefanha@redhat.com> >> CC: Kevin Wolf <kwolf@redhat.com> >> --- >> block/qcow2.c | 89 ++++++++++++++++++++++++++++++++++------------------------- >> 1 file changed, 52 insertions(+), 37 deletions(-) >> >> diff --git a/block/qcow2.c b/block/qcow2.c >> index c9306a7..38caa66 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -2535,13 +2535,16 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) >> >> /* XXX: put compressed sectors first, then all the cluster aligned >> tables to avoid losing bytes in alignment */ >> -static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, >> - const uint8_t *buf, int nb_sectors) >> +static coroutine_fn int >> +qcow2_co_write_compressed(BlockDriverState *bs, int64_t sector_num, >> + int nb_sectors, QEMUIOVector *qiov) > > Is it worth converting to a byte-based qcow2_co_pwrite_compressed() > while at it? > Yes, I'll do it for the next version.
Am 22.06.2016 um 14:27 hat Pavel Butsykin geschrieben: > On 13.06.2016 23:14, Eric Blake wrote: > >On 05/31/2016 03:15 AM, Denis V. Lunev wrote: > >>From: Pavel Butsykin <pbutsykin@virtuozzo.com> > >> > >>Added implementation of the qcow2_co_write_compressed function that > >>will allow us to safely use compressed writes for the qcow2 from running VMs. > >> > >>Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> > >>Signed-off-by: Denis V. Lunev <den@openvz.org> > >>CC: Jeff Cody <jcody@redhat.com> > >>CC: Markus Armbruster <armbru@redhat.com> > >>CC: Eric Blake <eblake@redhat.com> > >>CC: John Snow <jsnow@redhat.com> > >>CC: Stefan Hajnoczi <stefanha@redhat.com> > >>CC: Kevin Wolf <kwolf@redhat.com> > >>--- > >> block/qcow2.c | 89 ++++++++++++++++++++++++++++++++++------------------------- > >> 1 file changed, 52 insertions(+), 37 deletions(-) > >> > >>diff --git a/block/qcow2.c b/block/qcow2.c > >>index c9306a7..38caa66 100644 > >>--- a/block/qcow2.c > >>+++ b/block/qcow2.c > >>@@ -2535,13 +2535,16 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) > >> > >> /* XXX: put compressed sectors first, then all the cluster aligned > >> tables to avoid losing bytes in alignment */ > >>-static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, > >>- const uint8_t *buf, int nb_sectors) > >>+static coroutine_fn int > >>+qcow2_co_write_compressed(BlockDriverState *bs, int64_t sector_num, > >>+ int nb_sectors, QEMUIOVector *qiov) > > > >Is it worth converting to a byte-based qcow2_co_pwrite_compressed() > >while at it? > > Yes, I'll do it for the next version. I think it makes sense to do this in two steps. That is, one patch for making the function vectored and a second one for making it byte-based. Of course, you can take bytes as a unit even in the first patch, but then just convert it to sector numbers immediately, so that the actual conversion comes separately. Kevin
Am 31.05.2016 um 11:15 hat Denis V. Lunev geschrieben: > From: Pavel Butsykin <pbutsykin@virtuozzo.com> > > Added implementation of the qcow2_co_write_compressed function that > will allow us to safely use compressed writes for the qcow2 from running VMs. > > Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Jeff Cody <jcody@redhat.com> > CC: Markus Armbruster <armbru@redhat.com> > CC: Eric Blake <eblake@redhat.com> > CC: John Snow <jsnow@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Kevin Wolf <kwolf@redhat.com> > --- > block/qcow2.c | 89 ++++++++++++++++++++++++++++++++++------------------------- > 1 file changed, 52 insertions(+), 37 deletions(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index c9306a7..38caa66 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -2535,13 +2535,16 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) > > /* XXX: put compressed sectors first, then all the cluster aligned > tables to avoid losing bytes in alignment */ > -static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, > - const uint8_t *buf, int nb_sectors) > +static coroutine_fn int > +qcow2_co_write_compressed(BlockDriverState *bs, int64_t sector_num, > + int nb_sectors, QEMUIOVector *qiov) > { > BDRVQcow2State *s = bs->opaque; > + QEMUIOVector hd_qiov; > + struct iovec iov; > z_stream strm; > int ret, out_len; > - uint8_t *out_buf; > + uint8_t *buf, *out_buf; > uint64_t cluster_offset; > > if (nb_sectors == 0) { > @@ -2551,29 +2554,25 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, > return bdrv_truncate(bs->file->bs, cluster_offset); > } > > + buf = qemu_blockalign(bs, s->cluster_size); > if (nb_sectors != s->cluster_sectors) { > - ret = -EINVAL; > - > - /* Zero-pad last write if image size is not cluster aligned */ > - if (sector_num + nb_sectors == bs->total_sectors && > - nb_sectors < s->cluster_sectors) { > - uint8_t *pad_buf = qemu_blockalign(bs, s->cluster_size); > - memset(pad_buf, 0, s->cluster_size); > - memcpy(pad_buf, buf, nb_sectors * BDRV_SECTOR_SIZE); > - ret = qcow2_write_compressed(bs, sector_num, > - pad_buf, s->cluster_sectors); > - qemu_vfree(pad_buf); > + if (nb_sectors > s->cluster_sectors || > + sector_num + nb_sectors != bs->total_sectors) > + { > + qemu_vfree(buf); > + return -EINVAL; > } > - return ret; > + /* Zero-pad last write if image size is not cluster aligned */ > + memset(buf, 0, s->cluster_size); > } > + qemu_iovec_to_buf(qiov, 0, buf, qiov->size); This looks less related to the new interface, but more like an unrelated (but still worthwhile) cleanup to avoid the recursion. Can we separate this out as a cleanup patch before this one? Also, the last parameter of qemu_iovec_to_buf() should be s->cluster_size, it's the buffer size and not the qiov size. Additionally, we may want to assert(qiov->size == s->cluster_size). > out_buf = g_malloc(s->cluster_size + (s->cluster_size / 1000) + 128); > > /* best compression, small window, no zlib header */ > memset(&strm, 0, sizeof(strm)); > - ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, > - Z_DEFLATED, -12, > - 9, Z_DEFAULT_STRATEGY); > + ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED, > + -12, 9, Z_DEFAULT_STRATEGY); Unrelated reformatting? Let's drop this, so the semantic changes in the patch become more visible. > if (ret != 0) { > ret = -EINVAL; > goto fail; > @@ -2595,34 +2594,50 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, > deflateEnd(&strm); > > if (ret != Z_STREAM_END || out_len >= s->cluster_size) { > + iov = (struct iovec) { > + .iov_base = buf, > + .iov_len = out_len, > + }; > + qemu_iovec_init_external(&hd_qiov, &iov, 1); > /* could not compress: write normal cluster */ > - ret = bdrv_write(bs, sector_num, buf, s->cluster_sectors); > + ret = qcow2_co_writev(bs, sector_num, s->cluster_sectors, &hd_qiov); Now that it's qcow2_co_pwritev(), you can probably just use the existing qiov. > if (ret < 0) { > goto fail; > } > - } else { > - cluster_offset = qcow2_alloc_compressed_cluster_offset(bs, > - sector_num << 9, out_len); > - if (!cluster_offset) { > - ret = -EIO; > - goto fail; > - } > - cluster_offset &= s->cluster_offset_mask; > + goto success; > + } > > - ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len); > - if (ret < 0) { > - goto fail; > - } > + qemu_co_mutex_lock(&s->lock); > + cluster_offset = \ That backslash isn't necessary. > + qcow2_alloc_compressed_cluster_offset(bs, sector_num << 9, out_len); > + if (!cluster_offset) { > + qemu_co_mutex_unlock(&s->lock); > + ret = -EIO; > + goto fail; > + } > + cluster_offset &= s->cluster_offset_mask; Kevin
On 28.06.2016 14:30, Kevin Wolf wrote: > Am 31.05.2016 um 11:15 hat Denis V. Lunev geschrieben: >> From: Pavel Butsykin <pbutsykin@virtuozzo.com> >> >> Added implementation of the qcow2_co_write_compressed function that >> will allow us to safely use compressed writes for the qcow2 from running VMs. >> >> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: Jeff Cody <jcody@redhat.com> >> CC: Markus Armbruster <armbru@redhat.com> >> CC: Eric Blake <eblake@redhat.com> >> CC: John Snow <jsnow@redhat.com> >> CC: Stefan Hajnoczi <stefanha@redhat.com> >> CC: Kevin Wolf <kwolf@redhat.com> >> --- >> block/qcow2.c | 89 ++++++++++++++++++++++++++++++++++------------------------- >> 1 file changed, 52 insertions(+), 37 deletions(-) >> >> diff --git a/block/qcow2.c b/block/qcow2.c >> index c9306a7..38caa66 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -2535,13 +2535,16 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) >> >> /* XXX: put compressed sectors first, then all the cluster aligned >> tables to avoid losing bytes in alignment */ >> -static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, >> - const uint8_t *buf, int nb_sectors) >> +static coroutine_fn int >> +qcow2_co_write_compressed(BlockDriverState *bs, int64_t sector_num, >> + int nb_sectors, QEMUIOVector *qiov) >> { >> BDRVQcow2State *s = bs->opaque; >> + QEMUIOVector hd_qiov; >> + struct iovec iov; >> z_stream strm; >> int ret, out_len; >> - uint8_t *out_buf; >> + uint8_t *buf, *out_buf; >> uint64_t cluster_offset; >> >> if (nb_sectors == 0) { >> @@ -2551,29 +2554,25 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, >> return bdrv_truncate(bs->file->bs, cluster_offset); >> } >> >> + buf = qemu_blockalign(bs, s->cluster_size); >> if (nb_sectors != s->cluster_sectors) { >> - ret = -EINVAL; >> - >> - /* Zero-pad last write if image size is not cluster aligned */ >> - if (sector_num + nb_sectors == bs->total_sectors && >> - nb_sectors < s->cluster_sectors) { >> - uint8_t *pad_buf = qemu_blockalign(bs, s->cluster_size); >> - memset(pad_buf, 0, s->cluster_size); >> - memcpy(pad_buf, buf, nb_sectors * BDRV_SECTOR_SIZE); >> - ret = qcow2_write_compressed(bs, sector_num, >> - pad_buf, s->cluster_sectors); >> - qemu_vfree(pad_buf); >> + if (nb_sectors > s->cluster_sectors || >> + sector_num + nb_sectors != bs->total_sectors) >> + { >> + qemu_vfree(buf); >> + return -EINVAL; >> } >> - return ret; >> + /* Zero-pad last write if image size is not cluster aligned */ >> + memset(buf, 0, s->cluster_size); >> } >> + qemu_iovec_to_buf(qiov, 0, buf, qiov->size); > > This looks less related to the new interface, but more like an unrelated > (but still worthwhile) cleanup to avoid the recursion. > > Can we separate this out as a cleanup patch before this one? > We can :) > Also, the last parameter of qemu_iovec_to_buf() should be > s->cluster_size, it's the buffer size and not the qiov size. > Additionally, we may want to assert(qiov->size == s->cluster_size). It is not necessary, the qiov size can be less than s->cluster_size. In this case, the remaining part of the cluster is filled with zeros. > >> out_buf = g_malloc(s->cluster_size + (s->cluster_size / 1000) + 128); >> >> /* best compression, small window, no zlib header */ >> memset(&strm, 0, sizeof(strm)); >> - ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, >> - Z_DEFLATED, -12, >> - 9, Z_DEFAULT_STRATEGY); >> + ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED, >> + -12, 9, Z_DEFAULT_STRATEGY); > > Unrelated reformatting? Let's drop this, so the semantic changes in the > patch become more visible. > ok >> if (ret != 0) { >> ret = -EINVAL; >> goto fail; >> @@ -2595,34 +2594,50 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, >> deflateEnd(&strm); >> >> if (ret != Z_STREAM_END || out_len >= s->cluster_size) { >> + iov = (struct iovec) { >> + .iov_base = buf, >> + .iov_len = out_len, >> + }; >> + qemu_iovec_init_external(&hd_qiov, &iov, 1); >> /* could not compress: write normal cluster */ >> - ret = bdrv_write(bs, sector_num, buf, s->cluster_sectors); >> + ret = qcow2_co_writev(bs, sector_num, s->cluster_sectors, &hd_qiov); > > Now that it's qcow2_co_pwritev(), you can probably just use the existing > qiov. > >> if (ret < 0) { >> goto fail; >> } >> - } else { >> - cluster_offset = qcow2_alloc_compressed_cluster_offset(bs, >> - sector_num << 9, out_len); >> - if (!cluster_offset) { >> - ret = -EIO; >> - goto fail; >> - } >> - cluster_offset &= s->cluster_offset_mask; >> + goto success; >> + } >> >> - ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len); >> - if (ret < 0) { >> - goto fail; >> - } >> + qemu_co_mutex_lock(&s->lock); >> + cluster_offset = \ > > That backslash isn't necessary. > I know it's just a marker. >> + qcow2_alloc_compressed_cluster_offset(bs, sector_num << 9, out_len); >> + if (!cluster_offset) { >> + qemu_co_mutex_unlock(&s->lock); >> + ret = -EIO; >> + goto fail; >> + } >> + cluster_offset &= s->cluster_offset_mask; > > Kevin >
diff --git a/block/qcow2.c b/block/qcow2.c index c9306a7..38caa66 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2535,13 +2535,16 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) /* XXX: put compressed sectors first, then all the cluster aligned tables to avoid losing bytes in alignment */ -static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, - const uint8_t *buf, int nb_sectors) +static coroutine_fn int +qcow2_co_write_compressed(BlockDriverState *bs, int64_t sector_num, + int nb_sectors, QEMUIOVector *qiov) { BDRVQcow2State *s = bs->opaque; + QEMUIOVector hd_qiov; + struct iovec iov; z_stream strm; int ret, out_len; - uint8_t *out_buf; + uint8_t *buf, *out_buf; uint64_t cluster_offset; if (nb_sectors == 0) { @@ -2551,29 +2554,25 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, return bdrv_truncate(bs->file->bs, cluster_offset); } + buf = qemu_blockalign(bs, s->cluster_size); if (nb_sectors != s->cluster_sectors) { - ret = -EINVAL; - - /* Zero-pad last write if image size is not cluster aligned */ - if (sector_num + nb_sectors == bs->total_sectors && - nb_sectors < s->cluster_sectors) { - uint8_t *pad_buf = qemu_blockalign(bs, s->cluster_size); - memset(pad_buf, 0, s->cluster_size); - memcpy(pad_buf, buf, nb_sectors * BDRV_SECTOR_SIZE); - ret = qcow2_write_compressed(bs, sector_num, - pad_buf, s->cluster_sectors); - qemu_vfree(pad_buf); + if (nb_sectors > s->cluster_sectors || + sector_num + nb_sectors != bs->total_sectors) + { + qemu_vfree(buf); + return -EINVAL; } - return ret; + /* Zero-pad last write if image size is not cluster aligned */ + memset(buf, 0, s->cluster_size); } + qemu_iovec_to_buf(qiov, 0, buf, qiov->size); out_buf = g_malloc(s->cluster_size + (s->cluster_size / 1000) + 128); /* best compression, small window, no zlib header */ memset(&strm, 0, sizeof(strm)); - ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, - Z_DEFLATED, -12, - 9, Z_DEFAULT_STRATEGY); + ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED, + -12, 9, Z_DEFAULT_STRATEGY); if (ret != 0) { ret = -EINVAL; goto fail; @@ -2595,34 +2594,50 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, deflateEnd(&strm); if (ret != Z_STREAM_END || out_len >= s->cluster_size) { + iov = (struct iovec) { + .iov_base = buf, + .iov_len = out_len, + }; + qemu_iovec_init_external(&hd_qiov, &iov, 1); /* could not compress: write normal cluster */ - ret = bdrv_write(bs, sector_num, buf, s->cluster_sectors); + ret = qcow2_co_writev(bs, sector_num, s->cluster_sectors, &hd_qiov); if (ret < 0) { goto fail; } - } else { - cluster_offset = qcow2_alloc_compressed_cluster_offset(bs, - sector_num << 9, out_len); - if (!cluster_offset) { - ret = -EIO; - goto fail; - } - cluster_offset &= s->cluster_offset_mask; + goto success; + } - ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len); - if (ret < 0) { - goto fail; - } + qemu_co_mutex_lock(&s->lock); + cluster_offset = \ + qcow2_alloc_compressed_cluster_offset(bs, sector_num << 9, out_len); + if (!cluster_offset) { + qemu_co_mutex_unlock(&s->lock); + ret = -EIO; + goto fail; + } + cluster_offset &= s->cluster_offset_mask; - BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED); - ret = bdrv_pwrite(bs->file->bs, cluster_offset, out_buf, out_len); - if (ret < 0) { - goto fail; - } + ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len); + qemu_co_mutex_unlock(&s->lock); + if (ret < 0) { + goto fail; } + iov = (struct iovec) { + .iov_base = out_buf, + .iov_len = out_len, + }; + qemu_iovec_init_external(&hd_qiov, &iov, 1); + + BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED); + ret = bdrv_co_pwritev(bs->file->bs, cluster_offset, out_len, &hd_qiov, 0); + if (ret < 0) { + goto fail; + } +success: ret = 0; fail: + qemu_vfree(buf); g_free(out_buf); return ret; } @@ -3384,7 +3399,7 @@ BlockDriver bdrv_qcow2 = { .bdrv_co_write_zeroes = qcow2_co_write_zeroes, .bdrv_co_discard = qcow2_co_discard, .bdrv_truncate = qcow2_truncate, - .bdrv_write_compressed = qcow2_write_compressed, + .bdrv_co_write_compressed = qcow2_co_write_compressed, .bdrv_make_empty = qcow2_make_empty, .bdrv_snapshot_create = qcow2_snapshot_create,