Message ID | 1429871600-10180-3-git-send-email-famz@redhat.com |
---|---|
State | New |
Headers | show |
On 24/04/2015 12:33, Fam Zheng wrote: > For zero write, qiov passed by callers (qemu-io "write -z" and > scsi-disk "write same") is NULL. > > Commit fc3959e466 fixed bdrv_co_write_zeroes which is the common case > for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler > fix would be in bdrv_co_do_pwritev which is the NULL dereference point > and covers both cases. > > So don't access it in bdrv_co_do_pwritev in this case, use three aligned > writes. > > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > block.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 61 insertions(+), 17 deletions(-) > > diff --git a/block.c b/block.c > index 0fe97de..cbd0708 100644 > --- a/block.c > +++ b/block.c > @@ -3403,6 +3403,8 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, > */ > tracked_request_begin(&req, bs, offset, bytes, true); > > + assert(qiov || flags & BDRV_REQ_ZERO_WRITE); Perhaps as a follow-up you can add if ((flags & (BDRV_REQ_ZERO_WRITE|BDRV_REQ_MAY_UNMAP)) == BDRV_REQ_ZERO_WRITE|BDRV_REQ_MAY_UNMAP) { qiov = NULL; } so that the central area is always unmapped. You can have non-NULL qiov if the flags were added because of detect-zeroes=unmap. But in any case that would be a separate change. > if (offset & (align - 1)) { > QEMUIOVector head_qiov; > struct iovec head_iov; > @@ -3425,13 +3427,37 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, > } > BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_HEAD); > > - qemu_iovec_init(&local_qiov, qiov->niov + 2); > - qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); > - qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > - use_local_qiov = true; > + if (qiov) { > + qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 2 : 1); > + qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); > + qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > + use_local_qiov = true; > + bytes += offset & (align - 1); > + offset = offset & ~(align - 1); > + } else { > + memset(head_buf + (offset & (align - 1)), 0, > + align - (offset & (align - 1))); > + ret = bdrv_aligned_pwritev(bs, &req, offset & ~(align - 1), align, > + &head_qiov, 0); > + if (ret < 0) { > + goto fail; > + } > + bytes -= align - (offset & (align - 1)); > + offset = ROUND_UP(offset, align); > + } > + } > > - bytes += offset & (align - 1); > - offset = offset & ~(align - 1); > + if (!qiov) { > + uint64_t aligned_bytes = bytes & ~(align - 1); > + > + assert((offset & (align - 1)) == 0); > + ret = bdrv_aligned_pwritev(bs, &req, offset, aligned_bytes, > + NULL, flags); > + if (ret < 0) { > + goto fail; > + } > + bytes -= aligned_bytes; > + offset += aligned_bytes; > } > > if ((offset + bytes) & (align - 1)) { > @@ -3459,21 +3485,39 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, > } > BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL); > > - if (!use_local_qiov) { > - qemu_iovec_init(&local_qiov, qiov->niov + 1); > - qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > - use_local_qiov = true; > + if (qiov) { > + if (!use_local_qiov) { > + qemu_iovec_init(&local_qiov, qiov->niov + 1); > + qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > + use_local_qiov = true; > + } > + > + tail_bytes = (offset + bytes) & (align - 1); > + qemu_iovec_add(&local_qiov, tail_buf + tail_bytes, > + align - tail_bytes); > + > + bytes = ROUND_UP(bytes, align); > + } else { > + assert((offset & (align - 1)) == 0); > + assert(bytes < align); > + > + memset(tail_buf, 0, bytes & (align - 1)); > + ret = bdrv_aligned_pwritev(bs, &req, offset, align, > + &tail_qiov, 0); > + if (ret < 0) { > + goto fail; > + } > + offset += align; > + bytes = 0; > } > > - tail_bytes = (offset + bytes) & (align - 1); > - qemu_iovec_add(&local_qiov, tail_buf + tail_bytes, align - tail_bytes); > - > - bytes = ROUND_UP(bytes, align); > } > > - ret = bdrv_aligned_pwritev(bs, &req, offset, bytes, > - use_local_qiov ? &local_qiov : qiov, > - flags); > + if (bytes) { > + ret = bdrv_aligned_pwritev(bs, &req, offset, bytes, > + use_local_qiov ? &local_qiov : qiov, > + flags); > + } > > fail: > tracked_request_end(&req); > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
On 24/04/2015 13:00, Paolo Bonzini wrote: >> - qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); >> - qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); >> - use_local_qiov = true; >> + if (qiov) { >> + qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 2 : 1); >> + qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); >> + qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); >> + use_local_qiov = true; >> + bytes += offset & (align - 1); >> + offset = offset & ~(align - 1); >> + } else { >> + memset(head_buf + (offset & (align - 1)), 0, >> + align - (offset & (align - 1))); Actually, is the byte count correct if bytes < align? In the case of your testcase, you'd destroy bytes 1536..4095. Same for the computation of bytes, below. It could underflow. Perhaps a qemu-iotests testcase, using qemu-io, is also necessary. Paolo >> + ret = bdrv_aligned_pwritev(bs, &req, offset & ~(align - 1), align, >> + &head_qiov, 0); >> + if (ret < 0) { >> + goto fail; >> + } >> + bytes -= align - (offset & (align - 1)); >> + offset = ROUND_UP(offset, align); >> + } >> + }
On Fri, 04/24 13:51, Paolo Bonzini wrote: > > > On 24/04/2015 13:00, Paolo Bonzini wrote: > >> - qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); > >> - qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > >> - use_local_qiov = true; > >> + if (qiov) { > >> + qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 2 : 1); > >> + qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); > >> + qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > >> + use_local_qiov = true; > >> + bytes += offset & (align - 1); > >> + offset = offset & ~(align - 1); > >> + } else { > >> + memset(head_buf + (offset & (align - 1)), 0, > >> + align - (offset & (align - 1))); > > Actually, is the byte count correct if bytes < align? In the case of > your testcase, you'd destroy bytes 1536..4095. Yes, good catch! Fam > > Same for the computation of bytes, below. It could underflow. > > Perhaps a qemu-iotests testcase, using qemu-io, is also necessary. > > Paolo > > >> + ret = bdrv_aligned_pwritev(bs, &req, offset & ~(align - 1), align, > >> + &head_qiov, 0); > >> + if (ret < 0) { > >> + goto fail; > >> + } > >> + bytes -= align - (offset & (align - 1)); > >> + offset = ROUND_UP(offset, align); > >> + } > >> + }
diff --git a/block.c b/block.c index 0fe97de..cbd0708 100644 --- a/block.c +++ b/block.c @@ -3403,6 +3403,8 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, */ tracked_request_begin(&req, bs, offset, bytes, true); + assert(qiov || flags & BDRV_REQ_ZERO_WRITE); + if (offset & (align - 1)) { QEMUIOVector head_qiov; struct iovec head_iov; @@ -3425,13 +3427,37 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, } BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_HEAD); - qemu_iovec_init(&local_qiov, qiov->niov + 2); - qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); - qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); - use_local_qiov = true; + if (qiov) { + qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 2 : 1); + qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1)); + qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); + use_local_qiov = true; + bytes += offset & (align - 1); + offset = offset & ~(align - 1); + } else { + memset(head_buf + (offset & (align - 1)), 0, + align - (offset & (align - 1))); + ret = bdrv_aligned_pwritev(bs, &req, offset & ~(align - 1), align, + &head_qiov, 0); + if (ret < 0) { + goto fail; + } + bytes -= align - (offset & (align - 1)); + offset = ROUND_UP(offset, align); + } + } - bytes += offset & (align - 1); - offset = offset & ~(align - 1); + if (!qiov) { + uint64_t aligned_bytes = bytes & ~(align - 1); + + assert((offset & (align - 1)) == 0); + ret = bdrv_aligned_pwritev(bs, &req, offset, aligned_bytes, + NULL, flags); + if (ret < 0) { + goto fail; + } + bytes -= aligned_bytes; + offset += aligned_bytes; } if ((offset + bytes) & (align - 1)) { @@ -3459,21 +3485,39 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, } BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL); - if (!use_local_qiov) { - qemu_iovec_init(&local_qiov, qiov->niov + 1); - qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); - use_local_qiov = true; + if (qiov) { + if (!use_local_qiov) { + qemu_iovec_init(&local_qiov, qiov->niov + 1); + qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); + use_local_qiov = true; + } + + tail_bytes = (offset + bytes) & (align - 1); + qemu_iovec_add(&local_qiov, tail_buf + tail_bytes, + align - tail_bytes); + + bytes = ROUND_UP(bytes, align); + } else { + assert((offset & (align - 1)) == 0); + assert(bytes < align); + + memset(tail_buf, 0, bytes & (align - 1)); + ret = bdrv_aligned_pwritev(bs, &req, offset, align, + &tail_qiov, 0); + if (ret < 0) { + goto fail; + } + offset += align; + bytes = 0; } - tail_bytes = (offset + bytes) & (align - 1); - qemu_iovec_add(&local_qiov, tail_buf + tail_bytes, align - tail_bytes); - - bytes = ROUND_UP(bytes, align); } - ret = bdrv_aligned_pwritev(bs, &req, offset, bytes, - use_local_qiov ? &local_qiov : qiov, - flags); + if (bytes) { + ret = bdrv_aligned_pwritev(bs, &req, offset, bytes, + use_local_qiov ? &local_qiov : qiov, + flags); + } fail: tracked_request_end(&req);
For zero write, qiov passed by callers (qemu-io "write -z" and scsi-disk "write same") is NULL. Commit fc3959e466 fixed bdrv_co_write_zeroes which is the common case for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler fix would be in bdrv_co_do_pwritev which is the NULL dereference point and covers both cases. So don't access it in bdrv_co_do_pwritev in this case, use three aligned writes. Signed-off-by: Fam Zheng <famz@redhat.com> --- block.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 61 insertions(+), 17 deletions(-)