mbox series

[v10,0/9] qcow2: cluster space preallocation

Message ID 20181203101429.88735-1-anton.nefedov@virtuozzo.com
Headers show
Series qcow2: cluster space preallocation | expand

Message

Anton Nefedov Dec. 3, 2018, 10:14 a.m. UTC
new in v10:
    - patches 1-3,6,7: rebase after REQ_WRITE_UNCHANGED
    - patch 3: drop supported_zero_flags. My bad, no write_zeroes in quorum.
    - patch 4: almost trivial rebase. RB-tags not stripped.
               Choose another constant for BDRV_REQ_ALLOCATE
    - patch 5: rebase. Instead of marking REQ_ALLOCATE serialising, accompany
               it with REQ_SERIALISING.
    - patch 7: add symmetric copy-on-read change
    - patch 8: trivial rebase. RB-tags not stripped.

v9: http://lists.nongnu.org/archive/html/qemu-devel/2018-05/msg01667.html
    - fixed commentary wording in patches 4, 8
    - rebased (no conflicts)

v8: http://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg03291.html

----

This pull request is to start to improve a few performance points of
qcow2 format:

  1. non cluster-aligned write requests (to unallocated clusters) explicitly
     pad data with zeroes if there is no backing data.
     Resulting increase in ops number and potential cluster fragmentation
     (on the host file) is already solved by:
       ee22a9d qcow2: Merge the writing of the COW regions with the guest data
     However, in case of zero COW regions, that can be avoided at all
     but the whole clusters are preallocated and zeroed in a single
     efficient write_zeroes() operation

  2. moreover, efficient write_zeroes() operation can be used to preallocate
     space megabytes (*configurable number) ahead which gives noticeable
     improvement on some storage types (e.g. distributed storage)
     where the space allocation operation might be expensive)
     (Not included in this patchset since v6).

  3. this will also allow to enable simultaneous writes to the same unallocated
     cluster after the space has been allocated & zeroed but before
     the first data is written and the cluster is linked to L2.
     (Not included in this patchset).

Efficient write_zeroes usually implies that the blocks are not actually
written to but only reserved and marked as zeroed by the storage.
In this patchset, file-posix driver is marked as supporting this operation
if it supports (/configured to support) fallocate() operation.

Existing bdrv_write_zeroes() falls back to writing zero buffers if
write_zeroes is not supported by the driver.
A new flag (BDRV_REQ_ALLOCATE) is introduced to avoid that but return ENOTSUP.
Such allocate requests are also implemented to possibly overlap with the
other requests. No wait is performed but an error returned in such case as well.
So the operation should be considered advisory and a fallback scenario still
handled by the caller (in this case, qcow2 driver).

simple perf test:

  qemu-img create -f qcow2 test.img 4G && \
  qemu-img bench -c $((1024*1024)) -f qcow2 -n -s 4k -t none -w test.img

test results (seconds):

    +-----------+-------+------+-------+------+------+
    |   file    |    before    |     after    | gain |
    +-----------+-------+------+-------+------+------+
    |    ssd    |      61.153  |      36.313  |  41% |
    |    hdd    |     112.676  |     122.056  |  -8% |
    +-----------+--------------+--------------+------+

Anton Nefedov (9):
  mirror: inherit supported write/zero flags
  blkverify: set supported write/zero flags
  quorum: set supported write flags
  block: introduce BDRV_REQ_ALLOCATE flag
  block: treat BDRV_REQ_ALLOCATE as serialising
  file-posix: support BDRV_REQ_ALLOCATE
  block: support BDRV_REQ_ALLOCATE in passthrough drivers
  qcow2: skip writing zero buffers to empty COW areas
  iotest 134: test cluster-misaligned encrypted write

 qapi/block-core.json       |  4 +-
 block/qcow2.h              |  6 +++
 include/block/block.h      |  9 ++++-
 include/block/block_int.h  |  2 +-
 block/blkdebug.c           |  2 +-
 block/blkverify.c          | 10 ++++-
 block/copy-on-read.c       |  4 +-
 block/file-posix.c         |  8 +++-
 block/io.c                 | 49 ++++++++++++++++++-----
 block/mirror.c             |  8 +++-
 block/qcow2-cluster.c      |  2 +-
 block/qcow2.c              | 80 +++++++++++++++++++++++++++++++++++++-
 block/quorum.c             | 19 ++++++++-
 block/raw-format.c         |  2 +-
 block/trace-events         |  1 +
 tests/qemu-iotests/060     | 26 ++++++++-----
 tests/qemu-iotests/060.out |  5 ++-
 tests/qemu-iotests/134     |  9 +++++
 tests/qemu-iotests/134.out | 10 +++++
 19 files changed, 220 insertions(+), 36 deletions(-)