diff mbox series

[v5,07/23] block: Convert bdrv_get_block_status() to bytes

Message ID 20171004020048.26379-8-eblake@redhat.com
State New
Headers show
Series make bdrv_get_block_status byte-based | expand

Commit Message

Eric Blake Oct. 4, 2017, 2 a.m. UTC
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the name of the function from bdrv_get_block_status() to
bdrv_block_status() ensures that the compiler enforces that all
callers are updated.  For now, the io.c layer still assert()s that
all callers are sector-aligned, but that can be relaxed when a later
patch implements byte-based block status in the drivers.

Note that we have an inherent limitation in the BDRV_BLOCK_* return
values: BDRV_BLOCK_OFFSET_VALID can only return the start of a
sector, even if we later relax the interface to query for the status
starting at an intermediate byte; document the obvious interpretation
that valid offsets are always sector-relative.

Therefore, for the most part this patch is just the addition of scaling
at the callers followed by inverse scaling at bdrv_block_status().  But
some code, particularly bdrv_is_allocated(), gets a lot simpler because
it no longer has to mess with sectors.

For ease of review, bdrv_get_block_status_above() will be tackled
separately.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v5: drop pointless 'if (pnum)' [John], add comment
v4: no change
v3: clamp bytes to 32-bits, rather than asserting
v2: rebase to earlier changes
---
 include/block/block.h | 12 +++++++-----
 block/io.c            | 35 +++++++++++++++++++++++------------
 block/qcow2-cluster.c |  2 +-
 qemu-img.c            | 20 +++++++++++---------
 4 files changed, 42 insertions(+), 27 deletions(-)

Comments

Kevin Wolf Oct. 10, 2017, 2:46 p.m. UTC | #1
Am 04.10.2017 um 04:00 hat Eric Blake geschrieben:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  In the common case, allocation is unlikely to ever use
> values that are not naturally sector-aligned, but it is possible
> that byte-based values will let us be more precise about allocation
> at the end of an unaligned file that can do byte-based access.
> 
> Changing the name of the function from bdrv_get_block_status() to
> bdrv_block_status() ensures that the compiler enforces that all
> callers are updated.  For now, the io.c layer still assert()s that
> all callers are sector-aligned, but that can be relaxed when a later
> patch implements byte-based block status in the drivers.
> 
> Note that we have an inherent limitation in the BDRV_BLOCK_* return
> values: BDRV_BLOCK_OFFSET_VALID can only return the start of a
> sector, even if we later relax the interface to query for the status
> starting at an intermediate byte; document the obvious interpretation
> that valid offsets are always sector-relative.
> 
> Therefore, for the most part this patch is just the addition of scaling
> at the callers followed by inverse scaling at bdrv_block_status().  But
> some code, particularly bdrv_is_allocated(), gets a lot simpler because
> it no longer has to mess with sectors.
> 
> For ease of review, bdrv_get_block_status_above() will be tackled
> separately.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v5: drop pointless 'if (pnum)' [John], add comment
> v4: no change
> v3: clamp bytes to 32-bits, rather than asserting
> v2: rebase to earlier changes
> ---
>  include/block/block.h | 12 +++++++-----
>  block/io.c            | 35 +++++++++++++++++++++++------------
>  block/qcow2-cluster.c |  2 +-
>  qemu-img.c            | 20 +++++++++++---------
>  4 files changed, 42 insertions(+), 27 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index be49c4ae9d..4ecd2c4a65 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h

A bit above the first hunk:

 * Allocation status flags for bdrv_get_block_status() and friends.

This should be bdrv_block_status() now.

> @@ -138,8 +138,10 @@ typedef struct HDGeometry {
>   *
>   * If BDRV_BLOCK_OFFSET_VALID is set, bits 9-62 (BDRV_BLOCK_OFFSET_MASK)
>   * represent the offset in the returned BDS that is allocated for the
> - * corresponding raw data; however, whether that offset actually contains
> - * data also depends on BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO, as follows:
> + * corresponding raw data.  Individual bytes are at the same sector-relative
> + * locations (and thus, this bit cannot be set for mappings which are
> + * not equivalent modulo 512).  However, whether that offset actually
> + * contains data also depends on BDRV_BLOCK_DATA, as follows:

This suggests to me that it was a bad idea to embed the offset in the
bitmask. In the long run, we should probably make flags and offset two
separate things (e.g. making offset a new by-reference parameter).

Kevin
Eric Blake Oct. 10, 2017, 3:38 p.m. UTC | #2
On 10/10/2017 09:46 AM, Kevin Wolf wrote:
> Am 04.10.2017 um 04:00 hat Eric Blake geschrieben:
>> We are gradually moving away from sector-based interfaces, towards
>> byte-based.  In the common case, allocation is unlikely to ever use
>> values that are not naturally sector-aligned, but it is possible
>> that byte-based values will let us be more precise about allocation
>> at the end of an unaligned file that can do byte-based access.
>>
>> Changing the name of the function from bdrv_get_block_status() to
>> bdrv_block_status() ensures that the compiler enforces that all
>> callers are updated.  For now, the io.c layer still assert()s that
>> all callers are sector-aligned, but that can be relaxed when a later
>> patch implements byte-based block status in the drivers.
>>
>> Note that we have an inherent limitation in the BDRV_BLOCK_* return
>> values: BDRV_BLOCK_OFFSET_VALID can only return the start of a
>> sector, even if we later relax the interface to query for the status
>> starting at an intermediate byte; document the obvious interpretation
>> that valid offsets are always sector-relative.
>>
>> Therefore, for the most part this patch is just the addition of scaling
>> at the callers followed by inverse scaling at bdrv_block_status().  But
>> some code, particularly bdrv_is_allocated(), gets a lot simpler because
>> it no longer has to mess with sectors.
>>
>> For ease of review, bdrv_get_block_status_above() will be tackled
>> separately.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>

>> +++ b/include/block/block.h
> 
> A bit above the first hunk:
> 
>  * Allocation status flags for bdrv_get_block_status() and friends.
> 
> This should be bdrv_block_status() now.

Good catch. It wasn't there on v1, but I forgot to re-grep when rebasing.

>> @@ -138,8 +138,10 @@ typedef struct HDGeometry {
>>   *
>>   * If BDRV_BLOCK_OFFSET_VALID is set, bits 9-62 (BDRV_BLOCK_OFFSET_MASK)
>>   * represent the offset in the returned BDS that is allocated for the
>> - * corresponding raw data; however, whether that offset actually contains
>> - * data also depends on BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO, as follows:
>> + * corresponding raw data.  Individual bytes are at the same sector-relative
>> + * locations (and thus, this bit cannot be set for mappings which are
>> + * not equivalent modulo 512).  However, whether that offset actually
>> + * contains data also depends on BDRV_BLOCK_DATA, as follows:
> 
> This suggests to me that it was a bad idea to embed the offset in the
> bitmask. In the long run, we should probably make flags and offset two
> separate things (e.g. making offset a new by-reference parameter).

As this (plus the next series) is all about changing the driver
interface, NOW is as good a time as any to split those into two separate
parameters.  In fact, returning an offset only makes sense when the
caller also is asking for what file is associated with the offset, and
patch 1 of this series already changed several callers to pass NULL when
they don't care about the file.

Unless anyone has a reason why I should NOT split the offset from the
bitmask, I can proceed with making that change for v6 (it will be a bit
more rebase churn as a result - but thinking about the long-term
interface is worth the short-term pain).
diff mbox series

Patch

diff --git a/include/block/block.h b/include/block/block.h
index be49c4ae9d..4ecd2c4a65 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -138,8 +138,10 @@  typedef struct HDGeometry {
  *
  * If BDRV_BLOCK_OFFSET_VALID is set, bits 9-62 (BDRV_BLOCK_OFFSET_MASK)
  * represent the offset in the returned BDS that is allocated for the
- * corresponding raw data; however, whether that offset actually contains
- * data also depends on BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO, as follows:
+ * corresponding raw data.  Individual bytes are at the same sector-relative
+ * locations (and thus, this bit cannot be set for mappings which are
+ * not equivalent modulo 512).  However, whether that offset actually
+ * contains data also depends on BDRV_BLOCK_DATA, as follows:
  *
  * DATA ZERO OFFSET_VALID
  *  t    t        t       sectors read as zero, returned file is zero at offset
@@ -422,9 +424,9 @@  int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
 bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
-int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
-                              int nb_sectors, int *pnum,
-                              BlockDriverState **file);
+int64_t bdrv_block_status(BlockDriverState *bs, int64_t offset,
+                          int64_t bytes, int64_t *pnum,
+                          BlockDriverState **file);
 int64_t bdrv_get_block_status_above(BlockDriverState *bs,
                                     BlockDriverState *base,
                                     int64_t sector_num,
diff --git a/block/io.c b/block/io.c
index afba2da1c4..ab1853dc2d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -698,7 +698,6 @@  int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags)
 {
     int64_t target_size, ret, bytes, offset = 0;
     BlockDriverState *bs = child->bs;
-    int n; /* sectors */

     target_size = bdrv_getlength(bs);
     if (target_size < 0) {
@@ -710,24 +709,23 @@  int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags)
         if (bytes <= 0) {
             return 0;
         }
-        ret = bdrv_get_block_status(bs, offset >> BDRV_SECTOR_BITS,
-                                    bytes >> BDRV_SECTOR_BITS, &n, NULL);
+        ret = bdrv_block_status(bs, offset, bytes, &bytes, NULL);
         if (ret < 0) {
             error_report("error getting block status at offset %" PRId64 ": %s",
                          offset, strerror(-ret));
             return ret;
         }
         if (ret & BDRV_BLOCK_ZERO) {
-            offset += n * BDRV_SECTOR_BITS;
+            offset += bytes;
             continue;
         }
-        ret = bdrv_pwrite_zeroes(child, offset, n * BDRV_SECTOR_SIZE, flags);
+        ret = bdrv_pwrite_zeroes(child, offset, bytes, flags);
         if (ret < 0) {
             error_report("error writing zeroes at offset %" PRId64 ": %s",
                          offset, strerror(-ret));
             return ret;
         }
-        offset += n * BDRV_SECTOR_SIZE;
+        offset += bytes;
     }
 }

@@ -2021,13 +2019,26 @@  int64_t bdrv_get_block_status_above(BlockDriverState *bs,
                                           nb_sectors, pnum, file);
 }

-int64_t bdrv_get_block_status(BlockDriverState *bs,
-                              int64_t sector_num,
-                              int nb_sectors, int *pnum,
-                              BlockDriverState **file)
+int64_t bdrv_block_status(BlockDriverState *bs,
+                          int64_t offset, int64_t bytes, int64_t *pnum,
+                          BlockDriverState **file)
 {
-    return bdrv_get_block_status_above(bs, backing_bs(bs),
-                                       sector_num, nb_sectors, pnum, file);
+    int64_t ret;
+    int n;
+
+    assert(QEMU_IS_ALIGNED(offset | bytes, BDRV_SECTOR_SIZE));
+    assert(pnum);
+    /*
+     * The contract allows us to return pnum smaller than bytes, even
+     * if the next query would see the same status; we truncate the
+     * request to avoid overflowing the driver's 32-bit interface.
+     */
+    bytes = MIN(bytes, BDRV_REQUEST_MAX_BYTES);
+    ret = bdrv_get_block_status_above(bs, backing_bs(bs),
+                                      offset >> BDRV_SECTOR_BITS,
+                                      bytes >> BDRV_SECTOR_BITS, &n, file);
+    *pnum = n * BDRV_SECTOR_SIZE;
+    return ret;
 }

 int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index d2518d1893..395763b612 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1634,7 +1634,7 @@  static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
          * cluster is already marked as zero, or if it's unallocated and we
          * don't have a backing file.
          *
-         * TODO We might want to use bdrv_get_block_status(bs) here, but we're
+         * TODO We might want to use bdrv_block_status(bs) here, but we're
          * holding s->lock, so that doesn't work today.
          *
          * If full_discard is true, the sector should not read back as zeroes,
diff --git a/qemu-img.c b/qemu-img.c
index af3effdec5..d226dcc166 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1599,9 +1599,14 @@  static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)

     if (s->sector_next_status <= sector_num) {
         if (s->target_has_backing) {
-            ret = bdrv_get_block_status(blk_bs(s->src[src_cur]),
-                                        sector_num - src_cur_offset,
-                                        n, &n, NULL);
+            int64_t count = n * BDRV_SECTOR_SIZE;
+
+            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
+                                    (sector_num - src_cur_offset) *
+                                    BDRV_SECTOR_SIZE,
+                                    count, &count, NULL);
+            assert(ret < 0 || QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
+            n = count >> BDRV_SECTOR_BITS;
         } else {
             ret = bdrv_get_block_status_above(blk_bs(s->src[src_cur]), NULL,
                                               sector_num - src_cur_offset,
@@ -2678,9 +2683,7 @@  static int get_block_status(BlockDriverState *bs, int64_t offset,
     int depth;
     BlockDriverState *file;
     bool has_offset;
-    int nb_sectors = bytes >> BDRV_SECTOR_BITS;

-    assert(bytes < INT_MAX);
     /* As an optimization, we could cache the current range of unallocated
      * clusters in each file of the chain, and avoid querying the same
      * range repeatedly.
@@ -2688,12 +2691,11 @@  static int get_block_status(BlockDriverState *bs, int64_t offset,

     depth = 0;
     for (;;) {
-        ret = bdrv_get_block_status(bs, offset >> BDRV_SECTOR_BITS, nb_sectors,
-                                    &nb_sectors, &file);
+        ret = bdrv_block_status(bs, offset, bytes, &bytes, &file);
         if (ret < 0) {
             return ret;
         }
-        assert(nb_sectors);
+        assert(bytes);
         if (ret & (BDRV_BLOCK_ZERO|BDRV_BLOCK_DATA)) {
             break;
         }
@@ -2710,7 +2712,7 @@  static int get_block_status(BlockDriverState *bs, int64_t offset,

     *e = (MapEntry) {
         .start = offset,
-        .length = nb_sectors * BDRV_SECTOR_SIZE,
+        .length = bytes,
         .data = !!(ret & BDRV_BLOCK_DATA),
         .zero = !!(ret & BDRV_BLOCK_ZERO),
         .offset = ret & BDRV_BLOCK_OFFSET_MASK,