Patchwork [PATCHv3,1/2] qemu-img: find the highest offset in use during check

login
register
mail settings
Submitter Federico Simoncelli
Date Jan. 21, 2013, 2:25 p.m.
Message ID <1358778313-14738-1-git-send-email-fsimonce@redhat.com>
Download mbox | patch
Permalink /patch/214181/
State New
Headers show

Comments

Federico Simoncelli - Jan. 21, 2013, 2:25 p.m.
This patch adds the support for reporting the highest offset in use by
an image. This is particularly useful after a conversion (or a rebase)
where the destination is a block device in order to find the actual
amount of space in use.

Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
---
 block/qcow2-refcount.c       |   10 ++++++++--
 include/block/block.h        |    1 +
 qemu-img.c                   |    4 ++++
 tests/qemu-iotests/026       |    6 +++---
 tests/qemu-iotests/036       |    3 ++-
 tests/qemu-iotests/039       |    2 +-
 tests/qemu-iotests/044.out   |    1 +
 tests/qemu-iotests/common.rc |    5 +++--
 8 files changed, 23 insertions(+), 9 deletions(-)
Eric Blake - Jan. 21, 2013, 4:27 p.m.
On 01/21/2013 07:25 AM, Federico Simoncelli wrote:
> This patch adds the support for reporting the highest offset in use by
> an image. This is particularly useful after a conversion (or a rebase)
> where the destination is a block device in order to find the actual
> amount of space in use.
> 
> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
> ---
>  block/qcow2-refcount.c       |   10 ++++++++--
>  include/block/block.h        |    1 +
>  qemu-img.c                   |    4 ++++
>  tests/qemu-iotests/026       |    6 +++---
>  tests/qemu-iotests/036       |    3 ++-
>  tests/qemu-iotests/039       |    2 +-
>  tests/qemu-iotests/044.out   |    1 +
>  tests/qemu-iotests/common.rc |    5 +++--
>  8 files changed, 23 insertions(+), 9 deletions(-)
> 

> @@ -1231,6 +1236,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
>          }
>      }
>  
> +    res->highest_offset = (highest_cluster + 1) * s->cluster_size;

That's actually one byte larger than the actual highest offset in use.
That is, if I have a file with exactly 4096 bytes, this computation
would return 4096 (the file size) rather than 4095 (the last offset in use).

> +++ b/qemu-img.c
> @@ -475,6 +475,10 @@ static int img_check(int argc, char **argv)
>          result.bfi.fragmented_clusters * 100.0 / result.bfi.allocated_clusters);
>      }
>  
> +    if (result.highest_offset > 0) {
> +        printf("Highest offset in use: %" PRId64 "\n", result.highest_offset);

This output message feels off by one.  Either you need to subtract 1
from res->highest_offset to get the address of the last used byte, or
you need to document it as the first unused byte, or instead of calling
it 'highest offset', you should call it 'used bytes' (except that with
sparse files, it's hard to argue that all earlier bytes were 'used').

Overall, though, I like the idea of this patch.
Kevin Wolf - Jan. 21, 2013, 5:03 p.m.
Am 21.01.2013 17:27, schrieb Eric Blake:
> On 01/21/2013 07:25 AM, Federico Simoncelli wrote:
>> This patch adds the support for reporting the highest offset in use by
>> an image. This is particularly useful after a conversion (or a rebase)
>> where the destination is a block device in order to find the actual
>> amount of space in use.
>>
>> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
>> ---
>>  block/qcow2-refcount.c       |   10 ++++++++--
>>  include/block/block.h        |    1 +
>>  qemu-img.c                   |    4 ++++
>>  tests/qemu-iotests/026       |    6 +++---
>>  tests/qemu-iotests/036       |    3 ++-
>>  tests/qemu-iotests/039       |    2 +-
>>  tests/qemu-iotests/044.out   |    1 +
>>  tests/qemu-iotests/common.rc |    5 +++--
>>  8 files changed, 23 insertions(+), 9 deletions(-)
>>
> 
>> @@ -1231,6 +1236,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
>>          }
>>      }
>>  
>> +    res->highest_offset = (highest_cluster + 1) * s->cluster_size;
> 
> That's actually one byte larger than the actual highest offset in use.
> That is, if I have a file with exactly 4096 bytes, this computation
> would return 4096 (the file size) rather than 4095 (the last offset in use).
> 
>> +++ b/qemu-img.c
>> @@ -475,6 +475,10 @@ static int img_check(int argc, char **argv)
>>          result.bfi.fragmented_clusters * 100.0 / result.bfi.allocated_clusters);
>>      }
>>  
>> +    if (result.highest_offset > 0) {
>> +        printf("Highest offset in use: %" PRId64 "\n", result.highest_offset);
> 
> This output message feels off by one.  Either you need to subtract 1
> from res->highest_offset to get the address of the last used byte, or
> you need to document it as the first unused byte, or instead of calling
> it 'highest offset', you should call it 'used bytes' (except that with
> sparse files, it's hard to argue that all earlier bytes were 'used').

Good point. I think the number is what we wanted, and what users are
interested in is probably "used bytes" rather than "first unused byte".
Maybe we can find a better word for "used" (it has the same problem in
all three contexts), but I can't think of one off the top of my head.

Kevin
Eric Blake - Jan. 21, 2013, 5:52 p.m.
On 01/21/2013 10:03 AM, Kevin Wolf wrote:
> Am 21.01.2013 17:27, schrieb Eric Blake:
>> On 01/21/2013 07:25 AM, Federico Simoncelli wrote:
>>> This patch adds the support for reporting the highest offset in use by
>>> an image. This is particularly useful after a conversion (or a rebase)
>>> where the destination is a block device in order to find the actual
>>> amount of space in use.

>>> +    if (result.highest_offset > 0) {
>>> +        printf("Highest offset in use: %" PRId64 "\n", result.highest_offset);
>>
>> This output message feels off by one.  Either you need to subtract 1
>> from res->highest_offset to get the address of the last used byte, or
>> you need to document it as the first unused byte, or instead of calling
>> it 'highest offset', you should call it 'used bytes' (except that with
>> sparse files, it's hard to argue that all earlier bytes were 'used').
> 
> Good point. I think the number is what we wanted, and what users are
> interested in is probably "used bytes" rather than "first unused byte".
> Maybe we can find a better word for "used" (it has the same problem in
> all three contexts), but I can't think of one off the top of my head.

"Allocation" alone isn't good (because of sparse files, not all earlier
bytes are allocated).  But maybe "Offset of unallocated tail: ..."
works, to make it clear that there may be other unallocated portions
earlier, but that all bytes at the listed offset and beyond are
unallocated.  For a file that is completely allocated, or even for a
sparse file but with no unallocated tail, the value would be the same as
the file capacity.
Stefan Hajnoczi - Jan. 23, 2013, 3:24 p.m.
On Mon, Jan 21, 2013 at 09:25:12AM -0500, Federico Simoncelli wrote:
> @@ -1154,7 +1154,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
>          s->refcount_table_offset,
>          s->refcount_table_size * sizeof(uint64_t));
>  
> -    for(i = 0; i < s->refcount_table_size; i++) {
> +    for(i = 0, highest_cluster = 0; i < s->refcount_table_size; i++) {
>          uint64_t offset, cluster;
>          offset = s->refcount_table[i];
>          cluster = offset >> s->cluster_bits;

This is the wrong for loop?  It should be the next one down.

Stefan
Kevin Wolf - Jan. 23, 2013, 3:32 p.m.
Am 23.01.2013 16:24, schrieb Stefan Hajnoczi:
> On Mon, Jan 21, 2013 at 09:25:12AM -0500, Federico Simoncelli wrote:
>> @@ -1154,7 +1154,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
>>          s->refcount_table_offset,
>>          s->refcount_table_size * sizeof(uint64_t));
>>  
>> -    for(i = 0; i < s->refcount_table_size; i++) {
>> +    for(i = 0, highest_cluster = 0; i < s->refcount_table_size; i++) {
>>          uint64_t offset, cluster;
>>          offset = s->refcount_table[i];
>>          cluster = offset >> s->cluster_bits;
> 
> This is the wrong for loop?  It should be the next one down.

Oops, how did this happen? I'm pretty sure that one of the earlier
versions had it in the right place.

Anyway, it doesn't change the semantics, could just need a cleanup.

Kevin
Eric Blake - Jan. 25, 2013, 4:09 p.m.
On 01/21/2013 10:52 AM, Eric Blake wrote:
> On 01/21/2013 10:03 AM, Kevin Wolf wrote:
>> Am 21.01.2013 17:27, schrieb Eric Blake:
>>> On 01/21/2013 07:25 AM, Federico Simoncelli wrote:
>>>> This patch adds the support for reporting the highest offset in use by
>>>> an image. This is particularly useful after a conversion (or a rebase)
>>>> where the destination is a block device in order to find the actual
>>>> amount of space in use.
> 
>>>> +    if (result.highest_offset > 0) {
>>>> +        printf("Highest offset in use: %" PRId64 "\n", result.highest_offset);
>>>
>>> This output message feels off by one.  Either you need to subtract 1
>>> from res->highest_offset to get the address of the last used byte, or
>>> you need to document it as the first unused byte, or instead of calling
>>> it 'highest offset', you should call it 'used bytes' (except that with
>>> sparse files, it's hard to argue that all earlier bytes were 'used').
>>
>> Good point. I think the number is what we wanted, and what users are
>> interested in is probably "used bytes" rather than "first unused byte".
>> Maybe we can find a better word for "used" (it has the same problem in
>> all three contexts), but I can't think of one off the top of my head.
> 
> "Allocation" alone isn't good (because of sparse files, not all earlier
> bytes are allocated).  But maybe "Offset of unallocated tail: ..."
> works, to make it clear that there may be other unallocated portions
> earlier, but that all bytes at the listed offset and beyond are
> unallocated.  For a file that is completely allocated, or even for a
> sparse file but with no unallocated tail, the value would be the same as
> the file capacity.

On IRC, we decided the name result.image_end_offset feels better.

Patch

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 6a95aa6..3444f85 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1116,7 +1116,7 @@  int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
                           BdrvCheckMode fix)
 {
     BDRVQcowState *s = bs->opaque;
-    int64_t size, i;
+    int64_t size, i, highest_cluster;
     int nb_clusters, refcount1, refcount2;
     QCowSnapshot *sn;
     uint16_t *refcount_table;
@@ -1154,7 +1154,7 @@  int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
         s->refcount_table_offset,
         s->refcount_table_size * sizeof(uint64_t));
 
-    for(i = 0; i < s->refcount_table_size; i++) {
+    for(i = 0, highest_cluster = 0; i < s->refcount_table_size; i++) {
         uint64_t offset, cluster;
         offset = s->refcount_table[i];
         cluster = offset >> s->cluster_bits;
@@ -1197,6 +1197,11 @@  int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
         }
 
         refcount2 = refcount_table[i];
+
+        if (refcount1 > 0 || refcount2 > 0) {
+            highest_cluster = i;
+        }
+
         if (refcount1 != refcount2) {
 
             /* Check if we're allowed to fix the mismatch */
@@ -1231,6 +1236,7 @@  int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
         }
     }
 
+    res->highest_offset = (highest_cluster + 1) * s->cluster_size;
     ret = 0;
 
 fail:
diff --git a/include/block/block.h b/include/block/block.h
index ffd1936..b5cf6ac 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -213,6 +213,7 @@  typedef struct BdrvCheckResult {
     int check_errors;
     int corruptions_fixed;
     int leaks_fixed;
+    int64_t highest_offset;
     BlockFragInfo bfi;
 } BdrvCheckResult;
 
diff --git a/qemu-img.c b/qemu-img.c
index 85d3740..7b977cd 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -475,6 +475,10 @@  static int img_check(int argc, char **argv)
         result.bfi.fragmented_clusters * 100.0 / result.bfi.allocated_clusters);
     }
 
+    if (result.highest_offset > 0) {
+        printf("Highest offset in use: %" PRId64 "\n", result.highest_offset);
+    }
+
     bdrv_delete(bs);
 
     if (ret < 0 || result.check_errors) {
diff --git a/tests/qemu-iotests/026 b/tests/qemu-iotests/026
index 1602ccd..107a3ff 100755
--- a/tests/qemu-iotests/026
+++ b/tests/qemu-iotests/026
@@ -102,7 +102,7 @@  if [ "$event" == "l2_load" ]; then
     $QEMU_IO -c "read $vmstate 0 128k " $BLKDBG_TEST_IMG | _filter_qemu_io
 fi
 
-$QEMU_IMG check $TEST_IMG 2>&1 | grep -v "refcount=1 reference=0"
+_check_test_img 2>&1 | grep -v "refcount=1 reference=0"
 
 done
 done
@@ -147,7 +147,7 @@  echo
 echo "Event: $event; errno: $errno; imm: $imm; once: $once; write $vmstate"
 $QEMU_IO -c "write $vmstate 0 64M" $BLKDBG_TEST_IMG | _filter_qemu_io
 
-$QEMU_IMG check $TEST_IMG 2>&1 | grep -v "refcount=1 reference=0"
+_check_test_img 2>&1 | grep -v "refcount=1 reference=0"
 
 done
 done
@@ -186,7 +186,7 @@  echo
 echo "Event: $event; errno: $errno; imm: $imm; once: $once"
 $QEMU_IO -c "write -b 0 64k" $BLKDBG_TEST_IMG | _filter_qemu_io
 
-$QEMU_IMG check $TEST_IMG 2>&1 | grep -v "refcount=1 reference=0"
+_check_test_img 2>&1 | grep -v "refcount=1 reference=0"
 
 done
 done
diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
index 329533e..4dbfc57 100755
--- a/tests/qemu-iotests/036
+++ b/tests/qemu-iotests/036
@@ -59,7 +59,8 @@  _make_test_img 64M
 echo
 echo === Repair image ===
 echo
-$QEMU_IMG check -r all $TEST_IMG
+_check_test_img -r all
+
 ./qcow2.py $TEST_IMG dump-header
 
 # success, all done
diff --git a/tests/qemu-iotests/039 b/tests/qemu-iotests/039
index c5ae806..ae35175 100755
--- a/tests/qemu-iotests/039
+++ b/tests/qemu-iotests/039
@@ -86,7 +86,7 @@  $QEMU_IO -r -c "read -P 0x5a 0 512" $TEST_IMG | _filter_qemu_io
 echo
 echo "== Repairing the image file must succeed =="
 
-$QEMU_IMG check -r all $TEST_IMG
+_check_test_img -r all
 
 # The dirty bit must not be set
 ./qcow2.py $TEST_IMG dump-header | grep incompatible_features
diff --git a/tests/qemu-iotests/044.out b/tests/qemu-iotests/044.out
index 7a40071..6851878 100644
--- a/tests/qemu-iotests/044.out
+++ b/tests/qemu-iotests/044.out
@@ -1,4 +1,5 @@ 
 No errors were found on the image.
+Highest offset in use: 4296447488
 .
 ----------------------------------------------------------------------
 Ran 1 tests
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index aef5f52..22c0186 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -161,9 +161,10 @@  _cleanup_test_img()
 
 _check_test_img()
 {
-    $QEMU_IMG check -f $IMGFMT $TEST_IMG 2>&1 | \
+    $QEMU_IMG check "$@" -f $IMGFMT $TEST_IMG 2>&1 | \
         grep -v "fragmented$" | \
-    	sed -e 's/qemu-img\: This image format does not support checks/No errors were found on the image./'
+    	sed -e 's/qemu-img\: This image format does not support checks/No errors were found on the image./' | \
+    	sed -e '/Highest offset in use: [0-9]\+/d'
 }
 
 _img_info()