diff mbox

[03/21] qcow2: Use 64 bits for refcount values

Message ID 1415627159-15941-4-git-send-email-mreitz@redhat.com
State New
Headers show

Commit Message

Max Reitz Nov. 10, 2014, 1:45 p.m. UTC
Refcounts may have a width of up to 64 bit, so qemu should use the same
width to represent refcount values internally.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c  |  9 ++++++---
 block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
 block/qcow2.h          |  7 ++++---
 3 files changed, 30 insertions(+), 23 deletions(-)

Comments

Eric Blake Nov. 10, 2014, 8:59 p.m. UTC | #1
On 11/10/2014 06:45 AM, Max Reitz wrote:
> Refcounts may have a width of up to 64 bit, so qemu should use the same

s/bit/bits/

> width to represent refcount values internally.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cluster.c  |  9 ++++++---
>  block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
>  block/qcow2.h          |  7 ++++---
>  3 files changed, 30 insertions(+), 23 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index df0b2c9..ab43902 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1640,7 +1640,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>      for (i = 0; i < l1_size; i++) {
>          uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>          bool l2_dirty = false;
> -        int l2_refcount;
> +        int64_t l2_refcount;

You may want to mention in the commit message that you choose a signed
type to allow negative for errors, and therefore we really allow only up
to 63 useful bits.  Or even mention that this is  okay because no one
can feasibly generate an image with more than 2^63 refs to the same
cluster (there isn't that much storage or time to do such a task in our
lifetime...)

Reviewed-by: Eric Blake <eblake@redhat.com>
Max Reitz Nov. 11, 2014, 8:12 a.m. UTC | #2
On 2014-11-10 at 21:59, Eric Blake wrote:
> On 11/10/2014 06:45 AM, Max Reitz wrote:
>> Refcounts may have a width of up to 64 bit, so qemu should use the same
> s/bit/bits/

See my reply to your review on patch 2. I keep swapping knowing which to 
use - now I know, thanks!

>> width to represent refcount values internally.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qcow2-cluster.c  |  9 ++++++---
>>   block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
>>   block/qcow2.h          |  7 ++++---
>>   3 files changed, 30 insertions(+), 23 deletions(-)
>>
>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>> index df0b2c9..ab43902 100644
>> --- a/block/qcow2-cluster.c
>> +++ b/block/qcow2-cluster.c
>> @@ -1640,7 +1640,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>>       for (i = 0; i < l1_size; i++) {
>>           uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>>           bool l2_dirty = false;
>> -        int l2_refcount;
>> +        int64_t l2_refcount;
> You may want to mention in the commit message that you choose a signed
> type to allow negative for errors, and therefore we really allow only up
> to 63 useful bits.  Or even mention that this is  okay because no one
> can feasibly generate an image with more than 2^63 refs to the same
> cluster (there isn't that much storage or time to do such a task in our
> lifetime...)

Will do.

Max

> Reviewed-by: Eric Blake <eblake@redhat.com>
Kevin Wolf Nov. 11, 2014, 9:22 a.m. UTC | #3
Am 10.11.2014 um 21:59 hat Eric Blake geschrieben:
> On 11/10/2014 06:45 AM, Max Reitz wrote:
> > Refcounts may have a width of up to 64 bit, so qemu should use the same
> 
> s/bit/bits/
> 
> > width to represent refcount values internally.
> > 
> > Signed-off-by: Max Reitz <mreitz@redhat.com>
> > ---
> >  block/qcow2-cluster.c  |  9 ++++++---
> >  block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
> >  block/qcow2.h          |  7 ++++---
> >  3 files changed, 30 insertions(+), 23 deletions(-)
> > 
> > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> > index df0b2c9..ab43902 100644
> > --- a/block/qcow2-cluster.c
> > +++ b/block/qcow2-cluster.c
> > @@ -1640,7 +1640,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
> >      for (i = 0; i < l1_size; i++) {
> >          uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
> >          bool l2_dirty = false;
> > -        int l2_refcount;
> > +        int64_t l2_refcount;
> 
> You may want to mention in the commit message that you choose a signed
> type to allow negative for errors, and therefore we really allow only up
> to 63 useful bits.  Or even mention that this is  okay because no one
> can feasibly generate an image with more than 2^63 refs to the same
> cluster (there isn't that much storage or time to do such a task in our
> lifetime...)

Should patch 1 then set refcount_max = 2^63 for refcount order 6?

Also note that while it might not be feasible to create a cluster with
2^63 references, this doesn't mean that it's impossible to create a
cluster with a stored refcount of (more than) 2^63. We'll have to have
checks there.

Kevin
Max Reitz Nov. 11, 2014, 9:25 a.m. UTC | #4
On 2014-11-11 at 10:22, Kevin Wolf wrote:
> Am 10.11.2014 um 21:59 hat Eric Blake geschrieben:
>> On 11/10/2014 06:45 AM, Max Reitz wrote:
>>> Refcounts may have a width of up to 64 bit, so qemu should use the same
>> s/bit/bits/
>>
>>> width to represent refcount values internally.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>   block/qcow2-cluster.c  |  9 ++++++---
>>>   block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
>>>   block/qcow2.h          |  7 ++++---
>>>   3 files changed, 30 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>>> index df0b2c9..ab43902 100644
>>> --- a/block/qcow2-cluster.c
>>> +++ b/block/qcow2-cluster.c
>>> @@ -1640,7 +1640,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>>>       for (i = 0; i < l1_size; i++) {
>>>           uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>>>           bool l2_dirty = false;
>>> -        int l2_refcount;
>>> +        int64_t l2_refcount;
>> You may want to mention in the commit message that you choose a signed
>> type to allow negative for errors, and therefore we really allow only up
>> to 63 useful bits.  Or even mention that this is  okay because no one
>> can feasibly generate an image with more than 2^63 refs to the same
>> cluster (there isn't that much storage or time to do such a task in our
>> lifetime...)
> Should patch 1 then set refcount_max = 2^63 for refcount order 6?

It does set refcount_max to INT64_MAX (instead of UINT64_MAX, and there 
is a comment above that line why it's the signed maximum).

> Also note that while it might not be feasible to create a cluster with
> 2^63 references, this doesn't mean that it's impossible to create a
> cluster with a stored refcount of (more than) 2^63. We'll have to have
> checks there.

Yes, the check is done in qcow2_get_refcount() (and needs to be done in 
update_refcount_discard() as well, which I forgot in this version) and 
consists of returning -ERANGE on error.

Max
Max Reitz Nov. 11, 2014, 9:26 a.m. UTC | #5
On 2014-11-11 at 10:25, Max Reitz wrote:
> On 2014-11-11 at 10:22, Kevin Wolf wrote:
>> Am 10.11.2014 um 21:59 hat Eric Blake geschrieben:
>>> On 11/10/2014 06:45 AM, Max Reitz wrote:
>>>> Refcounts may have a width of up to 64 bit, so qemu should use the 
>>>> same
>>> s/bit/bits/
>>>
>>>> width to represent refcount values internally.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>   block/qcow2-cluster.c  |  9 ++++++---
>>>>   block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
>>>>   block/qcow2.h          |  7 ++++---
>>>>   3 files changed, 30 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>>>> index df0b2c9..ab43902 100644
>>>> --- a/block/qcow2-cluster.c
>>>> +++ b/block/qcow2-cluster.c
>>>> @@ -1640,7 +1640,7 @@ static int 
>>>> expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>>>>       for (i = 0; i < l1_size; i++) {
>>>>           uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>>>>           bool l2_dirty = false;
>>>> -        int l2_refcount;
>>>> +        int64_t l2_refcount;
>>> You may want to mention in the commit message that you choose a signed
>>> type to allow negative for errors, and therefore we really allow 
>>> only up
>>> to 63 useful bits.  Or even mention that this is  okay because no one
>>> can feasibly generate an image with more than 2^63 refs to the same
>>> cluster (there isn't that much storage or time to do such a task in our
>>> lifetime...)
>> Should patch 1 then set refcount_max = 2^63 for refcount order 6?
>
> It does set refcount_max to INT64_MAX (instead of UINT64_MAX, and 
> there is a comment above that line why it's the signed maximum).
>
>> Also note that while it might not be feasible to create a cluster with
>> 2^63 references, this doesn't mean that it's impossible to create a
>> cluster with a stored refcount of (more than) 2^63. We'll have to have
>> checks there.
>
> Yes, the check is done in qcow2_get_refcount() (and needs to be done 
> in update_refcount_discard() as well, which I forgot in this version) 
> and consists of returning -ERANGE on error.

s/error/overflow/
Kevin Wolf Nov. 11, 2014, 9:36 a.m. UTC | #6
Am 11.11.2014 um 10:25 hat Max Reitz geschrieben:
> On 2014-11-11 at 10:22, Kevin Wolf wrote:
> >Am 10.11.2014 um 21:59 hat Eric Blake geschrieben:
> >>On 11/10/2014 06:45 AM, Max Reitz wrote:
> >>>Refcounts may have a width of up to 64 bit, so qemu should use the same
> >>s/bit/bits/
> >>
> >>>width to represent refcount values internally.
> >>>
> >>>Signed-off-by: Max Reitz <mreitz@redhat.com>
> >>>---
> >>>  block/qcow2-cluster.c  |  9 ++++++---
> >>>  block/qcow2-refcount.c | 37 ++++++++++++++++++++-----------------
> >>>  block/qcow2.h          |  7 ++++---
> >>>  3 files changed, 30 insertions(+), 23 deletions(-)
> >>>
> >>>diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> >>>index df0b2c9..ab43902 100644
> >>>--- a/block/qcow2-cluster.c
> >>>+++ b/block/qcow2-cluster.c
> >>>@@ -1640,7 +1640,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
> >>>      for (i = 0; i < l1_size; i++) {
> >>>          uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
> >>>          bool l2_dirty = false;
> >>>-        int l2_refcount;
> >>>+        int64_t l2_refcount;
> >>You may want to mention in the commit message that you choose a signed
> >>type to allow negative for errors, and therefore we really allow only up
> >>to 63 useful bits.  Or even mention that this is  okay because no one
> >>can feasibly generate an image with more than 2^63 refs to the same
> >>cluster (there isn't that much storage or time to do such a task in our
> >>lifetime...)
> >Should patch 1 then set refcount_max = 2^63 for refcount order 6?
> 
> It does set refcount_max to INT64_MAX (instead of UINT64_MAX, and
> there is a comment above that line why it's the signed maximum).

Right, I should read patches instead of just Eric's reply before I reply
something myself...

Kevin
diff mbox

Patch

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index df0b2c9..ab43902 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1640,7 +1640,7 @@  static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
     for (i = 0; i < l1_size; i++) {
         uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
         bool l2_dirty = false;
-        int l2_refcount;
+        int64_t l2_refcount;
 
         if (!l2_offset) {
             /* unallocated */
@@ -1696,14 +1696,17 @@  static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                 }
 
                 if (l2_refcount > 1) {
+                    int64_t ret64;
+
                     /* For shared L2 tables, set the refcount accordingly (it is
                      * already 1 and needs to be l2_refcount) */
-                    ret = qcow2_update_cluster_refcount(bs,
+                    ret64 = qcow2_update_cluster_refcount(bs,
                             offset >> s->cluster_bits, l2_refcount - 1,
                             QCOW2_DISCARD_OTHER);
-                    if (ret < 0) {
+                    if (ret64 < 0) {
                         qcow2_free_clusters(bs, offset, s->cluster_size,
                                             QCOW2_DISCARD_OTHER);
+                        ret = ret64;
                         goto fail;
                     }
                 }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 6016211..6e06531 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -91,14 +91,14 @@  static int load_refcount_block(BlockDriverState *bs,
  * return value is the refcount of the cluster, negative values are -errno
  * and indicate an error.
  */
-int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index)
+int64_t qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index)
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t refcount_table_index, block_index;
     int64_t refcount_block_offset;
     int ret;
     uint16_t *refcount_block;
-    uint16_t refcount;
+    int64_t refcount;
 
     refcount_table_index = cluster_index >> s->refcount_block_bits;
     if (refcount_table_index >= s->refcount_table_size)
@@ -556,9 +556,10 @@  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
     for(cluster_offset = start; cluster_offset <= last;
         cluster_offset += s->cluster_size)
     {
-        int block_index, refcount;
+        int block_index;
         int64_t cluster_index = cluster_offset >> s->cluster_bits;
         int64_t table_index = cluster_index >> s->refcount_block_bits;
+        int64_t refcount;
 
         /* Load the refcount block and allocate it if needed */
         if (table_index != old_table_index) {
@@ -634,10 +635,10 @@  fail:
  * If the return value is non-negative, it is the new refcount of the cluster.
  * If it is negative, it is -errno and indicates an error.
  */
-int qcow2_update_cluster_refcount(BlockDriverState *bs,
-                                  int64_t cluster_index,
-                                  int addend,
-                                  enum qcow2_discard_type type)
+int64_t qcow2_update_cluster_refcount(BlockDriverState *bs,
+                                      int64_t cluster_index,
+                                      int addend,
+                                      enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
     int ret;
@@ -663,7 +664,7 @@  static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t i, nb_clusters;
-    int refcount;
+    int64_t refcount;
 
     nb_clusters = size_to_clusters(s, size);
 retry:
@@ -722,7 +723,8 @@  int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     BDRVQcowState *s = bs->opaque;
     uint64_t cluster_index;
     uint64_t i;
-    int refcount, ret;
+    int64_t refcount;
+    int ret;
 
     assert(nb_clusters >= 0);
     if (nb_clusters == 0) {
@@ -878,8 +880,8 @@  int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     BDRVQcowState *s = bs->opaque;
     uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2;
     bool l1_allocated = false;
-    int64_t old_offset, old_l2_offset;
-    int i, j, l1_modified = 0, nb_csectors, refcount;
+    int64_t old_offset, old_l2_offset, refcount;
+    int i, j, l1_modified = 0, nb_csectors;
     int ret;
 
     l2_table = NULL;
@@ -1341,7 +1343,7 @@  static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
     BDRVQcowState *s = bs->opaque;
     uint64_t *l2_table = qemu_blockalign(bs, s->cluster_size);
     int ret;
-    int refcount;
+    int64_t refcount;
     int i, j;
 
     for (i = 0; i < s->l1_size; i++) {
@@ -1360,7 +1362,7 @@  static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
         }
         if ((refcount == 1) != ((l1_entry & QCOW_OFLAG_COPIED) != 0)) {
             fprintf(stderr, "%s OFLAG_COPIED L2 cluster: l1_index=%d "
-                    "l1_entry=%" PRIx64 " refcount=%d\n",
+                    "l1_entry=%" PRIx64 " refcount=%" PRId64 "\n",
                     fix & BDRV_FIX_ERRORS ? "Repairing" :
                                             "ERROR",
                     i, l1_entry, refcount);
@@ -1403,7 +1405,7 @@  static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
                 }
                 if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
                     fprintf(stderr, "%s OFLAG_COPIED data cluster: "
-                            "l2_entry=%" PRIx64 " refcount=%d\n",
+                            "l2_entry=%" PRIx64 " refcount=%" PRId64 "\n",
                             fix & BDRV_FIX_ERRORS ? "Repairing" :
                                                     "ERROR",
                             l2_entry, refcount);
@@ -1628,8 +1630,8 @@  static void compare_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
                               uint16_t *refcount_table, int64_t nb_clusters)
 {
     BDRVQcowState *s = bs->opaque;
-    int64_t i;
-    int refcount1, refcount2, ret;
+    int64_t i, refcount1, refcount2;
+    int ret;
 
     for (i = 0, *highest_cluster = 0; i < nb_clusters; i++) {
         refcount1 = qcow2_get_refcount(bs, i);
@@ -1657,7 +1659,8 @@  static void compare_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
                 num_fixed = &res->corruptions_fixed;
             }
 
-            fprintf(stderr, "%s cluster %" PRId64 " refcount=%d reference=%d\n",
+            fprintf(stderr, "%s cluster %" PRId64 " refcount=%" PRId64
+                    " reference=%" PRId64 "\n",
                    num_fixed != NULL     ? "Repairing" :
                    refcount1 < refcount2 ? "ERROR" :
                                            "Leaked",
diff --git a/block/qcow2.h b/block/qcow2.h
index 4d8c902..0f8eb15 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -489,10 +489,11 @@  void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
 int qcow2_refcount_init(BlockDriverState *bs);
 void qcow2_refcount_close(BlockDriverState *bs);
 
-int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index);
+int64_t qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index);
 
-int qcow2_update_cluster_refcount(BlockDriverState *bs, int64_t cluster_index,
-                                  int addend, enum qcow2_discard_type type);
+int64_t qcow2_update_cluster_refcount(BlockDriverState *bs,
+                                      int64_t cluster_index, int addend,
+                                      enum qcow2_discard_type type);
 
 int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size);
 int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,