Message ID | 1423079623-4253-1-git-send-email-mreitz@redhat.com |
---|---|
State | New |
Headers | show |
On 02/04/2015 12:53 PM, Max Reitz wrote: My review jumps around a bit; try to follow the numbers to learn my stream of consciousness on reviewing it. tl;dr: It looks like this is functionally correct (you won't break behavior), but that you have a pessimization bug (see [7]), so you ought to spin v2. > qcow2_alloc_bytes() is a function with insufficient error handling and > an unnecessary goto. This patch rewrites it. Not clear what the added error handling is from just the commit message. [1] > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > block/qcow2-refcount.c | 77 +++++++++++++++++++++++++++++--------------------- > 1 file changed, 45 insertions(+), 32 deletions(-) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 9afdb40..cafca0a 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -759,46 +759,51 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, > int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) > { > BDRVQcowState *s = bs->opaque; > - int64_t offset, cluster_offset; > - int free_in_cluster; > + int64_t offset, new_cluster = 0, cluster_end; > + size_t free_in_cluster; > > BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_BYTES); > assert(size > 0 && size <= s->cluster_size); [0] Not listed here, but I finally figured it out and it made the rest of my review easier: on entry, s->free_byte_offset is either 0 (there is no existing compressed cluster with an available tail, does that mean s->free_cluster_index is also 0? [10]) or an actual byte offset of a tail (not cluster-aligned, s->free_cluster_index is non-zero). > - if (s->free_byte_offset == 0) { > - offset = qcow2_alloc_clusters(bs, s->cluster_size); The old code had one cluster alloc here; [2] > - if (offset < 0) { > - return offset; > + > + if (s->free_byte_offset) { > + int refcount = qcow2_get_refcount(bs, > + s->free_byte_offset >> s->cluster_bits); > + if (refcount < 0) { > + return refcount; > + } > + > + if (refcount == 0xffff) { > + s->free_byte_offset = 0; [1] This is some of the added error checking (the old code silently failed if it overflowed a refcount - perhaps possible with 2M clusters and LOTS of compressed data fitting into a single cluster). > } > - s->free_byte_offset = offset; > } > - redo: > + > free_in_cluster = s->cluster_size - > offset_into_cluster(s, s->free_byte_offset); Based on [0], free_in_cluster is either s->cluster_size (s->free_byte_offset was 0 and we have no tail) or smaller than s->cluster_size (s->free_byte_offset was non-zero). > - if (size <= free_in_cluster) { > - /* enough space in current cluster */ > - offset = s->free_byte_offset; The old code determined a successful return value here [9] > - s->free_byte_offset += size; > - free_in_cluster -= size; > - if (free_in_cluster == 0) > - s->free_byte_offset = 0; > - if (offset_into_cluster(s, offset) != 0) > - qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1, > - QCOW2_DISCARD_NEVER); [1] another instance of missing error checking - the old code fails to notice a botched refcount update, while the new code correctly undoes things if you encounter failure [8] > - } else { > - offset = qcow2_alloc_clusters(bs, s->cluster_size); [2] and a second here. If my logic is right, you could never trigger both in a single run (the first is triggered only if s->free_byte_offset was 0 [meaning we have no existing sector to add more compressed data to], the function requires that size <= s->cluster_size, so free_in_cluster is s->cluster_size if the first alloc triggered, and we skip the second. Oh, that ignores the 'redo:' label, so I have to re-evaluate if we can ever allocate a cluster, then goto redo, then allocate again; but looking at [3], the only time we jump back to redo is right after setting 'offset' to the start of a new cluster, so again only one allocation). So, I'd expect that your rewrite cleans things up to have just a single cluster allocation. [4] > - if (offset < 0) { > - return offset; > + > + if (!s->free_byte_offset || free_in_cluster < size) { > + new_cluster = qcow2_alloc_clusters(bs, s->cluster_size); [4] and here's the one allocation. A freshly-allocated cluster starts life with a ref-count of one; it is only if we reuse the tail of an existing cluster that we need to bump the refcount higher, so there is more logic needed later on deciding refcount changes [5]. If we made this allocation, then s->free_byte_offset is either 0 or the start of a tail that is too small but that we might be able to use if the allocation is contiguous [6]. > + if (new_cluster < 0) { > + return new_cluster; > } > - cluster_offset = start_of_cluster(s, s->free_byte_offset); > - if ((cluster_offset + s->cluster_size) == offset) { > - /* we are lucky: contiguous data */ > - offset = s->free_byte_offset; [9] or here > - qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1, > - QCOW2_DISCARD_NEVER); > - s->free_byte_offset += size; > - } else { > - s->free_byte_offset = offset; > - goto redo; [3] this is the backwards jump that makes the old code HARD to understand. Thanks for killing it! > + > + cluster_end = start_of_cluster(s, s->free_byte_offset) + > + s->cluster_size; [6] cluster_end is now either s->cluster_size (s->free_byte_offset was 0) or the first cluster-aligned address after s->free_byte_offset (will tell us if new_cluster is contiguous). If I'm not mistaken, you could also write this as: cluster_end = ROUND_UP(s->free_byte_offset, s->cluster_size); which would give the same value for a non-zero s->free_byte_offset, but the value 0 instead of s->cluster_size if s->free_byte_offset is 0. I'll analyze at point [7] what semantic difference that would make. > + > + if (!s->free_cluster_index || cluster_end != new_cluster) { > + s->free_byte_offset = new_cluster; > + } [7] This was the only mention of s->free_cluster_index in the patch. I had to hunt for it's use in the code base; there are only two places in the code base that set it to non-zero: alloc_clusters_noref() and update_refcount(). But alloc_clusters_noref() is called by qcow2_alloc_clusters(), which we just called. Therefore, the left half of the || is always true, and you always send up changing s->free_byte_offset (that is, when you allocate a new cluster, you never reuse the tail of an existing cluster, even if the two were contiguous). [11] At one point in my review, I wondered if point [0] should assert that !s->free_byte_offset should imply !s->free_cluster_index (more on that at [10], but I convinced myself that would be wrong). Let's ignore the left half of the ||, and focus on the right half. If s->free_byte_offset was non-zero, you now know whether the new cluster is contiguous (use the existing tail, and the overflow into the new cluster is safe) or not (use only the new cluster); either way, s->free_byte_offset is now the point where the compressed data will be written. If s->free_byte_offset was 0, you want to unconditionally change it to the start of the just-allocated cluster. Note that if you used my suggestion above at making cluster_end == 0 rather than s->cluster_size for the !s->free_byte_offset case, then you are guaranteed that cluster_end != new_cluster is a sufficient test for when to correctly rewrite s->free_byte_offset (since new_cluster will be non-zero, because qcow2_alloc_clusters() will never return the header cluster); whereas with your code, there is a very minute chance that qcow2_alloc_clusters() could be equal to s->cluster_size (that is, the very first cluster after the header, perhaps possible if earlier actions on the file moved the L1 and refcount table to later in the file and freed up cluster index 1). Using just the right half of the ||, if that rare chance under your code happened, then we would NOT rewrite s->free_byte_offset, and actually have a bug of returning 0 to the caller. > + } > + > + if (offset_into_cluster(s, s->free_byte_offset)) { I'd feel a bit better if you added assert(s->free_byte_offset); just before this if (which happens to be the case with your pessimization on the left half of || [7], and would also be the case with my proposed simplification of [6]). > + int ret = qcow2_update_cluster_refcount(bs, > + s->free_byte_offset >> s->cluster_bits, 1, > + QCOW2_DISCARD_NEVER); [5] good - you are only incrementing the refcount of an existing cluster if you use the tail. > + if (ret < 0) { > + if (new_cluster > 0) { > + qcow2_free_clusters(bs, new_cluster, s->cluster_size, > + QCOW2_DISCARD_OTHER); > + } [8] nice that you avoid leaking the new allocation on a failed refcount update. > + return ret; > } > } > > @@ -807,6 +812,14 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) > * be flushed before the caller's L2 table updates. > */ > qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache); > + > + offset = s->free_byte_offset; [9] while you don't compute it until the very end. > + > + s->free_byte_offset += size; > + if (!offset_into_cluster(s, s->free_byte_offset)) { > + s->free_byte_offset = 0; [10] Hmm. This condition means that the compressed data fits EXACTLY into the tail of the existing active compressed cluster, so the next byte allocation will need to start a new cluster. However, you are NOT setting s->free_cluster_index back to 0, so we are back to my analysis of the test at point [7] being suspect. As I argued above, it is possible that new_cluster could be == s->cluster_size, and so you needed something other than just 'cluster_end != new_cluster) for deciding whether to rewrite s->free_byte_offset when there was no existing tail to even consider. So my naive thought was that you could set s->free_cluster_index to 0 at this point, and then assert at point [0] that s->free_byte_offset == 0 implies !s->free_cluster_index. But thinking about it more, that pessimizes more than it helps (s->free_cluster_index is useful to keep non-zero as much as possible), and I think my alternative of being careful to avoid cluster_end == s->cluster_size is more elegant. > + } > + > return offset; > } > > [11] In summary, everything you have looks like it works, except for the fact that your rewrite has a pessimization that EVERY compressed write that spills a cluster tail will start at the beginning of a new cluster, rather than taking advantage of contiguous cluster allocations to reuse the tail. Can we design a test that would exercise this, and make sure we don't pessimize in any future changes? Off the top of my head, it might be possible to design the test to have minimum-size 512-byte clusters, then do a series of 3 compressed writes of 512 bytes of data that occupies 300 bytes when compressed. With your pessimization, you are guaranteed that this would consume three clusters (each with refcount 1, and with a tail of 212 bytes each, regardless of whether they are contiguous), with the original code or with my proposed fix, if we can guarantee contiguous allocation, then it would only consume two clusters (both with refcount 2 because they contain 1.5 compressed data clusters each, the first cluster having 300 bytes of data write 1 and 212 bytes of data write 2, and the second cluster having 88 bytes of data write 2, 300 bytes of data write 3, and 124 bytes of tail).
On 02/04/2015 02:59 PM, Eric Blake wrote: >> qcow2_alloc_bytes() is a function with insufficient error handling and >> an unnecessary goto. This patch rewrites it. >> - redo: >> + >> free_in_cluster = s->cluster_size - >> offset_into_cluster(s, s->free_byte_offset); > > Based on [0], free_in_cluster is either s->cluster_size > (s->free_byte_offset was 0 and we have no tail) or smaller than > s->cluster_size (s->free_byte_offset was non-zero). Maybe even worth an assert? assert(free_in_cluster < s->cluster_size + !s->free_byte_offset);
On 2015-02-04 at 16:59, Eric Blake wrote: > On 02/04/2015 12:53 PM, Max Reitz wrote: > > My review jumps around a bit; try to follow the numbers to learn my > stream of consciousness on reviewing it. > > tl;dr: > It looks like this is functionally correct (you won't break behavior), > but that you have a pessimization bug (see [7]), so you ought to spin v2. > >> qcow2_alloc_bytes() is a function with insufficient error handling and >> an unnecessary goto. This patch rewrites it. > Not clear what the added error handling is from just the commit message. [1] Well, there are two cases and I didn't really mind writing it into the commit message since this patch completely rewrites the function anyway. [snip] >> + >> + cluster_end = start_of_cluster(s, s->free_byte_offset) + >> + s->cluster_size; > [6] cluster_end is now either s->cluster_size (s->free_byte_offset was > 0) or the first cluster-aligned address after s->free_byte_offset (will > tell us if new_cluster is contiguous). If I'm not mistaken, you could > also write this as: > > cluster_end = ROUND_UP(s->free_byte_offset, s->cluster_size); > > which would give the same value for a non-zero s->free_byte_offset, but > the value 0 instead of s->cluster_size if s->free_byte_offset is 0. > I'll analyze at point [7] what semantic difference that would make. Indeed. I was wondering about s->free_byte_offset maybe being aligned to a cluster boundary and thus breaking this, but that can never happen (if it is aligned, it will be set to 0 at the end of this function). If it did happen, the free_in_cluster calculation would break, too. I'll add an assert(!s->free_byte_offset || offset_into_cluster(s, s->free_byte_offset)); at the start of this function (which might have avoided "[0] ... I finally figured it out"). >> + >> + if (!s->free_cluster_index || cluster_end != new_cluster) { >> + s->free_byte_offset = new_cluster; >> + } > [7] This was the only mention of s->free_cluster_index in the patch. Oops, that's because I meant to use s->free_byte_offset. > I > had to hunt for it's use in the code base; Sorry. :-/ > there are only two places in > the code base that set it to non-zero: alloc_clusters_noref() and > update_refcount(). But alloc_clusters_noref() is called by > qcow2_alloc_clusters(), which we just called. Therefore, the left half > of the || is always true, and you always send up changing > s->free_byte_offset (that is, when you allocate a new cluster, you never > reuse the tail of an existing cluster, even if the two were contiguous). > [11] > > At one point in my review, I wondered if point [0] should assert that > !s->free_byte_offset should imply !s->free_cluster_index (more on that > at [10], but I convinced myself that would be wrong). > > Let's ignore the left half of the ||, and focus on the right half. If > s->free_byte_offset was non-zero, you now know whether the new cluster > is contiguous (use the existing tail, and the overflow into the new > cluster is safe) or not (use only the new cluster); either way, > s->free_byte_offset is now the point where the compressed data will be > written. If s->free_byte_offset was 0, you want to unconditionally > change it to the start of the just-allocated cluster. > > Note that if you used my suggestion above at making cluster_end == 0 > rather than s->cluster_size for the !s->free_byte_offset case, then you > are guaranteed that cluster_end != new_cluster is a sufficient test for > when to correctly rewrite s->free_byte_offset (since new_cluster will be > non-zero, because qcow2_alloc_clusters() will never return the header > cluster); whereas with your code, there is a very minute chance that > qcow2_alloc_clusters() could be equal to s->cluster_size (that is, the > very first cluster after the header, perhaps possible if earlier actions > on the file moved the L1 and refcount table to later in the file and > freed up cluster index 1). Using just the right half of the ||, if that > rare chance under your code happened, then we would NOT rewrite > s->free_byte_offset, and actually have a bug of returning 0 to the caller. That silly small mistake made your whole review so much more difficult... I'm sorry, again. >> + } >> + >> + if (offset_into_cluster(s, s->free_byte_offset)) { > I'd feel a bit better if you added > assert(s->free_byte_offset); > just before this if (which happens to be the case with your > pessimization on the left half of || [7], and would also be the case > with my proposed simplification of [6]). Yes, why not. Thanks for your review! Max
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 9afdb40..cafca0a 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -759,46 +759,51 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) { BDRVQcowState *s = bs->opaque; - int64_t offset, cluster_offset; - int free_in_cluster; + int64_t offset, new_cluster = 0, cluster_end; + size_t free_in_cluster; BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_BYTES); assert(size > 0 && size <= s->cluster_size); - if (s->free_byte_offset == 0) { - offset = qcow2_alloc_clusters(bs, s->cluster_size); - if (offset < 0) { - return offset; + + if (s->free_byte_offset) { + int refcount = qcow2_get_refcount(bs, + s->free_byte_offset >> s->cluster_bits); + if (refcount < 0) { + return refcount; + } + + if (refcount == 0xffff) { + s->free_byte_offset = 0; } - s->free_byte_offset = offset; } - redo: + free_in_cluster = s->cluster_size - offset_into_cluster(s, s->free_byte_offset); - if (size <= free_in_cluster) { - /* enough space in current cluster */ - offset = s->free_byte_offset; - s->free_byte_offset += size; - free_in_cluster -= size; - if (free_in_cluster == 0) - s->free_byte_offset = 0; - if (offset_into_cluster(s, offset) != 0) - qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1, - QCOW2_DISCARD_NEVER); - } else { - offset = qcow2_alloc_clusters(bs, s->cluster_size); - if (offset < 0) { - return offset; + + if (!s->free_byte_offset || free_in_cluster < size) { + new_cluster = qcow2_alloc_clusters(bs, s->cluster_size); + if (new_cluster < 0) { + return new_cluster; } - cluster_offset = start_of_cluster(s, s->free_byte_offset); - if ((cluster_offset + s->cluster_size) == offset) { - /* we are lucky: contiguous data */ - offset = s->free_byte_offset; - qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1, - QCOW2_DISCARD_NEVER); - s->free_byte_offset += size; - } else { - s->free_byte_offset = offset; - goto redo; + + cluster_end = start_of_cluster(s, s->free_byte_offset) + + s->cluster_size; + + if (!s->free_cluster_index || cluster_end != new_cluster) { + s->free_byte_offset = new_cluster; + } + } + + if (offset_into_cluster(s, s->free_byte_offset)) { + int ret = qcow2_update_cluster_refcount(bs, + s->free_byte_offset >> s->cluster_bits, 1, + QCOW2_DISCARD_NEVER); + if (ret < 0) { + if (new_cluster > 0) { + qcow2_free_clusters(bs, new_cluster, s->cluster_size, + QCOW2_DISCARD_OTHER); + } + return ret; } } @@ -807,6 +812,14 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) * be flushed before the caller's L2 table updates. */ qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache); + + offset = s->free_byte_offset; + + s->free_byte_offset += size; + if (!offset_into_cluster(s, s->free_byte_offset)) { + s->free_byte_offset = 0; + } + return offset; }
qcow2_alloc_bytes() is a function with insufficient error handling and an unnecessary goto. This patch rewrites it. Signed-off-by: Max Reitz <mreitz@redhat.com> --- block/qcow2-refcount.c | 77 +++++++++++++++++++++++++++++--------------------- 1 file changed, 45 insertions(+), 32 deletions(-)