[2/3] block/qcow2: fix the corruption when rebasing luks encrypted files
diff mbox series

Message ID 20190906173201.7926-3-mlevitsk@redhat.com
State New
Headers show
Series
  • Fix qcow2+luks corruption introduced by commit 8ac0f15f335
Related show

Commit Message

Maxim Levitsky Sept. 6, 2019, 5:32 p.m. UTC
This fixes subltle corruption introduced by luks threaded encryption
in commit 8ac0f15f335

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922

The corruption happens when we do
   * write to two or more unallocated clusters at once
   * write doesn't fully cover nether first nor last cluster

In this case, when allocating the new clusters we COW both area
prior to the write and after the write, and we encrypt them.

The above mentioned commit accidently made it so, we encrypt the
second COW are using the physical cluster offset of the first area.

Fix this by:
 * remove the offset_in_cluster parameter of do_perform_cow_encrypt
   since it is misleading. That offset can be larger that cluster size.
   instead just add the start and end COW are offsets to both host and guest offsets
   that do_perform_cow_encrypt receives.

*  in do_perform_cow_encrypt, remove the cluster offset from the host_offset
   And thus pass correctly to the qcow2_co_encrypt, the host cluster offset and full guest offset


Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 block/qcow2-cluster.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

Comments

Eric Blake Sept. 6, 2019, 7:17 p.m. UTC | #1
On 9/6/19 12:32 PM, Maxim Levitsky wrote:
> This fixes subltle corruption introduced by luks threaded encryption

subtle

> in commit 8ac0f15f335
> 
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922
> 
> The corruption happens when we do
>    * write to two or more unallocated clusters at once
>    * write doesn't fully cover nether first nor last cluster

s/nether/neither/

or even:

write doesn't fully cover either the first or the last cluster

> 
> In this case, when allocating the new clusters we COW both area

areas

> prior to the write and after the write, and we encrypt them.
> 
> The above mentioned commit accidently made it so, we encrypt the

accidentally

s/made it so, we encrypt/changed the encryption of/

> second COW are using the physical cluster offset of the first area.

s/are using/to use/

> 
> Fix this by:
>  * remove the offset_in_cluster parameter of do_perform_cow_encrypt
>    since it is misleading. That offset can be larger that cluster size.
>    instead just add the start and end COW are offsets to both host and guest offsets
>    that do_perform_cow_encrypt receives.
> 
> *  in do_perform_cow_encrypt, remove the cluster offset from the host_offset
>    And thus pass correctly to the qcow2_co_encrypt, the host cluster offset and full guest offset
> 
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  block/qcow2-cluster.c | 26 +++++++++++++++-----------
>  1 file changed, 15 insertions(+), 11 deletions(-)
> 

> +++ b/block/qcow2-cluster.c
> @@ -463,20 +463,20 @@ static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
>  }
>  
>  static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
> -                                                uint64_t guest_cluster_offset,
> -                                                uint64_t host_cluster_offset,
> -                                                unsigned offset_in_cluster,
> +                                                uint64_t guest_offset,
> +                                                uint64_t host_offset,
>                                                  uint8_t *buffer,
>                                                  unsigned bytes)
>  {
>      if (bytes && bs->encrypted) {
>          BDRVQcow2State *s = bs->opaque;
> -        assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
> +        assert((guest_offset & ~BDRV_SECTOR_MASK) == 0);
> +        assert((host_offset & ~BDRV_SECTOR_MASK) == 0);
>          assert((bytes & ~BDRV_SECTOR_MASK) == 0);

Pre-existing, but we could use QEMU_IS_ALIGNED(x, BDRV_SECTOR_SIZE) for
slightly more legibility than open-coding the bit operation.

Neat trick about power-of-2 alignment checks:

assert(QEMU_IS_ALIGNED(offset_in_cluster | guest_offset |
                       host_offset | bytes, BDRV_SECTOR_SIZE));

gives the same result in one assertion.  (I've used it elsewhere in the
code base, but I'm not opposed to one assert per variable if you think
batching is too dense.)

I'll let Dan review the actual code change, but offhand it makes sense
to me.
Maxim Levitsky Sept. 6, 2019, 7:46 p.m. UTC | #2
On Fri, 2019-09-06 at 14:17 -0500, Eric Blake wrote:
> On 9/6/19 12:32 PM, Maxim Levitsky wrote:
> > This fixes subltle corruption introduced by luks threaded encryption
> 
> subtle

I usually put the commit messages to a spellchecker, but this time
I forgot to do this. I will try not to in the future.

> 
> > in commit 8ac0f15f335
> > 
> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922
> > 
> > The corruption happens when we do
> >    * write to two or more unallocated clusters at once
> >    * write doesn't fully cover nether first nor last cluster
> 
> s/nether/neither/
> 
> or even:
> 
> write doesn't fully cover either the first or the last cluster
I think I didn't wrote the double negative correctly here.
I meant a write that doesn't cover first sector fully and doesn't cover second sector.
I'll just write it like that I guess.

> 
> > 
> > In this case, when allocating the new clusters we COW both area
> 
> areas
> 
> > prior to the write and after the write, and we encrypt them.
> > 
> > The above mentioned commit accidently made it so, we encrypt the
> 
> accidentally
> 
> s/made it so, we encrypt/changed the encryption of/
> 
> > second COW are using the physical cluster offset of the first area.
> 
> s/are using/to use/
I actually meant to write 'area' here. I just haven't proofed the commit
message at all I confess. Next time I do better.

> 
> > 
> > Fix this by:
> >  * remove the offset_in_cluster parameter of do_perform_cow_encrypt
> >    since it is misleading. That offset can be larger that cluster size.
> >    instead just add the start and end COW are offsets to both host and guest offsets
> >    that do_perform_cow_encrypt receives.
> > 
> > *  in do_perform_cow_encrypt, remove the cluster offset from the host_offset
> >    And thus pass correctly to the qcow2_co_encrypt, the host cluster offset and full guest offset
> > 
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  block/qcow2-cluster.c | 26 +++++++++++++++-----------
> >  1 file changed, 15 insertions(+), 11 deletions(-)
> > 
> > +++ b/block/qcow2-cluster.c
> > @@ -463,20 +463,20 @@ static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
> >  }
> >  
> >  static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
> > -                                                uint64_t guest_cluster_offset,
> > -                                                uint64_t host_cluster_offset,
> > -                                                unsigned offset_in_cluster,
> > +                                                uint64_t guest_offset,
> > +                                                uint64_t host_offset,
> >                                                  uint8_t *buffer,
> >                                                  unsigned bytes)
> >  {
> >      if (bytes && bs->encrypted) {
> >          BDRVQcow2State *s = bs->opaque;
> > -        assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
> > +        assert((guest_offset & ~BDRV_SECTOR_MASK) == 0);
> > +        assert((host_offset & ~BDRV_SECTOR_MASK) == 0);
> >          assert((bytes & ~BDRV_SECTOR_MASK) == 0);
> 
> Pre-existing, but we could use QEMU_IS_ALIGNED(x, BDRV_SECTOR_SIZE) for
> slightly more legibility than open-coding the bit operation.
> 
> Neat trick about power-of-2 alignment checks:
> 
> assert(QEMU_IS_ALIGNED(offset_in_cluster | guest_offset |
>                        host_offset | bytes, BDRV_SECTOR_SIZE));

In my book, a shorter code is almost always better, so why not.
> 
> gives the same result in one assertion.  (I've used it elsewhere in the
> code base, but I'm not opposed to one assert per variable if you think
> batching is too dense.)
> 
> I'll let Dan review the actual code change, but offhand it makes sense
> to me.
> 

Best regards,
	Thanks for the review,
		Maxim Levitsky
Kevin Wolf Sept. 9, 2019, 10:56 a.m. UTC | #3
Am 06.09.2019 um 21:17 hat Eric Blake geschrieben:
> > -        assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
> > +        assert((guest_offset & ~BDRV_SECTOR_MASK) == 0);
> > +        assert((host_offset & ~BDRV_SECTOR_MASK) == 0);
> >          assert((bytes & ~BDRV_SECTOR_MASK) == 0);
> 
> Pre-existing, but we could use QEMU_IS_ALIGNED(x, BDRV_SECTOR_SIZE) for
> slightly more legibility than open-coding the bit operation.
> 
> Neat trick about power-of-2 alignment checks:
> 
> assert(QEMU_IS_ALIGNED(offset_in_cluster | guest_offset |
>                        host_offset | bytes, BDRV_SECTOR_SIZE));
> 
> gives the same result in one assertion.  (I've used it elsewhere in the
> code base, but I'm not opposed to one assert per variable if you think
> batching is too dense.)

A possible downside of this is that if a user reports an assertion
failure, you can't tell any more which of the variables ended up in a
bad state.

If you're lucky, you can still tell in gdb at least if the bug is
reproducible, but I wouldn't be surprised if in release builds, half of
the variables were actually optimised away, so that even this wouldn't
work.

Kevin
Maxim Levitsky Sept. 10, 2019, 11:12 a.m. UTC | #4
On Mon, 2019-09-09 at 12:56 +0200, Kevin Wolf wrote:
> Am 06.09.2019 um 21:17 hat Eric Blake geschrieben:
> > > -        assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
> > > +        assert((guest_offset & ~BDRV_SECTOR_MASK) == 0);
> > > +        assert((host_offset & ~BDRV_SECTOR_MASK) == 0);
> > >          assert((bytes & ~BDRV_SECTOR_MASK) == 0);
> > 
> > Pre-existing, but we could use QEMU_IS_ALIGNED(x, BDRV_SECTOR_SIZE) for
> > slightly more legibility than open-coding the bit operation.
> > 
> > Neat trick about power-of-2 alignment checks:
> > 
> > assert(QEMU_IS_ALIGNED(offset_in_cluster | guest_offset |
> >                        host_offset | bytes, BDRV_SECTOR_SIZE));
> > 
> > gives the same result in one assertion.  (I've used it elsewhere in the
> > code base, but I'm not opposed to one assert per variable if you think
> > batching is too dense.)
> 
> A possible downside of this is that if a user reports an assertion
> failure, you can't tell any more which of the variables ended up in a
> bad state.
> 
> If you're lucky, you can still tell in gdb at least if the bug is
> reproducible, but I wouldn't be surprised if in release builds, half of
> the variables were actually optimised away, so that even this wouldn't
> work.
Agreed. I guess I'll keep the separate asserts anyway after all, even though
I prefer shorter code.


Best regards,
	Maxim Levitsky

Patch
diff mbox series

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b95e64c237..32477f0156 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -463,20 +463,20 @@  static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
 }
 
 static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
-                                                uint64_t guest_cluster_offset,
-                                                uint64_t host_cluster_offset,
-                                                unsigned offset_in_cluster,
+                                                uint64_t guest_offset,
+                                                uint64_t host_offset,
                                                 uint8_t *buffer,
                                                 unsigned bytes)
 {
     if (bytes && bs->encrypted) {
         BDRVQcow2State *s = bs->opaque;
-        assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
+        assert((guest_offset & ~BDRV_SECTOR_MASK) == 0);
+        assert((host_offset & ~BDRV_SECTOR_MASK) == 0);
         assert((bytes & ~BDRV_SECTOR_MASK) == 0);
         assert(s->crypto);
-        if (qcow2_co_encrypt(bs, host_cluster_offset,
-                             guest_cluster_offset + offset_in_cluster,
-                             buffer, bytes) < 0) {
+
+        if (qcow2_co_encrypt(bs, start_of_cluster(s, host_offset),
+                             guest_offset, buffer, bytes) < 0) {
             return false;
         }
     }
@@ -890,11 +890,15 @@  static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
 
     /* Encrypt the data if necessary before writing it */
     if (bs->encrypted) {
-        if (!do_perform_cow_encrypt(bs, m->offset, m->alloc_offset,
-                                    start->offset, start_buffer,
+        if (!do_perform_cow_encrypt(bs,
+                                    m->offset + start->offset,
+                                    m->alloc_offset + start->offset,
+                                    start_buffer,
                                     start->nb_bytes) ||
-            !do_perform_cow_encrypt(bs, m->offset, m->alloc_offset,
-                                    end->offset, end_buffer, end->nb_bytes)) {
+            !do_perform_cow_encrypt(bs,
+                                    m->offset + end->offset,
+                                    m->alloc_offset + end->offset,
+                                    end_buffer, end->nb_bytes)) {
             ret = -EIO;
             goto fail;
         }