[v7,1/3] block: introduce compress filter driver
diff mbox series

Message ID 1573670589-229357-2-git-send-email-andrey.shinkevich@virtuozzo.com
State New
Headers show
Series
  • qcow2: advanced compression options
Related show

Commit Message

Andrey Shinkevich Nov. 13, 2019, 6:43 p.m. UTC
Allow writing all the data compressed through the filter driver.
The written data will be aligned by the cluster size.
Based on the QEMU current implementation, that data can be written to
unallocated clusters only. May be used for a backup job.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/Makefile.objs     |   1 +
 block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
 qapi/block-core.json    |  10 ++-
 3 files changed, 208 insertions(+), 4 deletions(-)
 create mode 100644 block/filter-compress.c

Comments

Max Reitz Nov. 14, 2019, 11:27 a.m. UTC | #1
On 13.11.19 19:43, Andrey Shinkevich wrote:
> Allow writing all the data compressed through the filter driver.
> The written data will be aligned by the cluster size.
> Based on the QEMU current implementation, that data can be written to
> unallocated clusters only. May be used for a backup job.
> 
> Suggested-by: Max Reitz <mreitz@redhat.com>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/Makefile.objs     |   1 +
>  block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>  qapi/block-core.json    |  10 ++-
>  3 files changed, 208 insertions(+), 4 deletions(-)
>  create mode 100644 block/filter-compress.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index e394fe0..330529b 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>  
>  block-obj-y += aio_task.o
>  block-obj-y += backup-top.o
> +block-obj-y += filter-compress.o
>  
>  common-obj-y += stream.o
>  
> diff --git a/block/filter-compress.c b/block/filter-compress.c
> new file mode 100644
> index 0000000..64b1ee5
> --- /dev/null
> +++ b/block/filter-compress.c
> @@ -0,0 +1,201 @@
> +/*
> + * Compress filter block driver
> + *
> + * Copyright (c) 2019 Virtuozzo International GmbH
> + *
> + * Author:
> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> + *   (based on block/copy-on-read.c by Max Reitz)
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 or
> + * (at your option) any later version of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "block/block_int.h"
> +#include "qemu/module.h"
> +
> +
> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
> +                         Error **errp)
> +{
> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
> +                                  errp);

Please don’t attach something that the QAPI schema calls “file” as
bs->backing.

Yes, attaching it as bs->file would break backing chains.  That’s a bug
in the block layer.  I’ve been working on a fix for a long time.

Please don’t introduce more weirdness just because we have a bug in the
block layer.

(Note that I’d strongly oppose calling the child “backing” in the QAPI
schema, as this would go against what all other user-creatable filters do.)

Max
Vladimir Sementsov-Ogievskiy Nov. 14, 2019, 11:59 a.m. UTC | #2
14.11.2019 14:27, Max Reitz wrote:
> On 13.11.19 19:43, Andrey Shinkevich wrote:
>> Allow writing all the data compressed through the filter driver.
>> The written data will be aligned by the cluster size.
>> Based on the QEMU current implementation, that data can be written to
>> unallocated clusters only. May be used for a backup job.
>>
>> Suggested-by: Max Reitz <mreitz@redhat.com>
>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> ---
>>   block/Makefile.objs     |   1 +
>>   block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   qapi/block-core.json    |  10 ++-
>>   3 files changed, 208 insertions(+), 4 deletions(-)
>>   create mode 100644 block/filter-compress.c
>>
>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>> index e394fe0..330529b 100644
>> --- a/block/Makefile.objs
>> +++ b/block/Makefile.objs
>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>   
>>   block-obj-y += aio_task.o
>>   block-obj-y += backup-top.o
>> +block-obj-y += filter-compress.o
>>   
>>   common-obj-y += stream.o
>>   
>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>> new file mode 100644
>> index 0000000..64b1ee5
>> --- /dev/null
>> +++ b/block/filter-compress.c
>> @@ -0,0 +1,201 @@
>> +/*
>> + * Compress filter block driver
>> + *
>> + * Copyright (c) 2019 Virtuozzo International GmbH
>> + *
>> + * Author:
>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> + *   (based on block/copy-on-read.c by Max Reitz)
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation; either version 2 or
>> + * (at your option) any later version of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "block/block_int.h"
>> +#include "qemu/module.h"
>> +
>> +
>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>> +                         Error **errp)
>> +{
>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>> +                                  errp);
> 
> Please don’t attach something that the QAPI schema calls “file” as
> bs->backing.


Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..

> 
> Yes, attaching it as bs->file would break backing chains.  That’s a bug
> in the block layer.  I’ve been working on a fix for a long time.
> 
> Please don’t introduce more weirdness just because we have a bug in the
> block layer.
> 
> (Note that I’d strongly oppose calling the child “backing” in the QAPI
> schema, as this would go against what all other user-creatable filters do.)
> 

So, are you opposite to correct backing-based user-creatable filter (with backing both
in QAPI and code)?

Do you think, that if we make backup-top to be user-creatable, we should move it to be
file-child-based, or support both backing and file child?
Max Reitz Nov. 15, 2019, 9:32 a.m. UTC | #3
On 14.11.19 12:59, Vladimir Sementsov-Ogievskiy wrote:
> 14.11.2019 14:27, Max Reitz wrote:
>> On 13.11.19 19:43, Andrey Shinkevich wrote:
>>> Allow writing all the data compressed through the filter driver.
>>> The written data will be aligned by the cluster size.
>>> Based on the QEMU current implementation, that data can be written to
>>> unallocated clusters only. May be used for a backup job.
>>>
>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>> ---
>>>   block/Makefile.objs     |   1 +
>>>   block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>   qapi/block-core.json    |  10 ++-
>>>   3 files changed, 208 insertions(+), 4 deletions(-)
>>>   create mode 100644 block/filter-compress.c
>>>
>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>> index e394fe0..330529b 100644
>>> --- a/block/Makefile.objs
>>> +++ b/block/Makefile.objs
>>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>>   
>>>   block-obj-y += aio_task.o
>>>   block-obj-y += backup-top.o
>>> +block-obj-y += filter-compress.o
>>>   
>>>   common-obj-y += stream.o
>>>   
>>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>>> new file mode 100644
>>> index 0000000..64b1ee5
>>> --- /dev/null
>>> +++ b/block/filter-compress.c
>>> @@ -0,0 +1,201 @@
>>> +/*
>>> + * Compress filter block driver
>>> + *
>>> + * Copyright (c) 2019 Virtuozzo International GmbH
>>> + *
>>> + * Author:
>>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>> + *   (based on block/copy-on-read.c by Max Reitz)
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU General Public License as
>>> + * published by the Free Software Foundation; either version 2 or
>>> + * (at your option) any later version of the License.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "block/block_int.h"
>>> +#include "qemu/module.h"
>>> +
>>> +
>>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>>> +                         Error **errp)
>>> +{
>>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>>> +                                  errp);
>>
>> Please don’t attach something that the QAPI schema calls “file” as
>> bs->backing.
> 
> 
> Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..
> 
>>
>> Yes, attaching it as bs->file would break backing chains.  That’s a bug
>> in the block layer.  I’ve been working on a fix for a long time.
>>
>> Please don’t introduce more weirdness just because we have a bug in the
>> block layer.
>>
>> (Note that I’d strongly oppose calling the child “backing” in the QAPI
>> schema, as this would go against what all other user-creatable filters do.)
>>
> 
> So, are you opposite to correct backing-based user-creatable filter (with backing both
> in QAPI and code)?

I’m not opposed to fixing it, but I don’t think the fix is to make all
filters use bs->backing.

> Do you think, that if we make backup-top to be user-creatable, we should move it to be
> file-child-based, or support both backing and file child?

I definitely don’t think it would be wrong.

It depends on how difficult it is.  I’m currently working on (more
groundwork for the filter series v7) a series to rework BdrvChildRole so
we can see from it what a child is used for (data, metadata, filter,
COW).  I can already see that it won‘t work out perfectly because
whenever we attach "backing", the question is whether that’s a COW child
now or whether it’s a filtered child.  I suppose I’m going to guess COW
when there’s no way to get the information, and maybe sometimes be wrong.

In my honest opinion, reusing bs->backing for filters was wrong.  I’m
not saying that bs->file was any better.  But I have a bit of a gripe
with filters using bs->backing, because it’s acknowledging a bug but not
fixing it at the same time.  Had we fixed the bug when we first noticed
it with the introduction of the mirror filter, maybe we wouldn’t be in
this position now.  Or maybe we should have just added a bs->filtered link.

But maybes aside, it still means that using bs->backing instead of
bs->file is not really better.  Right now it’s both wrong, and we need
to fix the block layer so it isn’t.

So what to do for new filters?  Sure, bs->backing works around a bug
now.  But it’ll be weird once the bug is fixed.  Then we’ll have filters
that use @file and others will use @backing.  I don’t think we want
that, I think we want a uniform interface for all filters.

And yes, that implies we probably should change backup-top to use file
instead of backing once it gets an external interface.

(Compare
https://lists.nongnu.org/archive/html/qemu-block/2017-09/msg00380.html
)

Max
Andrey Shinkevich Nov. 15, 2019, 10:12 a.m. UTC | #4
On 15/11/2019 12:32, Max Reitz wrote:
> On 14.11.19 12:59, Vladimir Sementsov-Ogievskiy wrote:
>> 14.11.2019 14:27, Max Reitz wrote:
>>> On 13.11.19 19:43, Andrey Shinkevich wrote:
>>>> Allow writing all the data compressed through the filter driver.
>>>> The written data will be aligned by the cluster size.
>>>> Based on the QEMU current implementation, that data can be written to
>>>> unallocated clusters only. May be used for a backup job.
>>>>
>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> ---
>>>>    block/Makefile.objs     |   1 +
>>>>    block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    qapi/block-core.json    |  10 ++-
>>>>    3 files changed, 208 insertions(+), 4 deletions(-)
>>>>    create mode 100644 block/filter-compress.c
>>>>
>>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>>> index e394fe0..330529b 100644
>>>> --- a/block/Makefile.objs
>>>> +++ b/block/Makefile.objs
>>>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>>>    
>>>>    block-obj-y += aio_task.o
>>>>    block-obj-y += backup-top.o
>>>> +block-obj-y += filter-compress.o
>>>>    
>>>>    common-obj-y += stream.o
>>>>    
>>>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>>>> new file mode 100644
>>>> index 0000000..64b1ee5
>>>> --- /dev/null
>>>> +++ b/block/filter-compress.c
>>>> @@ -0,0 +1,201 @@
>>>> +/*
>>>> + * Compress filter block driver
>>>> + *
>>>> + * Copyright (c) 2019 Virtuozzo International GmbH
>>>> + *
>>>> + * Author:
>>>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> + *   (based on block/copy-on-read.c by Max Reitz)
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU General Public License as
>>>> + * published by the Free Software Foundation; either version 2 or
>>>> + * (at your option) any later version of the License.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>> + * GNU General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU General Public License
>>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>>> + */
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "block/block_int.h"
>>>> +#include "qemu/module.h"
>>>> +
>>>> +
>>>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>>>> +                         Error **errp)
>>>> +{
>>>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>>>> +                                  errp);
>>>
>>> Please don’t attach something that the QAPI schema calls “file” as
>>> bs->backing.
>>
>>
>> Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..
>>
>>>
>>> Yes, attaching it as bs->file would break backing chains.  That’s a bug
>>> in the block layer.  I’ve been working on a fix for a long time.
>>>
>>> Please don’t introduce more weirdness just because we have a bug in the
>>> block layer.
>>>
>>> (Note that I’d strongly oppose calling the child “backing” in the QAPI
>>> schema, as this would go against what all other user-creatable filters do.)
>>>
>>
>> So, are you opposite to correct backing-based user-creatable filter (with backing both
>> in QAPI and code)?
> 
> I’m not opposed to fixing it, but I don’t think the fix is to make all
> filters use bs->backing.
> 
>> Do you think, that if we make backup-top to be user-creatable, we should move it to be
>> file-child-based, or support both backing and file child?
> 
> I definitely don’t think it would be wrong.
> 
> It depends on how difficult it is.  I’m currently working on (more
> groundwork for the filter series v7) a series to rework BdrvChildRole so
> we can see from it what a child is used for (data, metadata, filter,
> COW).  I can already see that it won‘t work out perfectly because
> whenever we attach "backing", the question is whether that’s a COW child
> now or whether it’s a filtered child.  I suppose I’m going to guess COW
> when there’s no way to get the information, and maybe sometimes be wrong.
> 
> In my honest opinion, reusing bs->backing for filters was wrong.  I’m
> not saying that bs->file was any better.  But I have a bit of a gripe
> with filters using bs->backing, because it’s acknowledging a bug but not
> fixing it at the same time.  Had we fixed the bug when we first noticed
> it with the introduction of the mirror filter, maybe we wouldn’t be in
> this position now.  Or maybe we should have just added a bs->filtered link.
> 
> But maybes aside, it still means that using bs->backing instead of
> bs->file is not really better.  Right now it’s both wrong, and we need
> to fix the block layer so it isn’t.
> 
> So what to do for new filters?  Sure, bs->backing works around a bug
> now.  But it’ll be weird once the bug is fixed.  Then we’ll have filters
> that use @file and others will use @backing.  I don’t think we want
> that, I think we want a uniform interface for all filters.
> 
> And yes, that implies we probably should change backup-top to use file
> instead of backing once it gets an external interface.
> 
> (Compare
> https://lists.nongnu.org/archive/html/qemu-block/2017-09/msg00380.html
> )
> 
> Max
> 

What if we modify backing_bs() as something like this
(to work around braking a backing chain):

static BlockDriverState *child_file_bs(BlockDriverState *bs)
{
     return bs->file ? bs->file->bs : NULL;
}

static BlockDriverState *skip_filter(BlockDriverState *bs)
{
     BlockDriverState *ret_bs = bs;

     while (ret_bs && ret_bs->drv && ret_bs->drv->is_filter) {
         ret_bs = child_file_bs(ret_bs);
     }

     return ret_bs;
}

static BlockDriverState *backing_bs(BlockDriverState *bs)
{
     BlockDriverState *ret_bs = skip_filter(bs);

     if (!ret_bs) {
         return NULL;
     }

     return ret_bs->backing ? ret_bs->backing->bs : NULL;
}
Vladimir Sementsov-Ogievskiy Nov. 15, 2019, 10:52 a.m. UTC | #5
15.11.2019 12:32, Max Reitz wrote:
> On 14.11.19 12:59, Vladimir Sementsov-Ogievskiy wrote:
>> 14.11.2019 14:27, Max Reitz wrote:
>>> On 13.11.19 19:43, Andrey Shinkevich wrote:
>>>> Allow writing all the data compressed through the filter driver.
>>>> The written data will be aligned by the cluster size.
>>>> Based on the QEMU current implementation, that data can be written to
>>>> unallocated clusters only. May be used for a backup job.
>>>>
>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> ---
>>>>    block/Makefile.objs     |   1 +
>>>>    block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    qapi/block-core.json    |  10 ++-
>>>>    3 files changed, 208 insertions(+), 4 deletions(-)
>>>>    create mode 100644 block/filter-compress.c
>>>>
>>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>>> index e394fe0..330529b 100644
>>>> --- a/block/Makefile.objs
>>>> +++ b/block/Makefile.objs
>>>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>>>    
>>>>    block-obj-y += aio_task.o
>>>>    block-obj-y += backup-top.o
>>>> +block-obj-y += filter-compress.o
>>>>    
>>>>    common-obj-y += stream.o
>>>>    
>>>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>>>> new file mode 100644
>>>> index 0000000..64b1ee5
>>>> --- /dev/null
>>>> +++ b/block/filter-compress.c
>>>> @@ -0,0 +1,201 @@
>>>> +/*
>>>> + * Compress filter block driver
>>>> + *
>>>> + * Copyright (c) 2019 Virtuozzo International GmbH
>>>> + *
>>>> + * Author:
>>>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> + *   (based on block/copy-on-read.c by Max Reitz)
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU General Public License as
>>>> + * published by the Free Software Foundation; either version 2 or
>>>> + * (at your option) any later version of the License.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>> + * GNU General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU General Public License
>>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>>> + */
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "block/block_int.h"
>>>> +#include "qemu/module.h"
>>>> +
>>>> +
>>>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>>>> +                         Error **errp)
>>>> +{
>>>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>>>> +                                  errp);
>>>
>>> Please don’t attach something that the QAPI schema calls “file” as
>>> bs->backing.
>>
>>
>> Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..
>>
>>>
>>> Yes, attaching it as bs->file would break backing chains.  That’s a bug
>>> in the block layer.  I’ve been working on a fix for a long time.
>>>
>>> Please don’t introduce more weirdness just because we have a bug in the
>>> block layer.
>>>
>>> (Note that I’d strongly oppose calling the child “backing” in the QAPI
>>> schema, as this would go against what all other user-creatable filters do.)
>>>
>>
>> So, are you opposite to correct backing-based user-creatable filter (with backing both
>> in QAPI and code)?
> 
> I’m not opposed to fixing it, but I don’t think the fix is to make all
> filters use bs->backing.
> 
>> Do you think, that if we make backup-top to be user-creatable, we should move it to be
>> file-child-based, or support both backing and file child?
> 
> I definitely don’t think it would be wrong.
> 
> It depends on how difficult it is.  I’m currently working on (more
> groundwork for the filter series v7) a series to rework BdrvChildRole so
> we can see from it what a child is used for (data, metadata, filter,
> COW).  I can already see that it won‘t work out perfectly because
> whenever we attach "backing", the question is whether that’s a COW child
> now or whether it’s a filtered child.  I suppose I’m going to guess COW
> when there’s no way to get the information, and maybe sometimes be wrong.
> 
> In my honest opinion, reusing bs->backing for filters was wrong.  I’m
> not saying that bs->file was any better.  But I have a bit of a gripe
> with filters using bs->backing, because it’s acknowledging a bug but not
> fixing it at the same time.  Had we fixed the bug when we first noticed
> it with the introduction of the mirror filter, maybe we wouldn’t be in
> this position now.  Or maybe we should have just added a bs->filtered link.
> 
> But maybes aside, it still means that using bs->backing instead of
> bs->file is not really better.  Right now it’s both wrong, and we need
> to fix the block layer so it isn’t.
> 
> So what to do for new filters?  Sure, bs->backing works around a bug
> now.  But it’ll be weird once the bug is fixed.  Then we’ll have filters
> that use @file and others will use @backing.  I don’t think we want
> that, I think we want a uniform interface for all filters.
> 
> And yes, that implies we probably should change backup-top to use file
> instead of backing once it gets an external interface.
> 
> (Compare
> https://lists.nongnu.org/archive/html/qemu-block/2017-09/msg00380.html
> )
> 
> Max
> 

OK, got your point. Let's use file child in compress filter. Hope for your series!
Vladimir Sementsov-Ogievskiy Nov. 15, 2019, 10:55 a.m. UTC | #6
15.11.2019 13:52, Vladimir Sementsov-Ogievskiy wrote:
> 15.11.2019 12:32, Max Reitz wrote:
>> On 14.11.19 12:59, Vladimir Sementsov-Ogievskiy wrote:
>>> 14.11.2019 14:27, Max Reitz wrote:
>>>> On 13.11.19 19:43, Andrey Shinkevich wrote:
>>>>> Allow writing all the data compressed through the filter driver.
>>>>> The written data will be aligned by the cluster size.
>>>>> Based on the QEMU current implementation, that data can be written to
>>>>> unallocated clusters only. May be used for a backup job.
>>>>>
>>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>> ---
>>>>>    block/Makefile.objs     |   1 +
>>>>>    block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    qapi/block-core.json    |  10 ++-
>>>>>    3 files changed, 208 insertions(+), 4 deletions(-)
>>>>>    create mode 100644 block/filter-compress.c
>>>>>
>>>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>>>> index e394fe0..330529b 100644
>>>>> --- a/block/Makefile.objs
>>>>> +++ b/block/Makefile.objs
>>>>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>>>>    block-obj-y += aio_task.o
>>>>>    block-obj-y += backup-top.o
>>>>> +block-obj-y += filter-compress.o
>>>>>    common-obj-y += stream.o
>>>>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>>>>> new file mode 100644
>>>>> index 0000000..64b1ee5
>>>>> --- /dev/null
>>>>> +++ b/block/filter-compress.c
>>>>> @@ -0,0 +1,201 @@
>>>>> +/*
>>>>> + * Compress filter block driver
>>>>> + *
>>>>> + * Copyright (c) 2019 Virtuozzo International GmbH
>>>>> + *
>>>>> + * Author:
>>>>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>> + *   (based on block/copy-on-read.c by Max Reitz)
>>>>> + *
>>>>> + * This program is free software; you can redistribute it and/or
>>>>> + * modify it under the terms of the GNU General Public License as
>>>>> + * published by the Free Software Foundation; either version 2 or
>>>>> + * (at your option) any later version of the License.
>>>>> + *
>>>>> + * This program is distributed in the hope that it will be useful,
>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>> + * GNU General Public License for more details.
>>>>> + *
>>>>> + * You should have received a copy of the GNU General Public License
>>>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>>>> + */
>>>>> +
>>>>> +#include "qemu/osdep.h"
>>>>> +#include "block/block_int.h"
>>>>> +#include "qemu/module.h"
>>>>> +
>>>>> +
>>>>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>>>>> +                         Error **errp)
>>>>> +{
>>>>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>>>>> +                                  errp);
>>>>
>>>> Please don’t attach something that the QAPI schema calls “file” as
>>>> bs->backing.
>>>
>>>
>>> Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..
>>>
>>>>
>>>> Yes, attaching it as bs->file would break backing chains.  That’s a bug
>>>> in the block layer.  I’ve been working on a fix for a long time.
>>>>
>>>> Please don’t introduce more weirdness just because we have a bug in the
>>>> block layer.
>>>>
>>>> (Note that I’d strongly oppose calling the child “backing” in the QAPI
>>>> schema, as this would go against what all other user-creatable filters do.)
>>>>
>>>
>>> So, are you opposite to correct backing-based user-creatable filter (with backing both
>>> in QAPI and code)?
>>
>> I’m not opposed to fixing it, but I don’t think the fix is to make all
>> filters use bs->backing.
>>
>>> Do you think, that if we make backup-top to be user-creatable, we should move it to be
>>> file-child-based, or support both backing and file child?
>>
>> I definitely don’t think it would be wrong.
>>
>> It depends on how difficult it is.  I’m currently working on (more
>> groundwork for the filter series v7) a series to rework BdrvChildRole so
>> we can see from it what a child is used for (data, metadata, filter,
>> COW).  I can already see that it won‘t work out perfectly because
>> whenever we attach "backing", the question is whether that’s a COW child
>> now or whether it’s a filtered child.  I suppose I’m going to guess COW
>> when there’s no way to get the information, and maybe sometimes be wrong.
>>
>> In my honest opinion, reusing bs->backing for filters was wrong.  I’m
>> not saying that bs->file was any better.  But I have a bit of a gripe
>> with filters using bs->backing, because it’s acknowledging a bug but not
>> fixing it at the same time.  Had we fixed the bug when we first noticed
>> it with the introduction of the mirror filter, maybe we wouldn’t be in
>> this position now.  Or maybe we should have just added a bs->filtered link.
>>
>> But maybes aside, it still means that using bs->backing instead of
>> bs->file is not really better.  Right now it’s both wrong, and we need
>> to fix the block layer so it isn’t.
>>
>> So what to do for new filters?  Sure, bs->backing works around a bug
>> now.  But it’ll be weird once the bug is fixed.  Then we’ll have filters
>> that use @file and others will use @backing.  I don’t think we want
>> that, I think we want a uniform interface for all filters.
>>
>> And yes, that implies we probably should change backup-top to use file
>> instead of backing once it gets an external interface.
>>
>> (Compare
>> https://lists.nongnu.org/archive/html/qemu-block/2017-09/msg00380.html
>> )
>>
>> Max
>>
> 
> OK, got your point. Let's use file child in compress filter. Hope for your series!
> 

Interesting, how much of your series needed to make it possible to use compress filter
in stream? To make it work in 5.0?
Max Reitz Nov. 15, 2019, 12:03 p.m. UTC | #7
On 15.11.19 11:12, Andrey Shinkevich wrote:
> 
> 
> On 15/11/2019 12:32, Max Reitz wrote:
>> On 14.11.19 12:59, Vladimir Sementsov-Ogievskiy wrote:
>>> 14.11.2019 14:27, Max Reitz wrote:
>>>> On 13.11.19 19:43, Andrey Shinkevich wrote:
>>>>> Allow writing all the data compressed through the filter driver.
>>>>> The written data will be aligned by the cluster size.
>>>>> Based on the QEMU current implementation, that data can be written to
>>>>> unallocated clusters only. May be used for a backup job.
>>>>>
>>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>> ---
>>>>>    block/Makefile.objs     |   1 +
>>>>>    block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    qapi/block-core.json    |  10 ++-
>>>>>    3 files changed, 208 insertions(+), 4 deletions(-)
>>>>>    create mode 100644 block/filter-compress.c
>>>>>
>>>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>>>> index e394fe0..330529b 100644
>>>>> --- a/block/Makefile.objs
>>>>> +++ b/block/Makefile.objs
>>>>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>>>>    
>>>>>    block-obj-y += aio_task.o
>>>>>    block-obj-y += backup-top.o
>>>>> +block-obj-y += filter-compress.o
>>>>>    
>>>>>    common-obj-y += stream.o
>>>>>    
>>>>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>>>>> new file mode 100644
>>>>> index 0000000..64b1ee5
>>>>> --- /dev/null
>>>>> +++ b/block/filter-compress.c
>>>>> @@ -0,0 +1,201 @@
>>>>> +/*
>>>>> + * Compress filter block driver
>>>>> + *
>>>>> + * Copyright (c) 2019 Virtuozzo International GmbH
>>>>> + *
>>>>> + * Author:
>>>>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>> + *   (based on block/copy-on-read.c by Max Reitz)
>>>>> + *
>>>>> + * This program is free software; you can redistribute it and/or
>>>>> + * modify it under the terms of the GNU General Public License as
>>>>> + * published by the Free Software Foundation; either version 2 or
>>>>> + * (at your option) any later version of the License.
>>>>> + *
>>>>> + * This program is distributed in the hope that it will be useful,
>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>> + * GNU General Public License for more details.
>>>>> + *
>>>>> + * You should have received a copy of the GNU General Public License
>>>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>>>> + */
>>>>> +
>>>>> +#include "qemu/osdep.h"
>>>>> +#include "block/block_int.h"
>>>>> +#include "qemu/module.h"
>>>>> +
>>>>> +
>>>>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>>>>> +                         Error **errp)
>>>>> +{
>>>>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>>>>> +                                  errp);
>>>>
>>>> Please don’t attach something that the QAPI schema calls “file” as
>>>> bs->backing.
>>>
>>>
>>> Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..
>>>
>>>>
>>>> Yes, attaching it as bs->file would break backing chains.  That’s a bug
>>>> in the block layer.  I’ve been working on a fix for a long time.
>>>>
>>>> Please don’t introduce more weirdness just because we have a bug in the
>>>> block layer.
>>>>
>>>> (Note that I’d strongly oppose calling the child “backing” in the QAPI
>>>> schema, as this would go against what all other user-creatable filters do.)
>>>>
>>>
>>> So, are you opposite to correct backing-based user-creatable filter (with backing both
>>> in QAPI and code)?
>>
>> I’m not opposed to fixing it, but I don’t think the fix is to make all
>> filters use bs->backing.
>>
>>> Do you think, that if we make backup-top to be user-creatable, we should move it to be
>>> file-child-based, or support both backing and file child?
>>
>> I definitely don’t think it would be wrong.
>>
>> It depends on how difficult it is.  I’m currently working on (more
>> groundwork for the filter series v7) a series to rework BdrvChildRole so
>> we can see from it what a child is used for (data, metadata, filter,
>> COW).  I can already see that it won‘t work out perfectly because
>> whenever we attach "backing", the question is whether that’s a COW child
>> now or whether it’s a filtered child.  I suppose I’m going to guess COW
>> when there’s no way to get the information, and maybe sometimes be wrong.
>>
>> In my honest opinion, reusing bs->backing for filters was wrong.  I’m
>> not saying that bs->file was any better.  But I have a bit of a gripe
>> with filters using bs->backing, because it’s acknowledging a bug but not
>> fixing it at the same time.  Had we fixed the bug when we first noticed
>> it with the introduction of the mirror filter, maybe we wouldn’t be in
>> this position now.  Or maybe we should have just added a bs->filtered link.
>>
>> But maybes aside, it still means that using bs->backing instead of
>> bs->file is not really better.  Right now it’s both wrong, and we need
>> to fix the block layer so it isn’t.
>>
>> So what to do for new filters?  Sure, bs->backing works around a bug
>> now.  But it’ll be weird once the bug is fixed.  Then we’ll have filters
>> that use @file and others will use @backing.  I don’t think we want
>> that, I think we want a uniform interface for all filters.
>>
>> And yes, that implies we probably should change backup-top to use file
>> instead of backing once it gets an external interface.
>>
>> (Compare
>> https://lists.nongnu.org/archive/html/qemu-block/2017-09/msg00380.html
>> )
>>
>> Max
>>
> 
> What if we modify backing_bs() as something like this
> (to work around braking a backing chain):

I hope it isn’t that simple or I’ve written a 42-patch series for
nothing. :-/
(https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00350.html)

On of the problems is that we actually need to think about whether the
functions that currently use backing_bs() really want a COW backing BDS,
or are content with a filtered BDS, too, or maybe want to skip implicit
filters, and so on.

Max
Max Reitz Nov. 15, 2019, 12:05 p.m. UTC | #8
On 15.11.19 11:55, Vladimir Sementsov-Ogievskiy wrote:
> 15.11.2019 13:52, Vladimir Sementsov-Ogievskiy wrote:
>> 15.11.2019 12:32, Max Reitz wrote:
>>> On 14.11.19 12:59, Vladimir Sementsov-Ogievskiy wrote:
>>>> 14.11.2019 14:27, Max Reitz wrote:
>>>>> On 13.11.19 19:43, Andrey Shinkevich wrote:
>>>>>> Allow writing all the data compressed through the filter driver.
>>>>>> The written data will be aligned by the cluster size.
>>>>>> Based on the QEMU current implementation, that data can be written to
>>>>>> unallocated clusters only. May be used for a backup job.
>>>>>>
>>>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>>> ---
>>>>>>    block/Makefile.objs     |   1 +
>>>>>>    block/filter-compress.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>    qapi/block-core.json    |  10 ++-
>>>>>>    3 files changed, 208 insertions(+), 4 deletions(-)
>>>>>>    create mode 100644 block/filter-compress.c
>>>>>>
>>>>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>>>>> index e394fe0..330529b 100644
>>>>>> --- a/block/Makefile.objs
>>>>>> +++ b/block/Makefile.objs
>>>>>> @@ -43,6 +43,7 @@ block-obj-y += crypto.o
>>>>>>    block-obj-y += aio_task.o
>>>>>>    block-obj-y += backup-top.o
>>>>>> +block-obj-y += filter-compress.o
>>>>>>    common-obj-y += stream.o
>>>>>> diff --git a/block/filter-compress.c b/block/filter-compress.c
>>>>>> new file mode 100644
>>>>>> index 0000000..64b1ee5
>>>>>> --- /dev/null
>>>>>> +++ b/block/filter-compress.c
>>>>>> @@ -0,0 +1,201 @@
>>>>>> +/*
>>>>>> + * Compress filter block driver
>>>>>> + *
>>>>>> + * Copyright (c) 2019 Virtuozzo International GmbH
>>>>>> + *
>>>>>> + * Author:
>>>>>> + *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>>>> + *   (based on block/copy-on-read.c by Max Reitz)
>>>>>> + *
>>>>>> + * This program is free software; you can redistribute it and/or
>>>>>> + * modify it under the terms of the GNU General Public License as
>>>>>> + * published by the Free Software Foundation; either version 2 or
>>>>>> + * (at your option) any later version of the License.
>>>>>> + *
>>>>>> + * This program is distributed in the hope that it will be useful,
>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>> + * GNU General Public License for more details.
>>>>>> + *
>>>>>> + * You should have received a copy of the GNU General Public License
>>>>>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>>>>>> + */
>>>>>> +
>>>>>> +#include "qemu/osdep.h"
>>>>>> +#include "block/block_int.h"
>>>>>> +#include "qemu/module.h"
>>>>>> +
>>>>>> +
>>>>>> +static int compress_open(BlockDriverState *bs, QDict *options, int flags,
>>>>>> +                         Error **errp)
>>>>>> +{
>>>>>> +    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
>>>>>> +                                  errp);
>>>>>
>>>>> Please don’t attach something that the QAPI schema calls “file” as
>>>>> bs->backing.
>>>>
>>>>
>>>> Agree, it's a mistake. If we want backing and user set backing in options, it's opened automatically, I think..
>>>>
>>>>>
>>>>> Yes, attaching it as bs->file would break backing chains.  That’s a bug
>>>>> in the block layer.  I’ve been working on a fix for a long time.
>>>>>
>>>>> Please don’t introduce more weirdness just because we have a bug in the
>>>>> block layer.
>>>>>
>>>>> (Note that I’d strongly oppose calling the child “backing” in the QAPI
>>>>> schema, as this would go against what all other user-creatable filters do.)
>>>>>
>>>>
>>>> So, are you opposite to correct backing-based user-creatable filter (with backing both
>>>> in QAPI and code)?
>>>
>>> I’m not opposed to fixing it, but I don’t think the fix is to make all
>>> filters use bs->backing.
>>>
>>>> Do you think, that if we make backup-top to be user-creatable, we should move it to be
>>>> file-child-based, or support both backing and file child?
>>>
>>> I definitely don’t think it would be wrong.
>>>
>>> It depends on how difficult it is.  I’m currently working on (more
>>> groundwork for the filter series v7) a series to rework BdrvChildRole so
>>> we can see from it what a child is used for (data, metadata, filter,
>>> COW).  I can already see that it won‘t work out perfectly because
>>> whenever we attach "backing", the question is whether that’s a COW child
>>> now or whether it’s a filtered child.  I suppose I’m going to guess COW
>>> when there’s no way to get the information, and maybe sometimes be wrong.
>>>
>>> In my honest opinion, reusing bs->backing for filters was wrong.  I’m
>>> not saying that bs->file was any better.  But I have a bit of a gripe
>>> with filters using bs->backing, because it’s acknowledging a bug but not
>>> fixing it at the same time.  Had we fixed the bug when we first noticed
>>> it with the introduction of the mirror filter, maybe we wouldn’t be in
>>> this position now.  Or maybe we should have just added a bs->filtered link.
>>>
>>> But maybes aside, it still means that using bs->backing instead of
>>> bs->file is not really better.  Right now it’s both wrong, and we need
>>> to fix the block layer so it isn’t.
>>>
>>> So what to do for new filters?  Sure, bs->backing works around a bug
>>> now.  But it’ll be weird once the bug is fixed.  Then we’ll have filters
>>> that use @file and others will use @backing.  I don’t think we want
>>> that, I think we want a uniform interface for all filters.
>>>
>>> And yes, that implies we probably should change backup-top to use file
>>> instead of backing once it gets an external interface.
>>>
>>> (Compare
>>> https://lists.nongnu.org/archive/html/qemu-block/2017-09/msg00380.html
>>> )
>>>
>>> Max
>>>
>>
>> OK, got your point. Let's use file child in compress filter. Hope for your series!
>>
> 
> Interesting, how much of your series needed to make it possible to use compress filter
> in stream? To make it work in 5.0?

I’ve never really considered the series as easily splittable, or I would
have done that so I wouldn’t have to rebase that monster every time.
But maybe I just didn’t take enough time to consider...

Max

Patch
diff mbox series

diff --git a/block/Makefile.objs b/block/Makefile.objs
index e394fe0..330529b 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -43,6 +43,7 @@  block-obj-y += crypto.o
 
 block-obj-y += aio_task.o
 block-obj-y += backup-top.o
+block-obj-y += filter-compress.o
 
 common-obj-y += stream.o
 
diff --git a/block/filter-compress.c b/block/filter-compress.c
new file mode 100644
index 0000000..64b1ee5
--- /dev/null
+++ b/block/filter-compress.c
@@ -0,0 +1,201 @@ 
+/*
+ * Compress filter block driver
+ *
+ * Copyright (c) 2019 Virtuozzo International GmbH
+ *
+ * Author:
+ *   Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
+ *   (based on block/copy-on-read.c by Max Reitz)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) any later version of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block_int.h"
+#include "qemu/module.h"
+
+
+static int compress_open(BlockDriverState *bs, QDict *options, int flags,
+                         Error **errp)
+{
+    bs->backing = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
+                                  errp);
+    if (!bs->backing) {
+        return -EINVAL;
+    }
+
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+        BDRV_REQ_WRITE_COMPRESSED |
+        (BDRV_REQ_FUA & bs->backing->bs->supported_write_flags);
+
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+        ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
+            bs->backing->bs->supported_zero_flags);
+
+    return 0;
+}
+
+
+#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
+                          | BLK_PERM_WRITE \
+                          | BLK_PERM_RESIZE)
+#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
+
+static void compress_child_perm(BlockDriverState *bs, BdrvChild *c,
+                                const BdrvChildRole *role,
+                                BlockReopenQueue *reopen_queue,
+                                uint64_t perm, uint64_t shared,
+                                uint64_t *nperm, uint64_t *nshared)
+{
+    *nperm = perm & PERM_PASSTHROUGH;
+    *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
+
+    /*
+     * We must not request write permissions for an inactive node, the child
+     * cannot provide it.
+     */
+    if (!(bs->open_flags & BDRV_O_INACTIVE)) {
+        *nperm |= BLK_PERM_WRITE_UNCHANGED;
+    }
+}
+
+
+static int64_t compress_getlength(BlockDriverState *bs)
+{
+    return bdrv_getlength(bs->backing->bs);
+}
+
+
+static int coroutine_fn compress_co_truncate(BlockDriverState *bs,
+                                             int64_t offset, bool exact,
+                                             PreallocMode prealloc,
+                                             Error **errp)
+{
+    return bdrv_co_truncate(bs->backing, offset, exact, prealloc, errp);
+}
+
+
+static int coroutine_fn compress_co_preadv_part(BlockDriverState *bs,
+                                                uint64_t offset, uint64_t bytes,
+                                                QEMUIOVector *qiov,
+                                                size_t qiov_offset,
+                                                int flags)
+{
+    return bdrv_co_preadv_part(bs->backing, offset, bytes, qiov, qiov_offset,
+                               flags);
+}
+
+
+static int coroutine_fn compress_co_pwritev_part(BlockDriverState *bs,
+                                                 uint64_t offset,
+                                                 uint64_t bytes,
+                                                 QEMUIOVector *qiov,
+                                                 size_t qiov_offset, int flags)
+{
+    return bdrv_co_pwritev_part(bs->backing, offset, bytes, qiov, qiov_offset,
+                                flags | BDRV_REQ_WRITE_COMPRESSED);
+}
+
+
+static int coroutine_fn compress_co_pwrite_zeroes(BlockDriverState *bs,
+                                                  int64_t offset, int bytes,
+                                                  BdrvRequestFlags flags)
+{
+    return bdrv_co_pwrite_zeroes(bs->backing, offset, bytes, flags);
+}
+
+
+static int coroutine_fn compress_co_pdiscard(BlockDriverState *bs,
+                                             int64_t offset, int bytes)
+{
+    return bdrv_co_pdiscard(bs->backing, offset, bytes);
+}
+
+
+static int compress_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+    return bdrv_get_info(bs->backing->bs, bdi);
+}
+
+
+static void compress_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    BlockDriverInfo bdi;
+    int ret;
+
+    if (!bs->backing) {
+        return;
+    }
+
+    ret = bdrv_get_info(bs->backing->bs, &bdi);
+    if (ret < 0 || bdi.cluster_size == 0) {
+        return;
+    }
+
+    bs->bl.request_alignment = bdi.cluster_size;
+}
+
+
+static void compress_eject(BlockDriverState *bs, bool eject_flag)
+{
+    bdrv_eject(bs->backing->bs, eject_flag);
+}
+
+
+static void compress_lock_medium(BlockDriverState *bs, bool locked)
+{
+    bdrv_lock_medium(bs->backing->bs, locked);
+}
+
+
+static bool compress_recurse_is_first_non_filter(BlockDriverState *bs,
+                                                 BlockDriverState *candidate)
+{
+    return bdrv_recurse_is_first_non_filter(bs->backing->bs, candidate);
+}
+
+
+static BlockDriver bdrv_compress = {
+    .format_name                        = "compress",
+
+    .bdrv_open                          = compress_open,
+    .bdrv_child_perm                    = compress_child_perm,
+
+    .bdrv_getlength                     = compress_getlength,
+    .bdrv_co_truncate                   = compress_co_truncate,
+
+    .bdrv_co_preadv_part                = compress_co_preadv_part,
+    .bdrv_co_pwritev_part               = compress_co_pwritev_part,
+    .bdrv_co_pwrite_zeroes              = compress_co_pwrite_zeroes,
+    .bdrv_co_pdiscard                   = compress_co_pdiscard,
+    .bdrv_get_info                      = compress_get_info,
+    .bdrv_refresh_limits                = compress_refresh_limits,
+
+    .bdrv_eject                         = compress_eject,
+    .bdrv_lock_medium                   = compress_lock_medium,
+
+    .bdrv_co_block_status               = bdrv_co_block_status_from_backing,
+
+    .bdrv_recurse_is_first_non_filter   = compress_recurse_is_first_non_filter,
+
+    .has_variable_length                = true,
+    .is_filter                          = true,
+};
+
+static void bdrv_compress_init(void)
+{
+    bdrv_register(&bdrv_compress);
+}
+
+block_init(bdrv_compress_init);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index aa97ee2..2f34703 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2884,15 +2884,16 @@ 
 # @copy-on-read: Since 3.0
 # @blklogwrites: Since 3.0
 # @blkreplay: Since 4.2
+# @compress: Since 5.0
 #
 # Since: 2.9
 ##
 { 'enum': 'BlockdevDriver',
   'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
-            'cloop', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps', 'gluster',
-            'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks',
-            'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow',
-            'qcow2', 'qed', 'quorum', 'raw', 'rbd',
+            'cloop', 'copy-on-read', 'compress', 'dmg', 'file', 'ftp', 'ftps',
+            'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi',
+            'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
+            'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
             { 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
             'sheepdog',
             'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
@@ -4045,6 +4046,7 @@ 
       'bochs':      'BlockdevOptionsGenericFormat',
       'cloop':      'BlockdevOptionsGenericFormat',
       'copy-on-read':'BlockdevOptionsGenericFormat',
+      'compress':   'BlockdevOptionsGenericFormat',
       'dmg':        'BlockdevOptionsGenericFormat',
       'file':       'BlockdevOptionsFile',
       'ftp':        'BlockdevOptionsCurlFtp',