diff mbox series

[v7,2/4] qcow2: add configurations for zoned format extension

Message ID 20240122184830.40094-3-faithilikerun@gmail.com
State New
Headers show
Series Add full zoned storage emulation to qcow2 driver | expand

Commit Message

Sam Li Jan. 22, 2024, 6:48 p.m. UTC
To configure the zoned format feature on the qcow2 driver, it
requires settings as: the device size, zone model, zone size,
zone capacity, number of conventional zones, limits on zone
resources (max append bytes, max open zones, and max_active_zones).

To create a qcow2 image with zoned format feature, use command like
this:
qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
-o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
-o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
-o zone.max_active_zones=8 -o zone.mode=host-managed

Signed-off-by: Sam Li <faithilikerun@gmail.com>
---
 block/qcow2.c                    | 252 ++++++++++++++++++++++++++++++-
 block/qcow2.h                    |  36 ++++-
 docs/interop/qcow2.txt           | 107 ++++++++++++-
 include/block/block_int-common.h |  13 ++
 qapi/block-core.json             |  67 +++++++-
 5 files changed, 469 insertions(+), 6 deletions(-)

Comments

Markus Armbruster Feb. 19, 2024, 11:57 a.m. UTC | #1
I apologize for the delayed review.

Sam Li <faithilikerun@gmail.com> writes:

> To configure the zoned format feature on the qcow2 driver, it
> requires settings as: the device size, zone model, zone size,
> zone capacity, number of conventional zones, limits on zone
> resources (max append bytes, max open zones, and max_active_zones).
>
> To create a qcow2 image with zoned format feature, use command like
> this:
> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
> -o zone.max_active_zones=8 -o zone.mode=host-managed
>
> Signed-off-by: Sam Li <faithilikerun@gmail.com>

[...]

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index ca390c5700..e2e0ec21a5 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -5038,6 +5038,67 @@
>  { 'enum': 'Qcow2CompressionType',
>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>  
> +##
> +# @Qcow2ZoneModel:
> +#
> +# Zoned device model used in qcow2 image file
> +#
> +# @host-managed: The host-managed model only allows sequential write over the
> +#     device zones.
> +#
> +# Since 8.2
> +##
> +{ 'enum': 'Qcow2ZoneModel',
> +  'data': [ 'host-managed'] }
> +
> +##
> +# @Qcow2ZoneHostManaged:
> +#
> +# The host-managed zone model.  It only allows sequential writes.
> +#
> +# @size: Total number of bytes within zones.

Default?

> +#
> +# @capacity: The number of usable logical blocks within zones
> +#     in bytes.  A zone capacity is always smaller or equal to the
> +#     zone size.

Default?

> +#
> +# @conventional-zones: The number of conventional zones of the
> +#     zoned device (default 0).
> +#
> +# @max-open-zones: The maximal number of open zones.  It is less than
> +#     or equal to the number of sequential write required zones of
> +#     the device (default 0).
> +#
> +# @max-active-zones: The maximal number of zones in the implicit
> +#     open, explicit open or closed state.  It is less than or equal
> +#     to the max open zones (default 0).
> +#
> +# @max-append-bytes: The maximal number of bytes of a zone
> +#     append request that can be issued to the device.  It must be
> +#     512-byte aligned and less than the zone capacity.

Default?

> +#
> +# Since 8.2
> +##
> +{ 'struct': 'Qcow2ZoneHostManaged',
> +  'data': { '*size':          'size',
> +            '*capacity':      'size',
> +            '*conventional-zones': 'uint32',
> +            '*max-open-zones':     'uint32',
> +            '*max-active-zones':   'uint32',
> +            '*max-append-bytes':   'size' } }
> +
> +##
> +# @Qcow2ZoneCreateOptions:
> +#
> +# The zone device model for the qcow2 image.
> +#
> +# Since 8.2
> +##
> +{ 'union': 'Qcow2ZoneCreateOptions',
> +  'base': { 'mode': 'Qcow2ZoneModel' },
> +  'discriminator': 'mode',
> +  'data': { 'host-managed': 'Qcow2ZoneHostManaged' } }
> +
>  ##
>  # @BlockdevCreateOptionsQcow2:
>  #
> @@ -5080,6 +5141,9 @@
>  # @compression-type: The image cluster compression method
>  #     (default: zlib, since 5.1)
>  #
> +# @zone: The zone device model modes.  The default is that the device is
> +# not zoned.  (since 8.2)
> +#
>  # Since: 2.12
>  ##
>  { 'struct': 'BlockdevCreateOptionsQcow2',
> @@ -5096,7 +5160,8 @@
>              '*preallocation':   'PreallocMode',
>              '*lazy-refcounts':  'bool',
>              '*refcount-bits':   'int',
> -            '*compression-type':'Qcow2CompressionType' } }
> +            '*compression-type':'Qcow2CompressionType',
> +            '*zone':            'Qcow2ZoneCreateOptions' } }
>  
>  ##
>  # @BlockdevCreateOptionsQed:
Markus Armbruster Feb. 19, 2024, 12:05 p.m. UTC | #2
One more thing...

Markus Armbruster <armbru@redhat.com> writes:

> I apologize for the delayed review.
>
> Sam Li <faithilikerun@gmail.com> writes:
>
>> To configure the zoned format feature on the qcow2 driver, it
>> requires settings as: the device size, zone model, zone size,
>> zone capacity, number of conventional zones, limits on zone
>> resources (max append bytes, max open zones, and max_active_zones).
>>
>> To create a qcow2 image with zoned format feature, use command like
>> this:
>> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
>> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
>> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
>> -o zone.max_active_zones=8 -o zone.mode=host-managed
>>
>> Signed-off-by: Sam Li <faithilikerun@gmail.com>
>
> [...]
>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index ca390c5700..e2e0ec21a5 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -5038,6 +5038,67 @@
>>  { 'enum': 'Qcow2CompressionType',
>>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>>  
>> +##
>> +# @Qcow2ZoneModel:
>> +#
>> +# Zoned device model used in qcow2 image file
>> +#
>> +# @host-managed: The host-managed model only allows sequential write over the
>> +#     device zones.
>> +#
>> +# Since 8.2
>> +##
>> +{ 'enum': 'Qcow2ZoneModel',
>> +  'data': [ 'host-managed'] }
>> +
>> +##
>> +# @Qcow2ZoneHostManaged:
>> +#
>> +# The host-managed zone model.  It only allows sequential writes.
>> +#
>> +# @size: Total number of bytes within zones.
>
> Default?
>
>> +#
>> +# @capacity: The number of usable logical blocks within zones
>> +#     in bytes.  A zone capacity is always smaller or equal to the
>> +#     zone size.
>
> Default?
>
>> +#
>> +# @conventional-zones: The number of conventional zones of the
>> +#     zoned device (default 0).
>> +#
>> +# @max-open-zones: The maximal number of open zones.  It is less than
>> +#     or equal to the number of sequential write required zones of
>> +#     the device (default 0).
>> +#
>> +# @max-active-zones: The maximal number of zones in the implicit
>> +#     open, explicit open or closed state.  It is less than or equal
>> +#     to the max open zones (default 0).
>> +#
>> +# @max-append-bytes: The maximal number of bytes of a zone
>> +#     append request that can be issued to the device.  It must be
>> +#     512-byte aligned and less than the zone capacity.
>
> Default?
>
>> +#
>> +# Since 8.2
>> +##
>> +{ 'struct': 'Qcow2ZoneHostManaged',
>> +  'data': { '*size':          'size',
>> +            '*capacity':      'size',
>> +            '*conventional-zones': 'uint32',
>> +            '*max-open-zones':     'uint32',
>> +            '*max-active-zones':   'uint32',
>> +            '*max-append-bytes':   'size' } }
>> +
>> +##
>> +# @Qcow2ZoneCreateOptions:
>> +#
>> +# The zone device model for the qcow2 image.

Please document member @mode.

Fails to build since merge commit 61e7a0d27c1:

    qapi/block-core.json: In union 'Qcow2ZoneCreateOptions':
    qapi/block-core.json:5135: member 'mode' lacks documentation

>> +#
>> +# Since 8.2
>> +##
>> +{ 'union': 'Qcow2ZoneCreateOptions',
>> +  'base': { 'mode': 'Qcow2ZoneModel' },
>> +  'discriminator': 'mode',
>> +  'data': { 'host-managed': 'Qcow2ZoneHostManaged' } }
>> +
>>  ##
>>  # @BlockdevCreateOptionsQcow2:
>>  #
>> @@ -5080,6 +5141,9 @@
>>  # @compression-type: The image cluster compression method
>>  #     (default: zlib, since 5.1)
>>  #
>> +# @zone: The zone device model modes.  The default is that the device is
>> +# not zoned.  (since 8.2)
>> +#
>>  # Since: 2.12
>>  ##
>>  { 'struct': 'BlockdevCreateOptionsQcow2',
>> @@ -5096,7 +5160,8 @@
>>              '*preallocation':   'PreallocMode',
>>              '*lazy-refcounts':  'bool',
>>              '*refcount-bits':   'int',
>> -            '*compression-type':'Qcow2CompressionType' } }
>> +            '*compression-type':'Qcow2CompressionType',
>> +            '*zone':            'Qcow2ZoneCreateOptions' } }
>>  
>>  ##
>>  # @BlockdevCreateOptionsQed:
Sam Li Feb. 19, 2024, 12:23 p.m. UTC | #3
Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
>
> One more thing...
>
> Markus Armbruster <armbru@redhat.com> writes:
>
> > I apologize for the delayed review.

No problems. Thanks for reviewing!

> >
> > Sam Li <faithilikerun@gmail.com> writes:
> >
> >> To configure the zoned format feature on the qcow2 driver, it
> >> requires settings as: the device size, zone model, zone size,
> >> zone capacity, number of conventional zones, limits on zone
> >> resources (max append bytes, max open zones, and max_active_zones).
> >>
> >> To create a qcow2 image with zoned format feature, use command like
> >> this:
> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
> >>
> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
> >
> > [...]
> >
> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >> index ca390c5700..e2e0ec21a5 100644
> >> --- a/qapi/block-core.json
> >> +++ b/qapi/block-core.json
> >> @@ -5038,6 +5038,67 @@
> >>  { 'enum': 'Qcow2CompressionType',
> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> >>
> >> +##
> >> +# @Qcow2ZoneModel:
> >> +#
> >> +# Zoned device model used in qcow2 image file
> >> +#
> >> +# @host-managed: The host-managed model only allows sequential write over the
> >> +#     device zones.
> >> +#
> >> +# Since 8.2
> >> +##
> >> +{ 'enum': 'Qcow2ZoneModel',
> >> +  'data': [ 'host-managed'] }
> >> +
> >> +##
> >> +# @Qcow2ZoneHostManaged:
> >> +#
> >> +# The host-managed zone model.  It only allows sequential writes.
> >> +#
> >> +# @size: Total number of bytes within zones.
> >
> > Default?

It should be set by users. No default value provided. If it's unset
then it is zero and an error will be returned.

> >
> >> +#
> >> +# @capacity: The number of usable logical blocks within zones
> >> +#     in bytes.  A zone capacity is always smaller or equal to the
> >> +#     zone size.
> >
> > Default?

Same.

> >
> >> +# @max-append-bytes: The maximal number of bytes of a zone
> >> +#     append request that can be issued to the device.  It must be
> >> +#     512-byte aligned and less than the zone capacity.
> >
> > Default?

Same.

For those values, I guess it could be set when users provide no
information and still want a workable emulated zoned block device.

> >
> >> +#
> >> +# Since 8.2
> >> +##
> >> +{ 'struct': 'Qcow2ZoneHostManaged',
> >> +  'data': { '*size':          'size',
> >> +            '*capacity':      'size',
> >> +            '*conventional-zones': 'uint32',
> >> +            '*max-open-zones':     'uint32',
> >> +            '*max-active-zones':   'uint32',
> >> +            '*max-append-bytes':   'size' } }
> >> +
> >> +##
> >> +# @Qcow2ZoneCreateOptions:
> >> +#
> >> +# The zone device model for the qcow2 image.
>
> Please document member @mode.
>
> Fails to build since merge commit 61e7a0d27c1:
>
>     qapi/block-core.json: In union 'Qcow2ZoneCreateOptions':
>     qapi/block-core.json:5135: member 'mode' lacks documentation
>

I see. Will update to the latest commit.

> >> +#
> >> +# Since 8.2
> >> +##
> >> +{ 'union': 'Qcow2ZoneCreateOptions',
> >> +  'base': { 'mode': 'Qcow2ZoneModel' },
> >> +  'discriminator': 'mode',
> >> +  'data': { 'host-managed': 'Qcow2ZoneHostManaged' } }
> >> +
> >>  ##
> >>  # @BlockdevCreateOptionsQcow2:
> >>  #
> >> @@ -5080,6 +5141,9 @@
> >>  # @compression-type: The image cluster compression method
> >>  #     (default: zlib, since 5.1)
> >>  #
> >> +# @zone: The zone device model modes.  The default is that the device is
> >> +# not zoned.  (since 8.2)
> >> +#
> >>  # Since: 2.12
> >>  ##
> >>  { 'struct': 'BlockdevCreateOptionsQcow2',
> >> @@ -5096,7 +5160,8 @@
> >>              '*preallocation':   'PreallocMode',
> >>              '*lazy-refcounts':  'bool',
> >>              '*refcount-bits':   'int',
> >> -            '*compression-type':'Qcow2CompressionType' } }
> >> +            '*compression-type':'Qcow2CompressionType',
> >> +            '*zone':            'Qcow2ZoneCreateOptions' } }
> >>
> >>  ##
> >>  # @BlockdevCreateOptionsQed:
>
Markus Armbruster Feb. 19, 2024, 2:39 p.m. UTC | #4
Sam Li <faithilikerun@gmail.com> writes:

> Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
>>
>> One more thing...
>>
>> Markus Armbruster <armbru@redhat.com> writes:
>>
>> > I apologize for the delayed review.
>
> No problems. Thanks for reviewing!
>
>> >
>> > Sam Li <faithilikerun@gmail.com> writes:
>> >
>> >> To configure the zoned format feature on the qcow2 driver, it
>> >> requires settings as: the device size, zone model, zone size,
>> >> zone capacity, number of conventional zones, limits on zone
>> >> resources (max append bytes, max open zones, and max_active_zones).
>> >>
>> >> To create a qcow2 image with zoned format feature, use command like
>> >> this:
>> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
>> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
>> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
>> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
>> >>
>> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
>> >
>> > [...]
>> >
>> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> >> index ca390c5700..e2e0ec21a5 100644
>> >> --- a/qapi/block-core.json
>> >> +++ b/qapi/block-core.json
>> >> @@ -5038,6 +5038,67 @@
>> >>  { 'enum': 'Qcow2CompressionType',
>> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>> >>
>> >> +##
>> >> +# @Qcow2ZoneModel:
>> >> +#
>> >> +# Zoned device model used in qcow2 image file
>> >> +#
>> >> +# @host-managed: The host-managed model only allows sequential write over the
>> >> +#     device zones.
>> >> +#
>> >> +# Since 8.2
>> >> +##
>> >> +{ 'enum': 'Qcow2ZoneModel',
>> >> +  'data': [ 'host-managed'] }
>> >> +
>> >> +##
>> >> +# @Qcow2ZoneHostManaged:
>> >> +#
>> >> +# The host-managed zone model.  It only allows sequential writes.
>> >> +#
>> >> +# @size: Total number of bytes within zones.
>> >
>> > Default?
>
> It should be set by users. No default value provided. If it's unset
> then it is zero and an error will be returned.

If the user must provide @size, why is it optional then?

>> >
>> >> +#
>> >> +# @capacity: The number of usable logical blocks within zones
>> >> +#     in bytes.  A zone capacity is always smaller or equal to the
>> >> +#     zone size.
>> >
>> > Default?
>
> Same.
>
>> >
>> >> +# @max-append-bytes: The maximal number of bytes of a zone
>> >> +#     append request that can be issued to the device.  It must be
>> >> +#     512-byte aligned and less than the zone capacity.
>> >
>> > Default?
>
> Same.
>
> For those values, I guess it could be set when users provide no
> information and still want a workable emulated zoned block device.
>
>> >
>> >> +#
>> >> +# Since 8.2
>> >> +##
>> >> +{ 'struct': 'Qcow2ZoneHostManaged',
>> >> +  'data': { '*size':          'size',
>> >> +            '*capacity':      'size',
>> >> +            '*conventional-zones': 'uint32',
>> >> +            '*max-open-zones':     'uint32',
>> >> +            '*max-active-zones':   'uint32',
>> >> +            '*max-append-bytes':   'size' } }

[...]
Sam Li Feb. 19, 2024, 2:48 p.m. UTC | #5
Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 15:40写道:
>
> Sam Li <faithilikerun@gmail.com> writes:
>
> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
> >>
> >> One more thing...
> >>
> >> Markus Armbruster <armbru@redhat.com> writes:
> >>
> >> > I apologize for the delayed review.
> >
> > No problems. Thanks for reviewing!
> >
> >> >
> >> > Sam Li <faithilikerun@gmail.com> writes:
> >> >
> >> >> To configure the zoned format feature on the qcow2 driver, it
> >> >> requires settings as: the device size, zone model, zone size,
> >> >> zone capacity, number of conventional zones, limits on zone
> >> >> resources (max append bytes, max open zones, and max_active_zones).
> >> >>
> >> >> To create a qcow2 image with zoned format feature, use command like
> >> >> this:
> >> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
> >> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
> >> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
> >> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
> >> >>
> >> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
> >> >
> >> > [...]
> >> >
> >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >> >> index ca390c5700..e2e0ec21a5 100644
> >> >> --- a/qapi/block-core.json
> >> >> +++ b/qapi/block-core.json
> >> >> @@ -5038,6 +5038,67 @@
> >> >>  { 'enum': 'Qcow2CompressionType',
> >> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> >> >>
> >> >> +##
> >> >> +# @Qcow2ZoneModel:
> >> >> +#
> >> >> +# Zoned device model used in qcow2 image file
> >> >> +#
> >> >> +# @host-managed: The host-managed model only allows sequential write over the
> >> >> +#     device zones.
> >> >> +#
> >> >> +# Since 8.2
> >> >> +##
> >> >> +{ 'enum': 'Qcow2ZoneModel',
> >> >> +  'data': [ 'host-managed'] }
> >> >> +
> >> >> +##
> >> >> +# @Qcow2ZoneHostManaged:
> >> >> +#
> >> >> +# The host-managed zone model.  It only allows sequential writes.
> >> >> +#
> >> >> +# @size: Total number of bytes within zones.
> >> >
> >> > Default?
> >
> > It should be set by users. No default value provided. If it's unset
> > then it is zero and an error will be returned.
>
> If the user must provide @size, why is it optional then?

It is not optional when the zone model is host-managed. If it's
non-zoned, then we don't care about zone info. I am not sure how to
make it unoptional.

>
> >> >
> >> >> +#
> >> >> +# @capacity: The number of usable logical blocks within zones
> >> >> +#     in bytes.  A zone capacity is always smaller or equal to the
> >> >> +#     zone size.
> >> >
> >> > Default?
> >
> > Same.
> >
> >> >
> >> >> +# @max-append-bytes: The maximal number of bytes of a zone
> >> >> +#     append request that can be issued to the device.  It must be
> >> >> +#     512-byte aligned and less than the zone capacity.
> >> >
> >> > Default?
> >
> > Same.
> >
> > For those values, I guess it could be set when users provide no
> > information and still want a workable emulated zoned block device.
> >
> >> >
> >> >> +#
> >> >> +# Since 8.2
> >> >> +##
> >> >> +{ 'struct': 'Qcow2ZoneHostManaged',
> >> >> +  'data': { '*size':          'size',
> >> >> +            '*capacity':      'size',
> >> >> +            '*conventional-zones': 'uint32',
> >> >> +            '*max-open-zones':     'uint32',
> >> >> +            '*max-active-zones':   'uint32',
> >> >> +            '*max-append-bytes':   'size' } }
>
> [...]
>
Markus Armbruster Feb. 19, 2024, 3:56 p.m. UTC | #6
Sam Li <faithilikerun@gmail.com> writes:

> Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 15:40写道:
>>
>> Sam Li <faithilikerun@gmail.com> writes:
>>
>> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
>> >>
>> >> One more thing...
>> >>
>> >> Markus Armbruster <armbru@redhat.com> writes:
>> >>
>> >> > I apologize for the delayed review.
>> >
>> > No problems. Thanks for reviewing!
>> >
>> >> >
>> >> > Sam Li <faithilikerun@gmail.com> writes:
>> >> >
>> >> >> To configure the zoned format feature on the qcow2 driver, it
>> >> >> requires settings as: the device size, zone model, zone size,
>> >> >> zone capacity, number of conventional zones, limits on zone
>> >> >> resources (max append bytes, max open zones, and max_active_zones).
>> >> >>
>> >> >> To create a qcow2 image with zoned format feature, use command like
>> >> >> this:
>> >> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
>> >> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
>> >> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
>> >> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
>> >> >>
>> >> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
>> >> >
>> >> > [...]
>> >> >
>> >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> >> >> index ca390c5700..e2e0ec21a5 100644
>> >> >> --- a/qapi/block-core.json
>> >> >> +++ b/qapi/block-core.json
>> >> >> @@ -5038,6 +5038,67 @@
>> >> >>  { 'enum': 'Qcow2CompressionType',
>> >> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>> >> >>
>> >> >> +##
>> >> >> +# @Qcow2ZoneModel:
>> >> >> +#
>> >> >> +# Zoned device model used in qcow2 image file
>> >> >> +#
>> >> >> +# @host-managed: The host-managed model only allows sequential write over the
>> >> >> +#     device zones.
>> >> >> +#
>> >> >> +# Since 8.2
>> >> >> +##
>> >> >> +{ 'enum': 'Qcow2ZoneModel',
>> >> >> +  'data': [ 'host-managed'] }
>> >> >> +
>> >> >> +##
>> >> >> +# @Qcow2ZoneHostManaged:
>> >> >> +#
>> >> >> +# The host-managed zone model.  It only allows sequential writes.
>> >> >> +#
>> >> >> +# @size: Total number of bytes within zones.
>> >> >
>> >> > Default?
>> >
>> > It should be set by users. No default value provided. If it's unset
>> > then it is zero and an error will be returned.
>>
>> If the user must provide @size, why is it optional then?
>
> It is not optional when the zone model is host-managed. If it's
> non-zoned, then we don't care about zone info. I am not sure how to
> make it unoptional.

We have:

   blockdev-create argument @options of type BlockdevCreateOptions

   BlockdevCreateOptions union branch @qcow2 of type
   BlockdevCreateOptionsQcow2, union tag member is @driver

   BlockdevCreateOptionsQcow2 optional member @zone of type
   Qcow2ZoneCreateOptions, default not zoned

   Qcow2ZoneCreateOptions union branch @host-managed of type
   Qcow2ZoneHostManaged, union tag member is @mode

   Qcow2ZoneHostManaged optional member @size of type size.

Making this member @size mandatory means we must specify it when
BlockdevCreateOptionsQcow2 member @zone is present and @zone's member
@mode is "host-managed".  Feels right to me.  Am I missing anything?

>>
>> >> >
>> >> >> +#
>> >> >> +# @capacity: The number of usable logical blocks within zones
>> >> >> +#     in bytes.  A zone capacity is always smaller or equal to the
>> >> >> +#     zone size.
>> >> >
>> >> > Default?
>> >
>> > Same.
>> >
>> >> >
>> >> >> +# @max-append-bytes: The maximal number of bytes of a zone
>> >> >> +#     append request that can be issued to the device.  It must be
>> >> >> +#     512-byte aligned and less than the zone capacity.
>> >> >
>> >> > Default?
>> >
>> > Same.
>> >
>> > For those values, I guess it could be set when users provide no
>> > information and still want a workable emulated zoned block device.
>> >
>> >> >
>> >> >> +#
>> >> >> +# Since 8.2
>> >> >> +##
>> >> >> +{ 'struct': 'Qcow2ZoneHostManaged',
>> >> >> +  'data': { '*size':          'size',
>> >> >> +            '*capacity':      'size',
>> >> >> +            '*conventional-zones': 'uint32',
>> >> >> +            '*max-open-zones':     'uint32',
>> >> >> +            '*max-active-zones':   'uint32',
>> >> >> +            '*max-append-bytes':   'size' } }
>>
>> [...]
>>
Sam Li Feb. 19, 2024, 4:09 p.m. UTC | #7
Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 16:56写道:
>
> Sam Li <faithilikerun@gmail.com> writes:
>
> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 15:40写道:
> >>
> >> Sam Li <faithilikerun@gmail.com> writes:
> >>
> >> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
> >> >>
> >> >> One more thing...
> >> >>
> >> >> Markus Armbruster <armbru@redhat.com> writes:
> >> >>
> >> >> > I apologize for the delayed review.
> >> >
> >> > No problems. Thanks for reviewing!
> >> >
> >> >> >
> >> >> > Sam Li <faithilikerun@gmail.com> writes:
> >> >> >
> >> >> >> To configure the zoned format feature on the qcow2 driver, it
> >> >> >> requires settings as: the device size, zone model, zone size,
> >> >> >> zone capacity, number of conventional zones, limits on zone
> >> >> >> resources (max append bytes, max open zones, and max_active_zones).
> >> >> >>
> >> >> >> To create a qcow2 image with zoned format feature, use command like
> >> >> >> this:
> >> >> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
> >> >> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
> >> >> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
> >> >> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
> >> >> >>
> >> >> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
> >> >> >
> >> >> > [...]
> >> >> >
> >> >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >> >> >> index ca390c5700..e2e0ec21a5 100644
> >> >> >> --- a/qapi/block-core.json
> >> >> >> +++ b/qapi/block-core.json
> >> >> >> @@ -5038,6 +5038,67 @@
> >> >> >>  { 'enum': 'Qcow2CompressionType',
> >> >> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> >> >> >>
> >> >> >> +##
> >> >> >> +# @Qcow2ZoneModel:
> >> >> >> +#
> >> >> >> +# Zoned device model used in qcow2 image file
> >> >> >> +#
> >> >> >> +# @host-managed: The host-managed model only allows sequential write over the
> >> >> >> +#     device zones.
> >> >> >> +#
> >> >> >> +# Since 8.2
> >> >> >> +##
> >> >> >> +{ 'enum': 'Qcow2ZoneModel',
> >> >> >> +  'data': [ 'host-managed'] }
> >> >> >> +
> >> >> >> +##
> >> >> >> +# @Qcow2ZoneHostManaged:
> >> >> >> +#
> >> >> >> +# The host-managed zone model.  It only allows sequential writes.
> >> >> >> +#
> >> >> >> +# @size: Total number of bytes within zones.
> >> >> >
> >> >> > Default?
> >> >
> >> > It should be set by users. No default value provided. If it's unset
> >> > then it is zero and an error will be returned.
> >>
> >> If the user must provide @size, why is it optional then?
> >
> > It is not optional when the zone model is host-managed. If it's
> > non-zoned, then we don't care about zone info. I am not sure how to
> > make it unoptional.
>
> We have:
>
>    blockdev-create argument @options of type BlockdevCreateOptions
>
>    BlockdevCreateOptions union branch @qcow2 of type
>    BlockdevCreateOptionsQcow2, union tag member is @driver
>
>    BlockdevCreateOptionsQcow2 optional member @zone of type
>    Qcow2ZoneCreateOptions, default not zoned
>
>    Qcow2ZoneCreateOptions union branch @host-managed of type
>    Qcow2ZoneHostManaged, union tag member is @mode
>
>    Qcow2ZoneHostManaged optional member @size of type size.
>
> Making this member @size mandatory means we must specify it when
> BlockdevCreateOptionsQcow2 member @zone is present and @zone's member
> @mode is "host-managed".  Feels right to me.  Am I missing anything?

That's right. And the checks when creating such an img can help do
that. It's not specified in the .json file directly.

>
> >>
> >> >> >
> >> >> >> +#
> >> >> >> +# @capacity: The number of usable logical blocks within zones
> >> >> >> +#     in bytes.  A zone capacity is always smaller or equal to the
> >> >> >> +#     zone size.
> >> >> >
> >> >> > Default?
> >> >
> >> > Same.
> >> >
> >> >> >
> >> >> >> +# @max-append-bytes: The maximal number of bytes of a zone
> >> >> >> +#     append request that can be issued to the device.  It must be
> >> >> >> +#     512-byte aligned and less than the zone capacity.
> >> >> >
> >> >> > Default?
> >> >
> >> > Same.
> >> >
> >> > For those values, I guess it could be set when users provide no
> >> > information and still want a workable emulated zoned block device.
> >> >
> >> >> >
> >> >> >> +#
> >> >> >> +# Since 8.2
> >> >> >> +##
> >> >> >> +{ 'struct': 'Qcow2ZoneHostManaged',
> >> >> >> +  'data': { '*size':          'size',
> >> >> >> +            '*capacity':      'size',
> >> >> >> +            '*conventional-zones': 'uint32',
> >> >> >> +            '*max-open-zones':     'uint32',
> >> >> >> +            '*max-active-zones':   'uint32',
> >> >> >> +            '*max-append-bytes':   'size' } }
> >>
> >> [...]
> >>
>
Markus Armbruster Feb. 19, 2024, 8:42 p.m. UTC | #8
Sam Li <faithilikerun@gmail.com> writes:

> Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 16:56写道:
>>
>> Sam Li <faithilikerun@gmail.com> writes:
>>
>> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 15:40写道:
>> >>
>> >> Sam Li <faithilikerun@gmail.com> writes:
>> >>
>> >> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
>> >> >>
>> >> >> One more thing...
>> >> >>
>> >> >> Markus Armbruster <armbru@redhat.com> writes:
>> >> >>
>> >> >> > I apologize for the delayed review.
>> >> >
>> >> > No problems. Thanks for reviewing!
>> >> >
>> >> >> >
>> >> >> > Sam Li <faithilikerun@gmail.com> writes:
>> >> >> >
>> >> >> >> To configure the zoned format feature on the qcow2 driver, it
>> >> >> >> requires settings as: the device size, zone model, zone size,
>> >> >> >> zone capacity, number of conventional zones, limits on zone
>> >> >> >> resources (max append bytes, max open zones, and max_active_zones).
>> >> >> >>
>> >> >> >> To create a qcow2 image with zoned format feature, use command like
>> >> >> >> this:
>> >> >> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
>> >> >> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
>> >> >> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
>> >> >> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
>> >> >> >>
>> >> >> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
>> >> >> >
>> >> >> > [...]
>> >> >> >
>> >> >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> >> >> >> index ca390c5700..e2e0ec21a5 100644
>> >> >> >> --- a/qapi/block-core.json
>> >> >> >> +++ b/qapi/block-core.json
>> >> >> >> @@ -5038,6 +5038,67 @@
>> >> >> >>  { 'enum': 'Qcow2CompressionType',
>> >> >> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>> >> >> >>
>> >> >> >> +##
>> >> >> >> +# @Qcow2ZoneModel:
>> >> >> >> +#
>> >> >> >> +# Zoned device model used in qcow2 image file
>> >> >> >> +#
>> >> >> >> +# @host-managed: The host-managed model only allows sequential write over the
>> >> >> >> +#     device zones.
>> >> >> >> +#
>> >> >> >> +# Since 8.2
>> >> >> >> +##
>> >> >> >> +{ 'enum': 'Qcow2ZoneModel',
>> >> >> >> +  'data': [ 'host-managed'] }
>> >> >> >> +
>> >> >> >> +##
>> >> >> >> +# @Qcow2ZoneHostManaged:
>> >> >> >> +#
>> >> >> >> +# The host-managed zone model.  It only allows sequential writes.
>> >> >> >> +#
>> >> >> >> +# @size: Total number of bytes within zones.
>> >> >> >
>> >> >> > Default?
>> >> >
>> >> > It should be set by users. No default value provided. If it's unset
>> >> > then it is zero and an error will be returned.
>> >>
>> >> If the user must provide @size, why is it optional then?
>> >
>> > It is not optional when the zone model is host-managed. If it's
>> > non-zoned, then we don't care about zone info. I am not sure how to
>> > make it unoptional.
>>
>> We have:
>>
>>    blockdev-create argument @options of type BlockdevCreateOptions
>>
>>    BlockdevCreateOptions union branch @qcow2 of type
>>    BlockdevCreateOptionsQcow2, union tag member is @driver
>>
>>    BlockdevCreateOptionsQcow2 optional member @zone of type
>>    Qcow2ZoneCreateOptions, default not zoned
>>
>>    Qcow2ZoneCreateOptions union branch @host-managed of type
>>    Qcow2ZoneHostManaged, union tag member is @mode
>>
>>    Qcow2ZoneHostManaged optional member @size of type size.
>>
>> Making this member @size mandatory means we must specify it when
>> BlockdevCreateOptionsQcow2 member @zone is present and @zone's member
>> @mode is "host-managed".  Feels right to me.  Am I missing anything?
>
> That's right. And the checks when creating such an img can help do
> that. It's not specified in the .json file directly.

What would break if we did specify it in the QAPI schema directly?

[...]
Sam Li Feb. 19, 2024, 8:46 p.m. UTC | #9
Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 21:42写道:
>
> Sam Li <faithilikerun@gmail.com> writes:
>
> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 16:56写道:
> >>
> >> Sam Li <faithilikerun@gmail.com> writes:
> >>
> >> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 15:40写道:
> >> >>
> >> >> Sam Li <faithilikerun@gmail.com> writes:
> >> >>
> >> >> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
> >> >> >>
> >> >> >> One more thing...
> >> >> >>
> >> >> >> Markus Armbruster <armbru@redhat.com> writes:
> >> >> >>
> >> >> >> > I apologize for the delayed review.
> >> >> >
> >> >> > No problems. Thanks for reviewing!
> >> >> >
> >> >> >> >
> >> >> >> > Sam Li <faithilikerun@gmail.com> writes:
> >> >> >> >
> >> >> >> >> To configure the zoned format feature on the qcow2 driver, it
> >> >> >> >> requires settings as: the device size, zone model, zone size,
> >> >> >> >> zone capacity, number of conventional zones, limits on zone
> >> >> >> >> resources (max append bytes, max open zones, and max_active_zones).
> >> >> >> >>
> >> >> >> >> To create a qcow2 image with zoned format feature, use command like
> >> >> >> >> this:
> >> >> >> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
> >> >> >> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
> >> >> >> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
> >> >> >> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
> >> >> >> >>
> >> >> >> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
> >> >> >> >
> >> >> >> > [...]
> >> >> >> >
> >> >> >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >> >> >> >> index ca390c5700..e2e0ec21a5 100644
> >> >> >> >> --- a/qapi/block-core.json
> >> >> >> >> +++ b/qapi/block-core.json
> >> >> >> >> @@ -5038,6 +5038,67 @@
> >> >> >> >>  { 'enum': 'Qcow2CompressionType',
> >> >> >> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> >> >> >> >>
> >> >> >> >> +##
> >> >> >> >> +# @Qcow2ZoneModel:
> >> >> >> >> +#
> >> >> >> >> +# Zoned device model used in qcow2 image file
> >> >> >> >> +#
> >> >> >> >> +# @host-managed: The host-managed model only allows sequential write over the
> >> >> >> >> +#     device zones.
> >> >> >> >> +#
> >> >> >> >> +# Since 8.2
> >> >> >> >> +##
> >> >> >> >> +{ 'enum': 'Qcow2ZoneModel',
> >> >> >> >> +  'data': [ 'host-managed'] }
> >> >> >> >> +
> >> >> >> >> +##
> >> >> >> >> +# @Qcow2ZoneHostManaged:
> >> >> >> >> +#
> >> >> >> >> +# The host-managed zone model.  It only allows sequential writes.
> >> >> >> >> +#
> >> >> >> >> +# @size: Total number of bytes within zones.
> >> >> >> >
> >> >> >> > Default?
> >> >> >
> >> >> > It should be set by users. No default value provided. If it's unset
> >> >> > then it is zero and an error will be returned.
> >> >>
> >> >> If the user must provide @size, why is it optional then?
> >> >
> >> > It is not optional when the zone model is host-managed. If it's
> >> > non-zoned, then we don't care about zone info. I am not sure how to
> >> > make it unoptional.
> >>
> >> We have:
> >>
> >>    blockdev-create argument @options of type BlockdevCreateOptions
> >>
> >>    BlockdevCreateOptions union branch @qcow2 of type
> >>    BlockdevCreateOptionsQcow2, union tag member is @driver
> >>
> >>    BlockdevCreateOptionsQcow2 optional member @zone of type
> >>    Qcow2ZoneCreateOptions, default not zoned
> >>
> >>    Qcow2ZoneCreateOptions union branch @host-managed of type
> >>    Qcow2ZoneHostManaged, union tag member is @mode
> >>
> >>    Qcow2ZoneHostManaged optional member @size of type size.
> >>
> >> Making this member @size mandatory means we must specify it when
> >> BlockdevCreateOptionsQcow2 member @zone is present and @zone's member
> >> @mode is "host-managed".  Feels right to me.  Am I missing anything?
> >
> > That's right. And the checks when creating such an img can help do
> > that. It's not specified in the .json file directly.
>
> What would break if we did specify it in the QAPI schema directly?

Nothing I think. We can keep the current schema and add a default zone
size like 131072.

>
> [...]
>
Markus Armbruster Feb. 19, 2024, 9:15 p.m. UTC | #10
Sam Li <faithilikerun@gmail.com> writes:

> Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 21:42写道:
>>
>> Sam Li <faithilikerun@gmail.com> writes:
>>
>> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 16:56写道:
>> >>
>> >> Sam Li <faithilikerun@gmail.com> writes:
>> >>
>> >> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 15:40写道:
>> >> >>
>> >> >> Sam Li <faithilikerun@gmail.com> writes:
>> >> >>
>> >> >> > Markus Armbruster <armbru@redhat.com> 于2024年2月19日周一 13:05写道:
>> >> >> >>
>> >> >> >> One more thing...
>> >> >> >>
>> >> >> >> Markus Armbruster <armbru@redhat.com> writes:
>> >> >> >>
>> >> >> >> > I apologize for the delayed review.
>> >> >> >
>> >> >> > No problems. Thanks for reviewing!
>> >> >> >
>> >> >> >> >
>> >> >> >> > Sam Li <faithilikerun@gmail.com> writes:
>> >> >> >> >
>> >> >> >> >> To configure the zoned format feature on the qcow2 driver, it
>> >> >> >> >> requires settings as: the device size, zone model, zone size,
>> >> >> >> >> zone capacity, number of conventional zones, limits on zone
>> >> >> >> >> resources (max append bytes, max open zones, and max_active_zones).
>> >> >> >> >>
>> >> >> >> >> To create a qcow2 image with zoned format feature, use command like
>> >> >> >> >> this:
>> >> >> >> >> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
>> >> >> >> >> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
>> >> >> >> >> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
>> >> >> >> >> -o zone.max_active_zones=8 -o zone.mode=host-managed
>> >> >> >> >>
>> >> >> >> >> Signed-off-by: Sam Li <faithilikerun@gmail.com>
>> >> >> >> >
>> >> >> >> > [...]
>> >> >> >> >
>> >> >> >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> >> >> >> >> index ca390c5700..e2e0ec21a5 100644
>> >> >> >> >> --- a/qapi/block-core.json
>> >> >> >> >> +++ b/qapi/block-core.json
>> >> >> >> >> @@ -5038,6 +5038,67 @@
>> >> >> >> >>  { 'enum': 'Qcow2CompressionType',
>> >> >> >> >>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>> >> >> >> >>
>> >> >> >> >> +##
>> >> >> >> >> +# @Qcow2ZoneModel:
>> >> >> >> >> +#
>> >> >> >> >> +# Zoned device model used in qcow2 image file
>> >> >> >> >> +#
>> >> >> >> >> +# @host-managed: The host-managed model only allows sequential write over the
>> >> >> >> >> +#     device zones.
>> >> >> >> >> +#
>> >> >> >> >> +# Since 8.2
>> >> >> >> >> +##
>> >> >> >> >> +{ 'enum': 'Qcow2ZoneModel',
>> >> >> >> >> +  'data': [ 'host-managed'] }
>> >> >> >> >> +
>> >> >> >> >> +##
>> >> >> >> >> +# @Qcow2ZoneHostManaged:
>> >> >> >> >> +#
>> >> >> >> >> +# The host-managed zone model.  It only allows sequential writes.
>> >> >> >> >> +#
>> >> >> >> >> +# @size: Total number of bytes within zones.
>> >> >> >> >
>> >> >> >> > Default?
>> >> >> >
>> >> >> > It should be set by users. No default value provided. If it's unset
>> >> >> > then it is zero and an error will be returned.
>> >> >>
>> >> >> If the user must provide @size, why is it optional then?
>> >> >
>> >> > It is not optional when the zone model is host-managed. If it's
>> >> > non-zoned, then we don't care about zone info. I am not sure how to
>> >> > make it unoptional.
>> >>
>> >> We have:
>> >>
>> >>    blockdev-create argument @options of type BlockdevCreateOptions
>> >>
>> >>    BlockdevCreateOptions union branch @qcow2 of type
>> >>    BlockdevCreateOptionsQcow2, union tag member is @driver
>> >>
>> >>    BlockdevCreateOptionsQcow2 optional member @zone of type
>> >>    Qcow2ZoneCreateOptions, default not zoned
>> >>
>> >>    Qcow2ZoneCreateOptions union branch @host-managed of type
>> >>    Qcow2ZoneHostManaged, union tag member is @mode
>> >>
>> >>    Qcow2ZoneHostManaged optional member @size of type size.
>> >>
>> >> Making this member @size mandatory means we must specify it when
>> >> BlockdevCreateOptionsQcow2 member @zone is present and @zone's member
>> >> @mode is "host-managed".  Feels right to me.  Am I missing anything?
>> >
>> > That's right. And the checks when creating such an img can help do
>> > that. It's not specified in the .json file directly.
>>
>> What would break if we did specify it in the QAPI schema directly?
>
> Nothing I think. We can keep the current schema and add a default zone
> size like 131072.

I believe making the member mandatory makes a lot more sense.

I guess we can keep @capacity and @max-append-bytes keep optional *if*
we can come up with sensible defaults.

>> [...]
Damien Le Moal Feb. 20, 2024, 2:25 a.m. UTC | #11
On 2/20/24 06:15, Markus Armbruster wrote:
>>>>> Making this member @size mandatory means we must specify it when
>>>>> BlockdevCreateOptionsQcow2 member @zone is present and @zone's member
>>>>> @mode is "host-managed".  Feels right to me.  Am I missing anything?
>>>>
>>>> That's right. And the checks when creating such an img can help do
>>>> that. It's not specified in the .json file directly.
>>>
>>> What would break if we did specify it in the QAPI schema directly?
>>
>> Nothing I think. We can keep the current schema and add a default zone
>> size like 131072.
> 
> I believe making the member mandatory makes a lot more sense.
> 
> I guess we can keep @capacity and @max-append-bytes keep optional *if*
> we can come up with sensible defaults.

Yes, @capacity can be optional and default to the zone size.  @max-append-bytes
can also be optional and default to the regular read/write max size.

However, defining a "sensible" default for the zone size is rather tricky. The
reason is that zones are generally sized according to the device speed. Zones
are 256MB on HDDs (which takes about 1s to fully write sequentially) while NVMe
ZNS devices will more likely have a zone size of 1-2 GB (because the device is
much faster). If we stick with such approach, a sensible zone size would depend
on how fast the qcow2 image backing is. I.e. 256 MB would be more appropriate
for a qcow2 image on a file stored on HDD while larger sizes may be better for
SSD backed images. But that is also not accounting for the host page caching
which can "show" the qcow2 image as being much faster than its backing storage
really is.

So I would be tempted to say that defaulting to 0 to force the user to specify a
zone size would be safer. But if you really want a non-0 default, then maybe
256MB or so may be OK as that is the most commonly used value out there for
zoned storage (there are literally millions of SMR drives running in production
systems out there and NVMe ZNS devices are not widely used yet).
Stefan Hajnoczi March 12, 2024, 3:04 p.m. UTC | #12
On Mon, Jan 22, 2024 at 07:48:28PM +0100, Sam Li wrote:
> To configure the zoned format feature on the qcow2 driver, it
> requires settings as: the device size, zone model, zone size,
> zone capacity, number of conventional zones, limits on zone
> resources (max append bytes, max open zones, and max_active_zones).
> 
> To create a qcow2 image with zoned format feature, use command like
> this:
> qemu-img create -f qcow2 zbc.qcow2 -o size=768M \
> -o zone.size=64M -o zone.capacity=64M -o zone.conventional_zones=0 \
> -o zone.max_append_bytes=4096 -o zone.max_open_zones=6 \
> -o zone.max_active_zones=8 -o zone.mode=host-managed
> 
> Signed-off-by: Sam Li <faithilikerun@gmail.com>
> ---
>  block/qcow2.c                    | 252 ++++++++++++++++++++++++++++++-
>  block/qcow2.h                    |  36 ++++-
>  docs/interop/qcow2.txt           | 107 ++++++++++++-
>  include/block/block_int-common.h |  13 ++
>  qapi/block-core.json             |  67 +++++++-
>  5 files changed, 469 insertions(+), 6 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 9bee66fff5..b987f1e751 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -73,6 +73,7 @@ typedef struct {
>  #define  QCOW2_EXT_MAGIC_CRYPTO_HEADER 0x0537be77
>  #define  QCOW2_EXT_MAGIC_BITMAPS 0x23852875
>  #define  QCOW2_EXT_MAGIC_DATA_FILE 0x44415441
> +#define  QCOW2_EXT_MAGIC_ZONED_FORMAT 0x007a6264
>  
>  static int coroutine_fn
>  qcow2_co_preadv_compressed(BlockDriverState *bs,
> @@ -194,6 +195,68 @@ qcow2_extract_crypto_opts(QemuOpts *opts, const char *fmt, Error **errp)
>      return cryptoopts_qdict;
>  }
>  
> +/*
> + * Passing by the zoned device configurations by a zoned_header struct, check
> + * if the zone device options are under constraints. Return false when some
> + * option is invalid
> + */
> +static inline bool
> +qcow2_check_zone_options(Qcow2ZonedHeaderExtension *zone_opt)
> +{
> +    if (zone_opt) {
> +        uint32_t sequential_zones;
> +
> +        if (zone_opt->zone_size == 0) {
> +            error_report("Zoned extension header zone_size field "
> +                         "can not be 0");

Please add an Error **errp argument to qcow2_check_zone_options() and
use error_setg() instead of calling error_report(). That way the caller
can decide how to print or propagate error messages.

> +            return false;
> +        }
> +
> +        if (zone_opt->zone_capacity > zone_opt->zone_size) {
> +            error_report("zone capacity %" PRIu32 "B exceeds zone size "
> +                         "%" PRIu32 "B", zone_opt->zone_capacity,
> +                         zone_opt->zone_size);
> +            return false;
> +        }
> +
> +        if (zone_opt->max_append_bytes + BDRV_SECTOR_SIZE >=
> +            zone_opt->zone_capacity) {
> +            error_report("max append bytes %" PRIu32 "B exceeds zone "
> +                         "capacity %" PRIu32 "B by more than block size",
> +                         zone_opt->zone_capacity,
> +                         zone_opt->max_append_bytes);
> +            return false;
> +        }
> +
> +        if (zone_opt->max_active_zones > zone_opt->nr_zones) {
> +            error_report("Max_active_zones %" PRIu32 " exceeds "
> +                         "nr_zones %" PRIu32 ". Set it to nr_zones.",
> +                         zone_opt->max_active_zones, zone_opt->nr_zones);
> +            zone_opt->max_active_zones = zone_opt->nr_zones;
> +        }
> +
> +        if (zone_opt->max_open_zones > zone_opt->max_active_zones) {
> +            error_report("Max_open_zones %" PRIu32 " exceeds "
> +                         "max_active_zones %" PRIu32 ". Set it to "
> +                         "max_active_zones.",
> +                         zone_opt->max_open_zones,
> +                         zone_opt->max_active_zones);
> +            zone_opt->max_open_zones = zone_opt->max_active_zones;
> +        }
> +
> +        sequential_zones = zone_opt->nr_zones - zone_opt->conventional_zones;
> +        if (zone_opt->max_open_zones > sequential_zones) {

Please check that conventional_zones < zone_opt->nr_zones first to avoid
integer underflow.

> +            error_report("Max_open_zones field can not be larger "
> +                         "than the number of SWR zones. Set it to number of "
> +                         "SWR zones %" PRIu32 ".", sequential_zones);
> +            zone_opt->max_open_zones = sequential_zones;
> +        }
> +
> +        return true;
> +    }
> +    return false;
> +}
> +
>  /*
>   * read qcow2 extension and fill bs
>   * start reading from start_offset
> @@ -211,6 +274,7 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
>      uint64_t offset;
>      int ret;
>      Qcow2BitmapHeaderExt bitmaps_ext;
> +    Qcow2ZonedHeaderExtension zoned_ext;
>  
>      if (need_update_header != NULL) {
>          *need_update_header = false;
> @@ -432,6 +496,51 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
>              break;
>          }
>  
> +        case QCOW2_EXT_MAGIC_ZONED_FORMAT:
> +        {
> +            if (ext.len < sizeof(zoned_ext)) {
> +                /* Missing fields */
> +                error_setg(errp, "zoned_ext: len=%" PRIu32 " too small "
> +                           "(<%zu)", ext.len, sizeof(zoned_ext));
> +                return -EINVAL;
> +            }
> +            ret = bdrv_pread(bs->file, offset, ext.len, &zoned_ext, 0);
> +            if (ret < 0) {
> +                error_setg_errno(errp, -ret, "zoned_ext: "
> +                                 "Could not read ext header");
> +                return ret;
> +            }
> +
> +            zoned_ext.zone_size = be32_to_cpu(zoned_ext.zone_size);
> +            zoned_ext.zone_capacity = be32_to_cpu(zoned_ext.zone_capacity);
> +            zoned_ext.conventional_zones =
> +                be32_to_cpu(zoned_ext.conventional_zones);
> +            zoned_ext.nr_zones = be32_to_cpu(zoned_ext.nr_zones);
> +            zoned_ext.max_open_zones = be32_to_cpu(zoned_ext.max_open_zones);
> +            zoned_ext.max_active_zones =
> +                be32_to_cpu(zoned_ext.max_active_zones);
> +            zoned_ext.max_append_bytes =
> +                be32_to_cpu(zoned_ext.max_append_bytes);
> +            s->zoned_header = zoned_ext;
> +
> +            /* refuse to open broken images */
> +            if (zoned_ext.nr_zones != DIV_ROUND_UP(bs->total_sectors *
> +                BDRV_SECTOR_SIZE, zoned_ext.zone_size)) {
> +                error_setg(errp, "Zoned extension header nr_zones field "
> +                           "is wrong");
> +                return -EINVAL;
> +            }
> +            if (!qcow2_check_zone_options(&zoned_ext)) {
> +                return -EINVAL;
> +            }
> +
> +#ifdef DEBUG_EXT
> +            printf("Qcow2: Got zoned format extension: "
> +                   "offset=%" PRIu32 "\n", offset);
> +#endif
> +            break;
> +        }
> +
>          default:
>              /* unknown magic - save it in case we need to rewrite the header */
>              /* If you add a new feature, make sure to also update the fast
> @@ -1968,6 +2077,15 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
>      }
>      bs->bl.pwrite_zeroes_alignment = s->subcluster_size;
>      bs->bl.pdiscard_alignment = s->cluster_size;
> +    bs->bl.zoned = s->zoned_header.zoned;
> +    bs->bl.nr_zones = s->zoned_header.nr_zones;
> +    bs->bl.max_append_sectors = s->zoned_header.max_append_bytes
> +        >> BDRV_SECTOR_BITS;
> +    bs->bl.max_active_zones = s->zoned_header.max_active_zones;
> +    bs->bl.max_open_zones = s->zoned_header.max_open_zones;
> +    bs->bl.zone_size = s->zoned_header.zone_size;
> +    bs->bl.zone_capacity = s->zoned_header.zone_capacity;
> +    bs->bl.write_granularity = BDRV_SECTOR_SIZE;
>  }
>  
>  static int GRAPH_UNLOCKED
> @@ -3067,6 +3185,11 @@ int qcow2_update_header(BlockDriverState *bs)
>                  .bit  = QCOW2_INCOMPAT_EXTL2_BITNR,
>                  .name = "extended L2 entries",
>              },
> +            {
> +                .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
> +                .bit  = QCOW2_INCOMPAT_ZONED_FORMAT_BITNR,
> +                .name = "zoned format",
> +            },
>              {
>                  .type = QCOW2_FEAT_TYPE_COMPATIBLE,
>                  .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
> @@ -3112,6 +3235,31 @@ int qcow2_update_header(BlockDriverState *bs)
>          buflen -= ret;
>      }
>  
> +    /* Zoned devices header extension */
> +    if (s->zoned_header.zoned == BLK_Z_HM) {
> +        Qcow2ZonedHeaderExtension zoned_header = {
> +            .zoned              = s->zoned_header.zoned,
> +            .zone_size          = cpu_to_be32(s->zoned_header.zone_size),
> +            .zone_capacity      = cpu_to_be32(s->zoned_header.zone_capacity),
> +            .conventional_zones =
> +                cpu_to_be32(s->zoned_header.conventional_zones),
> +            .nr_zones           = cpu_to_be32(s->zoned_header.nr_zones),
> +            .max_open_zones     = cpu_to_be32(s->zoned_header.max_open_zones),
> +            .max_active_zones   =
> +                cpu_to_be32(s->zoned_header.max_active_zones),
> +            .max_append_bytes =
> +                cpu_to_be32(s->zoned_header.max_append_bytes)
> +        };
> +        ret = header_ext_add(buf, QCOW2_EXT_MAGIC_ZONED_FORMAT,
> +                             &zoned_header, sizeof(zoned_header),
> +                             buflen);
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +        buf += ret;
> +        buflen -= ret;
> +    }
> +
>      /* Keep unknown header extensions */
>      QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
>          ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
> @@ -3485,6 +3633,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>  {
>      BlockdevCreateOptionsQcow2 *qcow2_opts;
>      QDict *options;
> +    Qcow2ZoneCreateOptions *zone_struct;
> +    Qcow2ZoneHostManaged *zone_host_managed;
>  
>      /*
>       * Open the image file and write a minimal qcow2 header.
> @@ -3511,6 +3661,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>  
>      assert(create_options->driver == BLOCKDEV_DRIVER_QCOW2);
>      qcow2_opts = &create_options->u.qcow2;
> +    zone_struct = create_options->u.qcow2.zone;
> +    zone_host_managed = &create_options->u.qcow2.zone->u.host_managed;
>  
>      bs = bdrv_co_open_blockdev_ref(qcow2_opts->file, errp);
>      if (bs == NULL) {
> @@ -3724,6 +3876,14 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>          header->incompatible_features |=
>              cpu_to_be64(QCOW2_INCOMPAT_DATA_FILE);
>      }
> +    if (zone_struct->mode == QCOW2_ZONE_MODEL_HOST_MANAGED) {
> +        /*
> +         * The incompatible bit must be set when the zone model is
> +         * host-managed
> +         */
> +        header->incompatible_features |=
> +            cpu_to_be64(QCOW2_INCOMPAT_ZONED_FORMAT);
> +    }
>      if (qcow2_opts->data_file_raw) {
>          header->autoclear_features |=
>              cpu_to_be64(QCOW2_AUTOCLEAR_DATA_FILE_RAW);
> @@ -3781,10 +3941,9 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>      bdrv_graph_co_rdlock();
>      ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
>      if (ret < 0) {
> -        bdrv_graph_co_rdunlock();
>          error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
>                           "header and refcount table");
> -        goto out;
> +        goto unlock;
>  
>      } else if (ret != 0) {
>          error_report("Huh, first cluster in empty image is already in use?");
> @@ -3792,11 +3951,56 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>      }
>  
>      /* Set the external data file if necessary */
> +    BDRVQcow2State *s = blk_bs(blk)->opaque;
>      if (data_bs) {
> -        BDRVQcow2State *s = blk_bs(blk)->opaque;
>          s->image_data_file = g_strdup(data_bs->filename);
>      }
>  
> +    if (zone_struct->mode == QCOW2_ZONE_MODEL_HOST_MANAGED) {
> +        if (!zone_host_managed->has_size) {
> +            error_setg(errp, "Missing zone_size parameter");
> +            ret = -EINVAL;
> +            goto unlock;
> +        }
> +
> +        s->zoned_header.zoned = BLK_Z_HM;
> +        s->zoned_header.zone_size = zone_host_managed->size;
> +        s->zoned_header.nr_zones = DIV_ROUND_UP(qcow2_opts->size,
> +                                                zone_host_managed->size);
> +        if (zone_host_managed->has_capacity) {
> +            s->zoned_header.zone_capacity = zone_host_managed->capacity;
> +        } else {
> +            s->zoned_header.zone_capacity = zone_host_managed->size;
> +        }
> +
> +        if (zone_host_managed->has_conventional_zones) {
> +            s->zoned_header.conventional_zones =
> +                zone_host_managed->conventional_zones;
> +        }
> +
> +        if (zone_host_managed->has_max_active_zones) {
> +            s->zoned_header.max_active_zones =
> +                zone_host_managed->max_active_zones;
> +
> +            if (zone_host_managed->has_max_open_zones) {
> +                s->zoned_header.max_open_zones =
> +                    zone_host_managed->max_open_zones;
> +            } else {
> +                s->zoned_header.max_open_zones =
> +                    zone_host_managed->max_active_zones;
> +            }
> +        }
> +        s->zoned_header.max_append_bytes = zone_host_managed->max_append_bytes;
> +
> +        if (!qcow2_check_zone_options(&s->zoned_header)) {
> +            s->zoned_header.zoned = BLK_Z_NONE;
> +            ret = -EINVAL;
> +            goto unlock;
> +        }
> +    } else {
> +        s->zoned_header.zoned = BLK_Z_NONE;
> +    }
> +
>      /* Create a full header (including things like feature table) */
>      ret = qcow2_update_header(blk_bs(blk));
>      bdrv_graph_co_rdunlock();
> @@ -3870,6 +4074,9 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>      }
>  
>      ret = 0;
> +    goto out;
> +unlock:
> +    bdrv_graph_co_rdunlock();
>  out:
>      blk_co_unref(blk);
>      bdrv_co_unref(bs);
> @@ -3930,6 +4137,10 @@ qcow2_co_create_opts(BlockDriver *drv, const char *filename, QemuOpts *opts,
>          { BLOCK_OPT_COMPAT_LEVEL,       "version" },
>          { BLOCK_OPT_DATA_FILE_RAW,      "data-file-raw" },
>          { BLOCK_OPT_COMPRESSION_TYPE,   "compression-type" },
> +        { BLOCK_OPT_CONVENTIONAL_ZONES, "zone.conventional-zones" },
> +        { BLOCK_OPT_MAX_OPEN_ZONES,     "zone.max-open-zones" },
> +        { BLOCK_OPT_MAX_ACTIVE_ZONES,   "zone.max-active-zones" },
> +        { BLOCK_OPT_MAX_APPEND_BYTES,   "zone.max-append-bytes" },
>          { NULL, NULL },
>      };
>  
> @@ -6097,6 +6308,41 @@ static QemuOptsList qcow2_create_opts = {
>              .help = "Compression method used for image cluster "        \
>                      "compression",                                      \
>              .def_value_str = "zlib"                                     \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_ZONE_MODEL,                               \
> +            .type = QEMU_OPT_STRING,                                    \
> +            .help = "zone model modes, mode choice: host-managed",      \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_ZONE_SIZE,                                \
> +            .type = QEMU_OPT_SIZE,                                      \
> +            .help = "zone size",                                        \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_ZONE_CAPACITY,                            \
> +            .type = QEMU_OPT_SIZE,                                      \
> +            .help = "zone capacity",                                    \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_CONVENTIONAL_ZONES,                       \
> +            .type = QEMU_OPT_NUMBER,                                    \
> +            .help = "numbers of conventional zones",                    \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_MAX_APPEND_BYTES,                         \
> +            .type = QEMU_OPT_SIZE,                                      \
> +            .help = "max append bytes",                                 \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_MAX_ACTIVE_ZONES,                         \
> +            .type = QEMU_OPT_NUMBER,                                    \
> +            .help = "max active zones",                                 \
> +        },                                                              \
> +        {                                                               \
> +            .name = BLOCK_OPT_MAX_OPEN_ZONES,                           \
> +            .type = QEMU_OPT_NUMBER,                                    \
> +            .help = "max open zones",                                   \
>          },
>          QCOW_COMMON_OPTIONS,
>          { /* end of list */ }
> diff --git a/block/qcow2.h b/block/qcow2.h
> index a9e3481c6e..7f37bb4034 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -236,6 +236,27 @@ typedef struct Qcow2CryptoHeaderExtension {
>      uint64_t length;
>  } QEMU_PACKED Qcow2CryptoHeaderExtension;
>  
> +typedef struct Qcow2ZonedHeaderExtension {
> +    /* Zoned device attributes */
> +    uint8_t zoned;
> +    uint8_t reserved[3];
> +    uint32_t zone_size;
> +    uint32_t zone_capacity;
> +    uint32_t conventional_zones;
> +    uint32_t nr_zones;
> +    uint32_t max_active_zones;
> +    uint32_t max_open_zones;
> +    uint32_t max_append_bytes;
> +    uint64_t zonedmeta_size;
> +    uint64_t zonedmeta_offset;
> +} QEMU_PACKED Qcow2ZonedHeaderExtension;
> +
> +typedef struct Qcow2ZoneListEntry {
> +    QLIST_ENTRY(Qcow2ZoneListEntry) exp_open_zone_entry;
> +    QLIST_ENTRY(Qcow2ZoneListEntry) imp_open_zone_entry;
> +    QLIST_ENTRY(Qcow2ZoneListEntry) closed_zone_entry;
> +} Qcow2ZoneListEntry;
> +
>  typedef struct Qcow2UnknownHeaderExtension {
>      uint32_t magic;
>      uint32_t len;
> @@ -256,17 +277,20 @@ enum {
>      QCOW2_INCOMPAT_DATA_FILE_BITNR  = 2,
>      QCOW2_INCOMPAT_COMPRESSION_BITNR = 3,
>      QCOW2_INCOMPAT_EXTL2_BITNR      = 4,
> +    QCOW2_INCOMPAT_ZONED_FORMAT_BITNR = 5,
>      QCOW2_INCOMPAT_DIRTY            = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
>      QCOW2_INCOMPAT_CORRUPT          = 1 << QCOW2_INCOMPAT_CORRUPT_BITNR,
>      QCOW2_INCOMPAT_DATA_FILE        = 1 << QCOW2_INCOMPAT_DATA_FILE_BITNR,
>      QCOW2_INCOMPAT_COMPRESSION      = 1 << QCOW2_INCOMPAT_COMPRESSION_BITNR,
>      QCOW2_INCOMPAT_EXTL2            = 1 << QCOW2_INCOMPAT_EXTL2_BITNR,
> +    QCOW2_INCOMPAT_ZONED_FORMAT     = 1 << QCOW2_INCOMPAT_ZONED_FORMAT_BITNR,
>  
>      QCOW2_INCOMPAT_MASK             = QCOW2_INCOMPAT_DIRTY
>                                      | QCOW2_INCOMPAT_CORRUPT
>                                      | QCOW2_INCOMPAT_DATA_FILE
>                                      | QCOW2_INCOMPAT_COMPRESSION
> -                                    | QCOW2_INCOMPAT_EXTL2,
> +                                    | QCOW2_INCOMPAT_EXTL2
> +                                    | QCOW2_INCOMPAT_ZONED_FORMAT,
>  };
>  
>  /* Compatible feature bits */
> @@ -422,6 +446,16 @@ typedef struct BDRVQcow2State {
>       * is to convert the image with the desired compression type set.
>       */
>      Qcow2CompressionType compression_type;
> +
> +    /* States of zoned device */
> +    Qcow2ZonedHeaderExtension zoned_header;
> +    QLIST_HEAD(, Qcow2ZoneListEntry) exp_open_zones;
> +    QLIST_HEAD(, Qcow2ZoneListEntry) imp_open_zones;
> +    QLIST_HEAD(, Qcow2ZoneListEntry) closed_zones;
> +    Qcow2ZoneListEntry *zone_list_entries;
> +    uint32_t nr_zones_exp_open;
> +    uint32_t nr_zones_imp_open;
> +    uint32_t nr_zones_closed;

This patch does not use these lists and counters. Please move them (and
the Qcow2ZoneListEntry typedef) to the patch that starts using them.
That way the patch series is easier to review and patches are more
self-contained.

>  } BDRVQcow2State;
>  
>  typedef struct Qcow2COWRegion {
> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
> index 2c4618375a..a8dd4c3b15 100644
> --- a/docs/interop/qcow2.txt
> +++ b/docs/interop/qcow2.txt
> @@ -125,7 +125,26 @@ the next fields through header_length.
>                                  allows subcluster-based allocation. See the
>                                  Extended L2 Entries section for more details.
>  
> -                    Bits 5-63:  Reserved (set to 0)
> +                    Bit 5:      Zoned extension bit. If this bit is set then
> +                                the file is an emulated zoned device. The
> +                                zoned extension must be present.
> +                                Implementations that do not support zoned
> +                                emulation cannot open this file because it
> +                                generally only make sense to interpret the
> +                                data along with the zone information and
> +                                write pointers.
> +
> +                                It is unsafe when any qcow2 user without
> +                                knowing the zoned extension reads or edits
> +                                a file with the zoned extension. The write
> +                                pointer tracking can be corrupted when a
> +                                writer edits a file, like overwriting beyond
> +                                the write pointer locations. Or a reader tries
> +                                to access a file without knowing write
> +                                pointers where the software setup will cause
> +                                invalid reads.
> +
> +                    Bits 6-63:  Reserved (set to 0)
>  
>           80 -  87:  compatible_features
>                      Bitmask of compatible features. An implementation can
> @@ -249,6 +268,7 @@ be stored. Each extension has a structure like the following:
>                          0x23852875 - Bitmaps extension
>                          0x0537be77 - Full disk encryption header pointer
>                          0x44415441 - External data file name string
> +                        0x007a6264 - Zoned extension
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -331,6 +351,91 @@ The fields of the bitmaps extension are:
>                     Offset into the image file at which the bitmap directory
>                     starts. Must be aligned to a cluster boundary.
>  
> +== Zoned extension ==
> +
> +The zoned extension must be present if the incompatible bit is set, and
> +omitted when the incompatible bit is clear. It contains fields for
> +emulating the zoned storage model (https://zonedstorage.io/). When the
> +zone model mode is not host-managed, it is regarded as incompatible
> +and reports an error to users.
> +
> +The write pointers for each zone are stored in an area called zonedmeta
> +clusters. It is 8 bytes per zone. The offset and size of the zonedmeta
> +are kept in the zoned extension header.
> +
> +The fields of the zoned extension are:
> +    Byte       0:  zoned
> +                   The bit represents the zoned model of the device. 0 is for

s/bit/byte/

> +                   a non-zoned device (all other information in this header
> +                   is ignored). 1 is for a host-managed device, which only
> +                   allows for sequential writes within each zone. Other
> +                   values may be added later, the implementation must refuse
> +                   to open a device containing an unknown zone model.
> +
> +          1 -  3:  Reserved, must be zero.
> +
> +          4 -  7:  zone_size
> +                   Total size of each zone, in bytes. It is less than 4GB
> +                   in the qcow2 image for simplicity. The maximum zone size
> +                   of real zoned devices are 2TB. If there are usages for
> +                   emulating devices whose zone sizes are larger than 4G,
> +                   the field needs to be sized bigger then.

Can we make the field 64-bit or at least expressed in units of 512-byte
sectors so its range is sufficient for Linux BLKGETZONESZ or virtio-blk
zone sizes?

> +
> +                   The value must be power of 2. Linux currently requires
> +                   the zone size to be a power of 2 number of LBAs. Qcow2
> +                   following this is mainly to allow emulating a real
> +                   ZNS drive configuration. It is not relevant to the cluster
> +                   size.
> +
> +          8 - 11:  zone_capacity
> +                   The number of writable bytes within the zones. The bytes
> +                   between zone capacity and zone size are unusable: reads
> +                   will return 0s and writes will fail.
> +
> +                   A zone capacity is always smaller or equal to the zone
> +                   size. It is for emulating a real ZNS drive configuration,
> +                   which has the constraint of aligning to some hardware erase
> +                   block size.
> +
> +         12 - 15:  conventional_zones
> +                   The number of conventional zones. The conventional zones
> +                   allow sequential writes and random writes. While the
> +                   sequential zones only allow sequential writes.
> +
> +         16 - 19:  nr_zones
> +                   The number of zones. It is the sum of conventional zones
> +                   and sequential zones.
> +
> +         20 - 23:  max_active_zones
> +                   The number of the zones that can be in the implicit open,
> +                   explicit open or closed state. The max active zones cannot
> +                   be larger than the max open zones.
> +
> +         24 - 27:  max_open_zones
> +                   The maximal number of open (implicitly open or explicitly
> +                   open) zones. It cannot be larger than the number of SWR
> +                   zones of the device.
> +
> +                   If the limits of open zones or active zones are equal to
> +                   the total number of SWR zones, then it's the same as having
> +                   no limits therefore max open zones and max active zones are
> +                   set to 0.
> +
> +         28 - 31:  max_append_bytes
> +                   The number of bytes of a zone append request that can be
> +                   issued to the device. It must be 512-byte aligned and less
> +                   than the zone capacity.
> +
> +         32 - 39:  zonedmeta_size
> +                   The size of zoned metadata in bytes. It contains no more
> +                   than 4GB. The zoned metadata structure is the write
> +                   pointers for each zone whose size is the number of zones
> +                   multiplied by the zone size.

Does this mean the maximum value for nr_zones is (2^32 - 1) / 8 =
536,870,911 zones? That seems like plenty, so I don't object to the
limit, but it should be stately explicit because it's not mentioned in
the above nr_zones field.

> +
> +         40 - 47:  zonedmeta_offset
> +                   The offset of zoned metadata structure in the contained
> +                   image, in bytes.

I suggest "Offset into the image file at which the zoned
metadata starts". That phrase is used elsewhere in the spec and is
clearer than than "offset ... in the contained image".

> +
>  == Full disk encryption header pointer ==
>  
>  The full disk encryption header must be present if, and only if, the
> diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
> index 151279d481..d48486f344 100644
> --- a/include/block/block_int-common.h
> +++ b/include/block/block_int-common.h
> @@ -57,6 +57,13 @@
>  #define BLOCK_OPT_DATA_FILE_RAW     "data_file_raw"
>  #define BLOCK_OPT_COMPRESSION_TYPE  "compression_type"
>  #define BLOCK_OPT_EXTL2             "extended_l2"
> +#define BLOCK_OPT_ZONE_MODEL        "zone.mode"
> +#define BLOCK_OPT_ZONE_SIZE         "zone.size"
> +#define BLOCK_OPT_ZONE_CAPACITY     "zone.capacity"
> +#define BLOCK_OPT_CONVENTIONAL_ZONES    "zone.conventional_zones"
> +#define BLOCK_OPT_MAX_APPEND_BYTES      "zone.max_append_bytes"
> +#define BLOCK_OPT_MAX_ACTIVE_ZONES      "zone.max_active_zones"
> +#define BLOCK_OPT_MAX_OPEN_ZONES        "zone.max_open_zones"
>  
>  #define BLOCK_PROBE_BUF_SIZE        512
>  
> @@ -886,6 +893,12 @@ typedef struct BlockLimits {
>      /* zone size expressed in bytes */
>      uint32_t zone_size;
>  
> +    /*
> +     * the number of usable logical blocks within the zone, expressed
> +     * in bytes. A zone capacity is smaller or equal to the zone size.
> +     */
> +    uint32_t zone_capacity;

The Linux BLKGETZONESZ ioctl produces a uint32_t number of 512-byte
sectors. virtio-blk also reports the number of 512-byte sectors. QEMU's
zone_size and zone_capacity are in uint32_t number of _bytes_.

Damien:
I don't remember if we discussed this before, but are we sure that zone
sizes will remain <4 GB? Perhaps uint64_t is more future-proof.

> +
>      /* total number of zones */
>      uint32_t nr_zones;
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index ca390c5700..e2e0ec21a5 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -5038,6 +5038,67 @@
>  { 'enum': 'Qcow2CompressionType',
>    'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>  
> +##
> +# @Qcow2ZoneModel:
> +#
> +# Zoned device model used in qcow2 image file
> +#
> +# @host-managed: The host-managed model only allows sequential write over the
> +#     device zones.
> +#
> +# Since 8.2
> +##
> +{ 'enum': 'Qcow2ZoneModel',
> +  'data': [ 'host-managed'] }
> +
> +##
> +# @Qcow2ZoneHostManaged:
> +#
> +# The host-managed zone model.  It only allows sequential writes.
> +#
> +# @size: Total number of bytes within zones.
> +#
> +# @capacity: The number of usable logical blocks within zones
> +#     in bytes.  A zone capacity is always smaller or equal to the
> +#     zone size.
> +#
> +# @conventional-zones: The number of conventional zones of the
> +#     zoned device (default 0).
> +#
> +# @max-open-zones: The maximal number of open zones.  It is less than
> +#     or equal to the number of sequential write required zones of
> +#     the device (default 0).
> +#
> +# @max-active-zones: The maximal number of zones in the implicit
> +#     open, explicit open or closed state.  It is less than or equal
> +#     to the max open zones (default 0).
> +#
> +# @max-append-bytes: The maximal number of bytes of a zone
> +#     append request that can be issued to the device.  It must be
> +#     512-byte aligned and less than the zone capacity.
> +#
> +# Since 8.2
> +##
> +{ 'struct': 'Qcow2ZoneHostManaged',
> +  'data': { '*size':          'size',
> +            '*capacity':      'size',
> +            '*conventional-zones': 'uint32',
> +            '*max-open-zones':     'uint32',
> +            '*max-active-zones':   'uint32',
> +            '*max-append-bytes':   'size' } }
> +
> +##
> +# @Qcow2ZoneCreateOptions:
> +#
> +# The zone device model for the qcow2 image.
> +#
> +# Since 8.2
> +##
> +{ 'union': 'Qcow2ZoneCreateOptions',
> +  'base': { 'mode': 'Qcow2ZoneModel' },
> +  'discriminator': 'mode',
> +  'data': { 'host-managed': 'Qcow2ZoneHostManaged' } }
> +
>  ##
>  # @BlockdevCreateOptionsQcow2:
>  #
> @@ -5080,6 +5141,9 @@
>  # @compression-type: The image cluster compression method
>  #     (default: zlib, since 5.1)
>  #
> +# @zone: The zone device model modes.  The default is that the device is
> +# not zoned.  (since 8.2)
> +#
>  # Since: 2.12
>  ##
>  { 'struct': 'BlockdevCreateOptionsQcow2',
> @@ -5096,7 +5160,8 @@
>              '*preallocation':   'PreallocMode',
>              '*lazy-refcounts':  'bool',
>              '*refcount-bits':   'int',
> -            '*compression-type':'Qcow2CompressionType' } }
> +            '*compression-type':'Qcow2CompressionType',
> +            '*zone':            'Qcow2ZoneCreateOptions' } }
>  
>  ##
>  # @BlockdevCreateOptionsQed:
> -- 
> 2.40.1
>
diff mbox series

Patch

diff --git a/block/qcow2.c b/block/qcow2.c
index 9bee66fff5..b987f1e751 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -73,6 +73,7 @@  typedef struct {
 #define  QCOW2_EXT_MAGIC_CRYPTO_HEADER 0x0537be77
 #define  QCOW2_EXT_MAGIC_BITMAPS 0x23852875
 #define  QCOW2_EXT_MAGIC_DATA_FILE 0x44415441
+#define  QCOW2_EXT_MAGIC_ZONED_FORMAT 0x007a6264
 
 static int coroutine_fn
 qcow2_co_preadv_compressed(BlockDriverState *bs,
@@ -194,6 +195,68 @@  qcow2_extract_crypto_opts(QemuOpts *opts, const char *fmt, Error **errp)
     return cryptoopts_qdict;
 }
 
+/*
+ * Passing by the zoned device configurations by a zoned_header struct, check
+ * if the zone device options are under constraints. Return false when some
+ * option is invalid
+ */
+static inline bool
+qcow2_check_zone_options(Qcow2ZonedHeaderExtension *zone_opt)
+{
+    if (zone_opt) {
+        uint32_t sequential_zones;
+
+        if (zone_opt->zone_size == 0) {
+            error_report("Zoned extension header zone_size field "
+                         "can not be 0");
+            return false;
+        }
+
+        if (zone_opt->zone_capacity > zone_opt->zone_size) {
+            error_report("zone capacity %" PRIu32 "B exceeds zone size "
+                         "%" PRIu32 "B", zone_opt->zone_capacity,
+                         zone_opt->zone_size);
+            return false;
+        }
+
+        if (zone_opt->max_append_bytes + BDRV_SECTOR_SIZE >=
+            zone_opt->zone_capacity) {
+            error_report("max append bytes %" PRIu32 "B exceeds zone "
+                         "capacity %" PRIu32 "B by more than block size",
+                         zone_opt->zone_capacity,
+                         zone_opt->max_append_bytes);
+            return false;
+        }
+
+        if (zone_opt->max_active_zones > zone_opt->nr_zones) {
+            error_report("Max_active_zones %" PRIu32 " exceeds "
+                         "nr_zones %" PRIu32 ". Set it to nr_zones.",
+                         zone_opt->max_active_zones, zone_opt->nr_zones);
+            zone_opt->max_active_zones = zone_opt->nr_zones;
+        }
+
+        if (zone_opt->max_open_zones > zone_opt->max_active_zones) {
+            error_report("Max_open_zones %" PRIu32 " exceeds "
+                         "max_active_zones %" PRIu32 ". Set it to "
+                         "max_active_zones.",
+                         zone_opt->max_open_zones,
+                         zone_opt->max_active_zones);
+            zone_opt->max_open_zones = zone_opt->max_active_zones;
+        }
+
+        sequential_zones = zone_opt->nr_zones - zone_opt->conventional_zones;
+        if (zone_opt->max_open_zones > sequential_zones) {
+            error_report("Max_open_zones field can not be larger "
+                         "than the number of SWR zones. Set it to number of "
+                         "SWR zones %" PRIu32 ".", sequential_zones);
+            zone_opt->max_open_zones = sequential_zones;
+        }
+
+        return true;
+    }
+    return false;
+}
+
 /*
  * read qcow2 extension and fill bs
  * start reading from start_offset
@@ -211,6 +274,7 @@  qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
     uint64_t offset;
     int ret;
     Qcow2BitmapHeaderExt bitmaps_ext;
+    Qcow2ZonedHeaderExtension zoned_ext;
 
     if (need_update_header != NULL) {
         *need_update_header = false;
@@ -432,6 +496,51 @@  qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
             break;
         }
 
+        case QCOW2_EXT_MAGIC_ZONED_FORMAT:
+        {
+            if (ext.len < sizeof(zoned_ext)) {
+                /* Missing fields */
+                error_setg(errp, "zoned_ext: len=%" PRIu32 " too small "
+                           "(<%zu)", ext.len, sizeof(zoned_ext));
+                return -EINVAL;
+            }
+            ret = bdrv_pread(bs->file, offset, ext.len, &zoned_ext, 0);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret, "zoned_ext: "
+                                 "Could not read ext header");
+                return ret;
+            }
+
+            zoned_ext.zone_size = be32_to_cpu(zoned_ext.zone_size);
+            zoned_ext.zone_capacity = be32_to_cpu(zoned_ext.zone_capacity);
+            zoned_ext.conventional_zones =
+                be32_to_cpu(zoned_ext.conventional_zones);
+            zoned_ext.nr_zones = be32_to_cpu(zoned_ext.nr_zones);
+            zoned_ext.max_open_zones = be32_to_cpu(zoned_ext.max_open_zones);
+            zoned_ext.max_active_zones =
+                be32_to_cpu(zoned_ext.max_active_zones);
+            zoned_ext.max_append_bytes =
+                be32_to_cpu(zoned_ext.max_append_bytes);
+            s->zoned_header = zoned_ext;
+
+            /* refuse to open broken images */
+            if (zoned_ext.nr_zones != DIV_ROUND_UP(bs->total_sectors *
+                BDRV_SECTOR_SIZE, zoned_ext.zone_size)) {
+                error_setg(errp, "Zoned extension header nr_zones field "
+                           "is wrong");
+                return -EINVAL;
+            }
+            if (!qcow2_check_zone_options(&zoned_ext)) {
+                return -EINVAL;
+            }
+
+#ifdef DEBUG_EXT
+            printf("Qcow2: Got zoned format extension: "
+                   "offset=%" PRIu32 "\n", offset);
+#endif
+            break;
+        }
+
         default:
             /* unknown magic - save it in case we need to rewrite the header */
             /* If you add a new feature, make sure to also update the fast
@@ -1968,6 +2077,15 @@  static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
     }
     bs->bl.pwrite_zeroes_alignment = s->subcluster_size;
     bs->bl.pdiscard_alignment = s->cluster_size;
+    bs->bl.zoned = s->zoned_header.zoned;
+    bs->bl.nr_zones = s->zoned_header.nr_zones;
+    bs->bl.max_append_sectors = s->zoned_header.max_append_bytes
+        >> BDRV_SECTOR_BITS;
+    bs->bl.max_active_zones = s->zoned_header.max_active_zones;
+    bs->bl.max_open_zones = s->zoned_header.max_open_zones;
+    bs->bl.zone_size = s->zoned_header.zone_size;
+    bs->bl.zone_capacity = s->zoned_header.zone_capacity;
+    bs->bl.write_granularity = BDRV_SECTOR_SIZE;
 }
 
 static int GRAPH_UNLOCKED
@@ -3067,6 +3185,11 @@  int qcow2_update_header(BlockDriverState *bs)
                 .bit  = QCOW2_INCOMPAT_EXTL2_BITNR,
                 .name = "extended L2 entries",
             },
+            {
+                .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+                .bit  = QCOW2_INCOMPAT_ZONED_FORMAT_BITNR,
+                .name = "zoned format",
+            },
             {
                 .type = QCOW2_FEAT_TYPE_COMPATIBLE,
                 .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
@@ -3112,6 +3235,31 @@  int qcow2_update_header(BlockDriverState *bs)
         buflen -= ret;
     }
 
+    /* Zoned devices header extension */
+    if (s->zoned_header.zoned == BLK_Z_HM) {
+        Qcow2ZonedHeaderExtension zoned_header = {
+            .zoned              = s->zoned_header.zoned,
+            .zone_size          = cpu_to_be32(s->zoned_header.zone_size),
+            .zone_capacity      = cpu_to_be32(s->zoned_header.zone_capacity),
+            .conventional_zones =
+                cpu_to_be32(s->zoned_header.conventional_zones),
+            .nr_zones           = cpu_to_be32(s->zoned_header.nr_zones),
+            .max_open_zones     = cpu_to_be32(s->zoned_header.max_open_zones),
+            .max_active_zones   =
+                cpu_to_be32(s->zoned_header.max_active_zones),
+            .max_append_bytes =
+                cpu_to_be32(s->zoned_header.max_append_bytes)
+        };
+        ret = header_ext_add(buf, QCOW2_EXT_MAGIC_ZONED_FORMAT,
+                             &zoned_header, sizeof(zoned_header),
+                             buflen);
+        if (ret < 0) {
+            goto fail;
+        }
+        buf += ret;
+        buflen -= ret;
+    }
+
     /* Keep unknown header extensions */
     QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
         ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
@@ -3485,6 +3633,8 @@  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 {
     BlockdevCreateOptionsQcow2 *qcow2_opts;
     QDict *options;
+    Qcow2ZoneCreateOptions *zone_struct;
+    Qcow2ZoneHostManaged *zone_host_managed;
 
     /*
      * Open the image file and write a minimal qcow2 header.
@@ -3511,6 +3661,8 @@  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 
     assert(create_options->driver == BLOCKDEV_DRIVER_QCOW2);
     qcow2_opts = &create_options->u.qcow2;
+    zone_struct = create_options->u.qcow2.zone;
+    zone_host_managed = &create_options->u.qcow2.zone->u.host_managed;
 
     bs = bdrv_co_open_blockdev_ref(qcow2_opts->file, errp);
     if (bs == NULL) {
@@ -3724,6 +3876,14 @@  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
         header->incompatible_features |=
             cpu_to_be64(QCOW2_INCOMPAT_DATA_FILE);
     }
+    if (zone_struct->mode == QCOW2_ZONE_MODEL_HOST_MANAGED) {
+        /*
+         * The incompatible bit must be set when the zone model is
+         * host-managed
+         */
+        header->incompatible_features |=
+            cpu_to_be64(QCOW2_INCOMPAT_ZONED_FORMAT);
+    }
     if (qcow2_opts->data_file_raw) {
         header->autoclear_features |=
             cpu_to_be64(QCOW2_AUTOCLEAR_DATA_FILE_RAW);
@@ -3781,10 +3941,9 @@  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
     bdrv_graph_co_rdlock();
     ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
     if (ret < 0) {
-        bdrv_graph_co_rdunlock();
         error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
                          "header and refcount table");
-        goto out;
+        goto unlock;
 
     } else if (ret != 0) {
         error_report("Huh, first cluster in empty image is already in use?");
@@ -3792,11 +3951,56 @@  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
     }
 
     /* Set the external data file if necessary */
+    BDRVQcow2State *s = blk_bs(blk)->opaque;
     if (data_bs) {
-        BDRVQcow2State *s = blk_bs(blk)->opaque;
         s->image_data_file = g_strdup(data_bs->filename);
     }
 
+    if (zone_struct->mode == QCOW2_ZONE_MODEL_HOST_MANAGED) {
+        if (!zone_host_managed->has_size) {
+            error_setg(errp, "Missing zone_size parameter");
+            ret = -EINVAL;
+            goto unlock;
+        }
+
+        s->zoned_header.zoned = BLK_Z_HM;
+        s->zoned_header.zone_size = zone_host_managed->size;
+        s->zoned_header.nr_zones = DIV_ROUND_UP(qcow2_opts->size,
+                                                zone_host_managed->size);
+        if (zone_host_managed->has_capacity) {
+            s->zoned_header.zone_capacity = zone_host_managed->capacity;
+        } else {
+            s->zoned_header.zone_capacity = zone_host_managed->size;
+        }
+
+        if (zone_host_managed->has_conventional_zones) {
+            s->zoned_header.conventional_zones =
+                zone_host_managed->conventional_zones;
+        }
+
+        if (zone_host_managed->has_max_active_zones) {
+            s->zoned_header.max_active_zones =
+                zone_host_managed->max_active_zones;
+
+            if (zone_host_managed->has_max_open_zones) {
+                s->zoned_header.max_open_zones =
+                    zone_host_managed->max_open_zones;
+            } else {
+                s->zoned_header.max_open_zones =
+                    zone_host_managed->max_active_zones;
+            }
+        }
+        s->zoned_header.max_append_bytes = zone_host_managed->max_append_bytes;
+
+        if (!qcow2_check_zone_options(&s->zoned_header)) {
+            s->zoned_header.zoned = BLK_Z_NONE;
+            ret = -EINVAL;
+            goto unlock;
+        }
+    } else {
+        s->zoned_header.zoned = BLK_Z_NONE;
+    }
+
     /* Create a full header (including things like feature table) */
     ret = qcow2_update_header(blk_bs(blk));
     bdrv_graph_co_rdunlock();
@@ -3870,6 +4074,9 @@  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
     }
 
     ret = 0;
+    goto out;
+unlock:
+    bdrv_graph_co_rdunlock();
 out:
     blk_co_unref(blk);
     bdrv_co_unref(bs);
@@ -3930,6 +4137,10 @@  qcow2_co_create_opts(BlockDriver *drv, const char *filename, QemuOpts *opts,
         { BLOCK_OPT_COMPAT_LEVEL,       "version" },
         { BLOCK_OPT_DATA_FILE_RAW,      "data-file-raw" },
         { BLOCK_OPT_COMPRESSION_TYPE,   "compression-type" },
+        { BLOCK_OPT_CONVENTIONAL_ZONES, "zone.conventional-zones" },
+        { BLOCK_OPT_MAX_OPEN_ZONES,     "zone.max-open-zones" },
+        { BLOCK_OPT_MAX_ACTIVE_ZONES,   "zone.max-active-zones" },
+        { BLOCK_OPT_MAX_APPEND_BYTES,   "zone.max-append-bytes" },
         { NULL, NULL },
     };
 
@@ -6097,6 +6308,41 @@  static QemuOptsList qcow2_create_opts = {
             .help = "Compression method used for image cluster "        \
                     "compression",                                      \
             .def_value_str = "zlib"                                     \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_ZONE_MODEL,                               \
+            .type = QEMU_OPT_STRING,                                    \
+            .help = "zone model modes, mode choice: host-managed",      \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_ZONE_SIZE,                                \
+            .type = QEMU_OPT_SIZE,                                      \
+            .help = "zone size",                                        \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_ZONE_CAPACITY,                            \
+            .type = QEMU_OPT_SIZE,                                      \
+            .help = "zone capacity",                                    \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_CONVENTIONAL_ZONES,                       \
+            .type = QEMU_OPT_NUMBER,                                    \
+            .help = "numbers of conventional zones",                    \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_MAX_APPEND_BYTES,                         \
+            .type = QEMU_OPT_SIZE,                                      \
+            .help = "max append bytes",                                 \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_MAX_ACTIVE_ZONES,                         \
+            .type = QEMU_OPT_NUMBER,                                    \
+            .help = "max active zones",                                 \
+        },                                                              \
+        {                                                               \
+            .name = BLOCK_OPT_MAX_OPEN_ZONES,                           \
+            .type = QEMU_OPT_NUMBER,                                    \
+            .help = "max open zones",                                   \
         },
         QCOW_COMMON_OPTIONS,
         { /* end of list */ }
diff --git a/block/qcow2.h b/block/qcow2.h
index a9e3481c6e..7f37bb4034 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -236,6 +236,27 @@  typedef struct Qcow2CryptoHeaderExtension {
     uint64_t length;
 } QEMU_PACKED Qcow2CryptoHeaderExtension;
 
+typedef struct Qcow2ZonedHeaderExtension {
+    /* Zoned device attributes */
+    uint8_t zoned;
+    uint8_t reserved[3];
+    uint32_t zone_size;
+    uint32_t zone_capacity;
+    uint32_t conventional_zones;
+    uint32_t nr_zones;
+    uint32_t max_active_zones;
+    uint32_t max_open_zones;
+    uint32_t max_append_bytes;
+    uint64_t zonedmeta_size;
+    uint64_t zonedmeta_offset;
+} QEMU_PACKED Qcow2ZonedHeaderExtension;
+
+typedef struct Qcow2ZoneListEntry {
+    QLIST_ENTRY(Qcow2ZoneListEntry) exp_open_zone_entry;
+    QLIST_ENTRY(Qcow2ZoneListEntry) imp_open_zone_entry;
+    QLIST_ENTRY(Qcow2ZoneListEntry) closed_zone_entry;
+} Qcow2ZoneListEntry;
+
 typedef struct Qcow2UnknownHeaderExtension {
     uint32_t magic;
     uint32_t len;
@@ -256,17 +277,20 @@  enum {
     QCOW2_INCOMPAT_DATA_FILE_BITNR  = 2,
     QCOW2_INCOMPAT_COMPRESSION_BITNR = 3,
     QCOW2_INCOMPAT_EXTL2_BITNR      = 4,
+    QCOW2_INCOMPAT_ZONED_FORMAT_BITNR = 5,
     QCOW2_INCOMPAT_DIRTY            = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
     QCOW2_INCOMPAT_CORRUPT          = 1 << QCOW2_INCOMPAT_CORRUPT_BITNR,
     QCOW2_INCOMPAT_DATA_FILE        = 1 << QCOW2_INCOMPAT_DATA_FILE_BITNR,
     QCOW2_INCOMPAT_COMPRESSION      = 1 << QCOW2_INCOMPAT_COMPRESSION_BITNR,
     QCOW2_INCOMPAT_EXTL2            = 1 << QCOW2_INCOMPAT_EXTL2_BITNR,
+    QCOW2_INCOMPAT_ZONED_FORMAT     = 1 << QCOW2_INCOMPAT_ZONED_FORMAT_BITNR,
 
     QCOW2_INCOMPAT_MASK             = QCOW2_INCOMPAT_DIRTY
                                     | QCOW2_INCOMPAT_CORRUPT
                                     | QCOW2_INCOMPAT_DATA_FILE
                                     | QCOW2_INCOMPAT_COMPRESSION
-                                    | QCOW2_INCOMPAT_EXTL2,
+                                    | QCOW2_INCOMPAT_EXTL2
+                                    | QCOW2_INCOMPAT_ZONED_FORMAT,
 };
 
 /* Compatible feature bits */
@@ -422,6 +446,16 @@  typedef struct BDRVQcow2State {
      * is to convert the image with the desired compression type set.
      */
     Qcow2CompressionType compression_type;
+
+    /* States of zoned device */
+    Qcow2ZonedHeaderExtension zoned_header;
+    QLIST_HEAD(, Qcow2ZoneListEntry) exp_open_zones;
+    QLIST_HEAD(, Qcow2ZoneListEntry) imp_open_zones;
+    QLIST_HEAD(, Qcow2ZoneListEntry) closed_zones;
+    Qcow2ZoneListEntry *zone_list_entries;
+    uint32_t nr_zones_exp_open;
+    uint32_t nr_zones_imp_open;
+    uint32_t nr_zones_closed;
 } BDRVQcow2State;
 
 typedef struct Qcow2COWRegion {
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index 2c4618375a..a8dd4c3b15 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -125,7 +125,26 @@  the next fields through header_length.
                                 allows subcluster-based allocation. See the
                                 Extended L2 Entries section for more details.
 
-                    Bits 5-63:  Reserved (set to 0)
+                    Bit 5:      Zoned extension bit. If this bit is set then
+                                the file is an emulated zoned device. The
+                                zoned extension must be present.
+                                Implementations that do not support zoned
+                                emulation cannot open this file because it
+                                generally only make sense to interpret the
+                                data along with the zone information and
+                                write pointers.
+
+                                It is unsafe when any qcow2 user without
+                                knowing the zoned extension reads or edits
+                                a file with the zoned extension. The write
+                                pointer tracking can be corrupted when a
+                                writer edits a file, like overwriting beyond
+                                the write pointer locations. Or a reader tries
+                                to access a file without knowing write
+                                pointers where the software setup will cause
+                                invalid reads.
+
+                    Bits 6-63:  Reserved (set to 0)
 
          80 -  87:  compatible_features
                     Bitmask of compatible features. An implementation can
@@ -249,6 +268,7 @@  be stored. Each extension has a structure like the following:
                         0x23852875 - Bitmaps extension
                         0x0537be77 - Full disk encryption header pointer
                         0x44415441 - External data file name string
+                        0x007a6264 - Zoned extension
                         other      - Unknown header extension, can be safely
                                      ignored
 
@@ -331,6 +351,91 @@  The fields of the bitmaps extension are:
                    Offset into the image file at which the bitmap directory
                    starts. Must be aligned to a cluster boundary.
 
+== Zoned extension ==
+
+The zoned extension must be present if the incompatible bit is set, and
+omitted when the incompatible bit is clear. It contains fields for
+emulating the zoned storage model (https://zonedstorage.io/). When the
+zone model mode is not host-managed, it is regarded as incompatible
+and reports an error to users.
+
+The write pointers for each zone are stored in an area called zonedmeta
+clusters. It is 8 bytes per zone. The offset and size of the zonedmeta
+are kept in the zoned extension header.
+
+The fields of the zoned extension are:
+    Byte       0:  zoned
+                   The bit represents the zoned model of the device. 0 is for
+                   a non-zoned device (all other information in this header
+                   is ignored). 1 is for a host-managed device, which only
+                   allows for sequential writes within each zone. Other
+                   values may be added later, the implementation must refuse
+                   to open a device containing an unknown zone model.
+
+          1 -  3:  Reserved, must be zero.
+
+          4 -  7:  zone_size
+                   Total size of each zone, in bytes. It is less than 4GB
+                   in the qcow2 image for simplicity. The maximum zone size
+                   of real zoned devices are 2TB. If there are usages for
+                   emulating devices whose zone sizes are larger than 4G,
+                   the field needs to be sized bigger then.
+
+                   The value must be power of 2. Linux currently requires
+                   the zone size to be a power of 2 number of LBAs. Qcow2
+                   following this is mainly to allow emulating a real
+                   ZNS drive configuration. It is not relevant to the cluster
+                   size.
+
+          8 - 11:  zone_capacity
+                   The number of writable bytes within the zones. The bytes
+                   between zone capacity and zone size are unusable: reads
+                   will return 0s and writes will fail.
+
+                   A zone capacity is always smaller or equal to the zone
+                   size. It is for emulating a real ZNS drive configuration,
+                   which has the constraint of aligning to some hardware erase
+                   block size.
+
+         12 - 15:  conventional_zones
+                   The number of conventional zones. The conventional zones
+                   allow sequential writes and random writes. While the
+                   sequential zones only allow sequential writes.
+
+         16 - 19:  nr_zones
+                   The number of zones. It is the sum of conventional zones
+                   and sequential zones.
+
+         20 - 23:  max_active_zones
+                   The number of the zones that can be in the implicit open,
+                   explicit open or closed state. The max active zones cannot
+                   be larger than the max open zones.
+
+         24 - 27:  max_open_zones
+                   The maximal number of open (implicitly open or explicitly
+                   open) zones. It cannot be larger than the number of SWR
+                   zones of the device.
+
+                   If the limits of open zones or active zones are equal to
+                   the total number of SWR zones, then it's the same as having
+                   no limits therefore max open zones and max active zones are
+                   set to 0.
+
+         28 - 31:  max_append_bytes
+                   The number of bytes of a zone append request that can be
+                   issued to the device. It must be 512-byte aligned and less
+                   than the zone capacity.
+
+         32 - 39:  zonedmeta_size
+                   The size of zoned metadata in bytes. It contains no more
+                   than 4GB. The zoned metadata structure is the write
+                   pointers for each zone whose size is the number of zones
+                   multiplied by the zone size.
+
+         40 - 47:  zonedmeta_offset
+                   The offset of zoned metadata structure in the contained
+                   image, in bytes.
+
 == Full disk encryption header pointer ==
 
 The full disk encryption header must be present if, and only if, the
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 151279d481..d48486f344 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -57,6 +57,13 @@ 
 #define BLOCK_OPT_DATA_FILE_RAW     "data_file_raw"
 #define BLOCK_OPT_COMPRESSION_TYPE  "compression_type"
 #define BLOCK_OPT_EXTL2             "extended_l2"
+#define BLOCK_OPT_ZONE_MODEL        "zone.mode"
+#define BLOCK_OPT_ZONE_SIZE         "zone.size"
+#define BLOCK_OPT_ZONE_CAPACITY     "zone.capacity"
+#define BLOCK_OPT_CONVENTIONAL_ZONES    "zone.conventional_zones"
+#define BLOCK_OPT_MAX_APPEND_BYTES      "zone.max_append_bytes"
+#define BLOCK_OPT_MAX_ACTIVE_ZONES      "zone.max_active_zones"
+#define BLOCK_OPT_MAX_OPEN_ZONES        "zone.max_open_zones"
 
 #define BLOCK_PROBE_BUF_SIZE        512
 
@@ -886,6 +893,12 @@  typedef struct BlockLimits {
     /* zone size expressed in bytes */
     uint32_t zone_size;
 
+    /*
+     * the number of usable logical blocks within the zone, expressed
+     * in bytes. A zone capacity is smaller or equal to the zone size.
+     */
+    uint32_t zone_capacity;
+
     /* total number of zones */
     uint32_t nr_zones;
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ca390c5700..e2e0ec21a5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5038,6 +5038,67 @@ 
 { 'enum': 'Qcow2CompressionType',
   'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
 
+##
+# @Qcow2ZoneModel:
+#
+# Zoned device model used in qcow2 image file
+#
+# @host-managed: The host-managed model only allows sequential write over the
+#     device zones.
+#
+# Since 8.2
+##
+{ 'enum': 'Qcow2ZoneModel',
+  'data': [ 'host-managed'] }
+
+##
+# @Qcow2ZoneHostManaged:
+#
+# The host-managed zone model.  It only allows sequential writes.
+#
+# @size: Total number of bytes within zones.
+#
+# @capacity: The number of usable logical blocks within zones
+#     in bytes.  A zone capacity is always smaller or equal to the
+#     zone size.
+#
+# @conventional-zones: The number of conventional zones of the
+#     zoned device (default 0).
+#
+# @max-open-zones: The maximal number of open zones.  It is less than
+#     or equal to the number of sequential write required zones of
+#     the device (default 0).
+#
+# @max-active-zones: The maximal number of zones in the implicit
+#     open, explicit open or closed state.  It is less than or equal
+#     to the max open zones (default 0).
+#
+# @max-append-bytes: The maximal number of bytes of a zone
+#     append request that can be issued to the device.  It must be
+#     512-byte aligned and less than the zone capacity.
+#
+# Since 8.2
+##
+{ 'struct': 'Qcow2ZoneHostManaged',
+  'data': { '*size':          'size',
+            '*capacity':      'size',
+            '*conventional-zones': 'uint32',
+            '*max-open-zones':     'uint32',
+            '*max-active-zones':   'uint32',
+            '*max-append-bytes':   'size' } }
+
+##
+# @Qcow2ZoneCreateOptions:
+#
+# The zone device model for the qcow2 image.
+#
+# Since 8.2
+##
+{ 'union': 'Qcow2ZoneCreateOptions',
+  'base': { 'mode': 'Qcow2ZoneModel' },
+  'discriminator': 'mode',
+  'data': { 'host-managed': 'Qcow2ZoneHostManaged' } }
+
 ##
 # @BlockdevCreateOptionsQcow2:
 #
@@ -5080,6 +5141,9 @@ 
 # @compression-type: The image cluster compression method
 #     (default: zlib, since 5.1)
 #
+# @zone: The zone device model modes.  The default is that the device is
+# not zoned.  (since 8.2)
+#
 # Since: 2.12
 ##
 { 'struct': 'BlockdevCreateOptionsQcow2',
@@ -5096,7 +5160,8 @@ 
             '*preallocation':   'PreallocMode',
             '*lazy-refcounts':  'bool',
             '*refcount-bits':   'int',
-            '*compression-type':'Qcow2CompressionType' } }
+            '*compression-type':'Qcow2CompressionType',
+            '*zone':            'Qcow2ZoneCreateOptions' } }
 
 ##
 # @BlockdevCreateOptionsQed: