diff mbox series

[3/5] block: add max_pwrite_zeroes_no_fallback to BlockLimits

Message ID 20200302100537.29058-4-vsementsov@virtuozzo.com
State New
Headers show
Series nbd: reduce max_block restrictions | expand

Commit Message

Vladimir Sementsov-Ogievskiy March 2, 2020, 10:05 a.m. UTC
NBD spec is updated, so that max_block doesn't relate to
NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which mirrors Qemu
flag BDRV_REQ_NO_FALLBACK). To drop the restriction we need new
max_pwrite_zeroes_no_fallback.

Default value of new max_pwrite_zeroes_no_fallback is zero and it means
no-restriction, so we are automatically done by this commit. Note that
nbd and blkdebug are the only drivers which in the same time define
max_pwrite_zeroes limit and support BDRV_REQ_NO_FALLBACK, so we need to
update only blkdebug.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block_int.h | 8 ++++++++
 block/blkdebug.c          | 7 ++++++-
 block/io.c                | 4 +++-
 3 files changed, 17 insertions(+), 2 deletions(-)

Comments

Eric Blake March 13, 2020, 9:07 p.m. UTC | #1
On 3/2/20 4:05 AM, Vladimir Sementsov-Ogievskiy wrote:
> NBD spec is updated, so that max_block doesn't relate to

Maybe: The NBD spec was recently updated to clarify that max_block...

> NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which mirrors Qemu
> flag BDRV_REQ_NO_FALLBACK). To drop the restriction we need new
> max_pwrite_zeroes_no_fallback.

It feels odd to have two different pwrite_zeroes limits in the block 
layer, but I can live with it if other block layer gurus are also okay 
with it.

> 
> Default value of new max_pwrite_zeroes_no_fallback is zero and it means
> no-restriction, so we are automatically done by this commit. Note that

Why not have the default value be set to the existing value of the 
normal pwrite_zeroes limit, rather than 0?

> nbd and blkdebug are the only drivers which in the same time define
> max_pwrite_zeroes limit and support BDRV_REQ_NO_FALLBACK, so we need to
> update only blkdebug.

Grammar:

The default value for the new max_pwrite_zeroes_no_fallback is zero, 
meaning no restriction, which covers all drivers not touched by this 
commit.  Note that nbd and blkdebug are the only drivers which have a 
max_pwrite_zeroes limit while supporting BDRV_REQ_NO_FALLBACK, so we 
only need to update blkdebug.

Except that I think there IS still a limit in current NBD: you can't 
request anything larger than 32 bits (whereas some other drivers may 
allow a full 63-bit request, as well as future NBD usage when we finally 
add 64-bit extensions to the protocol).  So I think this patch is 
incomplete; it should be updating the nbd code to set the proper limit.

(I still need to post v2 of my patches for bdrv_co_make_zero support, 
which is a case where knowing if there is a 32-bit limit when using 
BDRV_REQ_NO_FALLBACK for fast zeroing is important).

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block_int.h | 8 ++++++++
>   block/blkdebug.c          | 7 ++++++-
>   block/io.c                | 4 +++-
>   3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 6f9fd5e20e..c167e887c6 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -618,6 +618,14 @@ typedef struct BlockLimits {
>        * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
>       int32_t max_pwrite_zeroes;
>   
> +    /*
> +     * Maximum number of bytes that can zeroized at once if flag

zeroed

> +     * BDRV_REQ_NO_FALLBACK specified (since it is signed, it must be < 2G, if
> +     * set).

Why must it be a signed 32-bit number?  Why not let it be a 64-bit number?

> Must be multiple of pwrite_zeroes_alignment. May be 0 if no
> +     * inherent 32-bit limit.
> +     */
> +    int32_t max_pwrite_zeroes_no_fallback;
> +
>       /* Optimal alignment for write zeroes requests in bytes. A power
>        * of 2 is best but not mandatory.  Must be a multiple of
>        * bl.request_alignment, and must be less than max_pwrite_zeroes
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index af44aa973f..7627fbcb3b 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -692,7 +692,11 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
>       }
>       assert(QEMU_IS_ALIGNED(offset, align));
>       assert(QEMU_IS_ALIGNED(bytes, align));
> -    if (bs->bl.max_pwrite_zeroes) {
> +    if ((flags & BDRV_REQ_NO_FALLBACK) &&
> +        bs->bl.max_pwrite_zeroes_no_fallback)
> +    {
> +        assert(bytes <= bs->bl.max_pwrite_zeroes_no_fallback);
> +    } else if (bs->bl.max_pwrite_zeroes) {
>           assert(bytes <= bs->bl.max_pwrite_zeroes);
>       }
>   
> @@ -977,6 +981,7 @@ static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
>       }
>       if (s->max_write_zero) {
>           bs->bl.max_pwrite_zeroes = s->max_write_zero;
> +        bs->bl.max_pwrite_zeroes_no_fallback = s->max_write_zero;

Ah, so you DO default it to max_pwwrite_zeroes instead of to 0; the 
commit message does not quite match the code.

>       }
>       if (s->opt_discard) {
>           bs->bl.pdiscard_alignment = s->opt_discard;
> diff --git a/block/io.c b/block/io.c
> index 7e4cb74cf4..75fd5600c2 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1752,7 +1752,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>       int head = 0;
>       int tail = 0;
>   
> -    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
> +    int max_write_zeroes = MIN_NON_ZERO((flags & BDRV_REQ_NO_FALLBACK) ?
> +                                        bs->bl.max_pwrite_zeroes_no_fallback :
> +                                        bs->bl.max_pwrite_zeroes, INT_MAX);

I'd still like to get rid of this INT_MAX clamping.  If we can blank the 
entire image in one call, even when it is larger than 4G, then it is 
worth making that exposed to the user.  (Even in NBD, we might decide to 
add an extension that allows NBD_CMD_WRITE_ZEROES with a new flag and 
with offset/length == 0/0, as an official way to make the entire image 
zero, whereas it is now currently unspecified to pass a length of 0).

>       int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
>                           bs->bl.request_alignment);
>       int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
>
Vladimir Sementsov-Ogievskiy March 24, 2020, 8:32 a.m. UTC | #2
14.03.2020 0:07, Eric Blake wrote:
> On 3/2/20 4:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>> NBD spec is updated, so that max_block doesn't relate to
> 
> Maybe: The NBD spec was recently updated to clarify that max_block...
> 
>> NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which mirrors Qemu
>> flag BDRV_REQ_NO_FALLBACK). To drop the restriction we need new
>> max_pwrite_zeroes_no_fallback.
> 
> It feels odd to have two different pwrite_zeroes limits in the block layer, but I can live with it if other block layer gurus are also okay with it.
> 
>>
>> Default value of new max_pwrite_zeroes_no_fallback is zero and it means
>> no-restriction, so we are automatically done by this commit. Note that
> 
> Why not have the default value be set to the existing value of the normal pwrite_zeroes limit, rather than 0?
> 
>> nbd and blkdebug are the only drivers which in the same time define
>> max_pwrite_zeroes limit and support BDRV_REQ_NO_FALLBACK, so we need to
>> update only blkdebug.
> 
> Grammar:
> 
> The default value for the new max_pwrite_zeroes_no_fallback is zero, meaning no restriction, which covers all drivers not touched by this commit.  Note that nbd and blkdebug are the only drivers which have a max_pwrite_zeroes limit while supporting BDRV_REQ_NO_FALLBACK, so we only need to update blkdebug.
> 
> Except that I think there IS still a limit in current NBD: you can't request anything larger than 32 bits (whereas some other drivers may allow a full 63-bit request, as well as future NBD usage when we finally add 64-bit extensions to the protocol).  So I think this patch is incomplete; it should be updating the nbd code to set the proper limit.
> 
> (I still need to post v2 of my patches for bdrv_co_make_zero support, which is a case where knowing if there is a 32-bit limit when using BDRV_REQ_NO_FALLBACK for fast zeroing is important).
> 
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block_int.h | 8 ++++++++
>>   block/blkdebug.c          | 7 ++++++-
>>   block/io.c                | 4 +++-
>>   3 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 6f9fd5e20e..c167e887c6 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -618,6 +618,14 @@ typedef struct BlockLimits {
>>        * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
>>       int32_t max_pwrite_zeroes;
>> +    /*
>> +     * Maximum number of bytes that can zeroized at once if flag
> 
> zeroed
> 
>> +     * BDRV_REQ_NO_FALLBACK specified (since it is signed, it must be < 2G, if
>> +     * set).
> 
> Why must it be a signed 32-bit number?  Why not let it be a 64-bit number?
> 
>> Must be multiple of pwrite_zeroes_alignment. May be 0 if no
>> +     * inherent 32-bit limit.
>> +     */
>> +    int32_t max_pwrite_zeroes_no_fallback;
>> +
>>       /* Optimal alignment for write zeroes requests in bytes. A power
>>        * of 2 is best but not mandatory.  Must be a multiple of
>>        * bl.request_alignment, and must be less than max_pwrite_zeroes
>> diff --git a/block/blkdebug.c b/block/blkdebug.c
>> index af44aa973f..7627fbcb3b 100644
>> --- a/block/blkdebug.c
>> +++ b/block/blkdebug.c
>> @@ -692,7 +692,11 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
>>       }
>>       assert(QEMU_IS_ALIGNED(offset, align));
>>       assert(QEMU_IS_ALIGNED(bytes, align));
>> -    if (bs->bl.max_pwrite_zeroes) {
>> +    if ((flags & BDRV_REQ_NO_FALLBACK) &&
>> +        bs->bl.max_pwrite_zeroes_no_fallback)
>> +    {
>> +        assert(bytes <= bs->bl.max_pwrite_zeroes_no_fallback);
>> +    } else if (bs->bl.max_pwrite_zeroes) {
>>           assert(bytes <= bs->bl.max_pwrite_zeroes);
>>       }
>> @@ -977,6 +981,7 @@ static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
>>       }
>>       if (s->max_write_zero) {
>>           bs->bl.max_pwrite_zeroes = s->max_write_zero;
>> +        bs->bl.max_pwrite_zeroes_no_fallback = s->max_write_zero;
> 
> Ah, so you DO default it to max_pwwrite_zeroes instead of to 0; the commit message does not quite match the code.
> 
>>       }
>>       if (s->opt_discard) {
>>           bs->bl.pdiscard_alignment = s->opt_discard;
>> diff --git a/block/io.c b/block/io.c
>> index 7e4cb74cf4..75fd5600c2 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1752,7 +1752,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>       int head = 0;
>>       int tail = 0;
>> -    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
>> +    int max_write_zeroes = MIN_NON_ZERO((flags & BDRV_REQ_NO_FALLBACK) ?
>> +                                        bs->bl.max_pwrite_zeroes_no_fallback :
>> +                                        bs->bl.max_pwrite_zeroes, INT_MAX);
> 
> I'd still like to get rid of this INT_MAX clamping.  If we can blank the entire image in one call, even when it is larger than 4G, then it is worth making that exposed to the user.  (Even in NBD, we might decide to add an extension that allows NBD_CMD_WRITE_ZEROES with a new flag and with offset/length == 0/0, as an official way to make the entire image zero, whereas it is now currently unspecified to pass a length of 0).
> 

Hmm. This series is kind of hacking. Now, 5.0 is missed anyway, I think, I'll prepare something more complete. It would be good to prepare generic block layer for 64bit commands.

>>       int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
>>                           bs->bl.request_alignment);
>>       int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
>>
>
Vladimir Sementsov-Ogievskiy March 31, 2020, 6:52 a.m. UTC | #3
24.03.2020 11:32, Vladimir Sementsov-Ogievskiy wrote:
> 14.03.2020 0:07, Eric Blake wrote:
>> On 3/2/20 4:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> NBD spec is updated, so that max_block doesn't relate to
>>
>> Maybe: The NBD spec was recently updated to clarify that max_block...
>>
>>> NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which mirrors Qemu
>>> flag BDRV_REQ_NO_FALLBACK). To drop the restriction we need new
>>> max_pwrite_zeroes_no_fallback.
>>
>> It feels odd to have two different pwrite_zeroes limits in the block layer, but I can live with it if other block layer gurus are also okay with it.
>>
>>>
>>> Default value of new max_pwrite_zeroes_no_fallback is zero and it means
>>> no-restriction, so we are automatically done by this commit. Note that
>>
>> Why not have the default value be set to the existing value of the normal pwrite_zeroes limit, rather than 0?
>>
>>> nbd and blkdebug are the only drivers which in the same time define
>>> max_pwrite_zeroes limit and support BDRV_REQ_NO_FALLBACK, so we need to
>>> update only blkdebug.
>>
>> Grammar:
>>
>> The default value for the new max_pwrite_zeroes_no_fallback is zero, meaning no restriction, which covers all drivers not touched by this commit.  Note that nbd and blkdebug are the only drivers which have a max_pwrite_zeroes limit while supporting BDRV_REQ_NO_FALLBACK, so we only need to update blkdebug.
>>
>> Except that I think there IS still a limit in current NBD: you can't request anything larger than 32 bits (whereas some other drivers may allow a full 63-bit request, as well as future NBD usage when we finally add 64-bit extensions to the protocol).  So I think this patch is incomplete; it should be updating the nbd code to set the proper limit.
>>
>> (I still need to post v2 of my patches for bdrv_co_make_zero support, which is a case where knowing if there is a 32-bit limit when using BDRV_REQ_NO_FALLBACK for fast zeroing is important).
>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   include/block/block_int.h | 8 ++++++++
>>>   block/blkdebug.c          | 7 ++++++-
>>>   block/io.c                | 4 +++-
>>>   3 files changed, 17 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 6f9fd5e20e..c167e887c6 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -618,6 +618,14 @@ typedef struct BlockLimits {
>>>        * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
>>>       int32_t max_pwrite_zeroes;
>>> +    /*
>>> +     * Maximum number of bytes that can zeroized at once if flag
>>
>> zeroed
>>
>>> +     * BDRV_REQ_NO_FALLBACK specified (since it is signed, it must be < 2G, if
>>> +     * set).
>>
>> Why must it be a signed 32-bit number?  Why not let it be a 64-bit number?
>>
>>> Must be multiple of pwrite_zeroes_alignment. May be 0 if no
>>> +     * inherent 32-bit limit.
>>> +     */
>>> +    int32_t max_pwrite_zeroes_no_fallback;
>>> +
>>>       /* Optimal alignment for write zeroes requests in bytes. A power
>>>        * of 2 is best but not mandatory.  Must be a multiple of
>>>        * bl.request_alignment, and must be less than max_pwrite_zeroes
>>> diff --git a/block/blkdebug.c b/block/blkdebug.c
>>> index af44aa973f..7627fbcb3b 100644
>>> --- a/block/blkdebug.c
>>> +++ b/block/blkdebug.c
>>> @@ -692,7 +692,11 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
>>>       }
>>>       assert(QEMU_IS_ALIGNED(offset, align));
>>>       assert(QEMU_IS_ALIGNED(bytes, align));
>>> -    if (bs->bl.max_pwrite_zeroes) {
>>> +    if ((flags & BDRV_REQ_NO_FALLBACK) &&
>>> +        bs->bl.max_pwrite_zeroes_no_fallback)
>>> +    {
>>> +        assert(bytes <= bs->bl.max_pwrite_zeroes_no_fallback);
>>> +    } else if (bs->bl.max_pwrite_zeroes) {
>>>           assert(bytes <= bs->bl.max_pwrite_zeroes);
>>>       }
>>> @@ -977,6 +981,7 @@ static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
>>>       }
>>>       if (s->max_write_zero) {
>>>           bs->bl.max_pwrite_zeroes = s->max_write_zero;
>>> +        bs->bl.max_pwrite_zeroes_no_fallback = s->max_write_zero;
>>
>> Ah, so you DO default it to max_pwwrite_zeroes instead of to 0; the commit message does not quite match the code.
>>
>>>       }
>>>       if (s->opt_discard) {
>>>           bs->bl.pdiscard_alignment = s->opt_discard;
>>> diff --git a/block/io.c b/block/io.c
>>> index 7e4cb74cf4..75fd5600c2 100644
>>> --- a/block/io.c
>>> +++ b/block/io.c
>>> @@ -1752,7 +1752,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>>       int head = 0;
>>>       int tail = 0;
>>> -    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
>>> +    int max_write_zeroes = MIN_NON_ZERO((flags & BDRV_REQ_NO_FALLBACK) ?
>>> +                                        bs->bl.max_pwrite_zeroes_no_fallback :
>>> +                                        bs->bl.max_pwrite_zeroes, INT_MAX);
>>
>> I'd still like to get rid of this INT_MAX clamping.  If we can blank the entire image in one call, even when it is larger than 4G, then it is worth making that exposed to the user.  (Even in NBD, we might decide to add an extension that allows NBD_CMD_WRITE_ZEROES with a new flag and with offset/length == 0/0, as an official way to make the entire image zero, whereas it is now currently unspecified to pass a length of 0).
>>
> 
> Hmm. This series is kind of hacking. Now, 5.0 is missed anyway, I think, I'll prepare something more complete. It would be good to prepare generic block layer for 64bit commands.

I've started: "[RFC 0/3] 64bit block-layer part I". But I now see that it's not simple thing to convert the block-layer, so it probably will not be fast, and better not force the dependency between the series. So, I'll handle your comments and resend this series in separate and will try to keep it parallel.

> 
>>>       int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
>>>                           bs->bl.request_alignment);
>>>       int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
>>>
>>
> 
>
Vladimir Sementsov-Ogievskiy April 1, 2020, 2:09 p.m. UTC | #4
14.03.2020 0:07, Eric Blake wrote:
> On 3/2/20 4:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>> NBD spec is updated, so that max_block doesn't relate to
> 
> Maybe: The NBD spec was recently updated to clarify that max_block...
> 
>> NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which mirrors Qemu
>> flag BDRV_REQ_NO_FALLBACK). To drop the restriction we need new
>> max_pwrite_zeroes_no_fallback.
> 
> It feels odd to have two different pwrite_zeroes limits in the block layer, but I can live with it if other block layer gurus are also okay with it.
> 
>>
>> Default value of new max_pwrite_zeroes_no_fallback is zero and it means
>> no-restriction, so we are automatically done by this commit. Note that
> 
> Why not have the default value be set to the existing value of the normal pwrite_zeroes limit, rather than 0?

Hm I agree, that it's better to keep safer default.

> 
>> nbd and blkdebug are the only drivers which in the same time define
>> max_pwrite_zeroes limit and support BDRV_REQ_NO_FALLBACK, so we need to
>> update only blkdebug.
> 
> Grammar:
> 
> The default value for the new max_pwrite_zeroes_no_fallback is zero, meaning no restriction, which covers all drivers not touched by this commit.  Note that nbd and blkdebug are the only drivers which have a max_pwrite_zeroes limit while supporting BDRV_REQ_NO_FALLBACK, so we only need to update blkdebug.
> 
> Except that I think there IS still a limit in current NBD: you can't request anything larger than 32 bits (whereas some other drivers may allow a full 63-bit request, as well as future NBD usage when we finally add 64-bit extensions to the protocol).  So I think this patch is incomplete; it should be updating the nbd code to set the proper limit.
> 
> (I still need to post v2 of my patches for bdrv_co_make_zero support, which is a case where knowing if there is a 32-bit limit when using BDRV_REQ_NO_FALLBACK for fast zeroing is important).
> 
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block_int.h | 8 ++++++++
>>   block/blkdebug.c          | 7 ++++++-
>>   block/io.c                | 4 +++-
>>   3 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 6f9fd5e20e..c167e887c6 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -618,6 +618,14 @@ typedef struct BlockLimits {
>>        * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
>>       int32_t max_pwrite_zeroes;
>> +    /*
>> +     * Maximum number of bytes that can zeroized at once if flag
> 
> zeroed
> 
>> +     * BDRV_REQ_NO_FALLBACK specified (since it is signed, it must be < 2G, if
>> +     * set).
> 
> Why must it be a signed 32-bit number?  Why not let it be a 64-bit number?
> 
>> Must be multiple of pwrite_zeroes_alignment. May be 0 if no
>> +     * inherent 32-bit limit.
>> +     */
>> +    int32_t max_pwrite_zeroes_no_fallback;
>> +
>>       /* Optimal alignment for write zeroes requests in bytes. A power
>>        * of 2 is best but not mandatory.  Must be a multiple of
>>        * bl.request_alignment, and must be less than max_pwrite_zeroes
>> diff --git a/block/blkdebug.c b/block/blkdebug.c
>> index af44aa973f..7627fbcb3b 100644
>> --- a/block/blkdebug.c
>> +++ b/block/blkdebug.c
>> @@ -692,7 +692,11 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
>>       }
>>       assert(QEMU_IS_ALIGNED(offset, align));
>>       assert(QEMU_IS_ALIGNED(bytes, align));
>> -    if (bs->bl.max_pwrite_zeroes) {
>> +    if ((flags & BDRV_REQ_NO_FALLBACK) &&
>> +        bs->bl.max_pwrite_zeroes_no_fallback)
>> +    {
>> +        assert(bytes <= bs->bl.max_pwrite_zeroes_no_fallback);
>> +    } else if (bs->bl.max_pwrite_zeroes) {
>>           assert(bytes <= bs->bl.max_pwrite_zeroes);
>>       }
>> @@ -977,6 +981,7 @@ static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
>>       }
>>       if (s->max_write_zero) {
>>           bs->bl.max_pwrite_zeroes = s->max_write_zero;
>> +        bs->bl.max_pwrite_zeroes_no_fallback = s->max_write_zero;
> 
> Ah, so you DO default it to max_pwwrite_zeroes instead of to 0; the commit message does not quite match the code.

In this patch it's only for blkdebug. But I'm going to change the default as you proposed.

> 
>>       }
>>       if (s->opt_discard) {
>>           bs->bl.pdiscard_alignment = s->opt_discard;
>> diff --git a/block/io.c b/block/io.c
>> index 7e4cb74cf4..75fd5600c2 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1752,7 +1752,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>       int head = 0;
>>       int tail = 0;
>> -    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
>> +    int max_write_zeroes = MIN_NON_ZERO((flags & BDRV_REQ_NO_FALLBACK) ?
>> +                                        bs->bl.max_pwrite_zeroes_no_fallback :
>> +                                        bs->bl.max_pwrite_zeroes, INT_MAX);
> 
> I'd still like to get rid of this INT_MAX clamping.  If we can blank the entire image in one call, even when it is larger than 4G, then it is worth making that exposed to the user.  (Even in NBD, we might decide to add an extension that allows NBD_CMD_WRITE_ZEROES with a new flag and with offset/length == 0/0, as an official way to make the entire image zero, whereas it is now currently unspecified to pass a length of 0).

We can't get rid of it now, just because write_zero driver handler has int argument. So, I'm going to convert block-layer to int64_t-everywhere, but do it in separate series.

> 
>>       int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
>>                           bs->bl.request_alignment);
>>       int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
>>
>
diff mbox series

Patch

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6f9fd5e20e..c167e887c6 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -618,6 +618,14 @@  typedef struct BlockLimits {
      * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
     int32_t max_pwrite_zeroes;
 
+    /*
+     * Maximum number of bytes that can zeroized at once if flag
+     * BDRV_REQ_NO_FALLBACK specified (since it is signed, it must be < 2G, if
+     * set). Must be multiple of pwrite_zeroes_alignment. May be 0 if no
+     * inherent 32-bit limit.
+     */
+    int32_t max_pwrite_zeroes_no_fallback;
+
     /* Optimal alignment for write zeroes requests in bytes. A power
      * of 2 is best but not mandatory.  Must be a multiple of
      * bl.request_alignment, and must be less than max_pwrite_zeroes
diff --git a/block/blkdebug.c b/block/blkdebug.c
index af44aa973f..7627fbcb3b 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -692,7 +692,11 @@  static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
     }
     assert(QEMU_IS_ALIGNED(offset, align));
     assert(QEMU_IS_ALIGNED(bytes, align));
-    if (bs->bl.max_pwrite_zeroes) {
+    if ((flags & BDRV_REQ_NO_FALLBACK) &&
+        bs->bl.max_pwrite_zeroes_no_fallback)
+    {
+        assert(bytes <= bs->bl.max_pwrite_zeroes_no_fallback);
+    } else if (bs->bl.max_pwrite_zeroes) {
         assert(bytes <= bs->bl.max_pwrite_zeroes);
     }
 
@@ -977,6 +981,7 @@  static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
     }
     if (s->max_write_zero) {
         bs->bl.max_pwrite_zeroes = s->max_write_zero;
+        bs->bl.max_pwrite_zeroes_no_fallback = s->max_write_zero;
     }
     if (s->opt_discard) {
         bs->bl.pdiscard_alignment = s->opt_discard;
diff --git a/block/io.c b/block/io.c
index 7e4cb74cf4..75fd5600c2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1752,7 +1752,9 @@  static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int head = 0;
     int tail = 0;
 
-    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
+    int max_write_zeroes = MIN_NON_ZERO((flags & BDRV_REQ_NO_FALLBACK) ?
+                                        bs->bl.max_pwrite_zeroes_no_fallback :
+                                        bs->bl.max_pwrite_zeroes, INT_MAX);
     int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
                         bs->bl.request_alignment);
     int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);