diff mbox

block: limited request size in write zeroes unsupported path

Message ID 1420457389-16332-1-git-send-email-pl@kamp.de
State New
Headers show

Commit Message

Peter Lieven Jan. 5, 2015, 11:29 a.m. UTC
If bs->bl.max_write_zeroes is large and we end up in the unsupported
path we might allocate a lot of memory for the iovector and/or even
generate an oversized requests.

Fix this by limiting the request by the minimum of the reported
maximum transfer size or 16MB (32768 sectors).

Reported-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Denis V. Lunev Jan. 5, 2015, 11:51 a.m. UTC | #1
On 05/01/15 14:29, Peter Lieven wrote:
> If bs->bl.max_write_zeroes is large and we end up in the unsupported
> path we might allocate a lot of memory for the iovector and/or even
> generate an oversized requests.
>
> Fix this by limiting the request by the minimum of the reported
> maximum transfer size or 16MB (32768 sectors).
>
> Reported-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>   block.c |    5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/block.c b/block.c
> index a612594..8009478 100644
> --- a/block.c
> +++ b/block.c
> @@ -3203,6 +3203,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>
>           if (ret == -ENOTSUP) {
>               /* Fall back to bounce buffer if write zeroes is unsupported */
> +            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
> +                                            MAX_WRITE_ZEROES_DEFAULT);
> +            num = MIN(num, max_xfer_len);
>               iov.iov_len = num * BDRV_SECTOR_SIZE;
>               if (iov.iov_base == NULL) {
>                   iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
> @@ -3219,7 +3222,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>               /* Keep bounce buffer around if it is big enough for all
>                * all future requests.
>                */
> -            if (num < max_write_zeroes) {
> +            if (num < max_xfer_len) {
>                   qemu_vfree(iov.iov_base);
>                   iov.iov_base = NULL;
>               }
>
this is not going to work IMHO. num is the number in sectors.
max_xfer_len is in bytes.

I will send my updated version using your approach in a
couple of minutes. Would like to test it a bit.
Peter Lieven Jan. 5, 2015, 12:14 p.m. UTC | #2
On 05.01.2015 12:51, Denis V. Lunev wrote:
> On 05/01/15 14:29, Peter Lieven wrote:
>> If bs->bl.max_write_zeroes is large and we end up in the unsupported
>> path we might allocate a lot of memory for the iovector and/or even
>> generate an oversized requests.
>>
>> Fix this by limiting the request by the minimum of the reported
>> maximum transfer size or 16MB (32768 sectors).
>>
>> Reported-by: Denis V. Lunev <den@openvz.org>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>   block.c |    5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/block.c b/block.c
>> index a612594..8009478 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -3203,6 +3203,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>>
>>           if (ret == -ENOTSUP) {
>>               /* Fall back to bounce buffer if write zeroes is unsupported */
>> +            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
>> + MAX_WRITE_ZEROES_DEFAULT);
>> +            num = MIN(num, max_xfer_len);
>>               iov.iov_len = num * BDRV_SECTOR_SIZE;
>>               if (iov.iov_base == NULL) {
>>                   iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
>> @@ -3219,7 +3222,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>>               /* Keep bounce buffer around if it is big enough for all
>>                * all future requests.
>>                */
>> -            if (num < max_write_zeroes) {
>> +            if (num < max_xfer_len) {
>>                   qemu_vfree(iov.iov_base);
>>                   iov.iov_base = NULL;
>>               }
>>
> this is not going to work IMHO. num is the number in sectors.
> max_xfer_len is in bytes.

bs->bl.max_transfer_length is in sectors.

Peter

>
> I will send my updated version using your approach in a
> couple of minutes. Would like to test it a bit.
Denis V. Lunev Jan. 5, 2015, 12:28 p.m. UTC | #3
On 05/01/15 15:14, Peter Lieven wrote:
> On 05.01.2015 12:51, Denis V. Lunev wrote:
>> On 05/01/15 14:29, Peter Lieven wrote:
>>> If bs->bl.max_write_zeroes is large and we end up in the unsupported
>>> path we might allocate a lot of memory for the iovector and/or even
>>> generate an oversized requests.
>>>
>>> Fix this by limiting the request by the minimum of the reported
>>> maximum transfer size or 16MB (32768 sectors).
>>>
>>> Reported-by: Denis V. Lunev <den@openvz.org>
>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> ---
>>>   block.c |    5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index a612594..8009478 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -3203,6 +3203,9 @@ static int coroutine_fn 
>>> bdrv_co_do_write_zeroes(BlockDriverState *bs,
>>>
>>>           if (ret == -ENOTSUP) {
>>>               /* Fall back to bounce buffer if write zeroes is 
>>> unsupported */
>>> +            int max_xfer_len = 
>>> MIN_NON_ZERO(bs->bl.max_transfer_length,
>>> + MAX_WRITE_ZEROES_DEFAULT);
>>> +            num = MIN(num, max_xfer_len);
>>>               iov.iov_len = num * BDRV_SECTOR_SIZE;
>>>               if (iov.iov_base == NULL) {
>>>                   iov.iov_base = qemu_try_blockalign(bs, num * 
>>> BDRV_SECTOR_SIZE);
>>> @@ -3219,7 +3222,7 @@ static int coroutine_fn 
>>> bdrv_co_do_write_zeroes(BlockDriverState *bs,
>>>               /* Keep bounce buffer around if it is big enough for all
>>>                * all future requests.
>>>                */
>>> -            if (num < max_write_zeroes) {
>>> +            if (num < max_xfer_len) {
>>>                   qemu_vfree(iov.iov_base);
>>>                   iov.iov_base = NULL;
>>>               }
>>>
>> this is not going to work IMHO. num is the number in sectors.
>> max_xfer_len is in bytes.
>
> bs->bl.max_transfer_length is in sectors.
>
> Peter
>

oops. you are right...
Denis V. Lunev Jan. 5, 2015, 12:34 p.m. UTC | #4
On 05/01/15 14:29, Peter Lieven wrote:
> If bs->bl.max_write_zeroes is large and we end up in the unsupported
> path we might allocate a lot of memory for the iovector and/or even
> generate an oversized requests.
>
> Fix this by limiting the request by the minimum of the reported
> maximum transfer size or 16MB (32768 sectors).
>
> Reported-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>   block.c |    5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/block.c b/block.c
> index a612594..8009478 100644
> --- a/block.c
> +++ b/block.c
> @@ -3203,6 +3203,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>
>           if (ret == -ENOTSUP) {
>               /* Fall back to bounce buffer if write zeroes is unsupported */
> +            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
> +                                            MAX_WRITE_ZEROES_DEFAULT);
> +            num = MIN(num, max_xfer_len);
>               iov.iov_len = num * BDRV_SECTOR_SIZE;
>               if (iov.iov_base == NULL) {
>                   iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
> @@ -3219,7 +3222,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>               /* Keep bounce buffer around if it is big enough for all
>                * all future requests.
>                */
> -            if (num < max_write_zeroes) {
> +            if (num < max_xfer_len) {
>                   qemu_vfree(iov.iov_base);
>                   iov.iov_base = NULL;
>               }
>
Reviewed-by: Denis V. Lunev <den@openvz.org>

Though pls consider my patch v3, it avoids allocation of 16 Mb here and
uses only 1 Mb of memory.
Stefan Hajnoczi Jan. 6, 2015, 3:42 p.m. UTC | #5
On Mon, Jan 05, 2015 at 12:29:49PM +0100, Peter Lieven wrote:
> If bs->bl.max_write_zeroes is large and we end up in the unsupported
> path we might allocate a lot of memory for the iovector and/or even
> generate an oversized requests.
> 
> Fix this by limiting the request by the minimum of the reported
> maximum transfer size or 16MB (32768 sectors).
> 
> Reported-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan
Stefan Hajnoczi Jan. 6, 2015, 3:43 p.m. UTC | #6
On Mon, Jan 05, 2015 at 03:34:07PM +0300, Denis V. Lunev wrote:
> Though pls consider my patch v3, it avoids allocation of 16 Mb here and
> uses only 1 Mb of memory.

Once your patch has Reviewed-by: it will show up on my radar for merge.

If you and Peter need a 2nd opinion in your discussions about the
fallocate series, I can look at the series in more detail myself.  Just
let me know.

Stefan
Denis V. Lunev Jan. 6, 2015, 5:56 p.m. UTC | #7
On 06/01/15 18:43, Stefan Hajnoczi wrote:
> On Mon, Jan 05, 2015 at 03:34:07PM +0300, Denis V. Lunev wrote:
>> Though pls consider my patch v3, it avoids allocation of 16 Mb here and
>> uses only 1 Mb of memory.
>
> Once your patch has Reviewed-by: it will show up on my radar for merge.
>
> If you and Peter need a 2nd opinion in your discussions about the
> fallocate series, I can look at the series in more detail myself.  Just
> let me know.
>
> Stefan
>


Fallocate stuff has been reviewed by Fam and I have enough
feedback at the moment to start rework. He wants some
simplifications at the moment. This is not a big deal.

This patch is technically correct and solves the problem
I have spotted. Thus it could be merged. I'll drop patch 1
in my series for the sake of this one to avoid unnecessary
discussion with it.

On the other hand I believe that my patch is a little bit
better, it allocates only 1 MB instead of 16 here. Though
I could rebase it and send it separately on top of this
to discuss it independently.

By the way, Stefan, do you see Acked-by: tag in your radar
or it should be avoided? We are using it as review signature
thanks to my prior Linux kernel experience.

Regards,
	Den
diff mbox

Patch

diff --git a/block.c b/block.c
index a612594..8009478 100644
--- a/block.c
+++ b/block.c
@@ -3203,6 +3203,9 @@  static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
 
         if (ret == -ENOTSUP) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
+            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
+                                            MAX_WRITE_ZEROES_DEFAULT);
+            num = MIN(num, max_xfer_len);
             iov.iov_len = num * BDRV_SECTOR_SIZE;
             if (iov.iov_base == NULL) {
                 iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
@@ -3219,7 +3222,7 @@  static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
             /* Keep bounce buffer around if it is big enough for all
              * all future requests.
              */
-            if (num < max_write_zeroes) {
+            if (num < max_xfer_len) {
                 qemu_vfree(iov.iov_base);
                 iov.iov_base = NULL;
             }