[1/2] glusterfs: fix max_discard

Message ID	1422907754-21543-2-git-send-email-den@openvz.org
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: "Denis V. Lunev" <den@openvz.org> To: Date: Mon, 2 Feb 2015 23:09:13 +0300 Message-Id: <1422907754-21543-2-git-send-email-den@openvz.org> In-Reply-To: <1422907754-21543-1-git-send-email-den@openvz.org> References: <1422907754-21543-1-git-send-email-den@openvz.org> Cc: Kevin Wolf <kwolf@redhat.com>, "Denis V. Lunev" <den@openvz.org>, Peter Lieven <pl@kamp.de>, qemu-devel@nongnu.org Subject: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Denis V. Lunev Feb. 2, 2015, 8:09 p.m. UTC

qemu_gluster_co_discard calculates size to discard as follows
    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
    ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);

glfs_discard_async is declared as follows:
  int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
                          glfs_io_cbk fn, void *data) __THROW
This is problematic on i686 as sizeof(size_t) == 4.

Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
on i386.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Peter Lieven <pl@kamp.de>
---
 block/gluster.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Peter Lieven Feb. 2, 2015, 8:40 p.m. UTC | #1

Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
> qemu_gluster_co_discard calculates size to discard as follows
>     size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>     ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>
> glfs_discard_async is declared as follows:
>   int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>                           glfs_io_cbk fn, void *data) __THROW
> This is problematic on i686 as sizeof(size_t) == 4.
>
> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
> on i386.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Peter Lieven <pl@kamp.de>
> ---
>  block/gluster.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/block/gluster.c b/block/gluster.c
> index 1eb3a8c..8a8c153 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -622,6 +622,11 @@ out:
>      return ret;
>  }
>  
> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
> +}
> +

Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.

>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
>          int64_t sector_num, int nb_sectors)
> @@ -735,6 +740,7 @@ static BlockDriver bdrv_gluster = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -762,6 +768,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -789,6 +796,7 @@ static BlockDriver bdrv_gluster_unix = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -816,6 +824,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif

Reviewed-by: Peter Lieven <pl@kamp.de>

Denis V. Lunev Feb. 2, 2015, 8:46 p.m. UTC | #2

On 02/02/15 23:40, Peter Lieven wrote:
> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>> qemu_gluster_co_discard calculates size to discard as follows
>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>
>> glfs_discard_async is declared as follows:
>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>                            glfs_io_cbk fn, void *data) __THROW
>> This is problematic on i686 as sizeof(size_t) == 4.
>>
>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>> on i386.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Peter Lieven <pl@kamp.de>
>> ---
>>   block/gluster.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/block/gluster.c b/block/gluster.c
>> index 1eb3a8c..8a8c153 100644
>> --- a/block/gluster.c
>> +++ b/block/gluster.c
>> @@ -622,6 +622,11 @@ out:
>>       return ret;
>>   }
>>   
>> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>> +{
>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>> +}
>> +
> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
ha, the same applies to nbd code too.

I'll do this stuff tomorrow and also I think that some
audit in other drivers could reveal something interesting.

Den

Denis V. Lunev Feb. 3, 2015, 7:31 a.m. UTC | #3

On 02/02/15 23:46, Denis V. Lunev wrote:
> On 02/02/15 23:40, Peter Lieven wrote:
>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>> qemu_gluster_co_discard calculates size to discard as follows
>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>      ret = glfs_discard_async(s->fd, offset, size, 
>>> &gluster_finish_aiocb, acb);
>>>
>>> glfs_discard_async is declared as follows:
>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>                            glfs_io_cbk fn, void *data) __THROW
>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>
>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>> on i386.
>>>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Kevin Wolf <kwolf@redhat.com>
>>> CC: Peter Lieven <pl@kamp.de>
>>> ---
>>>   block/gluster.c | 9 +++++++++
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/block/gluster.c b/block/gluster.c
>>> index 1eb3a8c..8a8c153 100644
>>> --- a/block/gluster.c
>>> +++ b/block/gluster.c
>>> @@ -622,6 +622,11 @@ out:
>>>       return ret;
>>>   }
>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, 
>>> Error **errp)
>>> +{
>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>> +}
>>> +
>> Looking at the gluster code bl.max_transfer_length should have the 
>> same limit, but thats a different patch.
> ha, the same applies to nbd code too.
>
> I'll do this stuff tomorrow and also I think that some
> audit in other drivers could reveal something interesting.
>
> Den
ok. The situation is well rotten here on i686.

The problem comes from the fact that QEMUIOVector
and iovec uses size_t as length. All API calls use
this abstraction. Thus all conversion operations
from nr_sectors to size could bang at any moment.

Putting dirty hands here is problematic from my point
of view. Should we really care about this? 32bit
applications are becoming old good history of IT...
Den

Peter Lieven Feb. 3, 2015, 11:30 a.m. UTC | #4

Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
> On 02/02/15 23:46, Denis V. Lunev wrote:
>> On 02/02/15 23:40, Peter Lieven wrote:
>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>
>>>> glfs_discard_async is declared as follows:
>>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>                            glfs_io_cbk fn, void *data) __THROW
>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>
>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>> on i386.
>>>>
>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>> CC: Peter Lieven <pl@kamp.de>
>>>> ---
>>>>   block/gluster.c | 9 +++++++++
>>>>   1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>> index 1eb3a8c..8a8c153 100644
>>>> --- a/block/gluster.c
>>>> +++ b/block/gluster.c
>>>> @@ -622,6 +622,11 @@ out:
>>>>       return ret;
>>>>   }
>>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>> +{
>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>> +}
>>>> +
>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>> ha, the same applies to nbd code too.
>>
>> I'll do this stuff tomorrow and also I think that some
>> audit in other drivers could reveal something interesting.
>>
>> Den
> ok. The situation is well rotten here on i686.
>
> The problem comes from the fact that QEMUIOVector
> and iovec uses size_t as length. All API calls use
> this abstraction. Thus all conversion operations
> from nr_sectors to size could bang at any moment.
>
> Putting dirty hands here is problematic from my point
> of view. Should we really care about this? 32bit
> applications are becoming old good history of IT...

The host has to be 32bit to be in trouble. And at least if we have KVM the host
has to support long mode.

I have on my todo to add generic code for honouring bl.max_transfer_length
in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
for bl.max_transfer_length.

Peter

Kevin Wolf Feb. 3, 2015, 11:37 a.m. UTC | #5

Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
> > On 02/02/15 23:46, Denis V. Lunev wrote:
> >> On 02/02/15 23:40, Peter Lieven wrote:
> >>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
> >>>> qemu_gluster_co_discard calculates size to discard as follows
> >>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
> >>>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
> >>>>
> >>>> glfs_discard_async is declared as follows:
> >>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
> >>>>                            glfs_io_cbk fn, void *data) __THROW
> >>>> This is problematic on i686 as sizeof(size_t) == 4.
> >>>>
> >>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
> >>>> on i386.
> >>>>
> >>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
> >>>> CC: Kevin Wolf <kwolf@redhat.com>
> >>>> CC: Peter Lieven <pl@kamp.de>
> >>>> ---
> >>>>   block/gluster.c | 9 +++++++++
> >>>>   1 file changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/block/gluster.c b/block/gluster.c
> >>>> index 1eb3a8c..8a8c153 100644
> >>>> --- a/block/gluster.c
> >>>> +++ b/block/gluster.c
> >>>> @@ -622,6 +622,11 @@ out:
> >>>>       return ret;
> >>>>   }
> >>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>> +{
> >>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
> >>>> +}
> >>>> +
> >>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
> >> ha, the same applies to nbd code too.
> >>
> >> I'll do this stuff tomorrow and also I think that some
> >> audit in other drivers could reveal something interesting.
> >>
> >> Den
> > ok. The situation is well rotten here on i686.
> >
> > The problem comes from the fact that QEMUIOVector
> > and iovec uses size_t as length. All API calls use
> > this abstraction. Thus all conversion operations
> > from nr_sectors to size could bang at any moment.
> >
> > Putting dirty hands here is problematic from my point
> > of view. Should we really care about this? 32bit
> > applications are becoming old good history of IT...
> 
> The host has to be 32bit to be in trouble. And at least if we have KVM the host
> has to support long mode.
> 
> I have on my todo to add generic code for honouring bl.max_transfer_length
> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
> for bl.max_transfer_length.

So the conclusion is that we'll apply this series as it is and you'll
take care of the rest later?

Kevin

Peter Lieven Feb. 3, 2015, 11:47 a.m. UTC | #6

Am 03.02.2015 um 12:37 schrieb Kevin Wolf:
> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
>>> On 02/02/15 23:46, Denis V. Lunev wrote:
>>>> On 02/02/15 23:40, Peter Lieven wrote:
>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>>>
>>>>>> glfs_discard_async is declared as follows:
>>>>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>>>                            glfs_io_cbk fn, void *data) __THROW
>>>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>>>
>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>>>> on i386.
>>>>>>
>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>>>> CC: Peter Lieven <pl@kamp.de>
>>>>>> ---
>>>>>>   block/gluster.c | 9 +++++++++
>>>>>>   1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>>>> index 1eb3a8c..8a8c153 100644
>>>>>> --- a/block/gluster.c
>>>>>> +++ b/block/gluster.c
>>>>>> @@ -622,6 +622,11 @@ out:
>>>>>>       return ret;
>>>>>>   }
>>>>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>> +{
>>>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>>>> +}
>>>>>> +
>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>>>> ha, the same applies to nbd code too.
>>>>
>>>> I'll do this stuff tomorrow and also I think that some
>>>> audit in other drivers could reveal something interesting.
>>>>
>>>> Den
>>> ok. The situation is well rotten here on i686.
>>>
>>> The problem comes from the fact that QEMUIOVector
>>> and iovec uses size_t as length. All API calls use
>>> this abstraction. Thus all conversion operations
>>> from nr_sectors to size could bang at any moment.
>>>
>>> Putting dirty hands here is problematic from my point
>>> of view. Should we really care about this? 32bit
>>> applications are becoming old good history of IT...
>> The host has to be 32bit to be in trouble. And at least if we have KVM the host
>> has to support long mode.
>>
>> I have on my todo to add generic code for honouring bl.max_transfer_length
>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
>> for bl.max_transfer_length.
> So the conclusion is that we'll apply this series as it is and you'll
> take care of the rest later?

Yes, and actually we need a macro like

#define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX)

as limit for everything. Because bdrv_check_byte_request already has a size_t argument.
So we could already create an overflow in bdrv_check_request when we convert
nb_sectors to size_t.

I will create a patch to catch at least this overflow shortly.

Peter

Denis V. Lunev Feb. 3, 2015, 11:51 a.m. UTC | #7

On 03/02/15 14:47, Peter Lieven wrote:
> Am 03.02.2015 um 12:37 schrieb Kevin Wolf:
>> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
>>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
>>>> On 02/02/15 23:46, Denis V. Lunev wrote:
>>>>> On 02/02/15 23:40, Peter Lieven wrote:
>>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>>>>       size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>>>>       ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>>>>
>>>>>>> glfs_discard_async is declared as follows:
>>>>>>>     int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>>>>                             glfs_io_cbk fn, void *data) __THROW
>>>>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>>>>
>>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>>>>> on i386.
>>>>>>>
>>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>>>>> CC: Peter Lieven <pl@kamp.de>
>>>>>>> ---
>>>>>>>    block/gluster.c | 9 +++++++++
>>>>>>>    1 file changed, 9 insertions(+)
>>>>>>>
>>>>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>>>>> index 1eb3a8c..8a8c153 100644
>>>>>>> --- a/block/gluster.c
>>>>>>> +++ b/block/gluster.c
>>>>>>> @@ -622,6 +622,11 @@ out:
>>>>>>>        return ret;
>>>>>>>    }
>>>>>>>    +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>> +{
>>>>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>>>>> +}
>>>>>>> +
>>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>>>>> ha, the same applies to nbd code too.
>>>>>
>>>>> I'll do this stuff tomorrow and also I think that some
>>>>> audit in other drivers could reveal something interesting.
>>>>>
>>>>> Den
>>>> ok. The situation is well rotten here on i686.
>>>>
>>>> The problem comes from the fact that QEMUIOVector
>>>> and iovec uses size_t as length. All API calls use
>>>> this abstraction. Thus all conversion operations
>>>> from nr_sectors to size could bang at any moment.
>>>>
>>>> Putting dirty hands here is problematic from my point
>>>> of view. Should we really care about this? 32bit
>>>> applications are becoming old good history of IT...
>>> The host has to be 32bit to be in trouble. And at least if we have KVM the host
>>> has to support long mode.
>>>
>>> I have on my todo to add generic code for honouring bl.max_transfer_length
>>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
>>> for bl.max_transfer_length.
>> So the conclusion is that we'll apply this series as it is and you'll
>> take care of the rest later?
> Yes, and actually we need a macro like
>
> #define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX)
>
> as limit for everything. Because bdrv_check_byte_request already has a size_t argument.
> So we could already create an overflow in bdrv_check_request when we convert
> nb_sectors to size_t.
>
> I will create a patch to catch at least this overflow shortly.
>
> Peter
>
I like this macro :)

I vote to move MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX) into generic code
on discard/write_zero paths immediately and drop this exact patch.

Patch 2 of this set would be better to have additional
+bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;

I'll wait Peter's patch and respin on top of it to avoid unnecessary 
commits.

Den

Peter Lieven Feb. 3, 2015, 12:13 p.m. UTC | #8

Am 03.02.2015 um 12:51 schrieb Denis V. Lunev:
> On 03/02/15 14:47, Peter Lieven wrote:
>> Am 03.02.2015 um 12:37 schrieb Kevin Wolf:
>>> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
>>>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
>>>>> On 02/02/15 23:46, Denis V. Lunev wrote:
>>>>>> On 02/02/15 23:40, Peter Lieven wrote:
>>>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>>>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>>>>>       size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>>>>>       ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>>>>>
>>>>>>>> glfs_discard_async is declared as follows:
>>>>>>>>     int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>>>>>                             glfs_io_cbk fn, void *data) __THROW
>>>>>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>>>>>
>>>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>>>>>> on i386.
>>>>>>>>
>>>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>>>>>> CC: Peter Lieven <pl@kamp.de>
>>>>>>>> ---
>>>>>>>>    block/gluster.c | 9 +++++++++
>>>>>>>>    1 file changed, 9 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>>>>>> index 1eb3a8c..8a8c153 100644
>>>>>>>> --- a/block/gluster.c
>>>>>>>> +++ b/block/gluster.c
>>>>>>>> @@ -622,6 +622,11 @@ out:
>>>>>>>>        return ret;
>>>>>>>>    }
>>>>>>>>    +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>>> +{
>>>>>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>>>>>> +}
>>>>>>>> +
>>>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>>>>>> ha, the same applies to nbd code too.
>>>>>>
>>>>>> I'll do this stuff tomorrow and also I think that some
>>>>>> audit in other drivers could reveal something interesting.
>>>>>>
>>>>>> Den
>>>>> ok. The situation is well rotten here on i686.
>>>>>
>>>>> The problem comes from the fact that QEMUIOVector
>>>>> and iovec uses size_t as length. All API calls use
>>>>> this abstraction. Thus all conversion operations
>>>>> from nr_sectors to size could bang at any moment.
>>>>>
>>>>> Putting dirty hands here is problematic from my point
>>>>> of view. Should we really care about this? 32bit
>>>>> applications are becoming old good history of IT...
>>>> The host has to be 32bit to be in trouble. And at least if we have KVM the host
>>>> has to support long mode.
>>>>
>>>> I have on my todo to add generic code for honouring bl.max_transfer_length
>>>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
>>>> for bl.max_transfer_length.
>>> So the conclusion is that we'll apply this series as it is and you'll
>>> take care of the rest later?
>> Yes, and actually we need a macro like
>>
>> #define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX)
>>
>> as limit for everything. Because bdrv_check_byte_request already has a size_t argument.
>> So we could already create an overflow in bdrv_check_request when we convert
>> nb_sectors to size_t.
>>
>> I will create a patch to catch at least this overflow shortly.
>>
>> Peter
>>
> I like this macro :)

I see I looked at an old checkout in bdrv_check_request there is already such a check meanwhile.

static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num,
                              int nb_sectors)
{
    if (nb_sectors < 0 || nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
        return -EIO;
    }

    return bdrv_check_byte_request(bs, sector_num * BDRV_SECTOR_SIZE,
                                   nb_sectors * BDRV_SECTOR_SIZE);
}

>
> I vote to move MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX) into generic code
> on discard/write_zero paths immediately and drop this exact patch.
>
> Patch 2 of this set would be better to have additional
> +bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
>
> I'll wait Peter's patch and respin on top of it to avoid unnecessary commits.
>
> Den

[1/2] glusterfs: fix max_discard

Commit Message

Comments

Patch