Patchwork block: prevent multiwrite_merge from creating too large iovecs

login
register
mail settings
Submitter Christoph Hellwig
Date Jan. 19, 2010, 9:15 p.m.
Message ID <20100119211539.GA4383@lst.de>
Download mbox | patch
Permalink /patch/43229/
State New
Headers show

Comments

Christoph Hellwig - Jan. 19, 2010, 9:15 p.m.
If we go over the maximum number of iovecs support by syscall we get
back EINVAL from the kernel which translate to I/O errors for the guest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Kevin Wolf - Jan. 20, 2010, 11:37 a.m.
Am 19.01.2010 22:15, schrieb Christoph Hellwig:
> If we go over the maximum number of iovecs support by syscall we get
> back EINVAL from the kernel which translate to I/O errors for the guest.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Is this really enough? We don't check for IOV_MAX in any other place, so
can't we get a too big request directly from virtio-blk?

What about handling it transparently in qemu_pwritev/preadv and
laio_submit? Logically, it's a limitation of the backend anyway and not
a generic restriction.

To underline that it's a backend/platform dependent thing: Your patch
breaks the mingw build for me.

Kevin
Anthony Liguori - Jan. 20, 2010, 2:56 p.m.
On 01/19/2010 03:15 PM, Christoph Hellwig wrote:
> If we go over the maximum number of iovecs support by syscall we get
> back EINVAL from the kernel which translate to I/O errors for the guest.
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
>    

Applied.  Thanks.

Regards,

Anthony Liguori
> Index: qemu/block.c
> ===================================================================
> --- qemu.orig/block.c	2010-01-19 22:10:19.797003226 +0100
> +++ qemu/block.c	2010-01-19 22:11:08.226005767 +0100
> @@ -1711,6 +1711,10 @@ static int multiwrite_merge(BlockDriverS
>               merge = bs->drv->bdrv_merge_requests(bs,&reqs[outidx],&reqs[i]);
>           }
>
> +        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1>  IOV_MAX) {
> +            merge = 0;
> +        }
> +
>           if (merge) {
>               size_t size;
>               QEMUIOVector *qiov = qemu_mallocz(sizeof(*qiov));
>
>
>
>
Christoph Hellwig - Jan. 20, 2010, 4:24 p.m.
On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
> Am 19.01.2010 22:15, schrieb Christoph Hellwig:
> > If we go over the maximum number of iovecs support by syscall we get
> > back EINVAL from the kernel which translate to I/O errors for the guest.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Is this really enough? We don't check for IOV_MAX in any other place, so
> can't we get a too big request directly from virtio-blk?

Currently the virtqueue is limited to 1024 iovecs, but I plan to put in
some better infrastructure to deal with the queue limit.  For now this
patch fixes an issue that we see with real life setups.
Kevin Wolf - Jan. 20, 2010, 4:29 p.m.
Am 20.01.2010 17:24, schrieb Christoph Hellwig:
> On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
>> Am 19.01.2010 22:15, schrieb Christoph Hellwig:
>>> If we go over the maximum number of iovecs support by syscall we get
>>> back EINVAL from the kernel which translate to I/O errors for the guest.
>>>
>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>
>> Is this really enough? We don't check for IOV_MAX in any other place, so
>> can't we get a too big request directly from virtio-blk?
> 
> Currently the virtqueue is limited to 1024 iovecs, but I plan to put in
> some better infrastructure to deal with the queue limit.  For now this
> patch fixes an issue that we see with real life setups.

Ok, if you're planning to replace it by something real, I'm not opposed
to using this as a quick fix for the meantime. However, it needs an
#ifdef for the mingw build breakage at least.

Kevin
Christoph Hellwig - Jan. 26, 2010, 9:21 a.m.
On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
> To underline that it's a backend/platform dependent thing: Your patch
> breaks the mingw build for me.

Actually that's because mingw is the usual piece of crap and doesn't
actually have any of the vector support you can expect from a normal
Unix system.

I can either throw in an #ifdef IOV_MAX around the check or fake one up
for mingw.  Does any of the maintainers have a preference for either
variant?
Anthony Liguori - Jan. 26, 2010, 1:08 p.m.
On 01/26/2010 03:21 AM, Christoph Hellwig wrote:
> On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
>    
>> To underline that it's a backend/platform dependent thing: Your patch
>> breaks the mingw build for me.
>>      
> Actually that's because mingw is the usual piece of crap and doesn't
> actually have any of the vector support you can expect from a normal
> Unix system.
>
> I can either throw in an #ifdef IOV_MAX around the check or fake one up
> for mingw.  Does any of the maintainers have a preference for either
> variant?
>    

grep for CONFIG_IOVEC in qemu-common.h and add a #define IOV_MAX.

mingw doesn't have iovec so we introduce a compat version.

Regards,

Anthony Liguori
Christoph Hellwig - Jan. 26, 2010, 1:42 p.m.
On Tue, Jan 26, 2010 at 07:08:20AM -0600, Anthony Liguori wrote:
> >I can either throw in an #ifdef IOV_MAX around the check or fake one up
> >for mingw.  Does any of the maintainers have a preference for either
> >variant?
> >   
> 
> grep for CONFIG_IOVEC in qemu-common.h and add a #define IOV_MAX.
> 
> mingw doesn't have iovec so we introduce a compat version.

Yes, that's what I meant with the second alternative above.

Patch

Index: qemu/block.c
===================================================================
--- qemu.orig/block.c	2010-01-19 22:10:19.797003226 +0100
+++ qemu/block.c	2010-01-19 22:11:08.226005767 +0100
@@ -1711,6 +1711,10 @@  static int multiwrite_merge(BlockDriverS
             merge = bs->drv->bdrv_merge_requests(bs, &reqs[outidx], &reqs[i]);
         }
 
+        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1 > IOV_MAX) {
+            merge = 0;
+        }
+
         if (merge) {
             size_t size;
             QEMUIOVector *qiov = qemu_mallocz(sizeof(*qiov));