block: prevent multiwrite_merge from creating too large iovecs

Submitted by Christoph Hellwig on Jan. 19, 2010, 9:15 p.m.

Details

Message ID 20100119211539.GA4383@lst.de
State New
Headers show

Commit Message

Christoph Hellwig Jan. 19, 2010, 9:15 p.m.
If we go over the maximum number of iovecs support by syscall we get
back EINVAL from the kernel which translate to I/O errors for the guest.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Comments

Kevin Wolf Jan. 20, 2010, 11:37 a.m.
Am 19.01.2010 22:15, schrieb Christoph Hellwig:
> If we go over the maximum number of iovecs support by syscall we get
> back EINVAL from the kernel which translate to I/O errors for the guest.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Is this really enough? We don't check for IOV_MAX in any other place, so
can't we get a too big request directly from virtio-blk?

What about handling it transparently in qemu_pwritev/preadv and
laio_submit? Logically, it's a limitation of the backend anyway and not
a generic restriction.

To underline that it's a backend/platform dependent thing: Your patch
breaks the mingw build for me.

Kevin
Anthony Liguori Jan. 20, 2010, 2:56 p.m.
On 01/19/2010 03:15 PM, Christoph Hellwig wrote:
> If we go over the maximum number of iovecs support by syscall we get
> back EINVAL from the kernel which translate to I/O errors for the guest.
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
>    

Applied.  Thanks.

Regards,

Anthony Liguori
> Index: qemu/block.c
> ===================================================================
> --- qemu.orig/block.c	2010-01-19 22:10:19.797003226 +0100
> +++ qemu/block.c	2010-01-19 22:11:08.226005767 +0100
> @@ -1711,6 +1711,10 @@ static int multiwrite_merge(BlockDriverS
>               merge = bs->drv->bdrv_merge_requests(bs,&reqs[outidx],&reqs[i]);
>           }
>
> +        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1>  IOV_MAX) {
> +            merge = 0;
> +        }
> +
>           if (merge) {
>               size_t size;
>               QEMUIOVector *qiov = qemu_mallocz(sizeof(*qiov));
>
>
>
>
Christoph Hellwig Jan. 20, 2010, 4:24 p.m.
On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
> Am 19.01.2010 22:15, schrieb Christoph Hellwig:
> > If we go over the maximum number of iovecs support by syscall we get
> > back EINVAL from the kernel which translate to I/O errors for the guest.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Is this really enough? We don't check for IOV_MAX in any other place, so
> can't we get a too big request directly from virtio-blk?

Currently the virtqueue is limited to 1024 iovecs, but I plan to put in
some better infrastructure to deal with the queue limit.  For now this
patch fixes an issue that we see with real life setups.
Kevin Wolf Jan. 20, 2010, 4:29 p.m.
Am 20.01.2010 17:24, schrieb Christoph Hellwig:
> On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
>> Am 19.01.2010 22:15, schrieb Christoph Hellwig:
>>> If we go over the maximum number of iovecs support by syscall we get
>>> back EINVAL from the kernel which translate to I/O errors for the guest.
>>>
>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>
>> Is this really enough? We don't check for IOV_MAX in any other place, so
>> can't we get a too big request directly from virtio-blk?
> 
> Currently the virtqueue is limited to 1024 iovecs, but I plan to put in
> some better infrastructure to deal with the queue limit.  For now this
> patch fixes an issue that we see with real life setups.

Ok, if you're planning to replace it by something real, I'm not opposed
to using this as a quick fix for the meantime. However, it needs an
#ifdef for the mingw build breakage at least.

Kevin
Christoph Hellwig Jan. 26, 2010, 9:21 a.m.
On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
> To underline that it's a backend/platform dependent thing: Your patch
> breaks the mingw build for me.

Actually that's because mingw is the usual piece of crap and doesn't
actually have any of the vector support you can expect from a normal
Unix system.

I can either throw in an #ifdef IOV_MAX around the check or fake one up
for mingw.  Does any of the maintainers have a preference for either
variant?
Anthony Liguori Jan. 26, 2010, 1:08 p.m.
On 01/26/2010 03:21 AM, Christoph Hellwig wrote:
> On Wed, Jan 20, 2010 at 12:37:51PM +0100, Kevin Wolf wrote:
>    
>> To underline that it's a backend/platform dependent thing: Your patch
>> breaks the mingw build for me.
>>      
> Actually that's because mingw is the usual piece of crap and doesn't
> actually have any of the vector support you can expect from a normal
> Unix system.
>
> I can either throw in an #ifdef IOV_MAX around the check or fake one up
> for mingw.  Does any of the maintainers have a preference for either
> variant?
>    

grep for CONFIG_IOVEC in qemu-common.h and add a #define IOV_MAX.

mingw doesn't have iovec so we introduce a compat version.

Regards,

Anthony Liguori
Christoph Hellwig Jan. 26, 2010, 1:42 p.m.
On Tue, Jan 26, 2010 at 07:08:20AM -0600, Anthony Liguori wrote:
> >I can either throw in an #ifdef IOV_MAX around the check or fake one up
> >for mingw.  Does any of the maintainers have a preference for either
> >variant?
> >   
> 
> grep for CONFIG_IOVEC in qemu-common.h and add a #define IOV_MAX.
> 
> mingw doesn't have iovec so we introduce a compat version.

Yes, that's what I meant with the second alternative above.

Patch hide | download patch | download mbox

Index: qemu/block.c
===================================================================
--- qemu.orig/block.c	2010-01-19 22:10:19.797003226 +0100
+++ qemu/block.c	2010-01-19 22:11:08.226005767 +0100
@@ -1711,6 +1711,10 @@  static int multiwrite_merge(BlockDriverS
             merge = bs->drv->bdrv_merge_requests(bs, &reqs[outidx], &reqs[i]);
         }
 
+        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1 > IOV_MAX) {
+            merge = 0;
+        }
+
         if (merge) {
             size_t size;
             QEMUIOVector *qiov = qemu_mallocz(sizeof(*qiov));