Patchwork [v2,4/7] block: take lock around bdrv_read implementations

login
register
mail settings
Submitter Paolo Bonzini
Date Oct. 20, 2011, 11:16 a.m.
Message ID <1319109385-7927-5-git-send-email-pbonzini@redhat.com>
Download mbox | patch
Permalink /patch/120796/
State New
Headers show

Comments

Paolo Bonzini - Oct. 20, 2011, 11:16 a.m.
This does the first part of the conversion to coroutines, by
wrapping bdrv_read implementations to take the read side of the
rwlock.

Drivers that implement bdrv_read rather than bdrv_co_readv can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.

raw-win32 does not need the lock, because it cannot yield.
nbd also doesn't probably, but better be safe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/bochs.c     |   13 ++++++++++++-
 block/cloop.c     |   13 ++++++++++++-
 block/cow.c       |   13 ++++++++++++-
 block/dmg.c       |   13 ++++++++++++-
 block/nbd.c       |   13 ++++++++++++-
 block/parallels.c |   13 ++++++++++++-
 block/vmdk.c      |   13 ++++++++++++-
 block/vpc.c       |   13 ++++++++++++-
 block/vvfat.c     |   13 ++++++++++++-
 9 files changed, 108 insertions(+), 9 deletions(-)
Avi Kivity - Nov. 6, 2011, 2:27 p.m.
On 10/20/2011 01:16 PM, Paolo Bonzini wrote:
> This does the first part of the conversion to coroutines, by
> wrapping bdrv_read implementations to take the read side of the
> rwlock.
>
> Drivers that implement bdrv_read rather than bdrv_co_readv can
> then benefit from asynchronous operation (at least if the underlying
> protocol supports it, which is not the case for raw-win32), even
> though they still operate with a bounce buffer.
>
> raw-win32 does not need the lock, because it cannot yield.
> nbd also doesn't probably, but better be safe.

This patch (2914caa088e3fbbd) breaks autotest when a guest reboots after
install; instead of rebooting, the guest is stuck in the bootloader or
kernel.

This was discovered in qemu-kvm, but applies to plain qemu too.  The
commit above is broken, it's parent is good.
Paolo Bonzini - Nov. 6, 2011, 5:25 p.m.
On 11/06/2011 03:27 PM, Avi Kivity wrote:
> On 10/20/2011 01:16 PM, Paolo Bonzini wrote:
>> This does the first part of the conversion to coroutines, by
>> wrapping bdrv_read implementations to take the read side of the
>> rwlock.
>>
>> Drivers that implement bdrv_read rather than bdrv_co_readv can
>> then benefit from asynchronous operation (at least if the underlying
>> protocol supports it, which is not the case for raw-win32), even
>> though they still operate with a bounce buffer.
>>
>> raw-win32 does not need the lock, because it cannot yield.
>> nbd also doesn't probably, but better be safe.
>
> This patch (2914caa088e3fbbd) breaks autotest when a guest reboots after
> install; instead of rebooting, the guest is stuck in the bootloader or
> kernel.

Are any of these formats used by autotest?

  block/bochs.c     |   13 ++++++++++++-
  block/cloop.c     |   13 ++++++++++++-
  block/cow.c       |   13 ++++++++++++-
  block/dmg.c       |   13 ++++++++++++-
  block/nbd.c       |   13 ++++++++++++-
  block/parallels.c |   13 ++++++++++++-
  block/vmdk.c      |   13 ++++++++++++-
  block/vpc.c       |   13 ++++++++++++-
  block/vvfat.c     |   13 ++++++++++++-
  9 files changed, 108 insertions(+), 9 deletions(-)

Perhaps the failure is only reproduced 80-90% of the time and this 
screws up the bisection.

Paolo
Avi Kivity - Nov. 6, 2011, 5:29 p.m.
On 11/06/2011 07:25 PM, Paolo Bonzini wrote:
> On 11/06/2011 03:27 PM, Avi Kivity wrote:
>> On 10/20/2011 01:16 PM, Paolo Bonzini wrote:
>>> This does the first part of the conversion to coroutines, by
>>> wrapping bdrv_read implementations to take the read side of the
>>> rwlock.
>>>
>>> Drivers that implement bdrv_read rather than bdrv_co_readv can
>>> then benefit from asynchronous operation (at least if the underlying
>>> protocol supports it, which is not the case for raw-win32), even
>>> though they still operate with a bounce buffer.
>>>
>>> raw-win32 does not need the lock, because it cannot yield.
>>> nbd also doesn't probably, but better be safe.
>>
>> This patch (2914caa088e3fbbd) breaks autotest when a guest reboots after
>> install; instead of rebooting, the guest is stuck in the bootloader or
>> kernel.
>
> Are any of these formats used by autotest?
>

It's configurable; in my case, qcow2.

>  block/bochs.c     |   13 ++++++++++++-
>  block/cloop.c     |   13 ++++++++++++-
>  block/cow.c       |   13 ++++++++++++-
>  block/dmg.c       |   13 ++++++++++++-
>  block/nbd.c       |   13 ++++++++++++-
>  block/parallels.c |   13 ++++++++++++-
>  block/vmdk.c      |   13 ++++++++++++-
>  block/vpc.c       |   13 ++++++++++++-
>  block/vvfat.c     |   13 ++++++++++++-
>  9 files changed, 108 insertions(+), 9 deletions(-)
>

So no.

> Perhaps the failure is only reproduced 80-90% of the time and this
> screws up the bisection.

I thought I checked the before/after commit, but looking at the
diffstat, that's doesn't make sense.

On a related note, booting with -cdrom http://blah seems broken.
Kevin Wolf - Nov. 7, 2011, 9:12 a.m.
Am 06.11.2011 15:27, schrieb Avi Kivity:
> On 10/20/2011 01:16 PM, Paolo Bonzini wrote:
>> This does the first part of the conversion to coroutines, by
>> wrapping bdrv_read implementations to take the read side of the
>> rwlock.
>>
>> Drivers that implement bdrv_read rather than bdrv_co_readv can
>> then benefit from asynchronous operation (at least if the underlying
>> protocol supports it, which is not the case for raw-win32), even
>> though they still operate with a bounce buffer.
>>
>> raw-win32 does not need the lock, because it cannot yield.
>> nbd also doesn't probably, but better be safe.
> 
> This patch (2914caa088e3fbbd) breaks autotest when a guest reboots after
> install; instead of rebooting, the guest is stuck in the bootloader or
> kernel.
> 
> This was discovered in qemu-kvm, but applies to plain qemu too.  The
> commit above is broken, it's parent is good.

Does the autotest case use any of the block drivers that are changed by
this patch? I would be surprised to learn that, but otherwise it doesn't
make sense to me.

 block/bochs.c     |   13 ++++++++++++-
 block/cloop.c     |   13 ++++++++++++-
 block/cow.c       |   13 ++++++++++++-
 block/dmg.c       |   13 ++++++++++++-
 block/nbd.c       |   13 ++++++++++++-
 block/parallels.c |   13 ++++++++++++-
 block/vmdk.c      |   13 ++++++++++++-
 block/vpc.c       |   13 ++++++++++++-
 block/vvfat.c     |   13 ++++++++++++-

Kevin
Avi Kivity - Nov. 7, 2011, 10:26 p.m.
On 11/06/2011 07:25 PM, Paolo Bonzini wrote:
>
> Perhaps the failure is only reproduced 80-90% of the time and this
> screws up the bisection.

Correct, bad bisect.  It actually reproduces reliably, when the bisecter
is reliable.

The new candidate (disclaimer: unreliable bisecter) is:

commit a5c57d64aa61b700db444c4864a1da11f1165db6
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Mon Sep 12 14:40:36 2011 +0200

    qemu-timer: do not refer to runstate_is_running()
   
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avi Kivity - Nov. 7, 2011, 11:12 p.m.
On 11/08/2011 12:26 AM, Avi Kivity wrote:
> On 11/06/2011 07:25 PM, Paolo Bonzini wrote:
> >
> > Perhaps the failure is only reproduced 80-90% of the time and this
> > screws up the bisection.
>
> Correct, bad bisect.  It actually reproduces reliably, when the bisecter
> is reliable.
>
> The new candidate (disclaimer: unreliable bisecter) is:
>
> commit a5c57d64aa61b700db444c4864a1da11f1165db6
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Mon Sep 12 14:40:36 2011 +0200
>
>     qemu-timer: do not refer to runstate_is_running()
>    
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
>

Fixed by 47113ab6b8c5659a, as Anthony pointed out on IRC.

Patch

diff --git a/block/bochs.c b/block/bochs.c
index b0f8072..f49b821 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -209,6 +209,17 @@  static int bochs_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int bochs_co_read(BlockDriverState *bs, int64_t sector_num,
+                                      uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVBochsState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = bochs_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static void bochs_close(BlockDriverState *bs)
 {
     BDRVBochsState *s = bs->opaque;
@@ -220,7 +231,7 @@  static BlockDriver bdrv_bochs = {
     .instance_size	= sizeof(BDRVBochsState),
     .bdrv_probe		= bochs_probe,
     .bdrv_open		= bochs_open,
-    .bdrv_read		= bochs_read,
+    .bdrv_read          = bochs_co_read,
     .bdrv_close		= bochs_close,
 };
 
diff --git a/block/cloop.c b/block/cloop.c
index a91f372..68d45b0 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -146,6 +146,17 @@  static int cloop_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
+                                      uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVCloopState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = cloop_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static void cloop_close(BlockDriverState *bs)
 {
     BDRVCloopState *s = bs->opaque;
@@ -161,7 +172,7 @@  static BlockDriver bdrv_cloop = {
     .instance_size	= sizeof(BDRVCloopState),
     .bdrv_probe		= cloop_probe,
     .bdrv_open		= cloop_open,
-    .bdrv_read		= cloop_read,
+    .bdrv_read          = cloop_co_read,
     .bdrv_close		= cloop_close,
 };
 
diff --git a/block/cow.c b/block/cow.c
index 2f426e7..e57378d 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -201,6 +201,17 @@  static int cow_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
+                                    uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVCowState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = cow_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static int cow_write(BlockDriverState *bs, int64_t sector_num,
                      const uint8_t *buf, int nb_sectors)
 {
@@ -308,7 +319,7 @@  static BlockDriver bdrv_cow = {
     .instance_size	= sizeof(BDRVCowState),
     .bdrv_probe		= cow_probe,
     .bdrv_open		= cow_open,
-    .bdrv_read		= cow_read,
+    .bdrv_read          = cow_co_read,
     .bdrv_write		= cow_write,
     .bdrv_close		= cow_close,
     .bdrv_create	= cow_create,
diff --git a/block/dmg.c b/block/dmg.c
index 111aeae..f46306f 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -282,6 +282,17 @@  static int dmg_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
+                                    uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVDMGState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = dmg_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static void dmg_close(BlockDriverState *bs)
 {
     BDRVDMGState *s = bs->opaque;
@@ -302,7 +313,7 @@  static BlockDriver bdrv_dmg = {
     .instance_size	= sizeof(BDRVDMGState),
     .bdrv_probe		= dmg_probe,
     .bdrv_open		= dmg_open,
-    .bdrv_read		= dmg_read,
+    .bdrv_read          = dmg_co_read,
     .bdrv_close		= dmg_close,
 };
 
diff --git a/block/nbd.c b/block/nbd.c
index 14ab225..d5ba95d 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -240,6 +240,17 @@  static int nbd_write(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int nbd_co_read(BlockDriverState *bs, int64_t sector_num,
+                                    uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVNBDState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = nbd_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static void nbd_close(BlockDriverState *bs)
 {
     BDRVNBDState *s = bs->opaque;
@@ -260,7 +271,7 @@  static BlockDriver bdrv_nbd = {
     .format_name	= "nbd",
     .instance_size	= sizeof(BDRVNBDState),
     .bdrv_file_open	= nbd_open,
-    .bdrv_read		= nbd_read,
+    .bdrv_read          = nbd_co_read,
     .bdrv_write		= nbd_write,
     .bdrv_close		= nbd_close,
     .bdrv_getlength	= nbd_getlength,
diff --git a/block/parallels.c b/block/parallels.c
index b86e87e..3baf617 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -136,6 +136,17 @@  static int parallels_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int parallels_co_read(BlockDriverState *bs, int64_t sector_num,
+                                          uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVParallelsState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = parallels_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static void parallels_close(BlockDriverState *bs)
 {
     BDRVParallelsState *s = bs->opaque;
@@ -147,7 +158,7 @@  static BlockDriver bdrv_parallels = {
     .instance_size	= sizeof(BDRVParallelsState),
     .bdrv_probe		= parallels_probe,
     .bdrv_open		= parallels_open,
-    .bdrv_read		= parallels_read,
+    .bdrv_read          = parallels_co_read,
     .bdrv_close		= parallels_close,
 };
 
diff --git a/block/vmdk.c b/block/vmdk.c
index 1ce220d..0e791f2 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1024,6 +1024,17 @@  static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int vmdk_co_read(BlockDriverState *bs, int64_t sector_num,
+                                     uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVVmdkState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = vmdk_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
                      const uint8_t *buf, int nb_sectors)
 {
@@ -1542,7 +1553,7 @@  static BlockDriver bdrv_vmdk = {
     .instance_size  = sizeof(BDRVVmdkState),
     .bdrv_probe     = vmdk_probe,
     .bdrv_open      = vmdk_open,
-    .bdrv_read      = vmdk_read,
+    .bdrv_read      = vmdk_co_read,
     .bdrv_write     = vmdk_write,
     .bdrv_close     = vmdk_close,
     .bdrv_create    = vmdk_create,
diff --git a/block/vpc.c b/block/vpc.c
index 5414042..e91e397 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -409,6 +409,17 @@  static int vpc_read(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
+static coroutine_fn int vpc_co_read(BlockDriverState *bs, int64_t sector_num,
+                                    uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVVPCState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = vpc_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 static int vpc_write(BlockDriverState *bs, int64_t sector_num,
     const uint8_t *buf, int nb_sectors)
 {
@@ -649,7 +660,7 @@  static BlockDriver bdrv_vpc = {
     .instance_size  = sizeof(BDRVVPCState),
     .bdrv_probe     = vpc_probe,
     .bdrv_open      = vpc_open,
-    .bdrv_read      = vpc_read,
+    .bdrv_read      = vpc_co_read,
     .bdrv_write     = vpc_write,
     .bdrv_flush     = vpc_flush,
     .bdrv_close     = vpc_close,
diff --git a/block/vvfat.c b/block/vvfat.c
index 6f666d6..4501ccc 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1279,6 +1279,17 @@  DLOG(fprintf(stderr, "sector %d not allocated\n", (int)sector_num));
     return 0;
 }
 
+static coroutine_fn int vvfat_co_read(BlockDriverState *bs, int64_t sector_num,
+                                      uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVVVFATState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = vvfat_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
 /* LATER TODO: statify all functions */
 
 /*
@@ -2803,7 +2814,7 @@  static BlockDriver bdrv_vvfat = {
     .format_name	= "vvfat",
     .instance_size	= sizeof(BDRVVVFATState),
     .bdrv_file_open	= vvfat_open,
-    .bdrv_read		= vvfat_read,
+    .bdrv_read          = vvfat_co_read,
     .bdrv_write		= vvfat_write,
     .bdrv_close		= vvfat_close,
     .bdrv_is_allocated	= vvfat_is_allocated,