block: fix initialization in bdrv_io_limits_enable()

Submitted by Peter Lieven on Jan. 11, 2013, 12:29 p.m.

Details

Message ID 50F005C3.5010305@dlhnet.de
State New
Headers show

Commit Message

Peter Lieven Jan. 11, 2013, 12:29 p.m.
bdrv_io_limits_enable() starts a new slice, but does not set io_base
correctly for that slice.

Here is how io_base is used:

    bytes_base  = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write];
    bytes_res   = (unsigned) nb_sectors * BDRV_SECTOR_SIZE;

    if (bytes_base + bytes_res <= bytes_limit) {
        /* no wait */
    } else {
        /* operation needs to be throttled */
    }

As a result, any I/O operations that are triggered between now and
bs->slice_end are incorrectly limited.  If 10 MB of data has been
written since the VM was started, QEMU thinks that 10 MB of data has
been written in this slice. This leads to a I/O lockup in the guest.

We fix this by delaying the start of a new slice to the next
call of bdrv_exceed_io_limits().

Signed-off-by: Peter Lieven <pl@kamp.de>
---
  block.c |    4 ----
  1 file changed, 4 deletions(-)

Comments

Paolo Bonzini Jan. 11, 2013, 12:57 p.m.
Il 11/01/2013 13:29, Peter Lieven ha scritto:
> bdrv_io_limits_enable() starts a new slice, but does not set io_base
> correctly for that slice.
> 
> Here is how io_base is used:
> 
>    bytes_base  = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write];
>    bytes_res   = (unsigned) nb_sectors * BDRV_SECTOR_SIZE;
> 
>    if (bytes_base + bytes_res <= bytes_limit) {
>        /* no wait */
>    } else {
>        /* operation needs to be throttled */
>    }
> 
> As a result, any I/O operations that are triggered between now and
> bs->slice_end are incorrectly limited.  If 10 MB of data has been
> written since the VM was started, QEMU thinks that 10 MB of data has
> been written in this slice. This leads to a I/O lockup in the guest.
> 
> We fix this by delaying the start of a new slice to the next
> call of bdrv_exceed_io_limits().
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 4e28c55..a40a389 100644
> --- a/block.c
> +++ b/block.c
> @@ -155,10 +155,6 @@ void bdrv_io_limits_enable(BlockDriverState *bs)
>  {
>      qemu_co_queue_init(&bs->throttled_reqs);
>      bs->block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs);
> -    bs->slice_time  = 5 * BLOCK_IO_SLICE_TIME;
> -    bs->slice_start = qemu_get_clock_ns(vm_clock);
> -    bs->slice_end   = bs->slice_start + bs->slice_time;
> -    memset(&bs->io_base, 0, sizeof(bs->io_base));
>      bs->io_limits_enabled = true;
>  }
> 

Thanks!

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo
Stefan Hajnoczi Jan. 15, 2013, 8:26 a.m.
On Fri, Jan 11, 2013 at 01:29:55PM +0100, Peter Lieven wrote:
> bdrv_io_limits_enable() starts a new slice, but does not set io_base
> correctly for that slice.
> 
> Here is how io_base is used:
> 
>    bytes_base  = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write];
>    bytes_res   = (unsigned) nb_sectors * BDRV_SECTOR_SIZE;
> 
>    if (bytes_base + bytes_res <= bytes_limit) {
>        /* no wait */
>    } else {
>        /* operation needs to be throttled */
>    }
> 
> As a result, any I/O operations that are triggered between now and
> bs->slice_end are incorrectly limited.  If 10 MB of data has been
> written since the VM was started, QEMU thinks that 10 MB of data has
> been written in this slice. This leads to a I/O lockup in the guest.
> 
> We fix this by delaying the start of a new slice to the next
> call of bdrv_exceed_io_limits().
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    4 ----
>  1 file changed, 4 deletions(-)

I resolved a merge conflict due to whitespace issues with your patch.
Perhaps you copy-pasted from a terminal into an email client.  Please
try git-send-email(1) or double-check that spaces haven't changed before
sending.

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

Patch hide | download patch | download mbox

diff --git a/block.c b/block.c
index 4e28c55..a40a389 100644
--- a/block.c
+++ b/block.c
@@ -155,10 +155,6 @@  void bdrv_io_limits_enable(BlockDriverState *bs)
  {
      qemu_co_queue_init(&bs->throttled_reqs);
      bs->block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs);
-    bs->slice_time  = 5 * BLOCK_IO_SLICE_TIME;
-    bs->slice_start = qemu_get_clock_ns(vm_clock);
-    bs->slice_end   = bs->slice_start + bs->slice_time;
-    memset(&bs->io_base, 0, sizeof(bs->io_base));
      bs->io_limits_enabled = true;
  }