diff mbox

qed: replace vm_clock with rt_clock for qemu-tool compatibility

Message ID 1330268124-10725-1-git-send-email-stefanha@linux.vnet.ibm.com
State New
Headers show

Commit Message

Stefan Hajnoczi Feb. 26, 2012, 2:55 p.m. UTC
The QED dirty bit timer marks the file clean after allocating writes
have drained.  This is cheaper than clearing/setting the dirty bit on
each allocating write because the timer introduces a grace period which
can be extended if more allocating writes arrive.

The vm_clock was used in an attempt to prevent modifying the image file
when live migration has stopped the VM.  Unfortunately vm_clock is
unavailable in the qemu-tool environment and will abort(3)!

Since QED currently does not support live migration, just replace
vm_clock with rt_clock and add comments explaining the migration
blocker.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
Zhi Yong: This patch is needed in addition to the qemu_init_main_loop() patches
you sent recently.  Without this patch QED may read the vm_clock, which calls
abort(3) in qemu-tool.c.  Together, our patches make QED work again in qemu-img
and qemu-io.

 block/qed.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

Comments

Zhiyong Wu Feb. 27, 2012, 7:35 a.m. UTC | #1
On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> The QED dirty bit timer marks the file clean after allocating writes
> have drained.  This is cheaper than clearing/setting the dirty bit on
> each allocating write because the timer introduces a grace period which
> can be extended if more allocating writes arrive.
>
> The vm_clock was used in an attempt to prevent modifying the image file
> when live migration has stopped the VM.  Unfortunately vm_clock is
> unavailable in the qemu-tool environment and will abort(3)!
>
> Since QED currently does not support live migration, just replace
> vm_clock with rt_clock and add comments explaining the migration
> blocker.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
> Zhi Yong: This patch is needed in addition to the qemu_init_main_loop() patches
> you sent recently.  Without this patch QED may read the vm_clock, which calls
> abort(3) in qemu-tool.c.  Together, our patches make QED work again in qemu-img
Since vm_clock is created via qemu_init_main_loop(), when QED read
vm_clock, why will this call abort()?
Can you elaborate this? what is its call path?

> and qemu-io.
>
>  block/qed.c |   16 +++++++++++-----
>  1 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/block/qed.c b/block/qed.c
> index a041d31..fdb90e3 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -353,10 +353,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s)
>  {
>     trace_qed_start_need_check_timer(s);
>
> -    /* Use vm_clock so we don't alter the image file while suspended for
> -     * migration.
> -     */
> -    qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) +
> +    qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(rt_clock) +
>                    get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT);
>  }
>
> @@ -494,9 +491,18 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
>         }
>     }
>
> -    s->need_check_timer = qemu_new_timer_ns(vm_clock,
> +    s->need_check_timer = qemu_new_timer_ns(rt_clock,
>                                             qed_need_check_timer_cb, s);
>
> +    /* There are two issues with live migration:
> +     *
> +     * 1. The destination will open the image file and see the dirty bit is
> +     *    set, causing it to "repair" the image while the source still has it
> +     *    open for writing.
> +     *
> +     * 2. The timer used for clearing the dirty bit uses rt_clock and can in
> +     *    theory fire when the VM is not running during migration.
> +     */
>     error_set(&s->migration_blocker,
>               QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
>               "qed", bs->device_name, "live migration");
> --
> 1.7.9
>
>
Paolo Bonzini Feb. 27, 2012, 8:42 a.m. UTC | #2
On 02/27/2012 08:35 AM, Zhi Yong Wu wrote:
> Since vm_clock is created via qemu_init_main_loop(), when QED read
> vm_clock, why will this call abort()?
> Can you elaborate this? what is its call path?
> 

It will crash in cpu_get_clock() (in qemu-tool.c).

Paolo
Zhiyong Wu Feb. 27, 2012, 8:48 a.m. UTC | #3
On Mon, Feb 27, 2012 at 4:42 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 02/27/2012 08:35 AM, Zhi Yong Wu wrote:
>> Since vm_clock is created via qemu_init_main_loop(), when QED read
>> vm_clock, why will this call abort()?
>> Can you elaborate this? what is its call path?
>>
>
> It will crash in cpu_get_clock() (in qemu-tool.c).
nice, got it. thanks.
>
> Paolo
Zhiyong Wu Feb. 27, 2012, 2:20 p.m. UTC | #4
On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> The QED dirty bit timer marks the file clean after allocating writes
> have drained.  This is cheaper than clearing/setting the dirty bit on
> each allocating write because the timer introduces a grace period which
> can be extended if more allocating writes arrive.
>
> The vm_clock was used in an attempt to prevent modifying the image file
Why can vm_clock do this work but other clock can not? If you're
available, can you elaborate what the difference among the three
clocks is? such as vm_clock, rt_clock, and host_clock.

> when live migration has stopped the VM.  Unfortunately vm_clock is
> unavailable in the qemu-tool environment and will abort(3)!
>
> Since QED currently does not support live migration, just replace
> vm_clock with rt_clock and add comments explaining the migration
> blocker.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
> Zhi Yong: This patch is needed in addition to the qemu_init_main_loop() patches
> you sent recently.  Without this patch QED may read the vm_clock, which calls
> abort(3) in qemu-tool.c.  Together, our patches make QED work again in qemu-img
> and qemu-io.
>
>  block/qed.c |   16 +++++++++++-----
>  1 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/block/qed.c b/block/qed.c
> index a041d31..fdb90e3 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -353,10 +353,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s)
>  {
>     trace_qed_start_need_check_timer(s);
>
> -    /* Use vm_clock so we don't alter the image file while suspended for
> -     * migration.
> -     */
> -    qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) +
> +    qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(rt_clock) +
>                    get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT);
>  }
>
> @@ -494,9 +491,18 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
>         }
>     }
>
> -    s->need_check_timer = qemu_new_timer_ns(vm_clock,
> +    s->need_check_timer = qemu_new_timer_ns(rt_clock,
>                                             qed_need_check_timer_cb, s);
>
> +    /* There are two issues with live migration:
> +     *
> +     * 1. The destination will open the image file and see the dirty bit is
> +     *    set, causing it to "repair" the image while the source still has it
> +     *    open for writing.
> +     *
> +     * 2. The timer used for clearing the dirty bit uses rt_clock and can in
> +     *    theory fire when the VM is not running during migration.
> +     */
>     error_set(&s->migration_blocker,
>               QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
>               "qed", bs->device_name, "live migration");
> --
> 1.7.9
>
>
Stefan Hajnoczi Feb. 27, 2012, 2:41 p.m. UTC | #5
On Mon, Feb 27, 2012 at 2:20 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
>> The QED dirty bit timer marks the file clean after allocating writes
>> have drained.  This is cheaper than clearing/setting the dirty bit on
>> each allocating write because the timer introduces a grace period which
>> can be extended if more allocating writes arrive.
>>
>> The vm_clock was used in an attempt to prevent modifying the image file
> Why can vm_clock do this work but other clock can not? If you're
> available, can you elaborate what the difference among the three
> clocks is? such as vm_clock, rt_clock, and host_clock.

See qemu-timer.h.

Stefan
Zhiyong Wu Feb. 27, 2012, 3:17 p.m. UTC | #6
On Mon, Feb 27, 2012 at 10:41 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Mon, Feb 27, 2012 at 2:20 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>>> The QED dirty bit timer marks the file clean after allocating writes
>>> have drained.  This is cheaper than clearing/setting the dirty bit on
>>> each allocating write because the timer introduces a grace period which
>>> can be extended if more allocating writes arrive.
>>>
>>> The vm_clock was used in an attempt to prevent modifying the image file
>> Why can vm_clock do this work but other clock can not? If you're
>> available, can you elaborate what the difference among the three
>> clocks is? such as vm_clock, rt_clock, and host_clock.
>
> See qemu-timer.h.
thanks
>
> Stefan
diff mbox

Patch

diff --git a/block/qed.c b/block/qed.c
index a041d31..fdb90e3 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -353,10 +353,7 @@  static void qed_start_need_check_timer(BDRVQEDState *s)
 {
     trace_qed_start_need_check_timer(s);
 
-    /* Use vm_clock so we don't alter the image file while suspended for
-     * migration.
-     */
-    qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) +
+    qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(rt_clock) +
                    get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT);
 }
 
@@ -494,9 +491,18 @@  static int bdrv_qed_open(BlockDriverState *bs, int flags)
         }
     }
 
-    s->need_check_timer = qemu_new_timer_ns(vm_clock,
+    s->need_check_timer = qemu_new_timer_ns(rt_clock,
                                             qed_need_check_timer_cb, s);
 
+    /* There are two issues with live migration:
+     *
+     * 1. The destination will open the image file and see the dirty bit is
+     *    set, causing it to "repair" the image while the source still has it
+     *    open for writing.
+     *
+     * 2. The timer used for clearing the dirty bit uses rt_clock and can in
+     *    theory fire when the VM is not running during migration.
+     */
     error_set(&s->migration_blocker,
               QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
               "qed", bs->device_name, "live migration");