Message ID | 1330268124-10725-1-git-send-email-stefanha@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> wrote: > The QED dirty bit timer marks the file clean after allocating writes > have drained. This is cheaper than clearing/setting the dirty bit on > each allocating write because the timer introduces a grace period which > can be extended if more allocating writes arrive. > > The vm_clock was used in an attempt to prevent modifying the image file > when live migration has stopped the VM. Unfortunately vm_clock is > unavailable in the qemu-tool environment and will abort(3)! > > Since QED currently does not support live migration, just replace > vm_clock with rt_clock and add comments explaining the migration > blocker. > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > --- > Zhi Yong: This patch is needed in addition to the qemu_init_main_loop() patches > you sent recently. Without this patch QED may read the vm_clock, which calls > abort(3) in qemu-tool.c. Together, our patches make QED work again in qemu-img Since vm_clock is created via qemu_init_main_loop(), when QED read vm_clock, why will this call abort()? Can you elaborate this? what is its call path? > and qemu-io. > > block/qed.c | 16 +++++++++++----- > 1 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/block/qed.c b/block/qed.c > index a041d31..fdb90e3 100644 > --- a/block/qed.c > +++ b/block/qed.c > @@ -353,10 +353,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s) > { > trace_qed_start_need_check_timer(s); > > - /* Use vm_clock so we don't alter the image file while suspended for > - * migration. > - */ > - qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) + > + qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(rt_clock) + > get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT); > } > > @@ -494,9 +491,18 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) > } > } > > - s->need_check_timer = qemu_new_timer_ns(vm_clock, > + s->need_check_timer = qemu_new_timer_ns(rt_clock, > qed_need_check_timer_cb, s); > > + /* There are two issues with live migration: > + * > + * 1. The destination will open the image file and see the dirty bit is > + * set, causing it to "repair" the image while the source still has it > + * open for writing. > + * > + * 2. The timer used for clearing the dirty bit uses rt_clock and can in > + * theory fire when the VM is not running during migration. > + */ > error_set(&s->migration_blocker, > QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED, > "qed", bs->device_name, "live migration"); > -- > 1.7.9 > >
On 02/27/2012 08:35 AM, Zhi Yong Wu wrote: > Since vm_clock is created via qemu_init_main_loop(), when QED read > vm_clock, why will this call abort()? > Can you elaborate this? what is its call path? > It will crash in cpu_get_clock() (in qemu-tool.c). Paolo
On Mon, Feb 27, 2012 at 4:42 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > On 02/27/2012 08:35 AM, Zhi Yong Wu wrote: >> Since vm_clock is created via qemu_init_main_loop(), when QED read >> vm_clock, why will this call abort()? >> Can you elaborate this? what is its call path? >> > > It will crash in cpu_get_clock() (in qemu-tool.c). nice, got it. thanks. > > Paolo
On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> wrote: > The QED dirty bit timer marks the file clean after allocating writes > have drained. This is cheaper than clearing/setting the dirty bit on > each allocating write because the timer introduces a grace period which > can be extended if more allocating writes arrive. > > The vm_clock was used in an attempt to prevent modifying the image file Why can vm_clock do this work but other clock can not? If you're available, can you elaborate what the difference among the three clocks is? such as vm_clock, rt_clock, and host_clock. > when live migration has stopped the VM. Unfortunately vm_clock is > unavailable in the qemu-tool environment and will abort(3)! > > Since QED currently does not support live migration, just replace > vm_clock with rt_clock and add comments explaining the migration > blocker. > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > --- > Zhi Yong: This patch is needed in addition to the qemu_init_main_loop() patches > you sent recently. Without this patch QED may read the vm_clock, which calls > abort(3) in qemu-tool.c. Together, our patches make QED work again in qemu-img > and qemu-io. > > block/qed.c | 16 +++++++++++----- > 1 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/block/qed.c b/block/qed.c > index a041d31..fdb90e3 100644 > --- a/block/qed.c > +++ b/block/qed.c > @@ -353,10 +353,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s) > { > trace_qed_start_need_check_timer(s); > > - /* Use vm_clock so we don't alter the image file while suspended for > - * migration. > - */ > - qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) + > + qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(rt_clock) + > get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT); > } > > @@ -494,9 +491,18 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) > } > } > > - s->need_check_timer = qemu_new_timer_ns(vm_clock, > + s->need_check_timer = qemu_new_timer_ns(rt_clock, > qed_need_check_timer_cb, s); > > + /* There are two issues with live migration: > + * > + * 1. The destination will open the image file and see the dirty bit is > + * set, causing it to "repair" the image while the source still has it > + * open for writing. > + * > + * 2. The timer used for clearing the dirty bit uses rt_clock and can in > + * theory fire when the VM is not running during migration. > + */ > error_set(&s->migration_blocker, > QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED, > "qed", bs->device_name, "live migration"); > -- > 1.7.9 > >
On Mon, Feb 27, 2012 at 2:20 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote: > On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi > <stefanha@linux.vnet.ibm.com> wrote: >> The QED dirty bit timer marks the file clean after allocating writes >> have drained. This is cheaper than clearing/setting the dirty bit on >> each allocating write because the timer introduces a grace period which >> can be extended if more allocating writes arrive. >> >> The vm_clock was used in an attempt to prevent modifying the image file > Why can vm_clock do this work but other clock can not? If you're > available, can you elaborate what the difference among the three > clocks is? such as vm_clock, rt_clock, and host_clock. See qemu-timer.h. Stefan
On Mon, Feb 27, 2012 at 10:41 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Mon, Feb 27, 2012 at 2:20 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote: >> On Sun, Feb 26, 2012 at 10:55 PM, Stefan Hajnoczi >> <stefanha@linux.vnet.ibm.com> wrote: >>> The QED dirty bit timer marks the file clean after allocating writes >>> have drained. This is cheaper than clearing/setting the dirty bit on >>> each allocating write because the timer introduces a grace period which >>> can be extended if more allocating writes arrive. >>> >>> The vm_clock was used in an attempt to prevent modifying the image file >> Why can vm_clock do this work but other clock can not? If you're >> available, can you elaborate what the difference among the three >> clocks is? such as vm_clock, rt_clock, and host_clock. > > See qemu-timer.h. thanks > > Stefan
diff --git a/block/qed.c b/block/qed.c index a041d31..fdb90e3 100644 --- a/block/qed.c +++ b/block/qed.c @@ -353,10 +353,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s) { trace_qed_start_need_check_timer(s); - /* Use vm_clock so we don't alter the image file while suspended for - * migration. - */ - qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) + + qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(rt_clock) + get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT); } @@ -494,9 +491,18 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) } } - s->need_check_timer = qemu_new_timer_ns(vm_clock, + s->need_check_timer = qemu_new_timer_ns(rt_clock, qed_need_check_timer_cb, s); + /* There are two issues with live migration: + * + * 1. The destination will open the image file and see the dirty bit is + * set, causing it to "repair" the image while the source still has it + * open for writing. + * + * 2. The timer used for clearing the dirty bit uses rt_clock and can in + * theory fire when the VM is not running during migration. + */ error_set(&s->migration_blocker, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED, "qed", bs->device_name, "live migration");
The QED dirty bit timer marks the file clean after allocating writes have drained. This is cheaper than clearing/setting the dirty bit on each allocating write because the timer introduces a grace period which can be extended if more allocating writes arrive. The vm_clock was used in an attempt to prevent modifying the image file when live migration has stopped the VM. Unfortunately vm_clock is unavailable in the qemu-tool environment and will abort(3)! Since QED currently does not support live migration, just replace vm_clock with rt_clock and add comments explaining the migration blocker. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- Zhi Yong: This patch is needed in addition to the qemu_init_main_loop() patches you sent recently. Without this patch QED may read the vm_clock, which calls abort(3) in qemu-tool.c. Together, our patches make QED work again in qemu-img and qemu-io. block/qed.c | 16 +++++++++++----- 1 files changed, 11 insertions(+), 5 deletions(-)