diff mbox series

[v2] throttle-groups: update tg->any_timer_armed[] on detach

Message ID 20170920101740.12490-1-stefanha@redhat.com
State New
Headers show
Series [v2] throttle-groups: update tg->any_timer_armed[] on detach | expand

Commit Message

Stefan Hajnoczi Sept. 20, 2017, 10:17 a.m. UTC
Clear tg->any_timer_armed[] when throttling timers are destroyed during
AioContext attach/detach.  Failure to do so causes throttling to hang
because we believe the timer is already scheduled!

The following was broken at least since QEMU 2.10.0 with -drive
iops=100:

  $ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
  (qemu) stop
  (qemu) cont
  ...I/O is stuck...

Reported-by: Yongxue Hong <yhong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v2:
 * Use timer_pending(tt->timers[i]) instead of token [Berto]
 * Fix s/destroy/destroyed/ typo [Eric]

 block/throttle-groups.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Alberto Garcia Sept. 20, 2017, 11:08 a.m. UTC | #1
On Wed 20 Sep 2017 12:17:40 PM CEST, Stefan Hajnoczi wrote:
> @@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>  {
>      ThrottleTimers *tt = &tgm->throttle_timers;
> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
> +
> +    qemu_mutex_lock(&tg->lock);
> +    if (timer_pending(tt->timers[0])) {
> +        tg->any_timer_armed[0] = false;
> +    }
> +    if (timer_pending(tt->timers[1])) {
> +        tg->any_timer_armed[1] = false;
> +    }
> +    qemu_mutex_unlock(&tg->lock);
> +
>      throttle_timers_detach_aio_context(tt);
>      tgm->aio_context = NULL;
>  }

I'm sorry that I didn't noticed this in my previous e-mail, but after
this call you might have destroyed the timer that was set for that
throttling group, so if there are pending requests waiting it can happen
that no one wakes them up.

I think that the queue needs to be restarted after this, probably after
having reattached the context (or actually after detaching it already,
but then what happens if you try to restart the queue while aio_context
is NULL?).

Berto
Alberto Garcia Sept. 20, 2017, 12:17 p.m. UTC | #2
On Wed 20 Sep 2017 01:39:02 PM CEST, Manos Pitsidianakis wrote:
>>>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>>>  {
>>>      ThrottleTimers *tt = &tgm->throttle_timers;
>>> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>>> +
>>> +    qemu_mutex_lock(&tg->lock);
>>> +    if (timer_pending(tt->timers[0])) {
>>> +        tg->any_timer_armed[0] = false;
>>> +    }
>>> +    if (timer_pending(tt->timers[1])) {
>>> +        tg->any_timer_armed[1] = false;
>>> +    }
>>> +    qemu_mutex_unlock(&tg->lock);
>>> +
>>>      throttle_timers_detach_aio_context(tt);
>>>      tgm->aio_context = NULL;
>>>  }
>>
>>I'm sorry that I didn't noticed this in my previous e-mail, but after
>>this call you might have destroyed the timer that was set for that
>>throttling group, so if there are pending requests waiting it can
>>happen that no one wakes them up.
>>
>>I think that the queue needs to be restarted after this, probably
>>after having reattached the context (or actually after detaching it
>>already, but then what happens if you try to restart the queue while
>>aio_context is NULL?).
>
> aio_co_enter in the restart queue function requires that aio_context
> is non-NULL. Perhaps calling throttle_group_restart_tgm in
> throttle_group_attach_aio_context will suffice.

But can we guarantee that everything is safe between the _detach() and
_attach() calls?

Berto
Stefan Hajnoczi Sept. 20, 2017, 2:16 p.m. UTC | #3
On Wed, Sep 20, 2017 at 02:17:51PM +0200, Alberto Garcia wrote:
> On Wed 20 Sep 2017 01:39:02 PM CEST, Manos Pitsidianakis wrote:
> >>>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
> >>>  {
> >>>      ThrottleTimers *tt = &tgm->throttle_timers;
> >>> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
> >>> +
> >>> +    qemu_mutex_lock(&tg->lock);
> >>> +    if (timer_pending(tt->timers[0])) {
> >>> +        tg->any_timer_armed[0] = false;
> >>> +    }
> >>> +    if (timer_pending(tt->timers[1])) {
> >>> +        tg->any_timer_armed[1] = false;
> >>> +    }
> >>> +    qemu_mutex_unlock(&tg->lock);
> >>> +
> >>>      throttle_timers_detach_aio_context(tt);
> >>>      tgm->aio_context = NULL;
> >>>  }
> >>
> >>I'm sorry that I didn't noticed this in my previous e-mail, but after
> >>this call you might have destroyed the timer that was set for that
> >>throttling group, so if there are pending requests waiting it can
> >>happen that no one wakes them up.
> >>
> >>I think that the queue needs to be restarted after this, probably
> >>after having reattached the context (or actually after detaching it
> >>already, but then what happens if you try to restart the queue while
> >>aio_context is NULL?).
> >
> > aio_co_enter in the restart queue function requires that aio_context
> > is non-NULL. Perhaps calling throttle_group_restart_tgm in
> > throttle_group_attach_aio_context will suffice.
> 
> But can we guarantee that everything is safe between the _detach() and
> _attach() calls?

Here is a second patch that I didn't send because I was unsure whether it's
needed.

The idea is to move the throttle_group_detach/attach_aio_context() into
bdrv_set_aio_context() so it becomes part of the bdrv_drained_begin/end()
region.

The guarantees we have inside the drained region:

1. Throttling has been temporarily disabled.
2. throttle_group_restart_tgm() has been called to kick throttled reqs.
3. All requests are completed.

This is a nice, controlled environment to do the detach/attach.  That said,
Berto's point still stands: what about other ThrottleGroupMembers who have
throttled requests queued?

The main reason I didn't publish this patch is because Manos' "block: remove
legacy I/O throttling" series ought to remove this code anyway soon.

diff --git a/block/block-backend.c b/block/block-backend.c
index 45d9101be3..624eb3dc15 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1747,10 +1747,6 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
     ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
 
     if (bs) {
-        if (tgm->throttle_state) {
-            throttle_group_detach_aio_context(tgm);
-            throttle_group_attach_aio_context(tgm, new_context);
-        }
         bdrv_set_aio_context(bs, new_context);
     }
 }
@@ -2029,6 +2025,13 @@ static void blk_root_drained_end(BdrvChild *child)
     BlockBackend *blk = child->opaque;
     assert(blk->quiesce_counter);
 
+    if (tgm->throttle_state) {
+        AioContext *new_context = bdrv_aio_context(blk_bs(blk));
+
+        throttle_group_detach_aio_context(tgm);
+        throttle_group_attach_aio_context(tgm, new_context);
+    }
+
     assert(blk->public.throttle_group_member.io_limits_disabled);
     atomic_dec(&blk->public.throttle_group_member.io_limits_disabled);
diff mbox series

Patch

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 6ba992c8d7..0a49f3c409 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -592,6 +592,17 @@  void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
 void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
 {
     ThrottleTimers *tt = &tgm->throttle_timers;
+    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
+
+    qemu_mutex_lock(&tg->lock);
+    if (timer_pending(tt->timers[0])) {
+        tg->any_timer_armed[0] = false;
+    }
+    if (timer_pending(tt->timers[1])) {
+        tg->any_timer_armed[1] = false;
+    }
+    qemu_mutex_unlock(&tg->lock);
+
     throttle_timers_detach_aio_context(tt);
     tgm->aio_context = NULL;
 }