Message ID | 1437250916-18905-2-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
On Sat, 07/18 22:21, Paolo Bonzini wrote: > event_notifier_test_and_clear must be called before processing events. > Otherwise, an aio_poll could "eat" the notification before the main > I/O thread invokes ppoll(). The main I/O thread then never wakes up. > This is an example of what could happen: > > i/o thread vcpu thread worker thread > --------------------------------------------------------------------- > lock_iothread > notify_me = 1 > ... > unlock_iothread > lock_iothread > notify_me = 3 > ppoll > notify_me = 1 > bh->scheduled = 1 > event_notifier_set > event_notifier_test_and_clear > ppoll > *** hang *** It still seems possible for event_notifier_test_and_clear to happen before main thread ppoll, will that be a problem? How does main thread get waken up in that case? Fam > > In contrast with the previous bug, there wasn't really much debugging > to do here. "Tracing" with qemu_clock_get_ns shows pretty much the > same behavior as in the previous patch, so the only wait to find the > bug is staring at the code. > > One could also use a formal model, of course. The included one shows > this with three processes: notifier corresponds to a QEMU thread pool > worker, temporary_waiter to a VCPU thread that invokes aio_poll(), > waiter to the main I/O thread. I would be happy to say that the > formal model found the bug for me, but actually I wrote it after the > fact. > > This patch is a bit of a big hammer. The next one optimizes it, > with help (this time for real rather than a posteriori :)) from > another, similar formal model. > > Reported-by: Richard W. M. Jones <rjones@redhat.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > aio-posix.c | 2 + > aio-win32.c | 2 + > async.c | 8 ++- > docs/aio_notify_bug.promela | 140 ++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 151 insertions(+), 1 deletion(-) > create mode 100644 docs/aio_notify_bug.promela > > diff --git a/aio-posix.c b/aio-posix.c > index 249889f..5c8b266 100644 > --- a/aio-posix.c > +++ b/aio-posix.c > @@ -276,6 +276,8 @@ bool aio_poll(AioContext *ctx, bool blocking) > aio_context_acquire(ctx); > } > > + event_notifier_test_and_clear(&ctx->notifier); > + > /* if we have any readable fds, dispatch event */ > if (ret > 0) { > for (i = 0; i < npfd; i++) { > diff --git a/aio-win32.c b/aio-win32.c > index ea655b0..3e0db20 100644 > --- a/aio-win32.c > +++ b/aio-win32.c > @@ -337,6 +337,8 @@ bool aio_poll(AioContext *ctx, bool blocking) > aio_context_acquire(ctx); > } > > + event_notifier_test_and_clear(&ctx->notifier); > + > if (first && aio_bh_poll(ctx)) { > progress = true; > } > diff --git a/async.c b/async.c > index a232192..d625e8a 100644 > --- a/async.c > +++ b/async.c > @@ -203,6 +203,8 @@ aio_ctx_check(GSource *source) > QEMUBH *bh; > > atomic_and(&ctx->notify_me, ~1); > + event_notifier_test_and_clear(&ctx->notifier); > + > for (bh = ctx->first_bh; bh; bh = bh->next) { > if (!bh->deleted && bh->scheduled) { > return true; > @@ -279,6 +281,10 @@ static void aio_rfifolock_cb(void *opaque) > aio_notify(opaque); > } > > +static void event_notifier_dummy_cb(EventNotifier *e) > +{ > +} > + > AioContext *aio_context_new(Error **errp) > { > int ret; > @@ -293,7 +299,7 @@ AioContext *aio_context_new(Error **errp) > g_source_set_can_recurse(&ctx->source, true); > aio_set_event_notifier(ctx, &ctx->notifier, > (EventNotifierHandler *) > - event_notifier_test_and_clear); > + event_notifier_dummy_cb); > ctx->thread_pool = NULL; > qemu_mutex_init(&ctx->bh_lock); > rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx); > diff --git a/docs/aio_notify_bug.promela b/docs/aio_notify_bug.promela > new file mode 100644 > index 0000000..b3bfca1 > --- /dev/null > +++ b/docs/aio_notify_bug.promela > @@ -0,0 +1,140 @@ > +/* > + * This model describes a bug in aio_notify. If ctx->notifier is > + * cleared too late, a wakeup could be lost. > + * > + * Author: Paolo Bonzini <pbonzini@redhat.com> > + * > + * This file is in the public domain. If you really want a license, > + * the WTFPL will do. > + * > + * To verify the buggy version: > + * spin -a -DBUG docs/aio_notify_bug.promela > + * gcc -O2 pan.c > + * ./a.out -a -f > + * > + * To verify the working version: > + * spin -a docs/aio_notify_bug.promela > + * gcc -O2 pan.c > + * ./a.out -a -f > + * > + * Add -DCHECK_REQ to test an alternative invariant and the > + * "notify_me" optimization. > + */ > + > +int notify_me; > +bool event; > +bool req; > +bool notifier_done; > + > +#ifdef CHECK_REQ > +#define USE_NOTIFY_ME 1 > +#else > +#define USE_NOTIFY_ME 0 > +#endif > + > +active proctype notifier() > +{ > + do > + :: true -> { > + req = 1; > + if > + :: !USE_NOTIFY_ME || notify_me -> event = 1; > + :: else -> skip; > + fi > + } > + :: true -> break; > + od; > + notifier_done = 1; > +} > + > +#ifdef BUG > +#define AIO_POLL \ > + notify_me++; \ > + if \ > + :: !req -> { \ > + if \ > + :: event -> skip; \ > + fi; \ > + } \ > + :: else -> skip; \ > + fi; \ > + notify_me--; \ > + \ > + req = 0; \ > + event = 0; > +#else > +#define AIO_POLL \ > + notify_me++; \ > + if \ > + :: !req -> { \ > + if \ > + :: event -> skip; \ > + fi; \ > + } \ > + :: else -> skip; \ > + fi; \ > + notify_me--; \ > + \ > + event = 0; \ > + req = 0; > +#endif > + > +active proctype waiter() > +{ > + do > + :: true -> AIO_POLL; > + od; > +} > + > +/* Same as waiter(), but disappears after a while. */ > +active proctype temporary_waiter() > +{ > + do > + :: true -> AIO_POLL; > + :: true -> break; > + od; > +} > + > +#ifdef CHECK_REQ > +never { > + do > + :: req -> goto accept_if_req_not_eventually_false; > + :: true -> skip; > + od; > + > +accept_if_req_not_eventually_false: > + if > + :: req -> goto accept_if_req_not_eventually_false; > + fi; > + assert(0); > +} > + > +#else > +/* There must be infinitely many transitions of event as long > + * as the notifier does not exit. > + * > + * If event stayed always true, the waiters would be busy looping. > + * If event stayed always false, the waiters would be sleeping > + * forever. > + */ > +never { > + do > + :: !event -> goto accept_if_event_not_eventually_true; > + :: event -> goto accept_if_event_not_eventually_false; > + :: true -> skip; > + od; > + > +accept_if_event_not_eventually_true: > + if > + :: !event && notifier_done -> do :: true -> skip; od; > + :: !event && !notifier_done -> goto accept_if_event_not_eventually_true; > + fi; > + assert(0); > + > +accept_if_event_not_eventually_false: > + if > + :: event -> goto accept_if_event_not_eventually_false; > + fi; > + assert(0); > +} > +#endif > -- > 2.4.3 > >
On 20/07/2015 05:55, Fam Zheng wrote: > On Sat, 07/18 22:21, Paolo Bonzini wrote: >> event_notifier_test_and_clear must be called before processing events. >> Otherwise, an aio_poll could "eat" the notification before the main >> I/O thread invokes ppoll(). The main I/O thread then never wakes up. >> This is an example of what could happen: >> >> i/o thread vcpu thread worker thread >> --------------------------------------------------------------------- >> lock_iothread >> notify_me = 1 >> ... >> unlock_iothread >> lock_iothread >> notify_me = 3 >> ppoll >> notify_me = 1 >> bh->scheduled = 1 >> event_notifier_set >> event_notifier_test_and_clear >> ppoll >> *** hang *** > > It still seems possible for event_notifier_test_and_clear to happen before main > thread ppoll, will that be a problem? How does main thread get waken up in that > case? You could have this: i/o thread vcpu thread worker thread --------------------------------------------------------------------- lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh->scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll But then the bottom half *will* be processed by the VCPU thread. It's a similar "rule" to the one that you use with lock-free algorithms: the consumer, 99% of the time, uses the variables in the opposite order of the producer. Here the producer calls event_notifier_set after bh->scheduled (of course...); the consumer *must* clear the EventNotifier before reading bh->scheduled. Paolo > >> >> In contrast with the previous bug, there wasn't really much debugging >> to do here. "Tracing" with qemu_clock_get_ns shows pretty much the >> same behavior as in the previous patch, so the only wait to find the >> bug is staring at the code. >> >> One could also use a formal model, of course. The included one shows >> this with three processes: notifier corresponds to a QEMU thread pool >> worker, temporary_waiter to a VCPU thread that invokes aio_poll(), >> waiter to the main I/O thread. I would be happy to say that the >> formal model found the bug for me, but actually I wrote it after the >> fact. >> >> This patch is a bit of a big hammer. The next one optimizes it, >> with help (this time for real rather than a posteriori :)) from >> another, similar formal model. >> >> Reported-by: Richard W. M. Jones <rjones@redhat.com> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> aio-posix.c | 2 + >> aio-win32.c | 2 + >> async.c | 8 ++- >> docs/aio_notify_bug.promela | 140 ++++++++++++++++++++++++++++++++++++++++++++ >> 4 files changed, 151 insertions(+), 1 deletion(-) >> create mode 100644 docs/aio_notify_bug.promela >> >> diff --git a/aio-posix.c b/aio-posix.c >> index 249889f..5c8b266 100644 >> --- a/aio-posix.c >> +++ b/aio-posix.c >> @@ -276,6 +276,8 @@ bool aio_poll(AioContext *ctx, bool blocking) >> aio_context_acquire(ctx); >> } >> >> + event_notifier_test_and_clear(&ctx->notifier); >> + >> /* if we have any readable fds, dispatch event */ >> if (ret > 0) { >> for (i = 0; i < npfd; i++) { >> diff --git a/aio-win32.c b/aio-win32.c >> index ea655b0..3e0db20 100644 >> --- a/aio-win32.c >> +++ b/aio-win32.c >> @@ -337,6 +337,8 @@ bool aio_poll(AioContext *ctx, bool blocking) >> aio_context_acquire(ctx); >> } >> >> + event_notifier_test_and_clear(&ctx->notifier); >> + >> if (first && aio_bh_poll(ctx)) { >> progress = true; >> } >> diff --git a/async.c b/async.c >> index a232192..d625e8a 100644 >> --- a/async.c >> +++ b/async.c >> @@ -203,6 +203,8 @@ aio_ctx_check(GSource *source) >> QEMUBH *bh; >> >> atomic_and(&ctx->notify_me, ~1); >> + event_notifier_test_and_clear(&ctx->notifier); >> + >> for (bh = ctx->first_bh; bh; bh = bh->next) { >> if (!bh->deleted && bh->scheduled) { >> return true; >> @@ -279,6 +281,10 @@ static void aio_rfifolock_cb(void *opaque) >> aio_notify(opaque); >> } >> >> +static void event_notifier_dummy_cb(EventNotifier *e) >> +{ >> +} >> + >> AioContext *aio_context_new(Error **errp) >> { >> int ret; >> @@ -293,7 +299,7 @@ AioContext *aio_context_new(Error **errp) >> g_source_set_can_recurse(&ctx->source, true); >> aio_set_event_notifier(ctx, &ctx->notifier, >> (EventNotifierHandler *) >> - event_notifier_test_and_clear); >> + event_notifier_dummy_cb); >> ctx->thread_pool = NULL; >> qemu_mutex_init(&ctx->bh_lock); >> rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx); >> diff --git a/docs/aio_notify_bug.promela b/docs/aio_notify_bug.promela >> new file mode 100644 >> index 0000000..b3bfca1 >> --- /dev/null >> +++ b/docs/aio_notify_bug.promela >> @@ -0,0 +1,140 @@ >> +/* >> + * This model describes a bug in aio_notify. If ctx->notifier is >> + * cleared too late, a wakeup could be lost. >> + * >> + * Author: Paolo Bonzini <pbonzini@redhat.com> >> + * >> + * This file is in the public domain. If you really want a license, >> + * the WTFPL will do. >> + * >> + * To verify the buggy version: >> + * spin -a -DBUG docs/aio_notify_bug.promela >> + * gcc -O2 pan.c >> + * ./a.out -a -f >> + * >> + * To verify the working version: >> + * spin -a docs/aio_notify_bug.promela >> + * gcc -O2 pan.c >> + * ./a.out -a -f >> + * >> + * Add -DCHECK_REQ to test an alternative invariant and the >> + * "notify_me" optimization. >> + */ >> + >> +int notify_me; >> +bool event; >> +bool req; >> +bool notifier_done; >> + >> +#ifdef CHECK_REQ >> +#define USE_NOTIFY_ME 1 >> +#else >> +#define USE_NOTIFY_ME 0 >> +#endif >> + >> +active proctype notifier() >> +{ >> + do >> + :: true -> { >> + req = 1; >> + if >> + :: !USE_NOTIFY_ME || notify_me -> event = 1; >> + :: else -> skip; >> + fi >> + } >> + :: true -> break; >> + od; >> + notifier_done = 1; >> +} >> + >> +#ifdef BUG >> +#define AIO_POLL \ >> + notify_me++; \ >> + if \ >> + :: !req -> { \ >> + if \ >> + :: event -> skip; \ >> + fi; \ >> + } \ >> + :: else -> skip; \ >> + fi; \ >> + notify_me--; \ >> + \ >> + req = 0; \ >> + event = 0; >> +#else >> +#define AIO_POLL \ >> + notify_me++; \ >> + if \ >> + :: !req -> { \ >> + if \ >> + :: event -> skip; \ >> + fi; \ >> + } \ >> + :: else -> skip; \ >> + fi; \ >> + notify_me--; \ >> + \ >> + event = 0; \ >> + req = 0; >> +#endif >> + >> +active proctype waiter() >> +{ >> + do >> + :: true -> AIO_POLL; >> + od; >> +} >> + >> +/* Same as waiter(), but disappears after a while. */ >> +active proctype temporary_waiter() >> +{ >> + do >> + :: true -> AIO_POLL; >> + :: true -> break; >> + od; >> +} >> + >> +#ifdef CHECK_REQ >> +never { >> + do >> + :: req -> goto accept_if_req_not_eventually_false; >> + :: true -> skip; >> + od; >> + >> +accept_if_req_not_eventually_false: >> + if >> + :: req -> goto accept_if_req_not_eventually_false; >> + fi; >> + assert(0); >> +} >> + >> +#else >> +/* There must be infinitely many transitions of event as long >> + * as the notifier does not exit. >> + * >> + * If event stayed always true, the waiters would be busy looping. >> + * If event stayed always false, the waiters would be sleeping >> + * forever. >> + */ >> +never { >> + do >> + :: !event -> goto accept_if_event_not_eventually_true; >> + :: event -> goto accept_if_event_not_eventually_false; >> + :: true -> skip; >> + od; >> + >> +accept_if_event_not_eventually_true: >> + if >> + :: !event && notifier_done -> do :: true -> skip; od; >> + :: !event && !notifier_done -> goto accept_if_event_not_eventually_true; >> + fi; >> + assert(0); >> + >> +accept_if_event_not_eventually_false: >> + if >> + :: event -> goto accept_if_event_not_eventually_false; >> + fi; >> + assert(0); >> +} >> +#endif >> -- >> 2.4.3 >> >> > >
diff --git a/aio-posix.c b/aio-posix.c index 249889f..5c8b266 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -276,6 +276,8 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } + event_notifier_test_and_clear(&ctx->notifier); + /* if we have any readable fds, dispatch event */ if (ret > 0) { for (i = 0; i < npfd; i++) { diff --git a/aio-win32.c b/aio-win32.c index ea655b0..3e0db20 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -337,6 +337,8 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } + event_notifier_test_and_clear(&ctx->notifier); + if (first && aio_bh_poll(ctx)) { progress = true; } diff --git a/async.c b/async.c index a232192..d625e8a 100644 --- a/async.c +++ b/async.c @@ -203,6 +203,8 @@ aio_ctx_check(GSource *source) QEMUBH *bh; atomic_and(&ctx->notify_me, ~1); + event_notifier_test_and_clear(&ctx->notifier); + for (bh = ctx->first_bh; bh; bh = bh->next) { if (!bh->deleted && bh->scheduled) { return true; @@ -279,6 +281,10 @@ static void aio_rfifolock_cb(void *opaque) aio_notify(opaque); } +static void event_notifier_dummy_cb(EventNotifier *e) +{ +} + AioContext *aio_context_new(Error **errp) { int ret; @@ -293,7 +299,7 @@ AioContext *aio_context_new(Error **errp) g_source_set_can_recurse(&ctx->source, true); aio_set_event_notifier(ctx, &ctx->notifier, (EventNotifierHandler *) - event_notifier_test_and_clear); + event_notifier_dummy_cb); ctx->thread_pool = NULL; qemu_mutex_init(&ctx->bh_lock); rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx); diff --git a/docs/aio_notify_bug.promela b/docs/aio_notify_bug.promela new file mode 100644 index 0000000..b3bfca1 --- /dev/null +++ b/docs/aio_notify_bug.promela @@ -0,0 +1,140 @@ +/* + * This model describes a bug in aio_notify. If ctx->notifier is + * cleared too late, a wakeup could be lost. + * + * Author: Paolo Bonzini <pbonzini@redhat.com> + * + * This file is in the public domain. If you really want a license, + * the WTFPL will do. + * + * To verify the buggy version: + * spin -a -DBUG docs/aio_notify_bug.promela + * gcc -O2 pan.c + * ./a.out -a -f + * + * To verify the working version: + * spin -a docs/aio_notify_bug.promela + * gcc -O2 pan.c + * ./a.out -a -f + * + * Add -DCHECK_REQ to test an alternative invariant and the + * "notify_me" optimization. + */ + +int notify_me; +bool event; +bool req; +bool notifier_done; + +#ifdef CHECK_REQ +#define USE_NOTIFY_ME 1 +#else +#define USE_NOTIFY_ME 0 +#endif + +active proctype notifier() +{ + do + :: true -> { + req = 1; + if + :: !USE_NOTIFY_ME || notify_me -> event = 1; + :: else -> skip; + fi + } + :: true -> break; + od; + notifier_done = 1; +} + +#ifdef BUG +#define AIO_POLL \ + notify_me++; \ + if \ + :: !req -> { \ + if \ + :: event -> skip; \ + fi; \ + } \ + :: else -> skip; \ + fi; \ + notify_me--; \ + \ + req = 0; \ + event = 0; +#else +#define AIO_POLL \ + notify_me++; \ + if \ + :: !req -> { \ + if \ + :: event -> skip; \ + fi; \ + } \ + :: else -> skip; \ + fi; \ + notify_me--; \ + \ + event = 0; \ + req = 0; +#endif + +active proctype waiter() +{ + do + :: true -> AIO_POLL; + od; +} + +/* Same as waiter(), but disappears after a while. */ +active proctype temporary_waiter() +{ + do + :: true -> AIO_POLL; + :: true -> break; + od; +} + +#ifdef CHECK_REQ +never { + do + :: req -> goto accept_if_req_not_eventually_false; + :: true -> skip; + od; + +accept_if_req_not_eventually_false: + if + :: req -> goto accept_if_req_not_eventually_false; + fi; + assert(0); +} + +#else +/* There must be infinitely many transitions of event as long + * as the notifier does not exit. + * + * If event stayed always true, the waiters would be busy looping. + * If event stayed always false, the waiters would be sleeping + * forever. + */ +never { + do + :: !event -> goto accept_if_event_not_eventually_true; + :: event -> goto accept_if_event_not_eventually_false; + :: true -> skip; + od; + +accept_if_event_not_eventually_true: + if + :: !event && notifier_done -> do :: true -> skip; od; + :: !event && !notifier_done -> goto accept_if_event_not_eventually_true; + fi; + assert(0); + +accept_if_event_not_eventually_false: + if + :: event -> goto accept_if_event_not_eventually_false; + fi; + assert(0); +} +#endif
event_notifier_test_and_clear must be called before processing events. Otherwise, an aio_poll could "eat" the notification before the main I/O thread invokes ppoll(). The main I/O thread then never wakes up. This is an example of what could happen: i/o thread vcpu thread worker thread --------------------------------------------------------------------- lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh->scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll *** hang *** In contrast with the previous bug, there wasn't really much debugging to do here. "Tracing" with qemu_clock_get_ns shows pretty much the same behavior as in the previous patch, so the only wait to find the bug is staring at the code. One could also use a formal model, of course. The included one shows this with three processes: notifier corresponds to a QEMU thread pool worker, temporary_waiter to a VCPU thread that invokes aio_poll(), waiter to the main I/O thread. I would be happy to say that the formal model found the bug for me, but actually I wrote it after the fact. This patch is a bit of a big hammer. The next one optimizes it, with help (this time for real rather than a posteriori :)) from another, similar formal model. Reported-by: Richard W. M. Jones <rjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- aio-posix.c | 2 + aio-win32.c | 2 + async.c | 8 ++- docs/aio_notify_bug.promela | 140 ++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 151 insertions(+), 1 deletion(-) create mode 100644 docs/aio_notify_bug.promela