Message ID | 1467045468-25709-1-git-send-email-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
On 27/06/2016 18:37, Stefan Hajnoczi wrote: > Commit ccb9dc10129954d0bcd7814298ed445e684d5a2a ("linux-aio: Cancel BH > if not needed") exposed a side-effect of scheduling the BH for nested > event loops. When io_getevents(2) fetches MAX_EVENTS events then it's > necessary to call it again. Failure to do so can lead to hung I/O > because the eventfd has already been cleared and the completion BH will > not run again. > > This patch fixes the hang by calling io_getevents(2) again if the events > array was totally full. > > Reported-by: Roman Penyaev <roman.penyaev@profitbricks.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > block/linux-aio.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/block/linux-aio.c b/block/linux-aio.c > index e468960..af15f85 100644 > --- a/block/linux-aio.c > +++ b/block/linux-aio.c > @@ -117,6 +117,7 @@ static void qemu_laio_completion_bh(void *opaque) > LinuxAioState *s = opaque; > > /* Fetch more completion events when empty */ > +more_events: > if (s->event_idx == s->event_max) { > do { > struct timespec ts = { 0 }; > @@ -146,6 +147,10 @@ static void qemu_laio_completion_bh(void *opaque) > qemu_laio_process_completion(laiocb); > } > > + if (s->event_idx == MAX_EVENTS) { > + goto more_events; /* there might still be events waiting for us */ > + } > + > if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { > ioq_submit(s); > } > Do we want to do it after submitting, in order to reduce latency and perhaps even get some results immediately? Paolo
On Jun 27, 2016 6:37 PM, "Stefan Hajnoczi" <stefanha@redhat.com> wrote: > > Commit ccb9dc10129954d0bcd7814298ed445e684d5a2a ("linux-aio: Cancel BH > if not needed") exposed a side-effect of scheduling the BH for nested > event loops. When io_getevents(2) fetches MAX_EVENTS events then it's > necessary to call it again. Failure to do so can lead to hung I/O > because the eventfd has already been cleared and the completion BH will > not run again. > > This patch fixes the hang by calling io_getevents(2) again if the events > array was totally full. > > Reported-by: Roman Penyaev <roman.penyaev@profitbricks.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > block/linux-aio.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/block/linux-aio.c b/block/linux-aio.c > index e468960..af15f85 100644 > --- a/block/linux-aio.c > +++ b/block/linux-aio.c > @@ -117,6 +117,7 @@ static void qemu_laio_completion_bh(void *opaque) > LinuxAioState *s = opaque; > > /* Fetch more completion events when empty */ > +more_events: > if (s->event_idx == s->event_max) { > do { > struct timespec ts = { 0 }; > @@ -146,6 +147,10 @@ static void qemu_laio_completion_bh(void *opaque) > qemu_laio_process_completion(laiocb); > } > > + if (s->event_idx == MAX_EVENTS) { > + goto more_events; /* there might still be events waiting for us */ > + } > + I am on vacation and can't check anything, but seems you also could reproduce an issue and seems you are right: what I've also noticed is that io hangs only if total number of requests in guest > than 128 (what is defined in linux-aio.c). But I do not understand why you have to repeat io_getevents in case of return value == MAX_EVENTS. According to my understanding you cant submit more than 128, so the first io_getevents call returns you exactly what you've submitted. -- Roman > if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { > ioq_submit(s); > } > -- > 2.7.4 >
On Mon, Jun 27, 2016 at 06:42:33PM +0200, Paolo Bonzini wrote: > > > On 27/06/2016 18:37, Stefan Hajnoczi wrote: > > Commit ccb9dc10129954d0bcd7814298ed445e684d5a2a ("linux-aio: Cancel BH > > if not needed") exposed a side-effect of scheduling the BH for nested > > event loops. When io_getevents(2) fetches MAX_EVENTS events then it's > > necessary to call it again. Failure to do so can lead to hung I/O > > because the eventfd has already been cleared and the completion BH will > > not run again. > > > > This patch fixes the hang by calling io_getevents(2) again if the events > > array was totally full. > > > > Reported-by: Roman Penyaev <roman.penyaev@profitbricks.com> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > --- > > block/linux-aio.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/block/linux-aio.c b/block/linux-aio.c > > index e468960..af15f85 100644 > > --- a/block/linux-aio.c > > +++ b/block/linux-aio.c > > @@ -117,6 +117,7 @@ static void qemu_laio_completion_bh(void *opaque) > > LinuxAioState *s = opaque; > > > > /* Fetch more completion events when empty */ > > +more_events: > > if (s->event_idx == s->event_max) { > > do { > > struct timespec ts = { 0 }; > > @@ -146,6 +147,10 @@ static void qemu_laio_completion_bh(void *opaque) > > qemu_laio_process_completion(laiocb); > > } > > > > + if (s->event_idx == MAX_EVENTS) { > > + goto more_events; /* there might still be events waiting for us */ > > + } > > + > > if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { > > ioq_submit(s); > > } > > > > Do we want to do it after submitting, in order to reduce latency and > perhaps even get some results immediately? No, if more I/O requests are submitted let the event loop spin to avoid starving the event loop when there is a high IOPS device. Stefan
On Mon, Jun 27, 2016 at 08:36:19PM +0200, Roman Penyaev wrote: > On Jun 27, 2016 6:37 PM, "Stefan Hajnoczi" <stefanha@redhat.com> wrote: > > > > Commit ccb9dc10129954d0bcd7814298ed445e684d5a2a ("linux-aio: Cancel BH > > if not needed") exposed a side-effect of scheduling the BH for nested > > event loops. When io_getevents(2) fetches MAX_EVENTS events then it's > > necessary to call it again. Failure to do so can lead to hung I/O > > because the eventfd has already been cleared and the completion BH will > > not run again. > > > > This patch fixes the hang by calling io_getevents(2) again if the events > > array was totally full. > > > > Reported-by: Roman Penyaev <roman.penyaev@profitbricks.com> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > --- > > block/linux-aio.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/block/linux-aio.c b/block/linux-aio.c > > index e468960..af15f85 100644 > > --- a/block/linux-aio.c > > +++ b/block/linux-aio.c > > @@ -117,6 +117,7 @@ static void qemu_laio_completion_bh(void *opaque) > > LinuxAioState *s = opaque; > > > > /* Fetch more completion events when empty */ > > +more_events: > > if (s->event_idx == s->event_max) { > > do { > > struct timespec ts = { 0 }; > > @@ -146,6 +147,10 @@ static void qemu_laio_completion_bh(void *opaque) > > qemu_laio_process_completion(laiocb); > > } > > > > + if (s->event_idx == MAX_EVENTS) { > > + goto more_events; /* there might still be events waiting for us > */ > > + } > > + > > I am on vacation and can't check anything, but seems you also > could reproduce an issue and seems you are right: what I've > also noticed is that io hangs only if total number of requests > in guest > than 128 (what is defined in linux-aio.c). > > But I do not understand why you have to repeat io_getevents in > case of return value == MAX_EVENTS. According to my > understanding you cant submit more than 128, so the first > io_getevents call returns you exactly what you've submitted. Hi Roman, There is nothing like discussions on qemu-devel from the beach. True vacation! The issue is: 1. s->events[] is only MAX_EVENTS large so each io_getevents() call can only fetch that maximum number of events. 2. qemu_laio_completion_cb() clears the eventfd so the event loop will not call us again (unless additional I/O requests complete). Therefore, returning from qemu_laio_completion_bh() without having processed all events causes a hang. Stefan
diff --git a/block/linux-aio.c b/block/linux-aio.c index e468960..af15f85 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -117,6 +117,7 @@ static void qemu_laio_completion_bh(void *opaque) LinuxAioState *s = opaque; /* Fetch more completion events when empty */ +more_events: if (s->event_idx == s->event_max) { do { struct timespec ts = { 0 }; @@ -146,6 +147,10 @@ static void qemu_laio_completion_bh(void *opaque) qemu_laio_process_completion(laiocb); } + if (s->event_idx == MAX_EVENTS) { + goto more_events; /* there might still be events waiting for us */ + } + if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { ioq_submit(s); }
Commit ccb9dc10129954d0bcd7814298ed445e684d5a2a ("linux-aio: Cancel BH if not needed") exposed a side-effect of scheduling the BH for nested event loops. When io_getevents(2) fetches MAX_EVENTS events then it's necessary to call it again. Failure to do so can lead to hung I/O because the eventfd has already been cleared and the completion BH will not run again. This patch fixes the hang by calling io_getevents(2) again if the events array was totally full. Reported-by: Roman Penyaev <roman.penyaev@profitbricks.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- block/linux-aio.c | 5 +++++ 1 file changed, 5 insertions(+)