Message ID | 20190912181924.48539-1-slp@redhat.com |
---|---|
State | New |
Headers | show |
Series | [RFC] virtio-blk: schedule virtio_notify_config to run on main context | expand |
On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote: > Another AioContext-related issue, and this is a tricky one. > > Executing a QMP block_resize request for a virtio-blk device running > on an iothread may cause a deadlock involving the following mutexes: > > - main thead > * Has acquired: qemu_mutex_global. > * Is trying the acquire: iothread AioContext lock via > AIO_WAIT_WHILE (after aio_poll). > > - iothread > * Has acquired: AioContext lock. > * Is trying to acquire: qemu_mutex_global (via > virtio_notify_config->prepare_mmio_access). Hmm is this really the only case iothread takes qemu mutex? If any such access can deadlock, don't we need a generic solution? Maybe main thread can drop qemu mutex before taking io thread AioContext lock? > With this change, virtio_blk_resize checks if it's being called from a > coroutine context running on a non-main thread, and if that's the > case, creates a new coroutine and schedules it to be run on the main > thread. > > This works, but means the actual operation is done > asynchronously, perhaps opening a window in which a "device_del" > operation may fit and remove the VirtIODevice before > virtio_notify_config() is executed. > > I *think* it shouldn't be possible, as BHs will be processed before > any new QMP/monitor command, but I'm open to a different approach. > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955 > Signed-off-by: Sergio Lopez <slp@redhat.com> > --- > hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++- > 1 file changed, 24 insertions(+), 1 deletion(-) > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index 18851601cb..c763d071f6 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -16,6 +16,7 @@ > #include "qemu/iov.h" > #include "qemu/module.h" > #include "qemu/error-report.h" > +#include "qemu/main-loop.h" > #include "trace.h" > #include "hw/block/block.h" > #include "hw/qdev-properties.h" > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, > return 0; > } > > +static void coroutine_fn virtio_resize_co_entry(void *opaque) > +{ > + VirtIODevice *vdev = opaque; > + > + assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > + virtio_notify_config(vdev); > + aio_wait_kick(); > +} > + > static void virtio_blk_resize(void *opaque) > { > VirtIODevice *vdev = VIRTIO_DEVICE(opaque); > + Coroutine *co; > > - virtio_notify_config(vdev); > + if (qemu_in_coroutine() && > + qemu_get_current_aio_context() != qemu_get_aio_context()) { > + /* > + * virtio_notify_config() needs to acquire the global mutex, > + * so calling it from a coroutine running on a non-main context > + * may cause a deadlock. Instead, create a new coroutine and > + * schedule it to be run on the main thread. > + */ > + co = qemu_coroutine_create(virtio_resize_co_entry, vdev); > + aio_co_schedule(qemu_get_aio_context(), co); > + } else { > + virtio_notify_config(vdev); > + } > } > > static const BlockDevOps virtio_block_ops = { > -- > 2.21.0
Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote: >> Another AioContext-related issue, and this is a tricky one. >> >> Executing a QMP block_resize request for a virtio-blk device running >> on an iothread may cause a deadlock involving the following mutexes: >> >> - main thead >> * Has acquired: qemu_mutex_global. >> * Is trying the acquire: iothread AioContext lock via >> AIO_WAIT_WHILE (after aio_poll). >> >> - iothread >> * Has acquired: AioContext lock. >> * Is trying to acquire: qemu_mutex_global (via >> virtio_notify_config->prepare_mmio_access). > > Hmm is this really the only case iothread takes qemu mutex? Not the only one that takes the mutex, but the only one so far we found doing so upon request from a job running on the main thread (should be quite noticeable, due to the deadlock). > If any such access can deadlock, don't we need a generic > solution? Maybe main thread can drop qemu mutex > before taking io thread AioContext lock? The mutex is acquired very early at os_host_main_loop_wait(), so I assume there may be many assumptions in multiple code paths that it has been acquired. >> With this change, virtio_blk_resize checks if it's being called from a >> coroutine context running on a non-main thread, and if that's the >> case, creates a new coroutine and schedules it to be run on the main >> thread. >> >> This works, but means the actual operation is done >> asynchronously, perhaps opening a window in which a "device_del" >> operation may fit and remove the VirtIODevice before >> virtio_notify_config() is executed. >> >> I *think* it shouldn't be possible, as BHs will be processed before >> any new QMP/monitor command, but I'm open to a different approach. >> >> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955 >> Signed-off-by: Sergio Lopez <slp@redhat.com> >> --- >> hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++- >> 1 file changed, 24 insertions(+), 1 deletion(-) >> >> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c >> index 18851601cb..c763d071f6 100644 >> --- a/hw/block/virtio-blk.c >> +++ b/hw/block/virtio-blk.c >> @@ -16,6 +16,7 @@ >> #include "qemu/iov.h" >> #include "qemu/module.h" >> #include "qemu/error-report.h" >> +#include "qemu/main-loop.h" >> #include "trace.h" >> #include "hw/block/block.h" >> #include "hw/qdev-properties.h" >> @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, >> return 0; >> } >> >> +static void coroutine_fn virtio_resize_co_entry(void *opaque) >> +{ >> + VirtIODevice *vdev = opaque; >> + >> + assert(qemu_get_current_aio_context() == qemu_get_aio_context()); >> + virtio_notify_config(vdev); >> + aio_wait_kick(); >> +} >> + >> static void virtio_blk_resize(void *opaque) >> { >> VirtIODevice *vdev = VIRTIO_DEVICE(opaque); >> + Coroutine *co; >> >> - virtio_notify_config(vdev); >> + if (qemu_in_coroutine() && >> + qemu_get_current_aio_context() != qemu_get_aio_context()) { >> + /* >> + * virtio_notify_config() needs to acquire the global mutex, >> + * so calling it from a coroutine running on a non-main context >> + * may cause a deadlock. Instead, create a new coroutine and >> + * schedule it to be run on the main thread. >> + */ >> + co = qemu_coroutine_create(virtio_resize_co_entry, vdev); >> + aio_co_schedule(qemu_get_aio_context(), co); >> + } else { >> + virtio_notify_config(vdev); >> + } >> } >> >> static const BlockDevOps virtio_block_ops = { >> -- >> 2.21.0
Am 12.09.2019 um 21:51 hat Michael S. Tsirkin geschrieben: > On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote: > > Another AioContext-related issue, and this is a tricky one. > > > > Executing a QMP block_resize request for a virtio-blk device running > > on an iothread may cause a deadlock involving the following mutexes: > > > > - main thead > > * Has acquired: qemu_mutex_global. > > * Is trying the acquire: iothread AioContext lock via > > AIO_WAIT_WHILE (after aio_poll). > > > > - iothread > > * Has acquired: AioContext lock. > > * Is trying to acquire: qemu_mutex_global (via > > virtio_notify_config->prepare_mmio_access). > > Hmm is this really the only case iothread takes qemu mutex? > If any such access can deadlock, don't we need a generic > solution? Maybe main thread can drop qemu mutex > before taking io thread AioContext lock? The rule is that iothreads must not take the qemu mutex. If they do (like in this case), it's a bug. Maybe we could actually assert this in qemu_mutex_lock_iothread()? > > With this change, virtio_blk_resize checks if it's being called from a > > coroutine context running on a non-main thread, and if that's the > > case, creates a new coroutine and schedules it to be run on the main > > thread. > > > > This works, but means the actual operation is done > > asynchronously, perhaps opening a window in which a "device_del" > > operation may fit and remove the VirtIODevice before > > virtio_notify_config() is executed. > > > > I *think* it shouldn't be possible, as BHs will be processed before > > any new QMP/monitor command, but I'm open to a different approach. > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955 > > Signed-off-by: Sergio Lopez <slp@redhat.com> > > --- > > hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++- > > 1 file changed, 24 insertions(+), 1 deletion(-) > > > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > > index 18851601cb..c763d071f6 100644 > > --- a/hw/block/virtio-blk.c > > +++ b/hw/block/virtio-blk.c > > @@ -16,6 +16,7 @@ > > #include "qemu/iov.h" > > #include "qemu/module.h" > > #include "qemu/error-report.h" > > +#include "qemu/main-loop.h" > > #include "trace.h" > > #include "hw/block/block.h" > > #include "hw/qdev-properties.h" > > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, > > return 0; > > } > > > > +static void coroutine_fn virtio_resize_co_entry(void *opaque) > > +{ > > + VirtIODevice *vdev = opaque; > > + > > + assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > + virtio_notify_config(vdev); > > + aio_wait_kick(); > > +} > > + > > static void virtio_blk_resize(void *opaque) > > { > > VirtIODevice *vdev = VIRTIO_DEVICE(opaque); > > + Coroutine *co; > > > > - virtio_notify_config(vdev); > > + if (qemu_in_coroutine() && > > + qemu_get_current_aio_context() != qemu_get_aio_context()) { > > + /* > > + * virtio_notify_config() needs to acquire the global mutex, > > + * so calling it from a coroutine running on a non-main context > > + * may cause a deadlock. Instead, create a new coroutine and > > + * schedule it to be run on the main thread. > > + */ > > + co = qemu_coroutine_create(virtio_resize_co_entry, vdev); > > + aio_co_schedule(qemu_get_aio_context(), co); > > + } else { > > + virtio_notify_config(vdev); > > + } > > } Wouldn't a simple BH suffice (aio_bh_schedule_oneshot)? I don't see why you need a coroutine when you never yield. The reason why it deadlocks also has nothing to do with whether we are called from a coroutine or not. The important part is that we're running in an iothread. Kevin
Kevin Wolf <kwolf@redhat.com> writes: > Am 12.09.2019 um 21:51 hat Michael S. Tsirkin geschrieben: >> On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote: >> > Another AioContext-related issue, and this is a tricky one. >> > >> > Executing a QMP block_resize request for a virtio-blk device running >> > on an iothread may cause a deadlock involving the following mutexes: >> > >> > - main thead >> > * Has acquired: qemu_mutex_global. >> > * Is trying the acquire: iothread AioContext lock via >> > AIO_WAIT_WHILE (after aio_poll). >> > >> > - iothread >> > * Has acquired: AioContext lock. >> > * Is trying to acquire: qemu_mutex_global (via >> > virtio_notify_config->prepare_mmio_access). >> >> Hmm is this really the only case iothread takes qemu mutex? >> If any such access can deadlock, don't we need a generic >> solution? Maybe main thread can drop qemu mutex >> before taking io thread AioContext lock? > > The rule is that iothreads must not take the qemu mutex. If they do > (like in this case), it's a bug. > > Maybe we could actually assert this in qemu_mutex_lock_iothread()? > >> > With this change, virtio_blk_resize checks if it's being called from a >> > coroutine context running on a non-main thread, and if that's the >> > case, creates a new coroutine and schedules it to be run on the main >> > thread. >> > >> > This works, but means the actual operation is done >> > asynchronously, perhaps opening a window in which a "device_del" >> > operation may fit and remove the VirtIODevice before >> > virtio_notify_config() is executed. >> > >> > I *think* it shouldn't be possible, as BHs will be processed before >> > any new QMP/monitor command, but I'm open to a different approach. >> > >> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955 >> > Signed-off-by: Sergio Lopez <slp@redhat.com> >> > --- >> > hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++- >> > 1 file changed, 24 insertions(+), 1 deletion(-) >> > >> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c >> > index 18851601cb..c763d071f6 100644 >> > --- a/hw/block/virtio-blk.c >> > +++ b/hw/block/virtio-blk.c >> > @@ -16,6 +16,7 @@ >> > #include "qemu/iov.h" >> > #include "qemu/module.h" >> > #include "qemu/error-report.h" >> > +#include "qemu/main-loop.h" >> > #include "trace.h" >> > #include "hw/block/block.h" >> > #include "hw/qdev-properties.h" >> > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, >> > return 0; >> > } >> > >> > +static void coroutine_fn virtio_resize_co_entry(void *opaque) >> > +{ >> > + VirtIODevice *vdev = opaque; >> > + >> > + assert(qemu_get_current_aio_context() == qemu_get_aio_context()); >> > + virtio_notify_config(vdev); >> > + aio_wait_kick(); >> > +} >> > + >> > static void virtio_blk_resize(void *opaque) >> > { >> > VirtIODevice *vdev = VIRTIO_DEVICE(opaque); >> > + Coroutine *co; >> > >> > - virtio_notify_config(vdev); >> > + if (qemu_in_coroutine() && >> > + qemu_get_current_aio_context() != qemu_get_aio_context()) { >> > + /* >> > + * virtio_notify_config() needs to acquire the global mutex, >> > + * so calling it from a coroutine running on a non-main context >> > + * may cause a deadlock. Instead, create a new coroutine and >> > + * schedule it to be run on the main thread. >> > + */ >> > + co = qemu_coroutine_create(virtio_resize_co_entry, vdev); >> > + aio_co_schedule(qemu_get_aio_context(), co); >> > + } else { >> > + virtio_notify_config(vdev); >> > + } >> > } > > Wouldn't a simple BH suffice (aio_bh_schedule_oneshot)? I don't see why > you need a coroutine when you never yield. You're right, that's actually simpler, haven't thought of it. Do you see any drawbacks or should I send a non-RFC fixed version of this patch? > The reason why it deadlocks also has nothing to do with whether we are > called from a coroutine or not. The important part is that we're running > in an iothread. > > Kevin
Am 13.09.2019 um 11:28 hat Sergio Lopez geschrieben: > > Kevin Wolf <kwolf@redhat.com> writes: > > > Am 12.09.2019 um 21:51 hat Michael S. Tsirkin geschrieben: > >> On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote: > >> > Another AioContext-related issue, and this is a tricky one. > >> > > >> > Executing a QMP block_resize request for a virtio-blk device running > >> > on an iothread may cause a deadlock involving the following mutexes: > >> > > >> > - main thead > >> > * Has acquired: qemu_mutex_global. > >> > * Is trying the acquire: iothread AioContext lock via > >> > AIO_WAIT_WHILE (after aio_poll). > >> > > >> > - iothread > >> > * Has acquired: AioContext lock. > >> > * Is trying to acquire: qemu_mutex_global (via > >> > virtio_notify_config->prepare_mmio_access). > >> > >> Hmm is this really the only case iothread takes qemu mutex? > >> If any such access can deadlock, don't we need a generic > >> solution? Maybe main thread can drop qemu mutex > >> before taking io thread AioContext lock? > > > > The rule is that iothreads must not take the qemu mutex. If they do > > (like in this case), it's a bug. > > > > Maybe we could actually assert this in qemu_mutex_lock_iothread()? > > > >> > With this change, virtio_blk_resize checks if it's being called from a > >> > coroutine context running on a non-main thread, and if that's the > >> > case, creates a new coroutine and schedules it to be run on the main > >> > thread. > >> > > >> > This works, but means the actual operation is done > >> > asynchronously, perhaps opening a window in which a "device_del" > >> > operation may fit and remove the VirtIODevice before > >> > virtio_notify_config() is executed. > >> > > >> > I *think* it shouldn't be possible, as BHs will be processed before > >> > any new QMP/monitor command, but I'm open to a different approach. > >> > > >> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955 > >> > Signed-off-by: Sergio Lopez <slp@redhat.com> > >> > --- > >> > hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++- > >> > 1 file changed, 24 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > >> > index 18851601cb..c763d071f6 100644 > >> > --- a/hw/block/virtio-blk.c > >> > +++ b/hw/block/virtio-blk.c > >> > @@ -16,6 +16,7 @@ > >> > #include "qemu/iov.h" > >> > #include "qemu/module.h" > >> > #include "qemu/error-report.h" > >> > +#include "qemu/main-loop.h" > >> > #include "trace.h" > >> > #include "hw/block/block.h" > >> > #include "hw/qdev-properties.h" > >> > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, > >> > return 0; > >> > } > >> > > >> > +static void coroutine_fn virtio_resize_co_entry(void *opaque) > >> > +{ > >> > + VirtIODevice *vdev = opaque; > >> > + > >> > + assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > >> > + virtio_notify_config(vdev); > >> > + aio_wait_kick(); > >> > +} > >> > + > >> > static void virtio_blk_resize(void *opaque) > >> > { > >> > VirtIODevice *vdev = VIRTIO_DEVICE(opaque); > >> > + Coroutine *co; > >> > > >> > - virtio_notify_config(vdev); > >> > + if (qemu_in_coroutine() && > >> > + qemu_get_current_aio_context() != qemu_get_aio_context()) { > >> > + /* > >> > + * virtio_notify_config() needs to acquire the global mutex, > >> > + * so calling it from a coroutine running on a non-main context > >> > + * may cause a deadlock. Instead, create a new coroutine and > >> > + * schedule it to be run on the main thread. > >> > + */ > >> > + co = qemu_coroutine_create(virtio_resize_co_entry, vdev); > >> > + aio_co_schedule(qemu_get_aio_context(), co); > >> > + } else { > >> > + virtio_notify_config(vdev); > >> > + } > >> > } > > > > Wouldn't a simple BH suffice (aio_bh_schedule_oneshot)? I don't see why > > you need a coroutine when you never yield. > > You're right, that's actually simpler, haven't thought of it. > > Do you see any drawbacks or should I send a non-RFC fixed version of > this patch? Sending a fixed non-RFC version sounds good to me. Kevin
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 18851601cb..c763d071f6 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -16,6 +16,7 @@ #include "qemu/iov.h" #include "qemu/module.h" #include "qemu/error-report.h" +#include "qemu/main-loop.h" #include "trace.h" #include "hw/block/block.h" #include "hw/qdev-properties.h" @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, return 0; } +static void coroutine_fn virtio_resize_co_entry(void *opaque) +{ + VirtIODevice *vdev = opaque; + + assert(qemu_get_current_aio_context() == qemu_get_aio_context()); + virtio_notify_config(vdev); + aio_wait_kick(); +} + static void virtio_blk_resize(void *opaque) { VirtIODevice *vdev = VIRTIO_DEVICE(opaque); + Coroutine *co; - virtio_notify_config(vdev); + if (qemu_in_coroutine() && + qemu_get_current_aio_context() != qemu_get_aio_context()) { + /* + * virtio_notify_config() needs to acquire the global mutex, + * so calling it from a coroutine running on a non-main context + * may cause a deadlock. Instead, create a new coroutine and + * schedule it to be run on the main thread. + */ + co = qemu_coroutine_create(virtio_resize_co_entry, vdev); + aio_co_schedule(qemu_get_aio_context(), co); + } else { + virtio_notify_config(vdev); + } } static const BlockDevOps virtio_block_ops = {
Another AioContext-related issue, and this is a tricky one. Executing a QMP block_resize request for a virtio-blk device running on an iothread may cause a deadlock involving the following mutexes: - main thead * Has acquired: qemu_mutex_global. * Is trying the acquire: iothread AioContext lock via AIO_WAIT_WHILE (after aio_poll). - iothread * Has acquired: AioContext lock. * Is trying to acquire: qemu_mutex_global (via virtio_notify_config->prepare_mmio_access). With this change, virtio_blk_resize checks if it's being called from a coroutine context running on a non-main thread, and if that's the case, creates a new coroutine and schedules it to be run on the main thread. This works, but means the actual operation is done asynchronously, perhaps opening a window in which a "device_del" operation may fit and remove the VirtIODevice before virtio_notify_config() is executed. I *think* it shouldn't be possible, as BHs will be processed before any new QMP/monitor command, but I'm open to a different approach. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955 Signed-off-by: Sergio Lopez <slp@redhat.com> --- hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-)