diff mbox series

nbd/server: attach client channel to the export's AioContext

Message ID 20190911161521.59261-1-slp@redhat.com
State New
Headers show
Series nbd/server: attach client channel to the export's AioContext | expand

Commit Message

Sergio Lopez Sept. 11, 2019, 4:15 p.m. UTC
On creation, the export's AioContext is set to the same one as the
BlockBackend, while the AioContext in the client QIOChannel is left
untouched.

As a result, when using data-plane, nbd_client_receive_next_request()
schedules coroutines in the IOThread AioContext, while the client's
QIOChannel is serviced from the main_loop, potentially triggering the
assertion at qio_channel_restart_[read|write].

To fix this, as soon we have the export corresponding to the client,
we call qio_channel_attach_aio_context() to attach the QIOChannel
context to the export's AioContext. This matches with the logic in
blk_aio_attached().

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 nbd/server.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Eric Blake Sept. 11, 2019, 5:21 p.m. UTC | #1
On 9/11/19 11:15 AM, Sergio Lopez wrote:
> On creation, the export's AioContext is set to the same one as the
> BlockBackend, while the AioContext in the client QIOChannel is left
> untouched.
> 
> As a result, when using data-plane, nbd_client_receive_next_request()
> schedules coroutines in the IOThread AioContext, while the client's
> QIOChannel is serviced from the main_loop, potentially triggering the
> assertion at qio_channel_restart_[read|write].
> 
> To fix this, as soon we have the export corresponding to the client,
> we call qio_channel_attach_aio_context() to attach the QIOChannel
> context to the export's AioContext. This matches with the logic in
> blk_aio_attached().
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  nbd/server.c | 2 ++
>  1 file changed, 2 insertions(+)

I'd like a second opinion from Kevin, but the description makes sense to
me.  I'm happy to queue this through my NBD tree.

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 10faedcfc5..51322e2343 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -471,6 +471,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client,
>      QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>      nbd_export_get(client->exp);
>      nbd_check_meta_export(client);
> +    qio_channel_attach_aio_context(client->ioc, client->exp->ctx);
>  
>      return 0;
>  }
> @@ -673,6 +674,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
>          QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>          nbd_export_get(client->exp);
>          nbd_check_meta_export(client);
> +        qio_channel_attach_aio_context(client->ioc, exp->ctx);
>          rc = 1;
>      }
>      return rc;
>
Eric Blake Sept. 11, 2019, 9:33 p.m. UTC | #2
On 9/11/19 12:21 PM, Eric Blake wrote:
> On 9/11/19 11:15 AM, Sergio Lopez wrote:
>> On creation, the export's AioContext is set to the same one as the
>> BlockBackend, while the AioContext in the client QIOChannel is left
>> untouched.
>>
>> As a result, when using data-plane, nbd_client_receive_next_request()
>> schedules coroutines in the IOThread AioContext, while the client's
>> QIOChannel is serviced from the main_loop, potentially triggering the
>> assertion at qio_channel_restart_[read|write].
>>
>> To fix this, as soon we have the export corresponding to the client,
>> we call qio_channel_attach_aio_context() to attach the QIOChannel
>> context to the export's AioContext. This matches with the logic in
>> blk_aio_attached().
>>
>> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>> ---
>>  nbd/server.c | 2 ++
>>  1 file changed, 2 insertions(+)
> 
> I'd like a second opinion from Kevin, but the description makes sense to
> me.  I'm happy to queue this through my NBD tree.
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>

I tried to test this patch, but even with it applied, I still got an
aio-context crasher by attempting an nbd-server-start, nbd-server-add,
nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
domain using iothreads, with a backtrace of:

#0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
#1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
#2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
<__func__.15769> "qemu_mutex_unlock_impl")
    at util/qemu-thread-posix.c:36
#3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
file=0x55dd03d59041 "util/async.c",
    line=523) at util/qemu-thread-posix.c:96
#4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
util/async.c:523
#5  0x000055dd03ac421b in bdrv_do_drained_begin (bs=0x55dd0673a2d0,
recursive=false, parent=0x0,
    ignore_bds_parents=false, poll=true) at block/io.c:428
#6  0x000055dd03ac4299 in bdrv_drained_begin (bs=0x55dd0673a2d0) at
block/io.c:434
#7  0x000055dd03aafb54 in blk_drain (blk=0x55dd06a3ec40) at
block/block-backend.c:1605
#8  0x000055dd03aae054 in blk_remove_bs (blk=0x55dd06a3ec40) at
block/block-backend.c:800
#9  0x000055dd03aad54a in blk_delete (blk=0x55dd06a3ec40) at
block/block-backend.c:420
#10 0x000055dd03aad7d6 in blk_unref (blk=0x55dd06a3ec40) at
block/block-backend.c:475
#11 0x000055dd03aecb68 in nbd_export_put (exp=0x55dd0726f920) at
nbd/server.c:1666
#12 0x000055dd03aec8fe in nbd_export_close (exp=0x55dd0726f920) at
nbd/server.c:1616
#13 0x000055dd03aecbf1 in nbd_export_close_all () at nbd/server.c:1689
#14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
blockdev-nbd.c:233
...

Does that sound familiar to what you were seeing?  Does it mean we
missed another spot where the context is set incorrectly?

I'm happy to work with you on IRC for more real-time debugging of this
(I'm woefully behind on understanding how aio contexts are supposed to
work).
Eric Blake Sept. 11, 2019, 10:03 p.m. UTC | #3
On 9/11/19 4:33 PM, Eric Blake wrote:

> I tried to test this patch, but even with it applied, I still got an
> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
> domain using iothreads, with a backtrace of:

The crash also reproduces with nbd-server-remove instead of
nbd-server-stop.  Both QMP commands eventually call into the
blk_remove_bs() that calls the fatal blk_drain().  So libvirt can't work
around the bug merely by adding an nbd-server-remove step.

> 
> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
> <__func__.15769> "qemu_mutex_unlock_impl")
>     at util/qemu-thread-posix.c:36
> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
> file=0x55dd03d59041 "util/async.c",
>     line=523) at util/qemu-thread-posix.c:96
> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
> util/async.c:523
> #5  0x000055dd03ac421b in bdrv_do_drained_begin (bs=0x55dd0673a2d0,
> recursive=false, parent=0x0,
>     ignore_bds_parents=false, poll=true) at block/io.c:428
> #6  0x000055dd03ac4299 in bdrv_drained_begin (bs=0x55dd0673a2d0) at
> block/io.c:434
> #7  0x000055dd03aafb54 in blk_drain (blk=0x55dd06a3ec40) at
> block/block-backend.c:1605
> #8  0x000055dd03aae054 in blk_remove_bs (blk=0x55dd06a3ec40) at
> block/block-backend.c:800
Sergio Lopez Sept. 12, 2019, 6:37 a.m. UTC | #4
Eric Blake <eblake@redhat.com> writes:

> On 9/11/19 12:21 PM, Eric Blake wrote:
>> On 9/11/19 11:15 AM, Sergio Lopez wrote:
>>> On creation, the export's AioContext is set to the same one as the
>>> BlockBackend, while the AioContext in the client QIOChannel is left
>>> untouched.
>>>
>>> As a result, when using data-plane, nbd_client_receive_next_request()
>>> schedules coroutines in the IOThread AioContext, while the client's
>>> QIOChannel is serviced from the main_loop, potentially triggering the
>>> assertion at qio_channel_restart_[read|write].
>>>
>>> To fix this, as soon we have the export corresponding to the client,
>>> we call qio_channel_attach_aio_context() to attach the QIOChannel
>>> context to the export's AioContext. This matches with the logic in
>>> blk_aio_attached().
>>>
>>> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
>>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>>> ---
>>>  nbd/server.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>> 
>> I'd like a second opinion from Kevin, but the description makes sense to
>> me.  I'm happy to queue this through my NBD tree.
>> 
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>
> I tried to test this patch, but even with it applied, I still got an
> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
> domain using iothreads, with a backtrace of:
>
> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
> <__func__.15769> "qemu_mutex_unlock_impl")
>     at util/qemu-thread-posix.c:36
> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
> file=0x55dd03d59041 "util/async.c",
>     line=523) at util/qemu-thread-posix.c:96
> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
> util/async.c:523
> #5  0x000055dd03ac421b in bdrv_do_drained_begin (bs=0x55dd0673a2d0,
> recursive=false, parent=0x0,
>     ignore_bds_parents=false, poll=true) at block/io.c:428
> #6  0x000055dd03ac4299 in bdrv_drained_begin (bs=0x55dd0673a2d0) at
> block/io.c:434
> #7  0x000055dd03aafb54 in blk_drain (blk=0x55dd06a3ec40) at
> block/block-backend.c:1605
> #8  0x000055dd03aae054 in blk_remove_bs (blk=0x55dd06a3ec40) at
> block/block-backend.c:800
> #9  0x000055dd03aad54a in blk_delete (blk=0x55dd06a3ec40) at
> block/block-backend.c:420
> #10 0x000055dd03aad7d6 in blk_unref (blk=0x55dd06a3ec40) at
> block/block-backend.c:475
> #11 0x000055dd03aecb68 in nbd_export_put (exp=0x55dd0726f920) at
> nbd/server.c:1666
> #12 0x000055dd03aec8fe in nbd_export_close (exp=0x55dd0726f920) at
> nbd/server.c:1616
> #13 0x000055dd03aecbf1 in nbd_export_close_all () at nbd/server.c:1689
> #14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
> blockdev-nbd.c:233
> ...
>
> Does that sound familiar to what you were seeing?  Does it mean we
> missed another spot where the context is set incorrectly?

It looks like it was trying to release the AioContext while it was still
held by some other thread. Is this stacktrace from the main thread or an
iothread? What was the other one doing?

> I'm happy to work with you on IRC for more real-time debugging of this
> (I'm woefully behind on understanding how aio contexts are supposed to
> work).

I must be missing some step, because I can't reproduce this one
here. I've tried both with an idle NDB server and one with a client
generating I/O. Is it reproducible 100% of them time?

Sergio.
Kevin Wolf Sept. 12, 2019, 8:14 a.m. UTC | #5
Am 11.09.2019 um 23:33 hat Eric Blake geschrieben:
> On 9/11/19 12:21 PM, Eric Blake wrote:
> > On 9/11/19 11:15 AM, Sergio Lopez wrote:
> >> On creation, the export's AioContext is set to the same one as the
> >> BlockBackend, while the AioContext in the client QIOChannel is left
> >> untouched.
> >>
> >> As a result, when using data-plane, nbd_client_receive_next_request()
> >> schedules coroutines in the IOThread AioContext, while the client's
> >> QIOChannel is serviced from the main_loop, potentially triggering the
> >> assertion at qio_channel_restart_[read|write].
> >>
> >> To fix this, as soon we have the export corresponding to the client,
> >> we call qio_channel_attach_aio_context() to attach the QIOChannel
> >> context to the export's AioContext. This matches with the logic in
> >> blk_aio_attached().
> >>
> >> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
> >> Signed-off-by: Sergio Lopez <slp@redhat.com>
> >> ---
> >>  nbd/server.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> > 
> > I'd like a second opinion from Kevin, but the description makes sense to
> > me.  I'm happy to queue this through my NBD tree.
> > 
> > Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> I tried to test this patch, but even with it applied, I still got an
> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
> domain using iothreads, with a backtrace of:
> 
> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
> <__func__.15769> "qemu_mutex_unlock_impl")
>     at util/qemu-thread-posix.c:36
> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
> file=0x55dd03d59041 "util/async.c",
>     line=523) at util/qemu-thread-posix.c:96
> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
> util/async.c:523
> #5  0x000055dd03ac421b in bdrv_do_drained_begin (bs=0x55dd0673a2d0,
> recursive=false, parent=0x0,
>     ignore_bds_parents=false, poll=true) at block/io.c:428
> #6  0x000055dd03ac4299 in bdrv_drained_begin (bs=0x55dd0673a2d0) at
> block/io.c:434
> #7  0x000055dd03aafb54 in blk_drain (blk=0x55dd06a3ec40) at
> block/block-backend.c:1605
> #8  0x000055dd03aae054 in blk_remove_bs (blk=0x55dd06a3ec40) at
> block/block-backend.c:800
> #9  0x000055dd03aad54a in blk_delete (blk=0x55dd06a3ec40) at
> block/block-backend.c:420
> #10 0x000055dd03aad7d6 in blk_unref (blk=0x55dd06a3ec40) at
> block/block-backend.c:475
> #11 0x000055dd03aecb68 in nbd_export_put (exp=0x55dd0726f920) at
> nbd/server.c:1666
> #12 0x000055dd03aec8fe in nbd_export_close (exp=0x55dd0726f920) at
> nbd/server.c:1616
> #13 0x000055dd03aecbf1 in nbd_export_close_all () at nbd/server.c:1689
> #14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
> blockdev-nbd.c:233
> ...
> 
> Does that sound familiar to what you were seeing?  Does it mean we
> missed another spot where the context is set incorrectly?

I think nbd_export_close_all() or one of the NBD functions called by it
needs to take the AioContext lock of the associated BlockBackend.

The crash is because AIO_POLL_WHILE() wants to temporarily drop the lock
that we're not even holding.

Kevin
Kevin Wolf Sept. 12, 2019, 8:23 a.m. UTC | #6
Am 11.09.2019 um 18:15 hat Sergio Lopez geschrieben:
> On creation, the export's AioContext is set to the same one as the
> BlockBackend, while the AioContext in the client QIOChannel is left
> untouched.
> 
> As a result, when using data-plane, nbd_client_receive_next_request()
> schedules coroutines in the IOThread AioContext, while the client's
> QIOChannel is serviced from the main_loop, potentially triggering the
> assertion at qio_channel_restart_[read|write].
> 
> To fix this, as soon we have the export corresponding to the client,
> we call qio_channel_attach_aio_context() to attach the QIOChannel
> context to the export's AioContext. This matches with the logic in
> blk_aio_attached().
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
> Signed-off-by: Sergio Lopez <slp@redhat.com>

Oh, looks like I only fixed switching the AioContext after the fact, but
not starting the NBD server for a node that is already in a different
AioContext... :-/

> diff --git a/nbd/server.c b/nbd/server.c
> index 10faedcfc5..51322e2343 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -471,6 +471,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client,
>      QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>      nbd_export_get(client->exp);
>      nbd_check_meta_export(client);
> +    qio_channel_attach_aio_context(client->ioc, client->exp->ctx);
>  
>      return 0;
>  }
> @@ -673,6 +674,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
>          QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>          nbd_export_get(client->exp);
>          nbd_check_meta_export(client);
> +        qio_channel_attach_aio_context(client->ioc, exp->ctx);
>          rc = 1;
>      }
>      return rc;

I think I would rather do this once at the end of nbd_negotiate()
instead of duplicating it across the different way to open an export.
During the negotiation phase, we don't start requests yet, so doing
everything from the main thread should be fine.

Actually, not doing everything from the main thread sounds nasty because
I think the next QIOChannel callback could then already be executed in
the iothread while this one hasn't completed yet. Or do we have any
locking in place for the negotiation?

Kevin
Sergio Lopez Sept. 12, 2019, 10:13 a.m. UTC | #7
Kevin Wolf <kwolf@redhat.com> writes:

> Am 11.09.2019 um 18:15 hat Sergio Lopez geschrieben:
>> On creation, the export's AioContext is set to the same one as the
>> BlockBackend, while the AioContext in the client QIOChannel is left
>> untouched.
>> 
>> As a result, when using data-plane, nbd_client_receive_next_request()
>> schedules coroutines in the IOThread AioContext, while the client's
>> QIOChannel is serviced from the main_loop, potentially triggering the
>> assertion at qio_channel_restart_[read|write].
>> 
>> To fix this, as soon we have the export corresponding to the client,
>> we call qio_channel_attach_aio_context() to attach the QIOChannel
>> context to the export's AioContext. This matches with the logic in
>> blk_aio_attached().
>> 
>> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>
> Oh, looks like I only fixed switching the AioContext after the fact, but
> not starting the NBD server for a node that is already in a different
> AioContext... :-/
>
>> diff --git a/nbd/server.c b/nbd/server.c
>> index 10faedcfc5..51322e2343 100644
>> --- a/nbd/server.c
>> +++ b/nbd/server.c
>> @@ -471,6 +471,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client,
>>      QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>>      nbd_export_get(client->exp);
>>      nbd_check_meta_export(client);
>> +    qio_channel_attach_aio_context(client->ioc, client->exp->ctx);
>>  
>>      return 0;
>>  }
>> @@ -673,6 +674,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
>>          QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>>          nbd_export_get(client->exp);
>>          nbd_check_meta_export(client);
>> +        qio_channel_attach_aio_context(client->ioc, exp->ctx);
>>          rc = 1;
>>      }
>>      return rc;
>
> I think I would rather do this once at the end of nbd_negotiate()
> instead of duplicating it across the different way to open an export.
> During the negotiation phase, we don't start requests yet, so doing
> everything from the main thread should be fine.

OK.

> Actually, not doing everything from the main thread sounds nasty because
> I think the next QIOChannel callback could then already be executed in
> the iothread while this one hasn't completed yet. Or do we have any
> locking in place for the negotiation?

This is the first time I look at NBD code, but IIUC all the negotiation
is done with synchronous nbd_[read|write]() calls, so even if the
coroutine yields due to EWOULDBLOCK, nothing else should be making
progress.

Sergio.
Kevin Wolf Sept. 12, 2019, 10:25 a.m. UTC | #8
Am 12.09.2019 um 12:13 hat Sergio Lopez geschrieben:
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 11.09.2019 um 18:15 hat Sergio Lopez geschrieben:
> >> On creation, the export's AioContext is set to the same one as the
> >> BlockBackend, while the AioContext in the client QIOChannel is left
> >> untouched.
> >> 
> >> As a result, when using data-plane, nbd_client_receive_next_request()
> >> schedules coroutines in the IOThread AioContext, while the client's
> >> QIOChannel is serviced from the main_loop, potentially triggering the
> >> assertion at qio_channel_restart_[read|write].
> >> 
> >> To fix this, as soon we have the export corresponding to the client,
> >> we call qio_channel_attach_aio_context() to attach the QIOChannel
> >> context to the export's AioContext. This matches with the logic in
> >> blk_aio_attached().
> >> 
> >> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
> >> Signed-off-by: Sergio Lopez <slp@redhat.com>
> >
> > Oh, looks like I only fixed switching the AioContext after the fact, but
> > not starting the NBD server for a node that is already in a different
> > AioContext... :-/
> >
> >> diff --git a/nbd/server.c b/nbd/server.c
> >> index 10faedcfc5..51322e2343 100644
> >> --- a/nbd/server.c
> >> +++ b/nbd/server.c
> >> @@ -471,6 +471,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client,
> >>      QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
> >>      nbd_export_get(client->exp);
> >>      nbd_check_meta_export(client);
> >> +    qio_channel_attach_aio_context(client->ioc, client->exp->ctx);
> >>  
> >>      return 0;
> >>  }
> >> @@ -673,6 +674,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
> >>          QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
> >>          nbd_export_get(client->exp);
> >>          nbd_check_meta_export(client);
> >> +        qio_channel_attach_aio_context(client->ioc, exp->ctx);
> >>          rc = 1;
> >>      }
> >>      return rc;
> >
> > I think I would rather do this once at the end of nbd_negotiate()
> > instead of duplicating it across the different way to open an export.
> > During the negotiation phase, we don't start requests yet, so doing
> > everything from the main thread should be fine.
> 
> OK.
> 
> > Actually, not doing everything from the main thread sounds nasty because
> > I think the next QIOChannel callback could then already be executed in
> > the iothread while this one hasn't completed yet. Or do we have any
> > locking in place for the negotiation?
> 
> This is the first time I look at NBD code, but IIUC all the negotiation
> is done with synchronous nbd_[read|write]() calls, so even if the
> coroutine yields due to EWOULDBLOCK, nothing else should be making
> progress.

Ah, yes, you're right. We don't even have fd handlers installed if we
aren't currently waiting for the coroutine to be re-entered. So as
everything is tied to the one coroutine, this should not be a problem.

Let's avoid the duplication anyway.

Kevin
Sergio Lopez Sept. 12, 2019, 10:30 a.m. UTC | #9
Kevin Wolf <kwolf@redhat.com> writes:

> Am 11.09.2019 um 23:33 hat Eric Blake geschrieben:
>> On 9/11/19 12:21 PM, Eric Blake wrote:
>> > On 9/11/19 11:15 AM, Sergio Lopez wrote:
>> >> On creation, the export's AioContext is set to the same one as the
>> >> BlockBackend, while the AioContext in the client QIOChannel is left
>> >> untouched.
>> >>
>> >> As a result, when using data-plane, nbd_client_receive_next_request()
>> >> schedules coroutines in the IOThread AioContext, while the client's
>> >> QIOChannel is serviced from the main_loop, potentially triggering the
>> >> assertion at qio_channel_restart_[read|write].
>> >>
>> >> To fix this, as soon we have the export corresponding to the client,
>> >> we call qio_channel_attach_aio_context() to attach the QIOChannel
>> >> context to the export's AioContext. This matches with the logic in
>> >> blk_aio_attached().
>> >>
>> >> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
>> >> Signed-off-by: Sergio Lopez <slp@redhat.com>
>> >> ---
>> >>  nbd/server.c | 2 ++
>> >>  1 file changed, 2 insertions(+)
>> > 
>> > I'd like a second opinion from Kevin, but the description makes sense to
>> > me.  I'm happy to queue this through my NBD tree.
>> > 
>> > Reviewed-by: Eric Blake <eblake@redhat.com>
>> 
>> I tried to test this patch, but even with it applied, I still got an
>> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
>> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
>> domain using iothreads, with a backtrace of:
>> 
>> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
>> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
>> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
>> <__func__.15769> "qemu_mutex_unlock_impl")
>>     at util/qemu-thread-posix.c:36
>> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
>> file=0x55dd03d59041 "util/async.c",
>>     line=523) at util/qemu-thread-posix.c:96
>> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
>> util/async.c:523
>> #5  0x000055dd03ac421b in bdrv_do_drained_begin (bs=0x55dd0673a2d0,
>> recursive=false, parent=0x0,
>>     ignore_bds_parents=false, poll=true) at block/io.c:428
>> #6  0x000055dd03ac4299 in bdrv_drained_begin (bs=0x55dd0673a2d0) at
>> block/io.c:434
>> #7  0x000055dd03aafb54 in blk_drain (blk=0x55dd06a3ec40) at
>> block/block-backend.c:1605
>> #8  0x000055dd03aae054 in blk_remove_bs (blk=0x55dd06a3ec40) at
>> block/block-backend.c:800
>> #9  0x000055dd03aad54a in blk_delete (blk=0x55dd06a3ec40) at
>> block/block-backend.c:420
>> #10 0x000055dd03aad7d6 in blk_unref (blk=0x55dd06a3ec40) at
>> block/block-backend.c:475
>> #11 0x000055dd03aecb68 in nbd_export_put (exp=0x55dd0726f920) at
>> nbd/server.c:1666
>> #12 0x000055dd03aec8fe in nbd_export_close (exp=0x55dd0726f920) at
>> nbd/server.c:1616
>> #13 0x000055dd03aecbf1 in nbd_export_close_all () at nbd/server.c:1689
>> #14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
>> blockdev-nbd.c:233
>> ...
>> 
>> Does that sound familiar to what you were seeing?  Does it mean we
>> missed another spot where the context is set incorrectly?
>
> I think nbd_export_close_all() or one of the NBD functions called by it
> needs to take the AioContext lock of the associated BlockBackend.
>
> The crash is because AIO_POLL_WHILE() wants to temporarily drop the lock
> that we're not even holding.

Yes, I think locking the context during the "if (exp->blk) {" block at
nbd/server.c:1646 should do the trick.

On the other hand, I wonder if there is any situation in which calling
to blk_unref() without locking the context could be safe. If there isn't
any, perhaps we should assert that the lock is held if blk->ctx != NULL
to catch this kind of bugs earlier?

Sergio.
Kevin Wolf Sept. 12, 2019, 11:31 a.m. UTC | #10
Am 12.09.2019 um 12:30 hat Sergio Lopez geschrieben:
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 11.09.2019 um 23:33 hat Eric Blake geschrieben:
> >> On 9/11/19 12:21 PM, Eric Blake wrote:
> >> > On 9/11/19 11:15 AM, Sergio Lopez wrote:
> >> >> On creation, the export's AioContext is set to the same one as the
> >> >> BlockBackend, while the AioContext in the client QIOChannel is left
> >> >> untouched.
> >> >>
> >> >> As a result, when using data-plane, nbd_client_receive_next_request()
> >> >> schedules coroutines in the IOThread AioContext, while the client's
> >> >> QIOChannel is serviced from the main_loop, potentially triggering the
> >> >> assertion at qio_channel_restart_[read|write].
> >> >>
> >> >> To fix this, as soon we have the export corresponding to the client,
> >> >> we call qio_channel_attach_aio_context() to attach the QIOChannel
> >> >> context to the export's AioContext. This matches with the logic in
> >> >> blk_aio_attached().
> >> >>
> >> >> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1748253
> >> >> Signed-off-by: Sergio Lopez <slp@redhat.com>
> >> >> ---
> >> >>  nbd/server.c | 2 ++
> >> >>  1 file changed, 2 insertions(+)
> >> > 
> >> > I'd like a second opinion from Kevin, but the description makes sense to
> >> > me.  I'm happy to queue this through my NBD tree.
> >> > 
> >> > Reviewed-by: Eric Blake <eblake@redhat.com>
> >> 
> >> I tried to test this patch, but even with it applied, I still got an
> >> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
> >> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
> >> domain using iothreads, with a backtrace of:
> >> 
> >> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
> >> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
> >> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
> >> <__func__.15769> "qemu_mutex_unlock_impl")
> >>     at util/qemu-thread-posix.c:36
> >> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
> >> file=0x55dd03d59041 "util/async.c",
> >>     line=523) at util/qemu-thread-posix.c:96
> >> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
> >> util/async.c:523
> >> #5  0x000055dd03ac421b in bdrv_do_drained_begin (bs=0x55dd0673a2d0,
> >> recursive=false, parent=0x0,
> >>     ignore_bds_parents=false, poll=true) at block/io.c:428
> >> #6  0x000055dd03ac4299 in bdrv_drained_begin (bs=0x55dd0673a2d0) at
> >> block/io.c:434
> >> #7  0x000055dd03aafb54 in blk_drain (blk=0x55dd06a3ec40) at
> >> block/block-backend.c:1605
> >> #8  0x000055dd03aae054 in blk_remove_bs (blk=0x55dd06a3ec40) at
> >> block/block-backend.c:800
> >> #9  0x000055dd03aad54a in blk_delete (blk=0x55dd06a3ec40) at
> >> block/block-backend.c:420
> >> #10 0x000055dd03aad7d6 in blk_unref (blk=0x55dd06a3ec40) at
> >> block/block-backend.c:475
> >> #11 0x000055dd03aecb68 in nbd_export_put (exp=0x55dd0726f920) at
> >> nbd/server.c:1666
> >> #12 0x000055dd03aec8fe in nbd_export_close (exp=0x55dd0726f920) at
> >> nbd/server.c:1616
> >> #13 0x000055dd03aecbf1 in nbd_export_close_all () at nbd/server.c:1689
> >> #14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
> >> blockdev-nbd.c:233
> >> ...
> >> 
> >> Does that sound familiar to what you were seeing?  Does it mean we
> >> missed another spot where the context is set incorrectly?
> >
> > I think nbd_export_close_all() or one of the NBD functions called by it
> > needs to take the AioContext lock of the associated BlockBackend.
> >
> > The crash is because AIO_POLL_WHILE() wants to temporarily drop the lock
> > that we're not even holding.
> 
> Yes, I think locking the context during the "if (exp->blk) {" block at
> nbd/server.c:1646 should do the trick.

We need to be careful to avoid locking things twice, so maybe
nbd_export_put() is already too deep inside the NBD server.

Its callers are:

* qmp_nbd_server_add(). Like all other QMP handlers in blockdev-nbd.c it
  neglects to lock the AioContext, but it should do so. The lock is not
  only needed for the nbd_export_put() call, but even before.

* nbd_export_close(), which in turn is called from:
    * nbd_eject_notifier(): run in the main thread, not locked
    * nbd_export_remove():
        * qmp_nbd_server_remove(): see above
    * nbd_export_close_all():
        * bdrv_close_all()
        * qmp_nbd_server_stop()

There are also calls from qemu-nbd, but these can be ignored because we
don't have iothreads there.

I think the cleanest would be to take the lock in the outermost callers,
i.e. all QMP handlers that deal with a specific export, in the eject
notifier and in nbd_export_close_all().

> On the other hand, I wonder if there is any situation in which calling
> to blk_unref() without locking the context could be safe. If there isn't
> any, perhaps we should assert that the lock is held if blk->ctx != NULL
> to catch this kind of bugs earlier?

blk_unref() must be called from the main thread, and if the BlockBackend
to be unreferenced is not in the main AioContext, the lock must be held.

I'm not sure how to assert that locks are held, though. I once looked
for a way to do this, but it involved either looking at the internal
state of pthreads mutexes or hacking up QemuMutex with debug state.

Kevin
Eric Blake Sept. 16, 2019, 9:11 p.m. UTC | #11
On 9/12/19 1:37 AM, Sergio Lopez wrote:

>> I tried to test this patch, but even with it applied, I still got an
>> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
>> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
>> domain using iothreads, with a backtrace of:
>>
>> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
>> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
>> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
>> <__func__.15769> "qemu_mutex_unlock_impl")
>>     at util/qemu-thread-posix.c:36
>> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
>> file=0x55dd03d59041 "util/async.c",
>>     line=523) at util/qemu-thread-posix.c:96
>> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
>> util/async.c:523

>> #14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
>> blockdev-nbd.c:233
>> ...

Sorry for truncating the initial stackdump report. The rest of the trace
(it is definitely in the main loop):

#15 0x0000560be491c910 in qmp_marshal_nbd_server_stop
(args=0x560be54c4d00, ret=0x7ffdd832de38,
    errp=0x7ffdd832de30) at qapi/qapi-commands-block.c:318
#16 0x0000560be4a7a306 in do_qmp_dispatch (cmds=0x560be50dc1f0
<qmp_commands>, request=0x7fbcac009af0,
    allow_oob=false, errp=0x7ffdd832ded8) at qapi/qmp-dispatch.c:131
#17 0x0000560be4a7a507 in qmp_dispatch (cmds=0x560be50dc1f0
<qmp_commands>, request=0x7fbcac009af0,
    allow_oob=false) at qapi/qmp-dispatch.c:174
#18 0x0000560be48edd81 in monitor_qmp_dispatch (mon=0x560be55d6670,
req=0x7fbcac009af0) at monitor/qmp.c:120
#19 0x0000560be48ee116 in monitor_qmp_bh_dispatcher (data=0x0) at
monitor/qmp.c:209
#20 0x0000560be4ad16a2 in aio_bh_call (bh=0x560be53dbe90) at util/async.c:89
#21 0x0000560be4ad173a in aio_bh_poll (ctx=0x560be53daba0) at
util/async.c:117
#22 0x0000560be4ad6514 in aio_dispatch (ctx=0x560be53daba0) at
util/aio-posix.c:459
#23 0x0000560be4ad1ad3 in aio_ctx_dispatch (source=0x560be53daba0,
callback=0x0, user_data=0x0) at util/async.c:260
#24 0x00007fbcd7083ecd in g_main_context_dispatch () from
target:/lib64/libglib-2.0.so.0
#25 0x0000560be4ad4e47 in glib_pollfds_poll () at util/main-loop.c:218
#26 0x0000560be4ad4ec1 in os_host_main_loop_wait (timeout=1000000000) at
util/main-loop.c:241
#27 0x0000560be4ad4fc6 in main_loop_wait (nonblocking=0) at
util/main-loop.c:517
--Type <RET> for more, q to quit, c to continue without paging--
#28 0x0000560be4691266 in main_loop () at vl.c:1806
#29 0x0000560be46988a9 in main (argc=112, argv=0x7ffdd832e4e8,
envp=0x7ffdd832e870) at vl.c:4488


>>
>> Does that sound familiar to what you were seeing?  Does it mean we
>> missed another spot where the context is set incorrectly?
> 
> It looks like it was trying to release the AioContext while it was still
> held by some other thread. Is this stacktrace from the main thread or an
> iothread? What was the other one doing?

Kevin had some ideas on what it might be; I'm playing with obtaining the
context in the spots he pointed out.

> 
>> I'm happy to work with you on IRC for more real-time debugging of this
>> (I'm woefully behind on understanding how aio contexts are supposed to
>> work).
> 
> I must be missing some step, because I can't reproduce this one
> here. I've tried both with an idle NDB server and one with a client
> generating I/O. Is it reproducible 100% of them time?

Yes, with iothreads.  I took some time today to boil it down to
something that does not require libvirt:

$ file myfile
myfile: QEMU QCOW2 Image (v3), 104857600 bytes
$ qemu-img create -f qcow2 -o backing_file=myfile,backing_fmt=qcow2  \
 myfile.wrap
Formatting 'myfile.wrap', fmt=qcow2 size=104857600 backing_file=myfile
backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults \
  -name tmp,debug-threads=on -machine pc-q35-3.1,accel=kvm \
  -object iothread,id=iothread1 \
  -drive
file=myfile,format=qcow2,if=none,id=drive-virtio-disk0,node-name=n \
  -device
virtio-blk-pci,iothread=iothread1,drive=drive-virtio-disk0,id=virtio-disk0 \
  -qmp stdio -nographic
{'execute':'qmp_capabilities'}
{'execute':'nbd-server-start','arguments':{'addr':{'type':'inet',
  'data':{'host':'localhost','port':'10809'}}}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2',
 'node-name':'t','file'{'driver':'file',
  'filename':'myfile.wrap'},'backing':'n'}}
{'execute':'blockdev-backup','arguments':{'device':'n',
 'target':'t','sync':'none','job-id':'b'}}
{'execute':'nbd-server-add','arguments':{'device':'t','name':'t'}}
{'execute':'nbd-server-remove','arguments':{'name':'t'}}
Aborted (core dumped)

I'm now playing with Kevin's ideas of grabbing the aiocontext around nbd
unref.
Eric Blake Sept. 16, 2019, 10:30 p.m. UTC | #12
On 9/12/19 6:31 AM, Kevin Wolf wrote:

>>
>> Yes, I think locking the context during the "if (exp->blk) {" block at
>> nbd/server.c:1646 should do the trick.

That line number has moved over time; which function are you referring to?

> 
> We need to be careful to avoid locking things twice, so maybe
> nbd_export_put() is already too deep inside the NBD server.
> 
> Its callers are:
> 
> * qmp_nbd_server_add(). Like all other QMP handlers in blockdev-nbd.c it
>   neglects to lock the AioContext, but it should do so. The lock is not
>   only needed for the nbd_export_put() call, but even before.
> 
> * nbd_export_close(), which in turn is called from:
>     * nbd_eject_notifier(): run in the main thread, not locked
>     * nbd_export_remove():
>         * qmp_nbd_server_remove(): see above
>     * nbd_export_close_all():
>         * bdrv_close_all()
>         * qmp_nbd_server_stop()

Even weirder: nbd_export_put() calls nbd_export_close(), and
nbd_export_close() calls nbd_export_put().  The mutual recursion is
mind-numbing, and the fact that we use get/put instead of ref/unref like
most other qemu code is not making it any easier to reason about.

> 
> There are also calls from qemu-nbd, but these can be ignored because we
> don't have iothreads there.
> 
> I think the cleanest would be to take the lock in the outermost callers,
> i.e. all QMP handlers that deal with a specific export, in the eject
> notifier and in nbd_export_close_all().

Okay, I'm trying that (I already tried grabbing the aio_context in
nbd_export_close(), but as you predicted, that deadlocked when a nested
call already encountered the lock taken from an outer call).

> 
>> On the other hand, I wonder if there is any situation in which calling
>> to blk_unref() without locking the context could be safe. If there isn't
>> any, perhaps we should assert that the lock is held if blk->ctx != NULL
>> to catch this kind of bugs earlier?
> 
> blk_unref() must be called from the main thread, and if the BlockBackend
> to be unreferenced is not in the main AioContext, the lock must be held.
> 
> I'm not sure how to assert that locks are held, though. I once looked
> for a way to do this, but it involved either looking at the internal
> state of pthreads mutexes or hacking up QemuMutex with debug state.

Even if we can only test that in a debug build but not during normal
builds, could any of our CI builds set up that configuration?
diff mbox series

Patch

diff --git a/nbd/server.c b/nbd/server.c
index 10faedcfc5..51322e2343 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -471,6 +471,7 @@  static int nbd_negotiate_handle_export_name(NBDClient *client,
     QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
     nbd_export_get(client->exp);
     nbd_check_meta_export(client);
+    qio_channel_attach_aio_context(client->ioc, client->exp->ctx);
 
     return 0;
 }
@@ -673,6 +674,7 @@  static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
         QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
         nbd_export_get(client->exp);
         nbd_check_meta_export(client);
+        qio_channel_attach_aio_context(client->ioc, exp->ctx);
         rc = 1;
     }
     return rc;