diff mbox

[for,2.6,1/1] nbd: fix assert() on qemu-nbd stop

Message ID 1460629215-11567-1-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev April 14, 2016, 10:20 a.m. UTC
From: Pavel Butsykin <pbutsykin@virtuozzo.com>

From time to time qemu-nbd is crashing on the following assert:
    assert(state == TERMINATING);
    nbd_export_closed
    nbd_export_put
    main
and the state at the moment of the crash is evaluated to TERMINATE.

During shutdown process of the client the nbd_client_thread thread sends
SIGTERM signal and the main thread calls the nbd_client_closed callback.
If the SIGTERM callback will be executed after change the state to
TERMINATING, then the state will once again be TERMINATE.

To solve the issue, we must change the state to TERMINATE only if the state
is RUNNING. In the other case we are shutting down already.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 qemu-nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paolo Bonzini April 14, 2016, 1:23 p.m. UTC | #1
On 14/04/2016 12:20, Denis V. Lunev wrote:
> From: Pavel Butsykin <pbutsykin@virtuozzo.com>
> 
> From time to time qemu-nbd is crashing on the following assert:
>     assert(state == TERMINATING);
>     nbd_export_closed
>     nbd_export_put
>     main
> and the state at the moment of the crash is evaluated to TERMINATE.
> 
> During shutdown process of the client the nbd_client_thread thread sends
> SIGTERM signal and the main thread calls the nbd_client_closed callback.
> If the SIGTERM callback will be executed after change the state to
> TERMINATING, then the state will once again be TERMINATE.
> 
> To solve the issue, we must change the state to TERMINATE only if the state
> is RUNNING. In the other case we are shutting down already.
> 
> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qemu-nbd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index ca4a724..956013f 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -213,7 +213,7 @@ static int find_partition(BlockBackend *blk, int partition,
>  
>  static void termsig_handler(int signum)
>  {
> -    state = TERMINATE;
> +    atomic_cmpxchg(&state, RUNNING, TERMINATE);

Just a simple "if" is enough (the rest of the file is not using any of
atomic{mb_,}{read,set}.

I'm not able to send further pull requests for 2.6, can the block
maintainers help?

Paolo

>      qemu_notify_event();
>  }
>  
>
Denis V. Lunev April 14, 2016, 1:40 p.m. UTC | #2
On 04/14/2016 04:23 PM, Paolo Bonzini wrote:
>
> On 14/04/2016 12:20, Denis V. Lunev wrote:
>> From: Pavel Butsykin <pbutsykin@virtuozzo.com>
>>
>>  From time to time qemu-nbd is crashing on the following assert:
>>      assert(state == TERMINATING);
>>      nbd_export_closed
>>      nbd_export_put
>>      main
>> and the state at the moment of the crash is evaluated to TERMINATE.
>>
>> During shutdown process of the client the nbd_client_thread thread sends
>> SIGTERM signal and the main thread calls the nbd_client_closed callback.
>> If the SIGTERM callback will be executed after change the state to
>> TERMINATING, then the state will once again be TERMINATE.
>>
>> To solve the issue, we must change the state to TERMINATE only if the state
>> is RUNNING. In the other case we are shutting down already.
>>
>> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   qemu-nbd.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index ca4a724..956013f 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -213,7 +213,7 @@ static int find_partition(BlockBackend *blk, int partition,
>>   
>>   static void termsig_handler(int signum)
>>   {
>> -    state = TERMINATE;
>> +    atomic_cmpxchg(&state, RUNNING, TERMINATE);
> Just a simple "if" is enough (the rest of the file is not using any of
> atomic{mb_,}{read,set}.
>
> I'm not able to send further pull requests for 2.6, can the block
> maintainers help?
>
> Paolo

unfortunately, if () would be not enough. The race is with different
thread which can run on the another CPU. Thus this OP should be
atomic or some locking is required.

Den

P.S. Added more block guys...
Paolo Bonzini April 14, 2016, 2:16 p.m. UTC | #3
On 14/04/2016 15:40, Denis V. Lunev wrote:
> On 04/14/2016 04:23 PM, Paolo Bonzini wrote:
>>
>> On 14/04/2016 12:20, Denis V. Lunev wrote:
>>> From: Pavel Butsykin <pbutsykin@virtuozzo.com>
>>>
>>>  From time to time qemu-nbd is crashing on the following assert:
>>>      assert(state == TERMINATING);
>>>      nbd_export_closed
>>>      nbd_export_put
>>>      main
>>> and the state at the moment of the crash is evaluated to TERMINATE.
>>>
>>> During shutdown process of the client the nbd_client_thread thread sends
>>> SIGTERM signal and the main thread calls the nbd_client_closed callback.
>>> If the SIGTERM callback will be executed after change the state to
>>> TERMINATING, then the state will once again be TERMINATE.
>>>
>>> To solve the issue, we must change the state to TERMINATE only if the
>>> state
>>> is RUNNING. In the other case we are shutting down already.
>>>
>>> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>   qemu-nbd.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>>> index ca4a724..956013f 100644
>>> --- a/qemu-nbd.c
>>> +++ b/qemu-nbd.c
>>> @@ -213,7 +213,7 @@ static int find_partition(BlockBackend *blk, int
>>> partition,
>>>     static void termsig_handler(int signum)
>>>   {
>>> -    state = TERMINATE;
>>> +    atomic_cmpxchg(&state, RUNNING, TERMINATE);
>> Just a simple "if" is enough (the rest of the file is not using any of
>> atomic{mb_,}{read,set}.
>>
>> I'm not able to send further pull requests for 2.6, can the block
>> maintainers help?
>>
>> Paolo
> 
> unfortunately, if () would be not enough. The race is with different
> thread which can run on the another CPU. Thus this OP should be
> atomic or some locking is required.

Oh, qemu-nbd doesn't use qemu_thread_create (qemu_thread_create ensures
that the sigmask is all-blocked except in the main thread)!

Patch is okay then (though we certainly want to revisit it in 2.7).

Paolo

> 
> Den
> 
> P.S. Added more block guys...
> 
>
Max Reitz April 14, 2016, 9:41 p.m. UTC | #4
On 14.04.2016 12:20, Denis V. Lunev wrote:
> From: Pavel Butsykin <pbutsykin@virtuozzo.com>
> 
> From time to time qemu-nbd is crashing on the following assert:
>     assert(state == TERMINATING);
>     nbd_export_closed
>     nbd_export_put
>     main
> and the state at the moment of the crash is evaluated to TERMINATE.
> 
> During shutdown process of the client the nbd_client_thread thread sends
> SIGTERM signal and the main thread calls the nbd_client_closed callback.
> If the SIGTERM callback will be executed after change the state to
> TERMINATING, then the state will once again be TERMINATE.
> 
> To solve the issue, we must change the state to TERMINATE only if the state
> is RUNNING. In the other case we are shutting down already.
> 
> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qemu-nbd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks Pavel and Denis, I have applied the patch to my block tree:

https://github.com/XanClic/qemu/commits/block

Max
diff mbox

Patch

diff --git a/qemu-nbd.c b/qemu-nbd.c
index ca4a724..956013f 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -213,7 +213,7 @@  static int find_partition(BlockBackend *blk, int partition,
 
 static void termsig_handler(int signum)
 {
-    state = TERMINATE;
+    atomic_cmpxchg(&state, RUNNING, TERMINATE);
     qemu_notify_event();
 }