diff mbox series

[RFC,v3,4/7] migration: fix the compression code

Message ID 20180919133523.13351-5-fli@suse.com
State New
Headers show
Series qemu_thread_create: propagate errors to callers to check | expand

Commit Message

Fei Li Sept. 19, 2018, 1:35 p.m. UTC
Add judgement in compress_threads_save_cleanup() to check whether the
static CompressParam *comp_param has been allocated. If not, just
return; or else Segmentation fault will occur when using the
NULL comp_param's parameters in terminate_compression_threads().
One test case can reproduce this error is: set the compression on
and migrate to a wrong nonexistent host IP address.

Add judgement before handling comp_param[idx]'s quit and cond in
terminate_compression_threads(), in case they are not initialized.
Or else "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed."
will occur. One test case can reproduce this error is: set the
compression on and fail to fully setup the eight compression thread
in compress_threads_save_setup().

Signed-off-by: Fei Li <fli@suse.com>
---
 migration/ram.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Peter Xu Sept. 20, 2018, 4:31 a.m. UTC | #1
On Wed, Sep 19, 2018 at 09:35:20PM +0800, Fei Li wrote:
> Add judgement in compress_threads_save_cleanup() to check whether the
> static CompressParam *comp_param has been allocated. If not, just
> return; or else Segmentation fault will occur when using the
> NULL comp_param's parameters in terminate_compression_threads().
> One test case can reproduce this error is: set the compression on
> and migrate to a wrong nonexistent host IP address.
> 
> Add judgement before handling comp_param[idx]'s quit and cond in
> terminate_compression_threads(), in case they are not initialized.
> Or else "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed."
> will occur. One test case can reproduce this error is: set the
> compression on and fail to fully setup the eight compression thread
> in compress_threads_save_setup().
> 
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  migration/ram.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 79c89425a3..522a5550b4 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -427,6 +427,9 @@ static inline void terminate_compression_threads(void)
>      thread_count = migrate_compress_threads();
>  
>      for (idx = 0; idx < thread_count; idx++) {
> +        if (!comp_param[idx].mutex.initialized) {
> +            break;
> +        }

Instead of accessing the mutex internal variable "initialized", how
about we just squash the terminate_compression_threads() into existing
compress_threads_save_cleanup()?  Note that we have this in
compress_threads_save_cleanup() already:

    for (i = 0; i < thread_count; i++) {
        /*
         * we use it as a indicator which shows if the thread is
         * properly init'd or not
         */
        if (!comp_param[i].file) {
            break;
        }
        qemu_thread_join(compress_threads + i);
        qemu_mutex_destroy(&comp_param[i].mutex);
        qemu_cond_destroy(&comp_param[i].cond);
        deflateEnd(&comp_param[i].stream);
        g_free(comp_param[i].originbuf);
        qemu_fclose(comp_param[i].file);
        comp_param[i].file = NULL;
    }

So we already try to detect this using the comp_param[i].file, which
IMO is better (the exposure of the mutex.init var is just "an
accident" - logically we could hide that from mutex impl).

>          qemu_mutex_lock(&comp_param[idx].mutex);
>          comp_param[idx].quit = true;
>          qemu_cond_signal(&comp_param[idx].cond);
> @@ -438,7 +441,7 @@ static void compress_threads_save_cleanup(void)
>  {
>      int i, thread_count;
>  
> -    if (!migrate_use_compression()) {
> +    if (!migrate_use_compression() || !comp_param) {

This should be a valid fix for a crash (since we might call the
cleanup code even without setup when connect() never worked, so we'd
better handle it well).

Considering that this seems to fix a migration crash on source, I'm
CCing Dave and Juan in case this (or a new version) could even be a
good candidate as part of a migration pull.

Thanks,

>          return;
>      }
>      terminate_compression_threads();
> -- 
> 2.13.7
> 
>
Fei Li Sept. 20, 2018, 5:06 a.m. UTC | #2
On 09/20/2018 12:31 PM, Peter Xu wrote:
> On Wed, Sep 19, 2018 at 09:35:20PM +0800, Fei Li wrote:
>> Add judgement in compress_threads_save_cleanup() to check whether the
>> static CompressParam *comp_param has been allocated. If not, just
>> return; or else Segmentation fault will occur when using the
>> NULL comp_param's parameters in terminate_compression_threads().
>> One test case can reproduce this error is: set the compression on
>> and migrate to a wrong nonexistent host IP address.
>>
>> Add judgement before handling comp_param[idx]'s quit and cond in
>> terminate_compression_threads(), in case they are not initialized.
>> Or else "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed."
>> will occur. One test case can reproduce this error is: set the
>> compression on and fail to fully setup the eight compression thread
>> in compress_threads_save_setup().
>>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>>   migration/ram.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 79c89425a3..522a5550b4 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -427,6 +427,9 @@ static inline void terminate_compression_threads(void)
>>       thread_count = migrate_compress_threads();
>>   
>>       for (idx = 0; idx < thread_count; idx++) {
>> +        if (!comp_param[idx].mutex.initialized) {
>> +            break;
>> +        }
> Instead of accessing the mutex internal variable "initialized", how
> about we just squash the terminate_compression_threads() into existing
> compress_threads_save_cleanup()?  Note that we have this in
> compress_threads_save_cleanup() already:
>
>      for (i = 0; i < thread_count; i++) {
>          /*
>           * we use it as a indicator which shows if the thread is
>           * properly init'd or not
>           */
>          if (!comp_param[i].file) {
>              break;
>          }
>          qemu_thread_join(compress_threads + i);
>          qemu_mutex_destroy(&comp_param[i].mutex);
>          qemu_cond_destroy(&comp_param[i].cond);
>          deflateEnd(&comp_param[i].stream);
>          g_free(comp_param[i].originbuf);
>          qemu_fclose(comp_param[i].file);
>          comp_param[i].file = NULL;
>      }
>
> So we already try to detect this using the comp_param[i].file, which
> IMO is better (the exposure of the mutex.init var is just "an
> accident" - logically we could hide that from mutex impl).
Actually, I firstly use the comp_param[i].file as the judging condition, but
I am not sure whether the comp_param[i].file is also needed to be protected
by the mutex lock..
And I have no objection to the squash.
>
>>           qemu_mutex_lock(&comp_param[idx].mutex);
>>           comp_param[idx].quit = true;
>>           qemu_cond_signal(&comp_param[idx].cond);
>> @@ -438,7 +441,7 @@ static void compress_threads_save_cleanup(void)
>>   {
>>       int i, thread_count;
>>   
>> -    if (!migrate_use_compression()) {
>> +    if (!migrate_use_compression() || !comp_param) {
> This should be a valid fix for a crash (since we might call the
> cleanup code even without setup when connect() never worked, so we'd
> better handle it well).
>
> Considering that this seems to fix a migration crash on source, I'm
> CCing Dave and Juan in case this (or a new version) could even be a
> good candidate as part of a migration pull.
>
> Thanks,
Thanks for helping cc. ;)

Have a nice day
Fei
>
>>           return;
>>       }
>>       terminate_compression_threads();
>> -- 
>> 2.13.7
>>
>>
Peter Xu Sept. 20, 2018, 5:33 a.m. UTC | #3
On Thu, Sep 20, 2018 at 01:06:21PM +0800, Fei Li wrote:
> 
> 
> On 09/20/2018 12:31 PM, Peter Xu wrote:
> > On Wed, Sep 19, 2018 at 09:35:20PM +0800, Fei Li wrote:
> > > Add judgement in compress_threads_save_cleanup() to check whether the
> > > static CompressParam *comp_param has been allocated. If not, just
> > > return; or else Segmentation fault will occur when using the
> > > NULL comp_param's parameters in terminate_compression_threads().
> > > One test case can reproduce this error is: set the compression on
> > > and migrate to a wrong nonexistent host IP address.
> > > 
> > > Add judgement before handling comp_param[idx]'s quit and cond in
> > > terminate_compression_threads(), in case they are not initialized.
> > > Or else "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed."
> > > will occur. One test case can reproduce this error is: set the
> > > compression on and fail to fully setup the eight compression thread
> > > in compress_threads_save_setup().
> > > 
> > > Signed-off-by: Fei Li <fli@suse.com>
> > > ---
> > >   migration/ram.c | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index 79c89425a3..522a5550b4 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -427,6 +427,9 @@ static inline void terminate_compression_threads(void)
> > >       thread_count = migrate_compress_threads();
> > >       for (idx = 0; idx < thread_count; idx++) {
> > > +        if (!comp_param[idx].mutex.initialized) {
> > > +            break;
> > > +        }
> > Instead of accessing the mutex internal variable "initialized", how
> > about we just squash the terminate_compression_threads() into existing
> > compress_threads_save_cleanup()?  Note that we have this in
> > compress_threads_save_cleanup() already:
> > 
> >      for (i = 0; i < thread_count; i++) {
> >          /*
> >           * we use it as a indicator which shows if the thread is
> >           * properly init'd or not
> >           */
> >          if (!comp_param[i].file) {
> >              break;
> >          }
> >          qemu_thread_join(compress_threads + i);
> >          qemu_mutex_destroy(&comp_param[i].mutex);
> >          qemu_cond_destroy(&comp_param[i].cond);
> >          deflateEnd(&comp_param[i].stream);
> >          g_free(comp_param[i].originbuf);
> >          qemu_fclose(comp_param[i].file);
> >          comp_param[i].file = NULL;
> >      }
> > 
> > So we already try to detect this using the comp_param[i].file, which
> > IMO is better (the exposure of the mutex.init var is just "an
> > accident" - logically we could hide that from mutex impl).
> Actually, I firstly use the comp_param[i].file as the judging condition, but
> I am not sure whether the comp_param[i].file is also needed to be protected
> by the mutex lock..
> And I have no objection to the squash.

It should be fine to only check non-zero.  This is slightly tricky so
we have had some comment there which clarifies it a bit.

> > 
> > >           qemu_mutex_lock(&comp_param[idx].mutex);
> > >           comp_param[idx].quit = true;
> > >           qemu_cond_signal(&comp_param[idx].cond);
> > > @@ -438,7 +441,7 @@ static void compress_threads_save_cleanup(void)
> > >   {
> > >       int i, thread_count;
> > > -    if (!migrate_use_compression()) {
> > > +    if (!migrate_use_compression() || !comp_param) {
> > This should be a valid fix for a crash (since we might call the
> > cleanup code even without setup when connect() never worked, so we'd
> > better handle it well).
> > 
> > Considering that this seems to fix a migration crash on source, I'm
> > CCing Dave and Juan in case this (or a new version) could even be a
> > good candidate as part of a migration pull.
> > 
> > Thanks,
> Thanks for helping cc. ;)

No problem.  If you're gonna post another version, feel free to CC
Dave and Juan on this patch, and you can post this as a standalone one
since it's not directly related with the series and it fixes an even
more severe issue on migration side (source VM will data-loss if this
is triggerred; and it triggers easily).

Regards,
Fei Li Sept. 22, 2018, 8:28 a.m. UTC | #4
Sorry for the late reply.


On 09/20/2018 01:33 PM, Peter Xu wrote:
> On Thu, Sep 20, 2018 at 01:06:21PM +0800, Fei Li wrote:
>>
>> On 09/20/2018 12:31 PM, Peter Xu wrote:
>>> On Wed, Sep 19, 2018 at 09:35:20PM +0800, Fei Li wrote:
>>>> Add judgement in compress_threads_save_cleanup() to check whether the
>>>> static CompressParam *comp_param has been allocated. If not, just
>>>> return; or else Segmentation fault will occur when using the
>>>> NULL comp_param's parameters in terminate_compression_threads().
>>>> One test case can reproduce this error is: set the compression on
>>>> and migrate to a wrong nonexistent host IP address.
>>>>
>>>> Add judgement before handling comp_param[idx]'s quit and cond in
>>>> terminate_compression_threads(), in case they are not initialized.
>>>> Or else "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed."
>>>> will occur. One test case can reproduce this error is: set the
>>>> compression on and fail to fully setup the eight compression thread
>>>> in compress_threads_save_setup().
>>>>
>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>> ---
>>>>    migration/ram.c | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>> index 79c89425a3..522a5550b4 100644
>>>> --- a/migration/ram.c
>>>> +++ b/migration/ram.c
>>>> @@ -427,6 +427,9 @@ static inline void terminate_compression_threads(void)
>>>>        thread_count = migrate_compress_threads();
>>>>        for (idx = 0; idx < thread_count; idx++) {
>>>> +        if (!comp_param[idx].mutex.initialized) {
>>>> +            break;
>>>> +        }
>>> Instead of accessing the mutex internal variable "initialized", how
>>> about we just squash the terminate_compression_threads() into existing
>>> compress_threads_save_cleanup()?  Note that we have this in
>>> compress_threads_save_cleanup() already:
>>>
>>>       for (i = 0; i < thread_count; i++) {
>>>           /*
>>>            * we use it as a indicator which shows if the thread is
>>>            * properly init'd or not
>>>            */
>>>           if (!comp_param[i].file) {
>>>               break;
>>>           }
>>>           qemu_thread_join(compress_threads + i);
>>>           qemu_mutex_destroy(&comp_param[i].mutex);
>>>           qemu_cond_destroy(&comp_param[i].cond);
>>>           deflateEnd(&comp_param[i].stream);
>>>           g_free(comp_param[i].originbuf);
>>>           qemu_fclose(comp_param[i].file);
>>>           comp_param[i].file = NULL;
>>>       }
>>>
>>> So we already try to detect this using the comp_param[i].file, which
>>> IMO is better (the exposure of the mutex.init var is just "an
>>> accident" - logically we could hide that from mutex impl).
>> Actually, I firstly use the comp_param[i].file as the judging condition, but
>> I am not sure whether the comp_param[i].file is also needed to be protected
>> by the mutex lock..
>> And I have no objection to the squash.
> It should be fine to only check non-zero.  This is slightly tricky so
> we have had some comment there which clarifies it a bit.
Ok, I will do the squash.
>
>>>>            qemu_mutex_lock(&comp_param[idx].mutex);
>>>>            comp_param[idx].quit = true;
>>>>            qemu_cond_signal(&comp_param[idx].cond);
>>>> @@ -438,7 +441,7 @@ static void compress_threads_save_cleanup(void)
>>>>    {
>>>>        int i, thread_count;
>>>> -    if (!migrate_use_compression()) {
>>>> +    if (!migrate_use_compression() || !comp_param) {
>>> This should be a valid fix for a crash (since we might call the
>>> cleanup code even without setup when connect() never worked, so we'd
>>> better handle it well).
>>>
>>> Considering that this seems to fix a migration crash on source, I'm
>>> CCing Dave and Juan in case this (or a new version) could even be a
>>> good candidate as part of a migration pull.
>>>
>>> Thanks,
>> Thanks for helping cc. ;)
> No problem.  If you're gonna post another version, feel free to CC
> Dave and Juan on this patch, and you can post this as a standalone one
> since it's not directly related with the series and it fixes an even
> more severe issue on migration side (source VM will data-loss if this
> is triggerred; and it triggers easily).
>
> Regards,
>
Thanks for the suggestion! I'd like to post a standalone patch then. :)

Have a nice day
Fei
diff mbox series

Patch

diff --git a/migration/ram.c b/migration/ram.c
index 79c89425a3..522a5550b4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -427,6 +427,9 @@  static inline void terminate_compression_threads(void)
     thread_count = migrate_compress_threads();
 
     for (idx = 0; idx < thread_count; idx++) {
+        if (!comp_param[idx].mutex.initialized) {
+            break;
+        }
         qemu_mutex_lock(&comp_param[idx].mutex);
         comp_param[idx].quit = true;
         qemu_cond_signal(&comp_param[idx].cond);
@@ -438,7 +441,7 @@  static void compress_threads_save_cleanup(void)
 {
     int i, thread_count;
 
-    if (!migrate_use_compression()) {
+    if (!migrate_use_compression() || !comp_param) {
         return;
     }
     terminate_compression_threads();