diff mbox

cpus: make qemu_mutex_iothread_locked() understand co-routines

Message ID 20161021115442.18420-1-alex.bennee@linaro.org
State New
Headers show

Commit Message

Alex Bennée Oct. 21, 2016, 11:54 a.m. UTC
There is a slight wart when checking for the state of the BQL when using
GThread base co-routines (which we keep for ThreadSanitizer runs). While
the main-loop holds the BQL it is suspended until the co-routine
completes however the co-routines run in a separate thread so checking
the TLS variable could be wrong.

We fix this by expanding the check to include qemu_in_coroutine() for
GThread based builds. As it is not used for production builds I'm not
overly worried about any performance impact which should be negligible
anyway.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
---
 configure |  3 +++
 cpus.c    | 13 +++++++++++++
 2 files changed, 16 insertions(+)

Comments

Paolo Bonzini Nov. 1, 2016, 3:04 p.m. UTC | #1
On 21/10/2016 13:54, Alex Bennée wrote:
> There is a slight wart when checking for the state of the BQL when using
> GThread base co-routines (which we keep for ThreadSanitizer runs). While
> the main-loop holds the BQL it is suspended until the co-routine
> completes however the co-routines run in a separate thread so checking
> the TLS variable could be wrong.
> 
> We fix this by expanding the check to include qemu_in_coroutine() for
> GThread based builds. As it is not used for production builds I'm not
> overly worried about any performance impact which should be negligible
> anyway.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>

This is wrong unfortunately.  It is possible to run coroutines outside
the BQL (e.g. with -device virtio-blk,iothread=foo).

Do you know exactly why TSAN has no love for coroutines?

Paolo

> ---
>  configure |  3 +++
>  cpus.c    | 13 +++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/configure b/configure
> index 91a14c1..97b89fb 100755
> --- a/configure
> +++ b/configure
> @@ -5461,6 +5461,9 @@ if test "$rbd" = "yes" ; then
>  fi
>  
>  echo "CONFIG_COROUTINE_BACKEND=$coroutine" >> $config_host_mak
> +if test "$coroutine" = "gthread" ; then
> +  echo "CONFIG_COROUTINE_GTHREAD=1" >> $config_host_mak
> +fi
>  if test "$coroutine_pool" = "yes" ; then
>    echo "CONFIG_COROUTINE_POOL=1" >> $config_host_mak
>  else
> diff --git a/cpus.c b/cpus.c
> index 0c18a9f..a3e189a 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -49,6 +49,10 @@
>  #include "hw/nmi.h"
>  #include "sysemu/replay.h"
>  
> +#ifdef CONFIG_COROUTINE_GTHREAD
> +#include "qemu/coroutine.h"
> +#endif
> +
>  #ifndef _WIN32
>  #include "qemu/compatfd.h"
>  #endif
> @@ -1422,9 +1426,18 @@ bool qemu_in_vcpu_thread(void)
>  
>  static __thread bool iothread_locked = false;
>  
> +/*
> + * There is a slight wart when using gthread based co-routines. Here
> + * the BQL is held by the main-thread which is suspended until the
> + * co-routines complete.
> + */
>  bool qemu_mutex_iothread_locked(void)
>  {
> +#ifdef CONFIG_COROUTINE_GTHREAD
> +    return iothread_locked || qemu_in_coroutine();
> +#else
>      return iothread_locked;
> +#endif
>  }
>  
>  void qemu_mutex_lock_iothread(void)
>
Alex Bennée Nov. 1, 2016, 4:21 p.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/10/2016 13:54, Alex Bennée wrote:
>> There is a slight wart when checking for the state of the BQL when using
>> GThread base co-routines (which we keep for ThreadSanitizer runs). While
>> the main-loop holds the BQL it is suspended until the co-routine
>> completes however the co-routines run in a separate thread so checking
>> the TLS variable could be wrong.
>>
>> We fix this by expanding the check to include qemu_in_coroutine() for
>> GThread based builds. As it is not used for production builds I'm not
>> overly worried about any performance impact which should be negligible
>> anyway.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>
> This is wrong unfortunately.  It is possible to run coroutines outside
> the BQL (e.g. with -device virtio-blk,iothread=foo).
>
> Do you know exactly why TSAN has no love for coroutines?

The current production stuff is due to missing support for new stacks
with setcontext. However I have built the latest tsan support library
and that seems happy without the gthread co-routines.

Currently I'm dealing with glib's racy gthread support however.

>
> Paolo
>
>> ---
>>  configure |  3 +++
>>  cpus.c    | 13 +++++++++++++
>>  2 files changed, 16 insertions(+)
>>
>> diff --git a/configure b/configure
>> index 91a14c1..97b89fb 100755
>> --- a/configure
>> +++ b/configure
>> @@ -5461,6 +5461,9 @@ if test "$rbd" = "yes" ; then
>>  fi
>>
>>  echo "CONFIG_COROUTINE_BACKEND=$coroutine" >> $config_host_mak
>> +if test "$coroutine" = "gthread" ; then
>> +  echo "CONFIG_COROUTINE_GTHREAD=1" >> $config_host_mak
>> +fi
>>  if test "$coroutine_pool" = "yes" ; then
>>    echo "CONFIG_COROUTINE_POOL=1" >> $config_host_mak
>>  else
>> diff --git a/cpus.c b/cpus.c
>> index 0c18a9f..a3e189a 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -49,6 +49,10 @@
>>  #include "hw/nmi.h"
>>  #include "sysemu/replay.h"
>>
>> +#ifdef CONFIG_COROUTINE_GTHREAD
>> +#include "qemu/coroutine.h"
>> +#endif
>> +
>>  #ifndef _WIN32
>>  #include "qemu/compatfd.h"
>>  #endif
>> @@ -1422,9 +1426,18 @@ bool qemu_in_vcpu_thread(void)
>>
>>  static __thread bool iothread_locked = false;
>>
>> +/*
>> + * There is a slight wart when using gthread based co-routines. Here
>> + * the BQL is held by the main-thread which is suspended until the
>> + * co-routines complete.
>> + */
>>  bool qemu_mutex_iothread_locked(void)
>>  {
>> +#ifdef CONFIG_COROUTINE_GTHREAD
>> +    return iothread_locked || qemu_in_coroutine();
>> +#else
>>      return iothread_locked;
>> +#endif
>>  }
>>
>>  void qemu_mutex_lock_iothread(void)
>>


--
Alex Bennée
Stefan Hajnoczi Nov. 2, 2016, 2:06 p.m. UTC | #3
On Tue, Nov 01, 2016 at 04:21:36PM +0000, Alex Bennée wrote:
> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 21/10/2016 13:54, Alex Bennée wrote:
> >> There is a slight wart when checking for the state of the BQL when using
> >> GThread base co-routines (which we keep for ThreadSanitizer runs). While
> >> the main-loop holds the BQL it is suspended until the co-routine
> >> completes however the co-routines run in a separate thread so checking
> >> the TLS variable could be wrong.
> >>
> >> We fix this by expanding the check to include qemu_in_coroutine() for
> >> GThread based builds. As it is not used for production builds I'm not
> >> overly worried about any performance impact which should be negligible
> >> anyway.
> >>
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >
> > This is wrong unfortunately.  It is possible to run coroutines outside
> > the BQL (e.g. with -device virtio-blk,iothread=foo).
> >
> > Do you know exactly why TSAN has no love for coroutines?
> 
> The current production stuff is due to missing support for new stacks
> with setcontext. However I have built the latest tsan support library
> and that seems happy without the gthread co-routines.
> 
> Currently I'm dealing with glib's racy gthread support however.

I think Paolo suggested we drop the GThread backend on IRC.  I agree
that we should do that since GThread co-routines break code that uses
thread-local variables and have never truly worked.

Stefan
Alex Bennée Nov. 2, 2016, 2:55 p.m. UTC | #4
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Tue, Nov 01, 2016 at 04:21:36PM +0000, Alex Bennée wrote:
>>
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>> > On 21/10/2016 13:54, Alex Bennée wrote:
>> >> There is a slight wart when checking for the state of the BQL when using
>> >> GThread base co-routines (which we keep for ThreadSanitizer runs). While
>> >> the main-loop holds the BQL it is suspended until the co-routine
>> >> completes however the co-routines run in a separate thread so checking
>> >> the TLS variable could be wrong.
>> >>
>> >> We fix this by expanding the check to include qemu_in_coroutine() for
>> >> GThread based builds. As it is not used for production builds I'm not
>> >> overly worried about any performance impact which should be negligible
>> >> anyway.
>> >>
>> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> >
>> > This is wrong unfortunately.  It is possible to run coroutines outside
>> > the BQL (e.g. with -device virtio-blk,iothread=foo).
>> >
>> > Do you know exactly why TSAN has no love for coroutines?
>>
>> The current production stuff is due to missing support for new stacks
>> with setcontext. However I have built the latest tsan support library
>> and that seems happy without the gthread co-routines.
>>
>> Currently I'm dealing with glib's racy gthread support however.
>
> I think Paolo suggested we drop the GThread backend on IRC.  I agree
> that we should do that since GThread co-routines break code that uses
> thread-local variables and have never truly worked.

Indeed I have pulled your patch into my current series of sanitizer
fixes. Once I can fix the setcontext/sigjmp confusion for a
__SANITIZER__ builds I'll post the series.

--
Alex Bennée
diff mbox

Patch

diff --git a/configure b/configure
index 91a14c1..97b89fb 100755
--- a/configure
+++ b/configure
@@ -5461,6 +5461,9 @@  if test "$rbd" = "yes" ; then
 fi
 
 echo "CONFIG_COROUTINE_BACKEND=$coroutine" >> $config_host_mak
+if test "$coroutine" = "gthread" ; then
+  echo "CONFIG_COROUTINE_GTHREAD=1" >> $config_host_mak
+fi
 if test "$coroutine_pool" = "yes" ; then
   echo "CONFIG_COROUTINE_POOL=1" >> $config_host_mak
 else
diff --git a/cpus.c b/cpus.c
index 0c18a9f..a3e189a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -49,6 +49,10 @@ 
 #include "hw/nmi.h"
 #include "sysemu/replay.h"
 
+#ifdef CONFIG_COROUTINE_GTHREAD
+#include "qemu/coroutine.h"
+#endif
+
 #ifndef _WIN32
 #include "qemu/compatfd.h"
 #endif
@@ -1422,9 +1426,18 @@  bool qemu_in_vcpu_thread(void)
 
 static __thread bool iothread_locked = false;
 
+/*
+ * There is a slight wart when using gthread based co-routines. Here
+ * the BQL is held by the main-thread which is suspended until the
+ * co-routines complete.
+ */
 bool qemu_mutex_iothread_locked(void)
 {
+#ifdef CONFIG_COROUTINE_GTHREAD
+    return iothread_locked || qemu_in_coroutine();
+#else
     return iothread_locked;
+#endif
 }
 
 void qemu_mutex_lock_iothread(void)