Patchwork [2/2] qemu-thread: add TLS wrappers

login
register
mail settings
Submitter Stefan Hajnoczi
Date July 1, 2013, 9:35 a.m.
Message ID <1372671341-19855-3-git-send-email-stefanha@redhat.com>
Download mbox | patch
Permalink /patch/256039/
State New
Headers show

Comments

Stefan Hajnoczi - July 1, 2013, 9:35 a.m.
From: Paolo Bonzini <pbonzini@redhat.com>

Fast TLS is not available on some platforms, but it is always nice to
use it.  This wrapper implementation falls back to pthread_get/setspecific
on POSIX systems that lack __thread, but uses the dynamic linker's TLS
support on Linux and Windows.

The user shall call tls_alloc_foo() in every thread that needs to access
the variable---exactly once and before any access.  foo is the name of
the variable as passed to DECLARE_TLS and DEFINE_TLS.  Then,
tls_get_foo() will return the address of the variable.  It is guaranteed
to remain the same across the lifetime of a thread, so you can cache it.

[Renamed alloc_foo()/get_foo() to tls_alloc_foo()/tls_get_foo()
-- Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 configure                |  21 +++++++
 include/qemu/tls.h       | 139 +++++++++++++++++++++++++++++++++++++++++++++++
 tests/Makefile           |   3 +
 tests/test-tls.c         |  87 +++++++++++++++++++++++++++++
 util/qemu-thread-win32.c |  17 ++++++
 5 files changed, 267 insertions(+)
 create mode 100644 include/qemu/tls.h
 create mode 100644 tests/test-tls.c
Peter Maydell - July 1, 2013, 9:54 a.m.
On 1 July 2013 10:35, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> Fast TLS is not available on some platforms, but it is always nice to
> use it.  This wrapper implementation falls back to pthread_get/setspecific
> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
> support on Linux and Windows.
>
> The user shall call tls_alloc_foo() in every thread that needs to access
> the variable---exactly once and before any access.  foo is the name of
> the variable as passed to DECLARE_TLS and DEFINE_TLS.  Then,
> tls_get_foo() will return the address of the variable.  It is guaranteed
> to remain the same across the lifetime of a thread, so you can cache it.

>  ##########################################
> +# check for TLS runtime
> +
> +# Some versions of mingw include the "magic" definitions that make
> +# TLS work, some don't.  Check for it.
> +
> +if test "$mingw32" = yes; then
> +  cat > $TMPC << EOF
> +int main(void) {}

Execution falls off the end of function without returning a value
(I would expect the compiler to issue a warning about this.)

> +#ifndef QEMU_TLS_H
> +#define QEMU_TLS_H
> +
> +#if defined __linux__
> +#define DECLARE_TLS(type, x)                     \
> +extern __thread typeof(type) x;                  \
> +                                                 \
> +static inline typeof(type) *tls_get_##x(void)    \
> +{                                                \
> +    return &x;                                   \
> +}                                                \
> +                                                 \
> +static inline typeof(type) *tls_alloc_##x(void)  \
> +{                                                \
> +    return &x;                                   \
> +}                                                \
> +                                                 \
> +extern int dummy_##__LINE__

What's this for?

thanks
-- PMM
Paolo Bonzini - July 1, 2013, 10:14 a.m.
Il 01/07/2013 11:54, Peter Maydell ha scritto:
> On 1 July 2013 10:35, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>> From: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Fast TLS is not available on some platforms, but it is always nice to
>> use it.  This wrapper implementation falls back to pthread_get/setspecific
>> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
>> support on Linux and Windows.
>>
>> The user shall call tls_alloc_foo() in every thread that needs to access
>> the variable---exactly once and before any access.  foo is the name of
>> the variable as passed to DECLARE_TLS and DEFINE_TLS.  Then,
>> tls_get_foo() will return the address of the variable.  It is guaranteed
>> to remain the same across the lifetime of a thread, so you can cache it.
> 
>>  ##########################################
>> +# check for TLS runtime
>> +
>> +# Some versions of mingw include the "magic" definitions that make
>> +# TLS work, some don't.  Check for it.
>> +
>> +if test "$mingw32" = yes; then
>> +  cat > $TMPC << EOF
>> +int main(void) {}
> 
> Execution falls off the end of function without returning a value
> (I would expect the compiler to issue a warning about this.)
> 
>> +#ifndef QEMU_TLS_H
>> +#define QEMU_TLS_H
>> +
>> +#if defined __linux__
>> +#define DECLARE_TLS(type, x)                     \
>> +extern __thread typeof(type) x;                  \
>> +                                                 \
>> +static inline typeof(type) *tls_get_##x(void)    \
>> +{                                                \
>> +    return &x;                                   \
>> +}                                                \
>> +                                                 \
>> +static inline typeof(type) *tls_alloc_##x(void)  \
>> +{                                                \
>> +    return &x;                                   \
>> +}                                                \
>> +                                                 \
>> +extern int dummy_##__LINE__
> 
> What's this for?

It lets you use it as

DECLARE_TLS(type, x);

Many editors impose an indent without the trailing semicolon (they parse
it as a K&R function definition).

Paolo
Stefan Hajnoczi - July 1, 2013, 12:34 p.m.
On Mon, Jul 01, 2013 at 10:54:56AM +0100, Peter Maydell wrote:
> On 1 July 2013 10:35, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > From: Paolo Bonzini <pbonzini@redhat.com>
> >
> > Fast TLS is not available on some platforms, but it is always nice to
> > use it.  This wrapper implementation falls back to pthread_get/setspecific
> > on POSIX systems that lack __thread, but uses the dynamic linker's TLS
> > support on Linux and Windows.
> >
> > The user shall call tls_alloc_foo() in every thread that needs to access
> > the variable---exactly once and before any access.  foo is the name of
> > the variable as passed to DECLARE_TLS and DEFINE_TLS.  Then,
> > tls_get_foo() will return the address of the variable.  It is guaranteed
> > to remain the same across the lifetime of a thread, so you can cache it.
> 
> >  ##########################################
> > +# check for TLS runtime
> > +
> > +# Some versions of mingw include the "magic" definitions that make
> > +# TLS work, some don't.  Check for it.
> > +
> > +if test "$mingw32" = yes; then
> > +  cat > $TMPC << EOF
> > +int main(void) {}
> 
> Execution falls off the end of function without returning a value
> (I would expect the compiler to issue a warning about this.)

You are right, gcc emits a warning.

> > +#ifndef QEMU_TLS_H
> > +#define QEMU_TLS_H
> > +
> > +#if defined __linux__
> > +#define DECLARE_TLS(type, x)                     \
> > +extern __thread typeof(type) x;                  \
> > +                                                 \
> > +static inline typeof(type) *tls_get_##x(void)    \
> > +{                                                \
> > +    return &x;                                   \
> > +}                                                \
> > +                                                 \
> > +static inline typeof(type) *tls_alloc_##x(void)  \
> > +{                                                \
> > +    return &x;                                   \
> > +}                                                \
> > +                                                 \
> > +extern int dummy_##__LINE__
> 
> What's this for?

It makes the DECLARE_TLS() macro use a semicolon.
Ed Maste - July 1, 2013, 6:52 p.m.
On 1 July 2013 05:35, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> Fast TLS is not available on some platforms, but it is always nice to
> use it.  This wrapper implementation falls back to pthread_get/setspecific
> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
> support on Linux and Windows.

The most recent version of this patch posted by Paolo that I see has:

+#if defined(__linux__) || defined(__FreeBSD__)
+#define DECLARE_TLS(type, x)                     \

while this one has only __linux__.  Do you mind picking up that
change, if this is likely to make it in via your work?
Peter Maydell - July 1, 2013, 7:25 p.m.
On 1 July 2013 19:52, Ed Maste <emaste@freebsd.org> wrote:
> The most recent version of this patch posted by Paolo that I see has:
>
> +#if defined(__linux__) || defined(__FreeBSD__)
> +#define DECLARE_TLS(type, x)                     \
>
> while this one has only __linux__.  Do you mind picking up that
> change, if this is likely to make it in via your work?

Does any OS have a __thread which compiles but is broken, or can
we just have a configure test for this? That would let MacOSX+clang
use __thread.

-- PMM
Ed Maste - July 1, 2013, 8 p.m.
On 1 July 2013 15:25, Peter Maydell <peter.maydell@linaro.org> wrote:
> Does any OS have a __thread which compiles but is broken, or can
> we just have a configure test for this? That would let MacOSX+clang
> use __thread.

I believe this was recently the case on NetBSD - code with __thread
would build, but fail at run time.  I think this applies to NetBSD 5.0
and earlier, but I see that TLS support is listed in the 6.0 release
notes.  So perhaps a __thread configure test with an override for the
known-failing case(s).
Richard Henderson - July 1, 2013, 8:30 p.m.
On 07/01/2013 12:25 PM, Peter Maydell wrote:
> Does any OS have a __thread which compiles but is broken, or can
> we just have a configure test for this? That would let MacOSX+clang
> use __thread.

I suspect that this will work.  Some targets may succeed in using gcc's
"emutls" path, which while slower than TLS is pretty much exactly the pthread
get/setcontext fallback that's been proposed elsewhere on this list.


r~
Stefan Hajnoczi - July 2, 2013, 7:50 a.m.
On Mon, Jul 01, 2013 at 02:52:08PM -0400, Ed Maste wrote:
>  On 1 July 2013 05:35, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > From: Paolo Bonzini <pbonzini@redhat.com>
> >
> > Fast TLS is not available on some platforms, but it is always nice to
> > use it.  This wrapper implementation falls back to pthread_get/setspecific
> > on POSIX systems that lack __thread, but uses the dynamic linker's TLS
> > support on Linux and Windows.
> 
> The most recent version of this patch posted by Paolo that I see has:
> 
> +#if defined(__linux__) || defined(__FreeBSD__)
> +#define DECLARE_TLS(type, x)                     \
> 
> while this one has only __linux__.  Do you mind picking up that
> change, if this is likely to make it in via your work?

Since additional changes are necessary before these patches can be
merged, I hope Paolo can pick up my "tls_" rename and send the next
revision including his improvements.

Stefan
Paolo Bonzini - July 2, 2013, 7:54 a.m.
Il 01/07/2013 22:30, Richard Henderson ha scritto:
> On 07/01/2013 12:25 PM, Peter Maydell wrote:
>> Does any OS have a __thread which compiles but is broken, or can
>> we just have a configure test for this? That would let MacOSX+clang
>> use __thread.
> 
> I suspect that this will work.  Some targets may succeed in using gcc's
> "emutls" path, which while slower than TLS is pretty much exactly the pthread
> get/setcontext fallback that's been proposed elsewhere on this list.

We do not want to hit emutls on Windows, but that can be done simply by
reordering the three implementation.

Paolo
Jan Kiszka - July 4, 2013, 4:27 p.m.
On 2013-07-01 12:14, Paolo Bonzini wrote:
> Il 01/07/2013 11:54, Peter Maydell ha scritto:
>> On 1 July 2013 10:35, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> Fast TLS is not available on some platforms, but it is always nice to
>>> use it.  This wrapper implementation falls back to pthread_get/setspecific
>>> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
>>> support on Linux and Windows.
>>>
>>> The user shall call tls_alloc_foo() in every thread that needs to access
>>> the variable---exactly once and before any access.  foo is the name of
>>> the variable as passed to DECLARE_TLS and DEFINE_TLS.  Then,
>>> tls_get_foo() will return the address of the variable.  It is guaranteed
>>> to remain the same across the lifetime of a thread, so you can cache it.
>>
>>>  ##########################################
>>> +# check for TLS runtime
>>> +
>>> +# Some versions of mingw include the "magic" definitions that make
>>> +# TLS work, some don't.  Check for it.
>>> +
>>> +if test "$mingw32" = yes; then
>>> +  cat > $TMPC << EOF
>>> +int main(void) {}
>>
>> Execution falls off the end of function without returning a value
>> (I would expect the compiler to issue a warning about this.)
>>
>>> +#ifndef QEMU_TLS_H
>>> +#define QEMU_TLS_H
>>> +
>>> +#if defined __linux__
>>> +#define DECLARE_TLS(type, x)                     \
>>> +extern __thread typeof(type) x;                  \
>>> +                                                 \
>>> +static inline typeof(type) *tls_get_##x(void)    \
>>> +{                                                \
>>> +    return &x;                                   \
>>> +}                                                \
>>> +                                                 \
>>> +static inline typeof(type) *tls_alloc_##x(void)  \
>>> +{                                                \
>>> +    return &x;                                   \
>>> +}                                                \
>>> +                                                 \
>>> +extern int dummy_##__LINE__
>>
>> What's this for?
> 
> It lets you use it as
> 
> DECLARE_TLS(type, x);
> 
> Many editors impose an indent without the trailing semicolon (they parse
> it as a K&R function definition).

This workaround causes troubles here with gcc-4.5.1:

In file included from /data/qemu/include/exec/memory.h:29:0,
                 from /data/qemu/include/exec/ioport.h:29,
                 from /data/qemu/include/hw/hw.h:11,
                 from /data/qemu/exec.c:30:
/data/qemu/include/qemu/rcu.h:88:339: warning: redundant redeclaration of ‘dummy___LINE__’
/data/qemu/include/exec/cpu-all.h:362:359: note: previous declaration of ‘dummy___LINE__’ was here
/data/qemu/exec.c:77:24: warning: function declaration isn’t a prototype
/data/qemu/exec.c:77:41: error: invalid storage class for function ‘tls_get_cpu_single_env’
/data/qemu/exec.c:77:41: error: conflicting types for ‘tls_get_cpu_single_env’
/data/qemu/include/exec/cpu-all.h:362:148: note: previous definition of ‘tls_get_cpu_single_env’ was here

Jan
Paolo Bonzini - July 4, 2013, 4:38 p.m.
Il 04/07/2013 18:27, Jan Kiszka ha scritto:
> This workaround causes troubles here with gcc-4.5.1:
> 
> In file included from /data/qemu/include/exec/memory.h:29:0,
>                  from /data/qemu/include/exec/ioport.h:29,
>                  from /data/qemu/include/hw/hw.h:11,
>                  from /data/qemu/exec.c:30:
> /data/qemu/include/qemu/rcu.h:88:339: warning: redundant redeclaration of ‘dummy___LINE__’
> /data/qemu/include/exec/cpu-all.h:362:359: note: previous declaration of ‘dummy___LINE__’ was here
> /data/qemu/exec.c:77:24: warning: function declaration isn’t a prototype
> /data/qemu/exec.c:77:41: error: invalid storage class for function ‘tls_get_cpu_single_env’
> /data/qemu/exec.c:77:41: error: conflicting types for ‘tls_get_cpu_single_env’
> /data/qemu/include/exec/cpu-all.h:362:148: note: previous definition of ‘tls_get_cpu_single_env’ was here

Perhaps it helps to use glue(dummy, __LINE__).

Paolo

Patch

diff --git a/configure b/configure
index 0e0adde..8ccdf2e 100755
--- a/configure
+++ b/configure
@@ -284,6 +284,7 @@  fi
 ar="${AR-${cross_prefix}ar}"
 as="${AS-${cross_prefix}as}"
 cpp="${CPP-$cc -E}"
+nm="${NM-${cross_prefix}nm}"
 objcopy="${OBJCOPY-${cross_prefix}objcopy}"
 ld="${LD-${cross_prefix}ld}"
 libtool="${LIBTOOL-${cross_prefix}libtool}"
@@ -3163,6 +3164,22 @@  if test "$trace_backend" = "dtrace"; then
 fi
 
 ##########################################
+# check for TLS runtime
+
+# Some versions of mingw include the "magic" definitions that make
+# TLS work, some don't.  Check for it.
+
+if test "$mingw32" = yes; then
+  cat > $TMPC << EOF
+int main(void) {}
+EOF
+  compile_prog "" ""
+  if $nm $TMPE | grep _tls_used > /dev/null 2>&1; then
+    mingw32_tls_runtime=yes
+  fi
+fi
+
+##########################################
 # check and set a backend for coroutine
 
 # We prefer ucontext, but it's not always possible. The fallback
@@ -3621,6 +3638,9 @@  if test "$mingw32" = "yes" ; then
   version_micro=0
   echo "CONFIG_FILEVERSION=$version_major,$version_minor,$version_subminor,$version_micro" >> $config_host_mak
   echo "CONFIG_PRODUCTVERSION=$version_major,$version_minor,$version_subminor,$version_micro" >> $config_host_mak
+  if test "$mingw32_tls_runtime" = yes; then
+    echo "CONFIG_MINGW32_TLS_RUNTIME=y" >> $config_host_mak
+  fi
 else
   echo "CONFIG_POSIX=y" >> $config_host_mak
 fi
@@ -4043,6 +4063,7 @@  echo "OBJCC=$objcc" >> $config_host_mak
 echo "AR=$ar" >> $config_host_mak
 echo "AS=$as" >> $config_host_mak
 echo "CPP=$cpp" >> $config_host_mak
+echo "NM=$nm" >> $config_host_mak
 echo "OBJCOPY=$objcopy" >> $config_host_mak
 echo "LD=$ld" >> $config_host_mak
 echo "WINDRES=$windres" >> $config_host_mak
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
new file mode 100644
index 0000000..750ccb3
--- /dev/null
+++ b/include/qemu/tls.h
@@ -0,0 +1,139 @@ 
+/*
+ * Abstraction layer for defining and using TLS variables
+ *
+ * Copyright (c) 2011, 2013 Red Hat, Inc
+ * Copyright (c) 2011 Linaro Limited
+ *
+ * Authors:
+ *  Paolo Bonzini <pbonzini@redhat.com>
+ *  Peter Maydell <peter.maydell@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef QEMU_TLS_H
+#define QEMU_TLS_H
+
+#if defined __linux__
+#define DECLARE_TLS(type, x)                     \
+extern __thread typeof(type) x;                  \
+                                                 \
+static inline typeof(type) *tls_get_##x(void)    \
+{                                                \
+    return &x;                                   \
+}                                                \
+                                                 \
+static inline typeof(type) *tls_alloc_##x(void)  \
+{                                                \
+    return &x;                                   \
+}                                                \
+                                                 \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)                  \
+__thread typeof(type) x
+
+#elif defined CONFIG_POSIX
+typedef struct QEMUTLSValue {
+    pthread_key_t k;
+    pthread_once_t o;
+} QEMUTLSValue;
+
+#define DECLARE_TLS(type, x)                     \
+extern QEMUTLSValue x;                           \
+extern void init_##x(void);                      \
+                                                 \
+static inline typeof(type) *tls_get_##x(void)    \
+{                                                \
+    return pthread_getspecific(x.k);             \
+}                                                \
+                                                 \
+static inline typeof(type) *tls_alloc_##x(void)  \
+{                                                \
+    void *datum = g_malloc0(sizeof(type));       \
+    pthread_once(&x.o, init_##x);                \
+    pthread_setspecific(x.k, datum);             \
+    return datum;                                \
+}                                                \
+                                                 \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)                  \
+void init_##x(void) {                        \
+    pthread_key_create(&x.k, g_free);        \
+}                                            \
+                                             \
+QEMUTLSValue x = { .o = PTHREAD_ONCE_INIT }
+
+#elif defined CONFIG_WIN32
+
+/* The initial contents of TLS variables are placed in the .tls section.
+ * The linker takes all section starting with ".tls$", sorts them and puts
+ * the contents in a single ".tls" section.  qemu-thread-win32.c defines
+ * special symbols in .tls$000 and .tls$ZZZ that represent the beginning
+ * and end of TLS memory.  The linker and run-time library then cooperate
+ * to copy memory between those symbols in the TLS area of new threads.
+ *
+ * _tls_index holds the number of our module.  The executable should be
+ * zero, DLLs are numbered 1 and up.  The loader fills it in for us.
+ *
+ * Thus, Teb->ThreadLocalStoragePointer[_tls_index] is the base of
+ * the TLS segment for this (thread, module) pair.  Each segment has
+ * the same layout as this module's .tls segment and is initialized
+ * with the content of the .tls segment; 0 is the _tls_start variable.
+ * So, tls_get_##x passes us the offset of the passed variable relative to
+ * _tls_start, and we return that same offset plus the base of segment.
+ */
+
+typedef struct _TEB {
+    NT_TIB NtTib;
+    void *EnvironmentPointer;
+    void *x[3];
+    char **ThreadLocalStoragePointer;
+} TEB, *PTEB;
+
+extern int _tls_index;
+extern int _tls_start;
+
+static inline void *tls_var(size_t offset)
+{
+    PTEB Teb = NtCurrentTeb();
+    return (char *)(Teb->ThreadLocalStoragePointer[_tls_index]) + offset;
+}
+
+#define DECLARE_TLS(type, x)                                         \
+extern typeof(type) tls_##x __attribute__((section(".tls$QEMU")));   \
+                                                                     \
+static inline typeof(type) *tls_get_##x(void)                        \
+{                                                                    \
+    return tls_var((ULONG_PTR)&(tls_##x) - (ULONG_PTR)&_tls_start);  \
+}                                                                    \
+                                                                     \
+static inline typeof(type) *tls_alloc_##x(void)                      \
+{                                                                    \
+    typeof(type) *addr = tls_get_##x();                              \
+    memset((void *)addr, 0, sizeof(type));                           \
+    return addr;                                                     \
+}                                                                    \
+                                                                     \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)                                          \
+typeof(type) tls_##x __attribute__((section(".tls$QEMU")))
+
+#else
+#error No TLS abstraction available on this platform
+#endif
+
+#endif
diff --git a/tests/Makefile b/tests/Makefile
index c107489..1f5156a 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -42,6 +42,8 @@  check-unit-y += tests/test-xbzrle$(EXESUF)
 gcov-files-test-xbzrle-y = xbzrle.c
 check-unit-y += tests/test-cutils$(EXESUF)
 gcov-files-test-cutils-y += util/cutils.c
+check-unit-y += tests/test-tls$(EXESUF)
+gcov-files-test-tls-y +=
 check-unit-y += tests/test-mul64$(EXESUF)
 gcov-files-test-mul64-y = util/host-utils.c
 
@@ -98,6 +100,7 @@  tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o libqemuutil.a libqemustub.a
 tests/test-x86-cpuid$(EXESUF): tests/test-x86-cpuid.o
 tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o xbzrle.o page_cache.o libqemuutil.a
 tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
+tests/test-tls$(EXESUF): tests/test-tls.o libqemuutil.a
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-tls.c b/tests/test-tls.c
new file mode 100644
index 0000000..54e981d
--- /dev/null
+++ b/tests/test-tls.c
@@ -0,0 +1,87 @@ 
+/*
+ * Unit-tests for TLS wrappers
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Paolo Bonzini <pbonzini@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+#include "qemu/atomic.h"
+#include "qemu/thread.h"
+#include "qemu/tls.h"
+
+DECLARE_TLS(volatile long long, cnt);
+DEFINE_TLS(volatile long long, cnt);
+
+#define NUM_THREADS 10
+
+int stop;
+
+static void *test_thread(void *arg)
+{
+    volatile long long *p_cnt = alloc_cnt();
+    volatile long long **p_ret = arg;
+    long long exp = 0;
+
+    g_assert(get_cnt() == p_cnt);
+    *p_ret = p_cnt;
+    g_assert(*p_cnt == 0);
+    while (atomic_mb_read(&stop) == 0) {
+        exp++;
+        (*p_cnt)++;
+        g_assert(*get_cnt() == exp);
+    }
+
+    return NULL;
+}
+
+static void test_tls(void)
+{
+    volatile long long *addr[NUM_THREADS];
+    QemuThread t[NUM_THREADS];
+    int i;
+
+    for (i = 0; i < NUM_THREADS; i++) {
+        qemu_thread_create(&t[i], test_thread, &addr[i], QEMU_THREAD_JOINABLE);
+    }
+    g_usleep(1000000);
+    atomic_mb_set(&stop, 1);
+    for (i = 0; i < NUM_THREADS; i++) {
+        qemu_thread_join(&t[i]);
+    }
+    for (i = 1; i < NUM_THREADS; i++) {
+        g_assert(addr[i] != addr[i - 1]);
+    }
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/tls", test_tls);
+    return g_test_run();
+}
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 517878d..f75e404 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,23 @@ 
 #include <assert.h>
 #include <limits.h>
 
+/* TLS support.  Some versions of mingw32 provide it, others do not.  */
+
+#ifndef CONFIG_MINGW32_TLS_RUNTIME
+int __attribute__((section(".tls$AAA"))) _tls_start = 0;
+int __attribute__((section(".tls$ZZZ"))) _tls_end = 0;
+int _tls_index = 0;
+
+const IMAGE_TLS_DIRECTORY _tls_used __attribute__((used, section(".rdata$T"))) = {
+ (ULONG)(ULONG_PTR) &_tls_start, /* start of tls data */
+ (ULONG)(ULONG_PTR) &_tls_end,   /* end of tls data */
+ (ULONG)(ULONG_PTR) &_tls_index, /* address of tls_index */
+ (ULONG) 0,                      /* pointer to callbacks */
+ (ULONG) 0,                      /* size of tls zero fill */
+ (ULONG) 0                       /* characteristics */
+};
+#endif
+
 static void error_exit(int err, const char *msg)
 {
     char *pstr;