diff mbox

[v2] Use IE model for static variables in libc.so, libpthread.so and rtld

Message ID 1437488735-22994-1-git-send-email-siddhesh@redhat.com
State New
Headers show

Commit Message

Siddhesh Poyarekar July 21, 2015, 2:25 p.m. UTC
The recently introduced TLS variables in the thread-local destructor
implementation (__cxa_thread_atexit_impl) used the default GD access
model, resulting in a call to __tls_get_addr.  This causes a deadlock
with recent changes to the way TLS is initialized because DTV
allocations are delayed and hence despite knowing the offset to the
variable inside its TLS block, the thread has to take the global rtld
lock to safely update the TLS offset.

This causes deadlocks when a thread is instantiated and joined inside
a destructor of a dlopen'd DSO.  The correct long term fix is to
somehow not take the lock, but that will need a lot deeper change set
to alter the way in which the big rtld lock is used.

Instead, this patch just eliminates the call to __tls_get_addr for the
thread-local variables inside libc.so, libpthread.so and rtld by
building all of their units with -ftls-model=initial-exec.  I had
initially claimed that this might affect arm, but that was a false alarm
because arm uses -mtls-dialect, not tls-model.  Also, this causes only 2
additional TLS variables to change to IE - a static buffer in __inet_ntoa
and h_errno in libpthread.so.  So this change is not as wide-impacting as
it seems.

There were concerns that the static storage for TLS is limited and
hence we should not be using it.  Additionally, dynamically loaded
modules may result in libc.so looking for this static storage pretty
late in static binaries.  Both concerns are valid when using TLSDESC
since that is where one may attempt to allocate a TLS block from
static storage for even those variables that are not IE.  They're not
very strong arguments for the traditional TLS model though, since it
assumes that the static storage would be used sparingly and definitely
not by default.  Hence, for now this would only theoretically affect
ARM architectures.

The impact is hence limited to statically linked binaries that dlopen
modules that in turn load libc.so, all that on arm hardware.  It seems
like a small enough impact to justify fixing the larger problem that
currently affects everything everywhere.

This still does not solve the original problem completely.  That is,
it is still possible to deadlock on the big rtld lock with a small
tweak to the test case attached to this patch.  That problem is
however not a regression in 2.22 and hence could be tackled as a
separate project.  The test case is picked up as is from Alex's patch.

This change has been tested to verify that it does not cause any
issues on x86_64.  Once 2.23 opens, I'll remove all
attribute_tls_model_ie annotations from the libc.so and libpthread.so
sources.

ChangeLog:

	[BZ #18457]
	* nptl/Makefile (tests): New test case tst-join7.
	(modules-names): New test case module tst-join7mod.
	* nptl/tst-join7.c: New file.
	* nptl/tst-join7mod.c: New file.
	* Makeconfig (tls-model): Pass -ftls-model=initial-exec for
	all translation units in libc.so, libpthread.so and rtld.
---
 Makeconfig          |  6 +++++-
 nptl/Makefile       | 10 +++++++--
 nptl/tst-join7.c    | 46 ++++++++++++++++++++++++++++++++++++++++
 nptl/tst-join7mod.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 120 insertions(+), 3 deletions(-)
 create mode 100644 nptl/tst-join7.c
 create mode 100644 nptl/tst-join7mod.c

Comments

Joseph Myers July 22, 2015, 5:32 p.m. UTC | #1
On Tue, 21 Jul 2015, Siddhesh Poyarekar wrote:

> issues on x86_64.  Once 2.23 opens, I'll remove all
> attribute_tls_model_ie annotations from the libc.so and libpthread.so
> sources.

But not from header declarations of variables in those libraries that 
might be accessed by other libraries?
Siddhesh Poyarekar July 23, 2015, 2:04 a.m. UTC | #2
On Wed, Jul 22, 2015 at 05:32:44PM +0000, Joseph Myers wrote:
> But not from header declarations of variables in those libraries that 
> might be accessed by other libraries?

Yes.

Siddhesh
Siddhesh Poyarekar July 23, 2015, 7:25 a.m. UTC | #3
We're running out of time for the release deadline, so an early...

Ping!

On Tue, Jul 21, 2015 at 07:55:35PM +0530, Siddhesh Poyarekar wrote:
> The recently introduced TLS variables in the thread-local destructor
> implementation (__cxa_thread_atexit_impl) used the default GD access
> model, resulting in a call to __tls_get_addr.  This causes a deadlock
> with recent changes to the way TLS is initialized because DTV
> allocations are delayed and hence despite knowing the offset to the
> variable inside its TLS block, the thread has to take the global rtld
> lock to safely update the TLS offset.
> 
> This causes deadlocks when a thread is instantiated and joined inside
> a destructor of a dlopen'd DSO.  The correct long term fix is to
> somehow not take the lock, but that will need a lot deeper change set
> to alter the way in which the big rtld lock is used.
> 
> Instead, this patch just eliminates the call to __tls_get_addr for the
> thread-local variables inside libc.so, libpthread.so and rtld by
> building all of their units with -ftls-model=initial-exec.  I had
> initially claimed that this might affect arm, but that was a false alarm
> because arm uses -mtls-dialect, not tls-model.  Also, this causes only 2
> additional TLS variables to change to IE - a static buffer in __inet_ntoa
> and h_errno in libpthread.so.  So this change is not as wide-impacting as
> it seems.
> 
> There were concerns that the static storage for TLS is limited and
> hence we should not be using it.  Additionally, dynamically loaded
> modules may result in libc.so looking for this static storage pretty
> late in static binaries.  Both concerns are valid when using TLSDESC
> since that is where one may attempt to allocate a TLS block from
> static storage for even those variables that are not IE.  They're not
> very strong arguments for the traditional TLS model though, since it
> assumes that the static storage would be used sparingly and definitely
> not by default.  Hence, for now this would only theoretically affect
> ARM architectures.
> 
> The impact is hence limited to statically linked binaries that dlopen
> modules that in turn load libc.so, all that on arm hardware.  It seems
> like a small enough impact to justify fixing the larger problem that
> currently affects everything everywhere.
> 
> This still does not solve the original problem completely.  That is,
> it is still possible to deadlock on the big rtld lock with a small
> tweak to the test case attached to this patch.  That problem is
> however not a regression in 2.22 and hence could be tackled as a
> separate project.  The test case is picked up as is from Alex's patch.
> 
> This change has been tested to verify that it does not cause any
> issues on x86_64.  Once 2.23 opens, I'll remove all
> attribute_tls_model_ie annotations from the libc.so and libpthread.so
> sources.
> 
> ChangeLog:
> 
> 	[BZ #18457]
> 	* nptl/Makefile (tests): New test case tst-join7.
> 	(modules-names): New test case module tst-join7mod.
> 	* nptl/tst-join7.c: New file.
> 	* nptl/tst-join7mod.c: New file.
> 	* Makeconfig (tls-model): Pass -ftls-model=initial-exec for
> 	all translation units in libc.so, libpthread.so and rtld.
> ---
>  Makeconfig          |  6 +++++-
>  nptl/Makefile       | 10 +++++++--
>  nptl/tst-join7.c    | 46 ++++++++++++++++++++++++++++++++++++++++
>  nptl/tst-join7mod.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 120 insertions(+), 3 deletions(-)
>  create mode 100644 nptl/tst-join7.c
>  create mode 100644 nptl/tst-join7mod.c
> 
> diff --git a/Makeconfig b/Makeconfig
> index 7b46323..f136b88 100644
> --- a/Makeconfig
> +++ b/Makeconfig
> @@ -858,6 +858,10 @@ in-module = $(subst -,_,$(firstword $(libof-$(basename $(@F))) \
>  				    $(libof-$(@F)) \
>  				    libc))
>  
> +# Build ld.so, libc.so and libpthread.so with -ftls-model=initial-exec
> +tls-model = $(if $(filter libpthread rtld \
> +			  libc,$(in-module)),-ftls-model=initial-exec,)
> +
>  module-cppflags-real = -include $(common-objpfx)libc-modules.h \
>  		       -DMODULE_NAME=$(in-module)
>  
> @@ -883,7 +887,7 @@ CPPFLAGS = $(config-extra-cppflags) $(CPPUNDEFS) $(CPPFLAGS-config) \
>  override CFLAGS	= -std=gnu99 $(gnu89-inline-CFLAGS) $(config-extra-cflags) \
>  		  $(filter-out %frame-pointer,$(+cflags)) $(+gccwarn-c) \
>  		  $(sysdep-CFLAGS) $(CFLAGS-$(suffix $@)) $(CFLAGS-$(<F)) \
> -		  $(CFLAGS-$(@F)) \
> +		  $(CFLAGS-$(@F)) $(tls-model) \
>  		  $(foreach lib,$(libof-$(basename $(@F))) \
>  				$(libof-$(<F)) $(libof-$(@F)),$(CFLAGS-$(lib)))
>  override CXXFLAGS = $(c++-sysincludes) \
> diff --git a/nptl/Makefile b/nptl/Makefile
> index 140f063..aaca0a4 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -245,7 +245,7 @@ tests = tst-typesizes \
>  	tst-basic7 \
>  	tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \
>  	tst-raise1 \
> -	tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 \
> +	tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 tst-join7 \
>  	tst-detach1 \
>  	tst-eintr1 tst-eintr2 tst-eintr3 tst-eintr4 tst-eintr5 \
>  	tst-tsd1 tst-tsd2 tst-tsd3 tst-tsd4 tst-tsd5 tst-tsd6 \
> @@ -327,7 +327,8 @@ endif
>  modules-names = tst-atfork2mod tst-tls3mod tst-tls4moda tst-tls4modb \
>  		tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
>  		tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
> -		tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod
> +		tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod \
> +		tst-join7mod
>  extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) tst-cleanup4aux.o
>  test-extras += $(modules-names) tst-cleanup4aux
>  test-modules = $(addprefix $(objpfx),$(addsuffix .so,$(modules-names)))
> @@ -532,6 +533,11 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
>  	$(evaluate-test)
>  endif
>  
> +$(objpfx)tst-join7: $(libdl) $(shared-thread-library)
> +$(objpfx)tst-join7.out: $(objpfx)tst-join7mod.so
> +$(objpfx)tst-join7mod.so: $(shared-thread-library)
> +LDFLAGS-tst-join7mod.so = -Wl,-soname,tst-join7mod.so
> +
>  $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
>  
>  $(objpfx)tst-fini1: $(shared-thread-library) $(objpfx)tst-fini1mod.so
> diff --git a/nptl/tst-join7.c b/nptl/tst-join7.c
> new file mode 100644
> index 0000000..439d0fc
> --- /dev/null
> +++ b/nptl/tst-join7.c
> @@ -0,0 +1,46 @@
> +/* Verify that TLS access in separate thread in a dlopened library does not
> +   deadlock.
> +   Copyright (C) 2015 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <dlfcn.h>
> +
> +/* When one dynamically loads a module, which spawns a thread to perform some
> +   activities, it could be possible that TLS storage is accessed for the first
> +   time in that thread.  This results in an allocation request within the
> +   thread, which could result in an attempt to take the rtld load_lock.  This
> +   is a problem because it would then deadlock with the dlopen (which owns the
> +   lock), if the main thread is waiting for the spawned thread to exit.  We can
> +   at least ensure that this problem does not occur due to accesses within
> +   libc.so, by marking TLS variables within libc.so as IE.  The problem of an
> +   arbitrary variable being accessed and constructed within such a thread still
> +   exists but this test case does not verify that.  */
> +
> +int
> +do_test (void)
> +{
> +  void *f = dlopen ("tst-join7mod.so", RTLD_NOW | RTLD_GLOBAL);
> +  if (f)
> +    dlclose (f);
> +  else
> +    return 1;
> +
> +  return 0;
> +}
> +
> +#define TEST_FUNCTION do_test ()
> +#include "../test-skeleton.c"
> diff --git a/nptl/tst-join7mod.c b/nptl/tst-join7mod.c
> new file mode 100644
> index 0000000..92bb381
> --- /dev/null
> +++ b/nptl/tst-join7mod.c
> @@ -0,0 +1,61 @@
> +/* Verify that TLS access in separate thread in a dlopened library does not
> +   deadlock - the module.
> +   Copyright (C) 2015 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <stdio.h>
> +#include <pthread.h>
> +#include <atomic.h>
> +
> +static pthread_t th;
> +static int running = 1;
> +
> +static void *
> +test_run (void *p)
> +{
> +  while (atomic_load_relaxed (&running))
> +    printf ("Test running\n");
> +  printf ("Test finished\n");
> +  return NULL;
> +}
> +
> +static void __attribute__ ((constructor))
> +do_init (void)
> +{
> +  int ret = pthread_create (&th, NULL, test_run, NULL);
> +
> +  if (ret != 0)
> +    {
> +      printf ("failed to create thread: %s (%d)\n", strerror (ret), ret);
> +      exit (1);
> +    }
> +}
> +
> +static void __attribute__ ((destructor))
> +do_end (void)
> +{
> +  atomic_store_relaxed (&running, 0);
> +  int ret = pthread_join (th, NULL);
> +
> +  if (ret != 0)
> +    {
> +      printf ("pthread_join: %s(%d)\n", strerror (ret), ret);
> +      exit (1);
> +    }
> +
> +  printf ("Thread joined\n");
> +}
> -- 
> 2.4.3
>
Roland McGrath July 24, 2015, 2:02 a.m. UTC | #4
It looks fine to me.
diff mbox

Patch

diff --git a/Makeconfig b/Makeconfig
index 7b46323..f136b88 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -858,6 +858,10 @@  in-module = $(subst -,_,$(firstword $(libof-$(basename $(@F))) \
 				    $(libof-$(@F)) \
 				    libc))
 
+# Build ld.so, libc.so and libpthread.so with -ftls-model=initial-exec
+tls-model = $(if $(filter libpthread rtld \
+			  libc,$(in-module)),-ftls-model=initial-exec,)
+
 module-cppflags-real = -include $(common-objpfx)libc-modules.h \
 		       -DMODULE_NAME=$(in-module)
 
@@ -883,7 +887,7 @@  CPPFLAGS = $(config-extra-cppflags) $(CPPUNDEFS) $(CPPFLAGS-config) \
 override CFLAGS	= -std=gnu99 $(gnu89-inline-CFLAGS) $(config-extra-cflags) \
 		  $(filter-out %frame-pointer,$(+cflags)) $(+gccwarn-c) \
 		  $(sysdep-CFLAGS) $(CFLAGS-$(suffix $@)) $(CFLAGS-$(<F)) \
-		  $(CFLAGS-$(@F)) \
+		  $(CFLAGS-$(@F)) $(tls-model) \
 		  $(foreach lib,$(libof-$(basename $(@F))) \
 				$(libof-$(<F)) $(libof-$(@F)),$(CFLAGS-$(lib)))
 override CXXFLAGS = $(c++-sysincludes) \
diff --git a/nptl/Makefile b/nptl/Makefile
index 140f063..aaca0a4 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -245,7 +245,7 @@  tests = tst-typesizes \
 	tst-basic7 \
 	tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \
 	tst-raise1 \
-	tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 \
+	tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 tst-join7 \
 	tst-detach1 \
 	tst-eintr1 tst-eintr2 tst-eintr3 tst-eintr4 tst-eintr5 \
 	tst-tsd1 tst-tsd2 tst-tsd3 tst-tsd4 tst-tsd5 tst-tsd6 \
@@ -327,7 +327,8 @@  endif
 modules-names = tst-atfork2mod tst-tls3mod tst-tls4moda tst-tls4modb \
 		tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
 		tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
-		tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod
+		tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod \
+		tst-join7mod
 extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) tst-cleanup4aux.o
 test-extras += $(modules-names) tst-cleanup4aux
 test-modules = $(addprefix $(objpfx),$(addsuffix .so,$(modules-names)))
@@ -532,6 +533,11 @@  $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
 	$(evaluate-test)
 endif
 
+$(objpfx)tst-join7: $(libdl) $(shared-thread-library)
+$(objpfx)tst-join7.out: $(objpfx)tst-join7mod.so
+$(objpfx)tst-join7mod.so: $(shared-thread-library)
+LDFLAGS-tst-join7mod.so = -Wl,-soname,tst-join7mod.so
+
 $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
 
 $(objpfx)tst-fini1: $(shared-thread-library) $(objpfx)tst-fini1mod.so
diff --git a/nptl/tst-join7.c b/nptl/tst-join7.c
new file mode 100644
index 0000000..439d0fc
--- /dev/null
+++ b/nptl/tst-join7.c
@@ -0,0 +1,46 @@ 
+/* Verify that TLS access in separate thread in a dlopened library does not
+   deadlock.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <dlfcn.h>
+
+/* When one dynamically loads a module, which spawns a thread to perform some
+   activities, it could be possible that TLS storage is accessed for the first
+   time in that thread.  This results in an allocation request within the
+   thread, which could result in an attempt to take the rtld load_lock.  This
+   is a problem because it would then deadlock with the dlopen (which owns the
+   lock), if the main thread is waiting for the spawned thread to exit.  We can
+   at least ensure that this problem does not occur due to accesses within
+   libc.so, by marking TLS variables within libc.so as IE.  The problem of an
+   arbitrary variable being accessed and constructed within such a thread still
+   exists but this test case does not verify that.  */
+
+int
+do_test (void)
+{
+  void *f = dlopen ("tst-join7mod.so", RTLD_NOW | RTLD_GLOBAL);
+  if (f)
+    dlclose (f);
+  else
+    return 1;
+
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
diff --git a/nptl/tst-join7mod.c b/nptl/tst-join7mod.c
new file mode 100644
index 0000000..92bb381
--- /dev/null
+++ b/nptl/tst-join7mod.c
@@ -0,0 +1,61 @@ 
+/* Verify that TLS access in separate thread in a dlopened library does not
+   deadlock - the module.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <pthread.h>
+#include <atomic.h>
+
+static pthread_t th;
+static int running = 1;
+
+static void *
+test_run (void *p)
+{
+  while (atomic_load_relaxed (&running))
+    printf ("Test running\n");
+  printf ("Test finished\n");
+  return NULL;
+}
+
+static void __attribute__ ((constructor))
+do_init (void)
+{
+  int ret = pthread_create (&th, NULL, test_run, NULL);
+
+  if (ret != 0)
+    {
+      printf ("failed to create thread: %s (%d)\n", strerror (ret), ret);
+      exit (1);
+    }
+}
+
+static void __attribute__ ((destructor))
+do_end (void)
+{
+  atomic_store_relaxed (&running, 0);
+  int ret = pthread_join (th, NULL);
+
+  if (ret != 0)
+    {
+      printf ("pthread_join: %s(%d)\n", strerror (ret), ret);
+      exit (1);
+    }
+
+  printf ("Thread joined\n");
+}