diff mbox

C++ PATCH for target/54908 (thread_local vs emutls)

Message ID 50F9CCAE.7070409@redhat.com
State New
Headers show

Commit Message

Jason Merrill Jan. 18, 2013, 10:29 p.m. UTC
I've been thinking about how to deal with the problems with calling 
initialization functions in another translation unit; namely, that it's 
significant overhead even for cases where it isn't useful, and that it 
requires alias and weak reference support.  This patch introduces a new 
flag -fno-extern-tls-init that allows the user to disable support for 
calling a lazy initialization function in a different translation unit 
to avoid the overhead.  It also disables it by default for targets 
without support for aliases or weak references.

Tested x86_64-pc-linux-gnu.  Jack, does this fix the thread_local tests 
on Darwin?

Comments

Jack Howarth Jan. 19, 2013, 2:20 a.m. UTC | #1
On Fri, Jan 18, 2013 at 05:29:02PM -0500, Jason Merrill wrote:
> I've been thinking about how to deal with the problems with calling  
> initialization functions in another translation unit; namely, that it's  
> significant overhead even for cases where it isn't useful, and that it  
> requires alias and weak reference support.  This patch introduces a new  
> flag -fno-extern-tls-init that allows the user to disable support for  
> calling a lazy initialization function in a different translation unit  
> to avoid the overhead.  It also disables it by default for targets  
> without support for aliases or weak references.
>
> Tested x86_64-pc-linux-gnu.  Jack, does this fix the thread_local tests  
> on Darwin?

Jason,
   The proposed patch eliminates all of the previous failures from PR54908
at both -m32/-m64 on x86_64-apple-darwin12 but seems to introduce a new set of
failures...

FAIL: g++.dg/tls/thread_local-wrap3.C scan-assembler _ZTH1i
FAIL: g++.dg/gomp/tls-wrap3.C -std=c++98  scan-assembler _ZTH1i
FAIL: g++.dg/gomp/tls-wrap3.C -std=c++11  scan-assembler _ZTH1i

at both -m32 and -m64. The thread_local-wrap3.s generated with...

/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/testsuite/g++/../../xg++ -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/testsuite/g++/../../ /sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/gcc/testsuite/g++.dg/tls/thread_local-wrap3.C  -fno-diagnostics-show-caret  -nostdinc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include/x86_64-apple-darwin12.2.0 -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/libstdc++-v3/libsupc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/libstdc++-v3/include/backward -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/libstdc++-v3/testsuite/util -fmessage-length=0  -std=c++11  -S  -m64 -o thread_local-wrap3.s

is attached. Recompiling the testcase with -fextern-tls-init doesn't produce the missing _ZTH1i.
            Jack
ps I'll post full regression results tomorrow.


> commit b3cf0f73d853bb9f0d5bfde22995eafc244210d3
> Author: Jason Merrill <jason@redhat.com>
> Date:   Thu Jan 17 06:15:22 2013 -0500
> 
>     	PR target/54908
>     c-family/
>     	* c.opt (-fextern-tls-init): New.
>     	* c-opts.c (c_common_post_options): Handle it.
>     cp/
>     	* decl2.c (get_local_tls_init_fn): New.
>     	(get_tls_init_fn): Handle flag_extern_tls_init.  Don't bother
>     	with aliases for internal variables.  Don't use weakrefs if
>     	the variable needs destruction.
>     	(generate_tls_wrapper): Mark the wrapper as const if no
>     	initialization is needed.
>     	(handle_tls_init): Don't require aliases.
> 
> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
> index 3fabb36..1a922a8 100644
> --- a/gcc/c-family/c-opts.c
> +++ b/gcc/c-family/c-opts.c
> @@ -901,6 +901,20 @@ c_common_post_options (const char **pfilename)
>    else if (warn_narrowing == -1)
>      warn_narrowing = 0;
>  
> +  if (flag_extern_tls_init)
> +    {
> +#if !defined (ASM_OUTPUT_DEF) || !SUPPORTS_WEAK
> +      /* Lazy TLS initialization for a variable in another TU requires
> +	 alias and weak reference support. */
> +      if (flag_extern_tls_init > 0)
> +	sorry ("external TLS initialization functions not supported "
> +	       "on this target");
> +      flag_extern_tls_init = 0;
> +#else
> +      flag_extern_tls_init = 1;
> +#endif
> +    }
> +
>    if (flag_preprocess_only)
>      {
>        /* Open the output now.  We must do so even if flag_no_output is
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 187f3be..10ae84d 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -913,6 +913,9 @@ finput-charset=
>  C ObjC C++ ObjC++ Joined RejectNegative
>  -finput-charset=<cset>	Specify the default character set for source files
>  
> +fextern-tls-init
> +C++ ObjC++ Var(flag_extern_tls_init) Init(-1)
> +Support dynamic initialization of thread-local variables in a different translation unit
>  
>  fexternal-templates
>  C++ ObjC++ Ignore Warn(switch %qs is no longer supported)
> diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
> index 619d30d..4496395 100644
> --- a/gcc/cp/decl2.c
> +++ b/gcc/cp/decl2.c
> @@ -2812,6 +2812,28 @@ var_needs_tls_wrapper (tree var)
>  	  && !var_defined_without_dynamic_init (var));
>  }
>  
> +/* Get the FUNCTION_DECL for the shared TLS init function for this
> +   translation unit.  */
> +
> +static tree
> +get_local_tls_init_fn (void)
> +{
> +  tree sname = get_identifier ("__tls_init");
> +  tree fn = IDENTIFIER_GLOBAL_VALUE (sname);
> +  if (!fn)
> +    {
> +      fn = build_lang_decl (FUNCTION_DECL, sname,
> +			     build_function_type (void_type_node,
> +						  void_list_node));
> +      SET_DECL_LANGUAGE (fn, lang_c);
> +      TREE_PUBLIC (fn) = false;
> +      DECL_ARTIFICIAL (fn) = true;
> +      mark_used (fn);
> +      SET_IDENTIFIER_GLOBAL_VALUE (sname, fn);
> +    }
> +  return fn;
> +}
> +
>  /* Get a FUNCTION_DECL for the init function for the thread_local
>     variable VAR.  The init function will be an alias to the function
>     that initializes all the non-local TLS variables in the translation
> @@ -2824,6 +2846,18 @@ get_tls_init_fn (tree var)
>    if (!var_needs_tls_wrapper (var))
>      return NULL_TREE;
>  
> +  /* If -fno-extern-tls-init, assume that we don't need to call
> +     a tls init function for a variable defined in another TU.  */
> +  if (!flag_extern_tls_init && DECL_EXTERNAL (var))
> +    return NULL_TREE;
> +
> +#ifdef ASM_OUTPUT_DEF
> +  /* If the variable is internal, or if we can't generate aliases,
> +     call the local init function directly.  */
> +  if (!TREE_PUBLIC (var))
> +#endif
> +    return get_local_tls_init_fn ();
> +
>    tree sname = mangle_tls_init_fn (var);
>    tree fn = IDENTIFIER_GLOBAL_VALUE (sname);
>    if (!fn)
> @@ -2841,11 +2875,12 @@ get_tls_init_fn (tree var)
>        if (TREE_PUBLIC (var))
>  	{
>  	  tree obtype = strip_array_types (non_reference (TREE_TYPE (var)));
> -	  /* If the variable might have static initialization, make the
> -	     init function a weak reference.  */
> +	  /* If the variable is defined somewhere else and might have static
> +	     initialization, make the init function a weak reference.  */
>  	  if ((!TYPE_NEEDS_CONSTRUCTING (obtype)
>  	       || TYPE_HAS_CONSTEXPR_CTOR (obtype))
> -	      && TARGET_SUPPORTS_WEAK)
> +	      && TYPE_HAS_TRIVIAL_DESTRUCTOR (obtype)
> +	      && DECL_EXTERNAL (var))
>  	    declare_weak (fn);
>  	  else
>  	    DECL_WEAK (fn) = DECL_WEAK (var);
> @@ -2956,6 +2991,9 @@ generate_tls_wrapper (tree fn)
>  	  finish_if_stmt (if_stmt);
>  	}
>      }
> +  else
> +    /* If there's no initialization, the wrapper is a constant function.  */
> +    TREE_READONLY (fn) = true;
>    finish_return_stmt (convert_from_reference (var));
>    finish_function_body (body);
>    expand_or_defer_fn (finish_function (0));
> @@ -3861,15 +3899,6 @@ handle_tls_init (void)
>  
>    location_t loc = DECL_SOURCE_LOCATION (TREE_VALUE (vars));
>  
> -  #ifndef ASM_OUTPUT_DEF
> -  /* This currently requires alias support.  FIXME other targets could use
> -     small thunks instead of aliases.  */
> -  input_location = loc;
> -  sorry ("dynamic initialization of non-function-local thread_local "
> -	 "variables not supported on this target");
> -  return;
> -  #endif
> -
>    write_out_vars (vars);
>  
>    tree guard = build_decl (loc, VAR_DECL, get_identifier ("__tls_guard"),
> @@ -3882,14 +3911,7 @@ handle_tls_init (void)
>    DECL_TLS_MODEL (guard) = decl_default_tls_model (guard);
>    pushdecl_top_level_and_finish (guard, NULL_TREE);
>  
> -  tree fn = build_lang_decl (FUNCTION_DECL,
> -			     get_identifier ("__tls_init"),
> -			     build_function_type (void_type_node,
> -						  void_list_node));
> -  SET_DECL_LANGUAGE (fn, lang_c);
> -  TREE_PUBLIC (fn) = false;
> -  DECL_ARTIFICIAL (fn) = true;
> -  mark_used (fn);
> +  tree fn = get_local_tls_init_fn ();
>    start_preparsed_function (fn, NULL_TREE, SF_PRE_PARSED);
>    tree body = begin_function_body ();
>    tree if_stmt = begin_if_stmt ();
> @@ -3904,11 +3926,17 @@ handle_tls_init (void)
>        tree init = TREE_PURPOSE (vars);
>        one_static_initialization_or_destruction (var, init, true);
>  
> -      tree single_init_fn = get_tls_init_fn (var);
> -      cgraph_node *alias
> -	= cgraph_same_body_alias (cgraph_get_create_node (fn),
> -				  single_init_fn, fn);
> -      gcc_assert (alias != NULL);
> +#ifdef ASM_OUTPUT_DEF
> +      /* Output init aliases even with -fno-extern-tls-init.  */
> +      if (TREE_PUBLIC (var))
> +	{
> +          tree single_init_fn = get_tls_init_fn (var);
> +	  cgraph_node *alias
> +	    = cgraph_same_body_alias (cgraph_get_create_node (fn),
> +				      single_init_fn, fn);
> +	  gcc_assert (alias != NULL);
> +	}
> +#endif
>      }
>  
>    finish_then_clause (if_stmt);
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 9ef6c93..48ee779 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -2029,6 +2029,29 @@ exceptions in violation of the exception specifications; the compiler
>  still optimizes based on the specifications, so throwing an
>  unexpected exception results in undefined behavior at run time.
>  
> +@item -fextern-tls-init
> +@itemx -fno-extern-tls-init
> +@opindex fextern-tls-init
> +@opindex fno-extern-tls-init
> +The C++11 and OpenMP standards allow @samp{thread_local} and
> +@samp{threadprivate} variables to have dynamic (runtime)
> +initialization.  To support this, any use of such a variable goes
> +through a wrapper function that performs any necessary initialization.
> +When the use and definition of the variable are in the same
> +translation unit, this overhead can be optimized away, but when the
> +use is in a different translation unit there is significant overhead
> +even if the variable doesn't actually need dynamic initialization.  If
> +the programmer can be sure that no use of the variable in a
> +non-defining TU needs to trigger dynamic initialization (either
> +because the variable is statically initialized, or a use of the
> +variable in the defining TU will be executed before any uses in
> +another TU), they can avoid this overhead with the
> +@option{-fno-extern-tls-init} option.
> +
> +On targets that support symbol aliases, the default is
> +@option{-fextern-tls-init}.  On targets that do not support symbol
> +aliases, the default is @option{-fno-extern-tls-init}.
> +
>  @item -ffor-scope
>  @itemx -fno-for-scope
>  @opindex ffor-scope
> diff --git a/libgomp/testsuite/libgomp.c++/pr24455.C b/libgomp/testsuite/libgomp.c++/pr24455.C
> index 3185ca5..8256b66 100644
> --- a/libgomp/testsuite/libgomp.c++/pr24455.C
> +++ b/libgomp/testsuite/libgomp.c++/pr24455.C
> @@ -1,8 +1,7 @@
>  // { dg-do run }
>  // { dg-additional-sources pr24455-1.C }
>  // { dg-require-effective-target tls_runtime }
> -// { dg-options "-Wl,-G" { target powerpc-ibm-aix* } }
> -// { dg-options "-Wl,-undefined,dynamic_lookup" { target *-*-darwin* } }
> +// { dg-options "-fno-extern-tls-init" }
>  
>  extern "C" void abort (void);
>
.text
	.globl _main
_main:
LFB0:
	pushq	%rbp
LCFI0:
	movq	%rsp, %rbp
LCFI1:
	call	__ZTW1i
	movl	(%rax), %eax
	subl	$42, %eax
	popq	%rbp
LCFI2:
	ret
LFE0:
	.section __TEXT,__textcoal_nt,coalesced,pure_instructions
	.globl __ZTW1i
	.weak_definition __ZTW1i
	.private_extern __ZTW1i
__ZTW1i:
LFB1:
	pushq	%rbp
LCFI3:
	movq	%rsp, %rbp
LCFI4:
	movq	___emutls_v.i@GOTPCREL(%rip), %rax
	movq	%rax, %rdi
	call	___emutls_get_address
	popq	%rbp
LCFI5:
	ret
LFE1:
	.section __TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+live_support
EH_frame1:
	.set L$set$0,LECIE1-LSCIE1
	.long L$set$0
LSCIE1:
	.long	0
	.byte	0x1
	.ascii "zR\0"
	.byte	0x1
	.byte	0x78
	.byte	0x10
	.byte	0x1
	.byte	0x10
	.byte	0xc
	.byte	0x7
	.byte	0x8
	.byte	0x90
	.byte	0x1
	.align 3
LECIE1:
LSFDE1:
	.set L$set$1,LEFDE1-LASFDE1
	.long L$set$1
LASFDE1:
	.long	LASFDE1-EH_frame1
	.quad	LFB0-.
	.set L$set$2,LFE0-LFB0
	.quad L$set$2
	.byte	0
	.byte	0x4
	.set L$set$3,LCFI0-LFB0
	.long L$set$3
	.byte	0xe
	.byte	0x10
	.byte	0x86
	.byte	0x2
	.byte	0x4
	.set L$set$4,LCFI1-LCFI0
	.long L$set$4
	.byte	0xd
	.byte	0x6
	.byte	0x4
	.set L$set$5,LCFI2-LCFI1
	.long L$set$5
	.byte	0xc
	.byte	0x7
	.byte	0x8
	.align 3
LEFDE1:
LSFDE3:
	.set L$set$6,LEFDE3-LASFDE3
	.long L$set$6
LASFDE3:
	.long	LASFDE3-EH_frame1
	.quad	LFB1-.
	.set L$set$7,LFE1-LFB1
	.quad L$set$7
	.byte	0
	.byte	0x4
	.set L$set$8,LCFI3-LFB1
	.long L$set$8
	.byte	0xe
	.byte	0x10
	.byte	0x86
	.byte	0x2
	.byte	0x4
	.set L$set$9,LCFI4-LCFI3
	.long L$set$9
	.byte	0xd
	.byte	0x6
	.byte	0x4
	.set L$set$10,LCFI5-LCFI4
	.long L$set$10
	.byte	0xc
	.byte	0x7
	.byte	0x8
	.align 3
LEFDE3:
	.constructor
	.destructor
	.align 1
	.subsections_via_symbols
Jason Merrill Jan. 19, 2013, 5:15 a.m. UTC | #2
On 01/18/2013 09:20 PM, Jack Howarth wrote:
>     The proposed patch eliminates all of the previous failures from PR54908
> at both -m32/-m64 on x86_64-apple-darwin12 but seems to introduce a new set of
> failures...
>
> FAIL: g++.dg/tls/thread_local-wrap3.C scan-assembler _ZTH1i
> FAIL: g++.dg/gomp/tls-wrap3.C -std=c++98  scan-assembler _ZTH1i
> FAIL: g++.dg/gomp/tls-wrap3.C -std=c++11  scan-assembler _ZTH1i

Ah, yes; those are specifically testing for the aliases that we can't 
generate on darwin.  I'll add dg-require-alias to those tests.  Thanks 
for the testing.

> Recompiling the testcase with -fextern-tls-init doesn't produce the missing _ZTH1i.

Hmm, compiling with -fextern-tls-init should give an error for you. 
Doesn't it?

Jason
diff mbox

Patch

commit b3cf0f73d853bb9f0d5bfde22995eafc244210d3
Author: Jason Merrill <jason@redhat.com>
Date:   Thu Jan 17 06:15:22 2013 -0500

    	PR target/54908
    c-family/
    	* c.opt (-fextern-tls-init): New.
    	* c-opts.c (c_common_post_options): Handle it.
    cp/
    	* decl2.c (get_local_tls_init_fn): New.
    	(get_tls_init_fn): Handle flag_extern_tls_init.  Don't bother
    	with aliases for internal variables.  Don't use weakrefs if
    	the variable needs destruction.
    	(generate_tls_wrapper): Mark the wrapper as const if no
    	initialization is needed.
    	(handle_tls_init): Don't require aliases.

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 3fabb36..1a922a8 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -901,6 +901,20 @@  c_common_post_options (const char **pfilename)
   else if (warn_narrowing == -1)
     warn_narrowing = 0;
 
+  if (flag_extern_tls_init)
+    {
+#if !defined (ASM_OUTPUT_DEF) || !SUPPORTS_WEAK
+      /* Lazy TLS initialization for a variable in another TU requires
+	 alias and weak reference support. */
+      if (flag_extern_tls_init > 0)
+	sorry ("external TLS initialization functions not supported "
+	       "on this target");
+      flag_extern_tls_init = 0;
+#else
+      flag_extern_tls_init = 1;
+#endif
+    }
+
   if (flag_preprocess_only)
     {
       /* Open the output now.  We must do so even if flag_no_output is
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 187f3be..10ae84d 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -913,6 +913,9 @@  finput-charset=
 C ObjC C++ ObjC++ Joined RejectNegative
 -finput-charset=<cset>	Specify the default character set for source files
 
+fextern-tls-init
+C++ ObjC++ Var(flag_extern_tls_init) Init(-1)
+Support dynamic initialization of thread-local variables in a different translation unit
 
 fexternal-templates
 C++ ObjC++ Ignore Warn(switch %qs is no longer supported)
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 619d30d..4496395 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2812,6 +2812,28 @@  var_needs_tls_wrapper (tree var)
 	  && !var_defined_without_dynamic_init (var));
 }
 
+/* Get the FUNCTION_DECL for the shared TLS init function for this
+   translation unit.  */
+
+static tree
+get_local_tls_init_fn (void)
+{
+  tree sname = get_identifier ("__tls_init");
+  tree fn = IDENTIFIER_GLOBAL_VALUE (sname);
+  if (!fn)
+    {
+      fn = build_lang_decl (FUNCTION_DECL, sname,
+			     build_function_type (void_type_node,
+						  void_list_node));
+      SET_DECL_LANGUAGE (fn, lang_c);
+      TREE_PUBLIC (fn) = false;
+      DECL_ARTIFICIAL (fn) = true;
+      mark_used (fn);
+      SET_IDENTIFIER_GLOBAL_VALUE (sname, fn);
+    }
+  return fn;
+}
+
 /* Get a FUNCTION_DECL for the init function for the thread_local
    variable VAR.  The init function will be an alias to the function
    that initializes all the non-local TLS variables in the translation
@@ -2824,6 +2846,18 @@  get_tls_init_fn (tree var)
   if (!var_needs_tls_wrapper (var))
     return NULL_TREE;
 
+  /* If -fno-extern-tls-init, assume that we don't need to call
+     a tls init function for a variable defined in another TU.  */
+  if (!flag_extern_tls_init && DECL_EXTERNAL (var))
+    return NULL_TREE;
+
+#ifdef ASM_OUTPUT_DEF
+  /* If the variable is internal, or if we can't generate aliases,
+     call the local init function directly.  */
+  if (!TREE_PUBLIC (var))
+#endif
+    return get_local_tls_init_fn ();
+
   tree sname = mangle_tls_init_fn (var);
   tree fn = IDENTIFIER_GLOBAL_VALUE (sname);
   if (!fn)
@@ -2841,11 +2875,12 @@  get_tls_init_fn (tree var)
       if (TREE_PUBLIC (var))
 	{
 	  tree obtype = strip_array_types (non_reference (TREE_TYPE (var)));
-	  /* If the variable might have static initialization, make the
-	     init function a weak reference.  */
+	  /* If the variable is defined somewhere else and might have static
+	     initialization, make the init function a weak reference.  */
 	  if ((!TYPE_NEEDS_CONSTRUCTING (obtype)
 	       || TYPE_HAS_CONSTEXPR_CTOR (obtype))
-	      && TARGET_SUPPORTS_WEAK)
+	      && TYPE_HAS_TRIVIAL_DESTRUCTOR (obtype)
+	      && DECL_EXTERNAL (var))
 	    declare_weak (fn);
 	  else
 	    DECL_WEAK (fn) = DECL_WEAK (var);
@@ -2956,6 +2991,9 @@  generate_tls_wrapper (tree fn)
 	  finish_if_stmt (if_stmt);
 	}
     }
+  else
+    /* If there's no initialization, the wrapper is a constant function.  */
+    TREE_READONLY (fn) = true;
   finish_return_stmt (convert_from_reference (var));
   finish_function_body (body);
   expand_or_defer_fn (finish_function (0));
@@ -3861,15 +3899,6 @@  handle_tls_init (void)
 
   location_t loc = DECL_SOURCE_LOCATION (TREE_VALUE (vars));
 
-  #ifndef ASM_OUTPUT_DEF
-  /* This currently requires alias support.  FIXME other targets could use
-     small thunks instead of aliases.  */
-  input_location = loc;
-  sorry ("dynamic initialization of non-function-local thread_local "
-	 "variables not supported on this target");
-  return;
-  #endif
-
   write_out_vars (vars);
 
   tree guard = build_decl (loc, VAR_DECL, get_identifier ("__tls_guard"),
@@ -3882,14 +3911,7 @@  handle_tls_init (void)
   DECL_TLS_MODEL (guard) = decl_default_tls_model (guard);
   pushdecl_top_level_and_finish (guard, NULL_TREE);
 
-  tree fn = build_lang_decl (FUNCTION_DECL,
-			     get_identifier ("__tls_init"),
-			     build_function_type (void_type_node,
-						  void_list_node));
-  SET_DECL_LANGUAGE (fn, lang_c);
-  TREE_PUBLIC (fn) = false;
-  DECL_ARTIFICIAL (fn) = true;
-  mark_used (fn);
+  tree fn = get_local_tls_init_fn ();
   start_preparsed_function (fn, NULL_TREE, SF_PRE_PARSED);
   tree body = begin_function_body ();
   tree if_stmt = begin_if_stmt ();
@@ -3904,11 +3926,17 @@  handle_tls_init (void)
       tree init = TREE_PURPOSE (vars);
       one_static_initialization_or_destruction (var, init, true);
 
-      tree single_init_fn = get_tls_init_fn (var);
-      cgraph_node *alias
-	= cgraph_same_body_alias (cgraph_get_create_node (fn),
-				  single_init_fn, fn);
-      gcc_assert (alias != NULL);
+#ifdef ASM_OUTPUT_DEF
+      /* Output init aliases even with -fno-extern-tls-init.  */
+      if (TREE_PUBLIC (var))
+	{
+          tree single_init_fn = get_tls_init_fn (var);
+	  cgraph_node *alias
+	    = cgraph_same_body_alias (cgraph_get_create_node (fn),
+				      single_init_fn, fn);
+	  gcc_assert (alias != NULL);
+	}
+#endif
     }
 
   finish_then_clause (if_stmt);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9ef6c93..48ee779 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2029,6 +2029,29 @@  exceptions in violation of the exception specifications; the compiler
 still optimizes based on the specifications, so throwing an
 unexpected exception results in undefined behavior at run time.
 
+@item -fextern-tls-init
+@itemx -fno-extern-tls-init
+@opindex fextern-tls-init
+@opindex fno-extern-tls-init
+The C++11 and OpenMP standards allow @samp{thread_local} and
+@samp{threadprivate} variables to have dynamic (runtime)
+initialization.  To support this, any use of such a variable goes
+through a wrapper function that performs any necessary initialization.
+When the use and definition of the variable are in the same
+translation unit, this overhead can be optimized away, but when the
+use is in a different translation unit there is significant overhead
+even if the variable doesn't actually need dynamic initialization.  If
+the programmer can be sure that no use of the variable in a
+non-defining TU needs to trigger dynamic initialization (either
+because the variable is statically initialized, or a use of the
+variable in the defining TU will be executed before any uses in
+another TU), they can avoid this overhead with the
+@option{-fno-extern-tls-init} option.
+
+On targets that support symbol aliases, the default is
+@option{-fextern-tls-init}.  On targets that do not support symbol
+aliases, the default is @option{-fno-extern-tls-init}.
+
 @item -ffor-scope
 @itemx -fno-for-scope
 @opindex ffor-scope
diff --git a/libgomp/testsuite/libgomp.c++/pr24455.C b/libgomp/testsuite/libgomp.c++/pr24455.C
index 3185ca5..8256b66 100644
--- a/libgomp/testsuite/libgomp.c++/pr24455.C
+++ b/libgomp/testsuite/libgomp.c++/pr24455.C
@@ -1,8 +1,7 @@ 
 // { dg-do run }
 // { dg-additional-sources pr24455-1.C }
 // { dg-require-effective-target tls_runtime }
-// { dg-options "-Wl,-G" { target powerpc-ibm-aix* } }
-// { dg-options "-Wl,-undefined,dynamic_lookup" { target *-*-darwin* } }
+// { dg-options "-fno-extern-tls-init" }
 
 extern "C" void abort (void);