diff mbox

[updated] Make emulated TLS lto-friendly.

Message ID D2E274C1-9DA9-4A96-8518-33176DB8E3F7@sandoe-acoustics.co.uk
State New
Headers show

Commit Message

Iain Sandoe July 9, 2010, 12:10 p.m. UTC
On 8 Jul 2010, at 20:19, Richard Henderson wrote:

> On 07/08/2010 12:07 PM, IainS wrote:
>>
>> On 8 Jul 2010, at 00:22, Richard Henderson wrote:
>>> Do you really need this kind of thing anymore, since you're exposing
>>> the use of the control variable so early?  I would have thought that
>>> varpool.c would no longer need any special-casing for !have_tls.
>>
>> this is my understanding (which might be flawed &| incomplete).
>>
>> Whilst we are on the parse side - and building varpool & cgraph, the
>> relationship is not fully exposed (that happens when gimplication  
>> is done).
>>
>> So my reasoning was that the "ghost/proxy" vars should be made to  
>> track
>> the user-land ones until then.
>
> Hmm.  So what you're saying is that there's extra cleanup work that
> needs to happen while lowering the representation.  It's not merely
> a matter of code substitution during gimplification.

Attached is an updated patch which addresses the points you raised -  
less my response above.

IMO it is not the world's most elegant solution..
... perhaps in part owing to me not know the "Best Places" to do some  
things
... but I suspect also because the whole thing doesn't sit comfortably..

I'm not sure where we sit in terms of applying this..
I guess it remains an interim solution to LTO on emuTLS targets, with  
the observation that perhaps a different approach is needed now we  
have LTO.

I've tested on i686-apple-darwin9 and cris-elf that the changes don't  
regress.
(in fact, my local tls testsuite is somewhat wider than the one in  
trunk)

If it is intended to apply this then, of course, I'd test on linux  
first (should be a nil response).

Known Minus:
  it fails the regression Jakub pointed out.

Known Plus:
   it allows lto
  it actually performs better on CSE than trunk does (test added).

I'd hazard that the CSE improvement is worth more than the lost  
trivial zero return.

> In which case I wonder if it wouldn't be better to do as Honza
> suggested and separate all of this out into a new pass_lower_emutls.
> Perhaps to be placed just after pass_lower_vector.  That placement
> is before pass_build_cgraph_edges, which I believe means you would
> not have to fix up the cgraph edges for the new function calls.  All
> you'd need to do is transform the tls variable references and fix up
> the varpool references.

My £0.02 having tangled with this for the last few weeks...

1/ a pass behind a gate of ! targetm.have_tls is a whole lot less  
intrusive to non-emutls targets than what we have
2/ it's likely more efficient for emutls targets since there's less to- 
ing and fro-ing.
3/ probably more maintainable.
4/ certainly more transparent.
5/ (probably) loses all reference to emutls from varasm, varpool,  
expr, & gimplify..

mind you, that doesn't mean I know how to write such a pass ...  ;-)

Iain.

gcc:

	PR target/44132
	* expr.c (emutls_var_address): Remove.
	(expand_expr_addr_expr_1): Remove TLS emulation hook.
	(expand_expr_real_1): Ditto.

	* gimplify.c (emutls_var_address): Add proc.
	(gimplify_decl_expr): expand TLS vars.
	(gimplify_var_or_parm_decl): Ditto.
	(omp_notice_variable): Recognize TLS_MODEL_EMULATED.

	* passes.c (rest_of_decl_compilation): Substitute TLS control vars  
for the master.

	* varasm.c (decl_needs_tls_emulation_p): New.
	(get_emutls_init_templ_addr): Adjust DECL_PRESERVE_P and DECL_INITIAL.
	(emutls_decl): Copy TREE_ADDRESSABLE, DECL_PRESERVE_P, create an
	initializer when one is present on the user-var.
	(emutls_common_1): Remove comment.
	(emutls_finalize_control_var): Copy TREE_USED, add an initializer for  
size and
	align fields for cases with un-initialized user vars.
	(emutls_find_user_var_cb): New.
	(emutls_add_base_initializer): New.
	(asm_output_bss): Assert not a tls var needing emulation.
	(asm_output_aligned_bss): Ditto.
	(assemble_variable): Remove control var init. code.  Assert not a tls  
var needing
	emulation.  Provide a trivial initializer for size and align fields  
if one is not already
	set. (do_assemble_alias): Do not handle emutls vars here.
	(var_decl_for_asm): New.
	(finish_aliases_1): Walk the alias pairs substituting emutls controls  
for the user
	counterparts.

	* varpool.c (varpool_mark_needed_node): Do not handle TLS  
substitution here.
	(decide_is_variable_needed): Or here.
	(varpool_finalize_decl): Handle TLS substitution.
	Remove early calls to varpool_assemble_pending_decls().
	Check enqueuing of vars after all tests for need are complete.
	(varpool_analyze_pending_decls): Do not record references if the  
initializer is error_mark.

testsuite:
	
	PR target/44132
	* gcc.dg/tls/thr-init-1.c: New.
	* gcc.dg/tls/thr-init-2.c: New.
	* gcc.dg/torture/tls New.
	* gcc.dg/torture/tls/tls-test.c: New.
	* gcc.dg/torture/tls/thr-init-1.c: New.
	* gcc.dg/torture/tls/tls.exp: New.
	* gcc.dg/torture/tls/thr-init-2.c: New.

	* lib/target-supports.exp (check_effective_target_tls_emulated): New.
	* gcc.dg/tls/thr-cse-1.c: New.


Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 161974)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -596,6 +596,23 @@ proc check_effective_target_tls_native {} {
     }]
 }
 
+# Return 1 if *emulated* thread local storage (TLS) is supported, 0 otherwise.
+
+proc check_effective_target_tls_emulated {} {
+    # VxWorks uses emulated TLS machinery, but with non-standard helper
+    # functions, so we fail to automatically detect it.
+    global target_triplet
+    if { [regexp ".*-.*-vxworks.*" $target_triplet] } {
+	return 1
+    }
+    
+    return [check_no_messages_and_pattern tls_emulated "emutls" assembly {
+	__thread int i;
+	int f (void) { return i; }
+	void g (int j) { i = j; }
+    }]
+}
+
 # Return 1 if TLS executables can run correctly, 0 otherwise.
 
 proc check_effective_target_tls_runtime {} {
Index: gcc/testsuite/gcc.dg/tls/thr-cse-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tls/thr-cse-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tls/thr-cse-1.c	(revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+/* { dg-require-effective-target tls_emulated } */
+
+/* Test that we only get one call to emutls_get_address when CSE is
+   active.  Note that the var _must_ be initialized for the scan asm
+   to work, since otherwise there will be an initializer which will,
+   correctly, call emutls_get_address.  */
+int foo (int b, int c, int d)
+{
+  static __thread int a=1;
+  a += b;
+  a -= c;
+  a += d;
+  return a;
+}
+
+/* { dg-final { scan-assembler-not "emutls_get_address.*emutls_get_address.*" { target { ! *-wrs-vxworks } } } } */
+/* { dg-final { scan-assembler-not "tls_lookup.*tls_lookup.*" { target *-wrs-vxworks } } } */
+
Index: gcc/testsuite/gcc.dg/tls/thr-init-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tls/thr-init-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tls/thr-init-1.c	(revision 0)
@@ -0,0 +1,8 @@
+/* { dg-require-effective-target tls } */
+/* { dg-do compile } */
+
+static __thread int fstat ;
+static __thread int fstat = 1 ;
+static __thread int fstat ;
+static __thread int fstat = 2; /* { dg-error "redefinition of 'fstat'" } */
+				/* { dg-message "note: previous definition of 'fstat' was here" "" { target *-*-* } 5 } */
Index: gcc/testsuite/gcc.dg/tls/thr-init-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tls/thr-init-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tls/thr-init-2.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-require-effective-target tls } */
+/* { dg-do run } */
+
+extern void abort() ;
+
+static __thread int fstat ;
+static __thread int fstat = 1;
+
+int test_code(int b)
+{
+  fstat += b ;
+  return fstat;
+}
+
+int main (int ac, char *av[])
+{
+  int a = test_code(1);
+  
+  if ((a != 2) || (fstat != 2))
+    abort () ;
+  
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/torture/tls/tls-test.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/tls/tls-test.c	(revision 0)
+++ gcc/testsuite/gcc.dg/torture/tls/tls-test.c	(revision 0)
@@ -0,0 +1,52 @@
+/* { dg-do run }  */
+/* { dg-require-effective-target tls  }  */
+/* { dg-require-effective-target pthread } */
+/* { dg-options "-pthread" } */
+
+#include <pthread.h>
+extern int printf (char *,...);
+__thread int a = 5; 
+int *volatile a_in_other_thread = (int *) 12345;
+
+static void *
+thread_func (void *arg)
+{
+  a_in_other_thread = &a;
+  a+=5;
+  *((int *) arg) = a;
+  return (void *)0;
+}
+
+int
+main ()
+{
+  pthread_t thread;
+  void *thread_retval;
+  int *volatile a_in_main_thread;
+  int *volatile again ;
+  int thr_a;
+
+  a_in_main_thread = &a;
+
+  if (pthread_create (&thread, (pthread_attr_t *)0, thread_func, &thr_a))
+    return 0;
+
+  if (pthread_join (thread, &thread_retval))
+    return 0;
+
+  again = &a;
+  if (again != a_in_main_thread)
+    {
+      printf ("FAIL: main thread addy changed from 0x%0x to 0x%0x\n", 
+		a_in_other_thread, again);
+      return 1;
+    }
+
+  if (a != 5 || thr_a != 10 || (a_in_other_thread == a_in_main_thread))
+    {
+      printf ("FAIL: a= %d, thr_a = %d Addr = 0x%0x\n", 
+		a, thr_a, a_in_other_thread);
+      return 1;
+    }
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/torture/tls/thr-init-1.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/tls/thr-init-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/torture/tls/thr-init-1.c	(revision 0)
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-require-effective-target tls } */
+
+extern int printf (char *,...);
+extern void abort() ;
+
+int test_code(int b)
+{
+static __thread int fstat = 1;
+  fstat += b ;
+  return fstat;
+}
+
+int main (int ac, char *av[])
+{
+  int a = test_code(1);
+  
+  if ( a != 2 )
+    {
+      printf ("a=%d\n", a) ;
+      abort ();
+    }
+  
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/torture/tls/tls.exp
===================================================================
--- gcc/testsuite/gcc.dg/torture/tls/tls.exp	(revision 0)
+++ gcc/testsuite/gcc.dg/torture/tls/tls.exp	(revision 0)
@@ -0,0 +1,36 @@
+#   Copyright (C) 2010 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+# 
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+    set DEFAULT_CFLAGS " -ansi -pedantic-errors"
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
+        $DEFAULT_CFLAGS
+
+# All done.
+dg-finish
Index: gcc/testsuite/gcc.dg/torture/tls/thr-init-2.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/tls/thr-init-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/torture/tls/thr-init-2.c	(revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-require-effective-target tls } */
+
+extern int printf (char *,...);
+extern void abort() ;
+
+static __thread int fstat ;
+static __thread int fstat = 1;
+static __thread int fstat ;
+
+int test_code(int b)
+{
+  fstat += b ;
+  return fstat;
+}
+
+int main (int ac, char *av[])
+{
+  int a = test_code(1);
+  
+  if ( a != 2 || fstat != 2 )
+    {
+    printf ("a=%d fstat=%d\n", a, fstat) ;
+    abort ();
+    }
+  
+  return 0;
+}

Comments

Jack Howarth July 12, 2010, 2:47 p.m. UTC | #1
On Fri, Jul 09, 2010 at 01:10:16PM +0100, IainS wrote:
>
> On 8 Jul 2010, at 20:19, Richard Henderson wrote:
>
>> On 07/08/2010 12:07 PM, IainS wrote:
>>>
>>> On 8 Jul 2010, at 00:22, Richard Henderson wrote:
>>>> Do you really need this kind of thing anymore, since you're exposing
>>>> the use of the control variable so early?  I would have thought that
>>>> varpool.c would no longer need any special-casing for !have_tls.
>>>
>>> this is my understanding (which might be flawed &| incomplete).
>>>
>>> Whilst we are on the parse side - and building varpool & cgraph, the
>>> relationship is not fully exposed (that happens when gimplication is 
>>> done).
>>>
>>> So my reasoning was that the "ghost/proxy" vars should be made to  
>>> track
>>> the user-land ones until then.
>>
>> Hmm.  So what you're saying is that there's extra cleanup work that
>> needs to happen while lowering the representation.  It's not merely
>> a matter of code substitution during gimplification.
>
> Attached is an updated patch which addresses the points you raised -  
> less my response above.

Richard,
   Could Iain just commit this current version of the patch with
a TODO for converting it into another pass before gcc 4.6 is released?
           Jack

>
> IMO it is not the world's most elegant solution..
> ... perhaps in part owing to me not know the "Best Places" to do some  
> things
> ... but I suspect also because the whole thing doesn't sit comfortably..
>
> I'm not sure where we sit in terms of applying this..
> I guess it remains an interim solution to LTO on emuTLS targets, with  
> the observation that perhaps a different approach is needed now we have 
> LTO.
>
> I've tested on i686-apple-darwin9 and cris-elf that the changes don't  
> regress.
> (in fact, my local tls testsuite is somewhat wider than the one in  
> trunk)
>
> If it is intended to apply this then, of course, I'd test on linux first 
> (should be a nil response).
>
> Known Minus:
>  it fails the regression Jakub pointed out.
>
> Known Plus:
>   it allows lto
>  it actually performs better on CSE than trunk does (test added).
>
> I'd hazard that the CSE improvement is worth more than the lost trivial 
> zero return.
>
>> In which case I wonder if it wouldn't be better to do as Honza
>> suggested and separate all of this out into a new pass_lower_emutls.
>> Perhaps to be placed just after pass_lower_vector.  That placement
>> is before pass_build_cgraph_edges, which I believe means you would
>> not have to fix up the cgraph edges for the new function calls.  All
>> you'd need to do is transform the tls variable references and fix up
>> the varpool references.
>
> My £0.02 having tangled with this for the last few weeks...
>
> 1/ a pass behind a gate of ! targetm.have_tls is a whole lot less  
> intrusive to non-emutls targets than what we have
> 2/ it's likely more efficient for emutls targets since there's less to- 
> ing and fro-ing.
> 3/ probably more maintainable.
> 4/ certainly more transparent.
> 5/ (probably) loses all reference to emutls from varasm, varpool, expr, & 
> gimplify..
>
> mind you, that doesn't mean I know how to write such a pass ...  ;-)
>
> Iain.
>
> gcc:
>
> 	PR target/44132
> 	* expr.c (emutls_var_address): Remove.
> 	(expand_expr_addr_expr_1): Remove TLS emulation hook.
> 	(expand_expr_real_1): Ditto.
>
> 	* gimplify.c (emutls_var_address): Add proc.
> 	(gimplify_decl_expr): expand TLS vars.
> 	(gimplify_var_or_parm_decl): Ditto.
> 	(omp_notice_variable): Recognize TLS_MODEL_EMULATED.
>
> 	* passes.c (rest_of_decl_compilation): Substitute TLS control vars for 
> the master.
>
> 	* varasm.c (decl_needs_tls_emulation_p): New.
> 	(get_emutls_init_templ_addr): Adjust DECL_PRESERVE_P and DECL_INITIAL.
> 	(emutls_decl): Copy TREE_ADDRESSABLE, DECL_PRESERVE_P, create an
> 	initializer when one is present on the user-var.
> 	(emutls_common_1): Remove comment.
> 	(emutls_finalize_control_var): Copy TREE_USED, add an initializer for  
> size and
> 	align fields for cases with un-initialized user vars.
> 	(emutls_find_user_var_cb): New.
> 	(emutls_add_base_initializer): New.
> 	(asm_output_bss): Assert not a tls var needing emulation.
> 	(asm_output_aligned_bss): Ditto.
> 	(assemble_variable): Remove control var init. code.  Assert not a tls  
> var needing
> 	emulation.  Provide a trivial initializer for size and align fields if 
> one is not already
> 	set. (do_assemble_alias): Do not handle emutls vars here.
> 	(var_decl_for_asm): New.
> 	(finish_aliases_1): Walk the alias pairs substituting emutls controls  
> for the user
> 	counterparts.
>
> 	* varpool.c (varpool_mark_needed_node): Do not handle TLS substitution 
> here.
> 	(decide_is_variable_needed): Or here.
> 	(varpool_finalize_decl): Handle TLS substitution.
> 	Remove early calls to varpool_assemble_pending_decls().
> 	Check enqueuing of vars after all tests for need are complete.
> 	(varpool_analyze_pending_decls): Do not record references if the  
> initializer is error_mark.
>
> testsuite:
> 	
> 	PR target/44132
> 	* gcc.dg/tls/thr-init-1.c: New.
> 	* gcc.dg/tls/thr-init-2.c: New.
> 	* gcc.dg/torture/tls New.
> 	* gcc.dg/torture/tls/tls-test.c: New.
> 	* gcc.dg/torture/tls/thr-init-1.c: New.
> 	* gcc.dg/torture/tls/tls.exp: New.
> 	* gcc.dg/torture/tls/thr-init-2.c: New.
>
> 	* lib/target-supports.exp (check_effective_target_tls_emulated): New.
> 	* gcc.dg/tls/thr-cse-1.c: New.
>
>

> Index: gcc/expr.c
> ===================================================================
> --- gcc/expr.c	(revision 161974)
> +++ gcc/expr.c	(working copy)
> @@ -6828,21 +6828,7 @@ highest_pow2_factor_for_target (const_tree target,
>  
>    return MAX (factor, talign);
>  }
> -
> -/* Return &VAR expression for emulated thread local VAR.  */
>  
> -static tree
> -emutls_var_address (tree var)
> -{
> -  tree emuvar = emutls_decl (var);
> -  tree fn = built_in_decls [BUILT_IN_EMUTLS_GET_ADDRESS];
> -  tree arg = build_fold_addr_expr_with_type (emuvar, ptr_type_node);
> -  tree arglist = build_tree_list (NULL_TREE, arg);
> -  tree call = build_function_call_expr (UNKNOWN_LOCATION, fn, arglist);
> -  return fold_convert (build_pointer_type (TREE_TYPE (var)), call);
> -}
> -
> -
>  /* Subroutine of expand_expr.  Expand the two operands of a binary
>     expression EXP0 and EXP1 placing the results in OP0 and OP1.
>     The value may be stored in TARGET if TARGET is nonzero.  The
> @@ -6946,17 +6932,6 @@ expand_expr_addr_expr_1 (tree exp, rtx target, enu
>        break;
>  
>      case VAR_DECL:
> -      /* TLS emulation hook - replace __thread VAR's &VAR with
> -	 __emutls_get_address (&_emutls.VAR).  */
> -      if (! targetm.have_tls
> -	  && TREE_CODE (exp) == VAR_DECL
> -	  && DECL_THREAD_LOCAL_P (exp))
> -	{
> -	  exp = emutls_var_address (exp);
> -	  return expand_expr (exp, target, tmode, modifier);
> -	}
> -      /* Fall through.  */
> -
>      default:
>        /* If the object is a DECL, then expand it for its rtl.  Don't bypass
>  	 expand_expr, as that can have various side effects; LABEL_DECLs for
> @@ -8394,16 +8369,6 @@ expand_expr_real_1 (tree exp, rtx target, enum mac
>  	  && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
>  	layout_decl (exp, 0);
>  
> -      /* TLS emulation hook - replace __thread vars with
> -	 *__emutls_get_address (&_emutls.var).  */
> -      if (! targetm.have_tls
> -	  && TREE_CODE (exp) == VAR_DECL
> -	  && DECL_THREAD_LOCAL_P (exp))
> -	{
> -	  exp = build_fold_indirect_ref_loc (loc, emutls_var_address (exp));
> -	  return expand_expr_real_1 (exp, target, tmode, modifier, NULL);
> -	}
> -
>        /* ... fall through ...  */
>  
>      case FUNCTION_DECL:
> Index: gcc/gimplify.c
> ===================================================================
> --- gcc/gimplify.c	(revision 161974)
> +++ gcc/gimplify.c	(working copy)
> @@ -1339,7 +1339,19 @@ gimplify_vla_decl (tree decl, gimple_seq *seq_p)
>    gimplify_ctxp->save_stack = true;
>  }
>  
> +/* Return &VAR expression for emulated thread local VAR.  */
>  
> +static tree
> +emutls_var_address (tree var)
> +{
> +  tree emuvar = emutls_decl (var);
> +  tree fn = built_in_decls [BUILT_IN_EMUTLS_GET_ADDRESS];
> +  tree arg = build_fold_addr_expr_with_type (emuvar, ptr_type_node);
> +  tree arglist = build_tree_list (NULL_TREE, arg);
> +  tree call = build_function_call_expr (UNKNOWN_LOCATION, fn, arglist);
> +  return fold_convert (build_pointer_type (TREE_TYPE (var)), call);
> +}
> +
>  /* Gimplifies a DECL_EXPR node *STMT_P by making any necessary allocation
>     and initialization explicit.  */
>  
> @@ -1354,6 +1366,18 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_
>    if (TREE_TYPE (decl) == error_mark_node)
>      return GS_ERROR;
>  
> +  /* TLS emulation hook - replace __thread VAR's &VAR with
> +     __emutls_get_address (&_emutls.VAR). We then ignore the original
> +     var.  */
> +  if (! targetm.have_tls
> +      && TREE_CODE (decl) == VAR_DECL
> +      && DECL_THREAD_LOCAL_P (decl))
> +    {
> +      stmt = build_fold_indirect_ref (emutls_var_address (decl));
> +      gimplify_and_add (stmt, seq_p);
> +      return GS_ALL_DONE;
> +    }
> +
>    if ((TREE_CODE (decl) == TYPE_DECL
>         || TREE_CODE (decl) == VAR_DECL)
>        && !TYPE_SIZES_GIMPLIFIED (TREE_TYPE (decl)))
> @@ -1873,6 +1897,17 @@ gimplify_var_or_parm_decl (tree *expr_p)
>        return GS_ERROR;
>      }
>  
> +  /* TLS emulation hook - replace __thread VAR's &VAR with
> +     __emutls_get_address (&_emutls.VAR).  */
> +  if (! targetm.have_tls
> +      && TREE_CODE (decl) == VAR_DECL
> +      && DECL_THREAD_LOCAL_P (decl))
> +    {
> +      gcc_assert (!DECL_HAS_VALUE_EXPR_P (decl));
> +      *expr_p = build_fold_indirect_ref (emutls_var_address (decl));
> +      return GS_OK;
> +    }
> +
>    /* When within an OpenMP context, notice uses of variables.  */
>    if (gimplify_omp_ctxp && omp_notice_variable (gimplify_omp_ctxp, decl, true))
>      return GS_ALL_DONE;
> @@ -5553,14 +5588,15 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx,
>    /* Threadprivate variables are predetermined.  */
>    if (is_global_var (decl))
>      {
> -      if (DECL_THREAD_LOCAL_P (decl))
> +      if (DECL_TLS_MODEL (decl) != TLS_MODEL_NONE)
>  	return omp_notice_threadprivate_variable (ctx, decl, NULL_TREE);
>  
>        if (DECL_HAS_VALUE_EXPR_P (decl))
>  	{
>  	  tree value = get_base_address (DECL_VALUE_EXPR (decl));
>  
> -	  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
> +	  if (value && DECL_P (value) 
> +	      && (DECL_TLS_MODEL (value) != TLS_MODEL_NONE))
>  	    return omp_notice_threadprivate_variable (ctx, decl, value);
>  	}
>      }
> Index: gcc/varasm.c
> ===================================================================
> --- gcc/varasm.c	(revision 161974)
> +++ gcc/varasm.c	(working copy)
> @@ -204,6 +204,14 @@ static GTY (()) tree emutls_object_type;
>  # define EMUTLS_SEPARATOR	"_"
>  #endif
>  
> +static int
> +decl_needs_tls_emulation_p (tree decl)
> +{
> +  return !targetm.have_tls 
> +	  && TREE_CODE (decl) == VAR_DECL 
> +	  && DECL_THREAD_LOCAL_P (decl);
> +}
> +
>  /* Create an IDENTIFIER_NODE by prefixing PREFIX to the
>     IDENTIFIER_NODE NAME's name.  */
>  
> @@ -322,7 +330,7 @@ get_emutls_init_templ_addr (tree decl)
>    DECL_IGNORED_P (to) = 1;
>    DECL_CONTEXT (to) = DECL_CONTEXT (decl);
>    DECL_SECTION_NAME (to) = DECL_SECTION_NAME (decl);
> -  DECL_PRESERVE_P (to) = DECL_PRESERVE_P (decl);
> +  DECL_PRESERVE_P (to) = 1;
>  
>    DECL_WEAK (to) = DECL_WEAK (decl);
>    if (DECL_ONE_ONLY (decl))
> @@ -337,7 +345,6 @@ get_emutls_init_templ_addr (tree decl)
>  
>    DECL_VISIBILITY_SPECIFIED (to) = DECL_VISIBILITY_SPECIFIED (decl);
>    DECL_INITIAL (to) = DECL_INITIAL (decl);
> -  DECL_INITIAL (decl) = NULL;
>  
>    varpool_finalize_decl (to);
>    return build_fold_addr_expr (to);
> @@ -388,8 +395,7 @@ emutls_decl (tree decl)
>        DECL_TLS_MODEL (to) = TLS_MODEL_EMULATED;
>        DECL_ARTIFICIAL (to) = 1;
>        DECL_IGNORED_P (to) = 1;
> -      /* FIXME: work around PR44132.  */
> -      DECL_PRESERVE_P (to) = 1;
> +
>        TREE_READONLY (to) = 0;
>        SET_DECL_ASSEMBLER_NAME (to, DECL_NAME (to));
>        if (DECL_ONE_ONLY (decl))
> @@ -413,18 +419,42 @@ emutls_decl (tree decl)
>    TREE_STATIC (to) = TREE_STATIC (decl);
>    TREE_USED (to) = TREE_USED (decl);
>    TREE_PUBLIC (to) = TREE_PUBLIC (decl);
> +  TREE_ADDRESSABLE (to) = TREE_ADDRESSABLE (decl);
>    DECL_EXTERNAL (to) = DECL_EXTERNAL (decl);
>    DECL_COMMON (to) = DECL_COMMON (decl);
>    DECL_WEAK (to) = DECL_WEAK (decl);
>    DECL_VISIBILITY (to) = DECL_VISIBILITY (decl);
>    DECL_VISIBILITY_SPECIFIED (to) = DECL_VISIBILITY_SPECIFIED (decl);
> +  DECL_PRESERVE_P (to) = DECL_PRESERVE_P (decl);
>    
>    /* Fortran might pass this to us.  */
>    DECL_RESTRICTED_P (to) = DECL_RESTRICTED_P (decl);
> +  
> +  /* As soon as we see an initializer (and providing one is not already
> +     present) we can setup the init. template.  */
> +  if (!DECL_INITIAL (to) 
> +       && DECL_INITIAL (decl) 
> +       && DECL_INITIAL (decl) != error_mark_node 
> +       && !DECL_EXTERNAL (to) 
> +       && !DECL_COMMON (to))
> +    {
> +      DECL_INITIAL (to) = targetm.emutls.var_init
> +			  (to, decl, get_emutls_init_templ_addr (decl));
>  
> +      /* Make sure the template is marked as needed early enough.
> +	 Without this, if the variable is placed in a
> +	 section-anchored block, the template will only be marked
> +	 when it's too late.*/
> +      record_references_in_initializer (to, false);
> +    }
> +
> +  /* Say we are not interested in emitting this Var.  */
> +  TREE_ASM_WRITTEN (decl) = 1;
>    return to;
>  }
>  
> +/* Add static constructors for emutls vars, where required.  */
> +
>  static int
>  emutls_common_1 (void **loc, void *xstmts)
>  {
> @@ -439,10 +469,6 @@ emutls_common_1 (void **loc, void *xstmts)
>  
>    word_type_node = lang_hooks.types.type_for_mode (word_mode, 1);
>  
> -  /* The idea was to call get_emutls_init_templ_addr here, but if we
> -     do this and there is an initializer, -fanchor_section loses,
> -     because it would be too late to ensure the template is
> -     output.  */
>    x = null_pointer_node;
>    args = tree_cons (NULL, x, NULL);
>    x = build_int_cst (word_type_node, DECL_ALIGN_UNIT (h->base.from));
> @@ -469,8 +495,22 @@ emutls_finalize_control_var (void **loc,
>    if (h != NULL) 
>      {
>        struct varpool_node *node = varpool_node (h->to);
> +      if (TREE_USED (h->base.from)) 
> +	TREE_USED (h->to) = 1;
> +
> +      /* We must ensure that the size and align fields are initialized
> +         in control vars, even when the value is not.  */
> +      if ((!DECL_INITIAL (h->base.from)
> +            || DECL_INITIAL (h->base.from) == error_mark_node)
> +	    && !DECL_COMMON (h->base.from)
> +	    && !DECL_EXTERNAL (h->base.from))
> +	{
> +	  DECL_INITIAL (h->to) = targetm.emutls.var_init
> +			(h->to, h->base.from, null_pointer_node);
> +	}
> +
>        /* Because varpool_finalize_decl () has side-effects,
> -         only apply to un-finalized vars.  */
> +         only call it for un-finalized vars.  */
>        if (node && !node->finalized) 
>  	varpool_finalize_decl (h->to);
>      }
> @@ -500,6 +540,39 @@ emutls_finish (void)
>      }
>  }
>  
> +/* Callback to check an htab entry against a supplied control var and
> +   subsitute the original user var if they match.  */
> +static int
> +emutls_find_user_var_cb (void **loc, void *uvar)
> +{
> +  struct tree_map *h = *(struct tree_map **) loc;
> +  if (h != NULL
> +      && h->to == *((tree *)uvar))
> +    {
> +       *((tree *)uvar) = h->base.from;
> +       return 0;
> +    }
> +  return 1;
> +}
> +
> +/* In the case of uninitialized global vars, we have to find our way back
> +   to the user-var in order to build a trivial initializer for the size
> +   and align fields.  */
> +
> +static void
> +emutls_add_base_initializer (tree control)
> +{
> +  tree uvar = control;
> +  if (emutls_htab == NULL)
> +    return;
> +  htab_traverse_noresize (emutls_htab, emutls_find_user_var_cb, &uvar);
> +  /* If we didn't find the user var from the control, something is broken.  */
> +  gcc_assert (uvar != control);
> +  DECL_INITIAL (control) = targetm.emutls.var_init
> +			(control, uvar, null_pointer_node);
> +}
> +
> +
>  /* Helper routines for maintaining section_htab.  */
>  
>  static int
> @@ -792,6 +865,8 @@ asm_output_bss (FILE *file, tree decl ATTRIBUTE_UN
>    gcc_assert (strcmp (XSTR (XEXP (DECL_RTL (decl), 0), 0), name) == 0);
>    targetm.asm_out.globalize_decl_name (file, decl);
>    switch_to_section (bss_section);
> +  /* We don't emit the userland vars for emulated TLS, just the control.  */
> +  gcc_assert (!decl_needs_tls_emulation_p (decl));
>  #ifdef ASM_DECLARE_OBJECT_NAME
>    last_assemble_variable_decl = decl;
>    ASM_DECLARE_OBJECT_NAME (file, name, decl);
> @@ -818,6 +893,8 @@ asm_output_aligned_bss (FILE *file, tree decl ATTR
>  {
>    switch_to_section (bss_section);
>    ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT));
> +  /* We don't emit the userland vars for emulated TLS, just the control.  */
> +  gcc_assert (!decl_needs_tls_emulation_p (decl));
>  #ifdef ASM_DECLARE_OBJECT_NAME
>    last_assemble_variable_decl = decl;
>    ASM_DECLARE_OBJECT_NAME (file, name, decl);
> @@ -1233,7 +1310,10 @@ get_variable_section (tree decl, bool prefer_noswi
>    if (IN_NAMED_SECTION (decl))
>      return get_named_section (decl, NULL, reloc);
>  
> -  if (ADDR_SPACE_GENERIC_P (as)
> +  /* This should not be bss for an emulated TLS object.  */
> +  if (DECL_TLS_MODEL (decl) == TLS_MODEL_EMULATED)
> +    ;
> +  else if (ADDR_SPACE_GENERIC_P (as)
>        && !DECL_THREAD_LOCAL_P (decl)
>        && !(prefer_noswitch_p && targetm.have_switchable_bss_sections)
>        && bss_initializer_p (decl))
> @@ -2153,35 +2233,6 @@ assemble_variable (tree decl, int top_level ATTRIB
>    rtx decl_rtl, symbol;
>    section *sect;
>  
> -  if (! targetm.have_tls
> -      && TREE_CODE (decl) == VAR_DECL
> -      && DECL_THREAD_LOCAL_P (decl))
> -    {
> -      tree to = emutls_decl (decl);
> -
> -      /* If this variable is defined locally, then we need to initialize the
> -         control structure with size and alignment information.  We do this
> -	 at the last moment because tentative definitions can take a locally
> -	 defined but uninitialized variable and initialize it later, which
> -	 would result in incorrect contents.  */
> -      if (! DECL_EXTERNAL (to)
> -	  && (! DECL_COMMON (to)
> -	      || (DECL_INITIAL (decl)
> -		  && DECL_INITIAL (decl) != error_mark_node)))
> -	{
> -	  DECL_INITIAL (to) = targetm.emutls.var_init
> -	    (to, decl, get_emutls_init_templ_addr (decl));
> -
> -	  /* Make sure the template is marked as needed early enough.
> -	     Without this, if the variable is placed in a
> -	     section-anchored block, the template will only be marked
> -	     when it's too late.  */
> -	  record_references_in_initializer (to, false);
> -	}
> -
> -      decl = to;
> -    }
> -
>    last_assemble_variable_decl = 0;
>  
>    /* Normally no need to say anything here for external references,
> @@ -2238,6 +2289,19 @@ assemble_variable (tree decl, int top_level ATTRIB
>    if (flag_syntax_only)
>      return;
>  
> +  /* We don't emit the userland vars for emulated TLS - they should never
> +     get to here, only the control vars should be emitted.  */
> +  gcc_assert (! decl_needs_tls_emulation_p (decl));
> +  
> +  /* However, for the emutls control vars must ensure that the size and
> +     align fields are initialized, even if the value is not.  */
> +  if (!targetm.have_tls 
> +      && TREE_CODE (decl) == VAR_DECL
> +      && DECL_TLS_MODEL (decl) == TLS_MODEL_EMULATED
> +      && !DECL_INITIAL (decl)
> +      && !DECL_COMMON (decl))
> +    emutls_add_base_initializer (decl);
> +
>    if (! dont_output_data
>        && ! host_integerp (DECL_SIZE_UNIT (decl), 1))
>      {
> @@ -5701,18 +5765,12 @@ do_assemble_alias (tree decl, tree target)
>    TREE_ASM_WRITTEN (decl) = 1;
>    TREE_ASM_WRITTEN (DECL_ASSEMBLER_NAME (decl)) = 1;
>  
> +  gcc_assert (! decl_needs_tls_emulation_p (decl));
> +
>    if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
>      {
>        ultimate_transparent_alias_target (&target);
>  
> -      if (!targetm.have_tls
> -	  && TREE_CODE (decl) == VAR_DECL
> -	  && DECL_THREAD_LOCAL_P (decl))
> -	{
> -	  decl = emutls_decl (decl);
> -	  target = get_emutls_object_name (target);
> -	}
> -
>        if (!TREE_SYMBOL_REFERENCED (target))
>  	weakref_targets = tree_cons (decl, target, weakref_targets);
>  
> @@ -5731,14 +5789,6 @@ do_assemble_alias (tree decl, tree target)
>        return;
>      }
>  
> -  if (!targetm.have_tls
> -      && TREE_CODE (decl) == VAR_DECL
> -      && DECL_THREAD_LOCAL_P (decl))
> -    {
> -      decl = emutls_decl (decl);
> -      target = get_emutls_object_name (target);
> -    }
> -
>  #ifdef ASM_OUTPUT_DEF
>    /* Make name accessible from other files, if appropriate.  */
>  
> @@ -5820,6 +5870,15 @@ remove_unreachable_alias_pairs (void)
>      }
>  }
>  
> +/* Lookup the decl for a symbol in the varpool.  */
> +static tree
> +var_decl_for_asm (tree symbol)
> +{
> +  struct varpool_node *vnode = varpool_node_for_asm  (symbol);
> +  if (vnode) 
> +    return vnode->decl;
> +  return NULL;
> +}
>  
>  /* First pass of completing pending aliases.  Make sure that cgraph knows
>     which symbols will be required.  */
> @@ -5832,8 +5891,55 @@ finish_aliases_1 (void)
>  
>    for (i = 0; VEC_iterate (alias_pair, alias_pairs, i, p); i++)
>      {
> -      tree target_decl;
> +      tree target_decl=NULL;
>  
> +      /* When emulated TLS is in effect, redirect aliases so that they 
> +         are registered between the control vars.  */
> +      if (!targetm.have_tls 
> +          && TREE_CODE (p->decl) == VAR_DECL
> +          && DECL_TLS_MODEL (p->decl) != TLS_MODEL_NONE)
> +	{
> +	  tree tsym = p->target ;
> +	  target_decl = var_decl_for_asm (tsym) ;
> +	  if (!target_decl) 
> +	    {
> +	      /* If we didn't find the user's symbol, it could
> +	         be because the alias really refers to a control 
> +	         var.  */
> +	      tsym = get_emutls_object_name (p->target);
> +	      target_decl = var_decl_for_asm (tsym);
> +	    }
> +	  if (target_decl) 
> +	    {
> +	      struct varpool_node *vnode;
> +	      /* If it hasn't been done already, substitute the control
> +	         var for the original.  */
> +	      if (DECL_THREAD_LOCAL_P (p->decl))
> +		p->decl = emutls_decl (p->decl);
> +	      /* If not TLS target, we've made a mistake.  */
> +	      if (DECL_TLS_MODEL (target_decl) < TLS_MODEL_EMULATED)
> +		error ("TLS symbol %q+D aliased to non-TLS symbol %qE",
> +			p->decl, p->target);
> +	      /* If it's the original we need to substitute the contol.  */
> +	      else if (DECL_THREAD_LOCAL_P (target_decl))
> +		{
> +		  target_decl = emutls_decl (target_decl);
> +		  tsym = get_emutls_object_name (p->target);
> +		}
> +	      /* else it's already the emulation control.  */
> +	      /* Mark the var needed.  */
> +	      vnode = varpool_node (target_decl);
> +	      if (vnode) 
> +	        {
> +		  varpool_mark_needed_node (vnode);
> +		  vnode->force_output = 1;
> +	        }
> +	      p->target = tsym;
> +	    }
> +	  /* Else we didn't find a decl for the symbol, which is an error
> +	     unless there's a weak ref.  */
> +	} 
> +      else
>        target_decl = find_decl_and_mark_needed (p->decl, p->target);
>        if (target_decl == NULL)
>  	{
> Index: gcc/passes.c
> ===================================================================
> --- gcc/passes.c	(revision 161974)
> +++ gcc/passes.c	(working copy)
> @@ -152,6 +152,16 @@ rest_of_decl_compilation (tree decl,
>  			  int top_level,
>  			  int at_end)
>  {
> +  if (! targetm.have_tls 
> +	&& !in_lto_p 
> +	&& TREE_CODE (decl) == VAR_DECL
> +	&& DECL_THREAD_LOCAL_P (decl))
> +    {
> +      /* Substitute the control var. for the user one.  */
> +      rest_of_decl_compilation (emutls_decl (decl), top_level, at_end);
> +      return;
> +    }
> +
>    /* We deferred calling assemble_alias so that we could collect
>       other attributes such as visibility.  Emit the alias now.  */
>    {
> Index: gcc/varpool.c
> ===================================================================
> --- gcc/varpool.c	(revision 161974)
> +++ gcc/varpool.c	(working copy)
> @@ -312,6 +312,14 @@ varpool_mark_needed_node (struct varpool_node *nod
>        && !TREE_ASM_WRITTEN (node->decl))
>      varpool_enqueue_needed_node (node);
>    node->needed = 1;
> +  /* If we need the var, and it's an emulated TLS entity, that
> +     means we need the control var.  */
> +  if (!targetm.have_tls && DECL_THREAD_LOCAL_P (node->decl))
> +    {
> +      struct varpool_node *cv_node;
> +      cv_node = varpool_node (emutls_decl (node->decl)) ;
> +      varpool_mark_needed_node (cv_node);
> +    }
>  }
>  
>  /* Reset the queue of needed nodes.  */
> @@ -346,17 +354,6 @@ decide_is_variable_needed (struct varpool_node *no
>        && !DECL_EXTERNAL (decl))
>      return true;
>  
> -  /* When emulating tls, we actually see references to the control
> -     variable, rather than the user-level variable.  */
> -  if (!targetm.have_tls
> -      && TREE_CODE (decl) == VAR_DECL
> -      && DECL_THREAD_LOCAL_P (decl))
> -    {
> -      tree control = emutls_decl (decl);
> -      if (decide_is_variable_needed (varpool_node (control), control))
> -	return true;
> -    }
> -
>    /* When not reordering top level variables, we have to assume that
>       we are going to keep everything.  */
>    if (flag_toplevel_reorder)
> @@ -381,14 +378,31 @@ varpool_finalize_decl (tree decl)
>       or local (in C, has internal linkage).  So do nothing more
>       if this function has already run.  */
>    if (node->finalized)
> +      return;
> +
> +  /* For emulated TLS vars, if we are in a position to finalize the userland
> +     var, then we should be able to finalize the control var too.  */
> +  if (!targetm.have_tls 
> +      && TREE_CODE (decl) == VAR_DECL
> +      && DECL_THREAD_LOCAL_P (decl))
>      {
> -      if (cgraph_global_info_ready)
> -	varpool_assemble_pending_decls ();
> +      tree control = emutls_decl (decl);
> +      /* If we didn't create an initializer in the preceding line,  then we 
> +         must now add a minimal one that sets the size and align fields.  */
> +      if (!DECL_INITIAL (control) 
> +	   && !DECL_COMMON (control)
> +	   && !DECL_EXTERNAL (control))
> +	{
> +	  DECL_INITIAL (control) = targetm.emutls.var_init
> +			(control, decl, null_pointer_node);
> +	  record_references_in_initializer (control, false);
> +	}
> +
> +      varpool_finalize_decl (control) ;
> +      node->finalized = true;    
>        return;
>      }
> -  if (node->needed)
> -    varpool_enqueue_needed_node (node);
> -  node->finalized = true;
> +
>    if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl))
>      node->force_output = true;
>  
> @@ -399,8 +413,11 @@ varpool_finalize_decl (tree decl)
>       there.  */
>    else if (TREE_PUBLIC (decl) && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
>      varpool_mark_needed_node (node);
> -  if (cgraph_global_info_ready)
> -    varpool_assemble_pending_decls ();
> +
> +  if (node->needed)
> +    varpool_enqueue_needed_node (node);
> +
> +  node->finalized = true;
>  }
>  
>  /* Return variable availability.  See cgraph.h for description of individual
> @@ -449,7 +466,7 @@ varpool_analyze_pending_decls (void)
>  	     already informed about increased alignment.  */
>            align_variable (decl, 0);
>  	}
> -      if (DECL_INITIAL (decl))
> +      if (DECL_INITIAL (decl) && (DECL_INITIAL (decl) != error_mark_node))
>  	record_references_in_initializer (decl, analyzed);
>        if (node->same_comdat_group)
>  	{

>
>

> Index: gcc/testsuite/lib/target-supports.exp
> ===================================================================
> --- gcc/testsuite/lib/target-supports.exp	(revision 161974)
> +++ gcc/testsuite/lib/target-supports.exp	(working copy)
> @@ -596,6 +596,23 @@ proc check_effective_target_tls_native {} {
>      }]
>  }
>  
> +# Return 1 if *emulated* thread local storage (TLS) is supported, 0 otherwise.
> +
> +proc check_effective_target_tls_emulated {} {
> +    # VxWorks uses emulated TLS machinery, but with non-standard helper
> +    # functions, so we fail to automatically detect it.
> +    global target_triplet
> +    if { [regexp ".*-.*-vxworks.*" $target_triplet] } {
> +	return 1
> +    }
> +    
> +    return [check_no_messages_and_pattern tls_emulated "emutls" assembly {
> +	__thread int i;
> +	int f (void) { return i; }
> +	void g (int j) { i = j; }
> +    }]
> +}
> +
>  # Return 1 if TLS executables can run correctly, 0 otherwise.
>  
>  proc check_effective_target_tls_runtime {} {
> Index: gcc/testsuite/gcc.dg/tls/thr-cse-1.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tls/thr-cse-1.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/tls/thr-cse-1.c	(revision 0)
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1" } */
> +/* { dg-require-effective-target tls_emulated } */
> +
> +/* Test that we only get one call to emutls_get_address when CSE is
> +   active.  Note that the var _must_ be initialized for the scan asm
> +   to work, since otherwise there will be an initializer which will,
> +   correctly, call emutls_get_address.  */
> +int foo (int b, int c, int d)
> +{
> +  static __thread int a=1;
> +  a += b;
> +  a -= c;
> +  a += d;
> +  return a;
> +}
> +
> +/* { dg-final { scan-assembler-not "emutls_get_address.*emutls_get_address.*" { target { ! *-wrs-vxworks } } } } */
> +/* { dg-final { scan-assembler-not "tls_lookup.*tls_lookup.*" { target *-wrs-vxworks } } } */
> +
> Index: gcc/testsuite/gcc.dg/tls/thr-init-1.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tls/thr-init-1.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/tls/thr-init-1.c	(revision 0)
> @@ -0,0 +1,8 @@
> +/* { dg-require-effective-target tls } */
> +/* { dg-do compile } */
> +
> +static __thread int fstat ;
> +static __thread int fstat = 1 ;
> +static __thread int fstat ;
> +static __thread int fstat = 2; /* { dg-error "redefinition of 'fstat'" } */
> +				/* { dg-message "note: previous definition of 'fstat' was here" "" { target *-*-* } 5 } */
> Index: gcc/testsuite/gcc.dg/tls/thr-init-2.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tls/thr-init-2.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/tls/thr-init-2.c	(revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-require-effective-target tls } */
> +/* { dg-do run } */
> +
> +extern void abort() ;
> +
> +static __thread int fstat ;
> +static __thread int fstat = 1;
> +
> +int test_code(int b)
> +{
> +  fstat += b ;
> +  return fstat;
> +}
> +
> +int main (int ac, char *av[])
> +{
> +  int a = test_code(1);
> +  
> +  if ((a != 2) || (fstat != 2))
> +    abort () ;
> +  
> +  return 0;
> +}
> Index: gcc/testsuite/gcc.dg/torture/tls/tls-test.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/torture/tls/tls-test.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/torture/tls/tls-test.c	(revision 0)
> @@ -0,0 +1,52 @@
> +/* { dg-do run }  */
> +/* { dg-require-effective-target tls  }  */
> +/* { dg-require-effective-target pthread } */
> +/* { dg-options "-pthread" } */
> +
> +#include <pthread.h>
> +extern int printf (char *,...);
> +__thread int a = 5; 
> +int *volatile a_in_other_thread = (int *) 12345;
> +
> +static void *
> +thread_func (void *arg)
> +{
> +  a_in_other_thread = &a;
> +  a+=5;
> +  *((int *) arg) = a;
> +  return (void *)0;
> +}
> +
> +int
> +main ()
> +{
> +  pthread_t thread;
> +  void *thread_retval;
> +  int *volatile a_in_main_thread;
> +  int *volatile again ;
> +  int thr_a;
> +
> +  a_in_main_thread = &a;
> +
> +  if (pthread_create (&thread, (pthread_attr_t *)0, thread_func, &thr_a))
> +    return 0;
> +
> +  if (pthread_join (thread, &thread_retval))
> +    return 0;
> +
> +  again = &a;
> +  if (again != a_in_main_thread)
> +    {
> +      printf ("FAIL: main thread addy changed from 0x%0x to 0x%0x\n", 
> +		a_in_other_thread, again);
> +      return 1;
> +    }
> +
> +  if (a != 5 || thr_a != 10 || (a_in_other_thread == a_in_main_thread))
> +    {
> +      printf ("FAIL: a= %d, thr_a = %d Addr = 0x%0x\n", 
> +		a, thr_a, a_in_other_thread);
> +      return 1;
> +    }
> +  return 0;
> +}
> Index: gcc/testsuite/gcc.dg/torture/tls/thr-init-1.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/torture/tls/thr-init-1.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/torture/tls/thr-init-1.c	(revision 0)
> @@ -0,0 +1,25 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target tls } */
> +
> +extern int printf (char *,...);
> +extern void abort() ;
> +
> +int test_code(int b)
> +{
> +static __thread int fstat = 1;
> +  fstat += b ;
> +  return fstat;
> +}
> +
> +int main (int ac, char *av[])
> +{
> +  int a = test_code(1);
> +  
> +  if ( a != 2 )
> +    {
> +      printf ("a=%d\n", a) ;
> +      abort ();
> +    }
> +  
> +  return 0;
> +}
> Index: gcc/testsuite/gcc.dg/torture/tls/tls.exp
> ===================================================================
> --- gcc/testsuite/gcc.dg/torture/tls/tls.exp	(revision 0)
> +++ gcc/testsuite/gcc.dg/torture/tls/tls.exp	(revision 0)
> @@ -0,0 +1,36 @@
> +#   Copyright (C) 2010 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +# 
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +# 
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# GCC testsuite that uses the `dg.exp' driver.
> +
> +# Load support procs.
> +load_lib gcc-dg.exp
> +
> +# If a testcase doesn't have special options, use these.
> +global DEFAULT_CFLAGS
> +if ![info exists DEFAULT_CFLAGS] then {
> +    set DEFAULT_CFLAGS " -ansi -pedantic-errors"
> +}
> +
> +# Initialize `dg'.
> +dg-init
> +
> +# Main loop.
> +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
> +        $DEFAULT_CFLAGS
> +
> +# All done.
> +dg-finish
> Index: gcc/testsuite/gcc.dg/torture/tls/thr-init-2.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/torture/tls/thr-init-2.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/torture/tls/thr-init-2.c	(revision 0)
> @@ -0,0 +1,28 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target tls } */
> +
> +extern int printf (char *,...);
> +extern void abort() ;
> +
> +static __thread int fstat ;
> +static __thread int fstat = 1;
> +static __thread int fstat ;
> +
> +int test_code(int b)
> +{
> +  fstat += b ;
> +  return fstat;
> +}
> +
> +int main (int ac, char *av[])
> +{
> +  int a = test_code(1);
> +  
> +  if ( a != 2 || fstat != 2 )
> +    {
> +    printf ("a=%d fstat=%d\n", a, fstat) ;
> +    abort ();
> +    }
> +  
> +  return 0;
> +}

>
diff mbox

Patch

Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 161974)
+++ gcc/expr.c	(working copy)
@@ -6828,21 +6828,7 @@  highest_pow2_factor_for_target (const_tree target,
 
   return MAX (factor, talign);
 }
-
-/* Return &VAR expression for emulated thread local VAR.  */
 
-static tree
-emutls_var_address (tree var)
-{
-  tree emuvar = emutls_decl (var);
-  tree fn = built_in_decls [BUILT_IN_EMUTLS_GET_ADDRESS];
-  tree arg = build_fold_addr_expr_with_type (emuvar, ptr_type_node);
-  tree arglist = build_tree_list (NULL_TREE, arg);
-  tree call = build_function_call_expr (UNKNOWN_LOCATION, fn, arglist);
-  return fold_convert (build_pointer_type (TREE_TYPE (var)), call);
-}
-
-
 /* Subroutine of expand_expr.  Expand the two operands of a binary
    expression EXP0 and EXP1 placing the results in OP0 and OP1.
    The value may be stored in TARGET if TARGET is nonzero.  The
@@ -6946,17 +6932,6 @@  expand_expr_addr_expr_1 (tree exp, rtx target, enu
       break;
 
     case VAR_DECL:
-      /* TLS emulation hook - replace __thread VAR's &VAR with
-	 __emutls_get_address (&_emutls.VAR).  */
-      if (! targetm.have_tls
-	  && TREE_CODE (exp) == VAR_DECL
-	  && DECL_THREAD_LOCAL_P (exp))
-	{
-	  exp = emutls_var_address (exp);
-	  return expand_expr (exp, target, tmode, modifier);
-	}
-      /* Fall through.  */
-
     default:
       /* If the object is a DECL, then expand it for its rtl.  Don't bypass
 	 expand_expr, as that can have various side effects; LABEL_DECLs for
@@ -8394,16 +8369,6 @@  expand_expr_real_1 (tree exp, rtx target, enum mac
 	  && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
 	layout_decl (exp, 0);
 
-      /* TLS emulation hook - replace __thread vars with
-	 *__emutls_get_address (&_emutls.var).  */
-      if (! targetm.have_tls
-	  && TREE_CODE (exp) == VAR_DECL
-	  && DECL_THREAD_LOCAL_P (exp))
-	{
-	  exp = build_fold_indirect_ref_loc (loc, emutls_var_address (exp));
-	  return expand_expr_real_1 (exp, target, tmode, modifier, NULL);
-	}
-
       /* ... fall through ...  */
 
     case FUNCTION_DECL:
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	(revision 161974)
+++ gcc/gimplify.c	(working copy)
@@ -1339,7 +1339,19 @@  gimplify_vla_decl (tree decl, gimple_seq *seq_p)
   gimplify_ctxp->save_stack = true;
 }
 
+/* Return &VAR expression for emulated thread local VAR.  */
 
+static tree
+emutls_var_address (tree var)
+{
+  tree emuvar = emutls_decl (var);
+  tree fn = built_in_decls [BUILT_IN_EMUTLS_GET_ADDRESS];
+  tree arg = build_fold_addr_expr_with_type (emuvar, ptr_type_node);
+  tree arglist = build_tree_list (NULL_TREE, arg);
+  tree call = build_function_call_expr (UNKNOWN_LOCATION, fn, arglist);
+  return fold_convert (build_pointer_type (TREE_TYPE (var)), call);
+}
+
 /* Gimplifies a DECL_EXPR node *STMT_P by making any necessary allocation
    and initialization explicit.  */
 
@@ -1354,6 +1366,18 @@  gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_
   if (TREE_TYPE (decl) == error_mark_node)
     return GS_ERROR;
 
+  /* TLS emulation hook - replace __thread VAR's &VAR with
+     __emutls_get_address (&_emutls.VAR). We then ignore the original
+     var.  */
+  if (! targetm.have_tls
+      && TREE_CODE (decl) == VAR_DECL
+      && DECL_THREAD_LOCAL_P (decl))
+    {
+      stmt = build_fold_indirect_ref (emutls_var_address (decl));
+      gimplify_and_add (stmt, seq_p);
+      return GS_ALL_DONE;
+    }
+
   if ((TREE_CODE (decl) == TYPE_DECL
        || TREE_CODE (decl) == VAR_DECL)
       && !TYPE_SIZES_GIMPLIFIED (TREE_TYPE (decl)))
@@ -1873,6 +1897,17 @@  gimplify_var_or_parm_decl (tree *expr_p)
       return GS_ERROR;
     }
 
+  /* TLS emulation hook - replace __thread VAR's &VAR with
+     __emutls_get_address (&_emutls.VAR).  */
+  if (! targetm.have_tls
+      && TREE_CODE (decl) == VAR_DECL
+      && DECL_THREAD_LOCAL_P (decl))
+    {
+      gcc_assert (!DECL_HAS_VALUE_EXPR_P (decl));
+      *expr_p = build_fold_indirect_ref (emutls_var_address (decl));
+      return GS_OK;
+    }
+
   /* When within an OpenMP context, notice uses of variables.  */
   if (gimplify_omp_ctxp && omp_notice_variable (gimplify_omp_ctxp, decl, true))
     return GS_ALL_DONE;
@@ -5553,14 +5588,15 @@  omp_notice_variable (struct gimplify_omp_ctx *ctx,
   /* Threadprivate variables are predetermined.  */
   if (is_global_var (decl))
     {
-      if (DECL_THREAD_LOCAL_P (decl))
+      if (DECL_TLS_MODEL (decl) != TLS_MODEL_NONE)
 	return omp_notice_threadprivate_variable (ctx, decl, NULL_TREE);
 
       if (DECL_HAS_VALUE_EXPR_P (decl))
 	{
 	  tree value = get_base_address (DECL_VALUE_EXPR (decl));
 
-	  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
+	  if (value && DECL_P (value) 
+	      && (DECL_TLS_MODEL (value) != TLS_MODEL_NONE))
 	    return omp_notice_threadprivate_variable (ctx, decl, value);
 	}
     }
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 161974)
+++ gcc/varasm.c	(working copy)
@@ -204,6 +204,14 @@  static GTY (()) tree emutls_object_type;
 # define EMUTLS_SEPARATOR	"_"
 #endif
 
+static int
+decl_needs_tls_emulation_p (tree decl)
+{
+  return !targetm.have_tls 
+	  && TREE_CODE (decl) == VAR_DECL 
+	  && DECL_THREAD_LOCAL_P (decl);
+}
+
 /* Create an IDENTIFIER_NODE by prefixing PREFIX to the
    IDENTIFIER_NODE NAME's name.  */
 
@@ -322,7 +330,7 @@  get_emutls_init_templ_addr (tree decl)
   DECL_IGNORED_P (to) = 1;
   DECL_CONTEXT (to) = DECL_CONTEXT (decl);
   DECL_SECTION_NAME (to) = DECL_SECTION_NAME (decl);
-  DECL_PRESERVE_P (to) = DECL_PRESERVE_P (decl);
+  DECL_PRESERVE_P (to) = 1;
 
   DECL_WEAK (to) = DECL_WEAK (decl);
   if (DECL_ONE_ONLY (decl))
@@ -337,7 +345,6 @@  get_emutls_init_templ_addr (tree decl)
 
   DECL_VISIBILITY_SPECIFIED (to) = DECL_VISIBILITY_SPECIFIED (decl);
   DECL_INITIAL (to) = DECL_INITIAL (decl);
-  DECL_INITIAL (decl) = NULL;
 
   varpool_finalize_decl (to);
   return build_fold_addr_expr (to);
@@ -388,8 +395,7 @@  emutls_decl (tree decl)
       DECL_TLS_MODEL (to) = TLS_MODEL_EMULATED;
       DECL_ARTIFICIAL (to) = 1;
       DECL_IGNORED_P (to) = 1;
-      /* FIXME: work around PR44132.  */
-      DECL_PRESERVE_P (to) = 1;
+
       TREE_READONLY (to) = 0;
       SET_DECL_ASSEMBLER_NAME (to, DECL_NAME (to));
       if (DECL_ONE_ONLY (decl))
@@ -413,18 +419,42 @@  emutls_decl (tree decl)
   TREE_STATIC (to) = TREE_STATIC (decl);
   TREE_USED (to) = TREE_USED (decl);
   TREE_PUBLIC (to) = TREE_PUBLIC (decl);
+  TREE_ADDRESSABLE (to) = TREE_ADDRESSABLE (decl);
   DECL_EXTERNAL (to) = DECL_EXTERNAL (decl);
   DECL_COMMON (to) = DECL_COMMON (decl);
   DECL_WEAK (to) = DECL_WEAK (decl);
   DECL_VISIBILITY (to) = DECL_VISIBILITY (decl);
   DECL_VISIBILITY_SPECIFIED (to) = DECL_VISIBILITY_SPECIFIED (decl);
+  DECL_PRESERVE_P (to) = DECL_PRESERVE_P (decl);
   
   /* Fortran might pass this to us.  */
   DECL_RESTRICTED_P (to) = DECL_RESTRICTED_P (decl);
+  
+  /* As soon as we see an initializer (and providing one is not already
+     present) we can setup the init. template.  */
+  if (!DECL_INITIAL (to) 
+       && DECL_INITIAL (decl) 
+       && DECL_INITIAL (decl) != error_mark_node 
+       && !DECL_EXTERNAL (to) 
+       && !DECL_COMMON (to))
+    {
+      DECL_INITIAL (to) = targetm.emutls.var_init
+			  (to, decl, get_emutls_init_templ_addr (decl));
 
+      /* Make sure the template is marked as needed early enough.
+	 Without this, if the variable is placed in a
+	 section-anchored block, the template will only be marked
+	 when it's too late.*/
+      record_references_in_initializer (to, false);
+    }
+
+  /* Say we are not interested in emitting this Var.  */
+  TREE_ASM_WRITTEN (decl) = 1;
   return to;
 }
 
+/* Add static constructors for emutls vars, where required.  */
+
 static int
 emutls_common_1 (void **loc, void *xstmts)
 {
@@ -439,10 +469,6 @@  emutls_common_1 (void **loc, void *xstmts)
 
   word_type_node = lang_hooks.types.type_for_mode (word_mode, 1);
 
-  /* The idea was to call get_emutls_init_templ_addr here, but if we
-     do this and there is an initializer, -fanchor_section loses,
-     because it would be too late to ensure the template is
-     output.  */
   x = null_pointer_node;
   args = tree_cons (NULL, x, NULL);
   x = build_int_cst (word_type_node, DECL_ALIGN_UNIT (h->base.from));
@@ -469,8 +495,22 @@  emutls_finalize_control_var (void **loc,
   if (h != NULL) 
     {
       struct varpool_node *node = varpool_node (h->to);
+      if (TREE_USED (h->base.from)) 
+	TREE_USED (h->to) = 1;
+
+      /* We must ensure that the size and align fields are initialized
+         in control vars, even when the value is not.  */
+      if ((!DECL_INITIAL (h->base.from)
+            || DECL_INITIAL (h->base.from) == error_mark_node)
+	    && !DECL_COMMON (h->base.from)
+	    && !DECL_EXTERNAL (h->base.from))
+	{
+	  DECL_INITIAL (h->to) = targetm.emutls.var_init
+			(h->to, h->base.from, null_pointer_node);
+	}
+
       /* Because varpool_finalize_decl () has side-effects,
-         only apply to un-finalized vars.  */
+         only call it for un-finalized vars.  */
       if (node && !node->finalized) 
 	varpool_finalize_decl (h->to);
     }
@@ -500,6 +540,39 @@  emutls_finish (void)
     }
 }
 
+/* Callback to check an htab entry against a supplied control var and
+   subsitute the original user var if they match.  */
+static int
+emutls_find_user_var_cb (void **loc, void *uvar)
+{
+  struct tree_map *h = *(struct tree_map **) loc;
+  if (h != NULL
+      && h->to == *((tree *)uvar))
+    {
+       *((tree *)uvar) = h->base.from;
+       return 0;
+    }
+  return 1;
+}
+
+/* In the case of uninitialized global vars, we have to find our way back
+   to the user-var in order to build a trivial initializer for the size
+   and align fields.  */
+
+static void
+emutls_add_base_initializer (tree control)
+{
+  tree uvar = control;
+  if (emutls_htab == NULL)
+    return;
+  htab_traverse_noresize (emutls_htab, emutls_find_user_var_cb, &uvar);
+  /* If we didn't find the user var from the control, something is broken.  */
+  gcc_assert (uvar != control);
+  DECL_INITIAL (control) = targetm.emutls.var_init
+			(control, uvar, null_pointer_node);
+}
+
+
 /* Helper routines for maintaining section_htab.  */
 
 static int
@@ -792,6 +865,8 @@  asm_output_bss (FILE *file, tree decl ATTRIBUTE_UN
   gcc_assert (strcmp (XSTR (XEXP (DECL_RTL (decl), 0), 0), name) == 0);
   targetm.asm_out.globalize_decl_name (file, decl);
   switch_to_section (bss_section);
+  /* We don't emit the userland vars for emulated TLS, just the control.  */
+  gcc_assert (!decl_needs_tls_emulation_p (decl));
 #ifdef ASM_DECLARE_OBJECT_NAME
   last_assemble_variable_decl = decl;
   ASM_DECLARE_OBJECT_NAME (file, name, decl);
@@ -818,6 +893,8 @@  asm_output_aligned_bss (FILE *file, tree decl ATTR
 {
   switch_to_section (bss_section);
   ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT));
+  /* We don't emit the userland vars for emulated TLS, just the control.  */
+  gcc_assert (!decl_needs_tls_emulation_p (decl));
 #ifdef ASM_DECLARE_OBJECT_NAME
   last_assemble_variable_decl = decl;
   ASM_DECLARE_OBJECT_NAME (file, name, decl);
@@ -1233,7 +1310,10 @@  get_variable_section (tree decl, bool prefer_noswi
   if (IN_NAMED_SECTION (decl))
     return get_named_section (decl, NULL, reloc);
 
-  if (ADDR_SPACE_GENERIC_P (as)
+  /* This should not be bss for an emulated TLS object.  */
+  if (DECL_TLS_MODEL (decl) == TLS_MODEL_EMULATED)
+    ;
+  else if (ADDR_SPACE_GENERIC_P (as)
       && !DECL_THREAD_LOCAL_P (decl)
       && !(prefer_noswitch_p && targetm.have_switchable_bss_sections)
       && bss_initializer_p (decl))
@@ -2153,35 +2233,6 @@  assemble_variable (tree decl, int top_level ATTRIB
   rtx decl_rtl, symbol;
   section *sect;
 
-  if (! targetm.have_tls
-      && TREE_CODE (decl) == VAR_DECL
-      && DECL_THREAD_LOCAL_P (decl))
-    {
-      tree to = emutls_decl (decl);
-
-      /* If this variable is defined locally, then we need to initialize the
-         control structure with size and alignment information.  We do this
-	 at the last moment because tentative definitions can take a locally
-	 defined but uninitialized variable and initialize it later, which
-	 would result in incorrect contents.  */
-      if (! DECL_EXTERNAL (to)
-	  && (! DECL_COMMON (to)
-	      || (DECL_INITIAL (decl)
-		  && DECL_INITIAL (decl) != error_mark_node)))
-	{
-	  DECL_INITIAL (to) = targetm.emutls.var_init
-	    (to, decl, get_emutls_init_templ_addr (decl));
-
-	  /* Make sure the template is marked as needed early enough.
-	     Without this, if the variable is placed in a
-	     section-anchored block, the template will only be marked
-	     when it's too late.  */
-	  record_references_in_initializer (to, false);
-	}
-
-      decl = to;
-    }
-
   last_assemble_variable_decl = 0;
 
   /* Normally no need to say anything here for external references,
@@ -2238,6 +2289,19 @@  assemble_variable (tree decl, int top_level ATTRIB
   if (flag_syntax_only)
     return;
 
+  /* We don't emit the userland vars for emulated TLS - they should never
+     get to here, only the control vars should be emitted.  */
+  gcc_assert (! decl_needs_tls_emulation_p (decl));
+  
+  /* However, for the emutls control vars must ensure that the size and
+     align fields are initialized, even if the value is not.  */
+  if (!targetm.have_tls 
+      && TREE_CODE (decl) == VAR_DECL
+      && DECL_TLS_MODEL (decl) == TLS_MODEL_EMULATED
+      && !DECL_INITIAL (decl)
+      && !DECL_COMMON (decl))
+    emutls_add_base_initializer (decl);
+
   if (! dont_output_data
       && ! host_integerp (DECL_SIZE_UNIT (decl), 1))
     {
@@ -5701,18 +5765,12 @@  do_assemble_alias (tree decl, tree target)
   TREE_ASM_WRITTEN (decl) = 1;
   TREE_ASM_WRITTEN (DECL_ASSEMBLER_NAME (decl)) = 1;
 
+  gcc_assert (! decl_needs_tls_emulation_p (decl));
+
   if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
     {
       ultimate_transparent_alias_target (&target);
 
-      if (!targetm.have_tls
-	  && TREE_CODE (decl) == VAR_DECL
-	  && DECL_THREAD_LOCAL_P (decl))
-	{
-	  decl = emutls_decl (decl);
-	  target = get_emutls_object_name (target);
-	}
-
       if (!TREE_SYMBOL_REFERENCED (target))
 	weakref_targets = tree_cons (decl, target, weakref_targets);
 
@@ -5731,14 +5789,6 @@  do_assemble_alias (tree decl, tree target)
       return;
     }
 
-  if (!targetm.have_tls
-      && TREE_CODE (decl) == VAR_DECL
-      && DECL_THREAD_LOCAL_P (decl))
-    {
-      decl = emutls_decl (decl);
-      target = get_emutls_object_name (target);
-    }
-
 #ifdef ASM_OUTPUT_DEF
   /* Make name accessible from other files, if appropriate.  */
 
@@ -5820,6 +5870,15 @@  remove_unreachable_alias_pairs (void)
     }
 }
 
+/* Lookup the decl for a symbol in the varpool.  */
+static tree
+var_decl_for_asm (tree symbol)
+{
+  struct varpool_node *vnode = varpool_node_for_asm  (symbol);
+  if (vnode) 
+    return vnode->decl;
+  return NULL;
+}
 
 /* First pass of completing pending aliases.  Make sure that cgraph knows
    which symbols will be required.  */
@@ -5832,8 +5891,55 @@  finish_aliases_1 (void)
 
   for (i = 0; VEC_iterate (alias_pair, alias_pairs, i, p); i++)
     {
-      tree target_decl;
+      tree target_decl=NULL;
 
+      /* When emulated TLS is in effect, redirect aliases so that they 
+         are registered between the control vars.  */
+      if (!targetm.have_tls 
+          && TREE_CODE (p->decl) == VAR_DECL
+          && DECL_TLS_MODEL (p->decl) != TLS_MODEL_NONE)
+	{
+	  tree tsym = p->target ;
+	  target_decl = var_decl_for_asm (tsym) ;
+	  if (!target_decl) 
+	    {
+	      /* If we didn't find the user's symbol, it could
+	         be because the alias really refers to a control 
+	         var.  */
+	      tsym = get_emutls_object_name (p->target);
+	      target_decl = var_decl_for_asm (tsym);
+	    }
+	  if (target_decl) 
+	    {
+	      struct varpool_node *vnode;
+	      /* If it hasn't been done already, substitute the control
+	         var for the original.  */
+	      if (DECL_THREAD_LOCAL_P (p->decl))
+		p->decl = emutls_decl (p->decl);
+	      /* If not TLS target, we've made a mistake.  */
+	      if (DECL_TLS_MODEL (target_decl) < TLS_MODEL_EMULATED)
+		error ("TLS symbol %q+D aliased to non-TLS symbol %qE",
+			p->decl, p->target);
+	      /* If it's the original we need to substitute the contol.  */
+	      else if (DECL_THREAD_LOCAL_P (target_decl))
+		{
+		  target_decl = emutls_decl (target_decl);
+		  tsym = get_emutls_object_name (p->target);
+		}
+	      /* else it's already the emulation control.  */
+	      /* Mark the var needed.  */
+	      vnode = varpool_node (target_decl);
+	      if (vnode) 
+	        {
+		  varpool_mark_needed_node (vnode);
+		  vnode->force_output = 1;
+	        }
+	      p->target = tsym;
+	    }
+	  /* Else we didn't find a decl for the symbol, which is an error
+	     unless there's a weak ref.  */
+	} 
+      else
       target_decl = find_decl_and_mark_needed (p->decl, p->target);
       if (target_decl == NULL)
 	{
Index: gcc/passes.c
===================================================================
--- gcc/passes.c	(revision 161974)
+++ gcc/passes.c	(working copy)
@@ -152,6 +152,16 @@  rest_of_decl_compilation (tree decl,
 			  int top_level,
 			  int at_end)
 {
+  if (! targetm.have_tls 
+	&& !in_lto_p 
+	&& TREE_CODE (decl) == VAR_DECL
+	&& DECL_THREAD_LOCAL_P (decl))
+    {
+      /* Substitute the control var. for the user one.  */
+      rest_of_decl_compilation (emutls_decl (decl), top_level, at_end);
+      return;
+    }
+
   /* We deferred calling assemble_alias so that we could collect
      other attributes such as visibility.  Emit the alias now.  */
   {
Index: gcc/varpool.c
===================================================================
--- gcc/varpool.c	(revision 161974)
+++ gcc/varpool.c	(working copy)
@@ -312,6 +312,14 @@  varpool_mark_needed_node (struct varpool_node *nod
       && !TREE_ASM_WRITTEN (node->decl))
     varpool_enqueue_needed_node (node);
   node->needed = 1;
+  /* If we need the var, and it's an emulated TLS entity, that
+     means we need the control var.  */
+  if (!targetm.have_tls && DECL_THREAD_LOCAL_P (node->decl))
+    {
+      struct varpool_node *cv_node;
+      cv_node = varpool_node (emutls_decl (node->decl)) ;
+      varpool_mark_needed_node (cv_node);
+    }
 }
 
 /* Reset the queue of needed nodes.  */
@@ -346,17 +354,6 @@  decide_is_variable_needed (struct varpool_node *no
       && !DECL_EXTERNAL (decl))
     return true;
 
-  /* When emulating tls, we actually see references to the control
-     variable, rather than the user-level variable.  */
-  if (!targetm.have_tls
-      && TREE_CODE (decl) == VAR_DECL
-      && DECL_THREAD_LOCAL_P (decl))
-    {
-      tree control = emutls_decl (decl);
-      if (decide_is_variable_needed (varpool_node (control), control))
-	return true;
-    }
-
   /* When not reordering top level variables, we have to assume that
      we are going to keep everything.  */
   if (flag_toplevel_reorder)
@@ -381,14 +378,31 @@  varpool_finalize_decl (tree decl)
      or local (in C, has internal linkage).  So do nothing more
      if this function has already run.  */
   if (node->finalized)
+      return;
+
+  /* For emulated TLS vars, if we are in a position to finalize the userland
+     var, then we should be able to finalize the control var too.  */
+  if (!targetm.have_tls 
+      && TREE_CODE (decl) == VAR_DECL
+      && DECL_THREAD_LOCAL_P (decl))
     {
-      if (cgraph_global_info_ready)
-	varpool_assemble_pending_decls ();
+      tree control = emutls_decl (decl);
+      /* If we didn't create an initializer in the preceding line,  then we 
+         must now add a minimal one that sets the size and align fields.  */
+      if (!DECL_INITIAL (control) 
+	   && !DECL_COMMON (control)
+	   && !DECL_EXTERNAL (control))
+	{
+	  DECL_INITIAL (control) = targetm.emutls.var_init
+			(control, decl, null_pointer_node);
+	  record_references_in_initializer (control, false);
+	}
+
+      varpool_finalize_decl (control) ;
+      node->finalized = true;    
       return;
     }
-  if (node->needed)
-    varpool_enqueue_needed_node (node);
-  node->finalized = true;
+
   if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl))
     node->force_output = true;
 
@@ -399,8 +413,11 @@  varpool_finalize_decl (tree decl)
      there.  */
   else if (TREE_PUBLIC (decl) && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
     varpool_mark_needed_node (node);
-  if (cgraph_global_info_ready)
-    varpool_assemble_pending_decls ();
+
+  if (node->needed)
+    varpool_enqueue_needed_node (node);
+
+  node->finalized = true;
 }
 
 /* Return variable availability.  See cgraph.h for description of individual
@@ -449,7 +466,7 @@  varpool_analyze_pending_decls (void)
 	     already informed about increased alignment.  */
           align_variable (decl, 0);
 	}
-      if (DECL_INITIAL (decl))
+      if (DECL_INITIAL (decl) && (DECL_INITIAL (decl) != error_mark_node))
 	record_references_in_initializer (decl, analyzed);
       if (node->same_comdat_group)
 	{