Patchwork MIPS16 TLS support for GCC

login
register
mail settings
Submitter Richard Sandiford
Date Jan. 8, 2012, 11:20 a.m.
Message ID <87r4zaxwx6.fsf@firetop.home>
Download mbox | patch
Permalink /patch/134898/
State New
Headers show

Comments

Richard Sandiford - Jan. 8, 2012, 11:20 a.m.
Chung-Lin Tang <cltang@codesourcery.com> writes:
> Hi Richard,
> here are the patches for MIPS16 TLS.

Thanks for all the work.  It's great to see this hole in the MIPS16
support finally being plugged.

There seem to be three changes in the patch:

(1) Enable MIPS16 TLS (which must of course work independently
    of whether libgcc, libc, etc. were compiled as MIPS16).

(2) Allow the mips*-linux-gnu target libraries to be built as MIPS16.
    This relies on (1) but isn't required by (1).

(3) Define a built-in function for accessing the TLS pointer.  This is
    logically independent, although since the motivation was presumably
    to help build an external MIPS16 library, it's only worthwhile after (1).

I agree all three are good things to have, but I'd like to keep them as
separate patches.

Now that I've seen the implementation, (1) feels fairly low risk and
high value, so I'd like it to go in 4.7.

(2) is interesting if there is also a way to build those MIPS16 libraries
out of the box.  I'd like such a mechanism to be added at the same time,
so that the new support is easy to test.  This is still a 4.7 candidate
if it can be done in time, although it's probably a little tight.

You've given the built-in function the generic name __builtin_thread_pointer.
I agree that's a good idea, but I think we should also _treat_ it as a generic
function, and make it available on all targets.  The actual implementation
can be delegated to a target hook.  That makes (3) 4.8 material.

(Incidentally, I don't think it's correct to define the builtin if TLS
isn't available, so if we did keep it as a MIPS-specific function,
d->avail would need to be nonnull.  There would need to be some other
way of indicating that MIPS16 was OK.)

> The __mips16_rdhwr() function is not intended for general use, just
> tightly coupled compiler runtime support; therefore, it is only linked
> statically from libgcc.a, not exported from shared libgcc.

Well, a fair bit of libgcc is tightly-coupled compiler runtime support.
I don't really see any need to handle the new function differently from
the other __mips16 wrappers.  It's not like we're gaining any benefit in
the PIC call overhead: we can't turn JALRs into branches like we can for
nearby non-MIPS16-to-non-MIPS16 calls, so PIC calls will still go via
the GOT.  And we're not gaining any benefit in terms of ABI baggage either.
Future libgcc(.a)s would need to continue providing the function as
specified now, in order for future gccs to be able to link library
archives built with 4.7.

By making the function hidden, we lose the important ability to replace
all instances of it.  E.g. it isn't inconceivable that someone will find
a more efficient way to implement the function in cases where the RDHWR
is emulated (perhaps on a kernel other than Linux -- this support isn't
specific to *-linux-gnu).  Being able to replace all instances of a
function is useful for other things, such as profiling, debugging,
or working around processor errata.

> 5) The libgomp MIPS futex.h header has been adjusted; sys_futex0() has
> been modified to be static, non-inlined, nomips16 under MIPS16 mode (for
> sake of using 'syscall'). The inline assembly has also been fixed, as
> Maciej noticed a possible violation of the MIPS syscall restart
> convention; the 'li $2, #syscall_number' must be right before the
> syscall insn.

This change is OK as part of (2).

> 2012-01-06  Chung-Lin Tang  <cltang@codesourcery.com>
>
> 	gcc/
>         * config/mips/mips.h (CRT_CALL_STATIC_FUNCTION): Define versions
>         for MIPS16 O32.
>         * config/mips/mips-protos.h (mips_symbol_type): Add
>         SYMBOL_DTPREL_HI,SYMBOL_TPREL_HI. Update in comments.
>         (mips_output_tls_reloc_directive): New prototype.
>         * config/mips/mips-ftypes.def: Add (0, (POINTER)) entry.
>         * config/mips/predicates.md (tls_reloc_operand): New predicate.
>         * config/mips/mips.md (V0_REGNUM,PIC_JUMP_REGNUM): New constant.
>         (MIPS16 %dtprel_hi,%tprel_hi split pattern): New.
>         (consttable_tls_reloc): New.
>         (tls_get_tp_<mode>_mips16): New insn and split pattern.
>         (*tls_get_tp_<mode>_mips16_rdhwr): New insn pattern.

mips.c is missing from the changelog.

> Index: gcc/config/mips/mips.md
> ===================================================================
> --- gcc/config/mips/mips.md	(revision 182952)
> +++ gcc/config/mips/mips.md	(working copy)
> @@ -134,7 +134,9 @@
>  ])
>  
>  (define_constants
> -  [(TLS_GET_TP_REGNUM		3)
> +  [(V0_REGNUM			2)
> +   (TLS_GET_TP_REGNUM		3)
> +   (PIC_JUMP_REGNUM		25)
>     (RETURN_ADDR_REGNUM		31)
>     (CPRESTORE_SLOT_REGNUM	76)
>     (GOT_VERSION_REGNUM		79)

V0_REGNUM seems to be unused.  We should move PIC_FUNCTION_ADDR_REGNUM
here rather than add another name for it.

> @@ -3933,6 +3935,23 @@
>    operands[2] = mips_unspec_address (operands[1], SYMBOL_32_HIGH);
>  })
>  
> +;; MIPS16 %dtprel_hi,%tprel_hi split pattern. Similar transform
> +;; as above, for supporting MIPS16 TLS.
> +(define_split
> +  [(set (match_operand:SI 0 "d_operand")
> +	(high:SI (match_operand:SI 1 "tls_reloc_operand")))]
> +  "TARGET_MIPS16 && reload_completed"
> +  [(set (match_dup 0) (match_dup 2))
> +   (set (match_dup 0) (ashift:SI (match_dup 0) (const_int 16)))]
> +{
> +  /* SYMBOL_DTPREL_HI/TPREL_HI are ordered immediately after
> +     SYMBOL_DTPREL/TPREL respectively, so use unspec_type + 1.  */
> +  rtx unspec = XEXP (operands[1], 0);
> +  int unspec_type = XINT (unspec, 1);
> +  operands[2] = mips_unspec_address (XVECEXP (unspec, 0, 0),
> +				     unspec_type + 1 - UNSPEC_ADDRESS_FIRST);
> +})

mips_symbolic_constant_p allows an offset to be applied, in which case
the expression has the form:

   (const (plus (unspec [BASE] UNSPEC_FOO) (const_int OFFSET)))

I'm also not keen on hard-coding the +1 relationship.

I think I made the wrong call when adding SYMBOL_32_HIGH.  The idea
was to reuse the existing move patterns rather than add a special one.
That wasn't too bad when there was just this one simple case, but as
the patch shows, it doesn't generalise well.  Let's add a new "load
unshifted high" instruction instead.

> @@ -6593,6 +6623,60 @@
>     ; See tls_get_tp_<mode>
>     (set_attr "can_delay" "no")
>     (set_attr "mode" "<MODE>")])
> +
> +;; In MIPS16 mode, the TLS base pointer is accessed by a
> +;; libgcc helper function __mips16_rdhwr(), as 'rdhwr' is not
> +;; accessible in MIPS16.
> +;;
> +;; This is not represented as a call insn, to avoid the
> +;; unnecesarry clobbering of caller-save registers by a
> +;; function consisting only of: "rdhwr $3,$29; j $31; nop;"
> +;;
> +;; A $25 clobber is added to cater for a $25 load stub added by the
> +;; linker to __mips16_rdhwr when the call is made from non-PIC code.
> +
> +(define_insn_and_split "tls_get_tp_<mode>_mips16"
> +  [(set (match_operand:P 0 "register_operand" "=d")
> +	(unspec:P [(const_int 0)] UNSPEC_TLS_GET_TP))
> +   (clobber (reg:P TLS_GET_TP_REGNUM))
> +   (clobber (reg:P PIC_JUMP_REGNUM))
> +   (clobber (reg:P RETURN_ADDR_REGNUM))]
> +  "HAVE_AS_TLS && TARGET_MIPS16"
> +  "#"
> +  "&& reload_completed"
> +  [(parallel [(set (reg:P TLS_GET_TP_REGNUM)
> +  	      (unspec:P [(match_dup 1)] UNSPEC_TLS_GET_TP))
> +	      (clobber (reg:P PIC_JUMP_REGNUM))
> +	      (clobber (reg:P RETURN_ADDR_REGNUM))])
> +   (set (match_dup 0) (reg:P TLS_GET_TP_REGNUM)) ]
> +  {
> +    /* UNSPEC operand decides on direct/indirect pattern below.  */
> +    rtx sym = gen_rtx_SYMBOL_REF (Pmode, "__mips16_rdhwr");
> +    if (TARGET_ABSOLUTE_JUMPS)
> +      operands[1] = sym;
> +    else
> +      {
> +	operands[1] = gen_rtx_REG (Pmode, TLS_GET_TP_REGNUM);
> +	mips_emit_move (operands[1], sym);
> +      }
> +  }
> +  [(set_attr "type" "unknown")
> +   (set_attr "can_delay" "no")
> +   (set_attr "mode" "<MODE>")])

We should set "length" here, instead of "can_delay".  We should also
reuse the existing call-address handling -- call_insn_operand,
mips16_stub_function, etc. -- in order to make sure that things like
-mlong-calls are handled correctly.  It seems better to do that in
mips_get_tp (), then pass the call_insn_operand as an address.

> +(define_insn "*tls_get_tp_<mode>_mips16_rdhwr"
> +  [(set (reg:P TLS_GET_TP_REGNUM)
> +	(unspec:P [(match_operand:P 0 "")] UNSPEC_TLS_GET_TP))
> +	(clobber (reg:P PIC_JUMP_REGNUM))
> +	(clobber (reg:P RETURN_ADDR_REGNUM))]
> +  "HAVE_AS_TLS && TARGET_MIPS16"
> +  {
> +    return MIPS_CALL ("jal", operands, 0, -1);
> +  }
> +  [(set_attr "type" "call")
> +   (set_attr "can_delay" "no")
> +   (set_attr "mode" "<MODE>")])

Should be:

	(unspec:P [(match_operand:P 0 "call_insn_operand" "dS")]
		  UNSPEC_TLS_GET_TP))

because we need to tell post-reload passes what they can and can't do.
Should set "length" to 12 rather than "can_delay" to "no".

> Index: gcc/config/mips/mips.c
> ===================================================================
> --- gcc/config/mips/mips.c	(revision 182952)
> +++ gcc/config/mips/mips.c	(working copy)
> @@ -1763,6 +1767,8 @@ mips_symbolic_constant_p (rtx x, enum mips_symbol_
>      case SYMBOL_GOTTPREL:
>      case SYMBOL_TLS:
>      case SYMBOL_HALF:
> +    case SYMBOL_TPREL_HI:
> +    case SYMBOL_DTPREL_HI:
>        return false;
>      }

For the record, these types would need to have allowed the same offsets
as SYMBOL_TPREL and SYMBOL_DTPREL, otherwise we'd have ICEd after
splitting a SYMBOL + CONST_INT.  It's not important now though.

> @@ -1928,14 +1936,23 @@ mips_cannot_force_const_mem (enum machine_mode mod
>    if (mips_symbolic_constant_p (base, SYMBOL_CONTEXT_LEA, &type)
>        && type != SYMBOL_FORCE_TO_MEM)
>      {
> +      if (TARGET_MIPS16_PCREL_LOADS)
> +	{
> +	  /* Under MIPS16, TLS DTP/TP-relative offsets are loaded from the
> +	     constant pool, as this saves some code size compared to hi/lo
> +	     constructing.  */
> +	  if (type == SYMBOL_DTPREL || type == SYMBOL_TPREL)
> +	    return false;

As below, we should instead replace SYMBOL_FORCE_TO_MEM with a separate
predicate, since "force-to-mem"ness is now a separate property.

> @@ -2820,11 +2837,20 @@ mips_call_tls_get_addr (rtx sym, enum mips_symbol_
>  /* Return a pseudo register that contains the current thread pointer.  */
>  
>  static rtx
> -mips_get_tp (void)
> +mips_get_tp (rtx target)

This is part of (3), but: adjust the comment to mention the parameter.

> +  if (TARGET_MIPS16)
> +    {
> +      if (Pmode == DImode)
> +	emit_insn (gen_tls_get_tp_di_mips16 (tp));
> +      else
> +	emit_insn (gen_tls_get_tp_si_mips16 (tp));

So it looks like I missed a couple of places that should be using
PMODE_INSN.  Let's put the mode last -- tls_get_tp_mips16_{si,di} --
and use PMODE_INSN for both cases.  

> @@ -2875,13 +2895,23 @@ mips_legitimize_tls_address (rtx loc)
>  			    UNSPEC_TLS_LDM);
>        emit_libcall_block (insn, tmp1, v0, eqv);
>  
> -      tmp2 = mips_unspec_offset_high (NULL, tmp1, loc, SYMBOL_DTPREL);
> -      dest = gen_rtx_LO_SUM (Pmode, tmp2,
> -			     mips_unspec_address (loc, SYMBOL_DTPREL));
> +      if (TARGET_MIPS16_PCREL_LOADS)
> +	{
> +	  tmp2 = mips_force_temporary (NULL,
> +				       mips_unspec_address (loc,
> +							    SYMBOL_DTPREL));
> +	  dest = gen_rtx_PLUS (Pmode, tmp1, tmp2);
> +	}
> +      else
> +	{
> +	  tmp2 = mips_unspec_offset_high (NULL, tmp1, loc, SYMBOL_DTPREL);
> +	  dest = gen_rtx_LO_SUM (Pmode, tmp2,
> +				 mips_unspec_address (loc, SYMBOL_DTPREL));
> +	}

DEST is supposed to be a legitimate address (the sum of two registers isn't).
We should also use a higher-level test than TARGET_MIPS16_PCREL_LOADS.
Same for TPREL.

> @@ -17108,6 +17186,28 @@ mips_expand_vec_minmax (rtx target, rtx op0, rtx o
>    x = gen_rtx_IOR (vmode, t0, t1);
>    emit_insn (gen_rtx_SET (VOIDmode, target, x));
>  }
> +
> +/* Output a DTP/TP-relative relocation, used in MIPS16 TLS.  */
> +
> +void
> +mips_output_tls_reloc_directive (rtx x, rtx size)
> +{
> +  const char *dir = NULL;
> +  if (GET_CODE (x) == CONST)
> +    x = XEXP (x, 0);
> +  switch (UNSPEC_ADDRESS_TYPE (x))

As above, we allow small offsets to be applied to TPREL and DTPREL,
so we can't assume that we have (const (unspec ...)).

(In this case the HI/LO restriction doesn't apply, so we could actually
be a bit more relaxed in the offsets that we allow.  That's more
stage 1 material though.)

> Index: gcc/config/mips/mips.h
> ===================================================================
> --- gcc/config/mips/mips.h	(revision 182952)
> +++ gcc/config/mips/mips.h	(working copy)
> @@ -2841,8 +2841,32 @@ while (0)
>  	jal " USER_LABEL_PREFIX #FUNC "\n\
>  	" TEXT_SECTION_ASM_OP);
>  #endif
> +
> +#else
> +#if (defined _ABIO32 && _MIPS_SIM == _ABIO32)
> +#ifdef __PIC__
> +/* For MIPS16 PIC, construct the GP-value in $2 using PC-relative insns.  */
> +#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
> +   asm (SECTION_OP "\n\
> +	li $2,%hi(_gp_disp)\n\
> +	addiu $3,$pc,%lo(_gp_disp)\n\
> +	sll $2,16\n\
> +	addu $2,$3\n\
> +	lw $2,%got(" USER_LABEL_PREFIX #FUNC ")($2)\n\
> +	addiu $2,%lo(" USER_LABEL_PREFIX #FUNC ")\n\
> +	move $25,$2\n\
> +	jalr $2\n\
> +	" TEXT_SECTION_ASM_OP);
> +#else
> +#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
> +   asm (SECTION_OP "\n\
> +	jal " USER_LABEL_PREFIX #FUNC "\n\
> +	" TEXT_SECTION_ASM_OP);
>  #endif
> +#endif

This is part of (2), but: why is the !__PIC__ definition necessary for just
this one case?  If it's correct, then isn't it correct for !MIPS16 too?
And why isn't the default definition in crtstuff.c good enough?

I wanted to try this out on my set-up, so here's the modified version of (1).
I found I needed a GAS patch in order to get clean results.  I'll post
that patch soon.

The mips_init_relocs change is mostly reindentation, so it looks more
frightening that it is.

A consequence of the SYMBOL_FORCE_TO_MEM change is that -mcode-readable=pcrel
now allows non-PC-relative LEAs of absolute labels to be done via the
constant pool rather than using %hi and %lo.  I think this is a good
thing (it's shorter, and matches the corresponding symbol case)
so I've adjusted the testcase to allow it.

Tested on mips64-linux-gnu.  Does the patch look OK to you?

Thanks,
Richard


gcc/
2012-01-08  Chung-Lin Tang  <cltang@codesourcery.com>
	    Richard Sandiford  <rdsandiford@googlemail.com>

	* config/mips/mips-protos.h (SYMBOL_FORCE_TO_MEM): Delete.
	(SYMBOL_32_HIGH): Likewise.
	(mips_output_tls_reloc_directive): Declare.
	* config/mips/mips.h (PIC_FUNCTION_ADDR_REGNUM): Move to mips.md.
	(mips_use_pcrel_pool_p, mips_lo_relocs, mips_hi_relocs): Declare.
	* config/mips/mips.c (mips_use_pcrel_pool_p): New variable.
	(mips_lo_relocs, mips_hi_relocs): Make extern.
	(mips16_stub_function): Move up file.
	(mips_classify_symbol): Remove SYMBOL_FORCE_TO_MEM handling.
	(mips_symbolic_constant_p): Likewise.  Remove SYMBOL_32_HIGH too.
	(mips_symbol_insns_1): Likewise.  Check mips_use_pcrel_pool_p.
	(mips_cannot_force_const_mem): Use mips_use_pcrel_pool_p instead
	of SYMBOL_FORCE_TO_MEM.  Only check mips_tls_symbol_ref_1
	if it's false.
	(mips_get_tp): Add MIPS16 support.
	(mips_legitimize_tls_address): Remove MIPS16 sorry().
	Generalize DTPREL and TPREL handling.
	(mips_init_relocs): Initialize mips_use_pcrel_pool_p.
	Add MIPS16 TLS support.
	(mips_output_tls_reloc_directive): New function.
	(mips16_rewrite_pool_refs): Ignore UNSPEC_TLS_GET_TPs.
	* config/mips/predicates.md (symbolic_operand_with_high)
	(tls_reloc_operand): New predicates.
	(force_to_mem_operand): Use mips_use_pcrel_pool_p.
	* config/mips/mips.md (UNSPEC_UNSHIFTED_HIGH): New unspec.
	(PIC_FUNCTION_ADDR_REGNUM): Moved from mips.h.
	(*unshifted_high): New instruction.  Use it for MIPS16
	high splitter.
	(consttable_tls_reloc, tls_get_tp_mips16_<mode>): New patterns.
	(*tls_get_tp_mips16_call_<mode>): Likewise.

gcc/testsuite/
	* gcc.target/mips/code-readable-2.c: Allow the jump table address
	to be loaded from the constant pool, rather than via %hi and %lo.

libgcc/
2012-01-08  Chung-Lin Tang  <cltang@codesourcery.com>
	    Richard Sandiford  <rdsandiford@googlemail.com>

	* config/mips/libgcc-mips16.ver (__mips16_rdhwr): Add.
	* config/mips/mips16.S (__mips16_rdhwr): New function.
	* config/mips/t-mips16 (LIB1ASMFUNCS): Add _m16rdhwr.
Richard Sandiford - Jan. 15, 2012, 6:23 p.m.
Richard Sandiford <rdsandiford@googlemail.com> writes:
> gcc/
> 2012-01-08  Chung-Lin Tang  <cltang@codesourcery.com>
> 	    Richard Sandiford  <rdsandiford@googlemail.com>
>
> 	* config/mips/mips-protos.h (SYMBOL_FORCE_TO_MEM): Delete.
> 	(SYMBOL_32_HIGH): Likewise.
> 	(mips_output_tls_reloc_directive): Declare.
> 	* config/mips/mips.h (PIC_FUNCTION_ADDR_REGNUM): Move to mips.md.
> 	(mips_use_pcrel_pool_p, mips_lo_relocs, mips_hi_relocs): Declare.
> 	* config/mips/mips.c (mips_use_pcrel_pool_p): New variable.
> 	(mips_lo_relocs, mips_hi_relocs): Make extern.
> 	(mips16_stub_function): Move up file.
> 	(mips_classify_symbol): Remove SYMBOL_FORCE_TO_MEM handling.
> 	(mips_symbolic_constant_p): Likewise.  Remove SYMBOL_32_HIGH too.
> 	(mips_symbol_insns_1): Likewise.  Check mips_use_pcrel_pool_p.
> 	(mips_cannot_force_const_mem): Use mips_use_pcrel_pool_p instead
> 	of SYMBOL_FORCE_TO_MEM.  Only check mips_tls_symbol_ref_1
> 	if it's false.
> 	(mips_get_tp): Add MIPS16 support.
> 	(mips_legitimize_tls_address): Remove MIPS16 sorry().
> 	Generalize DTPREL and TPREL handling.
> 	(mips_init_relocs): Initialize mips_use_pcrel_pool_p.
> 	Add MIPS16 TLS support.
> 	(mips_output_tls_reloc_directive): New function.
> 	(mips16_rewrite_pool_refs): Ignore UNSPEC_TLS_GET_TPs.
> 	* config/mips/predicates.md (symbolic_operand_with_high)
> 	(tls_reloc_operand): New predicates.
> 	(force_to_mem_operand): Use mips_use_pcrel_pool_p.
> 	* config/mips/mips.md (UNSPEC_UNSHIFTED_HIGH): New unspec.
> 	(PIC_FUNCTION_ADDR_REGNUM): Moved from mips.h.
> 	(*unshifted_high): New instruction.  Use it for MIPS16
> 	high splitter.
> 	(consttable_tls_reloc, tls_get_tp_mips16_<mode>): New patterns.
> 	(*tls_get_tp_mips16_call_<mode>): Likewise.
>
> gcc/testsuite/
> 	* gcc.target/mips/code-readable-2.c: Allow the jump table address
> 	to be loaded from the constant pool, rather than via %hi and %lo.
>
> libgcc/
> 2012-01-08  Chung-Lin Tang  <cltang@codesourcery.com>
> 	    Richard Sandiford  <rdsandiford@googlemail.com>
>
> 	* config/mips/libgcc-mips16.ver (__mips16_rdhwr): Add.
> 	* config/mips/mips16.S (__mips16_rdhwr): New function.
> 	* config/mips/t-mips16 (LIB1ASMFUNCS): Add _m16rdhwr.

This wouldn't normally be "stage 4" material, but since it was posted
just before stage 3 closed, I went ahead and applied it.  I'll update
the web pages "soon".

Richard
Chung-Lin Tang - Feb. 3, 2012, 10:02 a.m.
Hi Richard,

Sorry for the delay to respond, and thanks for revising and committing
the changes to trunk. The revised changes look much cleaner :)

About the other concerns:

> (2) is interesting if there is also a way to build those MIPS16 libraries
> out of the box.  I'd like such a mechanism to be added at the same time,
> so that the new support is easy to test.  This is still a 4.7 candidate
> if it can be done in time, although it's probably a little tight.

I assume you mean the libgomp patch? That's mostly a potential bug fix;
as for MIP16 multilib additions, so far I haven't added any to the
default mips*-linux* configs, as I guess this should be vendor-stuff
(though the current additions should clear away any obstacles)

> You've given the built-in function the generic name __builtin_thread_pointer.
> I agree that's a good idea, but I think we should also _treat_ it as a generic
> function, and make it available on all targets.  The actual implementation
> can be delegated to a target hook.  That makes (3) 4.8 material.

Actually, I named and added __builtin_thread_pointer() only because it
was needed for building glibc (to avoid using 32-bit asm in MIPS16
mode), and because some other targets also have it (I'm sure ARM also
has it, and at least one other)

> (Incidentally, I don't think it's correct to define the builtin if TLS
> isn't available, so if we did keep it as a MIPS-specific function,
> d->avail would need to be nonnull.  There would need to be some other
> way of indicating that MIPS16 was OK.)

The function is essentially a wrapper for mips_get_tp(), which I don't
see any guarding for the no TLS case? FWIW, TARGET_HAVE_TLS is just
defined as HAVE_AS_TLS for MIPS...

>> The __mips16_rdhwr() function is not intended for general use, just
>> tightly coupled compiler runtime support; therefore, it is only linked
>> statically from libgcc.a, not exported from shared libgcc.
> 
> Well, a fair bit of libgcc is tightly-coupled compiler runtime support.
> I don't really see any need to handle the new function differently from
> the other __mips16 wrappers.  It's not like we're gaining any benefit in
> the PIC call overhead: we can't turn JALRs into branches like we can for
> nearby non-MIPS16-to-non-MIPS16 calls, so PIC calls will still go via
> the GOT.  And we're not gaining any benefit in terms of ABI baggage either.
> Future libgcc(.a)s would need to continue providing the function as
> specified now, in order for future gccs to be able to link library
> archives built with 4.7.
> 
> By making the function hidden, we lose the important ability to replace
> all instances of it.  E.g. it isn't inconceivable that someone will find
> a more efficient way to implement the function in cases where the RDHWR
> is emulated (perhaps on a kernel other than Linux -- this support isn't
> specific to *-linux-gnu).  Being able to replace all instances of a
> function is useful for other things, such as profiling, debugging,
> or working around processor errata.

We touched this internally as well, but interestingly came to the
opposite conclusion :)  We're not sure the __mips16_rdhwr() is
ultimately the best choice of API, plus the non-standardness of the
calling convention, hence simply decided to hide it from libgcc_s.so, in
case we need to change the MIPS16 TLS interface.

On the issue of hiding stuff in mips16.S, it may be suitable to raise
this issue here: there is a problem when MIPS16 hard-float calls go
through PLTs, because the hard-float stub functions in mips16.S use
v0/v1 as scratch registers, the same as MIPS16 PLT sequences generated
by BFD. (I think I described the scenario mostly correct, CCing Maciej
here who worked on this issue)  I think the solution was to mark a large
portion of the mips16.S functions as all hidden.

>> 5) The libgomp MIPS futex.h header has been adjusted; sys_futex0() has
>> been modified to be static, non-inlined, nomips16 under MIPS16 mode (for
>> sake of using 'syscall'). The inline assembly has also been fixed, as
>> Maciej noticed a possible violation of the MIPS syscall restart
>> convention; the 'li $2, #syscall_number' must be right before the
>> syscall insn.
> 
> This change is OK as part of (2).

Okay thanks, I commit this part soon.

Thanks,
Chung-Lin
Maciej W. Rozycki - Feb. 3, 2012, 11:56 a.m.
On Fri, 3 Feb 2012, Chung-Lin Tang wrote:

> On the issue of hiding stuff in mips16.S, it may be suitable to raise
> this issue here: there is a problem when MIPS16 hard-float calls go
> through PLTs, because the hard-float stub functions in mips16.S use
> v0/v1 as scratch registers, the same as MIPS16 PLT sequences generated
> by BFD. (I think I described the scenario mostly correct, CCing Maciej
> here who worked on this issue)  I think the solution was to mark a large
> portion of the mips16.S functions as all hidden.

 Not quite a large portion, just the function return stubs (which I reckon 
there are exactly four) that process values in $v0/$v1 as if they were 
arguments.  This is of course a non-standard calling convention that 
breaks the ABI and shouldn't be exposed to the dynamic linker and the PLT.  
Not even mentioning that three of these four stubs are not bigger than a 
PLT entry they would need, and the fourth is two or so instructions 
larger, so making them dynamic hardly makes point (the cost of the 
indirect call from PLT also matters).

 Given that this code (to support MIPS16 shared libraries) was obviously 
never fully functional before this effort, I think it is about the right 
time to remove some hacks I saw in glibc's dynamic linker that are meant 
to avoid clobbering $v0/$v1 before someone tries to rely on them for real.  

 And there's no reasonable way to support MIPS16 PLT entries without 
clobbering $v0/$v1 (to preempt your question -- MIPS16 code can of course 
work just fine with standard MIPS PLT entries (no pure-MIPS16 processors 
are possible), but on some processors the cost of flipping the ISA bit is 
prohibitive as it requires draining the pipeline to reconfigure the 
instruction decoder and as such should best be avoided in pure-MIPS16 
dynamic binaries).

  Maciej
Richard Sandiford - Feb. 3, 2012, 3:28 p.m.
Chung-Lin Tang <cltang@codesourcery.com> writes:
>> (2) is interesting if there is also a way to build those MIPS16 libraries
>> out of the box.  I'd like such a mechanism to be added at the same time,
>> so that the new support is easy to test.  This is still a 4.7 candidate
>> if it can be done in time, although it's probably a little tight.
>
> I assume you mean the libgomp patch? That's mostly a potential bug fix;

That, plus the CRT_CALL_STATIC_FUNCTION and crtfastmath.c changes.

> as for MIP16 multilib additions, so far I haven't added any to the
> default mips*-linux* configs, as I guess this should be vendor-stuff
> (though the current additions should clear away any obstacles)

Vendor stuff would be fine.  But to be clear, (2) isn't OK without
some way of getting MIPS16 multilibs out of the box, since it would
otherwise be dead code.

>> You've given the built-in function the generic name __builtin_thread_pointer.
>> I agree that's a good idea, but I think we should also _treat_ it as a generic
>> function, and make it available on all targets.  The actual implementation
>> can be delegated to a target hook.  That makes (3) 4.8 material.
>
> Actually, I named and added __builtin_thread_pointer() only because it
> was needed for building glibc (to avoid using 32-bit asm in MIPS16
> mode), and because some other targets also have it (I'm sure ARM also
> has it, and at least one other)

All the more reason to make it target-independent. :-)

>> (Incidentally, I don't think it's correct to define the builtin if TLS
>> isn't available, so if we did keep it as a MIPS-specific function,
>> d->avail would need to be nonnull.  There would need to be some other
>> way of indicating that MIPS16 was OK.)
>
> The function is essentially a wrapper for mips_get_tp(), which I don't
> see any guarding for the no TLS case? FWIW, TARGET_HAVE_TLS is just
> defined as HAVE_AS_TLS for MIPS...

This stuff is inherently guarded by TARGET_HAVE_TLS.

>>> 5) The libgomp MIPS futex.h header has been adjusted; sys_futex0() has
>>> been modified to be static, non-inlined, nomips16 under MIPS16 mode (for
>>> sake of using 'syscall'). The inline assembly has also been fixed, as
>>> Maciej noticed a possible violation of the MIPS syscall restart
>>> convention; the 'li $2, #syscall_number' must be right before the
>>> syscall insn.
>> 
>> This change is OK as part of (2).
>
> Okay thanks, I commit this part soon.

Please don't!  I was unnecessarily confusing here, sorry.  I was trying
to say earlier that (2) isn't OK as-is because people have no way of
testing it without changing the sources locally.  We need to add some
way of configuring MIPS16 multilibs at the same time.  The libgomp
change is OK as part of that patch, but not on its own.

Thanks,
Richard
Richard Sandiford - Feb. 3, 2012, 4:20 p.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
> On Fri, 3 Feb 2012, Chung-Lin Tang wrote:
>> On the issue of hiding stuff in mips16.S, it may be suitable to raise
>> this issue here: there is a problem when MIPS16 hard-float calls go
>> through PLTs, because the hard-float stub functions in mips16.S use
>> v0/v1 as scratch registers, the same as MIPS16 PLT sequences generated
>> by BFD. (I think I described the scenario mostly correct, CCing Maciej
>> here who worked on this issue)  I think the solution was to mark a large
>> portion of the mips16.S functions as all hidden.
>
>  Not quite a large portion, just the function return stubs (which I reckon 
> there are exactly four) that process values in $v0/$v1 as if they were 
> arguments.  This is of course a non-standard calling convention that 
> breaks the ABI and shouldn't be exposed to the dynamic linker and the PLT.  
> Not even mentioning that three of these four stubs are not bigger than a 
> PLT entry they would need, and the fourth is two or so instructions 
> larger, so making them dynamic hardly makes point (the cost of the 
> indirect call from PLT also matters).

But that isn't the only reason to use shared libraries.

>  Given that this code (to support MIPS16 shared libraries) was obviously 
> never fully functional before this effort, I think it is about the right 
> time to remove some hacks I saw in glibc's dynamic linker that are meant 
> to avoid clobbering $v0/$v1 before someone tries to rely on them for real.

FSF BFD doesn't generate MIPS16 PLTs though.  Is this a CS-local change?
If so, the criticism seems a little unfair.  The FSF code is self-
consistent: PLTs can be used for these stubs because the resolver
function preserves the necessary registers.  Traditional lazy binding
stubs are avoided by not using %call16.

I don't think changing the resolver function so that it preserves
fewer registers would be a good idea.  As well as the MIPS16 thing,
$3 is needed for _mcount.

You're right of course that PLTs can't be used for __mips16_rdhwr.
I'd forgotten about that, sorry.  Will fix this weekend.

If there's still some concern that __mips16_rdhwr might not have
the right ABI, then maybe we should simply emit a link-once function
in each object that needs it.  We could then switch to another function
(and another API) without having to keep the old one in libgcc.a for
compatibility.  It would also avoid the -shared-libgcc thing.

Admittedly that's just an off-the-top-of-my-head idea. :-)
What do you think?

Richard
Maciej W. Rozycki - Feb. 3, 2012, 5:33 p.m.
On Fri, 3 Feb 2012, Richard Sandiford wrote:

> >  Not quite a large portion, just the function return stubs (which I reckon 
> > there are exactly four) that process values in $v0/$v1 as if they were 
> > arguments.  This is of course a non-standard calling convention that 
> > breaks the ABI and shouldn't be exposed to the dynamic linker and the PLT.  
> > Not even mentioning that three of these four stubs are not bigger than a 
> > PLT entry they would need, and the fourth is two or so instructions 
> > larger, so making them dynamic hardly makes point (the cost of the 
> > indirect call from PLT also matters).
> 
> But that isn't the only reason to use shared libraries.

 That is true in the general case, but these MIPS16 stubs have essentially 
been cast in stone and there's little room if any for change here.  I 
can't imagine a case where static linking of these four stubs would cause 
trouble, but certainly I am open to any examples.

> >  Given that this code (to support MIPS16 shared libraries) was obviously 
> > never fully functional before this effort, I think it is about the right 
> > time to remove some hacks I saw in glibc's dynamic linker that are meant 
> > to avoid clobbering $v0/$v1 before someone tries to rely on them for real.
> 
> FSF BFD doesn't generate MIPS16 PLTs though.  Is this a CS-local change?

 A patch is in the queue, together with microMIPS PLT entries (that you 
have expected, haven't you?).  I can't give you a specific schedule, but I 
expect to propose it as soon as I've got the current batch of GDB changes 
followed by the outstanding binutils clean-ups I got your feedback on that 
I haven't finished processing yet, off my plate.

> If so, the criticism seems a little unfair.  The FSF code is self-
> consistent: PLTs can be used for these stubs because the resolver
> function preserves the necessary registers.  Traditional lazy binding
> stubs are avoided by not using %call16.

 Please don't take it as criticism but as a starting point for discussion. 
And certainly I did not intend to sound negative, I'm sorry if I did.  
I'm just raising points I noted as MIPS16 PLT entries were being 
developed.  I do think though that it's a bit unfortunate that such a 
creeping ABI change has been made here.  I think in case of doubt it's 
always safe to stick to the ABI.

 Lazy binding stubs are less of concern for some reason that is not 
entirely clear to me and we have no MIPS16 implementation in the queue 
(but we do have microMIPS stubs, as you might have expected -- to 
facilitate pure-microMIPS processors that may eventually appear).

> I don't think changing the resolver function so that it preserves
> fewer registers would be a good idea.  As well as the MIPS16 thing,
> $3 is needed for _mcount.

 I don't think _mcount has ever worked for dynamic libraries, has it? -- 
please correct me if I am wrong, but I believe `gprof' relies on memory 
segments to be contiguous which is certainly not the case for a dynamic 
executable where you have a separate set of mappings for the executable 
proper and then for each shared library loaded.

 I didn't know it required $3 for anything either -- I've thought it was 
$1 only.  How has it worked for standard MIPS code then?

 We have only about now got MIPS16 shared libraries to work -- are you 
sure removing code to save/restore $2/$3 in the dynamic linker is going to 
hit anyone?  While a small piece, this is still some wasted memory and 
execution time for every executable on every system and whether MIPS16 
code is involved or not (even on systems that do not support the MIPS16 
ASE at all), just to cope with an ABI anomaly of literally four functions 
only needed for some MIPS16 code (and which had not originally been 
expected to be ever used for dynamic linking -- until recently nobody even 
considered the use of MIPS16 code on a shared library system).  And I 
think you can still get into trouble if you use the wrong ld.so with your 
MIPS16 executable -- or has symbol versioning been used to ensure that 
ld.so bails out in this case?

> You're right of course that PLTs can't be used for __mips16_rdhwr.

 Unless you want to waste text (and possibly stack) space and time to save 
and restore call-clobbered registers in the caller that is. ;)

> I'd forgotten about that, sorry.  Will fix this weekend.
> 
> If there's still some concern that __mips16_rdhwr might not have
> the right ABI, then maybe we should simply emit a link-once function
> in each object that needs it.  We could then switch to another function
> (and another API) without having to keep the old one in libgcc.a for
> compatibility.  It would also avoid the -shared-libgcc thing.
> 
> Admittedly that's just an off-the-top-of-my-head idea. :-)
> What do you think?

 Actually I had that idea of a link-once function too, but it turned out 
quite complicated to do without rewriting some generic parts of GCC as it 
is currently not prepared to emit link-once functions outside C++ 
compilations.  It's been a while and I did lots of other stuff meanwhile, 
so please excuse me if I got anything wrong here.

  Maciej
Richard Sandiford - Feb. 4, 2012, 10:06 a.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>> I don't think changing the resolver function so that it preserves
>> fewer registers would be a good idea.  As well as the MIPS16 thing,
>> $3 is needed for _mcount.
>
>  I don't think _mcount has ever worked for dynamic libraries, has it? -- 
> please correct me if I am wrong, but I believe `gprof' relies on memory 
> segments to be contiguous which is certainly not the case for a dynamic 
> executable where you have a separate set of mappings for the executable 
> proper and then for each shared library loaded.
>
>  I didn't know it required $3 for anything either -- I've thought it was 
> $1 only.  How has it worked for standard MIPS code then?

Sorry, I misremembered, it was $2 (for the static chain register).
But you're probably right that it's not relevant.

>  We have only about now got MIPS16 shared libraries to work -- are you 
> sure removing code to save/restore $2/$3 in the dynamic linker is going to 
> hit anyone?

Pretty sure.  There are two separate points here.  Support for MIPS16
shared libraries went into the FSF tools in 2008, and I was able to
build working(!) shared libraries at that time.  (To be clear, these were
external libraries rather than things like libgcc.)  So it's not really
that new.

That doesn't mean that there weren't bugs in the implementation, of course.
But I think we're just going to have to agree to disagree on the
"working" thing.  It's probably moot anyway given the other point...

> While a small piece, this is still some wasted memory and 
> execution time for every executable on every system and whether MIPS16 
> code is involved or not (even on systems that do not support the MIPS16 
> ASE at all), just to cope with an ABI anomaly of literally four functions 
> only needed for some MIPS16 code (and which had not originally been 
> expected to be ever used for dynamic linking -- until recently nobody even 
> considered the use of MIPS16 code on a shared library system).  And I 
> think you can still get into trouble if you use the wrong ld.so with your 
> MIPS16 executable -- or has symbol versioning been used to ensure that 
> ld.so bails out in this case?

...to be clear, I'm talking here specifically about the behaviour of
the _PLT_ resolver function.  Only the PLT resolver function saves and
restores $2 and $3; the resolver for SVR4-style lazy binding stubs
doesn't.  (As I mentioned earlier, we're also not expecting the
resolver for the SVR4-style lazy binding stubs to preserve $2 and $3.)

MIPS PLTs are an extension to the original SVR4 ABI, so we were free
to choose the interface.  As you can tell from the glibc comments,
including $2 and $3 in the list was a deliberate decision, based on
the fact that there were already functions that would find it useful.
The PLT resolver has used this interface since the day it was added to
glibc (2008-10-01).  Its ABI has not changed, so there was no change
that would require an .so bump.  (Adding PLTs didn't itself require
an .so bump because old dynamic linkers would error out on them anyway.)

And because we're talking about PLTs -- which for MIPS are only used in
executables -- the question isn't really whether MIPS16 shared libraries
work or not.  It's a question of whether partly-MIPS16 dynamic executables
work.  And they do: this has been a regular part of my testing setup for
at least a couple of years now.

>> If there's still some concern that __mips16_rdhwr might not have
>> the right ABI, then maybe we should simply emit a link-once function
>> in each object that needs it.  We could then switch to another function
>> (and another API) without having to keep the old one in libgcc.a for
>> compatibility.  It would also avoid the -shared-libgcc thing.
>> 
>> Admittedly that's just an off-the-top-of-my-head idea. :-)
>> What do you think?
>
>  Actually I had that idea of a link-once function too, but it turned out 
> quite complicated to do without rewriting some generic parts of GCC as it 
> is currently not prepared to emit link-once functions outside C++ 
> compilations.  It's been a while and I did lots of other stuff meanwhile, 
> so please excuse me if I got anything wrong here.

Hmm, OK, I wouldn't have expected that.  But if you've tried making
__mips16_rdhwr link-once and had a bad experience with it, then yeah,
let's go with the hidden libgcc function.  It's just a shame that we're
having to force static linking of libgcc for this one case.

I'll take the relevant parts from Chung-Lin's patch and test them
over the weekend.

Richard
Maciej W. Rozycki - Feb. 6, 2012, 4:26 p.m.
On Sat, 4 Feb 2012, Richard Sandiford wrote:

> >  I don't think _mcount has ever worked for dynamic libraries, has it? -- 
> > please correct me if I am wrong, but I believe `gprof' relies on memory 
> > segments to be contiguous which is certainly not the case for a dynamic 
> > executable where you have a separate set of mappings for the executable 
> > proper and then for each shared library loaded.
> >
> >  I didn't know it required $3 for anything either -- I've thought it was 
> > $1 only.  How has it worked for standard MIPS code then?
> 
> Sorry, I misremembered, it was $2 (for the static chain register).

 Please elaborate anyway -- if you remember the details of $2 usage 
offhand, that is.  I'm curious.

> >  We have only about now got MIPS16 shared libraries to work -- are you 
> > sure removing code to save/restore $2/$3 in the dynamic linker is going to 
> > hit anyone?
> 
> Pretty sure.  There are two separate points here.  Support for MIPS16
> shared libraries went into the FSF tools in 2008, and I was able to
> build working(!) shared libraries at that time.  (To be clear, these were
> external libraries rather than things like libgcc.)  So it's not really
> that new.
> 
> That doesn't mean that there weren't bugs in the implementation, of course.
> But I think we're just going to have to agree to disagree on the
> "working" thing.  It's probably moot anyway given the other point...

 OK, perhaps soft-float (or if you happened to avoid FP altogether) could 
have worked.  With hard float we hit problems at least with symbol 
references emitted incorrectly by GCC in the context of some MIPS16 thunks 
as well as binutils failing to emit LA25 thunks for some MIPS16 thunks.  
So it wasn't just MIPS16 shared libraries that failed, but also dynamic 
MIPS16 executables.  So it looks to me like a part of the MIPS16 ABI 
wasn't covered at all (I'd expect all the "interesting" cases to have been 
deliberately tested before claiming victory).

 Chung-Lin may remember more details; I'd have to dig out old discussions.

> > While a small piece, this is still some wasted memory and 
> > execution time for every executable on every system and whether MIPS16 
> > code is involved or not (even on systems that do not support the MIPS16 
> > ASE at all), just to cope with an ABI anomaly of literally four functions 
> > only needed for some MIPS16 code (and which had not originally been 
> > expected to be ever used for dynamic linking -- until recently nobody even 
> > considered the use of MIPS16 code on a shared library system).  And I 
> > think you can still get into trouble if you use the wrong ld.so with your 
> > MIPS16 executable -- or has symbol versioning been used to ensure that 
> > ld.so bails out in this case?
> 
> ...to be clear, I'm talking here specifically about the behaviour of
> the _PLT_ resolver function.  Only the PLT resolver function saves and
> restores $2 and $3; the resolver for SVR4-style lazy binding stubs
> doesn't.  (As I mentioned earlier, we're also not expecting the
> resolver for the SVR4-style lazy binding stubs to preserve $2 and $3.)
> 
> MIPS PLTs are an extension to the original SVR4 ABI, so we were free
> to choose the interface.  As you can tell from the glibc comments,
> including $2 and $3 in the list was a deliberate decision, based on
> the fact that there were already functions that would find it useful.
> The PLT resolver has used this interface since the day it was added to
> glibc (2008-10-01).  Its ABI has not changed, so there was no change
> that would require an .so bump.  (Adding PLTs didn't itself require
> an .so bump because old dynamic linkers would error out on them anyway.)

 Fair enough -- I didn't realise that support for MIPS PLT and MIPS16 
dynamic executables was added at the same time.

> And because we're talking about PLTs -- which for MIPS are only used in
> executables -- the question isn't really whether MIPS16 shared libraries
> work or not.  It's a question of whether partly-MIPS16 dynamic executables
> work.  And they do: this has been a regular part of my testing setup for
> at least a couple of years now.

 Yet we had some problems to fix that made me believe it was work in 
progress that went upstream.  Sorry if that was unjust.

> >> If there's still some concern that __mips16_rdhwr might not have
> >> the right ABI, then maybe we should simply emit a link-once function
> >> in each object that needs it.  We could then switch to another function
> >> (and another API) without having to keep the old one in libgcc.a for
> >> compatibility.  It would also avoid the -shared-libgcc thing.
> >> 
> >> Admittedly that's just an off-the-top-of-my-head idea. :-)
> >> What do you think?
> >
> >  Actually I had that idea of a link-once function too, but it turned out 
> > quite complicated to do without rewriting some generic parts of GCC as it 
> > is currently not prepared to emit link-once functions outside C++ 
> > compilations.  It's been a while and I did lots of other stuff meanwhile, 
> > so please excuse me if I got anything wrong here.
> 
> Hmm, OK, I wouldn't have expected that.  But if you've tried making
> __mips16_rdhwr link-once and had a bad experience with it, then yeah,
> let's go with the hidden libgcc function.  It's just a shame that we're
> having to force static linking of libgcc for this one case.

 Well, it's just this single function that's pulled from libgcc.a -- all 
the rest will come from libgcc_s.so (unless hidden as well, that is).  
The benefit is you can always change the ABI of __mips16_rdhwr without 
worrying about existing executables -- they'll continue to work using 
their private piece of code unchanged.

  Maciej
Richard Sandiford - Feb. 6, 2012, 7:38 p.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>> >  We have only about now got MIPS16 shared libraries to work -- are you 
>> > sure removing code to save/restore $2/$3 in the dynamic linker is going to 
>> > hit anyone?
>> 
>> Pretty sure.  There are two separate points here.  Support for MIPS16
>> shared libraries went into the FSF tools in 2008, and I was able to
>> build working(!) shared libraries at that time.  (To be clear, these were
>> external libraries rather than things like libgcc.)  So it's not really
>> that new.
>> 
>> That doesn't mean that there weren't bugs in the implementation, of course.
>> But I think we're just going to have to agree to disagree on the
>> "working" thing.  It's probably moot anyway given the other point...
>
>  OK, perhaps soft-float (or if you happened to avoid FP altogether) could 
> have worked.  With hard float we hit problems at least with symbol 
> references emitted incorrectly by GCC in the context of some MIPS16 thunks 
> as well as binutils failing to emit LA25 thunks for some MIPS16 thunks.  
> So it wasn't just MIPS16 shared libraries that failed, but also dynamic 
> MIPS16 executables.  So it looks to me like a part of the MIPS16 ABI 
> wasn't covered at all (I'd expect all the "interesting" cases to have been 
> deliberately tested before claiming victory).

It was hard-float too FWIW.  As with all these things, some applications
just have a knack of avoiding certain bugs :-)

>> >> If there's still some concern that __mips16_rdhwr might not have
>> >> the right ABI, then maybe we should simply emit a link-once function
>> >> in each object that needs it.  We could then switch to another function
>> >> (and another API) without having to keep the old one in libgcc.a for
>> >> compatibility.  It would also avoid the -shared-libgcc thing.
>> >> 
>> >> Admittedly that's just an off-the-top-of-my-head idea. :-)
>> >> What do you think?
>> >
>> >  Actually I had that idea of a link-once function too, but it turned out 
>> > quite complicated to do without rewriting some generic parts of GCC as it 
>> > is currently not prepared to emit link-once functions outside C++ 
>> > compilations.  It's been a while and I did lots of other stuff meanwhile, 
>> > so please excuse me if I got anything wrong here.
>> 
>> Hmm, OK, I wouldn't have expected that.  But if you've tried making
>> __mips16_rdhwr link-once and had a bad experience with it, then yeah,
>> let's go with the hidden libgcc function.  It's just a shame that we're
>> having to force static linking of libgcc for this one case.
>
>  Well, it's just this single function that's pulled from libgcc.a -- all 
> the rest will come from libgcc_s.so (unless hidden as well, that is).  
> The benefit is you can always change the ABI of __mips16_rdhwr without 
> worrying about existing executables -- they'll continue to work using 
> their private piece of code unchanged.

We can't change the ABI of __mips16_rdhwr.  As I was saying before,
we need to keep this function around from now on so that static
libraries that contain objects built with gcc 4.7 can be linked
by later compilers (which will link against their own libgcc.a).
If we come up with another interface, the function will need to
have a new name, and we'll need to continue to provide the current
__mips16_rdhwr in libgcc.a for compatibility reasons.

Richard
Richard Sandiford - Feb. 6, 2012, 7:42 p.m.
[forking for different topics]

"Maciej W. Rozycki" <macro@codesourcery.com> writes:
> On Sat, 4 Feb 2012, Richard Sandiford wrote:
>> >  I don't think _mcount has ever worked for dynamic libraries, has it? -- 
>> > please correct me if I am wrong, but I believe `gprof' relies on memory 
>> > segments to be contiguous which is certainly not the case for a dynamic 
>> > executable where you have a separate set of mappings for the executable 
>> > proper and then for each shared library loaded.
>> >
>> >  I didn't know it required $3 for anything either -- I've thought it was 
>> > $1 only.  How has it worked for standard MIPS code then?
>> 
>> Sorry, I misremembered, it was $2 (for the static chain register).
>
>  Please elaborate anyway -- if you remember the details of $2 usage 
> offhand, that is.  I'm curious.

$2 was the traditional static chain pointer, and _mcount would preserve it
so that calls from nested functions would work correctly.  We since changed
GCC's own static pointer to $15, so we now have to move $15 to $2 before
calling _mcount and restore it afterwards, since there's no guarantee
that _mcount itself will preserve $15.

Or at least, that's the theory.  I have to admit to never using it in
"real world" situations.

Richard
Richard Henderson - Feb. 11, 2012, 2:41 a.m.
On 02/04/2012 02:06 AM, Richard Sandiford wrote:
>> >  Actually I had that idea of a link-once function too, but it turned out 
>> > quite complicated to do without rewriting some generic parts of GCC as it 
>> > is currently not prepared to emit link-once functions outside C++ 
>> > compilations.  It's been a while and I did lots of other stuff meanwhile, 
>> > so please excuse me if I got anything wrong here.
> Hmm, OK, I wouldn't have expected that.  But if you've tried making
> __mips16_rdhwr link-once and had a bad experience with it, then yeah,
> let's go with the hidden libgcc function.  It's just a shame that we're
> having to force static linking of libgcc for this one case.

The i386 target does it all the time for its __x86.get_pc_thunk.?x thingys.
So I really don't believe the "not prepared" comment.


r~
Maciej W. Rozycki - Feb. 11, 2012, 9:32 a.m.
On Sat, 11 Feb 2012, Richard Henderson wrote:

> On 02/04/2012 02:06 AM, Richard Sandiford wrote:
> >> >  Actually I had that idea of a link-once function too, but it turned out 
> >> > quite complicated to do without rewriting some generic parts of GCC as it 
> >> > is currently not prepared to emit link-once functions outside C++ 
> >> > compilations.  It's been a while and I did lots of other stuff meanwhile, 
> >> > so please excuse me if I got anything wrong here.
> > Hmm, OK, I wouldn't have expected that.  But if you've tried making
> > __mips16_rdhwr link-once and had a bad experience with it, then yeah,
> > let's go with the hidden libgcc function.  It's just a shame that we're
> > having to force static linking of libgcc for this one case.
> 
> The i386 target does it all the time for its __x86.get_pc_thunk.?x thingys.
> So I really don't believe the "not prepared" comment.

 I'm confused then, sorry -- Chung-Lin, can you look into it again then 
please?

  Maciej

Patch

Index: gcc/config/mips/mips-protos.h
===================================================================
--- gcc/config/mips/mips-protos.h	2012-01-08 08:43:13.000000000 +0000
+++ gcc/config/mips/mips-protos.h	2012-01-08 10:12:57.000000000 +0000
@@ -56,9 +56,6 @@  enum mips_symbol_context {
        The symbol's value will be calculated using a MIPS16 PC-relative
        calculation.
 
-   SYMBOL_FORCE_TO_MEM
-       The symbol's value must be forced to memory and loaded from there.
-
    SYMBOL_GOT_PAGE_OFST
        The symbol's value will be calculated by loading an address
        from the GOT and then applying a 16-bit offset.
@@ -94,9 +91,6 @@  enum mips_symbol_context {
        UNSPEC wrappers around SYMBOL_TLS, corresponding to the
        thread-local storage relocation operators.
 
-   SYMBOL_32_HIGH
-       For a 32-bit symbolic address X, this is the value of %hi(X).
-
    SYMBOL_64_HIGH
        For a 64-bit symbolic address X, this is the value of
        (%highest(X) << 16) + %higher(X).
@@ -116,7 +110,6 @@  enum mips_symbol_type {
   SYMBOL_ABSOLUTE,
   SYMBOL_GP_RELATIVE,
   SYMBOL_PC_RELATIVE,
-  SYMBOL_FORCE_TO_MEM,
   SYMBOL_GOT_PAGE_OFST,
   SYMBOL_GOT_DISP,
   SYMBOL_GOTOFF_PAGE,
@@ -129,7 +122,6 @@  enum mips_symbol_type {
   SYMBOL_DTPREL,
   SYMBOL_GOTTPREL,
   SYMBOL_TPREL,
-  SYMBOL_32_HIGH,
   SYMBOL_64_HIGH,
   SYMBOL_64_MID,
   SYMBOL_64_LOW,
@@ -260,6 +252,7 @@  extern void mips_push_asm_switch (struct
 extern void mips_pop_asm_switch (struct mips_asm_switch *);
 extern void mips_output_external (FILE *, tree, const char *);
 extern void mips_output_ascii (FILE *, const char *, size_t);
+extern const char *mips_output_tls_reloc_directive (rtx *);
 extern void mips_output_aligned_decl_common (FILE *, tree, const char *,
 					     unsigned HOST_WIDE_INT,
 					     unsigned int);
Index: gcc/config/mips/mips.h
===================================================================
--- gcc/config/mips/mips.h	2012-01-08 08:43:13.000000000 +0000
+++ gcc/config/mips/mips.h	2012-01-08 10:12:57.000000000 +0000
@@ -1785,8 +1785,6 @@  #define GLOBAL_POINTER_REGNUM (GP_REG_FI
    from there after reload.  */
 #define PIC_OFFSET_TABLE_REGNUM \
   (reload_completed ? REGNO (pic_offset_table_rtx) : GLOBAL_POINTER_REGNUM)
-
-#define PIC_FUNCTION_ADDR_REGNUM (GP_REG_FIRST + 25)
 
 /* Define the classes of registers for register constraints in the
    machine description.  Also define ranges of constants.
@@ -2868,6 +2866,9 @@  struct mips_asm_switch {
 extern int mips_dwarf_regno[];
 extern bool mips_split_p[];
 extern bool mips_split_hi_p[];
+extern bool mips_use_pcrel_pool_p[];
+extern const char *mips_lo_relocs[];
+extern const char *mips_hi_relocs[];
 extern enum processor mips_arch;        /* which cpu to codegen for */
 extern enum processor mips_tune;        /* which cpu to schedule for */
 extern int mips_isa;			/* architectural level */
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	2012-01-08 08:43:13.000000000 +0000
+++ gcc/config/mips/mips.c	2012-01-08 10:54:56.000000000 +0000
@@ -573,13 +573,17 @@  static GTY (()) int mips_output_filename
    can be split by mips_split_symbol.  */
 bool mips_split_hi_p[NUM_SYMBOL_TYPES];
 
+/* mips_use_pcrel_pool_p[X] is true if symbols of type X should be
+   forced into a PC-relative constant pool.  */
+bool mips_use_pcrel_pool_p[NUM_SYMBOL_TYPES];
+
 /* mips_lo_relocs[X] is the relocation to use when a symbol of type X
    appears in a LO_SUM.  It can be null if such LO_SUMs aren't valid or
    if they are matched by a special .md file pattern.  */
-static const char *mips_lo_relocs[NUM_SYMBOL_TYPES];
+const char *mips_lo_relocs[NUM_SYMBOL_TYPES];
 
 /* Likewise for HIGHs.  */
-static const char *mips_hi_relocs[NUM_SYMBOL_TYPES];
+const char *mips_hi_relocs[NUM_SYMBOL_TYPES];
 
 /* Target state for MIPS16.  */
 struct target_globals *mips16_globals;
@@ -1440,6 +1444,18 @@  mips_legitimate_constant_p (enum machine
   return mips_const_insns (x) > 0;
 }
 
+/* Return a SYMBOL_REF for a MIPS16 function called NAME.  */
+
+static rtx
+mips16_stub_function (const char *name)
+{
+  rtx x;
+
+  x = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name));
+  SYMBOL_REF_FLAGS (x) |= (SYMBOL_FLAG_EXTERNAL | SYMBOL_FLAG_FUNCTION);
+  return x;
+}
+
 /* Return true if symbols of type TYPE require a GOT access.  */
 
 static bool
@@ -1644,9 +1660,6 @@  mips_classify_symbol (const_rtx x, enum
       return SYMBOL_GOT_PAGE_OFST;
     }
 
-  if (TARGET_MIPS16_PCREL_LOADS && context != SYMBOL_CONTEXT_CALL)
-    return SYMBOL_FORCE_TO_MEM;
-
   return SYMBOL_ABSOLUTE;
 }
 
@@ -1709,8 +1722,6 @@  mips_symbolic_constant_p (rtx x, enum mi
   switch (*symbol_type)
     {
     case SYMBOL_ABSOLUTE:
-    case SYMBOL_FORCE_TO_MEM:
-    case SYMBOL_32_HIGH:
     case SYMBOL_64_HIGH:
     case SYMBOL_64_MID:
     case SYMBOL_64_LOW:
@@ -1776,6 +1787,17 @@  mips_symbolic_constant_p (rtx x, enum mi
 static int
 mips_symbol_insns_1 (enum mips_symbol_type type, enum machine_mode mode)
 {
+  if (mips_use_pcrel_pool_p[(int) type])
+    {
+      if (mode == MAX_MACHINE_MODE)
+	/* LEAs will be converted into constant-pool references by
+	   mips_reorg.  */
+	type = SYMBOL_PC_RELATIVE;
+      else
+	/* The constant must be loaded and then dereferenced.  */
+	return 0;
+    }
+
   switch (type)
     {
     case SYMBOL_ABSOLUTE:
@@ -1809,15 +1831,6 @@  mips_symbol_insns_1 (enum mips_symbol_ty
       /* The constant must be loaded using ADDIUPC or DADDIUPC first.  */
       return 0;
 
-    case SYMBOL_FORCE_TO_MEM:
-      /* LEAs will be converted into constant-pool references by
-	 mips_reorg.  */
-      if (mode == MAX_MACHINE_MODE)
-	return 1;
-
-      /* The constant must be loaded and then dereferenced.  */
-      return 0;
-
     case SYMBOL_GOT_DISP:
       /* The constant will have to be loaded from the GOT before it
 	 is used in an address.  */
@@ -1850,7 +1863,6 @@  mips_symbol_insns_1 (enum mips_symbol_ty
     case SYMBOL_GOTOFF_DISP:
     case SYMBOL_GOTOFF_CALL:
     case SYMBOL_GOTOFF_LOADGP:
-    case SYMBOL_32_HIGH:
     case SYMBOL_64_HIGH:
     case SYMBOL_64_MID:
     case SYMBOL_64_LOW:
@@ -1925,9 +1937,12 @@  mips_cannot_force_const_mem (enum machin
     return true;
 
   split_const (x, &base, &offset);
-  if (mips_symbolic_constant_p (base, SYMBOL_CONTEXT_LEA, &type)
-      && type != SYMBOL_FORCE_TO_MEM)
+  if (mips_symbolic_constant_p (base, SYMBOL_CONTEXT_LEA, &type))
     {
+      /* See whether we explicitly want these symbols in the pool.  */
+      if (mips_use_pcrel_pool_p[(int) type])
+	return false;
+
       /* The same optimization as for CONST_INT.  */
       if (SMALL_INT (offset) && mips_symbol_insns (type, MAX_MACHINE_MODE) > 0)
 	return true;
@@ -2822,13 +2837,18 @@  mips_call_tls_get_addr (rtx sym, enum mi
 static rtx
 mips_get_tp (void)
 {
-  rtx tp;
+  rtx tp, fn;
 
   tp = gen_reg_rtx (Pmode);
-  if (Pmode == DImode)
-    emit_insn (gen_tls_get_tp_di (tp));
+  if (TARGET_MIPS16)
+    {
+      fn = mips16_stub_function ("__mips16_rdhwr");
+      if (!call_insn_operand (fn, VOIDmode))
+	fn = force_reg (Pmode, fn);
+      emit_insn (PMODE_INSN (gen_tls_get_tp_mips16, (tp, fn)));
+    }
   else
-    emit_insn (gen_tls_get_tp_si (tp));
+    emit_insn (PMODE_INSN (gen_tls_get_tp, (tp)));
   return tp;
 }
 
@@ -2839,15 +2859,9 @@  mips_get_tp (void)
 static rtx
 mips_legitimize_tls_address (rtx loc)
 {
-  rtx dest, insn, v0, tp, tmp1, tmp2, eqv;
+  rtx dest, insn, v0, tp, tmp1, tmp2, eqv, offset;
   enum tls_model model;
 
-  if (TARGET_MIPS16)
-    {
-      sorry ("MIPS16 TLS");
-      return gen_reg_rtx (Pmode);
-    }
-
   model = SYMBOL_REF_TLS_MODEL (loc);
   /* Only TARGET_ABICALLS code can have more than one module; other
      code must be be static and should not use a GOT.  All TLS models
@@ -2875,9 +2889,15 @@  mips_legitimize_tls_address (rtx loc)
 			    UNSPEC_TLS_LDM);
       emit_libcall_block (insn, tmp1, v0, eqv);
 
-      tmp2 = mips_unspec_offset_high (NULL, tmp1, loc, SYMBOL_DTPREL);
-      dest = gen_rtx_LO_SUM (Pmode, tmp2,
-			     mips_unspec_address (loc, SYMBOL_DTPREL));
+      offset = mips_unspec_address (loc, SYMBOL_DTPREL);
+      if (mips_split_p[SYMBOL_DTPREL])
+	{
+	  tmp2 = mips_unspec_offset_high (NULL, tmp1, loc, SYMBOL_DTPREL);
+	  dest = gen_rtx_LO_SUM (Pmode, tmp2, offset);
+	}
+      else
+	dest = expand_binop (Pmode, add_optab, tmp1, offset,
+			     0, 0, OPTAB_DIRECT);
       break;
 
     case TLS_MODEL_INITIAL_EXEC:
@@ -2893,10 +2913,16 @@  mips_legitimize_tls_address (rtx loc)
       break;
 
     case TLS_MODEL_LOCAL_EXEC:
-      tp = mips_get_tp ();
-      tmp1 = mips_unspec_offset_high (NULL, tp, loc, SYMBOL_TPREL);
-      dest = gen_rtx_LO_SUM (Pmode, tmp1,
-			     mips_unspec_address (loc, SYMBOL_TPREL));
+      tmp1 = mips_get_tp ();
+      offset = mips_unspec_address (loc, SYMBOL_TPREL);
+      if (mips_split_p[SYMBOL_TPREL])
+	{
+	  tmp2 = mips_unspec_offset_high (NULL, tmp1, loc, SYMBOL_TPREL);
+	  dest = gen_rtx_LO_SUM (Pmode, tmp2, offset);
+	}
+      else
+	dest = expand_binop (Pmode, add_optab, tmp1, offset,
+			     0, 0, OPTAB_DIRECT);
       break;
 
     default:
@@ -5866,18 +5892,6 @@  struct mips16_stub {
 };
 static struct mips16_stub *mips16_stubs;
 
-/* Return a SYMBOL_REF for a MIPS16 function called NAME.  */
-
-static rtx
-mips16_stub_function (const char *name)
-{
-  rtx x;
-
-  x = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name));
-  SYMBOL_REF_FLAGS (x) |= (SYMBOL_FLAG_EXTERNAL | SYMBOL_FLAG_FUNCTION);
-  return x;
-}
-
 /* Return the two-character string that identifies floating-point
    return mode MODE in the name of a MIPS16 function stub.  */
 
@@ -7192,38 +7206,44 @@  mips_init_relocs (void)
 {
   memset (mips_split_p, '\0', sizeof (mips_split_p));
   memset (mips_split_hi_p, '\0', sizeof (mips_split_hi_p));
+  memset (mips_use_pcrel_pool_p, '\0', sizeof (mips_use_pcrel_pool_p));
   memset (mips_hi_relocs, '\0', sizeof (mips_hi_relocs));
   memset (mips_lo_relocs, '\0', sizeof (mips_lo_relocs));
 
-  if (ABI_HAS_64BIT_SYMBOLS)
+  if (TARGET_MIPS16_PCREL_LOADS)
+    mips_use_pcrel_pool_p[SYMBOL_ABSOLUTE] = true;
+  else
     {
-      if (TARGET_EXPLICIT_RELOCS)
+      if (ABI_HAS_64BIT_SYMBOLS)
 	{
-	  mips_split_p[SYMBOL_64_HIGH] = true;
-	  mips_hi_relocs[SYMBOL_64_HIGH] = "%highest(";
-	  mips_lo_relocs[SYMBOL_64_HIGH] = "%higher(";
-
-	  mips_split_p[SYMBOL_64_MID] = true;
-	  mips_hi_relocs[SYMBOL_64_MID] = "%higher(";
-	  mips_lo_relocs[SYMBOL_64_MID] = "%hi(";
-
-	  mips_split_p[SYMBOL_64_LOW] = true;
-	  mips_hi_relocs[SYMBOL_64_LOW] = "%hi(";
-	  mips_lo_relocs[SYMBOL_64_LOW] = "%lo(";
+	  if (TARGET_EXPLICIT_RELOCS)
+	    {
+	      mips_split_p[SYMBOL_64_HIGH] = true;
+	      mips_hi_relocs[SYMBOL_64_HIGH] = "%highest(";
+	      mips_lo_relocs[SYMBOL_64_HIGH] = "%higher(";
+
+	      mips_split_p[SYMBOL_64_MID] = true;
+	      mips_hi_relocs[SYMBOL_64_MID] = "%higher(";
+	      mips_lo_relocs[SYMBOL_64_MID] = "%hi(";
+
+	      mips_split_p[SYMBOL_64_LOW] = true;
+	      mips_hi_relocs[SYMBOL_64_LOW] = "%hi(";
+	      mips_lo_relocs[SYMBOL_64_LOW] = "%lo(";
 
-	  mips_split_p[SYMBOL_ABSOLUTE] = true;
-	  mips_lo_relocs[SYMBOL_ABSOLUTE] = "%lo(";
+	      mips_split_p[SYMBOL_ABSOLUTE] = true;
+	      mips_lo_relocs[SYMBOL_ABSOLUTE] = "%lo(";
+	    }
 	}
-    }
-  else
-    {
-      if (TARGET_EXPLICIT_RELOCS || mips_split_addresses_p () || TARGET_MIPS16)
+      else
 	{
-	  mips_split_p[SYMBOL_ABSOLUTE] = true;
-	  mips_hi_relocs[SYMBOL_ABSOLUTE] = "%hi(";
-	  mips_lo_relocs[SYMBOL_ABSOLUTE] = "%lo(";
-
-	  mips_lo_relocs[SYMBOL_32_HIGH] = "%hi(";
+	  if (TARGET_EXPLICIT_RELOCS
+	      || mips_split_addresses_p ()
+	      || TARGET_MIPS16)
+	    {
+	      mips_split_p[SYMBOL_ABSOLUTE] = true;
+	      mips_hi_relocs[SYMBOL_ABSOLUTE] = "%hi(";
+	      mips_lo_relocs[SYMBOL_ABSOLUTE] = "%lo(";
+	    }
 	}
     }
 
@@ -7291,16 +7311,23 @@  mips_init_relocs (void)
   mips_lo_relocs[SYMBOL_TLSGD] = "%tlsgd(";
   mips_lo_relocs[SYMBOL_TLSLDM] = "%tlsldm(";
 
-  mips_split_p[SYMBOL_DTPREL] = true;
-  mips_hi_relocs[SYMBOL_DTPREL] = "%dtprel_hi(";
-  mips_lo_relocs[SYMBOL_DTPREL] = "%dtprel_lo(";
-
-  mips_lo_relocs[SYMBOL_GOTTPREL] = "%gottprel(";
+  if (TARGET_MIPS16_PCREL_LOADS)
+    {
+      mips_use_pcrel_pool_p[SYMBOL_DTPREL] = true;
+      mips_use_pcrel_pool_p[SYMBOL_TPREL] = true;
+    }
+  else
+    {
+      mips_split_p[SYMBOL_DTPREL] = true;
+      mips_hi_relocs[SYMBOL_DTPREL] = "%dtprel_hi(";
+      mips_lo_relocs[SYMBOL_DTPREL] = "%dtprel_lo(";
 
-  mips_split_p[SYMBOL_TPREL] = true;
-  mips_hi_relocs[SYMBOL_TPREL] = "%tprel_hi(";
-  mips_lo_relocs[SYMBOL_TPREL] = "%tprel_lo(";
+      mips_split_p[SYMBOL_TPREL] = true;
+      mips_hi_relocs[SYMBOL_TPREL] = "%tprel_hi(";
+      mips_lo_relocs[SYMBOL_TPREL] = "%tprel_lo(";
+    }
 
+  mips_lo_relocs[SYMBOL_GOTTPREL] = "%gottprel(";
   mips_lo_relocs[SYMBOL_HALF] = "%half(";
 }
 
@@ -8082,6 +8109,29 @@  mips_output_ascii (FILE *stream, const c
   fprintf (stream, "\"\n");
 }
 
+/* Return the pseudo-op for full SYMBOL_(D)TPREL address *ADDR.
+   Update *ADDR with the operand that should be printed.  */
+
+const char *
+mips_output_tls_reloc_directive (rtx *addr)
+{
+  enum mips_symbol_type type;
+
+  type = mips_classify_symbolic_expression (*addr, SYMBOL_CONTEXT_LEA);
+  *addr = mips_strip_unspec_address (*addr);
+  switch (type)
+    {
+    case SYMBOL_DTPREL:
+      return Pmode == SImode ? ".dtprelword\t%0" : ".dtpreldword\t%0";
+
+    case SYMBOL_TPREL:
+      return Pmode == SImode ? ".tprelword\t%0" : ".tpreldword\t%0";
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
 /* Emit either a label, .comm, or .lcomm directive.  When using assembler
    macros, mark the symbol as written so that mips_asm_output_external
    won't emit an .extern for it.  STREAM is the output file, NAME is the
@@ -13783,6 +13833,10 @@  mips16_rewrite_pool_refs (rtx *x, void *
       return -1;
     }
 
+  /* Don't rewrite the __mips16_rdwr symbol.  */
+  if (GET_CODE (*x) == UNSPEC && XINT (*x, 1) == UNSPEC_TLS_GET_TP)
+    return -1;
+
   if (TARGET_MIPS16_TEXT_LOADS)
     mips16_rewrite_pool_constant (info->pool, x);
 
Index: gcc/config/mips/predicates.md
===================================================================
--- gcc/config/mips/predicates.md	2012-01-08 08:43:13.000000000 +0000
+++ gcc/config/mips/predicates.md	2012-01-08 10:12:57.000000000 +0000
@@ -288,12 +288,20 @@  (define_predicate "absolute_symbolic_ope
 	  && type == SYMBOL_ABSOLUTE);
 })
 
+(define_predicate "symbolic_operand_with_high"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum mips_symbol_type type;
+  return (mips_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &type)
+	  && mips_hi_relocs[(int) type]);
+})
+
 (define_predicate "force_to_mem_operand"
   (match_code "const,symbol_ref,label_ref")
 {
   enum mips_symbol_type symbol_type;
   return (mips_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &symbol_type)
-	  && symbol_type == SYMBOL_FORCE_TO_MEM);
+	  && mips_use_pcrel_pool_p[(int) symbol_type]);
 })
 
 (define_predicate "got_disp_operand"
@@ -312,6 +320,14 @@  (define_predicate "got_page_ofst_operand
 	  && type == SYMBOL_GOT_PAGE_OFST);
 })
 
+(define_predicate "tls_reloc_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum mips_symbol_type type;
+  return (mips_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &type)
+	  && (type == SYMBOL_DTPREL || type == SYMBOL_TPREL));
+})
+
 (define_predicate "symbol_ref_operand"
   (match_code "symbol_ref"))
 
Index: gcc/config/mips/mips.md
===================================================================
--- gcc/config/mips/mips.md	2012-01-08 08:43:13.000000000 +0000
+++ gcc/config/mips/mips.md	2012-01-08 10:45:39.000000000 +0000
@@ -103,6 +103,7 @@  (define_c_enum "unspec" [
   UNSPEC_LOAD_GOT
   UNSPEC_TLS_LDM
   UNSPEC_TLS_GET_TP
+  UNSPEC_UNSHIFTED_HIGH
 
   ;; MIPS16 constant pools.
   UNSPEC_ALIGN
@@ -135,6 +136,7 @@  (define_c_enum "unspec" [
 
 (define_constants
   [(TLS_GET_TP_REGNUM		3)
+   (PIC_FUNCTION_ADDR_REGNUM	25)
    (RETURN_ADDR_REGNUM		31)
    (CPRESTORE_SLOT_REGNUM	76)
    (GOT_VERSION_REGNUM		79)
@@ -3924,14 +3926,19 @@  (define_insn_and_split "*lea64"
 ;;
 ;; on MIPS16 targets.
 (define_split
-  [(set (match_operand:SI 0 "d_operand")
-	(high:SI (match_operand:SI 1 "absolute_symbolic_operand")))]
+  [(set (match_operand:P 0 "d_operand")
+	(high:P (match_operand:P 1 "symbolic_operand_with_high")))]
   "TARGET_MIPS16 && reload_completed"
-  [(set (match_dup 0) (match_dup 2))
-   (set (match_dup 0) (ashift:SI (match_dup 0) (const_int 16)))]
-{
-  operands[2] = mips_unspec_address (operands[1], SYMBOL_32_HIGH);
-})
+  [(set (match_dup 0) (unspec:P [(match_dup 1)] UNSPEC_UNSHIFTED_HIGH))
+   (set (match_dup 0) (ashift:P (match_dup 0) (const_int 16)))])
+
+(define_insn "*unshifted_high"
+  [(set (match_operand:P 0 "d_operand" "=d")
+	(unspec:P [(match_operand:P 1 "symbolic_operand_with_high")]
+		  UNSPEC_UNSHIFTED_HIGH))]
+  ""
+  "li\t%0,%h1"
+  [(set_attr "extended_mips16" "yes")])
 
 ;; Insns to fetch a symbol from a big GOT.
 
@@ -6492,6 +6499,14 @@  (define_expand "mov<mode>cc"
 ;;  ....................
 ;;
 
+(define_insn "consttable_tls_reloc"
+  [(unspec_volatile [(match_operand 0 "tls_reloc_operand" "")
+		     (match_operand 1 "const_int_operand" "")]
+		    UNSPEC_CONSTTABLE_INT)]
+  "TARGET_MIPS16_PCREL_LOADS"
+  { return mips_output_tls_reloc_directive (&operands[0]); }
+  [(set (attr "length") (symbol_ref "INTVAL (operands[1])"))])
+
 (define_insn "consttable_int"
   [(unspec_volatile [(match_operand 0 "consttable_operand" "")
 		     (match_operand 1 "const_int_operand" "")]
@@ -6593,6 +6608,49 @@  (define_insn "*tls_get_tp_<mode>_split"
    ; See tls_get_tp_<mode>
    (set_attr "can_delay" "no")
    (set_attr "mode" "<MODE>")])
+
+;; In MIPS16 mode, the TLS base pointer is accessed by a
+;; libgcc helper function __mips16_rdhwr(), as 'rdhwr' is not
+;; accessible in MIPS16.
+;;
+;; This is not represented as a call insn, to avoid the
+;; unnecesarry clobbering of caller-save registers by a
+;; function consisting only of: "rdhwr $3,$29; j $31; nop;"
+;;
+;; A $25 clobber is added to cater for a $25 load stub added by the
+;; linker to __mips16_rdhwr when the call is made from non-PIC code.
+
+(define_insn_and_split "tls_get_tp_mips16_<mode>"
+  [(set (match_operand:P 0 "register_operand" "=d")
+	(unspec:P [(match_operand:P 1 "call_insn_operand" "dS")]
+		  UNSPEC_TLS_GET_TP))
+   (clobber (reg:P TLS_GET_TP_REGNUM))
+   (clobber (reg:P PIC_FUNCTION_ADDR_REGNUM))
+   (clobber (reg:P RETURN_ADDR_REGNUM))]
+  "HAVE_AS_TLS && TARGET_MIPS16"
+  "#"
+  "&& reload_completed"
+  [(parallel [(set (reg:P TLS_GET_TP_REGNUM)
+	  	   (unspec:P [(match_dup 1)] UNSPEC_TLS_GET_TP))
+	      (clobber (reg:P PIC_FUNCTION_ADDR_REGNUM))
+	      (clobber (reg:P RETURN_ADDR_REGNUM))])
+   (set (match_dup 0) (reg:P TLS_GET_TP_REGNUM))]
+  ""
+  [(set_attr "type" "multi")
+   (set_attr "length" "16")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*tls_get_tp_mips16_call_<mode>"
+  [(set (reg:P TLS_GET_TP_REGNUM)
+	(unspec:P [(match_operand:P 0 "call_insn_operand" "dS")]
+		  UNSPEC_TLS_GET_TP))
+   (clobber (reg:P PIC_FUNCTION_ADDR_REGNUM))
+   (clobber (reg:P RETURN_ADDR_REGNUM))]
+  "HAVE_AS_TLS && TARGET_MIPS16"
+  { return MIPS_CALL ("jal", operands, 0, -1); }
+  [(set_attr "type" "call")
+   (set_attr "length" "12")
+   (set_attr "mode" "<MODE>")])
 
 ;; Synchronization instructions.
 
Index: gcc/testsuite/gcc.target/mips/code-readable-2.c
===================================================================
--- gcc/testsuite/gcc.target/mips/code-readable-2.c	2012-01-08 10:13:04.000000000 +0000
+++ gcc/testsuite/gcc.target/mips/code-readable-2.c	2012-01-08 10:15:24.000000000 +0000
@@ -26,8 +26,7 @@  bar (void)
 
 /* { dg-final { scan-assembler-not "\tla\t" } } */
 /* { dg-final { scan-assembler-not "\t\\.half\t" } } */
-/* { dg-final { scan-assembler "%hi\\(\[^)\]*L" } } */
-/* { dg-final { scan-assembler "%lo\\(\[^)\]*L" } } */
+/* { dg-final { scan-assembler "\t\\.word\t\[^\n\]*L" } } */
 
 /* { dg-final { scan-assembler "\t\\.word\tk\n" } } */
 /* { dg-final { scan-assembler-not "%hi\\(k\\)" } } */
Index: libgcc/config/mips/libgcc-mips16.ver
===================================================================
--- libgcc/config/mips/libgcc-mips16.ver	2012-01-08 08:43:13.000000000 +0000
+++ libgcc/config/mips/libgcc-mips16.ver	2012-01-08 10:12:57.000000000 +0000
@@ -84,3 +84,7 @@  GCC_4.4.0 {
   __mips16_call_stub_dc_9
   __mips16_call_stub_dc_10
 }
+
+GCC_4.7.0 {
+  __mips16_rdhwr
+}
Index: libgcc/config/mips/mips16.S
===================================================================
--- libgcc/config/mips/mips16.S	2012-01-08 08:43:13.000000000 +0000
+++ libgcc/config/mips/mips16.S	2012-01-08 10:12:57.000000000 +0000
@@ -709,4 +709,15 @@  CALL_STUB_RET (__mips16_call_stub_dc_9,
 CALL_STUB_RET (__mips16_call_stub_dc_10, 10, DC)
 #endif
 #endif /* !__mips_single_float */
+
+#ifdef L_m16rdhwr
+STARTFN (__mips16_rdhwr)
+	.set	push
+	.set	mips32r2
+	.set	noreorder
+	rdhwr	$3,$29
+	.set	pop
+	j	$31
+	ENDFN (__mips16_rdhwr)
+#endif
 #endif
Index: libgcc/config/mips/t-mips16
===================================================================
--- libgcc/config/mips/t-mips16	2012-01-08 08:43:13.000000000 +0000
+++ libgcc/config/mips/t-mips16	2012-01-08 10:12:57.000000000 +0000
@@ -36,7 +36,8 @@  LIB1ASMFUNCS = _m16addsf3 _m16subsf3 _m1
 	_m16stubsc0 _m16stubsc1 _m16stubsc2 _m16stubsc5 _m16stubsc6 \
 	_m16stubsc9 _m16stubsc10 \
 	_m16stubdc0 _m16stubdc1 _m16stubdc2 _m16stubdc5 _m16stubdc6 \
-	_m16stubdc9 _m16stubdc10
+	_m16stubdc9 _m16stubdc10 \
+	_m16rdhwr
 
 SYNC = yes
 SYNC_CFLAGS = -mno-mips16