Message ID | 20130813194235.GJ1814@tucnak.redhat.com |
---|---|
State | New |
Headers | show |
On Tue, Aug 13, 2013 at 9:42 PM, Jakub Jelinek <jakub@redhat.com> wrote: > We right now ICE with -mcmodel=large -fpic on x86_64 on TLS GD and LD > sequences, because obviously we can't call __tls_get_addr@plt there from code > potentially more than 2GB away from the PLT slot. > > The attached patches add support for that in gcc and also teaches linker > about those, because otherwise the linker will fail if you try to link such > -mcmodel=large -fpic code into binaries or PIEs. > > To make transitions possible, we emit always > leaq foo@tlsgd(%rip), %rdi > movabsq $__tls_get_addr@pltoff, %rax > addq $rbx, %rax > call *%rax > resp. > leaq foo@tlsld(%rip), %rdi > movabsq $__tls_get_addr@pltoff, %rax > addq $rbx, %rax > call *%rax > sequences (22 bytes, 6 bytes longer than what we do for TLSGD for normal > libraries). > > Bootstrapped/regtested on x86_64-linux and i686-linux, attached is also the > sources I've used to test all the 3 different transitions. > > Ok for trunk and 4.8 branch (and binutils trunk)? The implementation for x86 is technically OK, but I wonder if these sequences should be documented in some authoritative document about TLS relocations. The "ELF Handling For Thread-Local Storage" document [1] doesn't mention various code models fo x86_64, so I was not able to cross-check the implementaton vs. documentation. [1] http://www.akkadia.org/drepper/tls.pdf Thanks, Uros.
On Wed, Aug 14, 2013 at 12:03 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Aug 13, 2013 at 9:42 PM, Jakub Jelinek <jakub@redhat.com> wrote: > >> We right now ICE with -mcmodel=large -fpic on x86_64 on TLS GD and LD >> sequences, because obviously we can't call __tls_get_addr@plt there from code >> potentially more than 2GB away from the PLT slot. >> >> The attached patches add support for that in gcc and also teaches linker >> about those, because otherwise the linker will fail if you try to link such >> -mcmodel=large -fpic code into binaries or PIEs. >> >> To make transitions possible, we emit always >> leaq foo@tlsgd(%rip), %rdi >> movabsq $__tls_get_addr@pltoff, %rax >> addq $rbx, %rax >> call *%rax >> resp. >> leaq foo@tlsld(%rip), %rdi >> movabsq $__tls_get_addr@pltoff, %rax >> addq $rbx, %rax >> call *%rax >> sequences (22 bytes, 6 bytes longer than what we do for TLSGD for normal >> libraries). >> >> Bootstrapped/regtested on x86_64-linux and i686-linux, attached is also the >> sources I've used to test all the 3 different transitions. >> >> Ok for trunk and 4.8 branch (and binutils trunk)? > > The implementation for x86 is technically OK, but I wonder if these > sequences should be documented in some authoritative document about > TLS relocations. The "ELF Handling For Thread-Local Storage" document > [1] doesn't mention various code models fo x86_64, so I was not able > to cross-check the implementaton vs. documentation. > > [1] http://www.akkadia.org/drepper/tls.pdf > I agree. We need to document the TLS code sequences for PIC/non-PIC medium/large models first.
On Wed, Aug 14, 2013 at 09:03:24AM +0200, Uros Bizjak wrote: > The implementation for x86 is technically OK, but I wonder if these > sequences should be documented in some authoritative document about > TLS relocations. The "ELF Handling For Thread-Local Storage" document > [1] doesn't mention various code models fo x86_64, so I was not able > to cross-check the implementaton vs. documentation. > > [1] http://www.akkadia.org/drepper/tls.pdf Ping, are the patches ok for gcc trunk and binutils trunk? Uli has kindly updated the docs some time ago. Jakub
On Wed, Aug 28, 2013 at 11:37 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Wed, Aug 14, 2013 at 09:03:24AM +0200, Uros Bizjak wrote: >> The implementation for x86 is technically OK, but I wonder if these >> sequences should be documented in some authoritative document about >> TLS relocations. The "ELF Handling For Thread-Local Storage" document >> [1] doesn't mention various code models fo x86_64, so I was not able >> to cross-check the implementaton vs. documentation. >> >> [1] http://www.akkadia.org/drepper/tls.pdf > > Ping, are the patches ok for gcc trunk and binutils trunk? > Uli has kindly updated the docs some time ago. OK for gcc. Thanks, Uros.
On Wed, Aug 28, 2013 at 2:37 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Wed, Aug 14, 2013 at 09:03:24AM +0200, Uros Bizjak wrote: >> The implementation for x86 is technically OK, but I wonder if these >> sequences should be documented in some authoritative document about >> TLS relocations. The "ELF Handling For Thread-Local Storage" document >> [1] doesn't mention various code models fo x86_64, so I was not able >> to cross-check the implementaton vs. documentation. >> >> [1] http://www.akkadia.org/drepper/tls.pdf > > Ping, are the patches ok for gcc trunk and binutils trunk? > Uli has kindly updated the docs some time ago. > Linker change is OK with testcases for GD and LD. Thanks.
--- gcc/config/i386/i386.md.jj 2013-08-13 12:20:20.000000000 +0200 +++ gcc/config/i386/i386.md 2013-08-13 15:03:55.632194607 +0200 @@ -12303,11 +12303,33 @@ (define_insn "*tls_global_dynamic_64_<mo (set (attr "length") (symbol_ref "TARGET_X32 ? 15 : 16"))]) +(define_insn "*tls_global_dynamic_64_largepic" + [(set (match_operand:DI 0 "register_operand" "=a") + (call:DI + (mem:QI (plus:DI (match_operand:DI 2 "register_operand" "b") + (match_operand:DI 3 "immediate_operand" "i"))) + (match_operand 4))) + (unspec:DI [(match_operand 1 "tls_symbolic_operand")] + UNSPEC_TLS_GD)] + "TARGET_64BIT && ix86_cmodel == CM_LARGE_PIC && !TARGET_PECOFF + && GET_CODE (operands[3]) == CONST + && GET_CODE (XEXP (operands[3], 0)) == UNSPEC + && XINT (XEXP (operands[3], 0), 1) == UNSPEC_PLTOFF" +{ + output_asm_insn + ("lea{q}\t{%E1@tlsgd(%%rip), %%rdi|rdi, %E1@tlsgd[rip]}", operands); + output_asm_insn ("movabs{q}\t{%3, %%rax|rax, %3}", operands); + output_asm_insn ("add{q}\t{%2, %%rax|rax, %2}", operands); + return "call\t{*%%rax|rax}"; +} + [(set_attr "type" "multi") + (set_attr "length" "22")]) + (define_expand "tls_global_dynamic_64_<mode>" [(parallel [(set (match_operand:P 0 "register_operand") (call:P - (mem:QI (match_operand 2 "constant_call_address_operand")) + (mem:QI (match_operand 2)) (const_int 0))) (unspec:P [(match_operand 1 "tls_symbolic_operand")] UNSPEC_TLS_GD)])] @@ -12365,11 +12387,32 @@ (define_insn "*tls_local_dynamic_base_64 [(set_attr "type" "multi") (set_attr "length" "12")]) +(define_insn "*tls_local_dynamic_base_64_largepic" + [(set (match_operand:DI 0 "register_operand" "=a") + (call:DI + (mem:QI (plus:DI (match_operand:DI 1 "register_operand" "b") + (match_operand:DI 2 "immediate_operand" "i"))) + (match_operand 3))) + (unspec:DI [(const_int 0)] UNSPEC_TLS_LD_BASE)] + "TARGET_64BIT && ix86_cmodel == CM_LARGE_PIC && !TARGET_PECOFF + && GET_CODE (operands[2]) == CONST + && GET_CODE (XEXP (operands[2], 0)) == UNSPEC + && XINT (XEXP (operands[2], 0), 1) == UNSPEC_PLTOFF" +{ + output_asm_insn + ("lea{q}\t{%&@tlsld(%%rip), %%rdi|rdi, %&@tlsld[rip]}", operands); + output_asm_insn ("movabs{q}\t{%2, %%rax|rax, %2}", operands); + output_asm_insn ("add{q}\t{%1, %%rax|rax, %1}", operands); + return "call\t{*%%rax|rax}"; +} + [(set_attr "type" "multi") + (set_attr "length" "22")]) + (define_expand "tls_local_dynamic_base_64_<mode>" [(parallel [(set (match_operand:P 0 "register_operand") (call:P - (mem:QI (match_operand 1 "constant_call_address_operand")) + (mem:QI (match_operand 1)) (const_int 0))) (unspec:P [(const_int 0)] UNSPEC_TLS_LD_BASE)])] "TARGET_64BIT") --- gcc/config/i386/i386.c.jj 2013-08-13 12:20:20.000000000 +0200 +++ gcc/config/i386/i386.c 2013-08-13 14:42:32.449334139 +0200 @@ -13220,6 +13220,14 @@ ix86_tls_get_addr (void) ix86_tls_symbol = gen_rtx_SYMBOL_REF (Pmode, sym); } + if (ix86_cmodel == CM_LARGE_PIC && !TARGET_PECOFF) + { + rtx unspec = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, ix86_tls_symbol), + UNSPEC_PLTOFF); + return gen_rtx_PLUS (Pmode, pic_offset_table_rtx, + gen_rtx_CONST (Pmode, unspec)); + } + return ix86_tls_symbol; }