Message ID | CAFULd4bfEG1m8hTc=OFRg9UWdKkU-S+hfWxJTZYDqX6XG1d87A@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Mar 26, 2012 at 10:25 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > Hello! > > In a corner case of a reload, reload pass can generate partially > reloaded address, where not all registers get allocated to a hard reg. > When this address is checked with ix86_legitimate_address, it is > rejected, since in strict mode, pseudos are not valid address > registers. So, reload tries to legitimize following (partially > invalid) address: > > (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) > (reg:DI 97)) > (reg:DI 2 cx)) > > via generic way by pushing all components into a register, forming > (even more invalid) address that involves three registers (r8, r9 and > rcx): > > (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8) > (reg:DI 38 r9)) > (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8]) > (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal} > (nil)) > > BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2... > > Attached patch fixes this situation by (partially) reloading only > remaining pseudo(s), leaving UNSPEC in the address RTX. > > 2012-03-26 Uros Bizjak <ubizjak@gmail.com> > > PR target/52698 > * config/i386/i386-protos.h (ix86_legitimize_reload_address): > New prototype. > * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define. > * config/i386/i386.c: Include reload.h. > (ix86_legitimize_reload_address): New function. > > testsuite/ChangeLog: > > 2012-03-26 Uros Bizjak <ubizjak@gmail.com> > H.J. Lu <hongjiu.lu@intel.com> > > PR target/52698 > * gcc.target/i386/pr52698.c: New test. > > The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. > > Since fixing reload issues is some kind of black magic, I'd like to > ask Ulrich and Richard for their opinion on this approach. > > H.J., can you please test this fix on x32 testsuite for both address modes? > I am on it.
On Mon, Mar 26, 2012 at 10:41 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Mon, Mar 26, 2012 at 10:25 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >> Hello! >> >> In a corner case of a reload, reload pass can generate partially >> reloaded address, where not all registers get allocated to a hard reg. >> When this address is checked with ix86_legitimate_address, it is >> rejected, since in strict mode, pseudos are not valid address >> registers. So, reload tries to legitimize following (partially >> invalid) address: >> >> (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) >> (reg:DI 97)) >> (reg:DI 2 cx)) >> >> via generic way by pushing all components into a register, forming >> (even more invalid) address that involves three registers (r8, r9 and >> rcx): >> >> (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8) >> (reg:DI 38 r9)) >> (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8]) >> (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal} >> (nil)) >> >> BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2... >> >> Attached patch fixes this situation by (partially) reloading only >> remaining pseudo(s), leaving UNSPEC in the address RTX. >> >> 2012-03-26 Uros Bizjak <ubizjak@gmail.com> >> >> PR target/52698 >> * config/i386/i386-protos.h (ix86_legitimize_reload_address): >> New prototype. >> * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define. >> * config/i386/i386.c: Include reload.h. >> (ix86_legitimize_reload_address): New function. >> >> testsuite/ChangeLog: >> >> 2012-03-26 Uros Bizjak <ubizjak@gmail.com> >> H.J. Lu <hongjiu.lu@intel.com> >> >> PR target/52698 >> * gcc.target/i386/pr52698.c: New test. >> >> The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. >> >> Since fixing reload issues is some kind of black magic, I'd like to >> ask Ulrich and Richard for their opinion on this approach. >> >> H.J., can you please test this fix on x32 testsuite for both address modes? >> > > I am on it. There are no regressions in gcc x32 testsuite nor glibc tests for both address modes. But I didn't check code quality nor SPEC CPU performance.
On Mon, Mar 26, 2012 at 9:03 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> Attached patch fixes this situation by (partially) reloading only >>> remaining pseudo(s), leaving UNSPEC in the address RTX. >>> >>> 2012-03-26 Uros Bizjak <ubizjak@gmail.com> >>> >>> PR target/52698 >>> * config/i386/i386-protos.h (ix86_legitimize_reload_address): >>> New prototype. >>> * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define. >>> * config/i386/i386.c: Include reload.h. >>> (ix86_legitimize_reload_address): New function. >>> >>> testsuite/ChangeLog: >>> >>> 2012-03-26 Uros Bizjak <ubizjak@gmail.com> >>> H.J. Lu <hongjiu.lu@intel.com> >>> >>> PR target/52698 >>> * gcc.target/i386/pr52698.c: New test. >>> >>> The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. >>> >>> Since fixing reload issues is some kind of black magic, I'd like to >>> ask Ulrich and Richard for their opinion on this approach. >>> >>> H.J., can you please test this fix on x32 testsuite for both address modes? >>> >> >> I am on it. > > There are no regressions in gcc x32 testsuite nor glibc tests for > both address modes. But I didn't check code quality nor SPEC CPU > performance. This fixup should not trigger often (if at all), so there should be no effect on performance. So, committed. Uros.
On Tue, Mar 27, 2012 at 6:13 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote: >> Since fixing reload issues is some kind of black magic, I'd like to >> ask Ulrich and Richard for their opinion on this approach. > > Well, generally speaking, reload makes a lot of assumptions on how > addresses can look like; it needs to, since if a target rejects an > address as invalid as-is, reload must fix it up -- and it can do > so only by making assumptions ... > > Allowing a random UNSPEC as part of valid (non-strict) addresses > makes it really impossible for reload to understand what's going > on. Given that, I'd tend to agree that *if* you do that, you > then also have to help reload how to deal with such addresses > by providing a legitimize_reload_address hook as you did. > > Now, in this particular case, there might be another option to > avoid this hassle completely: I understand that this UNSPEC is > simply a magic marker to make the address use the fs: or gs: > segment override, right? Now that GCC supports address spaces, > it might be possible to model fs:/gs: relative addresses instead > by using a non-standard address space ... This is an interesting idea, I will look into it. (BTW: For now, I have just committed the proposed fixup). Thanks, Uros.
On Tue, Mar 27, 2012 at 9:37 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Mar 27, 2012 at 6:13 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote: > >>> Since fixing reload issues is some kind of black magic, I'd like to >>> ask Ulrich and Richard for their opinion on this approach. >> >> Well, generally speaking, reload makes a lot of assumptions on how >> addresses can look like; it needs to, since if a target rejects an >> address as invalid as-is, reload must fix it up -- and it can do >> so only by making assumptions ... >> >> Allowing a random UNSPEC as part of valid (non-strict) addresses >> makes it really impossible for reload to understand what's going >> on. Given that, I'd tend to agree that *if* you do that, you >> then also have to help reload how to deal with such addresses >> by providing a legitimize_reload_address hook as you did. >> >> Now, in this particular case, there might be another option to >> avoid this hassle completely: I understand that this UNSPEC is >> simply a magic marker to make the address use the fs: or gs: >> segment override, right? Now that GCC supports address spaces, >> it might be possible to model fs:/gs: relative addresses instead >> by using a non-standard address space ... > > This is an interesting idea, I will look into it. As I explained in: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01590.html we can remove *load_tp_x32_zext and use (define_insn "*load_tp_x32_<mode>" [(set (match_operand:SWI48x 0 "register_operand" "=r") (unspec:SWI48x [(const_int 0)] UNSPEC_TP))] "TARGET_X32" "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" [(set_attr "type" "imov") (set_attr "modrm" "0") (set_attr "length" "7") (set_attr "memory" "load") (set_attr "imm_disp" "false")]) to load %fs directly into %r32 or %r64.
On Tue, Mar 27, 2012 at 6:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> Well, generally speaking, reload makes a lot of assumptions on how >>> addresses can look like; it needs to, since if a target rejects an >>> address as invalid as-is, reload must fix it up -- and it can do >>> so only by making assumptions ... >>> >>> Allowing a random UNSPEC as part of valid (non-strict) addresses >>> makes it really impossible for reload to understand what's going >>> on. Given that, I'd tend to agree that *if* you do that, you >>> then also have to help reload how to deal with such addresses >>> by providing a legitimize_reload_address hook as you did. >>> >>> Now, in this particular case, there might be another option to >>> avoid this hassle completely: I understand that this UNSPEC is >>> simply a magic marker to make the address use the fs: or gs: >>> segment override, right? Now that GCC supports address spaces, >>> it might be possible to model fs:/gs: relative addresses instead >>> by using a non-standard address space ... >> >> This is an interesting idea, I will look into it. > > As I explained in: > > http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01590.html > > we can remove *load_tp_x32_zext and use > > (define_insn "*load_tp_x32_<mode>" > [(set (match_operand:SWI48x 0 "register_operand" "=r") > (unspec:SWI48x [(const_int 0)] UNSPEC_TP))] > "TARGET_X32" > "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" > [(set_attr "type" "imov") > (set_attr "modrm" "0") > (set_attr "length" "7") > (set_attr "memory" "load") > (set_attr "imm_disp" "false")]) > > to load %fs directly into %r32 or %r64. No, we can't. Uros.
On 03/27/12 09:37, Uros Bizjak wrote: >> > Now, in this particular case, there might be another option to >> > avoid this hassle completely: I understand that this UNSPEC is >> > simply a magic marker to make the address use the fs: or gs: >> > segment override, right? Now that GCC supports address spaces, >> > it might be possible to model fs:/gs: relative addresses instead >> > by using a non-standard address space ... > This is an interesting idea, I will look into it. Generallized segment overrides via non-standard address spaces has been on my to-do list for quite a while... r~
On Tue, Mar 27, 2012 at 9:57 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Mar 27, 2012 at 6:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > >>>> Well, generally speaking, reload makes a lot of assumptions on how >>>> addresses can look like; it needs to, since if a target rejects an >>>> address as invalid as-is, reload must fix it up -- and it can do >>>> so only by making assumptions ... >>>> >>>> Allowing a random UNSPEC as part of valid (non-strict) addresses >>>> makes it really impossible for reload to understand what's going >>>> on. Given that, I'd tend to agree that *if* you do that, you >>>> then also have to help reload how to deal with such addresses >>>> by providing a legitimize_reload_address hook as you did. >>>> >>>> Now, in this particular case, there might be another option to >>>> avoid this hassle completely: I understand that this UNSPEC is >>>> simply a magic marker to make the address use the fs: or gs: >>>> segment override, right? Now that GCC supports address spaces, >>>> it might be possible to model fs:/gs: relative addresses instead >>>> by using a non-standard address space ... >>> >>> This is an interesting idea, I will look into it. >> >> As I explained in: >> >> http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01590.html >> >> we can remove *load_tp_x32_zext and use >> >> (define_insn "*load_tp_x32_<mode>" >> [(set (match_operand:SWI48x 0 "register_operand" "=r") >> (unspec:SWI48x [(const_int 0)] UNSPEC_TP))] >> "TARGET_X32" >> "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" >> [(set_attr "type" "imov") >> (set_attr "modrm" "0") >> (set_attr "length" "7") >> (set_attr "memory" "load") >> (set_attr "imm_disp" "false")]) >> >> to load %fs directly into %r32 or %r64. > > No, we can't. > GCC needs to move the value in the %fs segment register into %r32 or %r64. This instruction "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" does exactly what GCC wants.
On Tue, Mar 27, 2012 at 7:28 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > GCC needs to move the value in the %fs segment > register into %r32 or %r64. This instruction > > "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" > > does exactly what GCC wants. Sorry, I really don't understand what you are trying to say. You are loading ptrmode (so, void *) pointer from %fs:0 to a DImode register. If you use movl, you can say that this instruction zero extends the value (void *, ptrmode, SImode) from a memory location pointed by %fs:0 to a DImode register. Please note the difference between: movl %fs:0, %eax and movl %fs, %eax. BTW: %fs is a 16bit register. Uros.
On Tue, Mar 27, 2012 at 10:53 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Mar 27, 2012 at 7:28 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > >> GCC needs to move the value in the %fs segment >> register into %r32 or %r64. This instruction >> >> "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" >> >> does exactly what GCC wants. > > Sorry, I really don't understand what you are trying to say. > > You are loading ptrmode (so, void *) pointer from %fs:0 to a DImode > register. If you use movl, you can say that this instruction zero > extends the value (void *, ptrmode, SImode) from a memory location > pointed by %fs:0 to a DImode register. Please note the difference > between: > > movl %fs:0, %eax > > and > > movl %fs, %eax. > > BTW: %fs is a 16bit register. > %fs and %gs are special in 64bit mode. For a memory operand "%fs:address", its effective address is base address of %fs + address. The base address of %fs and %fs are hidden. "mov %fs, %eax" will only access the visible part of %fs, which is the 16bit segment selector. In 64bit mode, UNSPEC_TP is the base address of %fs. To access the base address of %fs, we can use system call: int arch_prctl(int code, unsigned long addr); int arch_prctl(int code, unsigned long *addr); ARCH_SET_FS Set the 64-bit base for the FS register to addr. ARCH_GET_FS Return the 64-bit base value for the FS register of the current thread in the unsigned long pointed to by addr. BTW, 4 new instructions are added to read/write base address of %fs/%gs directly. For now, we have to use the system call to update base address of %fs, To read the base address of %fs, OS arranges that the base address of %fs points to a struct: typedef struct { void *tcb; /* Pointer to the TCB. Not necessarily the thread descriptor used by libpthread. */ ... } and sets up tcb == the base address of %fs. Then we can use "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" to move the base address of %fs into %r32 and %r64 directly. I hope this answers your questions. Thanks.
On Tue, Mar 27, 2012 at 8:34 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > %fs and %gs are special in 64bit mode. For a memory operand > "%fs:address", its effective address is base address of %fs + address. > The base address of %fs and %fs are hidden. "mov %fs, %eax" > will only access the visible part of %fs, which is the 16bit segment > selector. In 64bit mode, UNSPEC_TP is the base address of %fs. > To access the base address of %fs, we can use system call: > > int arch_prctl(int code, unsigned long addr); > int arch_prctl(int code, unsigned long *addr); > > ARCH_SET_FS > Set the 64-bit base for the FS register to addr. > > ARCH_GET_FS > Return the 64-bit base value for the FS register of the > current thread in the unsigned long pointed to by addr. > > BTW, 4 new instructions are added to read/write base address of > %fs/%gs directly. For now, we have to use the system call to > update base address of %fs, To read the base address of %fs, > OS arranges that the base address of %fs points to a struct: > > typedef struct > { > void *tcb; /* Pointer to the TCB. Not necessarily the > thread descriptor used by libpthread. */ > ... > } > > and sets up tcb == the base address of %fs. Then we can use > > "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}" > > to move the base address of %fs into %r32 and %r64 directly. > I hope this answers your questions. Let me say this way: please propose the patch that implements your suggestions. I will leave the approval of the patch to someone else. Uros.
On Tue, Mar 27, 2012 at 7:20 PM, Richard Henderson <rth@redhat.com> wrote: > On 03/27/12 09:37, Uros Bizjak wrote: >>> > Now, in this particular case, there might be another option to >>> > avoid this hassle completely: I understand that this UNSPEC is >>> > simply a magic marker to make the address use the fs: or gs: >>> > segment override, right? Now that GCC supports address spaces, >>> > it might be possible to model fs:/gs: relative addresses instead >>> > by using a non-standard address space ... >> This is an interesting idea, I will look into it. > > Generallized segment overrides via non-standard address spaces > has been on my to-do list for quite a while... What about release branches? While this issue didn't trigger there yet, it can be triggered by some bad RA decision, and the fixup is missing. Uros.
On 03/28/2012 02:40 AM, Uros Bizjak wrote: > What about release branches? While this issue didn't trigger there > yet, it can be triggered by some bad RA decision, and the fixup is > missing. I have no objection to your current patch being backported. It looked fairly safe. r~
On Tue, Mar 27, 2012 at 7:20 PM, Richard Henderson <rth@redhat.com> wrote: > On 03/27/12 09:37, Uros Bizjak wrote: >>> > Now, in this particular case, there might be another option to >>> > avoid this hassle completely: I understand that this UNSPEC is >>> > simply a magic marker to make the address use the fs: or gs: >>> > segment override, right? Now that GCC supports address spaces, >>> > it might be possible to model fs:/gs: relative addresses instead >>> > by using a non-standard address space ... >> This is an interesting idea, I will look into it. > > Generallized segment overrides via non-standard address spaces > has been on my to-do list for quite a while... I have looked a bit to implement this functionality, but address spaces are somehow coupled with a keyword that describes special address space. I guess we can't just reuse __thread keyword for this purpose. Ulrich, can you please provide some guidelines on how you think this proposed functionality should be implemented? Thanks, Uros.
Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 185807) +++ config/i386/i386.c (working copy) @@ -47,6 +47,7 @@ #include "target-def.h" #include "common/common-target.h" #include "langhooks.h" +#include "reload.h" #include "cgraph.h" #include "gimple.h" #include "dwarf2.h" @@ -12010,6 +12011,64 @@ return false; } +/* Our implementation of LEGITIMIZE_RELOAD_ADDRESS. Returns a value to + replace the input X, or the original X if no replacement is called for. + The output parameter *WIN is 1 if the calling macro should goto WIN, + 0 if it should not. */ + +bool +ix86_legitimize_reload_address (rtx x, + enum machine_mode mode ATTRIBUTE_UNUSED, + int opnum, int type, + int ind_levels ATTRIBUTE_UNUSED) +{ + /* Reload can generate: + + (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) + (reg:DI 97)) + (reg:DI 2 cx)) + + This RTX is rejected from ix86_legitimate_address_p due to + non-strictness of base register 97. Following this rejection, + reload pushes all three components into separate registers, + creating invalid memory address RTX. + + Following code reloads only the invalid part of the + memory address RTX. */ + + if (GET_CODE (x) == PLUS + && REG_P (XEXP (x, 1)) + && GET_CODE (XEXP (x, 0)) == PLUS + && REG_P (XEXP (XEXP (x, 0), 1))) + { + rtx base, index; + bool something_reloaded = false; + + base = XEXP (XEXP (x, 0), 1); + if (!REG_OK_FOR_BASE_STRICT_P (base)) + { + push_reload (base, NULL_RTX, &XEXP (XEXP (x, 0), 1), NULL, + BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, + opnum, (enum reload_type)type); + something_reloaded = true; + } + + index = XEXP (x, 1); + if (!REG_OK_FOR_INDEX_STRICT_P (index)) + { + push_reload (index, NULL_RTX, &XEXP (x, 1), NULL, + INDEX_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, + opnum, (enum reload_type)type); + something_reloaded = true; + } + + gcc_assert (something_reloaded); + return true; + } + + return false; +} + /* Recognizes RTL expressions that are valid memory addresses for an instruction. The MODE argument is the machine mode for the MEM expression that wants to use this address. Index: config/i386/i386.h =================================================================== --- config/i386/i386.h (revision 185807) +++ config/i386/i386.h (working copy) @@ -1630,6 +1630,17 @@ #define CONSTANT_ADDRESS_P(X) constant_address_p (X) +/* Try a machine-dependent way of reloading an illegitimate address + operand. If we find one, push the reload and jump to WIN. This + macro is used in only one place: `find_reloads_address' in reload.c. */ + +#define LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, INDL, WIN) \ +do { \ + if (ix86_legitimize_reload_address ((X), (MODE), (OPNUM), \ + (int)(TYPE), (INDL))) \ + goto WIN; \ +} while (0) + /* If defined, a C expression to determine the base term of address X. This macro is used in only one place: `find_base_term' in alias.c. Index: config/i386/i386-protos.h =================================================================== --- config/i386/i386-protos.h (revision 185807) +++ config/i386/i386-protos.h (working copy) @@ -65,7 +65,8 @@ extern bool constant_address_p (rtx); extern bool legitimate_pic_operand_p (rtx); extern bool legitimate_pic_address_disp_p (rtx); - +extern bool ix86_legitimize_reload_address (rtx, enum machine_mode, + int, int, int); extern void print_reg (rtx, int, FILE*); extern void ix86_print_operand (FILE *, rtx, int); Index: testsuite/gcc.target/i386/pr52698.c =================================================================== --- testsuite/gcc.target/i386/pr52698.c (revision 0) +++ testsuite/gcc.target/i386/pr52698.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-options "-O2 -mx32 -maddress-mode=long" } */ + +extern void abort (void); +static __thread unsigned char foo [32] +__attribute__ ((tls_model ("initial-exec"), aligned (sizeof (void *)))); + +void +test2 (void) +{ + unsigned int s; + for (s = 0; s < sizeof (foo); ++s) + { + if (foo [s] != s) + abort (); + foo [s] = sizeof (foo) - s; + } +}