Message ID | CAFULd4bXdwJmVtLLk9=y=7FW3aw4CudOgr=+r6H-xo3gORxMxQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Tue, Oct 4, 2011 at 10:06 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >>> This patch improves address combine for x32 by forcing the memory memory >>> operand of PLUS operation into register. Tested on Linux/x86-64 with >>> -mx32. OK for trunk? >> >> Does the patch fix >> >> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq] >> >> on x32 ? > > It does. > > Following patch is the same, but takes into account that non-matching > memory can only be in src2, so it avoids a bunch of unnecessary > checks. Can you please check the effects of the patch with some > codesize benchmark? OTOH, x86_64 and i686 targets can also benefit from this change. If combine can't create more complex address (covered by lea), then it will simply propagate memory operand back into the add insn. It looks to me that we can't loose here, so: /* Improve address combine. */ if (code == PLUS && MEM_P (src2)) src2 = force_reg (mode, src2); Any opinions? Uros.
On Tue, Oct 4, 2011 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Oct 4, 2011 at 10:06 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >>>> This patch improves address combine for x32 by forcing the memory memory >>>> operand of PLUS operation into register. Tested on Linux/x86-64 with >>>> -mx32. OK for trunk? >>> >>> Does the patch fix >>> >>> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq] >>> >>> on x32 ? >> >> It does. >> >> Following patch is the same, but takes into account that non-matching >> memory can only be in src2, so it avoids a bunch of unnecessary >> checks. Can you please check the effects of the patch with some >> codesize benchmark? > > OTOH, x86_64 and i686 targets can also benefit from this change. If > combine can't create more complex address (covered by lea), then it > will simply propagate memory operand back into the add insn. It looks > to me that we can't loose here, so: > > /* Improve address combine. */ > if (code == PLUS && MEM_P (src2)) > src2 = force_reg (mode, src2); > > Any opinions? This patch should fix PR 50603 as well as gcc.target/i386/pr45670.c. I will check its code size impact on SPEC CPU 2K/2006.
On Tue, Oct 4, 2011 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Oct 4, 2011 at 10:06 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >>>> This patch improves address combine for x32 by forcing the memory memory >>>> operand of PLUS operation into register. Tested on Linux/x86-64 with >>>> -mx32. OK for trunk? >>> >>> Does the patch fix >>> >>> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq] >>> >>> on x32 ? >> >> It does. >> >> Following patch is the same, but takes into account that non-matching >> memory can only be in src2, so it avoids a bunch of unnecessary >> checks. Can you please check the effects of the patch with some >> codesize benchmark? > > OTOH, x86_64 and i686 targets can also benefit from this change. If > combine can't create more complex address (covered by lea), then it > will simply propagate memory operand back into the add insn. It looks > to me that we can't loose here, so: > > /* Improve address combine. */ > if (code == PLUS && MEM_P (src2)) > src2 = force_reg (mode, src2); > > Any opinions? > It doesn't work with 64bit libstdc++: /export/gnu/import/git/gcc/libstdc++-v3/src/strstream.cc: In member function ‘void std::strstream::_ZTv0_n24_NSt9strstreamD1Ev()’: /export/gnu/import/git/gcc/libstdc++-v3/src/strstream.cc:418:1: error: insn does not satisfy its constraints: (insn 3 2 4 (set (reg:DI 59) (mem:DI (plus:DI (reg:DI 39 r10) (const_int -24 [0xffffffffffffffe8])) [0 S8 A8])) 62 {*movdi_internal_rex64} (nil)) /export/gnu/import/git/gcc/libstdc++-v3/src/strstream.cc:418:1: internal compiler error: in final_scan_insn, at final.c:2648 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. make[7]: *** [strstream.lo] Error 1 make[7]: *** Waiting for unfinished jobs....
Index: i386.c =================================================================== --- i386.c (revision 179489) +++ i386.c (working copy) @@ -15727,6 +15727,10 @@ if (MEM_P (src1) && !rtx_equal_p (dst, src1)) src1 = force_reg (mode, src1); + /* Improve address combine in x32 mode. */ + if (TARGET_X32 && code == PLUS && MEM_P (src2)) + src2 = force_reg (mode, src2); + operands[1] = src1; operands[2] = src2; return dst;