Patchwork PATCH: PR target/50603: [x32] Unnecessary lea

login
register
mail settings
Submitter Uros Bizjak
Date Oct. 4, 2011, 8:06 a.m.
Message ID <CAFULd4bXdwJmVtLLk9=y=7FW3aw4CudOgr=+r6H-xo3gORxMxQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/117587/
State New
Headers show

Comments

Uros Bizjak - Oct. 4, 2011, 8:06 a.m.
On Tue, Oct 4, 2011 at 8:17 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Oct 4, 2011 at 1:00 AM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> This patch improves address combine for x32 by forcing the memory memory
>> operand of PLUS operation into register.  Tested on Linux/x86-64 with
>> -mx32.  OK for trunk?
>
> Does the patch fix
>
> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq]
>
> on x32 ?

It does.

Following patch is the same, but takes into account that non-matching
memory can only be in src2, so it avoids a bunch of unnecessary
checks. Can you please check the effects of the patch with some
codesize benchmark?

Please also add the test from the PR that checks for lea (similar to
pr45670.c test).


Uros.
Uros Bizjak - Oct. 4, 2011, 8:19 a.m.
On Tue, Oct 4, 2011 at 10:06 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> This patch improves address combine for x32 by forcing the memory memory
>>> operand of PLUS operation into register.  Tested on Linux/x86-64 with
>>> -mx32.  OK for trunk?
>>
>> Does the patch fix
>>
>> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq]
>>
>> on x32 ?
>
> It does.
>
> Following patch is the same, but takes into account that non-matching
> memory can only be in src2, so it avoids a bunch of unnecessary
> checks. Can you please check the effects of the patch with some
> codesize benchmark?

OTOH, x86_64 and i686 targets can also benefit from this change. If
combine can't create more complex address (covered by lea), then it
will simply propagate memory operand back into the add insn. It looks
to me that we can't loose here, so:

  /* Improve address combine.  */
  if (code == PLUS && MEM_P (src2))
    src2 = force_reg (mode, src2);

Any opinions?

Uros.
H.J. Lu - Oct. 4, 2011, 2:40 p.m.
On Tue, Oct 4, 2011 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Oct 4, 2011 at 10:06 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>> This patch improves address combine for x32 by forcing the memory memory
>>>> operand of PLUS operation into register.  Tested on Linux/x86-64 with
>>>> -mx32.  OK for trunk?
>>>
>>> Does the patch fix
>>>
>>> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq]
>>>
>>> on x32 ?
>>
>> It does.
>>
>> Following patch is the same, but takes into account that non-matching
>> memory can only be in src2, so it avoids a bunch of unnecessary
>> checks. Can you please check the effects of the patch with some
>> codesize benchmark?
>
> OTOH, x86_64 and i686 targets can also benefit from this change. If
> combine can't create more complex address (covered by lea), then it
> will simply propagate memory operand back into the add insn. It looks
> to me that we can't loose here, so:
>
>  /* Improve address combine.  */
>  if (code == PLUS && MEM_P (src2))
>    src2 = force_reg (mode, src2);
>
> Any opinions?

This patch should fix PR 50603 as well as gcc.target/i386/pr45670.c.
I will check its code size impact on SPEC CPU 2K/2006.
H.J. Lu - Oct. 4, 2011, 4:06 p.m.
On Tue, Oct 4, 2011 at 1:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Oct 4, 2011 at 10:06 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>> This patch improves address combine for x32 by forcing the memory memory
>>>> operand of PLUS operation into register.  Tested on Linux/x86-64 with
>>>> -mx32.  OK for trunk?
>>>
>>> Does the patch fix
>>>
>>> FAIL: gcc.target/i386/pr45670.c scan-assembler-not lea[lq]
>>>
>>> on x32 ?
>>
>> It does.
>>
>> Following patch is the same, but takes into account that non-matching
>> memory can only be in src2, so it avoids a bunch of unnecessary
>> checks. Can you please check the effects of the patch with some
>> codesize benchmark?
>
> OTOH, x86_64 and i686 targets can also benefit from this change. If
> combine can't create more complex address (covered by lea), then it
> will simply propagate memory operand back into the add insn. It looks
> to me that we can't loose here, so:
>
>  /* Improve address combine.  */
>  if (code == PLUS && MEM_P (src2))
>    src2 = force_reg (mode, src2);
>
> Any opinions?
>

It doesn't work with 64bit libstdc++:

/export/gnu/import/git/gcc/libstdc++-v3/src/strstream.cc: In member
function ‘void std::strstream::_ZTv0_n24_NSt9strstreamD1Ev()’:
/export/gnu/import/git/gcc/libstdc++-v3/src/strstream.cc:418:1: error:
insn does not satisfy its constraints:
(insn 3 2 4 (set (reg:DI 59)
        (mem:DI (plus:DI (reg:DI 39 r10)
                (const_int -24 [0xffffffffffffffe8])) [0 S8 A8])) 62
{*movdi_internal_rex64}
     (nil))
/export/gnu/import/git/gcc/libstdc++-v3/src/strstream.cc:418:1:
internal compiler error: in final_scan_insn, at final.c:2648
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
make[7]: *** [strstream.lo] Error 1
make[7]: *** Waiting for unfinished jobs....

Patch

Index: i386.c
===================================================================
--- i386.c	(revision 179489)
+++ i386.c	(working copy)
@@ -15727,6 +15727,10 @@ 
   if (MEM_P (src1) && !rtx_equal_p (dst, src1))
     src1 = force_reg (mode, src1);

+  /* Improve address combine in x32 mode.  */
+  if (TARGET_X32 && code == PLUS && MEM_P (src2))
+    src2 = force_reg (mode, src2);
+
   operands[1] = src1;
   operands[2] = src2;
   return dst;