From patchwork Thu Sep 27 18:04:58 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: [v2, rtl-optimization] : Fix PR54457,
[x32] Fail to combine 64bit index + constant
From: Uros Bizjak
X-Patchwork-Id: 187430
Message-Id:
To: gcc-patches@gcc.gnu.org
Cc: Richard Sandiford ,
Eric Botcazou
Date: Thu, 27 Sep 2012 20:04:58 +0200
On Thu, Sep 27, 2012 at 4:25 PM, Richard Sandiford
wrote:
>>>> I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is
>>>> a good transformation, but why do we need to handle as special
>>>> the case where the subreg is itself the operand of a plus or minus?
>>>> I think it should happen regardless of where the subreg occurs.
>>>
>>> Don't we need to restrict this to the low part though?
>>
>> I have tried this approach with attached patch. Unfortunately,
>> although it survived bootstrap without libjava on x86_64, it failed
>> building libjava with:
>>
>> /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1299:0:
>> error: insn does not satisfy its constraints:
>> }
>> ^
>> (insn 237 398 399 7 (set (reg:SI 1 dx [125])
>> (plus:SI (subreg:SI (mult:DI (reg:DI 1 dx [orig:72 D.78627 ] [72])
>> (const_int 2 [0x2])) 0)
>> (reg:SI 5 di)))
>> /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1271
>> 240 {*leasi}
>> (expr_list:REG_DEAD (reg:DI 5 di)
>> (nil)))
>>
>> Original RTX was (subreg:SI (plus:DI (mult:DI (...) reg:DI))), which
>> is valid RTX pattern for lea insn, the above is not.
>>
>> Due to these problems, I think the safer approach is to limit the
>> transformation to (plus:SI (subreg:SI (plus:DI (...) 0)) RTXes, as was
>> the case with original patch. This approach would fix a specific
>> problem where simplify_plus_minus is not able to simplify the combined
>> RTX at combine time. Please note, that combined RTXes are always
>> checked for correctness at combine pass.
>
> I think instead the (subreg (plus ...)) handling should be applied
> to (subreg (mult ...)) too. IMO the correct form of the above address
> ought to be:
>
> (set (reg:SI 1 dx [125])
> (plus:SI (mult:SI (reg:SI 1 dx [orig:72 D.78627 ] [72])
> (const_int 2 [0x2]))
> (reg:SI 5 di))
Great, this works as expected!
After some off-line discussion with Richard, attached is v2 of the patch.
2012-09-27 Uros Bizjak
PR rtl-optimization/54457
* simplify-rtx.c (simplify_subreg):
Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).
testsuite/ChangeLog:
2012-09-27 Uros Bizjak
PR rtl-optimization/54457
* gcc.target/i386/pr54457.c: New test.
Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}.
BTW: I propose that we start with limited selection of opcodes, so x32
autotester will pick and test the patch with SImode addresses.
OK for mainline?
Uros.
Index: simplify-rtx.c
===================================================================
--- simplify-rtx.c (revision 191808)
+++ simplify-rtx.c (working copy)
@@ -5689,6 +5689,28 @@ simplify_subreg (enum machine_mode outermode, rtx
return CONST0_RTX (outermode);
}
+ /* Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
+ to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)), where
+ the outer subreg is effectively a truncation to the original mode. */
+ if ((GET_CODE (op) == PLUS
+ || GET_CODE (op) == MINUS
+ || GET_CODE (op) == MULT)
+ && SCALAR_INT_MODE_P (outermode)
+ && SCALAR_INT_MODE_P (innermode)
+ && GET_MODE_PRECISION (outermode) < GET_MODE_PRECISION (innermode)
+ && byte == subreg_lowpart_offset (outermode, innermode))
+ {
+ rtx op0 = simplify_gen_subreg (outermode, XEXP (op, 0),
+ innermode, byte);
+ if (op0)
+ {
+ rtx op1 = simplify_gen_subreg (outermode, XEXP (op, 1),
+ innermode, byte);
+ if (op1)
+ return simplify_gen_binary (GET_CODE (op), outermode, op0, op1);
+ }
+ }
+
/* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into
to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
the outer subreg is effectively a truncation to the original mode. */
Index: testsuite/gcc.target/i386/pr54457.c
===================================================================
--- testsuite/gcc.target/i386/pr54457.c (revision 0)
+++ testsuite/gcc.target/i386/pr54457.c (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-options "-O2 -mx32 -maddress-mode=short" } */
+
+extern char array[40];
+
+char foo (long long position)
+{
+ return array[position + 1];
+}
+
+/* { dg-final { scan-assembler-not "add\[lq\]?\[^\n\]*1" } } */