diff mbox

[AArch64] Improve spill code - swap order in shl pattern

Message ID 000201d080ef$44300080$cc900180$@com
State New
Headers show

Commit Message

Wilco April 27, 2015, 1:37 p.m. UTC
Various instructions are supported as integer operations as well as SIMD on AArch64. When register
pressure is high, lra-constraints inserts spill code without taking the allocation class into
account, and basically chooses the first available pattern that matches. Since this instruction has
the SIMD version first it is usually chosen eventhough some of the operands are eventually allocated
to integer registers. The result is inefficient code not only due to the higher latency of SIMD
instructions but also due to the extra int<->FP moves. Placing the integer variant first in the shl
pattern generates far more optimal spill code. A few more patterns are the wrong way around, which
I'll address in a separate patch. I'm also looking into fixing lra-constraints to generate the
expected code by taking the allocno class into account in the cost calculations during spilling.

2015-04-27  Wilco Dijkstra  <wdijkstr@arm.com>

        * gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_<mode>3):
        Place integer variant first.

---
 gcc/config/aarch64/aarch64.md | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

James Greenhalgh April 27, 2015, 4:13 p.m. UTC | #1
On Mon, Apr 27, 2015 at 02:37:12PM +0100, Wilco Dijkstra wrote:
> Various instructions are supported as integer operations as well as SIMD on
> AArch64. When register pressure is high, lra-constraints inserts spill code
> without taking the allocation class into account, and basically chooses the
> first available pattern that matches. Since this instruction has the SIMD
> version first it is usually chosen eventhough some of the operands are
> eventually allocated to integer registers. The result is inefficient code not
> only due to the higher latency of SIMD instructions but also due to the extra
> int<->FP moves. Placing the integer variant first in the shl pattern
> generates far more optimal spill code. A few more patterns are the wrong way
> around, which I'll address in a separate patch. I'm also looking into fixing
> lra-constraints to generate the expected code by taking the allocno class
> into account in the cost calculations during spilling.
> 
> 2015-04-27  Wilco Dijkstra  <wdijkstr@arm.com>
> 
>         * gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_<mode>3):
>         Place integer variant first.

OK, thanks for the fix.

Cheers,
James
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 7163025..baef56a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3334,17 +3334,17 @@ 
 
 ;; Logical left shift using SISD or Integer instruction
 (define_insn "*aarch64_ashl_sisd_or_int_<mode>3"
-  [(set (match_operand:GPI 0 "register_operand" "=w,w,r")
+  [(set (match_operand:GPI 0 "register_operand" "=r,w,w")
         (ashift:GPI
-          (match_operand:GPI 1 "register_operand" "w,w,r")
-          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))]
+          (match_operand:GPI 1 "register_operand" "r,w,w")
+          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "rUs<cmode>,Us<cmode>,w")))]
   ""
   "@
+   lsl\t%<w>0, %<w>1, %<w>2
    shl\t%<rtn>0<vas>, %<rtn>1<vas>, %2
-   ushl\t%<rtn>0<vas>, %<rtn>1<vas>, %<rtn>2<vas>
-   lsl\t%<w>0, %<w>1, %<w>2"
-  [(set_attr "simd" "yes,yes,no")
-   (set_attr "type" "neon_shift_imm<q>, neon_shift_reg<q>,shift_reg")]
+   ushl\t%<rtn>0<vas>, %<rtn>1<vas>, %<rtn>2<vas>"
+  [(set_attr "simd" "no,yes,yes")
+   (set_attr "type" "shift_reg,neon_shift_imm<q>, neon_shift_reg<q>")]
 )
 
 ;; Logical right shift using SISD or Integer instruction