diff mbox

[ARM] Improve extendsidi without neon

Message ID 004d01ce1104$18c63f70$4a52be50$@yorsh@arm.com
State New
Headers show

Commit Message

Greta Yorsh Feb. 22, 2013, 1:54 p.m. UTC
This patch improves code generated for extension from SI to DI mode for core
registers when neon is not enabled.

Currently, if neon is enabled, extendsidi for core registers benefits from
the patch described here (r194558 from 17 Dec 2012):
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00984.html

Before r194558, the compiler used to split extendsidi before register
allocation (at split1 pass), which resulted in some cases in an extra
register being used and an extra register move. After r194558, if neon is
enabled, extendsidi is split after reload, generating better code.
Unfortunately, when neon is not enabled, the splitter is still triggered
before reload.

For example, 

$ cat negdi-1.c

signed long long extendsidi_negsi (signed int x)
{
  return -x;
}

$ arm-none-eabi-gcc negdi-1.c -S -O2 -o- -mfpu=vfpv4

extendsidi_negsi:
        rsb     r3, r0, #0      @ 6     *arm_negsi2     [length = 4]
        mov     r0, r3  @ 19    *arm_movsi_vfp/1        [length = 4]
        mov     r1, r3, asr #31 @ 20    *arm_shiftsi3   [length = 4]
        bx      lr      @ 25    *arm_return     [length = 12]

$ arm-none-eabi-gcc negdi-1.c -S -O2 -o- -mfpu=neon

extendsidi_negsi:
        rsb     r0, r0, #0      @ 6     *arm_negsi2     [length = 4]
        mov     r1, r0, asr #31 @ 20    *arm_shiftsi3   [length = 4]
        bx      lr      @ 23    *arm_return     [length = 12]

This patch changes the condition of splitters for extendsidi to trigger only
after reload.

In addition, this patch fixes a bug in zero_extend<mode>di2 and
extend<mode>di2 patterns. One of the alternatives in these patterns has a
constraint 'w' which is not valid when neon is disabled, but the patterns
don't have attributes to guard these alternatives by neon availability. This
might cause an ICE when neon is not available. Currently, it seems that the
patterns are only matched by RTL insns that are generated by splitters with
conditions on neon being enabled. However, it may be a latent bug. In any
case, the change in conditions of these splitters made by this patch
triggers the bug and causes an ICE. 

No regression on qemu for arm-none-eabi cortex-a15 arm/thumb neon/vfpv4
softfp/soft.

Bootstrap successful on Cortex-A15.

I haven't added a test case because tests that scan assembly for 'mov' are
very unstable.

Ok for trunk?
Thanks,
Greta

gcc/

2013-02-21  Greta Yorsh  <Greta.Yorsh@arm.com>

        * config/arm/arm.md (split for extendsidi): Update condition.
        (zero_extend<mode>di2,extend<mode>di2): Add an alternative.
        * config/arm/iterators.md (qhs_extenddi_cstr): Likewise.
        (qhs_zextenddi_cstr): Likewise.

Comments

Richard Earnshaw Feb. 22, 2013, 2:08 p.m. UTC | #1
On 22/02/13 13:54, Greta Yorsh wrote:
> This patch improves code generated for extension from SI to DI mode for core
> registers when neon is not enabled.
>
> Currently, if neon is enabled, extendsidi for core registers benefits from
> the patch described here (r194558 from 17 Dec 2012):
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00984.html
>
> Before r194558, the compiler used to split extendsidi before register
> allocation (at split1 pass), which resulted in some cases in an extra
> register being used and an extra register move. After r194558, if neon is
> enabled, extendsidi is split after reload, generating better code.
> Unfortunately, when neon is not enabled, the splitter is still triggered
> before reload.
>
> For example,
>
> $ cat negdi-1.c
>
> signed long long extendsidi_negsi (signed int x)
> {
>    return -x;
> }
>
> $ arm-none-eabi-gcc negdi-1.c -S -O2 -o- -mfpu=vfpv4
>
> extendsidi_negsi:
>          rsb     r3, r0, #0      @ 6     *arm_negsi2     [length = 4]
>          mov     r0, r3  @ 19    *arm_movsi_vfp/1        [length = 4]
>          mov     r1, r3, asr #31 @ 20    *arm_shiftsi3   [length = 4]
>          bx      lr      @ 25    *arm_return     [length = 12]
>
> $ arm-none-eabi-gcc negdi-1.c -S -O2 -o- -mfpu=neon
>
> extendsidi_negsi:
>          rsb     r0, r0, #0      @ 6     *arm_negsi2     [length = 4]
>          mov     r1, r0, asr #31 @ 20    *arm_shiftsi3   [length = 4]
>          bx      lr      @ 23    *arm_return     [length = 12]
>
> This patch changes the condition of splitters for extendsidi to trigger only
> after reload.
>
> In addition, this patch fixes a bug in zero_extend<mode>di2 and
> extend<mode>di2 patterns. One of the alternatives in these patterns has a
> constraint 'w' which is not valid when neon is disabled, but the patterns
> don't have attributes to guard these alternatives by neon availability. This
> might cause an ICE when neon is not available. Currently, it seems that the
> patterns are only matched by RTL insns that are generated by splitters with
> conditions on neon being enabled. However, it may be a latent bug. In any
> case, the change in conditions of these splitters made by this patch
> triggers the bug and causes an ICE.
>
> No regression on qemu for arm-none-eabi cortex-a15 arm/thumb neon/vfpv4
> softfp/soft.
>
> Bootstrap successful on Cortex-A15.
>
> I haven't added a test case because tests that scan assembly for 'mov' are
> very unstable.
>
> Ok for trunk?
> Thanks,
> Greta
>
> gcc/
>
> 2013-02-21  Greta Yorsh  <Greta.Yorsh@arm.com>
>
>          * config/arm/arm.md (split for extendsidi): Update condition.
>          (zero_extend<mode>di2,extend<mode>di2): Add an alternative.
>          * config/arm/iterators.md (qhs_extenddi_cstr): Likewise.
>          (qhs_zextenddi_cstr): Likewise.
>
>

OK.

R.
diff mbox

Patch

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 1cb1515b1fa57c6052b68eb8701616c1b80e7416..0000000000000000000000000000000000000000 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4491,36 +4491,35 @@  (define_expand "truncdfhf2"
 ;; Zero and sign extension instructions.
 
 (define_insn "zero_extend<mode>di2"
-  [(set (match_operand:DI 0 "s_register_operand" "=w,r,?r")
+  [(set (match_operand:DI 0 "s_register_operand" "=w,r,?r,w")
         (zero_extend:DI (match_operand:QHSI 1 "<qhs_zextenddi_op>"
 					    "<qhs_zextenddi_cstr>")))]
   "TARGET_32BIT <qhs_zextenddi_cond>"
   "#"
-  [(set_attr "length" "8,4,8")
+  [(set_attr "length" "8,4,8,8")
+   (set_attr "arch" "neon_nota8,*,*,neon_onlya8")
    (set_attr "ce_count" "2")
    (set_attr "predicable" "yes")]
 )
 
 (define_insn "extend<mode>di2"
-  [(set (match_operand:DI 0 "s_register_operand" "=w,r,?r,?r")
+  [(set (match_operand:DI 0 "s_register_operand" "=w,r,?r,?r,w")
         (sign_extend:DI (match_operand:QHSI 1 "<qhs_extenddi_op>"
 					    "<qhs_extenddi_cstr>")))]
   "TARGET_32BIT <qhs_sextenddi_cond>"
   "#"
-  [(set_attr "length" "8,4,8,8")
+  [(set_attr "length" "8,4,8,8,8")
    (set_attr "ce_count" "2")
    (set_attr "shift" "1")
    (set_attr "predicable" "yes")
-   (set_attr "arch" "*,*,a,t")]
+   (set_attr "arch" "neon_nota8,*,a,t,neon_onlya8")]
 )
 
 ;; Splits for all extensions to DImode
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
         (zero_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT && (!TARGET_NEON
-		    || (reload_completed
-			&& !(IS_VFP_REGNUM (REGNO (operands[0])))))"
+  "TARGET_32BIT && reload_completed && !IS_VFP_REGNUM (REGNO (operands[0]))"
   [(set (match_dup 0) (match_dup 1))]
 {
   rtx lo_part = gen_lowpart (SImode, operands[0]);
@@ -4546,9 +4545,7 @@  (define_split
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
         (sign_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT && (!TARGET_NEON
-		    || (reload_completed
-			&& !(IS_VFP_REGNUM (REGNO (operands[0])))))"
+  "TARGET_32BIT && reload_completed && !IS_VFP_REGNUM (REGNO (operands[0]))"
   [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (const_int 31)))]
 {
   rtx lo_part = gen_lowpart (SImode, operands[0]);
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 0b46800cb94bc67deab8e174a17391034af9d104..0000000000000000000000000000000000000000 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -429,8 +429,8 @@  (define_mode_attr qhs_zextenddi_op [(SI 
 (define_mode_attr qhs_extenddi_op [(SI "s_register_operand")
 				   (HI "nonimmediate_operand")
 				   (QI "arm_reg_or_extendqisi_mem_op")])
-(define_mode_attr qhs_extenddi_cstr [(SI "r,0,r,r") (HI "r,0,rm,rm") (QI "r,0,rUq,rm")])
-(define_mode_attr qhs_zextenddi_cstr [(SI "r,0,r") (HI "r,0,rm") (QI "r,0,rm")])
+(define_mode_attr qhs_extenddi_cstr [(SI "r,0,r,r,r") (HI "r,0,rm,rm,r") (QI "r,0,rUq,rm,r")])
+(define_mode_attr qhs_zextenddi_cstr [(SI "r,0,r,r") (HI "r,0,rm,r") (QI "r,0,rm,r")])
 
 ;; Mode attributes used for fixed-point support.
 (define_mode_attr qaddsub_suf [(V4UQQ "8") (V2UHQ "16") (UQQ "8") (UHQ "16")