Message ID | 000001cf5558$b40149a0$1c03dce0$@arm.com |
---|---|
State | New |
Headers | show |
-(define_split +; Split the load of 64-bit constant into two loads for high and low 32-bit parts respectively +; to see if we can load them in fewer instructions or fewer cycles. +; For the small 64-bit integer constants that satisfy constraint J, the instruction pattern +; thumb1_movdi_insn has a better way to handle them. >+(define_split >+ [(set (match_operand:ANY64 0 "arm_general_register_operand" "") >+ (match_operand:ANY64 1 "const_double_operand" ""))] >+ "TARGET_THUMB1 && reload_completed && !satisfies_constraint_J (operands[1])" >+ [(set (match_dup 0) (match_dup 1)) >+ (set (match_dup 2) (match_dup 3))] Not ok - this splitter used to kick in in ARM state, now you've turned it off. Look at the movdi patterns in ARM state which deal with it immediate moves with a #. So, the condition should read (TARGET_32BIT) || ( TARGET_THUMB1 && ....) Ok with that change and if no regressions. regards Ramana On Fri, Apr 11, 2014 at 8:36 AM, Terry Guo <terry.guo@arm.com> wrote: > Hi there, > > Current gcc prefers to using two LDR instructions to load 64bit constants. > This could miss some chances that 64bit load can be done in fewer > instructions or fewer cycles. For example, below code to load 0x100000001 > > mov r0, #1 > mov r1, #1 > > is better than current solution: > > ldr r1, .L2+4 > ldr r0, .L2 > .L2: > .word 1 > .word 1 > > The attached patch intends to split 64bit load to take advantage of such > chances. Tested with gcc regression test on cortex-m0. No new regressions. > > Is it ok to stage 1? > > BR, > Terry > > gcc/ > 2014-04-11 Terry Guo <terry.guo@arm.com> > > * config/arm/arm.md (split 64-bit constant for Thumb1): New split > pattern. > > gcc/testsuite/ > 2014-04-11 Terry Guo <terry.guo@arm.com> > > * gcc.target/arm/thumb1-load-64bit-constant-1.c: New test. > * gcc.target/arm/thumb1-load-64bit-constant-2.c: Ditto. > * gcc.target/arm/thumb1-load-64bit-constant-3.c: Ditto.
> -----Original Message----- > From: Ramana Radhakrishnan [mailto:ramana.gcc@googlemail.com] > Sent: Wednesday, May 21, 2014 4:56 PM > To: Terry Guo > Cc: gcc-patches; Richard Earnshaw; Ramana Radhakrishnan > Subject: Re: [Patch, GCC/Thumb1] Improve 64bit constant load for Thumb1 > > -(define_split > +; Split the load of 64-bit constant into two loads for high and low > 32-bit parts respectively > +; to see if we can load them in fewer instructions or fewer cycles. > +; For the small 64-bit integer constants that satisfy constraint J, > the instruction pattern > +; thumb1_movdi_insn has a better way to handle them. > >+(define_split > >+ [(set (match_operand:ANY64 0 "arm_general_register_operand" "") > >+ (match_operand:ANY64 1 "const_double_operand" ""))] > >+ "TARGET_THUMB1 && reload_completed && !satisfies_constraint_J > (operands[1])" > >+ [(set (match_dup 0) (match_dup 1)) > >+ (set (match_dup 2) (match_dup 3))] > > Not ok - this splitter used to kick in in ARM state, now you've turned it off. > Look at the movdi patterns in ARM state which deal with it immediate > moves with a #. > > So, the condition should read > > (TARGET_32BIT) || ( TARGET_THUMB1 && ....) > > Ok with that change and if no regressions. > > regards > Ramana > Thanks for reviewing. This is a total new splitter dedicated for Thumb1 targets. The one for ARM isn't touched. BR, Terry > > > > On Fri, Apr 11, 2014 at 8:36 AM, Terry Guo <terry.guo@arm.com> wrote: > > Hi there, > > > > Current gcc prefers to using two LDR instructions to load 64bit constants. > > This could miss some chances that 64bit load can be done in fewer > > instructions or fewer cycles. For example, below code to load > > 0x100000001 > > > > mov r0, #1 > > mov r1, #1 > > > > is better than current solution: > > > > ldr r1, .L2+4 > > ldr r0, .L2 > > .L2: > > .word 1 > > .word 1 > > > > The attached patch intends to split 64bit load to take advantage of > > such chances. Tested with gcc regression test on cortex-m0. No new > regressions. > > > > Is it ok to stage 1? > > > > BR, > > Terry > > > > gcc/ > > 2014-04-11 Terry Guo <terry.guo@arm.com> > > > > * config/arm/arm.md (split 64-bit constant for Thumb1): New > > split pattern. > > > > gcc/testsuite/ > > 2014-04-11 Terry Guo <terry.guo@arm.com> > > > > * gcc.target/arm/thumb1-load-64bit-constant-1.c: New test. > > * gcc.target/arm/thumb1-load-64bit-constant-2.c: Ditto. > > * gcc.target/arm/thumb1-load-64bit-constant-3.c: Ditto.
On Wed, May 21, 2014 at 10:29 AM, Terry Guo <terry.guo@arm.com> wrote: > > >> -----Original Message----- >> From: Ramana Radhakrishnan [mailto:ramana.gcc@googlemail.com] >> Sent: Wednesday, May 21, 2014 4:56 PM >> To: Terry Guo >> Cc: gcc-patches; Richard Earnshaw; Ramana Radhakrishnan >> Subject: Re: [Patch, GCC/Thumb1] Improve 64bit constant load for Thumb1 >> >> -(define_split >> +; Split the load of 64-bit constant into two loads for high and low >> 32-bit parts respectively >> +; to see if we can load them in fewer instructions or fewer cycles. >> +; For the small 64-bit integer constants that satisfy constraint J, >> the instruction pattern >> +; thumb1_movdi_insn has a better way to handle them. >> >+(define_split >> >+ [(set (match_operand:ANY64 0 "arm_general_register_operand" "") >> >+ (match_operand:ANY64 1 "const_double_operand" ""))] >> >+ "TARGET_THUMB1 && reload_completed && !satisfies_constraint_J >> (operands[1])" >> >+ [(set (match_dup 0) (match_dup 1)) >> >+ (set (match_dup 2) (match_dup 3))] >> >> Not ok - this splitter used to kick in in ARM state, now you've turned it off. >> Look at the movdi patterns in ARM state which deal with it immediate >> moves with a #. >> >> So, the condition should read >> >> (TARGET_32BIT) || ( TARGET_THUMB1 && ....) >> >> Ok with that change and if no regressions. >> >> regards >> Ramana >> > > Thanks for reviewing. This is a total new splitter dedicated for Thumb1 targets. The one for ARM isn't touched. > Bah - sorry I must have misread the patch. This is OK - please apply. I was changing a very similar splitter yesterday so that I could define TARGET_SUPPORTS_WIDE_INT and which is why I must have misread it. Ramana > BR, > Terry > >> >> >> >> On Fri, Apr 11, 2014 at 8:36 AM, Terry Guo <terry.guo@arm.com> wrote: >> > Hi there, >> > >> > Current gcc prefers to using two LDR instructions to load 64bit constants. >> > This could miss some chances that 64bit load can be done in fewer >> > instructions or fewer cycles. For example, below code to load >> > 0x100000001 >> > >> > mov r0, #1 >> > mov r1, #1 >> > >> > is better than current solution: >> > >> > ldr r1, .L2+4 >> > ldr r0, .L2 >> > .L2: >> > .word 1 >> > .word 1 >> > >> > The attached patch intends to split 64bit load to take advantage of >> > such chances. Tested with gcc regression test on cortex-m0. No new >> regressions. >> > >> > Is it ok to stage 1? >> > >> > BR, >> > Terry >> > >> > gcc/ >> > 2014-04-11 Terry Guo <terry.guo@arm.com> >> > >> > * config/arm/arm.md (split 64-bit constant for Thumb1): New >> > split pattern. >> > >> > gcc/testsuite/ >> > 2014-04-11 Terry Guo <terry.guo@arm.com> >> > >> > * gcc.target/arm/thumb1-load-64bit-constant-1.c: New test. >> > * gcc.target/arm/thumb1-load-64bit-constant-2.c: Ditto. >> > * gcc.target/arm/thumb1-load-64bit-constant-3.c: Ditto. > > >
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 2ddda02..b339e83 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -6422,7 +6422,26 @@ (set_attr "pool_range" "*,*,*,*,*,*,1018,*,*") (set_attr "conds" "set,clob,*,*,nocond,nocond,nocond,nocond,nocond")]) -(define_split +; Split the load of 64-bit constant into two loads for high and low 32-bit parts respectively +; to see if we can load them in fewer instructions or fewer cycles. +; For the small 64-bit integer constants that satisfy constraint J, the instruction pattern +; thumb1_movdi_insn has a better way to handle them. +(define_split + [(set (match_operand:ANY64 0 "arm_general_register_operand" "") + (match_operand:ANY64 1 "const_double_operand" ""))] + "TARGET_THUMB1 && reload_completed && !satisfies_constraint_J (operands[1])" + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))] + " + operands[2] = gen_highpart (SImode, operands[0]); + operands[3] = gen_highpart_mode (SImode, GET_MODE (operands[0]), + operands[1]); + operands[0] = gen_lowpart (SImode, operands[0]); + operands[1] = gen_lowpart (SImode, operands[1]); + " +) + +(define_split [(set (match_operand:SI 0 "register_operand" "") (match_operand:SI 1 "const_int_operand" ""))] "TARGET_THUMB1 && satisfies_constraint_J (operands[1])" diff --git a/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-1.c b/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-1.c new file mode 100644 index 0000000..9537aaf --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-1.c @@ -0,0 +1,14 @@ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-do compile } */ +/* { dg-options "-Os" } */ +/* { dg-skip-if "" { ! { arm_thumb1 } } } */ + +extern long long madd (long long a, long long b); + +long long +foo () +{ + return madd (0x0000000100000001LL, 0x0000011100000001LL); +} + +/* { dg-final { scan-assembler-not "ldr" } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-2.c b/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-2.c new file mode 100644 index 0000000..836682b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-2.c @@ -0,0 +1,14 @@ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-skip-if "" { ! { arm_thumb1 } } } */ + +extern long long madd (long long a); + +long long +foo () +{ + return madd (0x0000000100000001LL); +} + +/* { dg-final { scan-assembler-not "ldr" } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-3.c b/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-3.c new file mode 100644 index 0000000..cf4d0be --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb1-load-64bit-constant-3.c @@ -0,0 +1,14 @@ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-do compile } */ +/* { dg-options "-Os" } */ +/* { dg-skip-if "" { ! { arm_thumb1 } } } */ + +long long +foo (int len) +{ + return (long long) (((long long) 1 << len) - 1); +} + +/* { dg-final { scan-assembler-not "ldr" } } */ +/* { dg-final { scan-assembler-times "neg" 1 } } */ +