Message ID | 20170525132626.t536vqs4yhhtvwyh@e107464-lin.cambridge.arm.com |
---|---|
State | New |
Headers | show |
Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: > This patch tackles the issue reported in PR71607. This patch takes a different > approach for disabling the creation of literal pools. Instead of disabling the > patterns that would normally transform the rtl into actual literal pools, it > disables the creation of this literal pool rtl by making the target hook > TARGET_CANNOT_FORCE_CONST_MEM return true if arm_disable_literal_pool is true. > I added patterns to split floating point constants for both SF and DFmode. A > pattern to handle the addressing of label_refs had to be included as well since > all "memory_operand" patterns are disabled when TARGET_CANNOT_FORCE_CONST_MEM > returns true. Also the pattern for splitting 32-bit immediates had to be > changed, it was not accepting unsigned 32-bit unsigned integers with the MSB > set. I believe const_int_operand expects the mode of the operand to be set to > VOIDmode and not SImode. I have only changed it in the patterns that were > affecting this code, though I suggest looking into changing it in the rest of > the ARM backend. I couldn't see the const_int_operand bit in the attached patch, but: const_int_operand *should* usually be used with the logical integer mode, such as SImode or DImode. const_ints are supposed to be stored in sign-extended form, so a 32-bit integer with the MSB set should be 0xffffffff80000000|x instead of 0x80000000|x. It's a bug if you have one where that isn't true. In the patch it looks like this could come from: > diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md > index befdea9edd9..d8f77e2ffe4 100644 > --- a/gcc/config/arm/vfp.md > +++ b/gcc/config/arm/vfp.md > @@ -2079,3 +2079,40 @@ > ;; fmdhr et al (VFPv1) > ;; Support for xD (single precision only) variants. > ;; fmrrs, fmsrr > + > +;; Split an immediate DF move to two immediate SI moves. > +(define_insn_and_split "no_literal_pool_df_immediate" > + [(set (match_operand:DF 0 "s_register_operand" "") > + (match_operand:DF 1 "const_double_operand" ""))] > + "TARGET_THUMB2 && arm_disable_literal_pool > + && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE > + && vfp3_const_double_rtx (operands[1]))" > + "#" > + "&& !reload_completed" > + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) > + (set (subreg:SI (match_dup 1) 4) (match_dup 3)) > + (set (match_dup 0) (match_dup 1))] > + " > + long buf[2]; > + real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode); > + operands[2] = GEN_INT ((int) buf[0]); > + operands[3] = GEN_INT ((int) buf[1]); > + operands[1] = gen_reg_rtx (DFmode); > + ") > + > +;; Split an immediate SF move to one immediate SI move. > +(define_insn_and_split "no_literal_pool_sf_immediate" > + [(set (match_operand:SF 0 "s_register_operand" "") > + (match_operand:SF 1 "const_double_operand" ""))] > + "TARGET_THUMB2 && arm_disable_literal_pool > + && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))" > + "#" > + "&& !reload_completed" > + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) > + (set (match_dup 0) (match_dup 1))] > + " > + long buf; > + real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode); > + operands[2] = GEN_INT ((int) buf); > + operands[1] = gen_reg_rtx (SFmode); > + ") ...these two splits, where the GEN_INTs should probably be: gen_int_mode (..., SImode); instead. Thanks, Richard
On 31/05/2017 09:19:40, Richard Sandiford wrote: > const_ints are supposed to be stored in sign-extended form, so a 32-bit > integer with the MSB set should be 0xffffffff80000000|x instead of > 0x80000000|x. It's a bug if you have one where that isn't true. > > In the patch it looks like this could come from: > ...these two splits, where the GEN_INTs should probably be: > > gen_int_mode (..., SImode); > > instead. Hi Richard, thanks for the tip. Is there a test case that could produce an incorrect result? I've attempted to create one using negative doubles and floats but haven't succeeded. Thanks,
Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: > On 31/05/2017 09:19:40, Richard Sandiford wrote: >> const_ints are supposed to be stored in sign-extended form, so a 32-bit >> integer with the MSB set should be 0xffffffff80000000|x instead of >> 0x80000000|x. It's a bug if you have one where that isn't true. >> >> In the patch it looks like this could come from: >> ...these two splits, where the GEN_INTs should probably be: >> >> gen_int_mode (..., SImode); >> >> instead. > > Hi Richard, thanks for the tip. Is there a test case that could produce an > incorrect result? I've attempted to create one using negative doubles and > floats but haven't succeeded. Just to check, are you testing with --enable-checking=yes,rtl? When the values you tried were split, did you get the sign-extended form or the zero-extended form? Thanks, Richard
On 31/05/2017 14:11:43, Richard Sandiford wrote: > Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: > > On 31/05/2017 09:19:40, Richard Sandiford wrote: > >> const_ints are supposed to be stored in sign-extended form, so a 32-bit > >> integer with the MSB set should be 0xffffffff80000000|x instead of > >> 0x80000000|x. It's a bug if you have one where that isn't true. > >> > >> In the patch it looks like this could come from: > >> ...these two splits, where the GEN_INTs should probably be: > >> > >> gen_int_mode (..., SImode); > >> > >> instead. > > > > Hi Richard, thanks for the tip. Is there a test case that could produce an > > incorrect result? I've attempted to create one using negative doubles and > > floats but haven't succeeded. > > Just to check, are you testing with --enable-checking=yes,rtl? > > When the values you tried were split, did you get the sign-extended form > or the zero-extended form? > > Thanks, > Richard I've now rebuilt with --enable-checking=yes,rtl and it appears that the split values are being correctly sign-extended in the rtl and appear correctly in the assembly. However, if you believe it is safer to use gen_int_mode(), I'll respin the patch accordingly.
Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: > On 31/05/2017 14:11:43, Richard Sandiford wrote: >> Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: >> > On 31/05/2017 09:19:40, Richard Sandiford wrote: >> >> const_ints are supposed to be stored in sign-extended form, so a 32-bit >> >> integer with the MSB set should be 0xffffffff80000000|x instead of >> >> 0x80000000|x. It's a bug if you have one where that isn't true. >> >> >> >> In the patch it looks like this could come from: >> >> ...these two splits, where the GEN_INTs should probably be: >> >> >> >> gen_int_mode (..., SImode); >> >> >> >> instead. >> > >> > Hi Richard, thanks for the tip. Is there a test case that could produce an >> > incorrect result? I've attempted to create one using negative doubles and >> > floats but haven't succeeded. >> >> Just to check, are you testing with --enable-checking=yes,rtl? >> >> When the values you tried were split, did you get the sign-extended form >> or the zero-extended form? >> >> Thanks, >> Richard > > I've now rebuilt with --enable-checking=yes,rtl and it appears that the split > values are being correctly sign-extended in the rtl and appear correctly in the > assembly. > > However, if you believe it is safer to use gen_int_mode(), I'll respin the > patch accordingly. Yeah, I think it would be safer. But if they were already correctly sign-extended, then what did you mean by: Also the pattern for splitting 32-bit immediates had to be changed, it was not accepting unsigned 32-bit unsigned integers with the MSB set. I believe const_int_operand expects the mode of the operand to be set to VOIDmode and not SImode. I have only changed it in the patterns that were affecting this code, though I suggest looking into changing it in the rest of the ARM backend. Thanks, Richard
On 01/06/2017 07:15:47, Richard Sandiford wrote: > Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: > > On 31/05/2017 14:11:43, Richard Sandiford wrote: > >> Prakhar Bahuguna <prakhar.bahuguna@arm.com> writes: > >> > On 31/05/2017 09:19:40, Richard Sandiford wrote: > >> >> const_ints are supposed to be stored in sign-extended form, so a 32-bit > >> >> integer with the MSB set should be 0xffffffff80000000|x instead of > >> >> 0x80000000|x. It's a bug if you have one where that isn't true. > >> >> > >> >> In the patch it looks like this could come from: > >> >> ...these two splits, where the GEN_INTs should probably be: > >> >> > >> >> gen_int_mode (..., SImode); > >> >> > >> >> instead. > >> > > >> > Hi Richard, thanks for the tip. Is there a test case that could produce an > >> > incorrect result? I've attempted to create one using negative doubles and > >> > floats but haven't succeeded. > >> > >> Just to check, are you testing with --enable-checking=yes,rtl? > >> > >> When the values you tried were split, did you get the sign-extended form > >> or the zero-extended form? > >> > >> Thanks, > >> Richard > > > > I've now rebuilt with --enable-checking=yes,rtl and it appears that the split > > values are being correctly sign-extended in the rtl and appear correctly in the > > assembly. > > > > However, if you believe it is safer to use gen_int_mode(), I'll respin the > > patch accordingly. > > Yeah, I think it would be safer. But if they were already correctly > sign-extended, then what did you mean by: > > Also the pattern for splitting 32-bit immediates had to be changed, it > was not accepting unsigned 32-bit unsigned integers with the MSB > set. I believe const_int_operand expects the mode of the operand to be > set to VOIDmode and not SImode. I have only changed it in the patterns > that were affecting this code, though I suggest looking into changing > it in the rest of the ARM backend. > > Thanks, > Richard This part of the patch was written by Andre. After checking with him, it seems that some of the confusion arises from the comment on real_to_target() which states "There are always 32 bits in each long, no matter the size of the host long". While this may imply the value is zero-extended on hosts with wider longs, it seems like the value is always correctly sign-extended and thus gen_int_mode() should be unnecessary. As for why VOIDmode is used with the values casted to int, there is a reason for why it is done this way to get it working but this has been long-forgotten. I only have the code and this message to rely on.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index f3a6b64b168..ea371e27797 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -8633,7 +8633,16 @@ arm_tls_referenced_p (rtx x) { const_rtx x = *iter; if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_TLS_MODEL (x) != 0) - return true; + { + /* ARM currently does not provide relocations to encode TLS variables + into AArch32 instructions, only data, so there is no way to + currently implement these if a literal pool is disabled. */ + if (arm_disable_literal_pool) + sorry ("accessing thread-local storage is not currently supported " + "with -mpure-code or -mslow-flash-data"); + + return true; + } /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are TLS offsets, not real symbol references. */ @@ -16340,6 +16349,7 @@ static void push_minipool_fix (rtx_insn *insn, HOST_WIDE_INT address, rtx *loc, machine_mode mode, rtx value) { + gcc_assert (!arm_disable_literal_pool); Mfix * fix = (Mfix *) obstack_alloc (&minipool_obstack, sizeof (* fix)); fix->insn = insn; @@ -16391,10 +16401,6 @@ push_minipool_fix (rtx_insn *insn, HOST_WIDE_INT address, rtx *loc, int arm_max_const_double_inline_cost () { - /* Let the value get synthesized to avoid the use of literal pools. */ - if (arm_disable_literal_pool) - return 99; - return ((optimize_size || arm_ld_sched) ? 3 : 4); } @@ -17341,6 +17347,11 @@ arm_reorg (void) if (!optimize) split_all_insns_noflow (); + /* Make sure we do not attempt to create a literal pool even though it should + no longer be necessary to create any. */ + if (arm_disable_literal_pool) + return ; + minipool_fix_head = minipool_fix_tail = NULL; /* The first insn must always be a note, or the code below won't diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 21cfe3a4c31..f9365cde504 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -233,10 +233,6 @@ (match_test "arm_restrict_it")) (const_string "no") - (and (eq_attr "use_literal_pool" "yes") - (match_test "arm_disable_literal_pool")) - (const_string "no") - (eq_attr "arch_enabled" "no") (const_string "no")] (const_string "yes"))) @@ -5878,8 +5874,9 @@ (match_operand:ANY64 1 "immediate_operand" ""))] "TARGET_32BIT && reload_completed - && (arm_const_double_inline_cost (operands[1]) - <= arm_max_const_double_inline_cost ())" + && (arm_disable_literal_pool + || (arm_const_double_inline_cost (operands[1]) + <= arm_max_const_double_inline_cost ()))" [(const_int 0)] " arm_split_constant (SET, SImode, curr_insn, diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md index befdea9edd9..d8f77e2ffe4 100644 --- a/gcc/config/arm/vfp.md +++ b/gcc/config/arm/vfp.md @@ -2079,3 +2079,40 @@ ;; fmdhr et al (VFPv1) ;; Support for xD (single precision only) variants. ;; fmrrs, fmsrr + +;; Split an immediate DF move to two immediate SI moves. +(define_insn_and_split "no_literal_pool_df_immediate" + [(set (match_operand:DF 0 "s_register_operand" "") + (match_operand:DF 1 "const_double_operand" ""))] + "TARGET_THUMB2 && arm_disable_literal_pool + && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE + && vfp3_const_double_rtx (operands[1]))" + "#" + "&& !reload_completed" + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) + (set (subreg:SI (match_dup 1) 4) (match_dup 3)) + (set (match_dup 0) (match_dup 1))] + " + long buf[2]; + real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode); + operands[2] = GEN_INT ((int) buf[0]); + operands[3] = GEN_INT ((int) buf[1]); + operands[1] = gen_reg_rtx (DFmode); + ") + +;; Split an immediate SF move to one immediate SI move. +(define_insn_and_split "no_literal_pool_sf_immediate" + [(set (match_operand:SF 0 "s_register_operand" "") + (match_operand:SF 1 "const_double_operand" ""))] + "TARGET_THUMB2 && arm_disable_literal_pool + && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))" + "#" + "&& !reload_completed" + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) + (set (match_dup 0) (match_dup 1))] + " + long buf; + real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode); + operands[2] = GEN_INT ((int) buf); + operands[1] = gen_reg_rtx (SFmode); + ") diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c similarity index 100% rename from gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c rename to gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c new file mode 100644 index 00000000000..6e76043daee --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-march=armv7e-m -mfloat-abi=hard -O2 -mthumb -mslow-flash-data" } */ + +float f (float); + +const float max = 0.01f; + +int +g (float in) +{ + if (f (in) + f (in) < max) + return 0; + return 1; +} + +double foo (void) +{ + return 0xF1EC7A5239123AF; +} + +double bar (void) +{ + return 0.0f; +} diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c new file mode 100644 index 00000000000..fe7a12bf99b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-march=armv7e-m -mfloat-abi=hard -mthumb -mslow-flash-data" } */ + +/* From PR71607 */ + +float b; +void fn1 (); + +float +fn2 () +{ + return 1.1f; +} + +void +fn3 () +{ + float a[2]; + a[1] = b; + fn1 (a); +} diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c new file mode 100644 index 00000000000..cc5aea4437f --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-march=armv7e-m -mfloat-abi=hard -O2 -mthumb -mslow-flash-data" } */ + +double __attribute__ ((target ("fpu=fpv5-d16"))) +foo (void) +{ + return 1.0f; +} + +float __attribute__ ((target ("fpu=fpv5-d16"))) +bar (void) +{ + return 1.0f; +} + +float __attribute__ ((target ("fpu=fpv5-sp-d16"))) +baz (void) +{ + return 1.0f; +} + +/* { dg-final { scan-assembler-times "#1\\.0e\\+0" 3 } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c new file mode 100644 index 00000000000..b9161c4a4b4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-march=armv7e-m -mfloat-abi=hard -O2 -mthumb -mslow-flash-data" } */ + +double __attribute__ ((target ("fpu=fpv5-sp-d16"))) +foo (void) +{ + return 1.0f; +} + +/* { dg-final { scan-assembler-not "#1\\.0e\\+0" } } */ diff --git a/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c b/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c new file mode 100644 index 00000000000..283201fdd97 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target tls_native } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-options "-mslow-flash-data" } */ + +__thread int x = 0; + +int +bar () +{ + return x; /* { dg-message "sorry, unimplemented: accessing thread-local storage is not currently supported with -mpure-code or -mslow-flash-data" } */ +} +