Message ID | 528B9527.2050800@arm.com |
---|---|
State | New |
Headers | show |
On Tue, Nov 19, 2013 at 5:43 PM, Richard Earnshaw wrote: > The test is unfortunately ARM specific -- I'm not aware of any generic > code that triggers this. Isn't this an insn constraint issue, then? I'd expect recog to reject the pattern if the right constraints are used. Ciao! Steven
On Tue, Nov 19, 2013 at 5:43 PM, Richard Earnshaw wrote: > PR 54300 is a problem in regcprop where the compiler sees > (parallel [(set (x) (y) > (set (y) (x)]) (REG_UNUSED (y)) > > as a single-set insn (since the other operand, y, is not used) and > replaces a use of x with a use of y. However, it fails to take into > account that y has been clobbered in the insn itself. Ah, wait. Incorrect use of single_set(). Try ------ } - set = single_set (insn); + set = multiple_sets (insn) ? NULL_RTX : single_set (insn); extract_insn (insn); if (! constrain_operands (1)) fatal_insn_not_found (insn); preprocess_constraints (); ------ Ciao! Steven
On 19/11/13 17:17, Steven Bosscher wrote: > On Tue, Nov 19, 2013 at 5:43 PM, Richard Earnshaw wrote: >> PR 54300 is a problem in regcprop where the compiler sees >> (parallel [(set (x) (y) >> (set (y) (x)]) (REG_UNUSED (y)) >> >> as a single-set insn (since the other operand, y, is not used) and >> replaces a use of x with a use of y. However, it fails to take into >> account that y has been clobbered in the insn itself. > > Ah, wait. Incorrect use of single_set(). > > Try > ------ > } > > - set = single_set (insn); > + set = multiple_sets (insn) ? NULL_RTX : single_set (insn); > extract_insn (insn); > if (! constrain_operands (1)) > fatal_insn_not_found (insn); > preprocess_constraints (); > ------ > > Ciao! > Steven > That won't handle parallel [ (set (x) (y)) (clobber (y))] R.
On Tue, Nov 19, 2013 at 6:22 PM, Richard Earnshaw wrote: >>> as a single-set insn (since the other operand, y, is not used) and >>> replaces a use of x with a use of y. However, it fails to take into >>> account that y has been clobbered in the insn itself. >> >> Ah, wait. Incorrect use of single_set(). >> >> Try >> ------ >> } >> >> - set = single_set (insn); >> + set = multiple_sets (insn) ? NULL_RTX : single_set (insn); >> extract_insn (insn); >> if (! constrain_operands (1)) >> fatal_insn_not_found (insn); >> preprocess_constraints (); >> ------ ... > That won't handle > > parallel [ (set (x) (y)) > (clobber (y))] Doesn't have to, AFAICT. y will be killed via kill_clobbered_value. Ciao! Steven
On 11/19/13 10:17, Steven Bosscher wrote: > On Tue, Nov 19, 2013 at 5:43 PM, Richard Earnshaw wrote: >> PR 54300 is a problem in regcprop where the compiler sees >> (parallel [(set (x) (y) >> (set (y) (x)]) (REG_UNUSED (y)) >> >> as a single-set insn (since the other operand, y, is not used) and >> replaces a use of x with a use of y. However, it fails to take into >> account that y has been clobbered in the insn itself. > > Ah, wait. Incorrect use of single_set(). > > Try > ------ > } > > - set = single_set (insn); > + set = multiple_sets (insn) ? NULL_RTX : single_set (insn); > extract_insn (insn); > if (! constrain_operands (1)) > fatal_insn_not_found (insn); > preprocess_constraints (); single_set is defined to be true if we have a parallel with multiple sets where all but one of those multiple sets are marked with REG_UNUSED. I don't remember the history here, but that may have been to better support divmod and similar insns. jeff
On Tue, Nov 19, 2013 at 6:27 PM, Jeff Law wrote: > I don't remember the history here, but that may have been to better support > divmod and similar insns. Yes. In the GCC3 days it was important for sincos on i386, and on mk68 it used to be important for some of the funnier patterns. Not sure if it's still useful today, though. Might be worth looking into, just to avoid the confusion in the future. There's been confusion about this before, where people assumed single_set really means "just one SET in this pattern". (ISTR fixing gcse.c's hash_scan_rtx for this at some point...?). But that's not the semantics of single_set. The proper test for "just one SET" is (!multiple_sets && single_set). At least, that's how I've always coded it... Ciao! Steven
On 11/19/13 10:32, Steven Bosscher wrote: > > Yes. In the GCC3 days it was important for sincos on i386, and on mk68 > it used to be important for some of the funnier patterns. Not sure if > it's still useful today, though. Might be worth looking into, just to > avoid the confusion in the future. I doubt it's changed all that much :-) > > There's been confusion about this before, where people assumed > single_set really means "just one SET in this pattern". (ISTR fixing > gcse.c's hash_scan_rtx for this at some point...?). But that's not the > semantics of single_set. Yes. And I'd expect confusion to continue :( Not sure if creating renaming to capture the actual semantics would help here. > > The proper test for "just one SET" is (!multiple_sets && single_set). > At least, that's how I've always coded it... Seems reasonable for those cases where you have to ensure there really is just one set. jeff
On 11/19/13 09:43, Richard Earnshaw wrote: > PR 54300 is a problem in regcprop where the compiler sees > (parallel [(set (x) (y) > (set (y) (x)]) (REG_UNUSED (y)) > > as a single-set insn (since the other operand, y, is not used) and > replaces a use of x with a use of y. However, it fails to take into > account that y has been clobbered in the insn itself. > > I considered changing single_set() to not return this case, but then > decided that would potentially cause missed optimization opportunities > in passes like combine which do know how to deal with cases like this. > > The fix consists of two parts: > a) Spotting the unused sets and ensuring that their values are killed in > the value chains > b) Disabling the simple-move optimization when we've killed something in a). > > The test is unfortunately ARM specific -- I'm not aware of any generic > code that triggers this. > > gcc/ > > PR rtl-optimization/54300 > * regcprop.c (copyprop_hardreg_forward_1): Ensure any unused > outputs in a single-set are killed from the value chains. > > gcc/testsuite: > > PR rtl-optimization/54300 > * gcc.target/arm/pr54300.C: New test. > > Bootstrapped/tested on x86_64 and tested on arm-eabi. This is good. Please install. Thanks, Jeff
On 19/11/13 17:48, Jeff Law wrote: > On 11/19/13 10:32, Steven Bosscher wrote: >> >> Yes. In the GCC3 days it was important for sincos on i386, and on mk68 >> it used to be important for some of the funnier patterns. Not sure if >> it's still useful today, though. Might be worth looking into, just to >> avoid the confusion in the future. > I doubt it's changed all that much :-) > >> >> There's been confusion about this before, where people assumed >> single_set really means "just one SET in this pattern". (ISTR fixing >> gcse.c's hash_scan_rtx for this at some point...?). But that's not the >> semantics of single_set. > Yes. And I'd expect confusion to continue :( Not sure if creating > renaming to capture the actual semantics would help here. > >> >> The proper test for "just one SET" is (!multiple_sets && single_set). >> At least, that's how I've always coded it... > Seems reasonable for those cases where you have to ensure there really > is just one set. > > > jeff > Provided we correctly note the other values that are killed, we can handle multiple sets safely. The one restriction we have to watch is where the dead set operations kill input values to the live set operation. I've committed my patch to trunk. I'll leave it to gestate a couple of days, but this is also needed on the active release branches as well. R.
On 20/11/13 13:57, Richard Earnshaw wrote: > On 19/11/13 17:48, Jeff Law wrote: >> On 11/19/13 10:32, Steven Bosscher wrote: >>> >>> Yes. In the GCC3 days it was important for sincos on i386, and on mk68 >>> it used to be important for some of the funnier patterns. Not sure if >>> it's still useful today, though. Might be worth looking into, just to >>> avoid the confusion in the future. >> I doubt it's changed all that much :-) >> >>> >>> There's been confusion about this before, where people assumed >>> single_set really means "just one SET in this pattern". (ISTR fixing >>> gcse.c's hash_scan_rtx for this at some point...?). But that's not the >>> semantics of single_set. >> Yes. And I'd expect confusion to continue :( Not sure if creating >> renaming to capture the actual semantics would help here. >> >>> >>> The proper test for "just one SET" is (!multiple_sets && single_set). >>> At least, that's how I've always coded it... >> Seems reasonable for those cases where you have to ensure there really >> is just one set. >> >> >> jeff >> > > Provided we correctly note the other values that are killed, we can > handle multiple sets safely. The one restriction we have to watch is > where the dead set operations kill input values to the live set operation. > > I've committed my patch to trunk. > > I'll leave it to gestate a couple of days, but this is also needed on > the active release branches as well. > Well, a bit more than a few days... 4.8 backport has now been applied. 4.7 should follow shortly. R.
Index: regcprop.c =================================================================== --- regcprop.c (revision 204974) +++ regcprop.c (working copy) @@ -747,6 +747,7 @@ copyprop_hardreg_forward_1 (basic_block int n_ops, i, alt, predicated; bool is_asm, any_replacements; rtx set; + rtx link; bool replaced[MAX_RECOG_OPERANDS]; bool changed = false; struct kill_set_value_data ksvd; @@ -815,6 +816,23 @@ copyprop_hardreg_forward_1 (basic_block if (recog_op_alt[i][alt].earlyclobber) kill_value (recog_data.operand[i], vd); + /* If we have dead sets in the insn, then we need to note these as we + would clobbers. */ + for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) + { + if (REG_NOTE_KIND (link) == REG_UNUSED) + { + kill_value (XEXP (link, 0), vd); + /* Furthermore, if the insn looked like a single-set, + but the dead store kills the source value of that + set, then we can no-longer use the plain move + special case below. */ + if (set + && reg_overlap_mentioned_p (XEXP (link, 0), SET_SRC (set))) + set = NULL; + } + } + /* Special-case plain move instructions, since we may well be able to do the move from a different register class. */ if (set && REG_P (SET_SRC (set))) Index: testsuite/gcc.target/arm/pr54300.C =================================================================== --- testsuite/gcc.target/arm/pr54300.C (revision 0) +++ testsuite/gcc.target/arm/pr54300.C (revision 0) @@ -0,0 +1,61 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon } */ +/* { dg-options "-O2" } */ +/* { dg-add-options arm_neon } */ + +#include <arm_neon.h> +#include <stdlib.h> + +struct __attribute__ ((aligned(8))) _v16u8_ { + uint8x16_t val; + _v16u8_( const int16x8_t &src) { val = vreinterpretq_u8_s16(src); } + operator int16x8_t () const { return vreinterpretq_s16_u8(val); } +}; +typedef struct _v16u8_ v16u8; + +struct __attribute__ ((aligned(4))) _v8u8_ { + uint8x8_t val; + _v8u8_( const uint8x8_t &src) { val = src; } + operator int16x4_t () const { return vreinterpret_s16_u8(val); } +}; +typedef struct _v8u8_ v8u8; + +typedef v16u8 v8i16; +typedef int32x4_t v4i32; +typedef const short cv1i16; +typedef const unsigned char cv1u8; +typedef const v8i16 cv8i16; + +static inline __attribute__((always_inline)) v8u8 zero_64(){ return vdup_n_u8( 0 ); } + +static inline __attribute__((always_inline)) v8i16 loadlo_8i16( cv8i16* p ){ + return vcombine_s16( vld1_s16( (cv1i16 *)p ), zero_64() ); +} +static inline __attribute__((always_inline)) v8i16 _loadlo_8i16( cv8i16* p, int offset ){ + return loadlo_8i16( (cv8i16*)(&((cv1u8*)p)[offset]) ); +} + +void __attribute__((noinline)) +test(unsigned short *_Inp, int32_t *_Out, + unsigned int s1v, unsigned int dv0, + unsigned int smask_v) +{ + int32x4_t c = vdupq_n_s32(0); + + for(unsigned int sv=0 ; sv!=dv0 ; sv=(sv+s1v)&smask_v ) + { + int32x4_t s; + s = vmovl_s16( vget_low_s16( _loadlo_8i16( (cv8i16*) _Inp, sv ) ) ); + c = vaddq_s32( c, s ); + } + vst1q_s32( _Out, c ); +} + +main() +{ + unsigned short a[4] = {1, 2, 3, 4}; + int32_t b[4] = {0, 0, 0, 0}; + test(a, b, 1, 1, ~0); + if (b[0] != 1 || b[1] != 2 || b[2] != 3 || b[3] != 4) + abort(); +}