Message ID | CABu31nPu+1QxCQDv71C=3V_qS61B3GkAd37Eh7WTZMqBTann_w@mail.gmail.com |
---|---|
State | New |
Headers | show |
Steven Bosscher <stevenb.gcc@gmail.com> wrote: >Hello, > >This patch is necessary to make ARM pass the test suite with LRA >enabled. The symptom is recog failing to recognize a store_minmaxsi >insn, see: > http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html > >But I am not sure if that's also the root cause of the problem, or >whether the ARM back end should not let recognition of insn patterns >be dependent on the state of the profile flags. > >The pattern for store_minmaxsi (in arm.md) is: > >(define_insn "*store_minmaxsi" > [(set (match_operand:SI 0 "memory_operand" "=m") > (match_operator:SI 3 "minmax_operator" > [(match_operand:SI 1 "s_register_operand" "r") > (match_operand:SI 2 "s_register_operand" "r")])) > (clobber (reg:CC CC_REGNUM))] > "TARGET_32BIT && optimize_insn_for_size_p()" > "* > operands[3] = gen_rtx_fmt_ee (minmax_code (operands[3]), SImode, > operands[1], operands[2]); > output_asm_insn (\"cmp\\t%1, %2\", operands); > if (TARGET_THUMB2) > output_asm_insn (\"ite\t%d3\", operands); > output_asm_insn (\"str%d3\\t%1, %0\", operands); > output_asm_insn (\"str%D3\\t%2, %0\", operands); > return \"\"; > " > [(set_attr "conds" "clob") > (set (attr "length") > (if_then_else (eq_attr "is_thumb" "yes") > (const_int 14) > (const_int 12))) > (set_attr "type" "store1")] >) > > >Note the insn condition uses optimize_insn_for_size_p(). This means >the pattern can be valid or invalid dependent on the context where the >insn appears: in hot or cold code. IMHO this behavior should not be >allowed. The back end cannot expect the middle end to know at all >times the context that the insn appears in, and more importantly >whether a pattern is valid or not is independent of where the insn >appears: That is a *cost* issue! > >It seems to me that the ARM back end should be fixed here, not LRA's >check_rtl. > >Comments&thoughts? The intent is to allow this also to control combine and split, but certainly making insns invalid after the fact is bad. think of sinking a previously hot insn into a cold path... Honza, how was this supposed to work? Richard. >Ciao! >Steven
Hi, I'm agree. I looked at the ARM backend and it occurs that the usage of optimize_insn_for_size_p() was added to only use store_minmax in cold path because of some performance issue. But in any case its usage doesn't shrink the number of instruction, if we are in ARM mode 3 are needed : 1 comparison, 1 conditional move and 1 store in O1, O2 and O3 and 1 comparison and 2 conditional sotres in Os : cmp ip, r0 movlt r0, ip str r0, [lr, r3] or cmp r5, r4 strle r5, [lr, r3] strgt r4, [lr, r3] and in Thumb mode 4 are needed in both cases because of the it insertion. Thus I think that this insn can be removed. Is it ok for you ? Thanks Yvan On 10 November 2013 14:17, Richard Biener <richard.guenther@gmail.com> wrote: > Steven Bosscher <stevenb.gcc@gmail.com> wrote: >>Hello, >> >>This patch is necessary to make ARM pass the test suite with LRA >>enabled. The symptom is recog failing to recognize a store_minmaxsi >>insn, see: >> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html >> >>But I am not sure if that's also the root cause of the problem, or >>whether the ARM back end should not let recognition of insn patterns >>be dependent on the state of the profile flags. >> >>The pattern for store_minmaxsi (in arm.md) is: >> >>(define_insn "*store_minmaxsi" >> [(set (match_operand:SI 0 "memory_operand" "=m") >> (match_operator:SI 3 "minmax_operator" >> [(match_operand:SI 1 "s_register_operand" "r") >> (match_operand:SI 2 "s_register_operand" "r")])) >> (clobber (reg:CC CC_REGNUM))] >> "TARGET_32BIT && optimize_insn_for_size_p()" >> "* >> operands[3] = gen_rtx_fmt_ee (minmax_code (operands[3]), SImode, >> operands[1], operands[2]); >> output_asm_insn (\"cmp\\t%1, %2\", operands); >> if (TARGET_THUMB2) >> output_asm_insn (\"ite\t%d3\", operands); >> output_asm_insn (\"str%d3\\t%1, %0\", operands); >> output_asm_insn (\"str%D3\\t%2, %0\", operands); >> return \"\"; >> " >> [(set_attr "conds" "clob") >> (set (attr "length") >> (if_then_else (eq_attr "is_thumb" "yes") >> (const_int 14) >> (const_int 12))) >> (set_attr "type" "store1")] >>) >> >> >>Note the insn condition uses optimize_insn_for_size_p(). This means >>the pattern can be valid or invalid dependent on the context where the >>insn appears: in hot or cold code. IMHO this behavior should not be >>allowed. The back end cannot expect the middle end to know at all >>times the context that the insn appears in, and more importantly >>whether a pattern is valid or not is independent of where the insn >>appears: That is a *cost* issue! >> >>It seems to me that the ARM back end should be fixed here, not LRA's >>check_rtl. >> >>Comments&thoughts? > > The intent is to allow this also to control combine and split, but certainly making insns invalid after the fact is bad. think of sinking a previously hot insn into a cold path... > > Honza, how was this supposed to work? > > Richard. > >>Ciao! >>Steven > >
On 15/11/13 14:19, Yvan Roux wrote: > Hi, > > I'm agree. I looked at the ARM backend and it occurs that the usage > of optimize_insn_for_size_p() was added to only use store_minmax in > cold path because of some performance issue. But in any case its > usage doesn't shrink the number of instruction, if we are in ARM mode > 3 are needed : 1 comparison, 1 conditional move and 1 store in O1, O2 > and O3 and 1 comparison and 2 conditional sotres in Os : > > cmp ip, r0 > movlt r0, ip > str r0, [lr, r3] > Sometimes 4 will be needed, since both original register values may remain live. However, I'm inclined to agree that while it should be possible to decide at the *function* level whether or not an insn is valid, doing so at the block level is probably unsafe. R. > or > > cmp r5, r4 > strle r5, [lr, r3] > strgt r4, [lr, r3] > > and in Thumb mode 4 are needed in both cases because of the it > insertion. Thus I think that this insn can be removed. Is it ok for > you ? > > Thanks > Yvan > > > On 10 November 2013 14:17, Richard Biener <richard.guenther@gmail.com> wrote: >> Steven Bosscher <stevenb.gcc@gmail.com> wrote: >>> Hello, >>> >>> This patch is necessary to make ARM pass the test suite with LRA >>> enabled. The symptom is recog failing to recognize a store_minmaxsi >>> insn, see: >>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html >>> >>> But I am not sure if that's also the root cause of the problem, or >>> whether the ARM back end should not let recognition of insn patterns >>> be dependent on the state of the profile flags. >>> >>> The pattern for store_minmaxsi (in arm.md) is: >>> >>> (define_insn "*store_minmaxsi" >>> [(set (match_operand:SI 0 "memory_operand" "=m") >>> (match_operator:SI 3 "minmax_operator" >>> [(match_operand:SI 1 "s_register_operand" "r") >>> (match_operand:SI 2 "s_register_operand" "r")])) >>> (clobber (reg:CC CC_REGNUM))] >>> "TARGET_32BIT && optimize_insn_for_size_p()" >>> "* >>> operands[3] = gen_rtx_fmt_ee (minmax_code (operands[3]), SImode, >>> operands[1], operands[2]); >>> output_asm_insn (\"cmp\\t%1, %2\", operands); >>> if (TARGET_THUMB2) >>> output_asm_insn (\"ite\t%d3\", operands); >>> output_asm_insn (\"str%d3\\t%1, %0\", operands); >>> output_asm_insn (\"str%D3\\t%2, %0\", operands); >>> return \"\"; >>> " >>> [(set_attr "conds" "clob") >>> (set (attr "length") >>> (if_then_else (eq_attr "is_thumb" "yes") >>> (const_int 14) >>> (const_int 12))) >>> (set_attr "type" "store1")] >>> ) >>> >>> >>> Note the insn condition uses optimize_insn_for_size_p(). This means >>> the pattern can be valid or invalid dependent on the context where the >>> insn appears: in hot or cold code. IMHO this behavior should not be >>> allowed. The back end cannot expect the middle end to know at all >>> times the context that the insn appears in, and more importantly >>> whether a pattern is valid or not is independent of where the insn >>> appears: That is a *cost* issue! >>> >>> It seems to me that the ARM back end should be fixed here, not LRA's >>> check_rtl. >>> >>> Comments&thoughts? >> >> The intent is to allow this also to control combine and split, but certainly making insns invalid after the fact is bad. think of sinking a previously hot insn into a cold path... >> >> Honza, how was this supposed to work? >> >> Richard. >> >>> Ciao! >>> Steven >> >> >
Index: lra.c =================================================================== --- lra.c (revision 204618) +++ lra.c (working copy) @@ -2022,26 +2022,32 @@ check_rtl (bool final_p) lra_assert (! final_p || reload_completed); FOR_EACH_BB (bb) - FOR_BB_INSNS (bb, insn) - if (NONDEBUG_INSN_P (insn) - && GET_CODE (PATTERN (insn)) != USE - && GET_CODE (PATTERN (insn)) != CLOBBER - && GET_CODE (PATTERN (insn)) != ASM_INPUT) - { - if (final_p) - { - extract_insn (insn); - lra_assert (constrain_operands (1)); - continue; - } - /* LRA code is based on assumption that all addresses can be - correctly decomposed. LRA can generate reloads for - decomposable addresses. The decomposition code checks the - correctness of the addresses. So we don't need to check - the addresses here. */ - if (insn_invalid_p (insn, false)) - fatal_insn_not_found (insn); - } + { + /* Whether an insn is recognized or not may depend on the hotness of + the insn: Some back ends use it as a layman's "enable" attribute. */ + rtl_profile_for_bb (bb); + + FOR_BB_INSNS (bb, insn) + if (NONDEBUG_INSN_P (insn) + && GET_CODE (PATTERN (insn)) != USE + && GET_CODE (PATTERN (insn)) != CLOBBER + && GET_CODE (PATTERN (insn)) != ASM_INPUT) + { + if (final_p) + { + extract_insn (insn); + lra_assert (constrain_operands (1)); + continue; + } + /* LRA code is based on assumption that all addresses can be + correctly decomposed. LRA can generate reloads for + decomposable addresses. The decomposition code checks the + correctness of the addresses. So we don't need to check + the addresses here. */ + if (insn_invalid_p (insn, false)) + fatal_insn_not_found (insn); + } + } } #endif /* #ifdef ENABLE_CHECKING */