Message ID | 00854266aef1cbae6d3620f78122e6131c37b73c.1431268135.git.segher@kernel.crashing.org |
---|---|
State | New |
Headers | show |
On 05/10/2015 10:13 AM, Segher Boessenkool wrote: > Combine has its own ideas of what is "canonical" RTL, forcing all > backends to have special patterns in their machine description for the > "more simplified" patterns combine often creates, even though the > backend already has patterns for a more general form. Backends that > do not implement those patterns get less well optimised code. > > This patch lifts that burden for two cases: combine often converts > an AND (with, say, 0xff) to a ZERO_EXTEND of a SUBREG; and an LSHIFTRT > followed by an AND to a ZERO_EXTRACT. This is perfectly helpful for > e.g. MEMs, but not nice if you have instructions to do more generic > masking (like PowerPC rlwinm, and similar on some other archs). > > With this patch, if recog_for_combine fails, and there are any > ZERO_EXT* in the pattern to be matched, it tries again with those > expressed as AND etc. If that also fails it rolls back the changes, > because it might still match after e.g. splitting, and we want to > try the ZERO_EXT* for that as well. > > Tested on powerpc-linux, before and after removing many patterns > from the machine description, and checked that the only changes in > the bootstrapped compiler are new and removed functions. > > I'll also test on x86_64-linux before committing. > > > Segher > > > 2015-05-10 Segher Boessenkool <segher@kernel.crashing.org> > > * combine.c (recog_for_combine_1): New function, factored out > from recog_for_combine. > (change_zero_ext): New function. > (recog_for_combine): If recog fails, try again with the pattern > modified by change_zero_ext; if that still fails, restore the > pattern. I like it. Attacking the extensions are the most obvious candidates, but I wonder if there's others (like the whole "ASHIFT vs MULT" stuff that we were recently looking at for ARM). jeff
On Sun, May 10, 2015 at 10:15:34PM -0600, Jeff Law wrote: > > (recog_for_combine): If recog fails, try again with the pattern > > modified by change_zero_ext; if that still fails, restore the > > pattern. > I like it. Attacking the extensions are the most obvious candidates, > but I wonder if there's others (like the whole "ASHIFT vs MULT" stuff > that we were recently looking at for ARM). Yeah, I thought about that as well. But that case, MULT instead of shift, outside of MEM, is simply non-canonical RTL; and combine created it on purpose. It just shouldn't. There certainly will be other cases though; hopefully not overlapping, I don't want to recog an exponentially expanding number of times ;-) Segher
Hi Segher, On 10/05/15 17:13, Segher Boessenkool wrote: > Combine has its own ideas of what is "canonical" RTL, forcing all > backends to have special patterns in their machine description for the > "more simplified" patterns combine often creates, even though the > backend already has patterns for a more general form. Backends that > do not implement those patterns get less well optimised code. > > This patch lifts that burden for two cases: combine often converts > an AND (with, say, 0xff) to a ZERO_EXTEND of a SUBREG; and an LSHIFTRT > followed by an AND to a ZERO_EXTRACT. This is perfectly helpful for > e.g. MEMs, but not nice if you have instructions to do more generic > masking (like PowerPC rlwinm, and similar on some other archs). > > With this patch, if recog_for_combine fails, and there are any > ZERO_EXT* in the pattern to be matched, it tries again with those > expressed as AND etc. If that also fails it rolls back the changes, > because it might still match after e.g. splitting, and we want to > try the ZERO_EXT* for that as well. > > Tested on powerpc-linux, before and after removing many patterns > from the machine description, and checked that the only changes in > the bootstrapped compiler are new and removed functions. Does this patch means we can remove any patterns in the backend that look like: - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (zero_extract:SI (match_operand:SI 1 "gpc_reg_operand" "r") - (match_operand:SI 2 "const_int_operand" "i") - (match_operand:SI 3 "const_int_operand" "i")))] as long as we have an equivalent and-with-mask pattern? Thanks, Kyrill > > I'll also test on x86_64-linux before committing. > > > Segher > > > 2015-05-10 Segher Boessenkool <segher@kernel.crashing.org> > > * combine.c (recog_for_combine_1): New function, factored out > from recog_for_combine. > (change_zero_ext): New function. > (recog_for_combine): If recog fails, try again with the pattern > modified by change_zero_ext; if that still fails, restore the > pattern. > > --- > gcc/combine.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++++------- > 1 file changed, 107 insertions(+), 13 deletions(-) > > diff --git a/gcc/combine.c b/gcc/combine.c > index 1e4d65e..896d9d2 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -10849,21 +10849,11 @@ simplify_shift_const (rtx x, enum rtx_code code, machine_mode result_mode, > } > > > -/* Like recog, but we receive the address of a pointer to a new pattern. > - We try to match the rtx that the pointer points to. > - If that fails, we may try to modify or replace the pattern, > - storing the replacement into the same pointer object. > - > - Modifications include deletion or addition of CLOBBERs. > - > - PNOTES is a pointer to a location where any REG_UNUSED notes added for > - the CLOBBERs are placed. > - > - The value is the final insn code from the pattern ultimately matched, > - or -1. */ > +/* A subroutine of recog_for_combine. See there for arguments and > + return value. */ > > static int > -recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) > +recog_for_combine_1 (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) > { > rtx pat = *pnewpat; > rtx pat_without_clobbers; > @@ -11010,6 +11000,110 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) > > return insn_code_number; > } > + > +/* Change every ZERO_EXTRACT and ZERO_EXTEND of a SUBREG that can be > + expressed as an AND and maybe an LSHIFTRT, to that formulation. > + Return whether anything was so changed. */ > + > +static bool > +change_zero_ext (rtx *src) > +{ > + bool changed = false; > + > + subrtx_ptr_iterator::array_type array; > + FOR_EACH_SUBRTX_PTR (iter, array, src, NONCONST) > + { > + rtx x = **iter; > + machine_mode mode = GET_MODE (x); > + int size; > + > + if (GET_CODE (x) == ZERO_EXTRACT > + && CONST_INT_P (XEXP (x, 1)) > + && CONST_INT_P (XEXP (x, 2)) > + && GET_MODE (XEXP (x, 0)) == mode) > + { > + size = INTVAL (XEXP (x, 1)); > + > + int start = INTVAL (XEXP (x, 2)); > + if (BITS_BIG_ENDIAN) > + start = GET_MODE_PRECISION (mode) - size - start; > + > + x = gen_rtx_LSHIFTRT (mode, XEXP (x, 0), GEN_INT (start)); > + } > + else if (GET_CODE (x) == ZERO_EXTEND > + && GET_CODE (XEXP (x, 0)) == SUBREG > + && GET_MODE (SUBREG_REG (XEXP (x, 0))) == mode > + && subreg_lowpart_p (XEXP (x, 0))) > + { > + size = GET_MODE_PRECISION (GET_MODE (XEXP (x, 0))); > + x = SUBREG_REG (XEXP (x, 0)); > + } > + else > + continue; > + > + unsigned HOST_WIDE_INT mask = 1; > + mask <<= size; > + mask--; > + > + x = gen_rtx_AND (mode, x, GEN_INT (mask)); > + > + SUBST (**iter, x); > + changed = true; > + } > + > + return changed; > +} > + > +/* Like recog, but we receive the address of a pointer to a new pattern. > + We try to match the rtx that the pointer points to. > + If that fails, we may try to modify or replace the pattern, > + storing the replacement into the same pointer object. > + > + Modifications include deletion or addition of CLOBBERs. If the > + instruction will still not match, we change ZERO_EXTEND and ZERO_EXTRACT > + to the equivalent AND and perhaps LSHIFTRT patterns, and try with that > + (and undo if that fails). > + > + PNOTES is a pointer to a location where any REG_UNUSED notes added for > + the CLOBBERs are placed. > + > + The value is the final insn code from the pattern ultimately matched, > + or -1. */ > + > +static int > +recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) > +{ > + rtx pat = PATTERN (insn); > + int insn_code_number = recog_for_combine_1 (pnewpat, insn, pnotes); > + if (insn_code_number >= 0 || check_asm_operands (pat)) > + return insn_code_number; > + > + void *marker = get_undo_marker (); > + bool changed = false; > + > + if (GET_CODE (pat) == SET) > + changed = change_zero_ext (&SET_SRC (pat)); > + else if (GET_CODE (pat) == PARALLEL) > + { > + int i; > + for (i = 0; i < XVECLEN (pat, 0); i++) > + { > + rtx set = XVECEXP (pat, 0, i); > + if (GET_CODE (set) == SET) > + changed |= change_zero_ext (&SET_SRC (set)); > + } > + } > + > + if (changed) > + { > + insn_code_number = recog_for_combine_1 (pnewpat, insn, pnotes); > + > + if (insn_code_number < 0) > + undo_to_marker (marker); > + } > + > + return insn_code_number; > +} > > /* Like gen_lowpart_general but for use by combine. In combine it > is not possible to create any new pseudoregs. However, it is
On Tue, May 12, 2015 at 09:20:18AM +0100, Kyrill Tkachov wrote: > Does this patch means we can remove any patterns in > the backend that look like: > > - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") > - (zero_extract:SI (match_operand:SI 1 "gpc_reg_operand" "r") > - (match_operand:SI 2 "const_int_operand" "i") > - (match_operand:SI 3 "const_int_operand" "i")))] > > > as long as we have an equivalent and-with-mask pattern? Yes, exactly. But you can also keep them, it will find either. Older / CISC targets will usually want the zero_ext*, newer / RISC targets will more often write something as AND (and LSHIFTRT). You can also have both in the same target: for example, many targets will want to write their plain zero_extend patterns as that, because you want to allow both mem and reg in the same pattern, for reload. Segher
diff --git a/gcc/combine.c b/gcc/combine.c index 1e4d65e..896d9d2 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10849,21 +10849,11 @@ simplify_shift_const (rtx x, enum rtx_code code, machine_mode result_mode, } -/* Like recog, but we receive the address of a pointer to a new pattern. - We try to match the rtx that the pointer points to. - If that fails, we may try to modify or replace the pattern, - storing the replacement into the same pointer object. - - Modifications include deletion or addition of CLOBBERs. - - PNOTES is a pointer to a location where any REG_UNUSED notes added for - the CLOBBERs are placed. - - The value is the final insn code from the pattern ultimately matched, - or -1. */ +/* A subroutine of recog_for_combine. See there for arguments and + return value. */ static int -recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) +recog_for_combine_1 (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) { rtx pat = *pnewpat; rtx pat_without_clobbers; @@ -11010,6 +11000,110 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) return insn_code_number; } + +/* Change every ZERO_EXTRACT and ZERO_EXTEND of a SUBREG that can be + expressed as an AND and maybe an LSHIFTRT, to that formulation. + Return whether anything was so changed. */ + +static bool +change_zero_ext (rtx *src) +{ + bool changed = false; + + subrtx_ptr_iterator::array_type array; + FOR_EACH_SUBRTX_PTR (iter, array, src, NONCONST) + { + rtx x = **iter; + machine_mode mode = GET_MODE (x); + int size; + + if (GET_CODE (x) == ZERO_EXTRACT + && CONST_INT_P (XEXP (x, 1)) + && CONST_INT_P (XEXP (x, 2)) + && GET_MODE (XEXP (x, 0)) == mode) + { + size = INTVAL (XEXP (x, 1)); + + int start = INTVAL (XEXP (x, 2)); + if (BITS_BIG_ENDIAN) + start = GET_MODE_PRECISION (mode) - size - start; + + x = gen_rtx_LSHIFTRT (mode, XEXP (x, 0), GEN_INT (start)); + } + else if (GET_CODE (x) == ZERO_EXTEND + && GET_CODE (XEXP (x, 0)) == SUBREG + && GET_MODE (SUBREG_REG (XEXP (x, 0))) == mode + && subreg_lowpart_p (XEXP (x, 0))) + { + size = GET_MODE_PRECISION (GET_MODE (XEXP (x, 0))); + x = SUBREG_REG (XEXP (x, 0)); + } + else + continue; + + unsigned HOST_WIDE_INT mask = 1; + mask <<= size; + mask--; + + x = gen_rtx_AND (mode, x, GEN_INT (mask)); + + SUBST (**iter, x); + changed = true; + } + + return changed; +} + +/* Like recog, but we receive the address of a pointer to a new pattern. + We try to match the rtx that the pointer points to. + If that fails, we may try to modify or replace the pattern, + storing the replacement into the same pointer object. + + Modifications include deletion or addition of CLOBBERs. If the + instruction will still not match, we change ZERO_EXTEND and ZERO_EXTRACT + to the equivalent AND and perhaps LSHIFTRT patterns, and try with that + (and undo if that fails). + + PNOTES is a pointer to a location where any REG_UNUSED notes added for + the CLOBBERs are placed. + + The value is the final insn code from the pattern ultimately matched, + or -1. */ + +static int +recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes) +{ + rtx pat = PATTERN (insn); + int insn_code_number = recog_for_combine_1 (pnewpat, insn, pnotes); + if (insn_code_number >= 0 || check_asm_operands (pat)) + return insn_code_number; + + void *marker = get_undo_marker (); + bool changed = false; + + if (GET_CODE (pat) == SET) + changed = change_zero_ext (&SET_SRC (pat)); + else if (GET_CODE (pat) == PARALLEL) + { + int i; + for (i = 0; i < XVECLEN (pat, 0); i++) + { + rtx set = XVECEXP (pat, 0, i); + if (GET_CODE (set) == SET) + changed |= change_zero_ext (&SET_SRC (set)); + } + } + + if (changed) + { + insn_code_number = recog_for_combine_1 (pnewpat, insn, pnotes); + + if (insn_code_number < 0) + undo_to_marker (marker); + } + + return insn_code_number; +} /* Like gen_lowpart_general but for use by combine. In combine it is not possible to create any new pseudoregs. However, it is