Message ID | CAJK_mQ3R90TsoKhoA5WMkPjn31UO8bkqc5Bwhaj9t+SyQ982Cg@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 01/14/15 04:27, Venkataramanan Kumar wrote: > Hi all, > > When trying to debug GCC combiner pass with the test case in PR63949 > ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this code. > > This code in "make_compound_operation" assumes that all PLUS and MINUS > RTX are "MEM" type for scalar int modes and tries to optimize based on > that assumption. > > /* Select the code to be used in recursive calls. Once we are inside an > address, we stay there. If we have a comparison, set to COMPARE, > but once inside, go back to our default of SET. */ > > next_code = (code == MEM ? MEM > : ((code == PLUS || code == MINUS) > && SCALAR_INT_MODE_P (mode)) ? MEM > : ((code == COMPARE || COMPARISON_P (x)) > && XEXP (x, 1) == const0_rtx) ? COMPARE > : in_code == COMPARE ? SET : in_code); > > next_code is passed as in_code via recursive calls to > "make_compound_operation". Based on that we are converting shift > pattern to MULT pattern. > > case ASHIFT: > /* Convert shifts by constants into multiplications if inside > an address. */ > if (in_code == MEM && CONST_INT_P (XEXP (x, 1)) > && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT > && INTVAL (XEXP (x, 1)) >= 0 > && SCALAR_INT_MODE_P (mode)) > { > > Now I tried to tighten it further by adding check to see in_code is > also MEM type. > Not sure if this right thing to do. But this assumption about MEM > seems to be very relaxed before. > > diff --git a/gcc/combine.c b/gcc/combine.c > index 101cf35..1353f54 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code in_code) > > next_code = (code == MEM ? MEM > : ((code == PLUS || code == MINUS) > - && SCALAR_INT_MODE_P (mode)) ? MEM > + && SCALAR_INT_MODE_P (mode) > + && (in_code == MEM)) ? MEM > : ((code == COMPARE || COMPARISON_P (x)) > && XEXP (x, 1) == const0_rtx) ? COMPARE > : in_code == COMPARE ? SET : in_code); > > > This passed bootstrap on x86_64 and GCC tests are not regressing. > On Aarch64 passed bootstrap tests, test case in PR passed, but few > tests failed (failed to generate adds and subs), because there are > patterns (extended adds and subs) based on multiplication only in > Aarch64 backend. > > if this change is correct then I may need to add patterns in Aarch64 > based on shifts. Not sure about targets also. > > Requesting further comments/help about this. > > I am looking to get it fixed in stage 1. So the first question I would ask here is what precisely are you trying to accomplish? Is there some code where making this change is important or is it strictly a theoretical problem? If the latter, can we make it concrete with a well crafted testcase? jeff
On 01/14/2015 03:27 AM, Venkataramanan Kumar wrote: > next_code = (code == MEM ? MEM > : ((code == PLUS || code == MINUS) > - && SCALAR_INT_MODE_P (mode)) ? MEM > + && SCALAR_INT_MODE_P (mode) > + && (in_code == MEM)) ? MEM > : ((code == COMPARE || COMPARISON_P (x)) > && XEXP (x, 1) == const0_rtx) ? COMPARE > : in_code == COMPARE ? SET : in_code); Isn't this change the same as simply deleting the condition? If we're testing in_code == MEM, isn't that the same as just returning in_code, as the last condition does? Seconded Law's request for more information... r~
Hi jeff and Richard On 15 January 2015 at 03:10, Jeff Law <law@redhat.com> wrote: > On 01/14/15 04:27, Venkataramanan Kumar wrote: >> >> Hi all, >> >> When trying to debug GCC combiner pass with the test case in PR63949 >> ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this >> code. >> >> This code in "make_compound_operation" assumes that all PLUS and MINUS >> RTX are "MEM" type for scalar int modes and tries to optimize based on >> that assumption. >> >> /* Select the code to be used in recursive calls. Once we are inside an >> address, we stay there. If we have a comparison, set to COMPARE, >> but once inside, go back to our default of SET. */ >> >> next_code = (code == MEM ? MEM >> : ((code == PLUS || code == MINUS) >> && SCALAR_INT_MODE_P (mode)) ? MEM >> : ((code == COMPARE || COMPARISON_P (x)) >> && XEXP (x, 1) == const0_rtx) ? COMPARE >> : in_code == COMPARE ? SET : in_code); >> >> next_code is passed as in_code via recursive calls to >> "make_compound_operation". Based on that we are converting shift >> pattern to MULT pattern. >> >> case ASHIFT: >> /* Convert shifts by constants into multiplications if inside >> an address. */ >> if (in_code == MEM && CONST_INT_P (XEXP (x, 1)) >> && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT >> && INTVAL (XEXP (x, 1)) >= 0 >> && SCALAR_INT_MODE_P (mode)) >> { >> >> Now I tried to tighten it further by adding check to see in_code is >> also MEM type. >> Not sure if this right thing to do. But this assumption about MEM >> seems to be very relaxed before. >> >> diff --git a/gcc/combine.c b/gcc/combine.c >> index 101cf35..1353f54 100644 >> --- a/gcc/combine.c >> +++ b/gcc/combine.c >> @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code >> in_code) >> >> next_code = (code == MEM ? MEM >> : ((code == PLUS || code == MINUS) >> - && SCALAR_INT_MODE_P (mode)) ? MEM >> + && SCALAR_INT_MODE_P (mode) >> + && (in_code == MEM)) ? MEM >> : ((code == COMPARE || COMPARISON_P (x)) >> && XEXP (x, 1) == const0_rtx) ? COMPARE >> : in_code == COMPARE ? SET : in_code); >> >> >> This passed bootstrap on x86_64 and GCC tests are not regressing. >> On Aarch64 passed bootstrap tests, test case in PR passed, but few >> tests failed (failed to generate adds and subs), because there are >> patterns (extended adds and subs) based on multiplication only in >> Aarch64 backend. >> >> if this change is correct then I may need to add patterns in Aarch64 >> based on shifts. Not sure about targets also. >> >> Requesting further comments/help about this. >> >> I am looking to get it fixed in stage 1. > > So the first question I would ask here is what precisely are you trying to > accomplish? Is there some code where making this change is important or is > it strictly a theoretical problem? If the latter, can we make it concrete > with a well crafted testcase? > > jeff The below test case which I am working on is from the PR63949 int subsi_sxth (int a, short i) { /* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */ return a - ((int)i << 1); } The expression "a - ((int)i << 1)" is not a memory expression. But combiner assumes that MINUS RTX as memory, and process all RTX sub expressions with the memory assumption. While processing ((int)i << 1) in the combiner, it first hits this code. /* See if we have operations between an ASHIFTRT and an ASHIFT. If so, try to merge the shifts into a SIGN_EXTEND. We could also do this for some cases of SIGN_EXTRACT, but it doesn't seem worth the effort; the case checked for occurs on Alpha. */ if (!OBJECT_P (lhs) && ! (GET_CODE (lhs) == SUBREG && (OBJECT_P (SUBREG_REG (lhs)))) && CONST_INT_P (rhs) && INTVAL (rhs) < HOST_BITS_PER_WIDE_INT && INTVAL (rhs) < mode_width && (new_rtx = extract_left_shift (lhs, INTVAL (rhs))) != 0) new_rtx = make_extraction (mode,make_compound_operation( new_rtx,next_code), 0, NULL_RTX, mode_width - INTVAL (rhs), code == LSHIFTRT, 0, in_code == COMPARE); The input RTX is (ashiftrt:SI (ashift:SI (reg:SI 1 x1 [ i ]) (const_int 16 [0x10])) (const_int 15 [0xf])) And the function extract_left_shift generates ashift:SI (reg:SI 1 x1 [ i ]) (const_int 16 [0x10]) Before we call make_extraction, there is another call to "make_compound_operation" again which converts ASHIFT to mult RTX. In "make_compound_operation" switch (code) { case ASHIFT: /* Convert shifts by constants into multiplications if inside an address. */ if (in_code == MEM && CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT && INTVAL (XEXP (x, 1)) >= 0 && SCALAR_INT_MODE_P (mode)) { After extraction what we get a sign_extract and an outer SUBREG expression. Combiner Says Failed to mismatch this set (reg/i:SI 0 x0) (minus:SI (reg:SI 0 x0 [ a ]) (subreg:SI (sign_extract:DI (mult:DI (reg:DI 1 x1 [ i ]) (const_int 2 [0x2])) (const_int 17 [0x11]) (const_int 0 [0])) 0))) But when we don't convert ASHIFT to MULT then we have a code in make_extraction that works on ASHIFT RTX else if (GET_CODE (inner) == ASHIFT && CONST_INT_P (XEXP (inner, 1)) && pos_rtx == 0 && pos == 0 && len > UINTVAL (XEXP (inner, 1))) { /* We're extracting the least significant bits of an rtx (ashift X (const_int C)), where LEN > C. Extract the least significant (LEN - C) bits of X, giving an rtx whose mode is MODE, then shift it left C times. */ new_rtx = make_extraction (mode, XEXP (inner, 0), 0, 0, len - INTVAL (XEXP (inner, 1)), unsignedp, in_dest, in_compare); Now combiner Successfully matched this instruction (set (reg/i:SI 0 x0) (minus:SI (reg:SI 0 x0 [ a ]) (ashift:SI (sign_extend:SI (reg:HI 1 x1 [ i ])) (const_int 1 [0x1])))) So MEM assumptions on MINUS and PLUX RTX seems to be wrong for me. Please let me know your suggestions on this. regards, Venkat.
On 01/16/2015 12:13 AM, Venkataramanan Kumar wrote: > > The below test case which I am working on is from the PR63949 > > int subsi_sxth (int a, short i) > { > /* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */ > return a - ((int)i << 1); > } > > The expression "a - ((int)i << 1)" is not a memory expression. > But combiner assumes that MINUS RTX as memory, and process all RTX > sub expressions with the memory assumption. Coming back to this old thread. And presumably when it treats this expression as a MEM, then we do the usual canonicalizations of turning ASHIFT into a MULT when we wanted to keep the code as an ASHIFT, right? The code in question comes from 1992: Fri Jun 26 07:06:19 1992 Richard Kenner (kenner@vlsi1.ultra.nyu.edu) [ ... ] * combine.c (make_compound_operation): Treat PLUS and MINUS the same when passing down the code to the next level; for consistency, an ASHIFT inside either gets turned into a MULT. Sadly, in that era of development, we didn't require nearly as many testcases as we do now. Similarly patches didn't have to be posted with a rationale for the patch. I've checked my gcc2 archives just to be sure and there's nothing about this patch in them. From reading the actual patch Kenner installed, I'm confident this was not a correctness issue, but instead was trying to improve the quality of the combiner output. So I think you've got a couple paths forward. First you could remove the PLUS/MINUS special casing and then verify nothing changes across a few targets (x86_64, ppc64, AArch64 would be my recommendations). And by nothing changes, I mean the assembly before/after is the same for all the files in a bootstrap. It's a fairly high hurdle. Alternately, you could find a way to accurately track if we're inside a MEM, where we want to canonicalize things slightly differently. Once we can accurately track if we're inside a MEM, then we no longer have to have the hack for PLUS/MINUS. Jeff
Jeff Law <law@redhat.com> writes: > On 01/16/2015 12:13 AM, Venkataramanan Kumar wrote: >> >> The below test case which I am working on is from the PR63949 >> >> int subsi_sxth (int a, short i) >> { >> /* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */ >> return a - ((int)i << 1); >> } >> >> The expression "a - ((int)i << 1)" is not a memory expression. >> But combiner assumes that MINUS RTX as memory, and process all RTX >> sub expressions with the memory assumption. > Coming back to this old thread. > > And presumably when it treats this expression as a MEM, then we do the > usual canonicalizations of turning ASHIFT into a MULT when we wanted to > keep the code as an ASHIFT, right? > > The code in question comes from 1992: > > Fri Jun 26 07:06:19 1992 Richard Kenner (kenner@vlsi1.ultra.nyu.edu) > > [ ... ] > > > * combine.c (make_compound_operation): Treat PLUS and MINUS > the same when passing down the code to the next level; for > consistency, an ASHIFT inside either gets turned into a MULT. > > Sadly, in that era of development, we didn't require nearly as many > testcases as we do now. Similarly patches didn't have to be posted with > a rationale for the patch. I've checked my gcc2 archives just to be > sure and there's nothing about this patch in them. > > From reading the actual patch Kenner installed, I'm confident this was > not a correctness issue, but instead was trying to improve the quality > of the combiner output. > > So I think you've got a couple paths forward. First you could remove > the PLUS/MINUS special casing and then verify nothing changes across a > few targets (x86_64, ppc64, AArch64 would be my recommendations). And > by nothing changes, I mean the assembly before/after is the same for all > the files in a bootstrap. It's a fairly high hurdle. > > Alternately, you could find a way to accurately track if we're inside a > MEM, where we want to canonicalize things slightly differently. Once we > can accurately track if we're inside a MEM, then we no longer have to > have the hack for PLUS/MINUS. A third even higher hurdle would be to remove the idea that the canonical form of an expression depends on whether it's in a MEM or not. It just causes confusion when an lea is created from an existing MEM address. :-) Richard
On Wed, May 06, 2015 at 08:32:13AM +0100, Richard Sandiford wrote: > > Alternately, you could find a way to accurately track if we're inside a > > MEM, where we want to canonicalize things slightly differently. Once we > > can accurately track if we're inside a MEM, then we no longer have to > > have the hack for PLUS/MINUS. > > A third even higher hurdle would be to remove the idea that the > canonical form of an expression depends on whether it's in a MEM > or not. It just causes confusion when an lea is created from an > existing MEM address. > :-) Fourth, whip combine into shape until it no longer has its special "more strict" ideas of what is canonical. Fifth (higher, higher!), relax some of the ad-hoc "canonicalisation" rules, make recog recognise *both* forms (or more than two in some cases, even). Segher
diff --git a/gcc/combine.c b/gcc/combine.c index 101cf35..1353f54 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code in_code) next_code = (code == MEM ? MEM : ((code == PLUS || code == MINUS) - && SCALAR_INT_MODE_P (mode)) ? MEM + && SCALAR_INT_MODE_P (mode) + && (in_code == MEM)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) && XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code);