diff mbox

[RFC] Tighten memory type assumption in RTL combiner pass.

Message ID CAJK_mQ3R90TsoKhoA5WMkPjn31UO8bkqc5Bwhaj9t+SyQ982Cg@mail.gmail.com
State New
Headers show

Commit Message

Venkataramanan Kumar Jan. 14, 2015, 11:27 a.m. UTC
Hi all,

When trying to debug GCC combiner pass with the test case in PR63949
ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this code.

This code in "make_compound_operation" assumes that all PLUS and MINUS
RTX are "MEM" type for scalar int modes and tries to optimize based on
that assumption.

/* Select the code to be used in recursive calls.  Once we are inside an
      address, we stay there.  If we have a comparison, set to COMPARE,
      but once inside, go back to our default of SET.  */

   next_code = (code == MEM ? MEM
                : ((code == PLUS || code == MINUS)
                   && SCALAR_INT_MODE_P (mode)) ? MEM
                : ((code == COMPARE || COMPARISON_P (x))
                   && XEXP (x, 1) == const0_rtx) ? COMPARE
                : in_code == COMPARE ? SET : in_code);

 next_code is passed as in_code via recursive calls to
"make_compound_operation". Based on that we are converting shift
pattern to MULT pattern.

case ASHIFT:
      /* Convert shifts by constants into multiplications if inside
         an address.  */
      if (in_code == MEM && CONST_INT_P (XEXP (x, 1))
          && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT
          && INTVAL (XEXP (x, 1)) >= 0
          && SCALAR_INT_MODE_P (mode))
        {

Now I tried to tighten it further by adding check to see in_code is
also MEM type.
Not sure if this right thing to do.  But this assumption about MEM
seems to be very relaxed before.



This passed bootstrap on x86_64 and  GCC tests are not regressing.
On Aarch64 passed bootstrap tests, test case in PR passed, but few
tests failed (failed to generate adds and subs), because there are
patterns (extended adds and subs) based on multiplication only in
Aarch64 backend.

if this change is correct then I may need to add patterns in Aarch64
based on shifts. Not sure about targets also.

Requesting further comments/help about this.

I am looking to get it fixed in stage 1.

regards,
Venkat.

Comments

Jeff Law Jan. 14, 2015, 9:40 p.m. UTC | #1
On 01/14/15 04:27, Venkataramanan Kumar wrote:
> Hi all,
>
> When trying to debug GCC combiner pass with the test case in PR63949
> ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this code.
>
> This code in "make_compound_operation" assumes that all PLUS and MINUS
> RTX are "MEM" type for scalar int modes and tries to optimize based on
> that assumption.
>
> /* Select the code to be used in recursive calls.  Once we are inside an
>        address, we stay there.  If we have a comparison, set to COMPARE,
>        but once inside, go back to our default of SET.  */
>
>     next_code = (code == MEM ? MEM
>                  : ((code == PLUS || code == MINUS)
>                     && SCALAR_INT_MODE_P (mode)) ? MEM
>                  : ((code == COMPARE || COMPARISON_P (x))
>                     && XEXP (x, 1) == const0_rtx) ? COMPARE
>                  : in_code == COMPARE ? SET : in_code);
>
>   next_code is passed as in_code via recursive calls to
> "make_compound_operation". Based on that we are converting shift
> pattern to MULT pattern.
>
> case ASHIFT:
>        /* Convert shifts by constants into multiplications if inside
>           an address.  */
>        if (in_code == MEM && CONST_INT_P (XEXP (x, 1))
>            && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT
>            && INTVAL (XEXP (x, 1)) >= 0
>            && SCALAR_INT_MODE_P (mode))
>          {
>
> Now I tried to tighten it further by adding check to see in_code is
> also MEM type.
> Not sure if this right thing to do.  But this assumption about MEM
> seems to be very relaxed before.
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 101cf35..1353f54 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code in_code)
>
>     next_code = (code == MEM ? MEM
>                 : ((code == PLUS || code == MINUS)
> -                 && SCALAR_INT_MODE_P (mode)) ? MEM
> +                 && SCALAR_INT_MODE_P (mode)
> +                 && (in_code == MEM)) ? MEM
>                 : ((code == COMPARE || COMPARISON_P (x))
>                    && XEXP (x, 1) == const0_rtx) ? COMPARE
>                 : in_code == COMPARE ? SET : in_code);
>
>
> This passed bootstrap on x86_64 and  GCC tests are not regressing.
> On Aarch64 passed bootstrap tests, test case in PR passed, but few
> tests failed (failed to generate adds and subs), because there are
> patterns (extended adds and subs) based on multiplication only in
> Aarch64 backend.
>
> if this change is correct then I may need to add patterns in Aarch64
> based on shifts. Not sure about targets also.
>
> Requesting further comments/help about this.
>
> I am looking to get it fixed in stage 1.
So the first question I would ask here is what precisely are you trying 
to accomplish?  Is there some code where making this change is important 
or is it strictly a theoretical problem?  If the latter, can we make it 
concrete with a well crafted testcase?

jeff
Richard Henderson Jan. 14, 2015, 10:20 p.m. UTC | #2
On 01/14/2015 03:27 AM, Venkataramanan Kumar wrote:
>    next_code = (code == MEM ? MEM
>                : ((code == PLUS || code == MINUS)
> -                 && SCALAR_INT_MODE_P (mode)) ? MEM
> +                 && SCALAR_INT_MODE_P (mode)
> +                 && (in_code == MEM)) ? MEM
>                : ((code == COMPARE || COMPARISON_P (x))
>                   && XEXP (x, 1) == const0_rtx) ? COMPARE
>                : in_code == COMPARE ? SET : in_code);

Isn't this change the same as simply deleting the condition?

If we're testing in_code == MEM, isn't that the same as just
returning in_code, as the last condition does?

Seconded Law's request for more information...


r~
Venkataramanan Kumar Jan. 16, 2015, 7:13 a.m. UTC | #3
Hi jeff and Richard

On 15 January 2015 at 03:10, Jeff Law <law@redhat.com> wrote:
> On 01/14/15 04:27, Venkataramanan Kumar wrote:
>>
>> Hi all,
>>
>> When trying to debug GCC combiner pass with the test case in PR63949
>> ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this
>> code.
>>
>> This code in "make_compound_operation" assumes that all PLUS and MINUS
>> RTX are "MEM" type for scalar int modes and tries to optimize based on
>> that assumption.
>>
>> /* Select the code to be used in recursive calls.  Once we are inside an
>>        address, we stay there.  If we have a comparison, set to COMPARE,
>>        but once inside, go back to our default of SET.  */
>>
>>     next_code = (code == MEM ? MEM
>>                  : ((code == PLUS || code == MINUS)
>>                     && SCALAR_INT_MODE_P (mode)) ? MEM
>>                  : ((code == COMPARE || COMPARISON_P (x))
>>                     && XEXP (x, 1) == const0_rtx) ? COMPARE
>>                  : in_code == COMPARE ? SET : in_code);
>>
>>   next_code is passed as in_code via recursive calls to
>> "make_compound_operation". Based on that we are converting shift
>> pattern to MULT pattern.
>>
>> case ASHIFT:
>>        /* Convert shifts by constants into multiplications if inside
>>           an address.  */
>>        if (in_code == MEM && CONST_INT_P (XEXP (x, 1))
>>            && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT
>>            && INTVAL (XEXP (x, 1)) >= 0
>>            && SCALAR_INT_MODE_P (mode))
>>          {
>>
>> Now I tried to tighten it further by adding check to see in_code is
>> also MEM type.
>> Not sure if this right thing to do.  But this assumption about MEM
>> seems to be very relaxed before.
>>
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index 101cf35..1353f54 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code
>> in_code)
>>
>>     next_code = (code == MEM ? MEM
>>                 : ((code == PLUS || code == MINUS)
>> -                 && SCALAR_INT_MODE_P (mode)) ? MEM
>> +                 && SCALAR_INT_MODE_P (mode)
>> +                 && (in_code == MEM)) ? MEM
>>                 : ((code == COMPARE || COMPARISON_P (x))
>>                    && XEXP (x, 1) == const0_rtx) ? COMPARE
>>                 : in_code == COMPARE ? SET : in_code);
>>
>>
>> This passed bootstrap on x86_64 and  GCC tests are not regressing.
>> On Aarch64 passed bootstrap tests, test case in PR passed, but few
>> tests failed (failed to generate adds and subs), because there are
>> patterns (extended adds and subs) based on multiplication only in
>> Aarch64 backend.
>>
>> if this change is correct then I may need to add patterns in Aarch64
>> based on shifts. Not sure about targets also.
>>
>> Requesting further comments/help about this.
>>
>> I am looking to get it fixed in stage 1.
>
> So the first question I would ask here is what precisely are you trying to
> accomplish?  Is there some code where making this change is important or is
> it strictly a theoretical problem?  If the latter, can we make it concrete
> with a well crafted testcase?
>
> jeff

The below test case which I am working on is from the PR63949

int   subsi_sxth (int a, short  i)
{
  /* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */
  return a - ((int)i << 1);
}

The expression "a - ((int)i << 1)" is not a memory expression.
But combiner assumes that MINUS RTX as memory, and process all RTX
sub expressions with the memory assumption.

While processing ((int)i << 1) in the combiner, it first hits this code.

     /* See if we have operations between an ASHIFTRT and an ASHIFT.
         If so, try to merge the shifts into a SIGN_EXTEND.  We could
         also do this for some cases of SIGN_EXTRACT, but it doesn't
         seem worth the effort; the case checked for occurs on Alpha.  */

      if (!OBJECT_P (lhs)
          && ! (GET_CODE (lhs) == SUBREG
                && (OBJECT_P (SUBREG_REG (lhs))))
          && CONST_INT_P (rhs)
          && INTVAL (rhs) < HOST_BITS_PER_WIDE_INT
          && INTVAL (rhs) < mode_width
          && (new_rtx = extract_left_shift (lhs, INTVAL (rhs))) != 0)
        new_rtx = make_extraction
(mode,make_compound_operation(
new_rtx,next_code),
                               0, NULL_RTX, mode_width - INTVAL (rhs),
                               code == LSHIFTRT, 0, in_code == COMPARE);

The input RTX is

(ashiftrt:SI (ashift:SI (reg:SI 1 x1 [ i ])
        (const_int 16 [0x10]))
    (const_int 15 [0xf]))

And the function extract_left_shift generates

ashift:SI (reg:SI 1 x1 [ i ])
        (const_int 16 [0x10])

Before we call make_extraction, there is another call to
"make_compound_operation" again which converts ASHIFT to mult RTX.

In "make_compound_operation"

switch (code)
    {
    case ASHIFT:
      /* Convert shifts by constants into multiplications if inside
         an address.  */
      if (in_code == MEM && CONST_INT_P (XEXP (x, 1))
          && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT
          && INTVAL (XEXP (x, 1)) >= 0
          && SCALAR_INT_MODE_P (mode))
        {


After extraction what we get a sign_extract and an outer SUBREG expression.
Combiner Says Failed to mismatch this

set (reg/i:SI 0 x0)
    (minus:SI (reg:SI 0 x0 [ a ])
        (subreg:SI (sign_extract:DI (mult:DI (reg:DI 1 x1 [ i ])
                    (const_int 2 [0x2]))
                (const_int 17 [0x11])
                (const_int 0 [0])) 0)))


But when we don't convert ASHIFT to MULT then we have a code in
make_extraction that works on ASHIFT RTX

else if (GET_CODE (inner) == ASHIFT
           && CONST_INT_P (XEXP (inner, 1))
           && pos_rtx == 0 && pos == 0
           && len > UINTVAL (XEXP (inner, 1)))
    {
      /* We're extracting the least significant bits of an rtx
         (ashift X (const_int C)), where LEN > C.  Extract the
         least significant (LEN - C) bits of X, giving an rtx
         whose mode is MODE, then shift it left C times.  */
      new_rtx = make_extraction (mode, XEXP (inner, 0),
                             0, 0, len - INTVAL (XEXP (inner, 1)),
                             unsignedp, in_dest, in_compare);


Now combiner Successfully matched this instruction

(set (reg/i:SI 0 x0)
    (minus:SI (reg:SI 0 x0 [ a ])
        (ashift:SI (sign_extend:SI (reg:HI 1 x1 [ i ]))
            (const_int 1 [0x1]))))


So MEM assumptions on MINUS and PLUX RTX seems to be wrong for me.

Please let me know your suggestions on this.

regards,
Venkat.
Jeff Law April 24, 2015, 12:14 a.m. UTC | #4
On 01/16/2015 12:13 AM, Venkataramanan Kumar wrote:
>
> The below test case which I am working on is from the PR63949
>
> int   subsi_sxth (int a, short  i)
> {
>    /* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */
>    return a - ((int)i << 1);
> }
>
> The expression "a - ((int)i << 1)" is not a memory expression.
> But combiner assumes that MINUS RTX as memory, and process all RTX
> sub expressions with the memory assumption.
Coming back to this old thread.

And presumably when it treats this expression as a MEM, then we do the 
usual canonicalizations of turning ASHIFT into a MULT when we wanted to 
keep the code as an ASHIFT, right?

The code in question comes from 1992:

Fri Jun 26 07:06:19 1992  Richard Kenner  (kenner@vlsi1.ultra.nyu.edu)

[ ... ]


         * combine.c (make_compound_operation): Treat PLUS and MINUS
         the same when passing down the code to the next level; for
         consistency, an ASHIFT inside either gets turned into a MULT.

Sadly, in that era of development, we didn't require nearly as many 
testcases as we do now.  Similarly patches didn't have to be posted with 
a rationale for the patch.  I've checked my gcc2 archives just to be 
sure and there's nothing about this patch in them.

 From reading the actual patch Kenner installed, I'm confident this was 
not a correctness issue, but instead was trying to improve the quality 
of the combiner output.

So I think you've got a couple paths forward.  First you could remove 
the PLUS/MINUS special casing and then verify nothing changes across a 
few targets (x86_64, ppc64, AArch64 would be my recommendations).  And 
by nothing changes, I mean the assembly before/after is the same for all 
the files in a bootstrap.  It's a fairly high hurdle.

Alternately, you could find a way to accurately track if we're inside a 
MEM, where we want to canonicalize things slightly differently.  Once we 
can accurately track if we're inside a MEM, then we no longer have to 
have the hack for PLUS/MINUS.

Jeff
Richard Sandiford May 6, 2015, 7:32 a.m. UTC | #5
Jeff Law <law@redhat.com> writes:
> On 01/16/2015 12:13 AM, Venkataramanan Kumar wrote:
>>
>> The below test case which I am working on is from the PR63949
>>
>> int   subsi_sxth (int a, short  i)
>> {
>>    /* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */
>>    return a - ((int)i << 1);
>> }
>>
>> The expression "a - ((int)i << 1)" is not a memory expression.
>> But combiner assumes that MINUS RTX as memory, and process all RTX
>> sub expressions with the memory assumption.
> Coming back to this old thread.
>
> And presumably when it treats this expression as a MEM, then we do the 
> usual canonicalizations of turning ASHIFT into a MULT when we wanted to 
> keep the code as an ASHIFT, right?
>
> The code in question comes from 1992:
>
> Fri Jun 26 07:06:19 1992  Richard Kenner  (kenner@vlsi1.ultra.nyu.edu)
>
> [ ... ]
>
>
>          * combine.c (make_compound_operation): Treat PLUS and MINUS
>          the same when passing down the code to the next level; for
>          consistency, an ASHIFT inside either gets turned into a MULT.
>
> Sadly, in that era of development, we didn't require nearly as many 
> testcases as we do now.  Similarly patches didn't have to be posted with 
> a rationale for the patch.  I've checked my gcc2 archives just to be 
> sure and there's nothing about this patch in them.
>
>  From reading the actual patch Kenner installed, I'm confident this was 
> not a correctness issue, but instead was trying to improve the quality 
> of the combiner output.
>
> So I think you've got a couple paths forward.  First you could remove 
> the PLUS/MINUS special casing and then verify nothing changes across a 
> few targets (x86_64, ppc64, AArch64 would be my recommendations).  And 
> by nothing changes, I mean the assembly before/after is the same for all 
> the files in a bootstrap.  It's a fairly high hurdle.
>
> Alternately, you could find a way to accurately track if we're inside a 
> MEM, where we want to canonicalize things slightly differently.  Once we 
> can accurately track if we're inside a MEM, then we no longer have to 
> have the hack for PLUS/MINUS.

A third even higher hurdle would be to remove the idea that the
canonical form of an expression depends on whether it's in a MEM
or not.  It just causes confusion when an lea is created from an
existing MEM address.
:-)

Richard
Segher Boessenkool May 6, 2015, 12:29 p.m. UTC | #6
On Wed, May 06, 2015 at 08:32:13AM +0100, Richard Sandiford wrote:
> > Alternately, you could find a way to accurately track if we're inside a 
> > MEM, where we want to canonicalize things slightly differently.  Once we 
> > can accurately track if we're inside a MEM, then we no longer have to 
> > have the hack for PLUS/MINUS.
> 
> A third even higher hurdle would be to remove the idea that the
> canonical form of an expression depends on whether it's in a MEM
> or not.  It just causes confusion when an lea is created from an
> existing MEM address.
> :-)

Fourth, whip combine into shape until it no longer has its special
"more strict" ideas of what is canonical.

Fifth (higher, higher!), relax some of the ad-hoc "canonicalisation"
rules, make recog recognise *both* forms (or more than two in some
cases, even).


Segher
diff mbox

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index 101cf35..1353f54 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7696,7 +7696,8 @@  make_compound_operation (rtx x, enum rtx_code in_code)

   next_code = (code == MEM ? MEM
               : ((code == PLUS || code == MINUS)
-                 && SCALAR_INT_MODE_P (mode)) ? MEM
+                 && SCALAR_INT_MODE_P (mode)
+                 && (in_code == MEM)) ? MEM
               : ((code == COMPARE || COMPARISON_P (x))
                  && XEXP (x, 1) == const0_rtx) ? COMPARE
               : in_code == COMPARE ? SET : in_code);