diff mbox

[RFA,PR,rtl-optimization/47477] Type narrowing in match.pd

Message ID 54DA7029.4070001@redhat.com
State New
Headers show

Commit Message

Jeff Law Feb. 10, 2015, 8:55 p.m. UTC
This PR was originally minor issue where we regressed on this kind of 
sequence:

typedef struct toto_s *toto_t;
toto_t add (toto_t a, toto_t b) {
   int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
   return (toto_t)(intptr_t) tmp;
}


There was talk of trying to peephole this in the x86 backend.  But later 
Jakub speculated that if we had good type narrowing this could be done 
in the tree optimizers...

Soooo, here we go.  I didn't do anything with logicals are those are 
already handled elsewhere in match.pd.  I didn't try to handle MULT as 
in the early experiments I did, it was a lose because of the existing 
mechanisms for widening multiplications.

Interestingly enough, this patch seems to help out libjava more than 
anything else in a GCC build and it really only helps a few routines. 
There weren't any routines I could see where the code regressed after 
this patch.  This is probably an indicator that these things aren't 
*that* common, or the existing shortening code better than we thought, 
or some important shortening case is missing.


I think we should pull the other tests from 47477 which are not 
regressions out into their own bug for future work.  Or alternately, 
when this fix is checked in remove the regression marker in 47477.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for 
the trunk?

Comments

Richard Biener Feb. 11, 2015, 10:56 a.m. UTC | #1
On Tue, Feb 10, 2015 at 9:55 PM, Jeff Law <law@redhat.com> wrote:
>
> This PR was originally minor issue where we regressed on this kind of
> sequence:
>
> typedef struct toto_s *toto_t;
> toto_t add (toto_t a, toto_t b) {
>   int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
>   return (toto_t)(intptr_t) tmp;
> }
>
>
> There was talk of trying to peephole this in the x86 backend.  But later
> Jakub speculated that if we had good type narrowing this could be done in
> the tree optimizers...
>
> Soooo, here we go.  I didn't do anything with logicals are those are already
> handled elsewhere in match.pd.  I didn't try to handle MULT as in the early
> experiments I did, it was a lose because of the existing mechanisms for
> widening multiplications.
>
> Interestingly enough, this patch seems to help out libjava more than
> anything else in a GCC build and it really only helps a few routines. There
> weren't any routines I could see where the code regressed after this patch.
> This is probably an indicator that these things aren't *that* common, or the
> existing shortening code better than we thought, or some important
> shortening case is missing.
>
>
> I think we should pull the other tests from 47477 which are not regressions
> out into their own bug for future work.  Or alternately, when this fix is
> checked in remove the regression marker in 47477.
>
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for the
> trunk?
>
>
>
>
>
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 7f3816c..7a95029 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
> +       simplifier to narrow arithmetic.
> +

Please reference PR47477 from here

>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81c4ee6..abc703e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>     (logs (pows @0 @1))
>     (mult @1 (logs @0)))))
>
> +/* If we have a narrowing conversion of an arithmetic operation where
> +   both operands are widening conversions from the same type as the outer
> +   narrowing conversion.  Then convert the innermost operands to a suitable
> +   unsigned type (to avoid introducing undefined behaviour), perform the
> +   operation and convert the result to the desired type.
> +
> +   This narrows the arithmetic operation.  */

This is also a part of what shorten_binary_op does, so please refer to
that in the comment so we can eventually complete this from there
and remove shorten_binary_op.

> +(for op (plus minus)
> +  (simplify
> +    (convert (op (convert@2 @0) (convert @1)))
> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
> +         && TREE_TYPE (@0) == type

Otherwhere in match.pd we use

(GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
 || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))

for type equality checks.  And I think even for GENERIC you want
to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
== TYPE_MAIN_VARAINT (TREE_TYPE (@1)).

> +         && INTEGRAL_TYPE_P (type)

So instead of testing TREE_TYPE (@0) == type we probably want
to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
(type) to allow sign-changes.  Say for

short i, j;
(unsigned short) (i + j)

we still want to narrow the widened i and j.

> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
> (@0))
> +        /* This prevents infinite recursion.  */
> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))

How can it recurse with the type precision constraint right above?
If you think it's necessary then nesting that condition below as

> +      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
             (if (utype != TREE_TYPE (@2))

avoids evaluating unsigned_type_for twice.

Note that technically we don't need to perform the operation in an unsigned
type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)).  Thus

           (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
             (convert (op @0 @1)))
           (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
             (convert (op (convert:utype @0) (convert:utype @1))))))

You can see that GCC exploits that with -fwrapv.  Maybe this
simplification should be split out into a separate pattern though,
tacking sign-changes for binary ops only.

Thanks,
Richard.

> +        (convert (op (convert:utype @0) (convert:utype @1)))))))
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 15d5e2d..76e5254 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       PR rtl-optimization/47477
> +       * gcc.dg/tree-ssa/narrow-arith-1.c: New test.
> +
>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> new file mode 100644
> index 0000000..104cb6f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/47477 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -w" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +typedef int int64_t __attribute__ ((__mode__ (__DI__)));
> +typedef int * intptr_t;
> +
> +typedef struct toto_s *toto_t;
> +toto_t add (toto_t a, toto_t b) {
> +  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> +  return (toto_t)(intptr_t) tmp;
> +}
> +
> +/* For an ILP32 target there'll be 6 casts when we start, but just 4
> +   if the match.pd pattern is successfully matched.  */
> +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1
> "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> +
>
Jeff Law Feb. 11, 2015, 5:05 p.m. UTC | #2
On 02/11/15 03:56, Richard Biener wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 7f3816c..7a95029 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2015-02-10  Jeff Law  <law@redhat.com>
>> +
>> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
>> +       simplifier to narrow arithmetic.
>> +
>
> Please reference PR47477 from here
Doh.  Fixed.


>
>>   2015-02-10  Richard Biener  <rguenther@suse.de>
>>
>>          PR tree-optimization/64909
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 81c4ee6..abc703e 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>>      (logs (pows @0 @1))
>>      (mult @1 (logs @0)))))
>>
>> +/* If we have a narrowing conversion of an arithmetic operation where
>> +   both operands are widening conversions from the same type as the outer
>> +   narrowing conversion.  Then convert the innermost operands to a suitable
>> +   unsigned type (to avoid introducing undefined behaviour), perform the
>> +   operation and convert the result to the desired type.
>> +
>> +   This narrows the arithmetic operation.  */
>
> This is also a part of what shorten_binary_op does, so please refer to
> that in the comment so we can eventually complete this from there
> and remove shorten_binary_op.
Done.  I've created a block comment meant to cover this pattern and any 
additions we need along the way to moving shortening/narrowing out of 
the front-ends.


>
>> +(for op (plus minus)
>> +  (simplify
>> +    (convert (op (convert@2 @0) (convert @1)))
>> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
>> +         && TREE_TYPE (@0) == type
>
> Otherwhere in match.pd we use
>
> (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
>   || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))
>
> for type equality checks.  And I think even for GENERIC you want
> to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
> == TYPE_MAIN_VARAINT (TREE_TYPE (@1)).
Seems reasonable.    Makes for some ugly formatting though.  It might 
make sense to factor that test so we don't end up repeating it senselessly.

>
>> +         && INTEGRAL_TYPE_P (type)
>
> So instead of testing TREE_TYPE (@0) == type we probably want
> to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
> (type) to allow sign-changes.  Say for
Yea, makes sense.

>
> short i, j;
> (unsigned short) (i + j)
>
> we still want to narrow the widened i and j.
Right.
>
>> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
>> (@0))
>> +        /* This prevents infinite recursion.  */
>> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
>
> How can it recurse with the type precision constraint right above?
It may no longer be needed.  The precision check used to be >=, but that 
regressed some java codes.  So it got tightened in the last iteration 
before posting.

>
> Note that technically we don't need to perform the operation in an unsigned
> type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)).  Thus
>
>             (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>               (convert (op @0 @1)))
>             (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
>               (convert (op (convert:utype @0) (convert:utype @1))))))
>
> You can see that GCC exploits that with -fwrapv.  Maybe this
> simplification should be split out into a separate pattern though,
> tacking sign-changes for binary ops only.
No strong opinion on separate pattern or integrating into existing pattern.

jeff
Bin.Cheng Feb. 12, 2015, 7:01 a.m. UTC | #3
On Wed, Feb 11, 2015 at 4:55 AM, Jeff Law <law@redhat.com> wrote:
>
> This PR was originally minor issue where we regressed on this kind of
> sequence:
>
> typedef struct toto_s *toto_t;
> toto_t add (toto_t a, toto_t b) {
>   int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
>   return (toto_t)(intptr_t) tmp;
> }
>
>
> There was talk of trying to peephole this in the x86 backend.  But later
> Jakub speculated that if we had good type narrowing this could be done in
> the tree optimizers...
>
> Soooo, here we go.  I didn't do anything with logicals are those are already
> handled elsewhere in match.pd.  I didn't try to handle MULT as in the early
> experiments I did, it was a lose because of the existing mechanisms for
> widening multiplications.
>
> Interestingly enough, this patch seems to help out libjava more than
> anything else in a GCC build and it really only helps a few routines. There
> weren't any routines I could see where the code regressed after this patch.
> This is probably an indicator that these things aren't *that* common, or the
> existing shortening code better than we thought, or some important
> shortening case is missing.

Cool that we are trying to simplify type conversion using generic
match facility.  I have thought about type promotion in match.pd too.
For example, (unsigned long long)(unsigned long)(int_expr), if we can
prove int_expr is always positive (in my case, this is from vrp
information), then the first conversion can be saved.  This is another
way for (and related? I didn't look at the code) the sign/zero
extension elimination work using VRP I suppose?

Thanks,
bin
>
>
> I think we should pull the other tests from 47477 which are not regressions
> out into their own bug for future work.  Or alternately, when this fix is
> checked in remove the regression marker in 47477.
>
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for the
> trunk?
>
>
>
>
>
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 7f3816c..7a95029 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
> +       simplifier to narrow arithmetic.
> +
>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81c4ee6..abc703e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>     (logs (pows @0 @1))
>     (mult @1 (logs @0)))))
>
> +/* If we have a narrowing conversion of an arithmetic operation where
> +   both operands are widening conversions from the same type as the outer
> +   narrowing conversion.  Then convert the innermost operands to a suitable
> +   unsigned type (to avoid introducing undefined behaviour), perform the
> +   operation and convert the result to the desired type.
> +
> +   This narrows the arithmetic operation.  */
> +(for op (plus minus)
> +  (simplify
> +    (convert (op (convert@2 @0) (convert @1)))
> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
> +         && TREE_TYPE (@0) == type
> +         && INTEGRAL_TYPE_P (type)
> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
> (@0))
> +        /* This prevents infinite recursion.  */
> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
> +      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
> +        (convert (op (convert:utype @0) (convert:utype @1)))))))
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 15d5e2d..76e5254 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       PR rtl-optimization/47477
> +       * gcc.dg/tree-ssa/narrow-arith-1.c: New test.
> +
>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> new file mode 100644
> index 0000000..104cb6f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/47477 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -w" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +typedef int int64_t __attribute__ ((__mode__ (__DI__)));
> +typedef int * intptr_t;
> +
> +typedef struct toto_s *toto_t;
> +toto_t add (toto_t a, toto_t b) {
> +  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> +  return (toto_t)(intptr_t) tmp;
> +}
> +
> +/* For an ILP32 target there'll be 6 casts when we start, but just 4
> +   if the match.pd pattern is successfully matched.  */
> +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1
> "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> +
>
Richard Biener Feb. 12, 2015, 1:02 p.m. UTC | #4
On Wed, Feb 11, 2015 at 6:05 PM, Jeff Law <law@redhat.com> wrote:
> On 02/11/15 03:56, Richard Biener wrote:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 7f3816c..7a95029 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,8 @@
>>> +2015-02-10  Jeff Law  <law@redhat.com>
>>> +
>>> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
>>> +       simplifier to narrow arithmetic.
>>> +
>>
>>
>> Please reference PR47477 from here
>
> Doh.  Fixed.
>
>
>>
>>>   2015-02-10  Richard Biener  <rguenther@suse.de>
>>>
>>>          PR tree-optimization/64909
>>> diff --git a/gcc/match.pd b/gcc/match.pd
>>> index 81c4ee6..abc703e 100644
>>> --- a/gcc/match.pd
>>> +++ b/gcc/match.pd
>>> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>>>      (logs (pows @0 @1))
>>>      (mult @1 (logs @0)))))
>>>
>>> +/* If we have a narrowing conversion of an arithmetic operation where
>>> +   both operands are widening conversions from the same type as the
>>> outer
>>> +   narrowing conversion.  Then convert the innermost operands to a
>>> suitable
>>> +   unsigned type (to avoid introducing undefined behaviour), perform the
>>> +   operation and convert the result to the desired type.
>>> +
>>> +   This narrows the arithmetic operation.  */
>>
>>
>> This is also a part of what shorten_binary_op does, so please refer to
>> that in the comment so we can eventually complete this from there
>> and remove shorten_binary_op.
>
> Done.  I've created a block comment meant to cover this pattern and any
> additions we need along the way to moving shortening/narrowing out of the
> front-ends.
>
>
>>
>>> +(for op (plus minus)
>>> +  (simplify
>>> +    (convert (op (convert@2 @0) (convert @1)))
>>> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
>>> +         && TREE_TYPE (@0) == type
>>
>>
>> Otherwhere in match.pd we use
>>
>> (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
>>   || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))
>>
>> for type equality checks.  And I think even for GENERIC you want
>> to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
>> == TYPE_MAIN_VARAINT (TREE_TYPE (@1)).
>
> Seems reasonable.    Makes for some ugly formatting though.  It might make
> sense to factor that test so we don't end up repeating it senselessly.
>
>>
>>> +         && INTEGRAL_TYPE_P (type)
>>
>>
>> So instead of testing TREE_TYPE (@0) == type we probably want
>> to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
>> (type) to allow sign-changes.  Say for
>
> Yea, makes sense.
>
>>
>> short i, j;
>> (unsigned short) (i + j)
>>
>> we still want to narrow the widened i and j.
>
> Right.
>>
>>
>>> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
>>> (@0))
>>> +        /* This prevents infinite recursion.  */
>>> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
>>
>>
>> How can it recurse with the type precision constraint right above?
>
> It may no longer be needed.  The precision check used to be >=, but that
> regressed some java codes.  So it got tightened in the last iteration before
> posting.
>
>>
>> Note that technically we don't need to perform the operation in an
>> unsigned
>> type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)).  Thus
>>
>>             (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>>               (convert (op @0 @1)))
>>             (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
>>               (convert (op (convert:utype @0) (convert:utype @1))))))
>>
>> You can see that GCC exploits that with -fwrapv.  Maybe this
>> simplification should be split out into a separate pattern though,
>> tacking sign-changes for binary ops only.
>
> No strong opinion on separate pattern or integrating into existing pattern.

Yeah, I'd leave it as you did it for now.  Btw, -ENOPATCH.

Richard.

> jeff
>
Jeff Law Feb. 12, 2015, 4:43 p.m. UTC | #5
On 02/12/15 00:01, Bin.Cheng wrote:
>
> Cool that we are trying to simplify type conversion using generic
> match facility.  I have thought about type promotion in match.pd too.
match.pd really feels like a meta-language which describes how to 
combine gimple or generic statements.


> For example, (unsigned long long)(unsigned long)(int_expr), if we can
> prove int_expr is always positive (in my case, this is from vrp
> information), then the first conversion can be saved.  This is another
> way for (and related? I didn't look at the code) the sign/zero
> extension elimination work using VRP I suppose?
In theory, given VRP information attached to an SSA_NAME you could query 
that information in the conditional part of the match.pd pattern.

Another patch I'm working on is best described as detecting cases where 
an outer bit-and masks off all the bits outside the innermost operand's 
type.  Under the right conditions, this also allows for narrowing.

Jeff
Jeff Law Feb. 12, 2015, 4:55 p.m. UTC | #6
On 02/12/15 06:02, Richard Biener wrote:
>
> Yeah, I'd leave it as you did it for now.  Btw, -ENOPATCH.
Because it needed to spin through another round of testing :-)   I'll be 
posting it after my morning meetings.

jeff
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7f3816c..7a95029 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@ 
+2015-02-10  Jeff Law  <law@redhat.com>
+
+	* match.pd (convert (plus/minus (convert @0) (convert @1): New
+	simplifier to narrow arithmetic.
+
 2015-02-10  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/64909
diff --git a/gcc/match.pd b/gcc/match.pd
index 81c4ee6..abc703e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1018,3 +1018,21 @@  along with GCC; see the file COPYING3.  If not see
    (logs (pows @0 @1))
    (mult @1 (logs @0)))))
 
+/* If we have a narrowing conversion of an arithmetic operation where
+   both operands are widening conversions from the same type as the outer
+   narrowing conversion.  Then convert the innermost operands to a suitable
+   unsigned type (to avoid introducing undefined behaviour), perform the
+   operation and convert the result to the desired type. 
+
+   This narrows the arithmetic operation.  */
+(for op (plus minus)
+  (simplify
+    (convert (op (convert@2 @0) (convert @1)))
+    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
+         && TREE_TYPE (@0) == type
+         && INTEGRAL_TYPE_P (type)
+         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
+	 /* This prevents infinite recursion.  */
+	 && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
+      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+        (convert (op (convert:utype @0) (convert:utype @1)))))))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 15d5e2d..76e5254 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@ 
+2015-02-10  Jeff Law  <law@redhat.com>
+
+	PR rtl-optimization/47477
+	* gcc.dg/tree-ssa/narrow-arith-1.c: New test.
+
 2015-02-10  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/64909
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
new file mode 100644
index 0000000..104cb6f5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
@@ -0,0 +1,22 @@ 
+/* PR tree-optimization/47477 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -w" } */
+/* { dg-require-effective-target ilp32 } */
+
+typedef int int64_t __attribute__ ((__mode__ (__DI__)));
+typedef int * intptr_t;
+
+typedef struct toto_s *toto_t;
+toto_t add (toto_t a, toto_t b) {
+  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
+  return (toto_t)(intptr_t) tmp;
+}
+
+/* For an ILP32 target there'll be 6 casts when we start, but just 4
+   if the match.pd pattern is successfully matched.  */
+/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
+