Tree-level fix for PR 69526
diff mbox

Message ID CAFiYyc2UKpz4krcaxow_=yF6_eFZU_=M5k+6VTHZscDz+g=2uw@mail.gmail.com
State New
Headers show

Commit Message

Richard Biener Sept. 14, 2016, 1:05 p.m. UTC
On Mon, Aug 22, 2016 at 4:58 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote:
>> if !inner_ovf (just set that to false if !check_inner_ovf to simplify
>> checks please).
>> you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2
>> (if T is wider than the inner type which I think you should check and
>> which should
>> simplify things).
>
> I simplified the control flow a little and split both parts of the
> combination into separate patterns.
>
>> You can then combine (T)@1 and @2 where I think you fail to handle mixed
>> MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for
> integers).
>
> Is it ok to do something like this (see patch) in order to deal with
> MINUS_EXPR and then perform a wi::add?
>
>         if (inner_op == MINUS_EXPR)
>                 w1 = wi::neg (w1);
>
>         if (outer_op == MINUS_EXPR)
>                 w2 = wi::neg (w2);

Yes.

>> The testcase is rather unspecific as to what testcases shoudl fold and
> what not
>> given your very simple scan and mixing should/should-not simplify
> cases.  Please
>> consider splitting it up and make it a run test that verifies the
>> bogus transforms
>> do not take place.
>
> I divided the testcases into signed/unsigned and should simplify/should
> not simplify. The bogus unsigned transforms are now checked via assert().
>
>> Doing
> [...]
>> causes such stmts to be folded twice as substitute_and_fold does
> [...]
>> which is less than ideal.  I think that given we have fold_stmt following
>> SSA edges and thus not only stmts we propagated into are possibly
>> interesting to fold (but also their single-uses, recursively), we should
>> evaluate the compile-time cost of doing the fold_stmt unconditionally.
>
> Only using update_stmt seems to avoid the double call to fold_stmt but
> is obviously the wrong thing to do as this then tries to update
> everything that matches the pattern in tree-vrp.c. Copying some checks
> from match.pd to tree-vrp.c would help but isn't there a proper way to
> prevent duplicating the checking logic?

I think what triggers it is that you return 'true', not that you call
update_stmt.

> In addition, is the compile-time cost checking something I could do for
> this patch or a general thing to be done later?

I meant to do sth like

this would need compile-time cost evaluation (and avoid the tree-vrp.c
folding part
of your patch).

>> tree.c doesn't look like the best fit.  I think putting it into
>> tree-vrp.c is better
>> and I think that extract_range_from_binary_expr_1 itself should
> compute what we
>> want here as additional output.  Conservative handling for all but
> plus/minus is
>> ok with me.
>
> I kept the extra function for now because I find
> extract_range_from_binary_expr_1 somewhat lengthy and hard to follow
> already :)

Heh, but I think duplicating code is even worse.

Just looking at the simpler pattern right now:

+  /* ((T)(A)) +- CST -> (T)(A +- CST)  */
+#if GIMPLE
+   (for outer_op (plus minus)
+    (simplify
+     (outer_op (convert @0) INTEGER_CST@2)
+     /* Perform binary operation inside the cast if the constant fits
+       and there is no overflow.  */
+       (with
+       {
+         bool simplify = false;
+
+         wide_int cst = @2;
+         tree inner_type = TREE_TYPE (@0);
+         tree cst_inner = wide_int_to_tree (inner_type, cst);

What prevents CST being truncated here?  It looks like
(long)intvar + 0xffff00000000L will be mishandled that way.

+         bool inner_ovf = true;
+
+         if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type)
+             || TREE_CODE (inner_type) != INTEGER_TYPE)
+           simplify = false;
+         else
+         {
+           inner_ovf = binop_overflow (outer_op, @0, cst_inner, inner_type);
+           if (!inner_ovf)
+             simplify = true;
+         }
+       }
+       (if (simplify && @0)

@0 is always "true"

+        (convert (outer_op @0 (convert { @2; }))))

You already have @2 converted -- cst_inner. Thus just use

    (convert (outer_op @0 { cst_inner; }))

+       )))
+#endif


Now onto the other pattern.

+  /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST)  */
+#if GIMPLE
+   (for outer_op (plus minus)
+     (for inner_op (plus minus)
+       (simplify
+        (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2)
+          (with
+           {
+          /* If the inner operation is wrapped inside a conversion, we have to
+             check it overflows/underflows and can only then perform the
+             simplification, i.e. add the second constant to the first
+             (wrapped) and convert afterwards.  This fixes
+             https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */
+           bool inner_ovf = false;
+
+           bool combine_ovf = true;
+           tree cst = NULL_TREE;
+           bool combine_constants = false;
+           bool types_ok = false;
+
+           tree inner_type = TREE_TYPE (@3);
+           /* Check overflow in wrapped binop? */
+           bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type);
+           /* Check overflow when combining constants? */
+           bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (type);

I think that should be ! TYPE_OVERFLOW_WRAPS (type) instead.

+
+           signop s1 = TYPE_SIGN (type);
+
+           if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type))
+             {

OTOH given that you use this to decide whether you can use a combined constant
that doesn't look necessary (if the constant is correct, that is) --
what you need
to make sure is that the final operation, (T)(A) +- CST, does not overflow
if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the
original operation.  I think that always holds, thus the combine_ovf check looks
superfluous to me.

Now looking at binop_overflow (that I don't like very much, but ...)

+/* Check if the binary operation overflows.  */
+bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type)
+{

new-line after 'bool'


+  if (TREE_CODE (type) != INTEGER_TYPE)

INTEGRAL_TYPE_P

+    return true;
+
+  if (TREE_CODE (t1) != SSA_NAME)
+    return true;

that seems unnecessary

You have duplicate code to handle range-info for SSA name t1/t2,
please split it out.

+      if (vrtype2 == VR_ANTI_RANGE)
+       {
+         bool ovf;
+         wide_int tmpmin = wi::add (min2, 1, st2, &ovf);
+         if (st2 == SIGNED && ovf)
+           return true;
+         wide_int tmpmax = wi::sub (max2, 1, st2, &ovf);
+         if (st2 == SIGNED && ovf)
+           return true;
+         min2 = tmpmin;
+         max2 = tmpmax;

this looks wrong, from ~[0, 0] you get min == -1 and max == 1 while
in reality min == INT_MIN and max == INT_MAX.  I suggest to give up
conservatively for anti-ranges.

+  signop sgn = TYPE_SIGN (TREE_TYPE (t1)) == SIGNED
+    || TYPE_SIGN (TREE_TYPE (t2)) == SIGNED ? SIGNED : UNSIGNED;

uh, you want to handle mixed-sign inputs?  why?

+  if (op == MINUS_EXPR)
+    {
+      wmin = wi::sub (min1, min2, sgn, &ovf);
+      if (sgn == SIGNED && (ovf || wi::gt_p (wmin, min1, sgn)))
+       return true;
+      wmax = wi::sub (max1, max2, sgn, &ovf);
+      if (sgn == SIGNED && (ovf || wi::gt_p (wmax, max1, sgn)))
+       return true;
+    }

I think you may want to simplify this and instead compute the min/max
values into widest_ints, perform the operation (conservatively all four
variants, min1 OP min2, min1 OP max2, max1 OP min2, max1 OP max2)
and verify if the result fits the desired type via wi::fits_to_tree_p.

> Wouldn't it be better to "separate concerns"/split it up in
> the long run and merge the functionality needed here at some time?

For sure splitting out the PLUS/MINUS handling from
extract_range_from_binary_expr_1
is fine with me, not sure how it immediately helps this patch.

Finally please mention the PR that made you work on this in the ChangeLog
(does it actually fix the issue or at least improve it?

Thanks,
Richard.

> Bootstrapped and reg-tested on s390x, bootstrap on x86 running.
>
> Regards
>  Robin
>
>
> gcc/ChangeLog:
>
> 2016-08-22  Robin Dapp  <rdapp@linux.vnet.ibm.com>
>
>         * match.pd: Introduce patterns to simplify binary operations wrapped
> inside a cast.
>         * tree-vrp.c (bool binop_overflow):
>         (simplify_plus_or_minus_using_ranges): Make use of newly introduced
> patterns.
>         (simplify_stmt_using_ranges): Call simplify_plus_or_minus_using_ranges
>         * tree-vrp.h: New file.
>         * gimple-match-head.c: Include tree-vrp.h
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-08-22  Robin Dapp  <rdapp@linux.vnet.ibm.com>
>
>         * gcc.dg/wrapped-binop-simplify-signed-1.c: New test.
>         * gcc.dg/wrapped-binop-simplify-signed-2.c: New test.
>         * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test.
>         * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test.

Comments

Jeff Law Sept. 14, 2016, 4:50 p.m. UTC | #1
[ Another patch I'd started working through, but hadn't completed... ]

On 09/14/2016 07:05 AM, Richard Biener wrote:
> On Mon, Aug 22, 2016 at 4:58 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote:
>>> if !inner_ovf (just set that to false if !check_inner_ovf to simplify
>>> checks please).
>>> you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2
>>> (if T is wider than the inner type which I think you should check and
>>> which should
>>> simplify things).
>>
>> I simplified the control flow a little and split both parts of the
>> combination into separate patterns.
>>
>>> You can then combine (T)@1 and @2 where I think you fail to handle mixed
>>> MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for
>> integers).
>>
>> Is it ok to do something like this (see patch) in order to deal with
>> MINUS_EXPR and then perform a wi::add?
>>
>>         if (inner_op == MINUS_EXPR)
>>                 w1 = wi::neg (w1);
>>
>>         if (outer_op == MINUS_EXPR)
>>                 w2 = wi::neg (w2);
>
> Yes.
>
I was concerned about the case where w1 or w2 might be the minimum 
(negative) integer for its type.  That creates a constant that can't be 
represented.  I'm not familiar enough with the rest of hte code yet to 
know if that's problematical.


>
>>> tree.c doesn't look like the best fit.  I think putting it into
>>> tree-vrp.c is better
>>> and I think that extract_range_from_binary_expr_1 itself should
>> compute what we
>>> want here as additional output.  Conservative handling for all but
>> plus/minus is
>>> ok with me.
>>
>> I kept the extra function for now because I find
>> extract_range_from_binary_expr_1 somewhat lengthy and hard to follow
>> already :)
>
> Heh, but I think duplicating code is even worse.
I was going to suggest a pre-patch that just refactored the various 
cases in extract_range_from_binary_expr_1 into their own functions. 
THat might it easier to keep things manageable.

[ ... ]

>
> Now looking at binop_overflow (that I don't like very much, but ...)
Note that the naked "return true" in there makes 95% of that function 
unreachable code.  That's where I stopped without looking deeply at the 
rest of the code.

Jeff
Robin Dapp Oct. 14, 2016, 8:33 a.m. UTC | #2
Ping :)

Patch
diff mbox

Index: tree-ssa-propagate.c
===================================================================
--- tree-ssa-propagate.c        (revision 240133)
+++ tree-ssa-propagate.c        (working copy)
@@ -1105,10 +1105,10 @@  substitute_and_fold_dom_walker::before_d
       /* Replace real uses in the statement.  */
       did_replace |= replace_uses_in (stmt, get_value_fn);

-      /* If we made a replacement, fold the statement.  */
-      if (did_replace)
+      /* Fold the statement.  */
+      if (fold_stmt (&i, follow_single_use_edges))
        {
-         fold_stmt (&i, follow_single_use_edges);
+         did_replace = true;
          stmt = gsi_stmt (i);
        }