diff mbox series

PR middle-end/100810: Penalize IV candidates with undefined value bases

Message ID 001b01d79e30$7441b100$5cc51300$@nextmovesoftware.com
State New
Headers show
Series PR middle-end/100810: Penalize IV candidates with undefined value bases | expand

Commit Message

Roger Sayle Aug. 31, 2021, 6:21 a.m. UTC
Time to hopefully earn some goodwill from the team; this patch fixes
a P1 wrong-code-on-valid regression in ivopts.  Many thanks to Andrew
Pinski for help with the analysis.

Consider the code fragment below:

        int i;
        for (j=0; j<10; j++)
          i++;

This results in a loop containing two induction variables, i and j,
where j is initialized, but i isn't (typically indicated in tree dumps
by qualified ssa names like i(D).  In PR 100810, the loop optimizers
end up selecting i as the "best" candidate (perhaps because having no
initialization it's cheaper) which leads to problems in later passes
when (the equivalent of) j is considered not to have the value 10 after
the loop, as its definition is now computed from an undefined/uninitialized
value.

The fix below is to add a field to track whether any "iv_group" contains
a use of an iv based on a undefined value, and then prohibit IVs that are
based on undefined values from being candidates for groups that don't use
undefined values.  This may seem lenient, but it allows an IV with an
undefined base to be a candidate for itself, and is a sufficient condition
to avoid the above bug/regression.  A stricter condition might be to only
allow "undefined_value iv"s as candidates for iv_groups where *all*
uses are to "undefined_value ivs"?  My concern was that this might lead
to cases/loops that no longer have suitable candidates (i.e. a possible
performance regression).

Hopefully, the tree-loop optimization experts agree with my analysis/fix.

This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?

2021-08-31  Roger Sayle  <roger@nextmovesoftware.com>
	    Andrew Pinski  <apinski@marvell.com>

gcc/ChangeLog
	PR middle-end/100810
	* tree-ssa-loop-ivopts.c (struct iv_group): Add a new
	uses_undefined_value_p field, indicating this group has
	uses of iv's whose base is ssa_undefined_value_p.
	(record_use): Update uses_undefined_value_p as required.
	(record_group): Initialize uses_undefined_value_p to false.
	(determine_group_iv_cost_generic): Consider a candidate with
	a ssa_undefined_value_p base to have infinite_cost for a
	group where uses_undefined_value_p is false.

gcc/testsuite/ChangeLog
	PR middle-end/100810
	* gcc.dg/pr100810.c: New test case.

Roger
--

/* { dg-do run } */
/* { dg-options "-O2" } */

int a, b = 1, c = 1, e, f = 1, g, h, j;
volatile int d;
static void k() {
  int i;
  h = b;
  if (c && a >= 0) {
    while (a) {
      i++;
      h--;
    }
    if (g)
      for (h = 0; h < 2; h++)
        ;
    if (!b)
      i &&d;
  }
}
static void l() {
  for (; j < 1; j++)
    if (!e && c && f)
      k();
}
int main() {
  if (f)
    l();
  if (h != 1)
    __builtin_abort();
  return 0;
}

Comments

Richard Biener Aug. 31, 2021, 12:46 p.m. UTC | #1
On Tue, Aug 31, 2021 at 8:22 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Time to hopefully earn some goodwill from the team; this patch fixes
> a P1 wrong-code-on-valid regression in ivopts.  Many thanks to Andrew
> Pinski for help with the analysis.
>
> Consider the code fragment below:
>
>         int i;
>         for (j=0; j<10; j++)
>           i++;
>
> This results in a loop containing two induction variables, i and j,
> where j is initialized, but i isn't (typically indicated in tree dumps
> by qualified ssa names like i(D).  In PR 100810, the loop optimizers
> end up selecting i as the "best" candidate (perhaps because having no
> initialization it's cheaper) which leads to problems in later passes
> when (the equivalent of) j is considered not to have the value 10 after
> the loop, as its definition is now computed from an undefined/uninitialized
> value.
>
> The fix below is to add a field to track whether any "iv_group" contains
> a use of an iv based on a undefined value, and then prohibit IVs that are
> based on undefined values from being candidates for groups that don't use
> undefined values.  This may seem lenient, but it allows an IV with an
> undefined base to be a candidate for itself, and is a sufficient condition
> to avoid the above bug/regression.  A stricter condition might be to only
> allow "undefined_value iv"s as candidates for iv_groups where *all*
> uses are to "undefined_value ivs"?  My concern was that this might lead
> to cases/loops that no longer have suitable candidates (i.e. a possible
> performance regression).
>
> Hopefully, the tree-loop optimization experts agree with my analysis/fix.

So the reason why the generated code is "wrong" is that we end up with

  # i_16 = PHI <i_24(D)(19), i_17(7)>
  _40 = (unsigned int) b.7_13;
  _47 = (unsigned int) i_24(D);
  _49 = _40 + _47;
  _39 = (unsigned int) i_16;
  _22 = -_39;
  _33 = _22 + _49;

where we relate i_24(D) and i_16 = PHI <i_24(D)(19), ...> but two
evaluations of the same undefined SSA name do not necessarily
yield the same value.  The undefined SSA names pop in via
SCEV and niter analysis so trying to fix this in IVOPTs is plugging
the hole only in a single place.  Not only do the SSA names not
evaluate to the same value in the end, but here CCP assumes
that UNDEFINED - UNDEFINED is UNDEFINED but
i_24(D) - i_24(D) is _not_ UNDEFINED.

So

Visiting statement:
_40 = (unsigned int) b.7_13;
which is likely CONSTANT
Lattice value changed to VARYING.  Adding SSA edges to worklist.

Visiting statement:
_47 = (unsigned int) i_24(D);
which is likely UNDEFINED
Lattice value changed to UNDEFINED.  Adding SSA edges to worklist.
marking stmt to be not simulated again

Visiting statement:
_49 = _47 + _40;
which is likely UNDEFINED

this last conclusion looks wrong to me, in fact most of the UNDEFINED
proapgation handling in CCP seems to assume that i_24(D) - i_24(D)
is not necessarily zero which is in conflict with us happily pulling
such not "stabilized" operands into expressions and propagating them
at will.

ISTR a paper about this very issue from the clang/llvm folks and they
inventing some special notion for not-quite-undefined or so ...

I don't have a good answer yet to the problem at hand (also given
CCP isn't the only place that doesn't treat i_24(D) as having the
same value in all contexts).  The classical testcase for this would be

int __attribute__((noipa)) foo (int i)
{
  int j, k, m;
  int val = j;
  if (i)
    k = val;
  else
    k = 3;
  if (i)
    m = val;
  else
    m = 2;
  return m - k;
}
int main()
{
  if (foo (1) != 0)
    __builtin_abort ();
}

where 'k' is optimistically 3 from merging UNDEFINED and 3
from two executable edges.  The reasoning is that eventually
only initialized values may contribute to sth meaningful which
fails to consider X - X.  We have not considered this to be a bug
but we now have to work towards making the compiler itself
avoid creating such cancellations ...

Richard.

> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> and "make -k check" with no new failures.  Ok for mainline?
>
> 2021-08-31  Roger Sayle  <roger@nextmovesoftware.com>
>             Andrew Pinski  <apinski@marvell.com>
>
> gcc/ChangeLog
>         PR middle-end/100810
>         * tree-ssa-loop-ivopts.c (struct iv_group): Add a new
>         uses_undefined_value_p field, indicating this group has
>         uses of iv's whose base is ssa_undefined_value_p.
>         (record_use): Update uses_undefined_value_p as required.
>         (record_group): Initialize uses_undefined_value_p to false.
>         (determine_group_iv_cost_generic): Consider a candidate with
>         a ssa_undefined_value_p base to have infinite_cost for a
>         group where uses_undefined_value_p is false.
>
> gcc/testsuite/ChangeLog
>         PR middle-end/100810
>         * gcc.dg/pr100810.c: New test case.
>
> Roger
> --
>
diff mbox series

Patch

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4a498ab..ca8f526 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -432,6 +432,8 @@  struct iv_group
   struct iv_cand *selected;
   /* To indicate this is a doloop use group.  */
   bool doloop_p;
+  /* This group uses undefined values.  */
+  bool uses_undefined_value_p;
   /* Uses in the group.  */
   vec<struct iv_use *> vuses;
 };
@@ -1540,6 +1542,12 @@  record_use (struct iv_group *group, tree *use_p, struct iv *iv,
   use->addr_offset = addr_offset;
 
   group->vuses.safe_push (use);
+
+  /* Record uses of undefined values.  */
+  if (TREE_CODE (iv->base) == SSA_NAME
+      && ssa_undefined_value_p (iv->base))
+    group->uses_undefined_value_p = true;
+
   return use;
 }
 
@@ -1582,6 +1590,7 @@  record_group (struct ivopts_data *data, enum use_type type)
   group->related_cands = BITMAP_ALLOC (NULL);
   group->vuses.create (1);
   group->doloop_p = false;
+  group->uses_undefined_value_p = false;
 
   data->vgroups.safe_push (group);
   return group;
@@ -4960,6 +4969,12 @@  determine_group_iv_cost_generic (struct ivopts_data *data,
      the candidate.  */
   if (cand->pos == IP_ORIGINAL && cand->incremented_at == use->stmt)
     cost = no_cost;
+  /* Disallow using an iv based on an undefined value as a candidate
+     replacement for a group that uses only defined values.  */
+  else if (!group->uses_undefined_value_p
+	   && TREE_CODE (cand->iv->base) == SSA_NAME
+	   && ssa_undefined_value_p (cand->iv->base))
+    cost = infinite_cost;
   else
     cost = get_computation_cost (data, use, cand, false,
 				 &inv_vars, NULL, &inv_expr);