diff mbox series

PR tree-optimization/109462 - Don't use ANY PHI equivalences in range-on-entry.

Message ID 951d67a7-5eb7-35e5-5e68-ddd6e3d05e3f@redhat.com
State New
Headers show
Series PR tree-optimization/109462 - Don't use ANY PHI equivalences in range-on-entry. | expand

Commit Message

Andrew MacLeod April 11, 2023, 11:52 p.m. UTC
This is a carry over from PR 108139.

When we have a PHI node which has 2 arguments and one is undefined, we 
create an equivalence between the LHS and the non-undefined PHI 
argument.  THis allows us to perform certain optimizations.

The problem is, when we are evaluating range-on-entry in the cache, its 
depends on where that equivalence is made, from where we have no context.

a_3 = <b_2 (3), c_3 (4)>

if c_3 is undefined,  then a_3 is equivalent to b_2... but b_2 is not 
equivalence to a_3 everywhere..   its a one way thing.

108139 fixed this by not evaluating any equivalences if the equivalence 
was the LHS.

What it missed, was it possible we are calculating the range of a_3.   
b_2 is not defined in a phi node, so it happily used the equivalence.  
This PR demonstrates that we can't always use that equivlence either 
without more context.  There can be places in the IL where a_3 is used, 
but b_2 has moved to a new value within a loop.

So we can't do this if either NAME or the equivalence is equal via a PHI 
node with an undefined argument.

Unfortunately, this unsafe assumption is why PR 101912 is fixed.   
Fixing this issue properly is going to cause that to reopen as it is 
unsafe. (That PR is  a false uninitialized warning issue, rather than an 
wrong-code issue)

This bootstraps on x86_64-pc-linux-gnu  with that single regression, 
which I have XFAILed for now.  OK for trunk?   Once Jakub verifies it 
actually fixes the execution problem.   we have no executable test . yet.

Andrew

Comments

Jakub Jelinek April 12, 2023, 8:20 a.m. UTC | #1
On Tue, Apr 11, 2023 at 07:52:29PM -0400, Andrew MacLeod wrote:
> This bootstraps on x86_64-pc-linux-gnu  with that single regression, which I
> have XFAILed for now.  OK for trunk?

Yes.

>   Once Jakub verifies it actually fixes
> the execution problem.   we have no executable test . yet.

I have verified this fix both on the original clang testcase, and
on a self-contained testcase I've reduced overnight and this morning.

Ok to commit it to trunk incrementally after your commit?

BTW, I've wondered if it is just this uninitialized phi arg problem or if
I could reproduce it also if the phi arg was initialized but had the same
range as the current iteration is known to have due to a comparison (even
when the actual value is from the previous loop's iteration).  In the
test below, that would be Result.c = r_paren; before
while (!TheLexer.LexFromRawLexer (I)) loop.  But it wasn't miscompiled in
that case.

2023-04-12  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/109462
	* g++.dg/opt/pr109462.C: New test.

--- gcc/testsuite/g++.dg/opt/pr109462.C.jj	2023-04-12 09:58:23.085603031 +0200
+++ gcc/testsuite/g++.dg/opt/pr109462.C	2023-04-12 09:54:22.472079711 +0200
@@ -0,0 +1,94 @@
+// PR tree-optimization/109462
+// { dg-do run { target c++11 } }
+// { dg-options "-O2" }
+
+struct A {
+  A (const char *);
+  A (const char *, int);
+  bool empty ();
+  int size ();
+  bool equals (A);
+  A trim (char);
+  A trim ();
+};
+[[gnu::noipa]] A::A (const char *) {}
+[[gnu::noipa]] A::A (const char *, int) { __builtin_abort (); }
+[[gnu::noipa]] bool A::empty () { __builtin_abort (); }
+[[gnu::noipa]] int A::size () { __builtin_abort (); }
+[[gnu::noipa]] bool A::equals (A) { return true; }
+[[gnu::noipa]] A A::trim (char) { __builtin_abort (); }
+[[gnu::noipa]] A A::trim () { __builtin_abort (); }
+
+enum B { raw_identifier = 6, l_paren = 21, r_paren = 22 };
+[[gnu::noipa]] bool isAnyIdentifier (B) { return true; }
+[[gnu::noipa]] bool isStringLiteral (B) { __builtin_abort (); }
+
+struct C {
+  B c;
+  B getKind () { return c; }
+  bool is (B x) { return c == x; }
+  unsigned getLength () { __builtin_abort (); }
+  A getRawIdentifier () {
+    A x ("");
+    c == raw_identifier ? void () : __builtin_abort ();
+    return x;
+  }
+  const char *getLiteralData ();
+};
+[[gnu::noipa]] const char *C::getLiteralData () { __builtin_abort (); }
+
+struct D {
+  D ();
+  bool LexFromRawLexer (C &);
+};
+[[gnu::noipa]] D::D () {}
+[[gnu::noipa]] bool D::LexFromRawLexer (C &t) {
+  static int cnt;
+  C tok[] = { { raw_identifier }, { l_paren }, { raw_identifier }, { r_paren } };
+  t = tok[cnt++];
+  return false;
+}
+
+bool ok = false;
+[[gnu::noipa]] void reportEmptyContextError ()
+{
+  ok = true;
+}
+
+[[gnu::noipa]] void
+VisitObjCMessageExpr ()
+{
+  D TheLexer;
+  C I;
+  C Result;
+  int p_count = 0;
+  while (!TheLexer.LexFromRawLexer (I)) {
+    if (I.getKind () == l_paren)
+      ++p_count;
+    if (I.getKind () == r_paren) {
+      if (p_count == 1)
+        break;
+      --p_count;
+    }
+    Result = I;
+  }
+  if (isAnyIdentifier (Result.getKind ())) {
+    if (Result.getRawIdentifier ().equals ("nil")) {
+      reportEmptyContextError ();
+      return;
+    }
+  }
+  if (!isStringLiteral (Result.getKind ()))
+    return;
+  A Comment = A (Result.getLiteralData (), Result.getLength ()).trim ('"');
+  if ((Comment.trim ().size () == 0 && Comment.size () > 0) || Comment.empty ())
+    reportEmptyContextError ();
+}
+
+int
+main ()
+{
+  VisitObjCMessageExpr ();
+  if (!ok)
+    __builtin_abort ();
+}


	Jakub
Richard Biener April 12, 2023, 9:12 a.m. UTC | #2
On Wed, Apr 12, 2023 at 1:52 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> This is a carry over from PR 108139.
>
> When we have a PHI node which has 2 arguments and one is undefined, we
> create an equivalence between the LHS and the non-undefined PHI
> argument.  THis allows us to perform certain optimizations.
>
> The problem is, when we are evaluating range-on-entry in the cache, its
> depends on where that equivalence is made, from where we have no context.
>
> a_3 = <b_2 (3), c_3 (4)>
>
> if c_3 is undefined,  then a_3 is equivalent to b_2... but b_2 is not
> equivalence to a_3 everywhere..   its a one way thing.
>
> 108139 fixed this by not evaluating any equivalences if the equivalence
> was the LHS.
>
> What it missed, was it possible we are calculating the range of a_3.
> b_2 is not defined in a phi node, so it happily used the equivalence.
> This PR demonstrates that we can't always use that equivlence either
> without more context.  There can be places in the IL where a_3 is used,
> but b_2 has moved to a new value within a loop.

I think that's only possible when b_2 flows in via a backedge (from BB3).
So isn't this all about backedges?  Indeed creating equivalences across
backedges is futile with SSA.  I think ranger requires dominators, so
to have the above scenario - a_3 used after the b_2 definition - requires
BB3 to be dominated by the a_3 definition which is what you could check.

>
> So we can't do this if either NAME or the equivalence is equal via a PHI
> node with an undefined argument.
>
> Unfortunately, this unsafe assumption is why PR 101912 is fixed.
> Fixing this issue properly is going to cause that to reopen as it is
> unsafe. (That PR is  a false uninitialized warning issue, rather than an
> wrong-code issue)
>
> This bootstraps on x86_64-pc-linux-gnu  with that single regression,
> which I have XFAILed for now.  OK for trunk?   Once Jakub verifies it
> actually fixes the execution problem.   we have no executable test . yet.
>
> Andrew
>
>
>
Jakub Jelinek April 12, 2023, 10:58 a.m. UTC | #3
On Wed, Apr 12, 2023 at 11:12:17AM +0200, Richard Biener wrote:
> > 108139 fixed this by not evaluating any equivalences if the equivalence
> > was the LHS.
> >
> > What it missed, was it possible we are calculating the range of a_3.
> > b_2 is not defined in a phi node, so it happily used the equivalence.
> > This PR demonstrates that we can't always use that equivlence either
> > without more context.  There can be places in the IL where a_3 is used,
> > but b_2 has moved to a new value within a loop.
> 
> I think that's only possible when b_2 flows in via a backedge (from BB3).
> So isn't this all about backedges?  Indeed creating equivalences across

Apparently backedges and undefined phi arg, without that it doesn't seem
to trigger because then it isn't considered equivalent even if the two have
same range.

> backedges is futile with SSA.  I think ranger requires dominators, so
> to have the above scenario - a_3 used after the b_2 definition - requires
> BB3 to be dominated by the a_3 definition which is what you could check.

Would be nice.

Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
it has exactly what happens in this PR, undefined phi arg from the
pre-header and uses of the previous iteration's value (i.e. across
backedge).

	Jakub
Richard Biener April 12, 2023, 11:01 a.m. UTC | #4
On Wed, Apr 12, 2023 at 12:59 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Apr 12, 2023 at 11:12:17AM +0200, Richard Biener wrote:
> > > 108139 fixed this by not evaluating any equivalences if the equivalence
> > > was the LHS.
> > >
> > > What it missed, was it possible we are calculating the range of a_3.
> > > b_2 is not defined in a phi node, so it happily used the equivalence.
> > > This PR demonstrates that we can't always use that equivlence either
> > > without more context.  There can be places in the IL where a_3 is used,
> > > but b_2 has moved to a new value within a loop.
> >
> > I think that's only possible when b_2 flows in via a backedge (from BB3).
> > So isn't this all about backedges?  Indeed creating equivalences across
>
> Apparently backedges and undefined phi arg, without that it doesn't seem
> to trigger because then it isn't considered equivalent even if the two have
> same range.
>
> > backedges is futile with SSA.  I think ranger requires dominators, so
> > to have the above scenario - a_3 used after the b_2 definition - requires
> > BB3 to be dominated by the a_3 definition which is what you could check.
>
> Would be nice.
>
> Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
> it has exactly what happens in this PR, undefined phi arg from the
> pre-header and uses of the previous iteration's value (i.e. across
> backedge).

Well yes, that's what's not allowed.  So when the PHI dominates the
to-be-equivalenced argument edge src then the equivalence isn't
valid because there's a place (that very source block for example) a use of the
PHI lhs could appear and where we'd then mixup iterations.

>
>         Jakub
>
Andrew MacLeod April 12, 2023, 1:12 p.m. UTC | #5
On 4/12/23 04:20, Jakub Jelinek wrote:
> On Tue, Apr 11, 2023 at 07:52:29PM -0400, Andrew MacLeod wrote:
>> This bootstraps on x86_64-pc-linux-gnu  with that single regression, which I
>> have XFAILed for now.  OK for trunk?
> Yes.
>
>>     Once Jakub verifies it actually fixes
>> the execution problem.   we have no executable test . yet.
> I have verified this fix both on the original clang testcase, and
> on a self-contained testcase I've reduced overnight and this morning.
>
> Ok to commit it to trunk incrementally after your commit?


Sure. I just pushed it.

Andrew
Andrew MacLeod April 12, 2023, 8:55 p.m. UTC | #6
On 4/12/23 07:01, Richard Biener wrote:
> On Wed, Apr 12, 2023 at 12:59 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>
>> Would be nice.
>>
>> Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
>> it has exactly what happens in this PR, undefined phi arg from the
>> pre-header and uses of the previous iteration's value (i.e. across
>> backedge).
> Well yes, that's what's not allowed.  So when the PHI dominates the
> to-be-equivalenced argument edge src then the equivalence isn't
> valid because there's a place (that very source block for example) a use of the
> PHI lhs could appear and where we'd then mixup iterations.
>
If we want to implement this cleaner, then as you say, we don't create 
the equivalence if the PHI node dominates the argument edge.  The 
attached patch does just that, removing the both the "fix" for 108139 
and the just committed one for 109462, replacing them with catching this 
at the time of equivalence registering.

It bootstraps and passes all regressions tests.
Do you want me to check this into trunk?

Andrew

PS    Of course, we still fail 101912.   The only way I see us being 
able to do anything with that is to effectively peel the first iteration 
off, either physically,  or logically with the path ranger to determine 
if a given use  is actually reachable by the undefined value.

<bb2>

   <bb 9> :
   # prevcorr_7 = PHI <prevcorr_19(D)(2), corr_22(8)>
   # leapcnt_8 = PHI <0(2), leapcnt_26(8)>
   if (leapcnt_8 < n_16)           // 0 < n_16
     goto <bb 3>; [INV]

   <bb 3> :
   corr_22 = getint ();
   if (corr_22 <= 0)
     goto <bb 7>; [INV]
   else
     goto <bb 4>; [INV]

   <bb 4> :
   _1 = corr_22 == 1;
   _2 = leapcnt_8 != 0;      // [0, 0] = 0 != 0
   _3 = _1 & _2;             // [0, 0] = 0 & _2
   if (_3 != 0)                // 4->5 is not taken on the path starting 
2->9
     goto <bb 5>; [INV]
   else
     goto <bb 8>; [INV]

   <bb 5> :                 // We know this path is not taken when 
prevcorr_7  == prevcorr_19(D)(2)
   if (prevcorr_7 != 1)
     goto <bb 6>; [INV]
   else
     goto <bb 8>; [INV]

   <bb 6> :
   _5 = prevcorr_7 + -1;
   if (prevcorr_7 != 2)
     goto <bb 7>; [INV]
   else
     goto <bb 8>; [INV]

Using the path ranger (Would it even need tweaks aldy?) , before issuing 
the warning the uninit code could easily start at each use, construct 
the path(s) to that use from the unitialized value, and determine  when 
prevcorr is uninitialized, 2->9->3->4->5 will not be executed  and of 
course,neither will 2->9->3->4->5->6

   I think threading already does something similar?
Richard Biener April 13, 2023, 1:56 p.m. UTC | #7
On Wed, Apr 12, 2023 at 10:55 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
>
> On 4/12/23 07:01, Richard Biener wrote:
> > On Wed, Apr 12, 2023 at 12:59 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >>
> >> Would be nice.
> >>
> >> Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
> >> it has exactly what happens in this PR, undefined phi arg from the
> >> pre-header and uses of the previous iteration's value (i.e. across
> >> backedge).
> > Well yes, that's what's not allowed.  So when the PHI dominates the
> > to-be-equivalenced argument edge src then the equivalence isn't
> > valid because there's a place (that very source block for example) a use of the
> > PHI lhs could appear and where we'd then mixup iterations.
> >
> If we want to implement this cleaner, then as you say, we don't create
> the equivalence if the PHI node dominates the argument edge.  The
> attached patch does just that, removing the both the "fix" for 108139
> and the just committed one for 109462, replacing them with catching this
> at the time of equivalence registering.
>
> It bootstraps and passes all regressions tests.
> Do you want me to check this into trunk?

Uh, it looks a bit convoluted.  Wouldn't the following be enough?  OK
if that works
(or fixed if I messed up trivially)

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index e81f6b3699e..9c29012e160 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -776,7 +776,11 @@ fold_using_range::range_of_phi (vrange &r, gphi
*phi, fur_source &src)
          if (!seen_arg)
            {
              seen_arg = true;
-             single_arg = arg;
+             // Avoid registering an equivalence if the PHI dominates the
+             // argument edge.  See PR 108139/109462.
+             if (dom_info_available_p (CDI_DOMINATORS)
+                 && !dominated_by_p (CDI_DOMINATORS, e->src, gimple_bb (phi)))
+               single_arg = arg;
            }
          else if (single_arg != arg)
            single_arg = NULL_TREE;


> Andrew
>
> PS    Of course, we still fail 101912.   The only way I see us being
> able to do anything with that is to effectively peel the first iteration
> off, either physically,  or logically with the path ranger to determine
> if a given use  is actually reachable by the undefined value.
>
> <bb2>
>
>    <bb 9> :
>    # prevcorr_7 = PHI <prevcorr_19(D)(2), corr_22(8)>
>    # leapcnt_8 = PHI <0(2), leapcnt_26(8)>
>    if (leapcnt_8 < n_16)           // 0 < n_16
>      goto <bb 3>; [INV]
>
>    <bb 3> :
>    corr_22 = getint ();
>    if (corr_22 <= 0)
>      goto <bb 7>; [INV]
>    else
>      goto <bb 4>; [INV]
>
>    <bb 4> :
>    _1 = corr_22 == 1;
>    _2 = leapcnt_8 != 0;      // [0, 0] = 0 != 0
>    _3 = _1 & _2;             // [0, 0] = 0 & _2
>    if (_3 != 0)                // 4->5 is not taken on the path starting
> 2->9
>      goto <bb 5>; [INV]
>    else
>      goto <bb 8>; [INV]
>
>    <bb 5> :                 // We know this path is not taken when
> prevcorr_7  == prevcorr_19(D)(2)
>    if (prevcorr_7 != 1)
>      goto <bb 6>; [INV]
>    else
>      goto <bb 8>; [INV]
>
>    <bb 6> :
>    _5 = prevcorr_7 + -1;
>    if (prevcorr_7 != 2)
>      goto <bb 7>; [INV]
>    else
>      goto <bb 8>; [INV]
>
> Using the path ranger (Would it even need tweaks aldy?) , before issuing
> the warning the uninit code could easily start at each use, construct
> the path(s) to that use from the unitialized value, and determine  when
> prevcorr is uninitialized, 2->9->3->4->5 will not be executed  and of
> course,neither will 2->9->3->4->5->6
>
>    I think threading already does something similar?

It does quite more convoluted things than that - it computes predicates on
paths with its own representation & simplifications.  It might be worth
trying if replacing this with a lot of path rangers would help - but then
it heavily relies on relation simplification it implements (and at the
same time it does that very imperfectly).

Richard.

>
>
Andrew MacLeod April 13, 2023, 3:48 p.m. UTC | #8
On 4/13/23 09:56, Richard Biener wrote:
> On Wed, Apr 12, 2023 at 10:55 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>>
>> On 4/12/23 07:01, Richard Biener wrote:
>>> On Wed, Apr 12, 2023 at 12:59 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>>> Would be nice.
>>>>
>>>> Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
>>>> it has exactly what happens in this PR, undefined phi arg from the
>>>> pre-header and uses of the previous iteration's value (i.e. across
>>>> backedge).
>>> Well yes, that's what's not allowed.  So when the PHI dominates the
>>> to-be-equivalenced argument edge src then the equivalence isn't
>>> valid because there's a place (that very source block for example) a use of the
>>> PHI lhs could appear and where we'd then mixup iterations.
>>>
>> If we want to implement this cleaner, then as you say, we don't create
>> the equivalence if the PHI node dominates the argument edge.  The
>> attached patch does just that, removing the both the "fix" for 108139
>> and the just committed one for 109462, replacing them with catching this
>> at the time of equivalence registering.
>>
>> It bootstraps and passes all regressions tests.
>> Do you want me to check this into trunk?
> Uh, it looks a bit convoluted.  Wouldn't the following be enough?  OK
> if that works
> (or fixed if I messed up trivially)
>
> diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
> index e81f6b3699e..9c29012e160 100644
> --- a/gcc/gimple-range-fold.cc
> +++ b/gcc/gimple-range-fold.cc
> @@ -776,7 +776,11 @@ fold_using_range::range_of_phi (vrange &r, gphi
> *phi, fur_source &src)
>            if (!seen_arg)
>              {
>                seen_arg = true;
> -             single_arg = arg;
> +             // Avoid registering an equivalence if the PHI dominates the
> +             // argument edge.  See PR 108139/109462.
> +             if (dom_info_available_p (CDI_DOMINATORS)
> +                 && !dominated_by_p (CDI_DOMINATORS, e->src, gimple_bb (phi)))
> +               single_arg = arg;
>              }
>            else if (single_arg != arg)
>              single_arg = NULL_TREE;
>
>
It would exposes a slight hole.. in cases where there is more than one 
copy of the name, ie:

for a_2 = PHI <c_3, c_3>  we currently will create an equivalence 
between  a_2 and c_3 because its considered a single argument.  Not a 
big deal for this case since all arguments are c_3, but the hole would 
be when we have something like:

a_2 = PHI<c_3, c_3, d_4(D)>    if d_4 is undefined, then with the above 
patch we would only check the dominance of the first edge with c_3. we'd 
need to check all of them.

The patch is slightly convoluted because we always defer checking the 
edge/processing single arguments until we think there is a reason to 
(for performance).  My patch simple does the deferred check on the 
previous edge and sets the new one so that we would check both edges are 
valid before setting the equivalence.  Even as it is with this deferred 
check we're about 0.4% slower in VRP. IF we didnt do this deferring, 
then every PHI is going to have a check.

And along the way, remove the boolean seen_arg because having 
single_arg_edge set produces the same information.

Perhaps it would be cleaner to simply defer the entire thing to the end, 
like so.
Performance is pretty much identical in the end.

Bootstraps on x86_64-pc-linux-gnu, regressions are running. Assuming no 
regressions pop up,   OK for trunk?

Andrew
Richard Biener April 13, 2023, 4:18 p.m. UTC | #9
> Am 13.04.2023 um 17:49 schrieb Andrew MacLeod <amacleod@redhat.com>:
> 
> 
>> On 4/13/23 09:56, Richard Biener wrote:
>>> On Wed, Apr 12, 2023 at 10:55 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>>> 
>>> On 4/12/23 07:01, Richard Biener wrote:
>>>> On Wed, Apr 12, 2023 at 12:59 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> Would be nice.
>>>>> 
>>>>> Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
>>>>> it has exactly what happens in this PR, undefined phi arg from the
>>>>> pre-header and uses of the previous iteration's value (i.e. across
>>>>> backedge).
>>>> Well yes, that's what's not allowed.  So when the PHI dominates the
>>>> to-be-equivalenced argument edge src then the equivalence isn't
>>>> valid because there's a place (that very source block for example) a use of the
>>>> PHI lhs could appear and where we'd then mixup iterations.
>>>> 
>>> If we want to implement this cleaner, then as you say, we don't create
>>> the equivalence if the PHI node dominates the argument edge.  The
>>> attached patch does just that, removing the both the "fix" for 108139
>>> and the just committed one for 109462, replacing them with catching this
>>> at the time of equivalence registering.
>>> 
>>> It bootstraps and passes all regressions tests.
>>> Do you want me to check this into trunk?
>> Uh, it looks a bit convoluted.  Wouldn't the following be enough?  OK
>> if that works
>> (or fixed if I messed up trivially)
>> 
>> diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
>> index e81f6b3699e..9c29012e160 100644
>> --- a/gcc/gimple-range-fold.cc
>> +++ b/gcc/gimple-range-fold.cc
>> @@ -776,7 +776,11 @@ fold_using_range::range_of_phi (vrange &r, gphi
>> *phi, fur_source &src)
>>           if (!seen_arg)
>>             {
>>               seen_arg = true;
>> -             single_arg = arg;
>> +             // Avoid registering an equivalence if the PHI dominates the
>> +             // argument edge.  See PR 108139/109462.
>> +             if (dom_info_available_p (CDI_DOMINATORS)
>> +                 && !dominated_by_p (CDI_DOMINATORS, e->src, gimple_bb (phi)))
>> +               single_arg = arg;
>>             }
>>           else if (single_arg != arg)
>>             single_arg = NULL_TREE;
>> 
>> 
> It would exposes a slight hole.. in cases where there is more than one copy of the name, ie:
> 
> for a_2 = PHI <c_3, c_3>  we currently will create an equivalence between  a_2 and c_3 because its considered a single argument.  Not a big deal for this case since all arguments are c_3, but the hole would be when we have something like:
> 
> a_2 = PHI<c_3, c_3, d_4(D)>    if d_4 is undefined, then with the above patch we would only check the dominance of the first edge with c_3. we'd need to check all of them.

I didn’t think so, but on a second thought for multiple backedges and an irreducible region you are right.  Once we see an edge with the reverse domination we’re fine though.  I originally checked all edges but thought checking the first is enough.

> The patch is slightly convoluted because we always defer checking the edge/processing single arguments until we think there is a reason to (for performance).  My patch simple does the deferred check on the previous edge and sets the new one so that we would check both edges are valid before setting the equivalence.  Even as it is with this deferred check we're about 0.4% slower in VRP. IF we didnt do this deferring, then every PHI is going to have a check.
> 
> And along the way, remove the boolean seen_arg because having single_arg_edge set produces the same information.
> 
> Perhaps it would be cleaner to simply defer the entire thing to the end, like so.
> Performance is pretty much identical in the end.
> 
> Bootstraps on x86_64-pc-linux-gnu, regressions are running. Assuming no regressions pop up,   OK for trunk?
> 
> Andrew
> 
> 
> 
> 
> 
> <462c.patch>
Richard Biener April 13, 2023, 4:20 p.m. UTC | #10
> Am 13.04.2023 um 17:49 schrieb Andrew MacLeod <amacleod@redhat.com>:
> 
> 
>> On 4/13/23 09:56, Richard Biener wrote:
>>> On Wed, Apr 12, 2023 at 10:55 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>>> 
>>> On 4/12/23 07:01, Richard Biener wrote:
>>>> On Wed, Apr 12, 2023 at 12:59 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> Would be nice.
>>>>> 
>>>>> Though, I'm afraid it still wouldn't fix the PR101912 testcase, because
>>>>> it has exactly what happens in this PR, undefined phi arg from the
>>>>> pre-header and uses of the previous iteration's value (i.e. across
>>>>> backedge).
>>>> Well yes, that's what's not allowed.  So when the PHI dominates the
>>>> to-be-equivalenced argument edge src then the equivalence isn't
>>>> valid because there's a place (that very source block for example) a use of the
>>>> PHI lhs could appear and where we'd then mixup iterations.
>>>> 
>>> If we want to implement this cleaner, then as you say, we don't create
>>> the equivalence if the PHI node dominates the argument edge.  The
>>> attached patch does just that, removing the both the "fix" for 108139
>>> and the just committed one for 109462, replacing them with catching this
>>> at the time of equivalence registering.
>>> 
>>> It bootstraps and passes all regressions tests.
>>> Do you want me to check this into trunk?
>> Uh, it looks a bit convoluted.  Wouldn't the following be enough?  OK
>> if that works
>> (or fixed if I messed up trivially)
>> 
>> diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
>> index e81f6b3699e..9c29012e160 100644
>> --- a/gcc/gimple-range-fold.cc
>> +++ b/gcc/gimple-range-fold.cc
>> @@ -776,7 +776,11 @@ fold_using_range::range_of_phi (vrange &r, gphi
>> *phi, fur_source &src)
>>           if (!seen_arg)
>>             {
>>               seen_arg = true;
>> -             single_arg = arg;
>> +             // Avoid registering an equivalence if the PHI dominates the
>> +             // argument edge.  See PR 108139/109462.
>> +             if (dom_info_available_p (CDI_DOMINATORS)
>> +                 && !dominated_by_p (CDI_DOMINATORS, e->src, gimple_bb (phi)))
>> +               single_arg = arg;
>>             }
>>           else if (single_arg != arg)
>>             single_arg = NULL_TREE;
>> 
>> 
> It would exposes a slight hole.. in cases where there is more than one copy of the name, ie:
> 
> for a_2 = PHI <c_3, c_3>  we currently will create an equivalence between  a_2 and c_3 because its considered a single argument.  Not a big deal for this case since all arguments are c_3, but the hole would be when we have something like:
> 
> a_2 = PHI<c_3, c_3, d_4(D)>    if d_4 is undefined, then with the above patch we would only check the dominance of the first edge with c_3. we'd need to check all of them.
> 
> The patch is slightly convoluted because we always defer checking the edge/processing single arguments until we think there is a reason to (for performance).  My patch simple does the deferred check on the previous edge and sets the new one so that we would check both edges are valid before setting the equivalence.  Even as it is with this deferred check we're about 0.4% slower in VRP. IF we didnt do this deferring, then every PHI is going to have a check.
> 
> And along the way, remove the boolean seen_arg because having single_arg_edge set produces the same information.
> 
> Perhaps it would be cleaner to simply defer the entire thing to the end, like so.
> Performance is pretty much identical in the end.
> 
> Bootstraps on x86_64-pc-linux-gnu, regressions are running. Assuming no regressions pop up,   OK for trunk?

Yes.  I like this more.  OK

Richard 

> Andrew
> 
> 
> 
> 
> 
> <462c.patch>
diff mbox series

Patch

commit 90848fb75cf91a45edd355d2b1485ef835099609
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Tue Apr 11 17:29:03 2023 -0400

    Don't use ANY PHI equivalences in range-on-entry.
    
    PR 108139 dissallows PHI equivalencies in the on-entry calculator, but
    it was only checking if the equivlaence was a PHI.  In this case, NAME
    itself is a PHI with an equivlaence caused by an undefined value, so we
    also need to check that case.  Unfortunately this un-fixes 101912.
    
            PR tree-optimization/109462
            gcc/
            * gimple-range-cache.cc (ranger_cache::fill_block_cache): Don't
            check for equivalences if NAME is a phi node.
    
            gcc/testsuite/
            * gcc.dg/uninit-pr101912.c: XFAIL the warning.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 6a098d8ec28..3b52f1e734c 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1218,7 +1218,9 @@  ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
 	  fprintf (dump_file, "\n");
 	}
       // See if any equivalences can refine it.
-      if (m_oracle)
+      // PR 109462, like 108139 below, a one way equivalence introduced
+      // by a PHI node can also be through the definition side.  Disallow it.
+      if (m_oracle && !is_a<gphi *> (SSA_NAME_DEF_STMT (name)))
 	{
 	  tree equiv_name;
 	  relation_kind rel;
diff --git a/gcc/testsuite/gcc.dg/uninit-pr101912.c b/gcc/testsuite/gcc.dg/uninit-pr101912.c
index 1550c03436d..62cd2a0c73e 100644
--- a/gcc/testsuite/gcc.dg/uninit-pr101912.c
+++ b/gcc/testsuite/gcc.dg/uninit-pr101912.c
@@ -11,7 +11,7 @@  tzloadbody (void)
   for (int i = 0; i < n; i++)
     {
       int corr = getint ();
-      if (corr < 1 || (corr == 1 && !(leapcnt == 0 || (prevcorr < corr ? corr == prevcorr + 1 : (corr == prevcorr || corr == prevcorr - 1))))) /* { dg-bogus "uninitialized" } */
+      if (corr < 1 || (corr == 1 && !(leapcnt == 0 || (prevcorr < corr ? corr == prevcorr + 1 : (corr == prevcorr || corr == prevcorr - 1))))) /* { dg-bogus "uninitialized" "pr101912" { xfail *-*-* } } */
 	return -1;
 
       prevcorr = corr;