diff mbox series

[PUSHED] Hybrid EVRP and testcases

Message ID 700c236b-4387-8d9a-152d-43d059cd5b4e@redhat.com
State New
Headers show
Series [PUSHED] Hybrid EVRP and testcases | expand

Commit Message

Andrew MacLeod Oct. 6, 2020, 5:05 p.m. UTC
I have now checked in the hybrid EVRP pass.

We have resolved all the issue we are aware of with a full Fedora build, 
but if any more issues arise, please let us know. (And Im sure you will :-)

I made some minor tweaks.   the option to the new -fevrp-mode  flag are now:

legacy             : classic EVRP mode
ranger             : Ranger only mode
*legacy-first     :  Query ranges with EVRP first, and if that fails try 
the ranger*
ranger-first     : Query the ranger first, then evrp
ranger-trace   : Ranger-only mode plus Show range tracing info in the dump
ranger-debug : Ranger-only mode, and also include all cache debugging info
trace                : Hybrid mode with range tracing info
debug              : Hybrid mode with cache debugging as well as tracing

The default is still *legacy-first*.

********
If there is a compilation problem, and the problem goes away with    
-fevrp-mode=legacy
the ranger is to blame, and let us know asap.
********

Attached is the patch which was applied.

-------------------------------------------

These are the initial test case differences, and the detailed analysis 
is below:

1)  We now XPASS analyzer/pr94851-1.c.
2) -disable-tree-evrp added to gcc.dg/pr81192.c
3) -disable-tree-evrp added to gcc.dg/tree-ssa/pr77445-2.c
4) -disable-tree-evrp added to tree-ssa/ssa-dom-thread-6.c
5) -disable-tree-evrp added to tree-ssa/ssa-dom-thread-7.c


mostly it is one of 2 things:

1) we propagate a constant into into PHI that wasnt happening before, 
EVRP didn't  handle anything other than single entry blocks well.
2) switches are processed in a lot more detail, which again propagate a 
lot of values into PHIs, and it then triggers more threading.

Last minute update... It also turns out the analyzer xpass may be noise. 
we did change the IL, but it sounds like there are other analyzer things 
at play at the moment :-)

--------------------------------------------------------------------------------------------------------------------------------------------------------
1)  We now XPASS analyzer/pr94851-1.c.


It was xfailing with:
In function ‘pamark’:
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:43:13: 
warning: leak of ‘p’ [CWE-401] [-Wanalyzer-malloc-leak]
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:29:6: note: 
(1) following ‘false’ branch (when ‘p’ is NULL)...
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:23: note: 
(2) ...to here
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:23: note: 
(3) allocated here
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:8: note: 
(4) assuming ‘p’ is non-NULL
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:8: note: 
(5) following ‘false’ branch (when ‘p’ is non-NULL)...
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:35:15: note: 
(6) ...to here
/gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:43:13: note: 
(7) ‘p’ leaks here; was allocated at (3)

now we produce:
XPASS: gcc.dg/analyzer/pr94851-1.c bogus leak (test for bogus messages, 
line 43)


THe reason is in the IL:
  <bb 4> :
   # p_9 = PHI <p_16(2), p_21(3)>
   # last_11 = PHI <p_16(2), p_9(3)>
   if (p_9 != 0B)
     goto <bb 5>; [INV]
   else
     goto <bb 6>; [INV]              --> This outgoing edge

   <bb 5> :
   _3 = p_9->m_name;
   _4 = (char) _32;
   if (_3 != _4)
     goto <bb 3>; [INV]
   else
     goto <bb 6>; [INV]

   <bb 6> :
   # p_2 = PHI <p_9(4), p_9(5)>         <<<----   THis PHI node
   # last_17 = PHI <last_11(4), last_11(5)>
   if (p_2 != 0B)
     goto <bb 7>; [INV]
   else
     goto <bb 8>; [INV]

   <bb 7> :
   printf ("over writing mark %c\n", _32);
   goto <bb 13>; [INV]


The ranger propagates the p_9 == 0 from the else branch into the PHI 
argument on edge 4->6
  <bb 6> :
   # p_2 = PHI <0B(4), p_9(5)>

which the threaders can then bypass the print in bb7 on one path, and 
that seems to resolve the current issue.

THe IL produced by the time we get to .optimized is identical, we just 
clear it up early enough for the analyzer to use now.

-------------------------------------------------------------------------------------------------------------------

2) -fdisable-tree-evrp added to gcc.dg/pr81192.c to enable the test to pass

new version of evrp sees
<bb 3> :
   if (j_8(D) != 2147483647)
     goto <bb 4>; [50.00%]
   else
     goto <bb 5>; [50.00%]
<bb 4> :
   iftmp.2_11 = j_8(D) + 1;
<bb 5> :
   # iftmp.2_12 = PHI <j_8(D)(3), iftmp.2_11(4)>

EVRP now recognizes a constant can be propagated into the 3->5 edge and
produces
   # iftmp.2_12 = PHI <2147483647(3), iftmp.2_11(4)>
which causes the situation being tested to dissappear before we get to 
PRE.  */

-------------------------------------------------------------------------------------------------------------------

3) -disable-tree-evrp added to gcc.dg/tree-ssa/pr77445-2.c

Aldy investigated this, and basically we are threading 6 more paths on 
x86_64 which is changing  the IL in visible ways.
Disabling evrp allows the threaders to test what they are looking for.

-------------------------------------------------------------------------

4) and 5)

along the same vein, we are threading anew opportunies in PHIs... these 
2 tests were changes to include the new evrp result.

Aldy has added comments to the test cases

+   # state_10 = PHI <state_11(7), 0(10), state_11(5), 1(6), 
state_11(8), 2(9), state_11(11)>
+
+   Now with the new evrp, we get:
+
+   # state_10 = PHI <0(7), 0(10), state_11(5), 1(6), 0(8), 2(9), 1(11)>
+
+   Thus, we have 3 more paths that are known to be constant and can be
+   threaded.  Which means that by the second threading pass, we can
+   only find one profitable path.
+
+   For the record, all these extra constants are better paths coming
+   out of switches.  For example:
+
+     SWITCH_BB -> BBx -> BBy -> BBz -> PHI
+
+   We now know the value of the switch index at PHI.  */

and

+$ diff clean/a.c.105t.mergephi2 a.c.105t.mergephi2
+252c252
+<   # s_50 = PHI <s_49(10), 5(14), s_51(18), s_51(22), 1(26), 1(29), 
1(31), s_51(5), 4(12), 1(15), 5(17), 1(19), 3(21), 1(23), 6(25), 7(28), 
s_51(30)>
+---
+>   # s_50 = PHI <s_49(10), 5(14), 4(18), 5(22), 1(26), 1(29), 1(31), 
s_51(5), 4(12), 1(15), 5(17), 1(19), 3(21), 1(23), 6(25), 7(28), 7(30)>
+272a273
+
+  I spot checked a few and they all have the same pattern.  We are
+  basically tracking the switch index better through multiple
+  paths.  */



Andrew

Comments

Richard Biener Oct. 7, 2020, 6:10 a.m. UTC | #1
On Tue, Oct 6, 2020 at 7:06 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> I have now checked in the hybrid EVRP pass.
>
> We have resolved all the issue we are aware of with a full Fedora build,
> but if any more issues arise, please let us know. (And Im sure you will :-)
>
> I made some minor tweaks.   the option to the new -fevrp-mode  flag are now:
>
> legacy             : classic EVRP mode
> ranger             : Ranger only mode
> *legacy-first     :  Query ranges with EVRP first, and if that fails try
> the ranger*
> ranger-first     : Query the ranger first, then evrp
> ranger-trace   : Ranger-only mode plus Show range tracing info in the dump
> ranger-debug : Ranger-only mode, and also include all cache debugging info
> trace                : Hybrid mode with range tracing info
> debug              : Hybrid mode with cache debugging as well as tracing
>
> The default is still *legacy-first*.

We'll have to keep -fevrp-mode forever so can you instead make it a --param
since I hope it is transitional only?  It certainly shouldn't be a
user-visible flag, should it?

Richard.

>
> ********
> If there is a compilation problem, and the problem goes away with
> -fevrp-mode=legacy
> the ranger is to blame, and let us know asap.
> ********
>
> Attached is the patch which was applied.
>
> -------------------------------------------
>
> These are the initial test case differences, and the detailed analysis
> is below:
>
> 1)  We now XPASS analyzer/pr94851-1.c.
> 2) -disable-tree-evrp added to gcc.dg/pr81192.c
> 3) -disable-tree-evrp added to gcc.dg/tree-ssa/pr77445-2.c
> 4) -disable-tree-evrp added to tree-ssa/ssa-dom-thread-6.c
> 5) -disable-tree-evrp added to tree-ssa/ssa-dom-thread-7.c
>
>
> mostly it is one of 2 things:
>
> 1) we propagate a constant into into PHI that wasnt happening before,
> EVRP didn't  handle anything other than single entry blocks well.
> 2) switches are processed in a lot more detail, which again propagate a
> lot of values into PHIs, and it then triggers more threading.
>
> Last minute update... It also turns out the analyzer xpass may be noise.
> we did change the IL, but it sounds like there are other analyzer things
> at play at the moment :-)
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> 1)  We now XPASS analyzer/pr94851-1.c.
>
>
> It was xfailing with:
> In function ‘pamark’:
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:43:13:
> warning: leak of ‘p’ [CWE-401] [-Wanalyzer-malloc-leak]
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:29:6: note:
> (1) following ‘false’ branch (when ‘p’ is NULL)...
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:23: note:
> (2) ...to here
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:23: note:
> (3) allocated here
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:8: note:
> (4) assuming ‘p’ is non-NULL
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:32:8: note:
> (5) following ‘false’ branch (when ‘p’ is non-NULL)...
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:35:15: note:
> (6) ...to here
> /gcc/master/gcc/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c:43:13: note:
> (7) ‘p’ leaks here; was allocated at (3)
>
> now we produce:
> XPASS: gcc.dg/analyzer/pr94851-1.c bogus leak (test for bogus messages,
> line 43)
>
>
> THe reason is in the IL:
>   <bb 4> :
>    # p_9 = PHI <p_16(2), p_21(3)>
>    # last_11 = PHI <p_16(2), p_9(3)>
>    if (p_9 != 0B)
>      goto <bb 5>; [INV]
>    else
>      goto <bb 6>; [INV]              --> This outgoing edge
>
>    <bb 5> :
>    _3 = p_9->m_name;
>    _4 = (char) _32;
>    if (_3 != _4)
>      goto <bb 3>; [INV]
>    else
>      goto <bb 6>; [INV]
>
>    <bb 6> :
>    # p_2 = PHI <p_9(4), p_9(5)>         <<<----   THis PHI node
>    # last_17 = PHI <last_11(4), last_11(5)>
>    if (p_2 != 0B)
>      goto <bb 7>; [INV]
>    else
>      goto <bb 8>; [INV]
>
>    <bb 7> :
>    printf ("over writing mark %c\n", _32);
>    goto <bb 13>; [INV]
>
>
> The ranger propagates the p_9 == 0 from the else branch into the PHI
> argument on edge 4->6
>   <bb 6> :
>    # p_2 = PHI <0B(4), p_9(5)>
>
> which the threaders can then bypass the print in bb7 on one path, and
> that seems to resolve the current issue.
>
> THe IL produced by the time we get to .optimized is identical, we just
> clear it up early enough for the analyzer to use now.
>
> -------------------------------------------------------------------------------------------------------------------
>
> 2) -fdisable-tree-evrp added to gcc.dg/pr81192.c to enable the test to pass
>
> new version of evrp sees
> <bb 3> :
>    if (j_8(D) != 2147483647)
>      goto <bb 4>; [50.00%]
>    else
>      goto <bb 5>; [50.00%]
> <bb 4> :
>    iftmp.2_11 = j_8(D) + 1;
> <bb 5> :
>    # iftmp.2_12 = PHI <j_8(D)(3), iftmp.2_11(4)>
>
> EVRP now recognizes a constant can be propagated into the 3->5 edge and
> produces
>    # iftmp.2_12 = PHI <2147483647(3), iftmp.2_11(4)>
> which causes the situation being tested to dissappear before we get to
> PRE.  */
>
> -------------------------------------------------------------------------------------------------------------------
>
> 3) -disable-tree-evrp added to gcc.dg/tree-ssa/pr77445-2.c
>
> Aldy investigated this, and basically we are threading 6 more paths on
> x86_64 which is changing  the IL in visible ways.
> Disabling evrp allows the threaders to test what they are looking for.
>
> -------------------------------------------------------------------------
>
> 4) and 5)
>
> along the same vein, we are threading anew opportunies in PHIs... these
> 2 tests were changes to include the new evrp result.
>
> Aldy has added comments to the test cases
>
> +   # state_10 = PHI <state_11(7), 0(10), state_11(5), 1(6),
> state_11(8), 2(9), state_11(11)>
> +
> +   Now with the new evrp, we get:
> +
> +   # state_10 = PHI <0(7), 0(10), state_11(5), 1(6), 0(8), 2(9), 1(11)>
> +
> +   Thus, we have 3 more paths that are known to be constant and can be
> +   threaded.  Which means that by the second threading pass, we can
> +   only find one profitable path.
> +
> +   For the record, all these extra constants are better paths coming
> +   out of switches.  For example:
> +
> +     SWITCH_BB -> BBx -> BBy -> BBz -> PHI
> +
> +   We now know the value of the switch index at PHI.  */
>
> and
>
> +$ diff clean/a.c.105t.mergephi2 a.c.105t.mergephi2
> +252c252
> +<   # s_50 = PHI <s_49(10), 5(14), s_51(18), s_51(22), 1(26), 1(29),
> 1(31), s_51(5), 4(12), 1(15), 5(17), 1(19), 3(21), 1(23), 6(25), 7(28),
> s_51(30)>
> +---
> +>   # s_50 = PHI <s_49(10), 5(14), 4(18), 5(22), 1(26), 1(29), 1(31),
> s_51(5), 4(12), 1(15), 5(17), 1(19), 3(21), 1(23), 6(25), 7(28), 7(30)>
> +272a273
> +
> +  I spot checked a few and they all have the same pattern.  We are
> +  basically tracking the switch index better through multiple
> +  paths.  */
>
>
>
> Andrew
Andrew MacLeod Oct. 7, 2020, 2:02 p.m. UTC | #2
On 10/7/20 2:10 AM, Richard Biener wrote:
> On Tue, Oct 6, 2020 at 7:06 PM Andrew MacLeod via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>> I have now checked in the hybrid EVRP pass.
>>
>> We have resolved all the issue we are aware of with a full Fedora build,
>> but if any more issues arise, please let us know. (And Im sure you will :-)
>>
>> I made some minor tweaks.   the option to the new -fevrp-mode  flag are now:
>>
>> legacy             : classic EVRP mode
>> ranger             : Ranger only mode
>> *legacy-first     :  Query ranges with EVRP first, and if that fails try
>> the ranger*
>> ranger-first     : Query the ranger first, then evrp
>> ranger-trace   : Ranger-only mode plus Show range tracing info in the dump
>> ranger-debug : Ranger-only mode, and also include all cache debugging info
>> trace                : Hybrid mode with range tracing info
>> debug              : Hybrid mode with cache debugging as well as tracing
>>
>> The default is still *legacy-first*.
> We'll have to keep -fevrp-mode forever so can you instead make it a --param
> since I hope it is transitional only?  It certainly shouldn't be a
> user-visible flag, should it?
>
> Richard.
>
>
doesnt need to be user visible, just to give us the ability to turn 
things on and off.

Sure. I'll try to get to that shortly.
Aldy Hernandez Oct. 7, 2020, 3:19 p.m. UTC | #3
On 10/7/20 8:10 AM, Richard Biener wrote:
 > On Tue, Oct 6, 2020 at 7:06 PM Andrew MacLeod via Gcc-patches
 > <gcc-patches@gcc.gnu.org> wrote:
 >>
 >> I have now checked in the hybrid EVRP pass.
 >>
 >> We have resolved all the issue we are aware of with a full Fedora build,
 >> but if any more issues arise, please let us know. (And Im sure you 
will :-)
 >>
 >> I made some minor tweaks.   the option to the new -fevrp-mode  flag 
are now:
 >>
 >> legacy             : classic EVRP mode
 >> ranger             : Ranger only mode
 >> *legacy-first     :  Query ranges with EVRP first, and if that fails try
 >> the ranger*
 >> ranger-first     : Query the ranger first, then evrp
 >> ranger-trace   : Ranger-only mode plus Show range tracing info in 
the dump
 >> ranger-debug : Ranger-only mode, and also include all cache 
debugging info
 >> trace                : Hybrid mode with range tracing info
 >> debug              : Hybrid mode with cache debugging as well as tracing
 >>
 >> The default is still *legacy-first*.
 >
 > We'll have to keep -fevrp-mode forever so can you instead make it a 
--param
 > since I hope it is transitional only?  It certainly shouldn't be a
 > user-visible flag, should it?

Sounds reasonable.

Attached is a patch to do so.  It basically moves the code from 
common.opt to params.opt.  I also removed the undocumented modifier, as 
all --params are supposed to be for internal use only.

OK?

	* common.opt (-fevrp-mode): Rename and move...
	* params.opt (--param=evrp-mode): ...here.
	* gimple-range.h (DEBUG_RANGE_CACHE): Use param_evrp_mode instead
	of flag_evrp_mode.
	* gimple-ssa-evrp.c (rvrp_folder): Same.
	(hybrid_folder): Same.
	(execute_early_vrp): Same.
---
  gcc/common.opt        | 31 -------------------------------
  gcc/gimple-range.h    |  2 +-
  gcc/gimple-ssa-evrp.c |  8 ++++----
  gcc/params.opt        | 31 +++++++++++++++++++++++++++++++
  4 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index e2bd90c450d..7e789d1c47f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2870,37 +2870,6 @@ ftree-vrp
  Common Report Var(flag_tree_vrp) Init(0) Optimization
  Perform Value Range Propagation on trees.

-fevrp-mode=
-Common Undocumented Joined RejectNegative Enum(evrp_mode) 
Var(flag_evrp_mode) Init(EVRP_MODE_EVRP_FIRST) Optimization
--fevrp-mode=[legacy|ranger|legacy-first|ranger-first|ranger-trace|ranger-debug|trace|debug] 
Specifies the mode Early VRP should operate in.
-
-Enum
-Name(evrp_mode) Type(enum evrp_mode) UnknownError(unknown evrp mode %qs)
-
-EnumValue
-Enum(evrp_mode) String(legacy) Value(EVRP_MODE_EVRP_ONLY)
-
-EnumValue
-Enum(evrp_mode) String(ranger) Value(EVRP_MODE_RVRP_ONLY)
-
-EnumValue
-Enum(evrp_mode) String(legacy-first) Value(EVRP_MODE_EVRP_FIRST)
-
-EnumValue
-Enum(evrp_mode) String(ranger-first) Value(EVRP_MODE_RVRP_FIRST)
-
-EnumValue
-Enum(evrp_mode) String(ranger-trace) Value(EVRP_MODE_RVRP_TRACE)
-
-EnumValue
-Enum(evrp_mode) String(ranger-debug) Value(EVRP_MODE_RVRP_DEBUG)
-
-EnumValue
-Enum(evrp_mode) String(trace) Value(EVRP_MODE_TRACE)
-
-EnumValue
-Enum(evrp_mode) String(debug) Value(EVRP_MODE_DEBUG)
-
  fsplit-paths
  Common Report Var(flag_split_paths) Init(0) Optimization
  Split paths leading to loop backedges.
diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
index 4d35e72795f..041dc7c2a97 100644
--- a/gcc/gimple-range.h
+++ b/gcc/gimple-range.h
@@ -165,6 +165,6 @@ private:
  };

  // Flag to enable debugging the various internal Caches.
-#define DEBUG_RANGE_CACHE (dump_file && (flag_evrp_mode & EVRP_MODE_DEBUG))
+#define DEBUG_RANGE_CACHE (dump_file && (param_evrp_mode & 
EVRP_MODE_DEBUG))

  #endif // GCC_GIMPLE_RANGE_STMT_H
diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
index 6be32d7a3f6..363e2ab6816 100644
--- a/gcc/gimple-ssa-evrp.c
+++ b/gcc/gimple-ssa-evrp.c
@@ -118,7 +118,7 @@ public:

    rvrp_folder () : substitute_and_fold_engine (), m_simplifier ()
    {
-    if (flag_evrp_mode & EVRP_MODE_TRACE)
+    if (param_evrp_mode & EVRP_MODE_TRACE)
        m_ranger = new trace_ranger ();
      else
        m_ranger = new gimple_ranger ();
@@ -175,7 +175,7 @@ class hybrid_folder : public evrp_folder
  public:
    hybrid_folder (bool evrp_first)
    {
-    if (flag_evrp_mode & EVRP_MODE_TRACE)
+    if (param_evrp_mode & EVRP_MODE_TRACE)
        m_ranger = new trace_ranger ();
      else
        m_ranger = new gimple_ranger ();
@@ -307,8 +307,8 @@ execute_early_vrp ()
    scev_initialize ();
    calculate_dominance_info (CDI_DOMINATORS);

-  // only the last 2 bits matter for choosing the folder.
-  switch (flag_evrp_mode & EVRP_MODE_RVRP_FIRST)
+  // Only the last 2 bits matter for choosing the folder.
+  switch (param_evrp_mode & EVRP_MODE_RVRP_FIRST)
      {
      case EVRP_MODE_EVRP_ONLY:
        {
diff --git a/gcc/params.opt b/gcc/params.opt
index 6f308a10da0..d770c55143b 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -102,6 +102,37 @@ Maximum size (in bytes) of objects tracked bytewise 
by dead store elimination.
  Common Joined UInteger Var(param_early_inlining_insns) Init(6) 
Optimization Param
  Maximal estimated growth of function body caused by early inlining of 
single call.

+-param=evrp-mode=
+Common Joined Var(param_evrp_mode) Enum(evrp_mode) 
Init(EVRP_MODE_EVRP_FIRST) Param Optimization
+--param=evrp-mode=[legacy|ranger|legacy-first|ranger-first|ranger-trace|ranger-debug|trace|debug] 
Specifies the mode Early VRP should operate in.
+
+Enum
+Name(evrp_mode) Type(enum evrp_mode) UnknownError(unknown evrp mode %qs)
+
+EnumValue
+Enum(evrp_mode) String(legacy) Value(EVRP_MODE_EVRP_ONLY)
+
+EnumValue
+Enum(evrp_mode) String(ranger) Value(EVRP_MODE_RVRP_ONLY)
+
+EnumValue
+Enum(evrp_mode) String(legacy-first) Value(EVRP_MODE_EVRP_FIRST)
+
+EnumValue
+Enum(evrp_mode) String(ranger-first) Value(EVRP_MODE_RVRP_FIRST)
+
+EnumValue
+Enum(evrp_mode) String(ranger-trace) Value(EVRP_MODE_RVRP_TRACE)
+
+EnumValue
+Enum(evrp_mode) String(ranger-debug) Value(EVRP_MODE_RVRP_DEBUG)
+
+EnumValue
+Enum(evrp_mode) String(trace) Value(EVRP_MODE_TRACE)
+
+EnumValue
+Enum(evrp_mode) String(debug) Value(EVRP_MODE_DEBUG)
+
  -param=fsm-maximum-phi-arguments=
  Common Joined UInteger Var(param_fsm_maximum_phi_arguments) Init(100) 
IntegerRange(1, 999999) Param Optimization
  Maximum number of arguments a PHI may have before the FSM threader 
will not try to thread through its block.
Andrew MacLeod Oct. 7, 2020, 4:04 p.m. UTC | #4
On 10/7/20 11:19 AM, Aldy Hernandez wrote:
>
>
> On 10/7/20 8:10 AM, Richard Biener wrote:
> > On Tue, Oct 6, 2020 at 7:06 PM Andrew MacLeod via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> I have now checked in the hybrid EVRP pass.
> >>
> >> We have resolved all the issue we are aware of with a full Fedora 
> build,
> >> but if any more issues arise, please let us know. (And Im sure you 
> will :-)
> >>
> >> I made some minor tweaks.   the option to the new -fevrp-mode  flag 
> are now:
> >>
> >> legacy             : classic EVRP mode
> >> ranger             : Ranger only mode
> >> *legacy-first     :  Query ranges with EVRP first, and if that 
> fails try
> >> the ranger*
> >> ranger-first     : Query the ranger first, then evrp
> >> ranger-trace   : Ranger-only mode plus Show range tracing info in 
> the dump
> >> ranger-debug : Ranger-only mode, and also include all cache 
> debugging info
> >> trace                : Hybrid mode with range tracing info
> >> debug              : Hybrid mode with cache debugging as well as 
> tracing
> >>
> >> The default is still *legacy-first*.
> >
> > We'll have to keep -fevrp-mode forever so can you instead make it a 
> --param
> > since I hope it is transitional only?  It certainly shouldn't be a
> > user-visible flag, should it?
>
> Sounds reasonable.
>
> Attached is a patch to do so.  It basically moves the code from 
> common.opt to params.opt.  I also removed the undocumented modifier, 
> as all --params are supposed to be for internal use only.
>
> OK?
>
>     * common.opt (-fevrp-mode): Rename and move...
>     * params.opt (--param=evrp-mode): ...here.
>     * gimple-range.h (DEBUG_RANGE_CACHE): Use param_evrp_mode instead
>     of flag_evrp_mode.
>     * gimple-ssa-evrp.c (rvrp_folder): Same.
>     (hybrid_folder): Same.
>     (execute_early_vrp): Same.
OK.  thanks.
Andrew
diff mbox series

Patch

2020-10-06  Andrew MacLeod  <amacleod@redhat.com>

	* gcc.dg/pr81192.c: Disable EVRP pass.
	* gcc.dg/tree-ssa/pr77445-2.c: Ditto.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Adjust for new threads.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Ditto.


diff --git a/gcc/testsuite/gcc.dg/pr81192.c b/gcc/testsuite/gcc.dg/pr81192.c
index 0049f371b3d..71bbc13a0e9 100644
--- a/gcc/testsuite/gcc.dg/pr81192.c
+++ b/gcc/testsuite/gcc.dg/pr81192.c
@@ -1,4 +1,20 @@ 
-/* { dg-options "-Os -fdump-tree-pre-details" } */
+/* { dg-options "-Os -fdump-tree-pre-details -fdisable-tree-evrp" } */
+
+/* Disable tree-evrp because the new version of evrp sees
+<bb 3> :
+  if (j_8(D) != 2147483647)
+    goto <bb 4>; [50.00%]
+  else
+    goto <bb 5>; [50.00%]
+<bb 4> :
+  iftmp.2_11 = j_8(D) + 1;
+<bb 5> :
+  # iftmp.2_12 = PHI <j_8(D)(3), iftmp.2_11(4)>
+
+EVRP now recognizes a constant can be propagated into the 3->5 edge and
+produces
+  # iftmp.2_12 = PHI <2147483647(3), iftmp.2_11(4)> 
+which causes the situation being tested to dissapear before we get to PRE.  */
 
 #if __SIZEOF_INT__ == 2
 #define unsigned __UINT32_TYPE__
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
index 9c22c538da8..cf74e156109 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread-details-blocks-stats" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-thread-details-blocks-stats" } */
 typedef enum STATES {
 	START=0,
 	INVALID,
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
index 551fbac3dad..16a9ef4e28a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -1,7 +1,41 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-thread2-details" } */
-/* { dg-final { scan-tree-dump-times "FSM" 3 "thread1" } } */
-/* { dg-final { scan-tree-dump-times "FSM" 5 "thread2" } } */
+
+/* All the threads in the thread1 dump start on a X->BB12 edge, as can
+   be seen in the dump:
+
+     Registering FSM jump thread: (x, 12) incoming edge; ...
+     etc
+     etc
+
+   Before the new evrp, we were threading paths that started at the
+   following edges:
+
+      Registering FSM jump thread: (10, 12) incoming edge
+      Registering FSM jump thread:  (6, 12) incoming edge
+      Registering FSM jump thread:  (9, 12) incoming edge
+
+   This was because the PHI at BB12 had constant values coming in from
+   BB10, BB6, and BB9:
+
+   # state_10 = PHI <state_11(7), 0(10), state_11(5), 1(6), state_11(8), 2(9), state_11(11)>
+
+   Now with the new evrp, we get:
+
+   # state_10 = PHI <0(7), 0(10), state_11(5), 1(6), 0(8), 2(9), 1(11)>
+
+   Thus, we have 3 more paths that are known to be constant and can be
+   threaded.  Which means that by the second threading pass, we can
+   only find one profitable path.
+
+   For the record, all these extra constants are better paths coming
+   out of switches.  For example:
+
+     SWITCH_BB -> BBx -> BBy -> BBz -> PHI
+
+   We now know the value of the switch index at PHI.  */
+/* { dg-final { scan-tree-dump-times "FSM" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "FSM" 1 "thread2" } } */
 
 int sum0, sum1, sum2, sum3;
 int foo (char *s, char **ret)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index a3395578f78..bad5bc1d003 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -1,20 +1,31 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
-/* { dg-final { scan-tree-dump "Jumps threaded: 16"  "thread1" } } */
-/* { dg-final { scan-tree-dump "Jumps threaded: 9" "thread2" } } */
+
+/* Here we have the same issue as was commented in ssa-dom-thread-6.c.
+   The PHI coming into the threader has a lot more constants, so the
+   threader can thread more paths.
+
+$ diff clean/a.c.105t.mergephi2 a.c.105t.mergephi2 
+252c252
+<   # s_50 = PHI <s_49(10), 5(14), s_51(18), s_51(22), 1(26), 1(29), 1(31), s_51(5), 4(12), 1(15), 5(17), 1(19), 3(21), 1(23), 6(25), 7(28), s_51(30)>
+---
+>   # s_50 = PHI <s_49(10), 5(14), 4(18), 5(22), 1(26), 1(29), 1(31), s_51(5), 4(12), 1(15), 5(17), 1(19), 3(21), 1(23), 6(25), 7(28), 7(30)>
+272a273
+
+  I spot checked a few and they all have the same pattern.  We are
+  basically tracking the switch index better through multiple
+  paths.  */
+
+/* { dg-final { scan-tree-dump "Jumps threaded: 19"  "thread1" } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: 8" "thread2" } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom2" } } */
+
 /* aarch64 has the highest CASE_VALUES_THRESHOLD in GCC.  It's high enough
    to change decisions in switch expansion which in turn can expose new
    jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom3" { target { ! aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "vrp2" { target { ! aarch64*-*-* } } } } */
 
-/* Most architectures get 3 threadable paths here, whereas aarch64 and
-   possibly others get 5.  We really should rewrite threading tests to
-   test a specific IL sequence, not gobs of code whose IL can vary
-   from architecture to architecture.  */
-/* { dg-final { scan-tree-dump "Jumps threaded: \[35\]" "thread3" } } */
-
 enum STATE {
   S0=0,
   SI,