diff mbox series

Add Ranger temporal cache

Message ID 029e94ea-d93a-71df-6185-701e174b74bf@redhat.com
State New
Headers show
Series Add Ranger temporal cache | expand

Commit Message

Andrew MacLeod Nov. 4, 2020, 6:12 p.m. UTC
PR 97515 highlighted a bit of silliness that results when we calculate a 
bunch of ranges by traversing a back edge, and set some values.  Then we 
eventually visit that block during the DOM walk, and discover the value 
can be improved, sometimes dramatically.  It is already cached, so 
unfortunately we don't visit it again...

The situation is described in comment 4 : 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97515#c4

I have created a temporal cache for the ranger that basically adds a 
timestamp to the global cache.

The timestamp maintains the time a global range was calculated 
(monotonically increasing based on "set") and  a list of up to 2 
directly dependent ssa-names that whenever we access the global value, 
their timestamp is checked to see if they are newer.  Any time the 
global value for a dependent is newer, then the current global value is 
considered stale and the ranger will recalculate using the newer values 
of the dependent.

whats that mean?

Using the PR testcase,  a back edge calculation for a PHI requires a 
range for ui_8:
   ui_8 = ~xe_3;
and at this time, we only know that xe_3 is <=0 based on the branch 
feeding the statement..  This ui_8 is calculated as [-1, +INF] globally 
and stored.
As the caluclting comtniues, we actually discover that xe_3 has to be -1.

When the EVRp dom walk eventually gets to this statement, we know xe_3 
evaluates to [-1, -1] and we fodl this statement to
   ui_8 = -1
Unfortunately, the global cache still have [-1, +INF] for ui_8 and has 
no way to know to reevalaute.

With the temporal cache operating, when we figure out that xe_3 
evaluates to [-1,-1], xe_3 gets a timestamp that is newer than that of ui_8.
when range_of_stmt is called on ui_8 now, it fails the "current" check, 
and the ranger proceeds to recalculate ui_8 using the new value of x_3, 
and we get the proper result of [-1, -1] store for the global value.
With this patch, this testcase now comes out of EVRP  looking like:

e7 (int gg)
{
   int ui;
   int xe;
   _Bool _1;
   int _2;

   <bb 2> :

   <bb 3> :
   _1 = 0;
   _2 = (int) _1;
   goto <bb 3>; [INV]

}

Time wise its pretty good.  It basically consumes the time I saved with 
the previous cache tweaks, and the overall time now is almost exactly 
the same as it was before.    So we're still pretty zippy.

This  integrates with the previous cache changes so that when this 
global value for ui_8 is updated, any changes are automatically 
propagated into the global cache as well.

I have also updated the testcase to ensure that it now produces the 
above code with a single goto.


This bootstraps on x86_64-pc-linux-gnu, no regressions, and pushed.

Andrew

PS.  Before the next stage 1, I intend to use the preexisting dependency 
chains in the GORI engine instead of this one-or-two name timestamp entry.
Currentlt the  drawback is that only dependent values are checked, so 
intervening calculations will not trigger a recalculation.   If we use 
the GORI dependency chains, then everything in the dependency chain will 
be recognized as stale, and we'll get even more cases. Combined with 
improvements planned for how dependency chain ranges are calculated by 
GORI, we could get even more interesting results.
diff mbox series

Patch

commit e86fd6a17cdb26710d1f13c9a47a3878c76028f9
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Wed Nov 4 12:59:15 2020 -0500

    Add Ranger temporal cache
    
    Add a timestamp to supplement the global range cache to detect when a value
    may become stale.
    
            gcc/
            PR tree-optimization/97515
            * gimple-range-cache.h (class ranger_cache): New prototypes plus
            temporal cache pointer.
            * gimple-range-cache.cc (struct range_timestamp): New.
            (class temporal_cache): New.
            (temporal_cache::temporal_cache): New.
            (temporal_cache::~temporal_cache): New.
            (temporal_cache::get_timestamp): New.
            (temporal_cache::set_dependency): New.
            (temporal_cache::temporal_value): New.
            (temporal_cache::current_p): New.
            (temporal_cache::set_timestamp): New.
            (temporal_cache::set_always_current): New.
            (ranger_cache::ranger_cache): Allocate the temporal cache.
            (ranger_cache::~ranger_cache): Free temporal cache.
            (ranger_cache::get_non_stale_global_range): New.
            (ranger_cache::set_global_range): Add a timestamp.
            (ranger_cache::register_dependency): New.  Add timestamp dependency.
            * gimple-range.cc (gimple_ranger::range_of_range_op): Add operand
            dependencies.
            (gimple_ranger::range_of_phi): Ditto.
            (gimple_ranger::range_of_stmt): Check if global range is stale, and
            recalculate if so.
            gcc/testsuite/
            * gcc.dg/pr97515.c: Check listing for folding of entire function.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index cca9025abba..b01563c83f9 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -476,6 +476,140 @@  ssa_global_cache::dump (FILE *f)
   fputc ('\n', f);
 }
 
+// --------------------------------------------------------------------------
+
+
+// This struct provides a timestamp for a global range calculation.
+// it contains the time counter, as well as a limited number of ssa-names
+// that it is dependent upon.  If the timestamp for any of the dependent names
+// Are newer, then this range could need updating.
+
+struct range_timestamp
+{
+  unsigned time;
+  unsigned ssa1;
+  unsigned ssa2;
+};
+
+// This class will manage the timestamps for each ssa_name.
+// When a value is calcualted, its timestamp is set to the current time.
+// The ssanames it is dependent on have already been calculated, so they will
+// have older times.  If one fo those values is ever calculated again, it
+// will get a newer timestamp, and the "current_p" check will fail.
+
+class temporal_cache
+{
+public:
+  temporal_cache ();
+  ~temporal_cache ();
+  bool current_p (tree name) const;
+  void set_timestamp (tree name);
+  void set_dependency (tree name, tree dep);
+  void set_always_current (tree name);
+private:
+  unsigned temporal_value (unsigned ssa) const;
+  const range_timestamp *get_timestamp (unsigned ssa) const;
+  range_timestamp *get_timestamp (unsigned ssa);
+
+  unsigned m_current_time;
+  vec <range_timestamp> m_timestamp;
+};
+
+
+inline
+temporal_cache::temporal_cache ()
+{
+  m_current_time = 1;
+  m_timestamp.create (0);
+  m_timestamp.safe_grow_cleared (num_ssa_names);
+}
+
+inline
+temporal_cache::~temporal_cache ()
+{
+  m_timestamp.release ();
+}
+
+// Return a pointer to the timetamp for ssa-name at index SSA, if there is
+// one, otherwise return NULL.
+
+inline const range_timestamp *
+temporal_cache::get_timestamp (unsigned ssa) const
+{
+  if (ssa >= m_timestamp.length ())
+    return NULL;
+  return &(m_timestamp[ssa]);
+}
+
+// Return a reference to the timetamp for ssa-name at index SSA.  If the index
+// is past the end of the vector, extend the vector.
+
+inline range_timestamp *
+temporal_cache::get_timestamp (unsigned ssa)
+{
+  if (ssa >= m_timestamp.length ())
+    m_timestamp.safe_grow_cleared (num_ssa_names + 20);
+  return &(m_timestamp[ssa]);
+}
+
+// This routine will fill NAME's next operand slot with DEP if DEP is a valid
+// SSA_NAME and there is a free slot.
+
+inline void
+temporal_cache::set_dependency (tree name, tree dep)
+{
+  if (dep && TREE_CODE (dep) == SSA_NAME)
+    {
+      gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name)));
+      range_timestamp& ts = *(get_timestamp (SSA_NAME_VERSION (name)));
+      if (!ts.ssa1)
+	ts.ssa1 = SSA_NAME_VERSION (dep);
+      else if (!ts.ssa2 && ts.ssa1 != SSA_NAME_VERSION (name))
+	ts.ssa2 = SSA_NAME_VERSION (dep);
+    }
+}
+
+// Return the timestamp value for SSA, or 0 if there isnt one.
+inline unsigned
+temporal_cache::temporal_value (unsigned ssa) const
+{
+  const range_timestamp *ts = get_timestamp (ssa);
+  return ts ? ts->time : 0;
+}
+
+// Return TRUE if the timestampe for NAME is newer than any of its dependents.
+
+bool
+temporal_cache::current_p (tree name) const
+{
+  const range_timestamp *ts = get_timestamp (SSA_NAME_VERSION (name));
+  if (!ts || ts->time == 0)
+    return true;
+  // Any non-registered dependencies will have a value of 0 and thus be older.
+  // Return true if time is newer than either dependent.
+  return ts->time > temporal_value (ts->ssa1)
+	 && ts->time > temporal_value (ts->ssa2);
+}
+
+// This increments the global timer and sets the timestamp for NAME.
+
+inline void
+temporal_cache::set_timestamp (tree name)
+{
+  gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name)));
+  get_timestamp (SSA_NAME_VERSION (name))->time = ++m_current_time;
+}
+
+// Set the timestamp to 0, marking it as "always up to date".
+
+inline void
+temporal_cache::set_always_current (tree name)
+{
+  gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name)));
+  get_timestamp (SSA_NAME_VERSION (name))->time = 0;
+}
+
+
 // --------------------------------------------------------------------------
 
 ranger_cache::ranger_cache (gimple_ranger &q) : query (q)
@@ -488,10 +622,12 @@  ranger_cache::ranger_cache (gimple_ranger &q) : query (q)
   m_poor_value_list.create (0);
   m_poor_value_list.safe_grow_cleared (20);
   m_poor_value_list.truncate (0);
+  m_temporal = new temporal_cache;
 }
 
 ranger_cache::~ranger_cache ()
 {
+  delete m_temporal;
   m_poor_value_list.release ();
   m_workback.release ();
   m_update_list.release ();
@@ -529,6 +665,32 @@  ranger_cache::get_global_range (irange &r, tree name) const
   return m_globals.get_global_range (r, name);
 }
 
+// Get the global range for NAME, and return in R if the value is not stale.
+// If the range is set, but is stale, mark it current and return false.
+// If it is not set pick up the legacy global value, mark it current, and
+// return false.
+// Note there is always a value returned in R. The return value indicates
+// whether that value is an up-to-date calculated value or not..
+
+bool
+ranger_cache::get_non_stale_global_range (irange &r, tree name)
+{
+  if (m_globals.get_global_range (r, name))
+    {
+      if (m_temporal->current_p (name))
+	return true;
+    }
+  else
+    {
+      // Global has never been accessed, so pickup the legacy global value.
+      r = gimple_range_global (name);
+      m_globals.set_global_range (name, r);
+    }
+  // After a stale check failure, mark the value as always current until a
+  // new one is set.
+  m_temporal->set_always_current (name);
+  return false;
+}
 //  Set the global range of NAME to R.
 
 void
@@ -546,6 +708,18 @@  ranger_cache::set_global_range (tree name, const irange &r)
 
       propagate_updated_value (name, bb);
     }
+  // Mark the value as up-to-date.
+  m_temporal->set_timestamp (name);
+}
+
+// Register a dependency on DEP to name.  If the timestamp for DEP is ever
+// greateer than the timestamp for NAME, then it is newer and NAMEs value
+// becomes stale.
+
+void
+ranger_cache::register_dependency (tree name, tree dep)
+{
+  m_temporal->set_dependency (name, dep);
 }
 
 // Push a request for a new lookup in block BB of name.  Return true if
diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h
index 0e84ab0c4e6..c5749fefb5f 100644
--- a/gcc/gimple-range-cache.h
+++ b/gcc/gimple-range-cache.h
@@ -97,7 +97,9 @@  public:
   bool block_range (irange &r, basic_block bb, tree name, bool calc = true);
 
   bool get_global_range (irange &r, tree name) const;
+  bool get_non_stale_global_range (irange &r, tree name);
   void set_global_range (tree name, const irange &r);
+  void register_dependency (tree name, tree dep);
 
   non_null_ref m_non_null;
 
@@ -106,6 +108,7 @@  public:
 private:
   ssa_global_cache m_globals;
   block_range_cache m_on_entry;
+  class temporal_cache *m_temporal;
   void add_to_update (basic_block bb);
   void fill_block_cache (tree name, basic_block bb, basic_block def_bb);
   void propagate_cache (tree name);
diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 8fdcc310111..ef65e00cc1d 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -415,12 +415,20 @@  bool
 gimple_ranger::range_of_range_op (irange &r, gimple *s)
 {
   int_range_max range1, range2;
+  tree lhs = gimple_get_lhs (s);
   tree type = gimple_expr_type (s);
   gcc_checking_assert (irange::supports_type_p (type));
 
   tree op1 = gimple_range_operand1 (s);
   tree op2 = gimple_range_operand2 (s);
 
+  if (lhs)
+    {
+      // Register potential dependencies for stale value tracking.
+      m_cache.register_dependency (lhs, op1);
+      m_cache.register_dependency (lhs, op2);
+    }
+
   if (range_of_non_trivial_assignment (r, s))
     return true;
 
@@ -501,6 +509,9 @@  gimple_ranger::range_of_phi (irange &r, gphi *phi)
       tree arg = gimple_phi_arg_def (phi, x);
       edge e = gimple_phi_arg_edge (phi, x);
 
+      // Register potential dependencies for stale value tracking.
+      m_cache.register_dependency (phi_def, arg);
+
       range_on_edge (arg_range, e, arg);
       r.union_ (arg_range);
       // Once the value reaches varying, stop looking.
@@ -1009,18 +1020,12 @@  gimple_ranger::range_of_stmt (irange &r, gimple *s, tree name)
   if (!gimple_range_ssa_p (name))
     return false;
 
-  // If this STMT has already been processed, return that value.
-  if (m_cache.get_global_range (r, name))
+  // Check if the stmt has already been processed, and is not stale.
+  if (m_cache.get_non_stale_global_range (r, name))
     return true;
 
-  // Avoid infinite recursion by initializing global cache
-  int_range_max tmp = gimple_range_global (name);
-  m_cache.set_global_range (name, tmp);
-
+  // Otherwise calculate a new value and save it.
   calc_stmt (r, s, name);
-
-  if (is_a<gphi *> (s))
-    r.intersect (tmp);
   m_cache.set_global_range (name, r);
   return true;
 }
diff --git a/gcc/testsuite/gcc.dg/pr97515.c b/gcc/testsuite/gcc.dg/pr97515.c
index 2b6185ec90b..84f145a261f 100644
--- a/gcc/testsuite/gcc.dg/pr97515.c
+++ b/gcc/testsuite/gcc.dg/pr97515.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
 
 int
 e7 (int gg)
@@ -19,3 +19,7 @@  e7 (int gg)
 
   return xe;
 }
+
+/* EVRP should be able to reduce this to a single goto.  */
+ 
+/* { dg-final { scan-tree-dump-times "goto" 1 "evrp" } } */