From patchwork Wed Nov 4 18:12:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew MacLeod X-Patchwork-Id: 1394443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=G38XMeMw; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CRFB11cXLz9sSs for ; Thu, 5 Nov 2020 05:12:41 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A24039878D9; Wed, 4 Nov 2020 18:12:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1A24039878D9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1604513559; bh=B+u+oUGhC11k+oC73WfboNsLazvjd8bvORGmied4l1Y=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=G38XMeMwbFYyKvKDjVtj32eil5KWyfN1PyQcKcQiXtCqj5YfCQNwBbl8XezU7aZxw UemfV90yE9PWL6fUAD9sfwT0O0Hrm6P50Frw3Td1vhhWqsE34hbWBly21Too/ewSD8 v5jBw4RAR+zS1y+qeYoVGhyFYLhYvZzZS+pjLdiY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id 6E1C5385141D for ; Wed, 4 Nov 2020 18:12:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6E1C5385141D Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-480-1DWGwyd0Ph6tBpCZtKpEJA-1; Wed, 04 Nov 2020 13:12:31 -0500 X-MC-Unique: 1DWGwyd0Ph6tBpCZtKpEJA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 229F787951A for ; Wed, 4 Nov 2020 18:12:30 +0000 (UTC) Received: from [10.10.119.201] (ovpn-119-201.rdu2.redhat.com [10.10.119.201]) by smtp.corp.redhat.com (Postfix) with ESMTP id 63F825C5FD; Wed, 4 Nov 2020 18:12:29 +0000 (UTC) To: gcc-patches Subject: [PATCH] Add Ranger temporal cache Message-ID: <029e94ea-d93a-71df-6185-701e174b74bf@redhat.com> Date: Wed, 4 Nov 2020 13:12:28 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew MacLeod via Gcc-patches From: Andrew MacLeod Reply-To: Andrew MacLeod Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" PR 97515 highlighted a bit of silliness that results when we calculate a bunch of ranges by traversing a back edge, and set some values.  Then we eventually visit that block during the DOM walk, and discover the value can be improved, sometimes dramatically.  It is already cached, so unfortunately we don't visit it again... The situation is described in comment 4 : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97515#c4 I have created a temporal cache for the ranger that basically adds a timestamp to the global cache. The timestamp maintains the time a global range was calculated (monotonically increasing based on "set") and  a list of up to 2 directly dependent ssa-names that whenever we access the global value, their timestamp is checked to see if they are newer.  Any time the global value for a dependent is newer, then the current global value is considered stale and the ranger will recalculate using the newer values of the dependent. whats that mean? Using the PR testcase,  a back edge calculation for a PHI requires a range for ui_8:   ui_8 = ~xe_3; and at this time, we only know that xe_3 is <=0 based on the branch feeding the statement..  This ui_8 is calculated as [-1, +INF] globally and stored. As the caluclting comtniues, we actually discover that xe_3 has to be -1. When the EVRp dom walk eventually gets to this statement, we know xe_3 evaluates to [-1, -1] and we fodl this statement to   ui_8 = -1 Unfortunately, the global cache still have [-1, +INF] for ui_8 and has no way to know to reevalaute. With the temporal cache operating, when we figure out that xe_3 evaluates to [-1,-1], xe_3 gets a timestamp that is newer than that of ui_8. when range_of_stmt is called on ui_8 now, it fails the "current" check, and the ranger proceeds to recalculate ui_8 using the new value of x_3, and we get the proper result of [-1, -1] store for the global value. With this patch, this testcase now comes out of EVRP  looking like: e7 (int gg) {   int ui;   int xe;   _Bool _1;   int _2;   :   :   _1 = 0;   _2 = (int) _1;   goto ; [INV] } Time wise its pretty good.  It basically consumes the time I saved with the previous cache tweaks, and the overall time now is almost exactly the same as it was before.    So we're still pretty zippy. This  integrates with the previous cache changes so that when this global value for ui_8 is updated, any changes are automatically propagated into the global cache as well. I have also updated the testcase to ensure that it now produces the above code with a single goto. This bootstraps on x86_64-pc-linux-gnu, no regressions, and pushed. Andrew PS.  Before the next stage 1, I intend to use the preexisting dependency chains in the GORI engine instead of this one-or-two name timestamp entry. Currentlt the  drawback is that only dependent values are checked, so intervening calculations will not trigger a recalculation.   If we use the GORI dependency chains, then everything in the dependency chain will be recognized as stale, and we'll get even more cases. Combined with improvements planned for how dependency chain ranges are calculated by GORI, we could get even more interesting results. commit e86fd6a17cdb26710d1f13c9a47a3878c76028f9 Author: Andrew MacLeod Date: Wed Nov 4 12:59:15 2020 -0500 Add Ranger temporal cache Add a timestamp to supplement the global range cache to detect when a value may become stale. gcc/ PR tree-optimization/97515 * gimple-range-cache.h (class ranger_cache): New prototypes plus temporal cache pointer. * gimple-range-cache.cc (struct range_timestamp): New. (class temporal_cache): New. (temporal_cache::temporal_cache): New. (temporal_cache::~temporal_cache): New. (temporal_cache::get_timestamp): New. (temporal_cache::set_dependency): New. (temporal_cache::temporal_value): New. (temporal_cache::current_p): New. (temporal_cache::set_timestamp): New. (temporal_cache::set_always_current): New. (ranger_cache::ranger_cache): Allocate the temporal cache. (ranger_cache::~ranger_cache): Free temporal cache. (ranger_cache::get_non_stale_global_range): New. (ranger_cache::set_global_range): Add a timestamp. (ranger_cache::register_dependency): New. Add timestamp dependency. * gimple-range.cc (gimple_ranger::range_of_range_op): Add operand dependencies. (gimple_ranger::range_of_phi): Ditto. (gimple_ranger::range_of_stmt): Check if global range is stale, and recalculate if so. gcc/testsuite/ * gcc.dg/pr97515.c: Check listing for folding of entire function. diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index cca9025abba..b01563c83f9 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -476,6 +476,140 @@ ssa_global_cache::dump (FILE *f) fputc ('\n', f); } +// -------------------------------------------------------------------------- + + +// This struct provides a timestamp for a global range calculation. +// it contains the time counter, as well as a limited number of ssa-names +// that it is dependent upon. If the timestamp for any of the dependent names +// Are newer, then this range could need updating. + +struct range_timestamp +{ + unsigned time; + unsigned ssa1; + unsigned ssa2; +}; + +// This class will manage the timestamps for each ssa_name. +// When a value is calcualted, its timestamp is set to the current time. +// The ssanames it is dependent on have already been calculated, so they will +// have older times. If one fo those values is ever calculated again, it +// will get a newer timestamp, and the "current_p" check will fail. + +class temporal_cache +{ +public: + temporal_cache (); + ~temporal_cache (); + bool current_p (tree name) const; + void set_timestamp (tree name); + void set_dependency (tree name, tree dep); + void set_always_current (tree name); +private: + unsigned temporal_value (unsigned ssa) const; + const range_timestamp *get_timestamp (unsigned ssa) const; + range_timestamp *get_timestamp (unsigned ssa); + + unsigned m_current_time; + vec m_timestamp; +}; + + +inline +temporal_cache::temporal_cache () +{ + m_current_time = 1; + m_timestamp.create (0); + m_timestamp.safe_grow_cleared (num_ssa_names); +} + +inline +temporal_cache::~temporal_cache () +{ + m_timestamp.release (); +} + +// Return a pointer to the timetamp for ssa-name at index SSA, if there is +// one, otherwise return NULL. + +inline const range_timestamp * +temporal_cache::get_timestamp (unsigned ssa) const +{ + if (ssa >= m_timestamp.length ()) + return NULL; + return &(m_timestamp[ssa]); +} + +// Return a reference to the timetamp for ssa-name at index SSA. If the index +// is past the end of the vector, extend the vector. + +inline range_timestamp * +temporal_cache::get_timestamp (unsigned ssa) +{ + if (ssa >= m_timestamp.length ()) + m_timestamp.safe_grow_cleared (num_ssa_names + 20); + return &(m_timestamp[ssa]); +} + +// This routine will fill NAME's next operand slot with DEP if DEP is a valid +// SSA_NAME and there is a free slot. + +inline void +temporal_cache::set_dependency (tree name, tree dep) +{ + if (dep && TREE_CODE (dep) == SSA_NAME) + { + gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name))); + range_timestamp& ts = *(get_timestamp (SSA_NAME_VERSION (name))); + if (!ts.ssa1) + ts.ssa1 = SSA_NAME_VERSION (dep); + else if (!ts.ssa2 && ts.ssa1 != SSA_NAME_VERSION (name)) + ts.ssa2 = SSA_NAME_VERSION (dep); + } +} + +// Return the timestamp value for SSA, or 0 if there isnt one. +inline unsigned +temporal_cache::temporal_value (unsigned ssa) const +{ + const range_timestamp *ts = get_timestamp (ssa); + return ts ? ts->time : 0; +} + +// Return TRUE if the timestampe for NAME is newer than any of its dependents. + +bool +temporal_cache::current_p (tree name) const +{ + const range_timestamp *ts = get_timestamp (SSA_NAME_VERSION (name)); + if (!ts || ts->time == 0) + return true; + // Any non-registered dependencies will have a value of 0 and thus be older. + // Return true if time is newer than either dependent. + return ts->time > temporal_value (ts->ssa1) + && ts->time > temporal_value (ts->ssa2); +} + +// This increments the global timer and sets the timestamp for NAME. + +inline void +temporal_cache::set_timestamp (tree name) +{ + gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name))); + get_timestamp (SSA_NAME_VERSION (name))->time = ++m_current_time; +} + +// Set the timestamp to 0, marking it as "always up to date". + +inline void +temporal_cache::set_always_current (tree name) +{ + gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name))); + get_timestamp (SSA_NAME_VERSION (name))->time = 0; +} + + // -------------------------------------------------------------------------- ranger_cache::ranger_cache (gimple_ranger &q) : query (q) @@ -488,10 +622,12 @@ ranger_cache::ranger_cache (gimple_ranger &q) : query (q) m_poor_value_list.create (0); m_poor_value_list.safe_grow_cleared (20); m_poor_value_list.truncate (0); + m_temporal = new temporal_cache; } ranger_cache::~ranger_cache () { + delete m_temporal; m_poor_value_list.release (); m_workback.release (); m_update_list.release (); @@ -529,6 +665,32 @@ ranger_cache::get_global_range (irange &r, tree name) const return m_globals.get_global_range (r, name); } +// Get the global range for NAME, and return in R if the value is not stale. +// If the range is set, but is stale, mark it current and return false. +// If it is not set pick up the legacy global value, mark it current, and +// return false. +// Note there is always a value returned in R. The return value indicates +// whether that value is an up-to-date calculated value or not.. + +bool +ranger_cache::get_non_stale_global_range (irange &r, tree name) +{ + if (m_globals.get_global_range (r, name)) + { + if (m_temporal->current_p (name)) + return true; + } + else + { + // Global has never been accessed, so pickup the legacy global value. + r = gimple_range_global (name); + m_globals.set_global_range (name, r); + } + // After a stale check failure, mark the value as always current until a + // new one is set. + m_temporal->set_always_current (name); + return false; +} // Set the global range of NAME to R. void @@ -546,6 +708,18 @@ ranger_cache::set_global_range (tree name, const irange &r) propagate_updated_value (name, bb); } + // Mark the value as up-to-date. + m_temporal->set_timestamp (name); +} + +// Register a dependency on DEP to name. If the timestamp for DEP is ever +// greateer than the timestamp for NAME, then it is newer and NAMEs value +// becomes stale. + +void +ranger_cache::register_dependency (tree name, tree dep) +{ + m_temporal->set_dependency (name, dep); } // Push a request for a new lookup in block BB of name. Return true if diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h index 0e84ab0c4e6..c5749fefb5f 100644 --- a/gcc/gimple-range-cache.h +++ b/gcc/gimple-range-cache.h @@ -97,7 +97,9 @@ public: bool block_range (irange &r, basic_block bb, tree name, bool calc = true); bool get_global_range (irange &r, tree name) const; + bool get_non_stale_global_range (irange &r, tree name); void set_global_range (tree name, const irange &r); + void register_dependency (tree name, tree dep); non_null_ref m_non_null; @@ -106,6 +108,7 @@ public: private: ssa_global_cache m_globals; block_range_cache m_on_entry; + class temporal_cache *m_temporal; void add_to_update (basic_block bb); void fill_block_cache (tree name, basic_block bb, basic_block def_bb); void propagate_cache (tree name); diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc index 8fdcc310111..ef65e00cc1d 100644 --- a/gcc/gimple-range.cc +++ b/gcc/gimple-range.cc @@ -415,12 +415,20 @@ bool gimple_ranger::range_of_range_op (irange &r, gimple *s) { int_range_max range1, range2; + tree lhs = gimple_get_lhs (s); tree type = gimple_expr_type (s); gcc_checking_assert (irange::supports_type_p (type)); tree op1 = gimple_range_operand1 (s); tree op2 = gimple_range_operand2 (s); + if (lhs) + { + // Register potential dependencies for stale value tracking. + m_cache.register_dependency (lhs, op1); + m_cache.register_dependency (lhs, op2); + } + if (range_of_non_trivial_assignment (r, s)) return true; @@ -501,6 +509,9 @@ gimple_ranger::range_of_phi (irange &r, gphi *phi) tree arg = gimple_phi_arg_def (phi, x); edge e = gimple_phi_arg_edge (phi, x); + // Register potential dependencies for stale value tracking. + m_cache.register_dependency (phi_def, arg); + range_on_edge (arg_range, e, arg); r.union_ (arg_range); // Once the value reaches varying, stop looking. @@ -1009,18 +1020,12 @@ gimple_ranger::range_of_stmt (irange &r, gimple *s, tree name) if (!gimple_range_ssa_p (name)) return false; - // If this STMT has already been processed, return that value. - if (m_cache.get_global_range (r, name)) + // Check if the stmt has already been processed, and is not stale. + if (m_cache.get_non_stale_global_range (r, name)) return true; - // Avoid infinite recursion by initializing global cache - int_range_max tmp = gimple_range_global (name); - m_cache.set_global_range (name, tmp); - + // Otherwise calculate a new value and save it. calc_stmt (r, s, name); - - if (is_a (s)) - r.intersect (tmp); m_cache.set_global_range (name, r); return true; } diff --git a/gcc/testsuite/gcc.dg/pr97515.c b/gcc/testsuite/gcc.dg/pr97515.c index 2b6185ec90b..84f145a261f 100644 --- a/gcc/testsuite/gcc.dg/pr97515.c +++ b/gcc/testsuite/gcc.dg/pr97515.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fdump-tree-evrp" } */ int e7 (int gg) @@ -19,3 +19,7 @@ e7 (int gg) return xe; } + +/* EVRP should be able to reduce this to a single goto. */ + +/* { dg-final { scan-tree-dump-times "goto" 1 "evrp" } } */