diff mbox

stabilize store merging

Message ID or1su83ah3.fsf@lxoliva.fsfla.org
State New
Headers show

Commit Message

Alexandre Oliva March 7, 2017, 11:58 p.m. UTC
Don't let pointer randomization change the order in which we process
store chains.  This may cause SSA_NAMEs to be released in different
order, and if they're reused later, they may cause differences in SSA
partitioning, leading to differences in expand, and ultimately to
different code.

bootstrap-debug-lean (-fcompare-debug) on i686-linux-gnu has failed in
haifa-sched.c since r245196 exposed the latent ordering problem in
store merging.  In this case, the IR differences (different SSA names
selected for copies in out-of-SSA, resulting in some off-by-one
differences in pseudos) were not significant enough to be visible in
the compiler output.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

for  gcc/ChangeLog

	* gimple-ssa-store-merging.c (INCLUDE_MAP): Define.
	(struct imm_store_chain_info): Add seqno field and ctor parm.
	(class pass_store_merging): Add m_store_seq field.
	(pass_store_merging::terminate_and_process_all_chains):
	Iterate over m_store_seq.
	(pass_store_merging::terminate_all_aliasing_chains):
	Likewise.
	(pass_store_merging::terminate_and_release_chain):
	Erase the m_store_seq entry.
	(pass_store_merging::execute): Check for debug stmts first.
	Compute seqno, add the m_store_seq entry.
---
 gcc/gimple-ssa-store-merging.c |   58 +++++++++++++++++++++++++++++-----------
 1 file changed, 42 insertions(+), 16 deletions(-)

Comments

Richard Biener March 8, 2017, 10:36 a.m. UTC | #1
On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> Don't let pointer randomization change the order in which we process
> store chains.  This may cause SSA_NAMEs to be released in different
> order, and if they're reused later, they may cause differences in SSA
> partitioning, leading to differences in expand, and ultimately to
> different code.
>
> bootstrap-debug-lean (-fcompare-debug) on i686-linux-gnu has failed in
> haifa-sched.c since r245196 exposed the latent ordering problem in
> store merging.  In this case, the IR differences (different SSA names
> selected for copies in out-of-SSA, resulting in some off-by-one
> differences in pseudos) were not significant enough to be visible in
> the compiler output.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Any reason you chose to maintain a new hash-map instead of at
interesting places gather to-be-processed chains into a vector and
sort that after IDs?  Traversing the hash-map still doesn't get you
a reliable order but only one depending on the chain creation and
thus stmt walk order - plus the hash function and size - which may be
good enough
to fix the issue at hand of course.  But it will for example still make
testcase reduction fragile if shrinking the hash-map by removing unrelated
code ends up processing things in different order.

So - if we fix this can we fix it by properly sorting things?

Thanks,
Richard.

> for  gcc/ChangeLog
>
>         * gimple-ssa-store-merging.c (INCLUDE_MAP): Define.
>         (struct imm_store_chain_info): Add seqno field and ctor parm.
>         (class pass_store_merging): Add m_store_seq field.
>         (pass_store_merging::terminate_and_process_all_chains):
>         Iterate over m_store_seq.
>         (pass_store_merging::terminate_all_aliasing_chains):
>         Likewise.
>         (pass_store_merging::terminate_and_release_chain):
>         Erase the m_store_seq entry.
>         (pass_store_merging::execute): Check for debug stmts first.
>         Compute seqno, add the m_store_seq entry.
> ---
>  gcc/gimple-ssa-store-merging.c |   58 +++++++++++++++++++++++++++++-----------
>  1 file changed, 42 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
> index 17ac94a..0bf7d89 100644
> --- a/gcc/gimple-ssa-store-merging.c
> +++ b/gcc/gimple-ssa-store-merging.c
> @@ -103,6 +103,7 @@
>    [p + 4B] (16-bit) := 0xabcd;     //  val & 0x00000000ffff;  */
>
>  #include "config.h"
> +#define INCLUDE_MAP
>  #include "system.h"
>  #include "coretypes.h"
>  #include "backend.h"
> @@ -675,11 +676,12 @@ merged_store_group::apply_stores ()
>
>  struct imm_store_chain_info
>  {
> +  unsigned seqno;
>    tree base_addr;
>    auto_vec<struct store_immediate_info *> m_store_info;
>    auto_vec<merged_store_group *> m_merged_store_groups;
>
> -  imm_store_chain_info (tree b_a) : base_addr (b_a) {}
> +  imm_store_chain_info (unsigned seq, tree b_a) : seqno (seq), base_addr (b_a) {}
>    bool terminate_and_process_chain ();
>    bool coalesce_immediate_stores ();
>    bool output_merged_store (merged_store_group *);
> @@ -718,6 +720,18 @@ public:
>  private:
>    hash_map<tree_operand_hash, struct imm_store_chain_info *> m_stores;
>
> +  /* Use m_store_seq to order elements in m_stores, and iterate over
> +     them in a predictable way.  It maps sequence numbers to the base
> +     addresses stored as keys in m_stores.  Using this order avoids
> +     extraneous differences in the compiler output just because of
> +     tree pointer variations (e.g. different chains end up in
> +     different positions of m_stores, so they are handled in different
> +     orders, so they allocate or release SSA names in different
> +     orders, and when they get reused, subsequent passes end up
> +     getting different SSA names, which may ultimately change
> +     decisions when going out of SSA).  */
> +  std::map<unsigned, tree> m_store_seq;
> +
>    bool terminate_and_process_all_chains ();
>    bool terminate_all_aliasing_chains (imm_store_chain_info **,
>                                       bool, gimple *);
> @@ -730,11 +744,16 @@ private:
>  bool
>  pass_store_merging::terminate_and_process_all_chains ()
>  {
> -  hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter
> -    = m_stores.begin ();
>    bool ret = false;
> -  for (; iter != m_stores.end (); ++iter)
> -    ret |= terminate_and_release_chain ((*iter).second);
> +  for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (),
> +        iter = next; iter != m_store_seq.end (); iter = next)
> +    {
> +      next++;
> +      tree base_addr = (*iter).second;
> +      ret |= terminate_and_release_chain (*m_stores.get (base_addr));
> +    }
> +  gcc_assert (m_stores.elements () == 0);
> +  gcc_assert (m_store_seq.empty ());
>
>    return ret;
>  }
> @@ -799,15 +818,17 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
>         }
>      }
>
> -  hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter
> -    = m_stores.begin ();
> -
>    /* Check for aliasing with all other store chains.  */
> -  for (; iter != m_stores.end (); ++iter)
> +  for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (),
> +        iter = next; iter != m_store_seq.end (); iter = next)
>      {
> +      next++;
> +      unsigned seqno = (*iter).first;
> +      tree base_addr = (*iter).second;
> +
>        /* We already checked all the stores in chain_info and terminated the
>          chain if necessary.  Skip it here.  */
> -      if (chain_info && (*chain_info) == (*iter).second)
> +      if (chain_info && (*chain_info)->seqno == seqno)
>         continue;
>
>        /* We can't use the base object here as that does not reliably exist.
> @@ -815,11 +836,11 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
>          minimum and maximum offset and the maximum size we could improve
>          things here).  */
>        ao_ref chain_ref;
> -      ao_ref_init_from_ptr_and_size (&chain_ref, (*iter).first, NULL_TREE);
> +      ao_ref_init_from_ptr_and_size (&chain_ref, base_addr, NULL_TREE);
>        if (ref_maybe_used_by_stmt_p (stmt, &chain_ref)
>           || stmt_may_clobber_ref_p_1 (stmt, &chain_ref))
>         {
> -         terminate_and_release_chain ((*iter).second);
> +         terminate_and_release_chain (*m_stores.get (base_addr));
>           ret = true;
>         }
>      }
> @@ -835,6 +856,7 @@ bool
>  pass_store_merging::terminate_and_release_chain (imm_store_chain_info *chain_info)
>  {
>    bool ret = chain_info->terminate_and_process_chain ();
> +  m_store_seq.erase (chain_info->seqno);
>    m_stores.remove (chain_info->base_addr);
>    delete chain_info;
>    return ret;
> @@ -1336,6 +1358,9 @@ pass_store_merging::execute (function *fun)
>         {
>           gimple *stmt = gsi_stmt (gsi);
>
> +         if (is_gimple_debug (stmt))
> +           continue;
> +
>           if (gimple_has_volatile_ops (stmt))
>             {
>               /* Terminate all chains.  */
> @@ -1346,9 +1371,6 @@ pass_store_merging::execute (function *fun)
>               continue;
>             }
>
> -         if (is_gimple_debug (stmt))
> -           continue;
> -
>           if (gimple_assign_single_p (stmt) && gimple_vdef (stmt)
>               && !stmt_can_throw_internal (stmt)
>               && lhs_valid_for_store_merging_p (gimple_assign_lhs (stmt)))
> @@ -1454,13 +1476,17 @@ pass_store_merging::execute (function *fun)
>
>                   /* Store aliases any existing chain?  */
>                   terminate_all_aliasing_chains (chain_info, false, stmt);
> +                 unsigned next_seqno = m_store_seq.empty () ? 0
> +                   : m_store_seq.rbegin()->first + 1;
>                   /* Start a new chain.  */
>                   struct imm_store_chain_info *new_chain
> -                   = new imm_store_chain_info (base_addr);
> +                   = new imm_store_chain_info (next_seqno, base_addr);
>                   info = new store_immediate_info (bitsize, bitpos,
>                                                    stmt, 0);
>                   new_chain->m_store_info.safe_push (info);
>                   m_stores.put (base_addr, new_chain);
> +                 m_store_seq.insert (m_store_seq.end (),
> +                                     std::make_pair (new_chain->seqno, base_addr));
>                   if (dump_file && (dump_flags & TDF_DETAILS))
>                     {
>                       fprintf (dump_file,
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Alexandre Oliva March 9, 2017, 11:02 p.m. UTC | #2
On Mar  8, 2017, Richard Biener <richard.guenther@gmail.com> wrote:

> On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> Don't let pointer randomization change the order in which we process
>> store chains.  This may cause SSA_NAMEs to be released in different
>> order, and if they're reused later, they may cause differences in SSA
>> partitioning, leading to differences in expand, and ultimately to
>> different code.

> Any reason you chose to maintain a new hash-map instead of at
> interesting places gather to-be-processed chains into a vector and
> sort that after IDs?

Where would we get such ids?  AFAICT base_addrs don't have to be decls,
they can be arbitrary expressions.

> Traversing the hash-map still doesn't get you
> a reliable order but only one depending on the chain creation and
> thus stmt walk order

True.  That's what all we need in general: output depends on stable
inputs only (relative order of stmts within BBs), not on random flukes
like pointer values within the compiler process.

> But it will for example still make testcase reduction fragile if
> shrinking the hash-map by removing unrelated code ends up processing
> things in different order.

True, if you remove or move the stmts that initiate a chain within a BB,
you will change the processing order.  There are many passes that will
process stmts or insns in the order they appear, so shuffling them will
yield different SSA version assignments, pseudo numbers and whatnot.
Why should this pass be held to a higher standard?

> So - if we fix this can we fix it by properly sorting things?

I don't see a way to do better, considering there's no real ID we could
use for sorting.  Do you?

Even if we were to use decl IDs rather than pointers in the tree
hashing, that would still make for impredictable ordering, because decl
ids are not necessarily stable across e.g. debug and nondebug
compilations.  Their order is, but if they were to appear within more
complex exprs, as we may have in this pass, hash computation would not
ensure decl id ordering is preserved.
Richard Biener March 10, 2017, 9:42 a.m. UTC | #3
On Fri, Mar 10, 2017 at 12:02 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Mar  8, 2017, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> Don't let pointer randomization change the order in which we process
>>> store chains.  This may cause SSA_NAMEs to be released in different
>>> order, and if they're reused later, they may cause differences in SSA
>>> partitioning, leading to differences in expand, and ultimately to
>>> different code.
>
>> Any reason you chose to maintain a new hash-map instead of at
>> interesting places gather to-be-processed chains into a vector and
>> sort that after IDs?
>
> Where would we get such ids?  AFAICT base_addrs don't have to be decls,
> they can be arbitrary expressions.
>
>> Traversing the hash-map still doesn't get you
>> a reliable order but only one depending on the chain creation and
>> thus stmt walk order
>
> True.  That's what all we need in general: output depends on stable
> inputs only (relative order of stmts within BBs), not on random flukes
> like pointer values within the compiler process.
>
>> But it will for example still make testcase reduction fragile if
>> shrinking the hash-map by removing unrelated code ends up processing
>> things in different order.
>
> True, if you remove or move the stmts that initiate a chain within a BB,
> you will change the processing order.  There are many passes that will
> process stmts or insns in the order they appear, so shuffling them will
> yield different SSA version assignments, pseudo numbers and whatnot.
> Why should this pass be held to a higher standard?
>
>> So - if we fix this can we fix it by properly sorting things?
>
> I don't see a way to do better, considering there's no real ID we could
> use for sorting.  Do you?

Just use the ID you use but instead of maintaining a hash-map and traverse
that collect work items from the other hash-table into a vec and then sort
after the ID.

I was just thinking that if we're going all the way of having IDs then
walking a hash-map of those IDs is not as good as we can do
with similar cost?

Richard.

>
> Even if we were to use decl IDs rather than pointers in the tree
> hashing, that would still make for impredictable ordering, because decl
> ids are not necessarily stable across e.g. debug and nondebug
> compilations.  Their order is, but if they were to appear within more
> complex exprs, as we may have in this pass, hash computation would not
> ensure decl id ordering is preserved.
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Alexandre Oliva March 10, 2017, 9 p.m. UTC | #4
On Mar 10, 2017, Richard Biener <richard.guenther@gmail.com> wrote:

>>> So - if we fix this can we fix it by properly sorting things?

>> I don't see a way to do better, considering there's no real ID we could
>> use for sorting.  Do you?

> Just use the ID you use but instead of maintaining a hash-map and traverse
> that collect work items from the other hash-table into a vec and then sort
> after the ID.

Why is that better?  You wrote 'hash-map' again, so I won't assume it
was a mistake any more.  I'm using a map for the seqno-to-chain mapping;
that's a lot more predictable performance- and size-wise.  Iterating
over a hashmap requires one to go over all of the holes, and you get
shuffled results that you then have to sort every time you'll iterate
over the entries.  You may have to do so multiple times, and if the
number of chains is large (is that relevant in practice?) you may have
to resize it multiple times.

Whereas with the map I proposed, we always insert at the end, we iterate
efficiently without having to do additional sorting, we most often
remove from the head, and in the exceptional aliasing cases in which we
remove from the middle, we do so without much regard to the element
count.

Now, if there's something you dislike about maps, we could make it a
doubly-linked list, or maybe even a singly-linked list.  I could replace
seqno with a pointer to the next entry in the seq list.  We'd just have
to keep a pointer to the first entry instead of the newly-added map.  We
don't even need a pointer to the last element: we could make it work
like a stack, rather than a queue, since the sequence of elements
doesn't matter much, as long as we make it consistent.  That would
amount to more boilerplace list-maintenance code, but it's certainly
doable, it costs just a pointer vs an int in the chains and an
additional pointer over your suggestion, but it saves the shuffling and
sorting.

> I was just thinking that if we're going all the way of having IDs then
> walking a hash-map of those IDs is not as good as we can do
> with similar cost?

You seem to consider walking a hash_map (which we did before) pretty
costly, but then I'm not proposin us to do so; the performance features
of a map are not the same as those of a hash_map, and I was proposing a
map.  Please reassess under this light, and considering the revised
suggestion above.
diff mbox

Patch

diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index 17ac94a..0bf7d89 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -103,6 +103,7 @@ 
   [p + 4B] (16-bit) := 0xabcd;     //  val & 0x00000000ffff;  */
 
 #include "config.h"
+#define INCLUDE_MAP
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
@@ -675,11 +676,12 @@  merged_store_group::apply_stores ()
 
 struct imm_store_chain_info
 {
+  unsigned seqno;
   tree base_addr;
   auto_vec<struct store_immediate_info *> m_store_info;
   auto_vec<merged_store_group *> m_merged_store_groups;
 
-  imm_store_chain_info (tree b_a) : base_addr (b_a) {}
+  imm_store_chain_info (unsigned seq, tree b_a) : seqno (seq), base_addr (b_a) {}
   bool terminate_and_process_chain ();
   bool coalesce_immediate_stores ();
   bool output_merged_store (merged_store_group *);
@@ -718,6 +720,18 @@  public:
 private:
   hash_map<tree_operand_hash, struct imm_store_chain_info *> m_stores;
 
+  /* Use m_store_seq to order elements in m_stores, and iterate over
+     them in a predictable way.  It maps sequence numbers to the base
+     addresses stored as keys in m_stores.  Using this order avoids
+     extraneous differences in the compiler output just because of
+     tree pointer variations (e.g. different chains end up in
+     different positions of m_stores, so they are handled in different
+     orders, so they allocate or release SSA names in different
+     orders, and when they get reused, subsequent passes end up
+     getting different SSA names, which may ultimately change
+     decisions when going out of SSA).  */
+  std::map<unsigned, tree> m_store_seq;
+
   bool terminate_and_process_all_chains ();
   bool terminate_all_aliasing_chains (imm_store_chain_info **,
 				      bool, gimple *);
@@ -730,11 +744,16 @@  private:
 bool
 pass_store_merging::terminate_and_process_all_chains ()
 {
-  hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter
-    = m_stores.begin ();
   bool ret = false;
-  for (; iter != m_stores.end (); ++iter)
-    ret |= terminate_and_release_chain ((*iter).second);
+  for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (),
+	 iter = next; iter != m_store_seq.end (); iter = next)
+    {
+      next++;
+      tree base_addr = (*iter).second;
+      ret |= terminate_and_release_chain (*m_stores.get (base_addr));
+    }
+  gcc_assert (m_stores.elements () == 0);
+  gcc_assert (m_store_seq.empty ());
 
   return ret;
 }
@@ -799,15 +818,17 @@  pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
 	}
     }
 
-  hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter
-    = m_stores.begin ();
-
   /* Check for aliasing with all other store chains.  */
-  for (; iter != m_stores.end (); ++iter)
+  for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (),
+	 iter = next; iter != m_store_seq.end (); iter = next)
     {
+      next++;
+      unsigned seqno = (*iter).first;
+      tree base_addr = (*iter).second;
+
       /* We already checked all the stores in chain_info and terminated the
 	 chain if necessary.  Skip it here.  */
-      if (chain_info && (*chain_info) == (*iter).second)
+      if (chain_info && (*chain_info)->seqno == seqno)
 	continue;
 
       /* We can't use the base object here as that does not reliably exist.
@@ -815,11 +836,11 @@  pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
 	 minimum and maximum offset and the maximum size we could improve
 	 things here).  */
       ao_ref chain_ref;
-      ao_ref_init_from_ptr_and_size (&chain_ref, (*iter).first, NULL_TREE);
+      ao_ref_init_from_ptr_and_size (&chain_ref, base_addr, NULL_TREE);
       if (ref_maybe_used_by_stmt_p (stmt, &chain_ref)
 	  || stmt_may_clobber_ref_p_1 (stmt, &chain_ref))
 	{
-	  terminate_and_release_chain ((*iter).second);
+	  terminate_and_release_chain (*m_stores.get (base_addr));
 	  ret = true;
 	}
     }
@@ -835,6 +856,7 @@  bool
 pass_store_merging::terminate_and_release_chain (imm_store_chain_info *chain_info)
 {
   bool ret = chain_info->terminate_and_process_chain ();
+  m_store_seq.erase (chain_info->seqno);
   m_stores.remove (chain_info->base_addr);
   delete chain_info;
   return ret;
@@ -1336,6 +1358,9 @@  pass_store_merging::execute (function *fun)
 	{
 	  gimple *stmt = gsi_stmt (gsi);
 
+	  if (is_gimple_debug (stmt))
+	    continue;
+
 	  if (gimple_has_volatile_ops (stmt))
 	    {
 	      /* Terminate all chains.  */
@@ -1346,9 +1371,6 @@  pass_store_merging::execute (function *fun)
 	      continue;
 	    }
 
-	  if (is_gimple_debug (stmt))
-	    continue;
-
 	  if (gimple_assign_single_p (stmt) && gimple_vdef (stmt)
 	      && !stmt_can_throw_internal (stmt)
 	      && lhs_valid_for_store_merging_p (gimple_assign_lhs (stmt)))
@@ -1454,13 +1476,17 @@  pass_store_merging::execute (function *fun)
 
 		  /* Store aliases any existing chain?  */
 		  terminate_all_aliasing_chains (chain_info, false, stmt);
+		  unsigned next_seqno = m_store_seq.empty () ? 0
+		    : m_store_seq.rbegin()->first + 1;
 		  /* Start a new chain.  */
 		  struct imm_store_chain_info *new_chain
-		    = new imm_store_chain_info (base_addr);
+		    = new imm_store_chain_info (next_seqno, base_addr);
 		  info = new store_immediate_info (bitsize, bitpos,
 						   stmt, 0);
 		  new_chain->m_store_info.safe_push (info);
 		  m_stores.put (base_addr, new_chain);
+		  m_store_seq.insert (m_store_seq.end (),
+				      std::make_pair (new_chain->seqno, base_addr));
 		  if (dump_file && (dump_flags & TDF_DETAILS))
 		    {
 		      fprintf (dump_file,