diff mbox series

Improve handling of memory operands in ipa-icf 4/4

Message ID 20201115232042.GA91420@kam.mff.cuni.cz
State New
Headers show
Series Improve handling of memory operands in ipa-icf 4/4 | expand

Commit Message

Jan Hubicka Nov. 15, 2020, 11:20 p.m. UTC
Hi,
this patch implements the logic to drop base alias sets (and thus also access
paths) from memory operands when doing so helps merging.  This is done
by extending func_checker::compare_operand to record mismatched pairs
and then processing this in sem_function::equals_private when comparsion
suceeds.

This is controlled by -fipa-icf-alias-sets

The patch drops too early, so we may end up processing function twice.  Also if
merging is not performed we lose code quality for no win (this is rare case).
My original plan was to remember the mismatched parameter and apply them only
after merging decisions are finished, but I was not sure how to do that in
ipa-icf.  In particular we need to ensure transitivity. In particular if
function foo is merged to bar, we also need to be sure that we dropped
base alias setsin functions tht are called by bar even if they themselves
are not merged. Martin, is there easy way to implement this on top of current ICF?

Patch improves ICF code size savings on cc1plus to 1.3% compared to 0.6% before patch
and 3% with -fno-strict-aliasing. 6802 functions are merged and 3975 base alias sets
are dropped, 3642 originating from hash-table.h.

memory stats are:
 ipa lto gimple in                  :   1.01 (  3%)   0.47 ( 12%)   1.56 (  4%)   228M ( 14%)
 tree operand scan                  :   0.40 (  1%)   0.16 (  4%)   0.49 (  1%)    39M (  2%)
 ipa icf                            :   1.96 (  6%)   0.20 (  5%)   2.14 (  6%)    12M (  1%)
 TOTAL                              :  31.99          3.99         36.07         1632M

compared to --disable-ipa-icf:
 TOTAL                              :  26.52          2.44         29.21         1354M

To recover more of -fno-strict-alias-analysis we could have
-fipa-icf-alias-sets 4-state
 -fipa-icf-alias-sets=0 do not drop any TBAA
 -fipa-icf-alias-sets=1 drop base alias sets
 -fipa-icf-alias-sets=2 drop pointer ref alias sets
 -fipa-icf-alias-sets=3 also drop ref alias sets to 0 for completely incompatible types.

On GCC most common reason is now diference in pointer types. so =2
should get the 3% of code size.

Bootstrapped/regtested x86_64-linux.
OK for mainline?  I would be also fine with making this wait for next
stage1 if that looks too intrusive (with part 3 of series at least icf
is not consuming a lot of memory for nothing), but there ought to be
again very nice savings on libreoffice which I think had double-digit
saving for icf (13% if I recall correctly).  I am busy tomorrow but will
start looking into firefox, clang and libreoffice builds again on
wednesday.

Honza

	* invoke.texi (-fipa-icf-alias-sets): Document.
	* common.opt (ipa-icf-alias-sets): New flag.
	* ipa-icf-gimple.c (func_checker::func_checker): Initialize
	m_compare_base_alias_sets.
	(func_checker::hash_operand): Pass m_compare_base_alias_sets
	to hash_ao_ref.
	(func_checker::compare_operand): If asked to record
	mismatched base alias sets.
	* ipa-icf-gimple.h (func_checker::m_compare_base_alias_sets): New
	variable.
	(func_checker::func_checker): Update prototype.
	(func_checker::m_mismatched_base_alias_set): New variable.
	* ipa-icf.c: Include ipa-modref-tree.h and ipa-modref.h
	(drop_base_alias_sets): New function.
	(sem_function::equals_private): Use it.
	(sem_function::hash_stmt): Only record
	base alias sets if asked to.
	* ipa-modref.c (ipa_modref_analyze): New function.
	* ipa-modref.h (ipa_modref_analyze): Declare.
	* tree-ssa-alias-compare.h (ao_compare::hash_ao_ref): Update prototype.
	* tree-ssa-alias.c (ao_compare::compare_ao_refs): Add
	base_alias_set argument.

Comments

Richard Biener Nov. 16, 2020, 9:26 a.m. UTC | #1
On Mon, 16 Nov 2020, Jan Hubicka wrote:

> Hi,
> this patch implements the logic to drop base alias sets (and thus also access
> paths) from memory operands when doing so helps merging.  This is done
> by extending func_checker::compare_operand to record mismatched pairs
> and then processing this in sem_function::equals_private when comparsion
> suceeds.
> 
> This is controlled by -fipa-icf-alias-sets
> 
> The patch drops too early, so we may end up processing function twice.  Also if
> merging is not performed we lose code quality for no win (this is rare case).
> My original plan was to remember the mismatched parameter and apply them only
> after merging decisions are finished, but I was not sure how to do that in
> ipa-icf.  In particular we need to ensure transitivity. In particular if
> function foo is merged to bar, we also need to be sure that we dropped
> base alias setsin functions tht are called by bar even if they themselves
> are not merged. Martin, is there easy way to implement this on top of current ICF?
> 
> Patch improves ICF code size savings on cc1plus to 1.3% compared to 0.6% before patch
> and 3% with -fno-strict-aliasing. 6802 functions are merged and 3975 base alias sets
> are dropped, 3642 originating from hash-table.h.
> 
> memory stats are:
>  ipa lto gimple in                  :   1.01 (  3%)   0.47 ( 12%)   1.56 (  4%)   228M ( 14%)
>  tree operand scan                  :   0.40 (  1%)   0.16 (  4%)   0.49 (  1%)    39M (  2%)
>  ipa icf                            :   1.96 (  6%)   0.20 (  5%)   2.14 (  6%)    12M (  1%)
>  TOTAL                              :  31.99          3.99         36.07         1632M
> 
> compared to --disable-ipa-icf:
>  TOTAL                              :  26.52          2.44         29.21         1354M

Compared to -fno-ipa-icf-alias-sets?

> To recover more of -fno-strict-alias-analysis we could have
> -fipa-icf-alias-sets 4-state
>  -fipa-icf-alias-sets=0 do not drop any TBAA
>  -fipa-icf-alias-sets=1 drop base alias sets
>  -fipa-icf-alias-sets=2 drop pointer ref alias sets
>  -fipa-icf-alias-sets=3 also drop ref alias sets to 0 for completely incompatible types.

I'm concerned about the test matrix exploding here ... it's already
quite difficult to create testcases for possible alias breakage with ICF.

Note the option name isn't very good for users.  Isn't this just
a kind of aggressiveness?  Thus -fipa-icf-aggressive?  That is,
we perform more merging even if the merging might come at cost
of possibly doing less optimizations on the merged result?

It again makes me think of leaving inline clones alone, thus
have "optimized" and "merged" function bodies.  You've said
this might still break things, but I don't remember the details.

How does ICF fare with modref analysis?  I guess we'll have to drop
modref summaries once we do any alias-set squashing?

> On GCC most common reason is now diference in pointer types. so =2
> should get the 3% of code size.
> 
> Bootstrapped/regtested x86_64-linux.
> OK for mainline?  I would be also fine with making this wait for next
> stage1 if that looks too intrusive (with part 3 of series at least icf
> is not consuming a lot of memory for nothing), but there ought to be
> again very nice savings on libreoffice which I think had double-digit
> saving for icf (13% if I recall correctly).  I am busy tomorrow but will
> start looking into firefox, clang and libreoffice builds again on
> wednesday.

It definitely looks a bit rushed to me and I'd rather get modref
and the fn spec propagation bits ironed out before throwing in ICF.

But decide for yourself here - in principle the patch looks good
but I have no idea of confidence in the approach ;)

Thanks,
Richard.

> Honza
> 
> 	* invoke.texi (-fipa-icf-alias-sets): Document.
> 	* common.opt (ipa-icf-alias-sets): New flag.
> 	* ipa-icf-gimple.c (func_checker::func_checker): Initialize
> 	m_compare_base_alias_sets.
> 	(func_checker::hash_operand): Pass m_compare_base_alias_sets
> 	to hash_ao_ref.
> 	(func_checker::compare_operand): If asked to record
> 	mismatched base alias sets.
> 	* ipa-icf-gimple.h (func_checker::m_compare_base_alias_sets): New
> 	variable.
> 	(func_checker::func_checker): Update prototype.
> 	(func_checker::m_mismatched_base_alias_set): New variable.
> 	* ipa-icf.c: Include ipa-modref-tree.h and ipa-modref.h
> 	(drop_base_alias_sets): New function.
> 	(sem_function::equals_private): Use it.
> 	(sem_function::hash_stmt): Only record
> 	base alias sets if asked to.
> 	* ipa-modref.c (ipa_modref_analyze): New function.
> 	* ipa-modref.h (ipa_modref_analyze): Declare.
> 	* tree-ssa-alias-compare.h (ao_compare::hash_ao_ref): Update prototype.
> 	* tree-ssa-alias.c (ao_compare::compare_ao_refs): Add
> 	base_alias_set argument.
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 9552cebe0d6..a5178bfa7e5 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1857,6 +1857,10 @@ fipa-icf-functions
>  Common Report Var(flag_ipa_icf_functions) Optimization
>  Perform Identical Code Folding for functions.
>  
> +fipa-icf-alias-sets
> +Common Report Var(flag_ipa_icf_alias_sets) Init(1) Optimization
> +Perform Identical Code Folding even if base alias set info needs to be eliminated
> +
>  fipa-icf-variables
>  Common Report Var(flag_ipa_icf_variables) Optimization
>  Perform Identical Code Folding for variables.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 322ec357bb6..76bb4862431 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11043,6 +11043,13 @@ equivalences that are found only by GCC and equivalences found only by Gold.
>  
>  This flag is enabled by default at @option{-O2} and @option{-Os}.
>  
> +@item -fipa-icf-alias-sets
> +@opindex fipa-icf-alias-sets
> +Perform Identical Code Folding of memory accesses that are not complatible
> +according to type based alias analysis rules.  This implies removing the
> +type information from the accesses possibly degrading optimization.
> +Enabled by default.
> +
>  @item -flive-patching=@var{level}
>  @opindex flive-patching
>  Control GCC's optimizations to produce output suitable for live-patching.
> diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
> index ffb1baddbdb..e05d14d9c17 100644
> --- a/gcc/ipa-icf-gimple.c
> +++ b/gcc/ipa-icf-gimple.c
> @@ -54,12 +54,14 @@ namespace ipa_icf_gimple {
>  
>  func_checker::func_checker (tree source_func_decl, tree target_func_decl,
>  			    bool ignore_labels, bool tbaa,
> +			    bool compare_base_alias_sets,
>  			    hash_set<symtab_node *> *ignored_source_nodes,
>  			    hash_set<symtab_node *> *ignored_target_nodes)
>    : m_source_func_decl (source_func_decl), m_target_func_decl (target_func_decl),
>      m_ignored_source_nodes (ignored_source_nodes),
>      m_ignored_target_nodes (ignored_target_nodes),
> -    m_ignore_labels (ignore_labels), m_tbaa (tbaa)
> +    m_ignore_labels (ignore_labels), m_tbaa (tbaa),
> +    m_compare_base_alias_sets (compare_base_alias_sets)
>  {
>    function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
>    function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
> @@ -259,7 +261,8 @@ func_checker::hash_operand (const_tree arg, inchash::hash &hstate,
>      {
>        ao_ref ref;
>        ao_ref_init (&ref, const_cast <tree> (arg));
> -      return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa, hstate);
> +      return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa,
> +		          m_compare_base_alias_sets, hstate);
>      }
>    else
>      return hash_operand (arg, hstate, flags);
> @@ -335,18 +338,28 @@ func_checker::compare_operand (tree t1, tree t2, operand_access_type access)
>        if (flags & SEMANTICS)
>  	return return_false_with_msg
>  		("compare_ao_refs failed (semantic difference)");
> -      if (flags & BASE_ALIAS_SET)
> -	return return_false_with_msg
> -		("compare_ao_refs failed (base alias set difference)");
>        if (flags & REF_ALIAS_SET)
>  	return return_false_with_msg
>  		 ("compare_ao_refs failed (ref alias set difference)");
> -      if (flags & ACCESS_PATH)
> -	return return_false_with_msg
> -		 ("compare_ao_refs failed (access path difference)");
>        if (flags & DEPENDENCE_CLIQUE)
>  	return return_false_with_msg
>  		 ("compare_ao_refs failed (dependence clique difference)");
> +      if (!m_compare_base_alias_sets)
> +	{
> +	  if (ao_ref_base_alias_ptr_type (&ref1)
> +	      != ao_ref_alias_ptr_type (&ref1))
> +	    m_mismatched_base_alias_set.safe_push (t1);
> +	  if (ao_ref_base_alias_ptr_type (&ref2)
> +	      != ao_ref_alias_ptr_type (&ref2))
> +	    m_mismatched_base_alias_set.safe_push (t2);
> +	  return true;
> +	}
> +      if (flags & BASE_ALIAS_SET)
> +	return return_false_with_msg
> +		("compare_ao_refs failed (base alias set difference)");
> +      if (flags & ACCESS_PATH)
> +	return return_false_with_msg
> +		 ("compare_ao_refs failed (access path difference)");
>        gcc_unreachable ();
>      }
>    else
> diff --git a/gcc/ipa-icf-gimple.h b/gcc/ipa-icf-gimple.h
> index 40f7474738e..9745fdd9278 100644
> --- a/gcc/ipa-icf-gimple.h
> +++ b/gcc/ipa-icf-gimple.h
> @@ -125,7 +125,7 @@ public:
>    func_checker ():
>      m_source_func_decl (NULL_TREE), m_target_func_decl (NULL_TREE),
>      m_ignored_source_nodes (NULL), m_ignored_target_nodes (NULL),
> -    m_ignore_labels (false), m_tbaa (true)
> +    m_ignore_labels (false), m_tbaa (true), m_compare_base_alias_sets (NULL)
>    {
>      m_source_ssa_names.create (0);
>      m_target_ssa_names.create (0);
> @@ -140,6 +140,7 @@ public:
>    func_checker (tree source_func_decl, tree target_func_decl,
>  		bool ignore_labels = false,
>  		bool tbaa = true,
> +		bool compare_base_alias_sets = true,
>  		hash_set<symtab_node *> *ignored_source_nodes = NULL,
>  		hash_set<symtab_node *> *ignored_target_nodes = NULL);
>  
> @@ -243,6 +244,9 @@ public:
>    /* Return access type of a given operand.  */
>    static operand_access_type get_operand_access_type
>  		 (operand_access_type_map *map, tree);
> +
> +  /* Memory operands that have conficting alias sets.  */
> +  auto_vec<tree> m_mismatched_base_alias_set;
>  private:
>    /* Vector mapping source SSA names to target ones.  */
>    vec <int> m_source_ssa_names;
> @@ -264,7 +268,6 @@ private:
>       declaration comparison.  */
>    hash_set<symtab_node *> *m_ignored_target_nodes;
>  
> -  /* Source to target edge map.  */
>    hash_map <edge, edge> m_edge_map;
>  
>    /* Source to target declaration map.  */
> @@ -279,6 +282,8 @@ private:
>    /* Flag if we should compare type based alias analysis info.  */
>    bool m_tbaa;
>  
> +  /* Flag if we should compare base alias sets.  */
> +  bool m_compare_base_alias_sets;
>  public:
>    /* Return true if two operands are equal.  The flags fields can be used
>       to specify OEP flags described above.  */
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index 27eeda3a319..33fbb58ebc6 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -88,6 +88,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-vector-builder.h"
>  #include "symtab-thunks.h"
>  #include "alias.h"
> +#include "ipa-modref-tree.h"
> +#include "ipa-modref.h"
>  
>  using namespace ipa_icf_gimple;
>  
> @@ -823,6 +825,72 @@ sem_function::equals (sem_item *item,
>    return eq;
>  }
>  
> +/* Drop base alias sets in function DECL for all operands in
> +   MISMATCHED_BASE_ALIAS_SET_OPS.  */
> +
> +static void
> +drop_base_alias_sets (tree decl,
> +		      hash_set <tree> &mismatched_base_alias_set_ops)
> +{
> +  function *func = DECL_STRUCT_FUNCTION (decl);
> +  unsigned int n = 0;
> +  basic_block bb;
> +
> +  FOR_EACH_BB_FN (bb, func)
> +    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> +	 gsi_next (&gsi))
> +      for (unsigned i = 0; i < gimple_num_ops (gsi_stmt (gsi)); ++i)
> +	if (mismatched_base_alias_set_ops.contains
> +	      (gimple_op (gsi_stmt (gsi), i)))
> +	  {
> +	    tree *op = gimple_op_ptr (gsi_stmt (gsi), i);
> +	    ao_ref ref;
> +	    ao_ref_init (&ref, *op);
> +	    n++;
> +	    tree ptrtype = ao_ref_alias_ptr_type (&ref);
> +
> +	    if (dump_enabled_p ())
> +	      {
> +		dump_user_location_t loc (gsi_stmt (gsi));
> +		dump_printf_loc (MSG_NOTE, loc,
> +				 "Dropping base alias set to enable"
> +				 " identical code folding.\n");
> +	      }
> +
> +	    while (handled_component_p (*op))
> +	      op = &TREE_OPERAND (*op, 0);
> +
> +	    /* Adjust MEM_REF.  This will also make it view
> +	       converted so we will drop access path.  */
> +	    if (TREE_CODE (*op) == MEM_REF
> +		|| TREE_CODE (*op) == TARGET_MEM_REF)
> +	      {
> +		TREE_OPERAND (*op, 1)
> +		  = fold_convert (ptrtype, TREE_OPERAND (*op, 1));
> +	      }
> +	    /* If there is no MEM_REF introduce new one.  */
> +	    else
> +	      {
> +		tree t = *op;
> +
> +		*op = build2 (MEM_REF, TREE_TYPE (t),
> +			      build1 (ADDR_EXPR, ptrtype, t),
> +			      build_int_cst (ptrtype, 0));
> +		TREE_THIS_VOLATILE (*op) = TREE_THIS_VOLATILE (t);
> +	      }
> +	    if (flag_checking)
> +	      {
> +		ao_ref_init (&ref, gimple_op (gsi_stmt (gsi), i));
> +		gcc_assert (ao_ref_alias_ptr_type (&ref)
> +			    == ao_ref_base_alias_ptr_type (&ref));
> +		gcc_assert (ao_ref_alias_set (&ref)
> +			    == ao_ref_base_alias_set (&ref));
> +	      }
> +	  }
> +  if (n)
> +    ipa_modref_analyze (cgraph_node::get (decl));
> +}
> +
>  /* Processes function equality comparison.  */
>  
>  bool
> @@ -850,6 +918,8 @@ sem_function::equals_private (sem_item *item)
>  				false,
>  				opt_for_fn (m_compared_func->decl,
>  					    flag_strict_aliasing),
> +				!opt_for_fn (m_compared_func->decl,
> +					     flag_ipa_icf_alias_sets),
>  				&refs_set,
>  				&m_compared_func->refs_set);
>    arg1 = DECL_ARGUMENTS (decl);
> @@ -920,6 +990,32 @@ sem_function::equals_private (sem_item *item)
>      if (!compare_phi_node (bb_sorted[i]->bb, m_compared_func->bb_sorted[i]->bb))
>        return return_false_with_msg ("PHI node comparison returns false");
>  
> +  /* If necessary walk alias body and drop mase alias sets.  */
> +  if (result && m_checker->m_mismatched_base_alias_set.length ())
> +    {
> +      hash_set <tree> mismatched_base_alias_set_ops;
> +      /* Record mismatched operands.  */
> +      for (unsigned i = 0; i < m_checker->m_mismatched_base_alias_set.length (); i++)
> +	{
> +	  tree op = m_checker->m_mismatched_base_alias_set [i];
> +	  mismatched_base_alias_set_ops.add (op);
> +	}
> +      /* Drop in both bodies; it is not clear what body will win in the
> +	 merging and thus process both.
> +
> +	 TODO: It is possible that merging will fail.
> +	 In that case we lose alias info for no win.  This is bit hard to track
> +	 because proving two function equal also enables merging of callers
> +	 of the functions.
> +	 
> +	 We can also keep the body with alias set around for inlining,
> +	 but again this requires quite careful tracking of the transitive
> +	 closure of all merges.  */
> +      drop_base_alias_sets (decl, mismatched_base_alias_set_ops);
> +      drop_base_alias_sets (m_compared_func->decl,
> +			    mismatched_base_alias_set_ops);
> +    }
> +
>    return result;
>  }
>  
> @@ -1461,9 +1557,12 @@ sem_function::hash_stmt (gimple *stmt, inchash::hash &hstate)
>  		tree t = ao_ref_alias_ptr_type (&ref);
>  		if (variably_modified_type_p (t, NULL_TREE))
>  		  memory_access_types.safe_push (t);
> -		t = ao_ref_base_alias_ptr_type (&ref);
> -		if (variably_modified_type_p (t, NULL_TREE))
> -		  memory_access_types.safe_push (t);
> +		if (!flag_ipa_icf_alias_sets)
> +		  {
> +		    t = ao_ref_base_alias_ptr_type (&ref);
> +		    if (variably_modified_type_p (t, NULL_TREE))
> +		      memory_access_types.safe_push (t);
> +		  }
>  	      }
>  	  }
>  	/* Consider nocf_check attribute in hash as it affects code
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index 155bf59fc53..b09a2fcb7a2 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -2117,6 +2117,30 @@ modref_summaries_lto::insert (struct cgraph_node *node, modref_summary_lto *)
>    pop_cfun ();
>  }
>  
> +/* Recompute summary after body was updated.  */
> +
> +void
> +ipa_modref_analyze (struct cgraph_node *node)
> +{
> +  if (optimization_summaries
> +      && optimization_summaries->get (node))
> +    {
> +      optimization_summaries->remove (node);
> +      optimization_summaries->insert
> +	      (node, optimization_summaries->get_create (node));
> +    }
> +  if (summaries && summaries->get (node))
> +    {
> +      summaries->remove (node);
> +      summaries->insert (node, summaries->get_create (node));
> +    }
> +  if (summaries_lto && summaries_lto->get (node))
> +    {
> +      summaries_lto->remove (node);
> +      summaries_lto->insert (node, summaries_lto->get_create (node));
> +    }
> +}
> +
>  /* Called when new clone is inserted to callgraph late.  */
>  
>  void
> diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
> index 7decabd5744..982c620468a 100644
> --- a/gcc/ipa-modref.h
> +++ b/gcc/ipa-modref.h
> @@ -41,5 +41,6 @@ struct GTY(()) modref_summary
>  modref_summary *get_modref_function_summary (cgraph_node *func);
>  void ipa_modref_c_finalize ();
>  void ipa_merge_modref_summary_after_inlining (cgraph_edge *e);
> +void ipa_modref_analyze (cgraph_node *node);
>  
>  #endif
> diff --git a/gcc/tree-ssa-alias-compare.h b/gcc/tree-ssa-alias-compare.h
> index 0e8409a7565..13bfa3381b5 100644
> --- a/gcc/tree-ssa-alias-compare.h
> +++ b/gcc/tree-ssa-alias-compare.h
> @@ -37,7 +37,7 @@ class ao_compare : public operand_compare
>    int compare_ao_refs (ao_ref *ref1, ao_ref *ref2, bool lto_streaming_safe,
>  		       bool tbaa);
>    void hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa,
> -		    inchash::hash &hstate);
> +		    bool base_alias_set, inchash::hash &hstate);
>  };
>  
>  #endif
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 5ebbb087285..52dd0055905 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -4169,11 +4169,14 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2,
>    return flags;
>  }
>  
> -/* Hash REF to HSTATE.  If LTO_STREAMING_SAFE do not use alias sets
> +/* Hash REF to HSTATE. If TBAA is false do not hash info relevant
> +   for type based alias anslysi.  If BASE_ALIS_SET is false
> +   do not hash base alias set and the access path.
> +   If LTO_STREAMING_SAFE do not use alias sets
>     and canonical types.  */
>  void
>  ao_compare::hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa,
> -			 inchash::hash &hstate)
> +			 bool base_alias_set, inchash::hash &hstate)
>  {
>    tree base = ao_ref_base (ref);
>    tree tbase = base;
> @@ -4209,6 +4212,7 @@ ao_compare::hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa,
>    if (!lto_streaming_safe && tbaa)
>      {
>        hstate.add_int (ao_ref_alias_set (ref));
> -      hstate.add_int (ao_ref_base_alias_set (ref));
> +      if (base_alias_set)
> +	hstate.add_int (ao_ref_base_alias_set (ref));
>      }
>  }
>
Martin Liška Nov. 18, 2020, 12:23 p.m. UTC | #2
On 11/16/20 12:20 AM, Jan Hubicka wrote:
> This is controlled by -fipa-icf-alias-sets
> 
> The patch drops too early, so we may end up processing function twice.  Also if
> merging is not performed we lose code quality for no win (this is rare case).
> My original plan was to remember the mismatched parameter and apply them only
> after merging decisions are finished, but I was not sure how to do that in
> ipa-icf.  In particular we need to ensure transitivity. In particular if
> function foo is merged to bar, we also need to be sure that we dropped
> base alias setsin functions tht are called by bar even if they themselves
> are not merged. Martin, is there easy way to implement this on top of current ICF?

Well, you will need to create a set of merged functions and then traverse all
callers of these (via cgraph_node callers). It should not be so difficult, or?

> 
> Patch improves ICF code size savings on cc1plus to 1.3% compared to 0.6% before patch
> and 3% with -fno-strict-aliasing. 6802 functions are merged and 3975 base alias sets
> are dropped, 3642 originating from hash-table.h.
> 
> memory stats are:
>   ipa lto gimple in                  :   1.01 (  3%)   0.47 ( 12%)   1.56 (  4%)   228M ( 14%)
>   tree operand scan                  :   0.40 (  1%)   0.16 (  4%)   0.49 (  1%)    39M (  2%)
>   ipa icf                            :   1.96 (  6%)   0.20 (  5%)   2.14 (  6%)    12M (  1%)
>   TOTAL                              :  31.99          3.99         36.07         1632M
> 
> compared to --disable-ipa-icf:
>   TOTAL                              :  26.52          2.44         29.21         1354M
> 
> To recover more of -fno-strict-alias-analysis we could have
> -fipa-icf-alias-sets 4-state
>   -fipa-icf-alias-sets=0 do not drop any TBAA
>   -fipa-icf-alias-sets=1 drop base alias sets
>   -fipa-icf-alias-sets=2 drop pointer ref alias sets
>   -fipa-icf-alias-sets=3 also drop ref alias sets to 0 for completely incompatible types.
> 
> On GCC most common reason is now diference in pointer types. so =2
> should get the 3% of code size.
> 
> Bootstrapped/regtested x86_64-linux.
> OK for mainline?  I would be also fine with making this wait for next
> stage1 if that looks too intrusive (with part 3 of series at least icf
> is not consuming a lot of memory for nothing), but there ought to be
> again very nice savings on libreoffice which I think had double-digit
> saving for icf (13% if I recall correctly).  I am busy tomorrow but will
> start looking into firefox, clang and libreoffice builds again on
> wednesday.

I would recommend deferring that to the next stage1.

Thanks for working on that Honza,
Martin
Jan Hubicka Nov. 19, 2020, 10:14 a.m. UTC | #3
> On 11/16/20 12:20 AM, Jan Hubicka wrote:
> > This is controlled by -fipa-icf-alias-sets
> > 
> > The patch drops too early, so we may end up processing function twice.  Also if
> > merging is not performed we lose code quality for no win (this is rare case).
> > My original plan was to remember the mismatched parameter and apply them only
> > after merging decisions are finished, but I was not sure how to do that in
> > ipa-icf.  In particular we need to ensure transitivity. In particular if
> > function foo is merged to bar, we also need to be sure that we dropped
> > base alias setsin functions tht are called by bar even if they themselves
> > are not merged. Martin, is there easy way to implement this on top of current ICF?
> 
> Well, you will need to create a set of merged functions and then traverse all
> callers of these (via cgraph_node callers). It should not be so difficult, or?

Well, imagine you have function A1 and A2 
and calls A1->B2
and       A2->B3
and there is also B3.

Now A1 is ICF equivalent to A2
and also B1,B2,B3 are ICF equivalent if some TBAA info is dropped.

ICF merges A2 to A1
it also considers to merge B2,B3 to B1 but concludes it is not benefical
at the very end (because some of them have address taken and
constructing wrapper is too expensive)

The comparsions done are
 B1:B2
 B1:B3
 A1:A2
So after comparing we have info what to drop in B1 to make merging B2->B1
and B3->B1 valid.  We also have info what to drop in B2 to make B1->B2
valid and in B3 to make B3->B1 valid.

But we meed to drop info in B2 to make B3->B2 valid to make call path
alias A2 of A1->B2 safe.

Honza
Martin Liška Nov. 20, 2020, 8:51 a.m. UTC | #4
On 11/19/20 11:14 AM, Jan Hubicka wrote:
>> On 11/16/20 12:20 AM, Jan Hubicka wrote:
>>> This is controlled by -fipa-icf-alias-sets
>>>
>>> The patch drops too early, so we may end up processing function twice.  Also if
>>> merging is not performed we lose code quality for no win (this is rare case).
>>> My original plan was to remember the mismatched parameter and apply them only
>>> after merging decisions are finished, but I was not sure how to do that in
>>> ipa-icf.  In particular we need to ensure transitivity. In particular if
>>> function foo is merged to bar, we also need to be sure that we dropped
>>> base alias setsin functions tht are called by bar even if they themselves
>>> are not merged. Martin, is there easy way to implement this on top of current ICF?
>>
>> Well, you will need to create a set of merged functions and then traverse all
>> callers of these (via cgraph_node callers). It should not be so difficult, or?
> 

Hey.

> Well, imagine you have function A1 and A2
> and calls A1->B2
> and       A2->B3
> and there is also B3.

You likely mean A1->B1 and A2->B2, right?

> 
> Now A1 is ICF equivalent to A2
> and also B1,B2,B3 are ICF equivalent if some TBAA info is dropped.

ICF works in a way that we have classes of functions (groups) that we know
that are equivalent. And we we do, we subdivide these classes. Then when
e.g. foo1 and foo2 are known to be different then a class
with bar1->foo1 and bar2->foo2 is split as well.

> 
> ICF merges A2 to A1
> it also considers to merge B2,B3 to B1 but concludes it is not benefical
> at the very end (because some of them have address taken and
> constructing wrapper is too expensive)

We never do a revert of a decision!

> 
> The comparsions done are
>   B1:B2
>   B1:B3
>   A1:A2
> So after comparing we have info what to drop in B1 to make merging B2->B1
> and B3->B1 valid.  We also have info what to drop in B2 to make B1->B2
> valid and in B3 to make B3->B1 valid.
> 
> But we meed to drop info in B2 to make B3->B2 valid to make call path
> alias A2 of A1->B2 safe.

So what you need is to skip a division of groups based on the "strict aliasing"
and mark all functions that will need a drop operation.

Hope it helps?

Martin

> 
> Honza
>
diff mbox series

Patch

diff --git a/gcc/common.opt b/gcc/common.opt
index 9552cebe0d6..a5178bfa7e5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1857,6 +1857,10 @@  fipa-icf-functions
 Common Report Var(flag_ipa_icf_functions) Optimization
 Perform Identical Code Folding for functions.
 
+fipa-icf-alias-sets
+Common Report Var(flag_ipa_icf_alias_sets) Init(1) Optimization
+Perform Identical Code Folding even if base alias set info needs to be eliminated
+
 fipa-icf-variables
 Common Report Var(flag_ipa_icf_variables) Optimization
 Perform Identical Code Folding for variables.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 322ec357bb6..76bb4862431 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11043,6 +11043,13 @@  equivalences that are found only by GCC and equivalences found only by Gold.
 
 This flag is enabled by default at @option{-O2} and @option{-Os}.
 
+@item -fipa-icf-alias-sets
+@opindex fipa-icf-alias-sets
+Perform Identical Code Folding of memory accesses that are not complatible
+according to type based alias analysis rules.  This implies removing the
+type information from the accesses possibly degrading optimization.
+Enabled by default.
+
 @item -flive-patching=@var{level}
 @opindex flive-patching
 Control GCC's optimizations to produce output suitable for live-patching.
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index ffb1baddbdb..e05d14d9c17 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -54,12 +54,14 @@  namespace ipa_icf_gimple {
 
 func_checker::func_checker (tree source_func_decl, tree target_func_decl,
 			    bool ignore_labels, bool tbaa,
+			    bool compare_base_alias_sets,
 			    hash_set<symtab_node *> *ignored_source_nodes,
 			    hash_set<symtab_node *> *ignored_target_nodes)
   : m_source_func_decl (source_func_decl), m_target_func_decl (target_func_decl),
     m_ignored_source_nodes (ignored_source_nodes),
     m_ignored_target_nodes (ignored_target_nodes),
-    m_ignore_labels (ignore_labels), m_tbaa (tbaa)
+    m_ignore_labels (ignore_labels), m_tbaa (tbaa),
+    m_compare_base_alias_sets (compare_base_alias_sets)
 {
   function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
   function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
@@ -259,7 +261,8 @@  func_checker::hash_operand (const_tree arg, inchash::hash &hstate,
     {
       ao_ref ref;
       ao_ref_init (&ref, const_cast <tree> (arg));
-      return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa, hstate);
+      return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa,
+		          m_compare_base_alias_sets, hstate);
     }
   else
     return hash_operand (arg, hstate, flags);
@@ -335,18 +338,28 @@  func_checker::compare_operand (tree t1, tree t2, operand_access_type access)
       if (flags & SEMANTICS)
 	return return_false_with_msg
 		("compare_ao_refs failed (semantic difference)");
-      if (flags & BASE_ALIAS_SET)
-	return return_false_with_msg
-		("compare_ao_refs failed (base alias set difference)");
       if (flags & REF_ALIAS_SET)
 	return return_false_with_msg
 		 ("compare_ao_refs failed (ref alias set difference)");
-      if (flags & ACCESS_PATH)
-	return return_false_with_msg
-		 ("compare_ao_refs failed (access path difference)");
       if (flags & DEPENDENCE_CLIQUE)
 	return return_false_with_msg
 		 ("compare_ao_refs failed (dependence clique difference)");
+      if (!m_compare_base_alias_sets)
+	{
+	  if (ao_ref_base_alias_ptr_type (&ref1)
+	      != ao_ref_alias_ptr_type (&ref1))
+	    m_mismatched_base_alias_set.safe_push (t1);
+	  if (ao_ref_base_alias_ptr_type (&ref2)
+	      != ao_ref_alias_ptr_type (&ref2))
+	    m_mismatched_base_alias_set.safe_push (t2);
+	  return true;
+	}
+      if (flags & BASE_ALIAS_SET)
+	return return_false_with_msg
+		("compare_ao_refs failed (base alias set difference)");
+      if (flags & ACCESS_PATH)
+	return return_false_with_msg
+		 ("compare_ao_refs failed (access path difference)");
       gcc_unreachable ();
     }
   else
diff --git a/gcc/ipa-icf-gimple.h b/gcc/ipa-icf-gimple.h
index 40f7474738e..9745fdd9278 100644
--- a/gcc/ipa-icf-gimple.h
+++ b/gcc/ipa-icf-gimple.h
@@ -125,7 +125,7 @@  public:
   func_checker ():
     m_source_func_decl (NULL_TREE), m_target_func_decl (NULL_TREE),
     m_ignored_source_nodes (NULL), m_ignored_target_nodes (NULL),
-    m_ignore_labels (false), m_tbaa (true)
+    m_ignore_labels (false), m_tbaa (true), m_compare_base_alias_sets (NULL)
   {
     m_source_ssa_names.create (0);
     m_target_ssa_names.create (0);
@@ -140,6 +140,7 @@  public:
   func_checker (tree source_func_decl, tree target_func_decl,
 		bool ignore_labels = false,
 		bool tbaa = true,
+		bool compare_base_alias_sets = true,
 		hash_set<symtab_node *> *ignored_source_nodes = NULL,
 		hash_set<symtab_node *> *ignored_target_nodes = NULL);
 
@@ -243,6 +244,9 @@  public:
   /* Return access type of a given operand.  */
   static operand_access_type get_operand_access_type
 		 (operand_access_type_map *map, tree);
+
+  /* Memory operands that have conficting alias sets.  */
+  auto_vec<tree> m_mismatched_base_alias_set;
 private:
   /* Vector mapping source SSA names to target ones.  */
   vec <int> m_source_ssa_names;
@@ -264,7 +268,6 @@  private:
      declaration comparison.  */
   hash_set<symtab_node *> *m_ignored_target_nodes;
 
-  /* Source to target edge map.  */
   hash_map <edge, edge> m_edge_map;
 
   /* Source to target declaration map.  */
@@ -279,6 +282,8 @@  private:
   /* Flag if we should compare type based alias analysis info.  */
   bool m_tbaa;
 
+  /* Flag if we should compare base alias sets.  */
+  bool m_compare_base_alias_sets;
 public:
   /* Return true if two operands are equal.  The flags fields can be used
      to specify OEP flags described above.  */
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 27eeda3a319..33fbb58ebc6 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -88,6 +88,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-vector-builder.h"
 #include "symtab-thunks.h"
 #include "alias.h"
+#include "ipa-modref-tree.h"
+#include "ipa-modref.h"
 
 using namespace ipa_icf_gimple;
 
@@ -823,6 +825,72 @@  sem_function::equals (sem_item *item,
   return eq;
 }
 
+/* Drop base alias sets in function DECL for all operands in
+   MISMATCHED_BASE_ALIAS_SET_OPS.  */
+
+static void
+drop_base_alias_sets (tree decl,
+		      hash_set <tree> &mismatched_base_alias_set_ops)
+{
+  function *func = DECL_STRUCT_FUNCTION (decl);
+  unsigned int n = 0;
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, func)
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      for (unsigned i = 0; i < gimple_num_ops (gsi_stmt (gsi)); ++i)
+	if (mismatched_base_alias_set_ops.contains
+	      (gimple_op (gsi_stmt (gsi), i)))
+	  {
+	    tree *op = gimple_op_ptr (gsi_stmt (gsi), i);
+	    ao_ref ref;
+	    ao_ref_init (&ref, *op);
+	    n++;
+	    tree ptrtype = ao_ref_alias_ptr_type (&ref);
+
+	    if (dump_enabled_p ())
+	      {
+		dump_user_location_t loc (gsi_stmt (gsi));
+		dump_printf_loc (MSG_NOTE, loc,
+				 "Dropping base alias set to enable"
+				 " identical code folding.\n");
+	      }
+
+	    while (handled_component_p (*op))
+	      op = &TREE_OPERAND (*op, 0);
+
+	    /* Adjust MEM_REF.  This will also make it view
+	       converted so we will drop access path.  */
+	    if (TREE_CODE (*op) == MEM_REF
+		|| TREE_CODE (*op) == TARGET_MEM_REF)
+	      {
+		TREE_OPERAND (*op, 1)
+		  = fold_convert (ptrtype, TREE_OPERAND (*op, 1));
+	      }
+	    /* If there is no MEM_REF introduce new one.  */
+	    else
+	      {
+		tree t = *op;
+
+		*op = build2 (MEM_REF, TREE_TYPE (t),
+			      build1 (ADDR_EXPR, ptrtype, t),
+			      build_int_cst (ptrtype, 0));
+		TREE_THIS_VOLATILE (*op) = TREE_THIS_VOLATILE (t);
+	      }
+	    if (flag_checking)
+	      {
+		ao_ref_init (&ref, gimple_op (gsi_stmt (gsi), i));
+		gcc_assert (ao_ref_alias_ptr_type (&ref)
+			    == ao_ref_base_alias_ptr_type (&ref));
+		gcc_assert (ao_ref_alias_set (&ref)
+			    == ao_ref_base_alias_set (&ref));
+	      }
+	  }
+  if (n)
+    ipa_modref_analyze (cgraph_node::get (decl));
+}
+
 /* Processes function equality comparison.  */
 
 bool
@@ -850,6 +918,8 @@  sem_function::equals_private (sem_item *item)
 				false,
 				opt_for_fn (m_compared_func->decl,
 					    flag_strict_aliasing),
+				!opt_for_fn (m_compared_func->decl,
+					     flag_ipa_icf_alias_sets),
 				&refs_set,
 				&m_compared_func->refs_set);
   arg1 = DECL_ARGUMENTS (decl);
@@ -920,6 +990,32 @@  sem_function::equals_private (sem_item *item)
     if (!compare_phi_node (bb_sorted[i]->bb, m_compared_func->bb_sorted[i]->bb))
       return return_false_with_msg ("PHI node comparison returns false");
 
+  /* If necessary walk alias body and drop mase alias sets.  */
+  if (result && m_checker->m_mismatched_base_alias_set.length ())
+    {
+      hash_set <tree> mismatched_base_alias_set_ops;
+      /* Record mismatched operands.  */
+      for (unsigned i = 0; i < m_checker->m_mismatched_base_alias_set.length (); i++)
+	{
+	  tree op = m_checker->m_mismatched_base_alias_set [i];
+	  mismatched_base_alias_set_ops.add (op);
+	}
+      /* Drop in both bodies; it is not clear what body will win in the
+	 merging and thus process both.
+
+	 TODO: It is possible that merging will fail.
+	 In that case we lose alias info for no win.  This is bit hard to track
+	 because proving two function equal also enables merging of callers
+	 of the functions.
+	 
+	 We can also keep the body with alias set around for inlining,
+	 but again this requires quite careful tracking of the transitive
+	 closure of all merges.  */
+      drop_base_alias_sets (decl, mismatched_base_alias_set_ops);
+      drop_base_alias_sets (m_compared_func->decl,
+			    mismatched_base_alias_set_ops);
+    }
+
   return result;
 }
 
@@ -1461,9 +1557,12 @@  sem_function::hash_stmt (gimple *stmt, inchash::hash &hstate)
 		tree t = ao_ref_alias_ptr_type (&ref);
 		if (variably_modified_type_p (t, NULL_TREE))
 		  memory_access_types.safe_push (t);
-		t = ao_ref_base_alias_ptr_type (&ref);
-		if (variably_modified_type_p (t, NULL_TREE))
-		  memory_access_types.safe_push (t);
+		if (!flag_ipa_icf_alias_sets)
+		  {
+		    t = ao_ref_base_alias_ptr_type (&ref);
+		    if (variably_modified_type_p (t, NULL_TREE))
+		      memory_access_types.safe_push (t);
+		  }
 	      }
 	  }
 	/* Consider nocf_check attribute in hash as it affects code
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 155bf59fc53..b09a2fcb7a2 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -2117,6 +2117,30 @@  modref_summaries_lto::insert (struct cgraph_node *node, modref_summary_lto *)
   pop_cfun ();
 }
 
+/* Recompute summary after body was updated.  */
+
+void
+ipa_modref_analyze (struct cgraph_node *node)
+{
+  if (optimization_summaries
+      && optimization_summaries->get (node))
+    {
+      optimization_summaries->remove (node);
+      optimization_summaries->insert
+	      (node, optimization_summaries->get_create (node));
+    }
+  if (summaries && summaries->get (node))
+    {
+      summaries->remove (node);
+      summaries->insert (node, summaries->get_create (node));
+    }
+  if (summaries_lto && summaries_lto->get (node))
+    {
+      summaries_lto->remove (node);
+      summaries_lto->insert (node, summaries_lto->get_create (node));
+    }
+}
+
 /* Called when new clone is inserted to callgraph late.  */
 
 void
diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
index 7decabd5744..982c620468a 100644
--- a/gcc/ipa-modref.h
+++ b/gcc/ipa-modref.h
@@ -41,5 +41,6 @@  struct GTY(()) modref_summary
 modref_summary *get_modref_function_summary (cgraph_node *func);
 void ipa_modref_c_finalize ();
 void ipa_merge_modref_summary_after_inlining (cgraph_edge *e);
+void ipa_modref_analyze (cgraph_node *node);
 
 #endif
diff --git a/gcc/tree-ssa-alias-compare.h b/gcc/tree-ssa-alias-compare.h
index 0e8409a7565..13bfa3381b5 100644
--- a/gcc/tree-ssa-alias-compare.h
+++ b/gcc/tree-ssa-alias-compare.h
@@ -37,7 +37,7 @@  class ao_compare : public operand_compare
   int compare_ao_refs (ao_ref *ref1, ao_ref *ref2, bool lto_streaming_safe,
 		       bool tbaa);
   void hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa,
-		    inchash::hash &hstate);
+		    bool base_alias_set, inchash::hash &hstate);
 };
 
 #endif
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 5ebbb087285..52dd0055905 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -4169,11 +4169,14 @@  ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2,
   return flags;
 }
 
-/* Hash REF to HSTATE.  If LTO_STREAMING_SAFE do not use alias sets
+/* Hash REF to HSTATE. If TBAA is false do not hash info relevant
+   for type based alias anslysi.  If BASE_ALIS_SET is false
+   do not hash base alias set and the access path.
+   If LTO_STREAMING_SAFE do not use alias sets
    and canonical types.  */
 void
 ao_compare::hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa,
-			 inchash::hash &hstate)
+			 bool base_alias_set, inchash::hash &hstate)
 {
   tree base = ao_ref_base (ref);
   tree tbase = base;
@@ -4209,6 +4212,7 @@  ao_compare::hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa,
   if (!lto_streaming_safe && tbaa)
     {
       hstate.add_int (ao_ref_alias_set (ref));
-      hstate.add_int (ao_ref_base_alias_set (ref));
+      if (base_alias_set)
+	hstate.add_int (ao_ref_base_alias_set (ref));
     }
 }