diff mbox

[3/5] IPA ICF pass

Message ID 20140926204654.GD26922@atrey.karlin.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Sept. 26, 2014, 8:46 p.m. UTC
Hi,
this is on ipa-icf-gimple.c

@@ -2827,11 +2829,19 @@ cgraph_node::verify_node (void)
 			    {
 			      if (verify_edge_corresponds_to_fndecl (e, decl))
 				{
-				  error ("edge points to wrong declaration:");
-				  debug_tree (e->callee->decl);
-				  fprintf (stderr," Instead of:");
-				  debug_tree (decl);
-				  error_found = true;
+				  /* The edge can be redirected in WPA by IPA ICF.
+				     Following check really ensures that it's
+				     not the case.  */
+
+				  cgraph_node *current_node = cgraph_node::get (decl);
+				  if (!current_node || !current_node->icf_merged)

I would move this into verify_edge_corresponds_to_fndecl.

Comments

Martin Liška Oct. 11, 2014, 12:05 a.m. UTC | #1
On 09/26/2014 09:46 PM, Jan Hubicka wrote:
> Hi,
> this is on ipa-icf-gimple.c
> 
> @@ -2827,11 +2829,19 @@ cgraph_node::verify_node (void)
>  			    {
>  			      if (verify_edge_corresponds_to_fndecl (e, decl))
>  				{
> -				  error ("edge points to wrong declaration:");
> -				  debug_tree (e->callee->decl);
> -				  fprintf (stderr," Instead of:");
> -				  debug_tree (decl);
> -				  error_found = true;
> +				  /* The edge can be redirected in WPA by IPA ICF.
> +				     Following check really ensures that it's
> +				     not the case.  */
> +
> +				  cgraph_node *current_node = cgraph_node::get (decl);
> +				  if (!current_node || !current_node->icf_merged)
> 
> I would move this into verify_edge_corresponds_to_fndecl.
> 
> diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
> new file mode 100644
> index 0000000..7031eaa
> --- /dev/null
> +++ b/gcc/ipa-icf-gimple.c
> @@ -0,0 +1,384 @@
> +/* Interprocedural Identical Code Folding pass
> +   Copyright (C) 2014 Free Software Foundation, Inc.
> +
> +   Contributed by Jan Hubicka <hubicka@ucw.cz> and Martin Liska <mliska@suse.cz>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> 
> Please add toplevel comment about what the code does and how to use it.
> 
> +namespace ipa_icf {
> +
> +/* Basic block equivalence comparison function that returns true if
> +   basic blocks BB1 and BB2 (from functions FUNC1 and FUNC2) correspond.  */
> ... to each other?
> I would add short comment that as comparsion goes you build voclabulary
> of equivalences of variables/ssanames etc.
> So people reading the code do not get lost at very beggining.
> 
> +
> +bool
> +func_checker::compare_bb (sem_bb *bb1, sem_bb *bb2)
> +{
> +  unsigned i;
> +  gimple_stmt_iterator gsi1, gsi2;
> +  gimple s1, s2;
> +
> +  if (bb1->nondbg_stmt_count != bb2->nondbg_stmt_count
> +      || bb1->edge_count != bb2->edge_count)
> +    return RETURN_FALSE ();
> 
> The UPPERCASE looks ugly.  I see that RETURN_FALSE is a warpper for return_false_with_msg
> that outputs line and file information.
> 
> I would make it lowercase even if it is macro. You may consider using
> CXX_MEM_STAT_INFO style default argument to avoid function macro completely.
> Probably not big win given that it won't save you from preprocesor mess.
> +
> +  gsi1 = gsi_start_bb (bb1->bb);
> +  gsi2 = gsi_start_bb (bb2->bb);
> +
> +  for (i = 0; i < bb1->nondbg_stmt_count; i++)
> +    {
> +      if (is_gimple_debug (gsi_stmt (gsi1)))
> +	gsi_next_nondebug (&gsi1);
> +
> +      if (is_gimple_debug (gsi_stmt (gsi2)))
> +	gsi_next_nondebug (&gsi2);
> +
> +      s1 = gsi_stmt (gsi1);
> +      s2 = gsi_stmt (gsi2);
> +
> +      if (gimple_code (s1) != gimple_code (s2))
> +	return RETURN_FALSE_WITH_MSG ("gimple codes are different");
> 
> I think you need to compare EH here.  Consider case where one unit
> is compiled with -fno-exception and thus all EH regions are removed,
> while other function has EH regions in it.  Those are not equivalent.
> 
> EH region is obtained by lookup_stmt_eh and then you need to comapre
> them for match as you do with gimple_resx_regoin.
> 
> +  t1 = gimple_call_fndecl (s1);
> +  t2 = gimple_call_fndecl (s2);
> +
> +  /* Function pointer variables are not supported yet.  */
> 
> They seems to be, compare_operand seems just right.
> 
> +
> +/* Verifies for given GIMPLEs S1 and S2 that
> +   label statements are semantically equivalent.  */
> +
> +bool
> +func_checker::compare_gimple_label (gimple g1, gimple g2)
> +{
> +  if (m_ignore_labels)
> +    return true;
> +
> +  tree t1 = gimple_label_label (g1);
> +  tree t2 = gimple_label_label (g2);
> +
> +  return compare_tree_ssa_label (t1, t2);
> +}
> 
> I would expect the main BB loop to record BB in which label belongs to
> and the BB assciatio neing checked here.
> Otherwise I do not see how switch statements are compared to not have
> different permutations of targets. Also note that one BB may have
> multiple labels in them and they are equivalent.
> 
> Also I would punt on occurence of FORCED_LABEL. Those are tricky as they
> may be passed around and compared for address and no one really defines
> what should happen.  Better to avoid those.

Hi.

I will remove this support in the pass.

> 
> +
> +/* Verifies for given GIMPLEs S1 and S2 that
> +   switch statements are semantically equivalent.  */
> +
> +bool
> +func_checker::compare_gimple_switch (gimple g1, gimple g2)
> +{
> +  unsigned lsize1, lsize2, i;
> +  tree t1, t2, low1, low2, high1, high2;
> +
> +  lsize1 = gimple_switch_num_labels (g1);
> +  lsize2 = gimple_switch_num_labels (g2);
> +
> +  if (lsize1 != lsize2)
> +    return false;
> +
> +  t1 = gimple_switch_index (g1);
> +  t2 = gimple_switch_index (g2);
> +
> +  if (TREE_CODE (t1) != SSA_NAME || TREE_CODE(t2) != SSA_NAME)
> +    return false;
> 
> Why non-SSA_NAMES are excluded? I see that constants should be optimized out,
> but I do not see a need for specific test here.
> +
> +  if (!compare_operand (t1, t2))
> +    return false;
> +
> +  for (i = 0; i < lsize1; i++)
> +    {
> +      low1 = CASE_LOW (gimple_switch_label (g1, i));
> +      low2 = CASE_LOW (gimple_switch_label (g2, i));
> +
> +      if ((low1 != NULL) != (low2 != NULL)
> +	  || (low1 && low2 && TREE_INT_CST_LOW (low1) != TREE_INT_CST_LOW (low2)))
> +	return false;
> 
> You want to compare integers for equivality.  Just use tree_int_cst_equal.
> +
> +      high1 = CASE_HIGH (gimple_switch_label (g1, i));
> +      high2 = CASE_HIGH (gimple_switch_label (g2, i));
> +
> +      if ((high1 != NULL) != (high2 != NULL)
> +	  || (high1 && high2
> +	      && TREE_INT_CST_LOW (high1) != TREE_INT_CST_LOW (high2)))
> +	return false;
> 
> Same here.
> +    }
> +
> +  return true;
> +}
> +
> +/* Verifies for given GIMPLEs S1 and S2 that
> +   return statements are semantically equivalent.  */
> +
> +bool
> +func_checker::compare_gimple_return (gimple g1, gimple g2)
> +{
> +  tree t1, t2;
> +
> +  t1 = gimple_return_retval (g1);
> +  t2 = gimple_return_retval (g2);
> +
> +  /* Void return type.  */
> +  if (t1 == NULL && t2 == NULL)
> +    return true;
> +  else
> +    return compare_operand (t1, t2);
> 
> Do you check somewhere DECL_BY_REFERENCE (DECL_RESULT (struct function))? Those needs to match, too.
> Perhaps it is always the case that SSA form differs, but I would test it.
> +}
> +
> +/* Verifies for given GIMPLEs S1 and S2 that
> +   goto statements are semantically equivalent.  */
> +
> +bool
> +func_checker::compare_gimple_goto (gimple g1, gimple g2)
> +{
> +  tree dest1, dest2;
> +
> +  dest1 = gimple_goto_dest (g1);
> +  dest2 = gimple_goto_dest (g2);
> +
> +  if (TREE_CODE (dest1) != TREE_CODE (dest2) || TREE_CODE (dest1) != SSA_NAME)
> +    return false;
> +
> +  return compare_operand (dest1, dest2);
> 
> You probably need to care only about indirect gotos, the direct ones are checked by
> CFG compare.  So is the condtional jump.

It looks that this code is visited quite rare.

> +
> +/* Verifies for given GIMPLEs S1 and S2 that ASM statements are equivalent.
> +   For the beginning, the pass only supports equality for
> +   '__asm__ __volatile__ ("", "", "", "memory")'.  */
> +
> +bool
> +func_checker::compare_gimple_asm (gimple g1, gimple g2)
> +{
> +  if (gimple_asm_volatile_p (g1) != gimple_asm_volatile_p (g2))
> +    return false;
> +
> +  if (gimple_asm_ninputs (g1) || gimple_asm_ninputs (g2))
> +    return false;
> +
> +  if (gimple_asm_noutputs (g1) || gimple_asm_noutputs (g2))
> +    return false;
> +
> +  if (gimple_asm_nlabels (g1) || gimple_asm_nlabels (g2))
> +    return false;
> +
> +  if (gimple_asm_nclobbers (g1) != gimple_asm_nclobbers (g2))
> +    return false;
> +
> +  for (unsigned i = 0; i < gimple_asm_nclobbers (g1); i++)
> +    {
> +      tree clobber1 = TREE_VALUE (gimple_asm_clobber_op (g1, i));
> +      tree clobber2 = TREE_VALUE (gimple_asm_clobber_op (g2, i));
> +
> +      if (!operand_equal_p (clobber1, clobber2, OEP_ONLY_CONST))
> +	return false;
> +    }
> +
> 
> Even asm statements with no inputs or outputs can differ by the actual
> asm statement. Compare it too.
> 
> Comparing inputs/outputs/labels should be very easy to do.
> 
> Compare all gimple_asm_n* for equivalency.

This makes fully sense, but I don't understand what kind of operands do you mean?

> At the end walk operands and watch the case they are TREE_LIST.
> THen compare TREE_VALUE (op) of the list for operand_equal_p
> and TREE_VALUE (TREE_PURPOSE (op)) for equivalency
> (those are the constraints)
> 
> If they are not (clobbers are not, those are just strings), operand_equal_p
> should do.
> 
> +  return true;
> +}
> +
> +} // ipa_icf namespace
> 
> Otherwise I think ipa-gimple-icf is quite fine now.
> Please send updated version and I think it can go to mainline before the actual ipa-icf.

I renamed both files and put them to a newly created namespace ipa_icf_gimple.

Thank you,
Martin
Jan Hubicka Oct. 11, 2014, 8:23 a.m. UTC | #2
> > +/* Verifies for given GIMPLEs S1 and S2 that
> > +   goto statements are semantically equivalent.  */
> > +
> > +bool
> > +func_checker::compare_gimple_goto (gimple g1, gimple g2)
> > +{
> > +  tree dest1, dest2;
> > +
> > +  dest1 = gimple_goto_dest (g1);
> > +  dest2 = gimple_goto_dest (g2);
> > +
> > +  if (TREE_CODE (dest1) != TREE_CODE (dest2) || TREE_CODE (dest1) != SSA_NAME)
> > +    return false;
> > +
> > +  return compare_operand (dest1, dest2);
> > 
> > You probably need to care only about indirect gotos, the direct ones are checked by
> > CFG compare.  So is the condtional jump.
> 
> It looks that this code is visited quite rare.

Hmm, perhaps it is called only for indirect calls, because all others are not represented as statements.
> 
> > +
> > +/* Verifies for given GIMPLEs S1 and S2 that ASM statements are equivalent.
> > +   For the beginning, the pass only supports equality for
> > +   '__asm__ __volatile__ ("", "", "", "memory")'.  */
> > +
> > +bool
> > +func_checker::compare_gimple_asm (gimple g1, gimple g2)
> > +{
> > +  if (gimple_asm_volatile_p (g1) != gimple_asm_volatile_p (g2))
> > +    return false;
> > +
> > +  if (gimple_asm_ninputs (g1) || gimple_asm_ninputs (g2))
> > +    return false;
> > +
> > +  if (gimple_asm_noutputs (g1) || gimple_asm_noutputs (g2))
> > +    return false;
> > +
> > +  if (gimple_asm_nlabels (g1) || gimple_asm_nlabels (g2))
> > +    return false;
> > +
> > +  if (gimple_asm_nclobbers (g1) != gimple_asm_nclobbers (g2))
> > +    return false;
> > +
> > +  for (unsigned i = 0; i < gimple_asm_nclobbers (g1); i++)
> > +    {
> > +      tree clobber1 = TREE_VALUE (gimple_asm_clobber_op (g1, i));
> > +      tree clobber2 = TREE_VALUE (gimple_asm_clobber_op (g2, i));
> > +
> > +      if (!operand_equal_p (clobber1, clobber2, OEP_ONLY_CONST))
> > +	return false;
> > +    }
> > +
> > 
> > Even asm statements with no inputs or outputs can differ by the actual
> > asm statement. Compare it too.
> > 
> > Comparing inputs/outputs/labels should be very easy to do.
> > 
> > Compare all gimple_asm_n* for equivalency.
> 
> This makes fully sense, but I don't understand what kind of operands do you mean?

You can look some other code dealing with gimple asm statements.  You can just compare
gimple_op for 0.... gimple_num_ops and be ready to deal with TREE_LIST as described
bellow. 

Honza
> 
> > At the end walk operands and watch the case they are TREE_LIST.
> > THen compare TREE_VALUE (op) of the list for operand_equal_p
> > and TREE_VALUE (TREE_PURPOSE (op)) for equivalency
> > (those are the constraints)
> > 
> > If they are not (clobbers are not, those are just strings), operand_equal_p
> > should do.
> > 
> > +  return true;
> > +}
> > +
> > +} // ipa_icf namespace
> > 
> > Otherwise I think ipa-gimple-icf is quite fine now.
> > Please send updated version and I think it can go to mainline before the actual ipa-icf.
> 
> I renamed both files and put them to a newly created namespace ipa_icf_gimple.
> 
> Thank you,
> Martin
diff mbox

Patch

diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
new file mode 100644
index 0000000..7031eaa
--- /dev/null
+++ b/gcc/ipa-icf-gimple.c
@@ -0,0 +1,384 @@ 
+/* Interprocedural Identical Code Folding pass
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+   Contributed by Jan Hubicka <hubicka@ucw.cz> and Martin Liska <mliska@suse.cz>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */

Please add toplevel comment about what the code does and how to use it.

+namespace ipa_icf {
+
+/* Basic block equivalence comparison function that returns true if
+   basic blocks BB1 and BB2 (from functions FUNC1 and FUNC2) correspond.  */
... to each other?
I would add short comment that as comparsion goes you build voclabulary
of equivalences of variables/ssanames etc.
So people reading the code do not get lost at very beggining.

+
+bool
+func_checker::compare_bb (sem_bb *bb1, sem_bb *bb2)
+{
+  unsigned i;
+  gimple_stmt_iterator gsi1, gsi2;
+  gimple s1, s2;
+
+  if (bb1->nondbg_stmt_count != bb2->nondbg_stmt_count
+      || bb1->edge_count != bb2->edge_count)
+    return RETURN_FALSE ();

The UPPERCASE looks ugly.  I see that RETURN_FALSE is a warpper for return_false_with_msg
that outputs line and file information.

I would make it lowercase even if it is macro. You may consider using
CXX_MEM_STAT_INFO style default argument to avoid function macro completely.
Probably not big win given that it won't save you from preprocesor mess.
+
+  gsi1 = gsi_start_bb (bb1->bb);
+  gsi2 = gsi_start_bb (bb2->bb);
+
+  for (i = 0; i < bb1->nondbg_stmt_count; i++)
+    {
+      if (is_gimple_debug (gsi_stmt (gsi1)))
+	gsi_next_nondebug (&gsi1);
+
+      if (is_gimple_debug (gsi_stmt (gsi2)))
+	gsi_next_nondebug (&gsi2);
+
+      s1 = gsi_stmt (gsi1);
+      s2 = gsi_stmt (gsi2);
+
+      if (gimple_code (s1) != gimple_code (s2))
+	return RETURN_FALSE_WITH_MSG ("gimple codes are different");

I think you need to compare EH here.  Consider case where one unit
is compiled with -fno-exception and thus all EH regions are removed,
while other function has EH regions in it.  Those are not equivalent.

EH region is obtained by lookup_stmt_eh and then you need to comapre
them for match as you do with gimple_resx_regoin.

+  t1 = gimple_call_fndecl (s1);
+  t2 = gimple_call_fndecl (s2);
+
+  /* Function pointer variables are not supported yet.  */

They seems to be, compare_operand seems just right.

+
+/* Verifies for given GIMPLEs S1 and S2 that
+   label statements are semantically equivalent.  */
+
+bool
+func_checker::compare_gimple_label (gimple g1, gimple g2)
+{
+  if (m_ignore_labels)
+    return true;
+
+  tree t1 = gimple_label_label (g1);
+  tree t2 = gimple_label_label (g2);
+
+  return compare_tree_ssa_label (t1, t2);
+}

I would expect the main BB loop to record BB in which label belongs to
and the BB assciatio neing checked here.
Otherwise I do not see how switch statements are compared to not have
different permutations of targets. Also note that one BB may have
multiple labels in them and they are equivalent.

Also I would punt on occurence of FORCED_LABEL. Those are tricky as they
may be passed around and compared for address and no one really defines
what should happen.  Better to avoid those.

+
+/* Verifies for given GIMPLEs S1 and S2 that
+   switch statements are semantically equivalent.  */
+
+bool
+func_checker::compare_gimple_switch (gimple g1, gimple g2)
+{
+  unsigned lsize1, lsize2, i;
+  tree t1, t2, low1, low2, high1, high2;
+
+  lsize1 = gimple_switch_num_labels (g1);
+  lsize2 = gimple_switch_num_labels (g2);
+
+  if (lsize1 != lsize2)
+    return false;
+
+  t1 = gimple_switch_index (g1);
+  t2 = gimple_switch_index (g2);
+
+  if (TREE_CODE (t1) != SSA_NAME || TREE_CODE(t2) != SSA_NAME)
+    return false;

Why non-SSA_NAMES are excluded? I see that constants should be optimized out,
but I do not see a need for specific test here.
+
+  if (!compare_operand (t1, t2))
+    return false;
+
+  for (i = 0; i < lsize1; i++)
+    {
+      low1 = CASE_LOW (gimple_switch_label (g1, i));
+      low2 = CASE_LOW (gimple_switch_label (g2, i));
+
+      if ((low1 != NULL) != (low2 != NULL)
+	  || (low1 && low2 && TREE_INT_CST_LOW (low1) != TREE_INT_CST_LOW (low2)))
+	return false;

You want to compare integers for equivality.  Just use tree_int_cst_equal.
+
+      high1 = CASE_HIGH (gimple_switch_label (g1, i));
+      high2 = CASE_HIGH (gimple_switch_label (g2, i));
+
+      if ((high1 != NULL) != (high2 != NULL)
+	  || (high1 && high2
+	      && TREE_INT_CST_LOW (high1) != TREE_INT_CST_LOW (high2)))
+	return false;

Same here.
+    }
+
+  return true;
+}
+
+/* Verifies for given GIMPLEs S1 and S2 that
+   return statements are semantically equivalent.  */
+
+bool
+func_checker::compare_gimple_return (gimple g1, gimple g2)
+{
+  tree t1, t2;
+
+  t1 = gimple_return_retval (g1);
+  t2 = gimple_return_retval (g2);
+
+  /* Void return type.  */
+  if (t1 == NULL && t2 == NULL)
+    return true;
+  else
+    return compare_operand (t1, t2);

Do you check somewhere DECL_BY_REFERENCE (DECL_RESULT (struct function))? Those needs to match, too.
Perhaps it is always the case that SSA form differs, but I would test it.
+}
+
+/* Verifies for given GIMPLEs S1 and S2 that
+   goto statements are semantically equivalent.  */
+
+bool
+func_checker::compare_gimple_goto (gimple g1, gimple g2)
+{
+  tree dest1, dest2;
+
+  dest1 = gimple_goto_dest (g1);
+  dest2 = gimple_goto_dest (g2);
+
+  if (TREE_CODE (dest1) != TREE_CODE (dest2) || TREE_CODE (dest1) != SSA_NAME)
+    return false;
+
+  return compare_operand (dest1, dest2);

You probably need to care only about indirect gotos, the direct ones are checked by
CFG compare.  So is the condtional jump.
+
+/* Verifies for given GIMPLEs S1 and S2 that ASM statements are equivalent.
+   For the beginning, the pass only supports equality for
+   '__asm__ __volatile__ ("", "", "", "memory")'.  */
+
+bool
+func_checker::compare_gimple_asm (gimple g1, gimple g2)
+{
+  if (gimple_asm_volatile_p (g1) != gimple_asm_volatile_p (g2))
+    return false;
+
+  if (gimple_asm_ninputs (g1) || gimple_asm_ninputs (g2))
+    return false;
+
+  if (gimple_asm_noutputs (g1) || gimple_asm_noutputs (g2))
+    return false;
+
+  if (gimple_asm_nlabels (g1) || gimple_asm_nlabels (g2))
+    return false;
+
+  if (gimple_asm_nclobbers (g1) != gimple_asm_nclobbers (g2))
+    return false;
+
+  for (unsigned i = 0; i < gimple_asm_nclobbers (g1); i++)
+    {
+      tree clobber1 = TREE_VALUE (gimple_asm_clobber_op (g1, i));
+      tree clobber2 = TREE_VALUE (gimple_asm_clobber_op (g2, i));
+
+      if (!operand_equal_p (clobber1, clobber2, OEP_ONLY_CONST))
+	return false;
+    }
+

Even asm statements with no inputs or outputs can differ by the actual
asm statement. Compare it too.

Comparing inputs/outputs/labels should be very easy to do.

Compare all gimple_asm_n* for equivalency.
At the end walk operands and watch the case they are TREE_LIST.
THen compare TREE_VALUE (op) of the list for operand_equal_p
and TREE_VALUE (TREE_PURPOSE (op)) for equivalency
(those are the constraints)

If they are not (clobbers are not, those are just strings), operand_equal_p
should do.

+  return true;
+}
+
+} // ipa_icf namespace

Otherwise I think ipa-gimple-icf is quite fine now.
Please send updated version and I think it can go to mainline before the actual ipa-icf.