Patchwork [RFC] Less TODO_remove_unused_locals

login
register
mail settings
Submitter Richard Guenther
Date Feb. 20, 2013, 1:13 p.m.
Message ID <alpine.LNX.2.00.1302201357090.3543@zhemvz.fhfr.qr>
Download mbox | patch
Permalink /patch/222078/
State New
Headers show

Comments

Richard Guenther - Feb. 20, 2013, 1:13 p.m.
On Wed, 20 Feb 2013, Richard Biener wrote:

> 
> Hunting for the "we're getting slower" bits I noticed that
> TODO_remove_unused_locals is a big part of execute_function_todo
> (and accounts for 1% of compile-time of ac.f90).
> The following patch removes most of the remove_unused_locals
> calls based on the fact that with anonymous SSA names now available
> we should never create new locals (wishful thinking of course ...)
> and the important places to remove unused stuff are driven by
> 1) avoid creating yet another copy of the unused stuff, thus do
> it before inlining, on the callee;  2) avoid pinning unused memory
> while we operate on other function bodies, thus, do it at the end
> of non-IPA pass pipelines
> 
> In the end this asks for more explicit placement and thus a
> real pass ... but the following should be enough as a RFC and
> be good enough for 4.8.
> 
> We keep doing remove_unused_locals after going into SSA
> and now after releasing unused SSA names (that should suffice
> to achieve 1) for early and IPA inlining and for 2) pre-IPA.
> post-IPA we perform it after IPA inline transform and at right
> before expanding to RTL (avoid expanding unused stack vars).
> 
> I'm scheduling a bootstrap & regtest.
> 
> Ok for trunk?

As requested, here is some statistics.  stage2 gcc/ compiled with
-fdump-statistics-stats, summed over all files with

   clear_unused_block_pointer ();

gives us before the patch (first number is the static pass number,
last is the pass name, the number of removed vars is second):

20 323409 ssa
23 131864 einline
25 29311 copyrename
26 2178 ccp
27 9771 forwprop
30 29 fre
31 60212 copyprop
33 13849 cddce
35 10 tailr
39 355 local-pure-const
40 4347 fnsplit
-----
subtotal 575335

52 46056 inline
59 10834 copyrename
61 3721 ccp
62 83 forwprop
66 826 fre
67 4821 copyprop
69 5017 vrp
70 24764 dce
73 2 ifcombine
74 662 phiopt
76 137 ch
80 2909 copyrename
81 212 dom
82 1472 phicprop
85 1322 dce
86 51 forwprop
87 4 phiopt
90 48 ccp
91 19 copyprop
95 6 pre
101 5 lim
102 883 copyprop
103 3082 dceloop
125 2 loopdone
129 589 vrp
131 49 dom
132 145 phicprop
133 316 cddce
137 5 forwprop
138 1 phiopt
142 458 copyrename
168 2955 optimized
--------
total 686791

and after the patch, adjusted as below (eh, I failed to notice
we schedule remove_unused_locals if TODO_cleanup_cfg did sth...):

20 323396 ssa
23 107755 einline
41 135260 release_ssa
------
subtotal 566411

52 47513 inline
168 61713 optimized
--------
total 675637

as expected the most unused locals are removed before final inlining.
I suppose the difference in (sub-)total is because patched we copy
less unused locals due to the extra remove-unused locals at
release_ssa time.  I can't explain the 13 difference for into-SSA.
I suppose I should also count the number of remaining locals ...

Scheduled for re-bootstrap / regtest.

Thanks,
Richard.

2013-02-20  Richard Biener  <rguenther@suse.de>

	* tree-call-cdce.c (tree_call_cdce): Do not remove unused locals.
	* tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Likewise.
	* tree-ssa-dce.c (perform_tree_ssa_dce): Likewise.
	* tree-ssa-copyrename.c (copy_rename_partition_coalesce): Do
	not return anything.
	(rename_ssa_copies): Do not remove unused locals.
	* tree-ssa-ccp.c (do_ssa_ccp): Likewise.
	* tree-ssanames.c (pass_release_ssa_names): Remove unused
	locals first.
	* passes.c (execute_function_todo): Do not schedule unused locals
	removal if cleanup_tree_cfg did something.
	* tree-ssa-live.c (remove_unused_locals): Dump statistics
	about the number of removed locals.

Index: gcc/tree-call-cdce.c
===================================================================
*** gcc/tree-call-cdce.c.orig	2013-02-20 12:49:24.000000000 +0100
--- gcc/tree-call-cdce.c	2013-02-20 12:51:00.528421709 +0100
*************** tree_call_cdce (void)
*** 898,908 ****
        /* As we introduced new control-flow we need to insert PHI-nodes
           for the call-clobbers of the remaining call.  */
        mark_virtual_operands_for_renaming (cfun);
!       return (TODO_update_ssa | TODO_cleanup_cfg | TODO_ggc_collect
!               | TODO_remove_unused_locals);
      }
!   else
!     return 0;
  }
  
  static bool
--- 898,907 ----
        /* As we introduced new control-flow we need to insert PHI-nodes
           for the call-clobbers of the remaining call.  */
        mark_virtual_operands_for_renaming (cfun);
!       return TODO_update_ssa;
      }
! 
!   return 0;
  }
  
  static bool
Index: gcc/tree-ssa-forwprop.c
===================================================================
*** gcc/tree-ssa-forwprop.c.orig	2013-02-20 12:49:24.000000000 +0100
--- gcc/tree-ssa-forwprop.c	2013-02-20 12:51:00.556422021 +0100
*************** ssa_forward_propagate_and_combine (void)
*** 2936,2942 ****
  		  && forward_propagate_addr_expr (lhs, rhs))
  		{
  		  release_defs (stmt);
- 		  todoflags |= TODO_remove_unused_locals;
  		  gsi_remove (&gsi, true);
  		}
  	      else
--- 2936,2941 ----
*************** ssa_forward_propagate_and_combine (void)
*** 2961,2967 ****
  							       off)))))
  		{
  		  release_defs (stmt);
- 		  todoflags |= TODO_remove_unused_locals;
  		  gsi_remove (&gsi, true);
  		}
  	      else if (is_gimple_min_invariant (rhs))
--- 2960,2965 ----
Index: gcc/tree-ssa-dce.c
===================================================================
*** gcc/tree-ssa-dce.c.orig	2013-02-20 12:49:24.000000000 +0100
--- gcc/tree-ssa-dce.c	2013-02-20 12:51:00.579422275 +0100
*************** perform_tree_ssa_dce (bool aggressive)
*** 1607,1616 ****
    free_edge_list (el);
  
    if (something_changed)
!     return (TODO_update_ssa | TODO_cleanup_cfg | TODO_ggc_collect
! 	    | TODO_remove_unused_locals);
!   else
!     return 0;
  }
  
  /* Pass entry points.  */
--- 1607,1614 ----
    free_edge_list (el);
  
    if (something_changed)
!     return TODO_update_ssa | TODO_cleanup_cfg;
!   return 0;
  }
  
  /* Pass entry points.  */
Index: gcc/tree-ssa-copyrename.c
===================================================================
*** gcc/tree-ssa-copyrename.c.orig	2013-02-20 12:49:24.000000000 +0100
--- gcc/tree-ssa-copyrename.c	2013-02-20 12:55:59.374738833 +0100
*************** static struct
*** 113,119 ****
  /* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
     Choose a representative for the partition, and send debug info to DEBUG.  */
  
! static bool
  copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
  {
    int p1, p2, p3;
--- 113,119 ----
  /* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
     Choose a representative for the partition, and send debug info to DEBUG.  */
  
! static void
  copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
  {
    int p1, p2, p3;
*************** copy_rename_partition_coalesce (var_map
*** 146,152 ****
      {
        if (debug)
  	fprintf (debug, " : Already coalesced.\n");
!       return false;
      }
  
    rep1 = partition_to_var (map, p1);
--- 146,152 ----
      {
        if (debug)
  	fprintf (debug, " : Already coalesced.\n");
!       return;
      }
  
    rep1 = partition_to_var (map, p1);
*************** copy_rename_partition_coalesce (var_map
*** 154,160 ****
    root1 = SSA_NAME_VAR (rep1);
    root2 = SSA_NAME_VAR (rep2);
    if (!root1 && !root2)
!     return false;
  
    /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
    abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
--- 154,160 ----
    root1 = SSA_NAME_VAR (rep1);
    root2 = SSA_NAME_VAR (rep2);
    if (!root1 && !root2)
!     return;
  
    /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
    abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
*************** copy_rename_partition_coalesce (var_map
*** 163,169 ****
      {
        if (debug)
  	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
!       return false;
      }
  
    /* Partitions already have the same root, simply merge them.  */
--- 163,169 ----
      {
        if (debug)
  	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
!       return;
      }
  
    /* Partitions already have the same root, simply merge them.  */
*************** copy_rename_partition_coalesce (var_map
*** 172,178 ****
        p1 = partition_union (map->var_partition, p1, p2);
        if (debug)
  	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
!       return false;
      }
  
    /* Never attempt to coalesce 2 different parameters.  */
--- 172,178 ----
        p1 = partition_union (map->var_partition, p1, p2);
        if (debug)
  	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
!       return;
      }
  
    /* Never attempt to coalesce 2 different parameters.  */
*************** copy_rename_partition_coalesce (var_map
*** 181,187 ****
      {
        if (debug)
          fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
!       return false;
      }
  
    if ((root1 && TREE_CODE (root1) == RESULT_DECL)
--- 181,187 ----
      {
        if (debug)
          fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
!       return;
      }
  
    if ((root1 && TREE_CODE (root1) == RESULT_DECL)
*************** copy_rename_partition_coalesce (var_map
*** 189,195 ****
      {
        if (debug)
          fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
!       return false;
      }
  
    ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
--- 189,195 ----
      {
        if (debug)
          fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
!       return;
      }
  
    ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
*************** copy_rename_partition_coalesce (var_map
*** 206,212 ****
  	{
  	  if (debug)
  	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
! 	  return false;
  	}
        else
  	ign2 = true;
--- 206,212 ----
  	{
  	  if (debug)
  	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
! 	  return;
  	}
        else
  	ign2 = true;
*************** copy_rename_partition_coalesce (var_map
*** 220,226 ****
  	{
  	  if (debug)
  	    fprintf (debug, " : 2 default defs. No coalesce.\n");
! 	  return false;
  	}
        else
          {
--- 220,226 ----
  	{
  	  if (debug)
  	    fprintf (debug, " : 2 default defs. No coalesce.\n");
! 	  return;
  	}
        else
          {
*************** copy_rename_partition_coalesce (var_map
*** 240,246 ****
      {
        if (debug)
  	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
!       return false;
      }
  
    /* Don't coalesce if the new chosen root variable would be read-only.
--- 240,246 ----
      {
        if (debug)
  	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
!       return;
      }
  
    /* Don't coalesce if the new chosen root variable would be read-only.
*************** copy_rename_partition_coalesce (var_map
*** 253,259 ****
      {
        if (debug)
  	fprintf (debug, " : Readonly variable.  No coalesce.\n");
!       return false;
      }
  
    /* Don't coalesce if the two variables aren't type compatible .  */
--- 253,259 ----
      {
        if (debug)
  	fprintf (debug, " : Readonly variable.  No coalesce.\n");
!       return;
      }
  
    /* Don't coalesce if the two variables aren't type compatible .  */
*************** copy_rename_partition_coalesce (var_map
*** 266,272 ****
      {
        if (debug)
  	fprintf (debug, " : Incompatible types.  No coalesce.\n");
!       return false;
      }
  
    /* Merge the two partitions.  */
--- 266,272 ----
      {
        if (debug)
  	fprintf (debug, " : Incompatible types.  No coalesce.\n");
!       return;
      }
  
    /* Merge the two partitions.  */
*************** copy_rename_partition_coalesce (var_map
*** 288,294 ****
  			  TDF_SLIM);
        fprintf (debug, "\n");
      }
-   return true;
  }
  
  
--- 288,293 ----
*************** rename_ssa_copies (void)
*** 308,314 ****
    gimple stmt, phi;
    unsigned x;
    FILE *debug;
-   bool updated = false;
  
    memset (&stats, 0, sizeof (stats));
  
--- 307,312 ----
*************** rename_ssa_copies (void)
*** 330,336 ****
  	      tree lhs = gimple_assign_lhs (stmt);
  	      tree rhs = gimple_assign_rhs1 (stmt);
  
! 	      updated |= copy_rename_partition_coalesce (map, lhs, rhs, debug);
  	    }
  	}
      }
--- 328,334 ----
  	      tree lhs = gimple_assign_lhs (stmt);
  	      tree rhs = gimple_assign_rhs1 (stmt);
  
! 	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
  	    }
  	}
      }
*************** rename_ssa_copies (void)
*** 358,365 ****
  	      {
  		tree arg = PHI_ARG_DEF (phi, i);
  		if (TREE_CODE (arg) == SSA_NAME)
! 		  updated |= copy_rename_partition_coalesce (map, res, arg,
! 							     debug);
  	      }
  	  /* Else if all arguments are in the same partition try to merge
  	     it with the result.  */
--- 356,363 ----
  	      {
  		tree arg = PHI_ARG_DEF (phi, i);
  		if (TREE_CODE (arg) == SSA_NAME)
! 		  copy_rename_partition_coalesce (map, res, arg,
! 						  debug);
  	      }
  	  /* Else if all arguments are in the same partition try to merge
  	     it with the result.  */
*************** rename_ssa_copies (void)
*** 390,398 ****
  		    }
  		}
  	      if (all_p_same == 1)
! 		updated |= copy_rename_partition_coalesce (map, res,
! 							   PHI_ARG_DEF (phi, 0),
! 							   debug);
  	    }
          }
      }
--- 388,396 ----
  		    }
  		}
  	      if (all_p_same == 1)
! 		copy_rename_partition_coalesce (map, res,
! 						PHI_ARG_DEF (phi, 0),
! 						debug);
  	    }
          }
      }
*************** rename_ssa_copies (void)
*** 426,432 ****
    statistics_counter_event (cfun, "copies coalesced",
  			    stats.coalesced);
    delete_var_map (map);
!   return updated ? TODO_remove_unused_locals : 0;
  }
  
  /* Return true if copy rename is to be performed.  */
--- 424,430 ----
    statistics_counter_event (cfun, "copies coalesced",
  			    stats.coalesced);
    delete_var_map (map);
!   return 0;
  }
  
  /* Return true if copy rename is to be performed.  */
Index: gcc/tree-ssa-ccp.c
===================================================================
*** gcc/tree-ssa-ccp.c.orig	2013-02-20 12:49:24.000000000 +0100
--- gcc/tree-ssa-ccp.c	2013-02-20 12:51:00.580422285 +0100
*************** do_ssa_ccp (void)
*** 2108,2114 ****
    ccp_initialize ();
    ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
    if (ccp_finalize ())
!     todo = (TODO_cleanup_cfg | TODO_update_ssa | TODO_remove_unused_locals);
    free_dominance_info (CDI_DOMINATORS);
    return todo;
  }
--- 2108,2114 ----
    ccp_initialize ();
    ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
    if (ccp_finalize ())
!     todo = (TODO_cleanup_cfg | TODO_update_ssa);
    free_dominance_info (CDI_DOMINATORS);
    return todo;
  }
Index: gcc/tree-ssanames.c
===================================================================
*** gcc/tree-ssanames.c.orig	2013-02-20 12:49:24.000000000 +0100
--- gcc/tree-ssanames.c	2013-02-20 12:51:29.583743873 +0100
*************** struct gimple_opt_pass pass_release_ssa_
*** 455,461 ****
    PROP_ssa,				/* properties_required */
    0,					/* properties_provided */
    0,					/* properties_destroyed */
!   0,					/* todo_flags_start */
!   0              			/* todo_flags_finish */
   }
  };
--- 455,461 ----
    PROP_ssa,				/* properties_required */
    0,					/* properties_provided */
    0,					/* properties_destroyed */
!   TODO_remove_unused_locals,		/* todo_flags_start */
!   0					/* todo_flags_finish */
   }
  };
Index: gcc/passes.c
===================================================================
*** gcc/passes.c.orig	2013-02-11 11:41:01.000000000 +0100
--- gcc/passes.c	2013-02-20 12:54:45.700917657 +0100
*************** execute_function_todo (void *data)
*** 1918,1927 ****
    /* Always cleanup the CFG before trying to update SSA.  */
    if (flags & TODO_cleanup_cfg)
      {
!       bool cleanup = cleanup_tree_cfg ();
! 
!       if (cleanup && (cfun->curr_properties & PROP_ssa))
! 	flags |= TODO_remove_unused_locals;
  
        /* When cleanup_tree_cfg merges consecutive blocks, it may
  	 perform some simplistic propagation when removing single
--- 1918,1924 ----
    /* Always cleanup the CFG before trying to update SSA.  */
    if (flags & TODO_cleanup_cfg)
      {
!       cleanup_tree_cfg ();
  
        /* When cleanup_tree_cfg merges consecutive blocks, it may
  	 perform some simplistic propagation when removing single
Index: gcc/tree-ssa-live.c
===================================================================
*** gcc/tree-ssa-live.c.orig	2013-01-11 10:54:52.000000000 +0100
--- gcc/tree-ssa-live.c	2013-02-20 12:55:30.066411897 +0100
*************** remove_unused_locals (void)
*** 889,895 ****
        dstidx++;
      }
    if (dstidx != num)
!     cfun->local_decls->truncate (dstidx);
  
    remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
    clear_unused_block_pointer ();
--- 889,898 ----
        dstidx++;
      }
    if (dstidx != num)
!     {
!       statistics_counter_event (cfun, "unused VAR_DECLs removed", num - dstidx);
!       cfun->local_decls->truncate (dstidx);
!     }
  
    remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
    clear_unused_block_pointer ();
Jan Hubicka - Feb. 21, 2013, 10:16 a.m.
>> Hunting for the "we're getting slower" bits I noticed that
>> TODO_remove_unused_locals is a big part of execute_function_todo
>> (and accounts for 1% of compile-time of ac.f90).
>> The following patch removes most of the remove_unused_locals
>> calls based on the fact that with anonymous SSA names now available
>> we should never create new locals (wishful thinking of course ...)

We still can kill references to existing ones.

>> and the important places to remove unused stuff are driven by
>> 1) avoid creating yet another copy of the unused stuff, thus do
>> it before inlining, on the callee;  2) avoid pinning unused memory
>> while we operate on other function bodies, thus, do it at the end
>> of non-IPA pass pipelines
>>
>> In the end this asks for more explicit placement and thus a
>> real pass ... but the following should be enough as a RFC and
>> be good enough for 4.8.

Yes, it looks good to me.  I introduced this remove_unused_locals stuff
and I was aware it is executed quite too often. I meant to get some  
data on this
but never got around that and moreover the number of execution has increased
over the years as we increased number of passes...

My recollection is that at that time the pass also did dropping of  
some flags (ADDRESSABLE?) promoting into Gimple registers.  This was  
motivation to keep it working often since it was code quality thing  
too. I suppose this is now done elsewhere.

Honza
Richard Guenther - Feb. 21, 2013, 10:29 a.m.
On Thu, 21 Feb 2013, Jan Hubicka wrote:

> 
> > > Hunting for the "we're getting slower" bits I noticed that
> > > TODO_remove_unused_locals is a big part of execute_function_todo
> > > (and accounts for 1% of compile-time of ac.f90).
> > > The following patch removes most of the remove_unused_locals
> > > calls based on the fact that with anonymous SSA names now available
> > > we should never create new locals (wishful thinking of course ...)
> 
> We still can kill references to existing ones.
> 
> > > and the important places to remove unused stuff are driven by
> > > 1) avoid creating yet another copy of the unused stuff, thus do
> > > it before inlining, on the callee;  2) avoid pinning unused memory
> > > while we operate on other function bodies, thus, do it at the end
> > > of non-IPA pass pipelines
> > > 
> > > In the end this asks for more explicit placement and thus a
> > > real pass ... but the following should be enough as a RFC and
> > > be good enough for 4.8.
> 
> Yes, it looks good to me.  I introduced this remove_unused_locals stuff
> and I was aware it is executed quite too often. I meant to get some data on
> this
> but never got around that and moreover the number of execution has increased
> over the years as we increased number of passes...
> 
> My recollection is that at that time the pass also did dropping of some flags
> (ADDRESSABLE?) promoting into Gimple registers.  This was motivation to keep
> it working often since it was code quality thing too. I suppose this is now
> done elsewhere.

Yes, that's done with TODO_update_address_taken which is done in a more
controlled fashion.

Richard.

Patch

Index: gcc/tree-ssa-live.c
===================================================================
--- gcc/tree-ssa-live.c (revision 196170)
+++ gcc/tree-ssa-live.c (working copy)
@@ -889,7 +889,10 @@  remove_unused_locals (void)
       dstidx++;
     }
   if (dstidx != num)
-    cfun->local_decls->truncate (dstidx);
+    {
+      statistics_counter_event (cfun, "unused VAR_DECLs removed", num - 
dstidx);
+      cfun->local_decls->truncate (dstidx);
+    }
 
   remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));