Patchwork Speed up find_loc_in_1pdv (PR debug/41371)

login
register
mail settings
Submitter Alexandre Oliva
Date June 10, 2010, 10:25 a.m.
Message ID <or39wvp4jx.fsf@livre.localdomain>
Download mbox | patch
Permalink /patch/55194/
State New
Headers show

Comments

Alexandre Oliva - June 10, 2010, 10:25 a.m.
On Jun  4, 2010, Richard Guenther <rguenther@suse.de> wrote:

> On Fri, 4 Jun 2010, Jakub Jelinek wrote:
>> Hi!
>> 
>> This is a patch from Alex, I've just bootstrapped/regtested it on
>> x86_64-linux and i686-linux.  find_loc_in_1pdv doesn't mark
>> the original VALUE as VALUE_RECURSED_INTO, so the recursion is deeper than
>> needed and in cases like in the PR41371 testcases that's horribly expensive.
>> E.g. on the wine testcase this patch speeds in release checking trunk gcc
>> compilation from more than 7 minutes to 56 seconds and similar improvements
>> can be seen on the other testcases.
>> 
>> Ok for trunk?

> Ok.  Can you also backport this to the 4.5 branch?

Here's a patch that should speed things up further.  Since we're
operating on a star-canonicalized variable set, we can refrain from
visiting non-canonical VALUEs, that only point back to the canonical
value.  This saves a *lot* of pointless looking up and recursing.

Regstrapped on x86_64-linux-gnu trunk, verified that it returns the same
values as the version Jakub checked in.  Ok for trunk?  Ok for 4.5?
Richard Guenther - June 10, 2010, 10:29 a.m.
On Thu, 10 Jun 2010, Alexandre Oliva wrote:

> On Jun  4, 2010, Richard Guenther <rguenther@suse.de> wrote:
> 
> > On Fri, 4 Jun 2010, Jakub Jelinek wrote:
> >> Hi!
> >> 
> >> This is a patch from Alex, I've just bootstrapped/regtested it on
> >> x86_64-linux and i686-linux.  find_loc_in_1pdv doesn't mark
> >> the original VALUE as VALUE_RECURSED_INTO, so the recursion is deeper than
> >> needed and in cases like in the PR41371 testcases that's horribly expensive.
> >> E.g. on the wine testcase this patch speeds in release checking trunk gcc
> >> compilation from more than 7 minutes to 56 seconds and similar improvements
> >> can be seen on the other testcases.
> >> 
> >> Ok for trunk?
> 
> > Ok.  Can you also backport this to the 4.5 branch?
> 
> Here's a patch that should speed things up further.  Since we're
> operating on a star-canonicalized variable set, we can refrain from
> visiting non-canonical VALUEs, that only point back to the canonical
> value.  This saves a *lot* of pointless looking up and recursing.
> 
> Regstrapped on x86_64-linux-gnu trunk, verified that it returns the same
> values as the version Jakub checked in.  Ok for trunk?  Ok for 4.5?

There's some odd spacing here:

+      dv = dv_from_value (node->loc);
+      rvar = (variable)        htab_find_with_hash (vars, dv, 
dv_htab_hash (dv));

Ok for trunk and the branch.  Do you have updated timings for the
testcases in the PR?

Thanks,
Richard.

Patch

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/41371
	* var-tracking.c (find_loc_in_1pdv): Remove recursion, only
	tail-recurse into canonical node.  Fast-forward over
	non-canonical VALUEs.

Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c.orig	2010-06-09 02:43:28.000000000 -0300
+++ gcc/var-tracking.c	2010-06-09 03:03:51.000000000 -0300
@@ -2479,125 +2479,81 @@  dv_changed_p (decl_or_value dv)
 
 /* Return a location list node whose loc is rtx_equal to LOC, in the
    location list of a one-part variable or value VAR, or in that of
-   any values recursively mentioned in the location lists.  */
+   any values recursively mentioned in the location lists.  VARS must
+   be in star-canonical form.  */
 
 static location_chain
 find_loc_in_1pdv (rtx loc, variable var, htab_t vars)
 {
   location_chain node;
   enum rtx_code loc_code;
-  location_chain ret = NULL;
-  int unmark_self = 0;
-#ifdef ENABLE_CHECKING
-  static int mark_count;
-#endif
 
   if (!var)
-    return ret;
+    return NULL;
 
 #ifdef ENABLE_CHECKING
   gcc_assert (dv_onepart_p (var->dv));
 #endif
 
   if (!var->n_var_parts)
-    return ret;
+    return NULL;
 
 #ifdef ENABLE_CHECKING
   gcc_assert (var->var_part[0].offset == 0);
+  gcc_assert (loc != dv_as_opaque (var->dv));
 #endif
 
   loc_code = GET_CODE (loc);
   for (node = var->var_part[0].loc_chain; node; node = node->next)
     {
+      decl_or_value dv;
+      variable rvar;
+
       if (GET_CODE (node->loc) != loc_code)
 	{
 	  if (GET_CODE (node->loc) != VALUE)
 	    continue;
 	}
       else if (loc == node->loc)
-	{
-	  ret = node;
-	  break;
-	}
+	return node;
       else if (loc_code != VALUE)
 	{
 	  if (rtx_equal_p (loc, node->loc))
-	    {
-	      ret = node;
-	      break;
-	    }
+	    return node;
 	  continue;
 	}
-      if (!VALUE_RECURSED_INTO (node->loc))
-	{
-	  decl_or_value dv = dv_from_value (node->loc);
-	  variable rvar = (variable)
-	    htab_find_with_hash (vars, dv, dv_htab_hash (dv));
 
-	  if (rvar)
+      /* Since we're in star-canonical form, we don't need to visit
+	 non-canonical nodes: one-part variables and non-canonical
+	 values would only point back to the canonical node.  */
+      if (dv_is_value_p (var->dv)
+	  && !canon_value_cmp (node->loc, dv_as_value (var->dv)))
+	{
+	  /* Skip all subsequent VALUEs.  */
+	  while (node->next && GET_CODE (node->next->loc) == VALUE)
 	    {
-	      location_chain where;
-
-	      if (!unmark_self)
-		{
-		  if (dv_is_value_p (var->dv)
-		      && !VALUE_RECURSED_INTO (dv_as_value (var->dv)))
-		    {
-		      unmark_self = 1;
+	      node = node->next;
 #ifdef ENABLE_CHECKING
-		      mark_count++;
-#endif
-		      VALUE_RECURSED_INTO (dv_as_value (var->dv)) = true;
-		    }
-		  else
-		    unmark_self = -1;
-		}
-
-#ifdef ENABLE_CHECKING
-	      mark_count++;
-	      /* The recursion count is bounded because we're
-		 searching in a star-canonicalized set, i.e., each
-		 equivalence set of values is arranged so that the
-		 canonical value has all locations and equivalent
-		 values, whereas equivalent values only point back to
-		 the canonical.  So, if we start at the canonical
-		 value, we'll recurse at most into each sibling, so
-		 the recurse limit will be 2.  If we start at a
-		 non-canonical value, we'll recurse into the
-		 canonical, and from there to other siblings, so
-		 recurse limit will be 3.  If we start at a one-part
-		 variable, we add one level of recursion, but we don't
-		 count it.  */
-	      gcc_assert (mark_count <= 3);
-#endif
-	      VALUE_RECURSED_INTO (node->loc) = true;
-	      if ((where = find_loc_in_1pdv (loc, rvar, vars)))
-		{
-#ifdef ENABLE_CHECKING
-		  mark_count--;
-#endif
-		  VALUE_RECURSED_INTO (node->loc) = false;
-		  ret = where;
-		  break;
-		}
-	      VALUE_RECURSED_INTO (node->loc) = false;
-#ifdef ENABLE_CHECKING
-	      mark_count--;
+	      gcc_assert (!canon_value_cmp (node->loc,
+					    dv_as_value (var->dv)));
 #endif
+	      if (loc == node->loc)
+		return node;
 	    }
+	  continue;
 	}
-    }
 
-  if (unmark_self > 0)
-    {
-      VALUE_RECURSED_INTO (dv_as_value (var->dv)) = false;
 #ifdef ENABLE_CHECKING
-      mark_count--;
-      gcc_assert (mark_count == 0);
+      gcc_assert (node == var->var_part[0].loc_chain);
+      gcc_assert (!node->next);
 #endif
+
+      dv = dv_from_value (node->loc);
+      rvar = (variable)	htab_find_with_hash (vars, dv, dv_htab_hash (dv));
+      return find_loc_in_1pdv (loc, rvar, vars);
     }
 
-  return ret;
+  return NULL;
 }
 
 /* Hash table iteration argument passed to variable_merge.  */