diff mbox

[1/2] PR 63340: Avoid harmful union classes in ira-costs.c

Message ID 87tx3pjm1n.fsf@e105548-lin.cambridge.arm.com
State New
Headers show

Commit Message

Richard Sandiford Sept. 30, 2014, 1:05 p.m. UTC
This patch is the first of two to fix PR 63340, which is an ia64
regression caused by:

2014-09-22  Richard Sandiford  <richard.sandiford@arm.com>

	* hard-reg-set.h: Include hash-table.h.
	(target_hard_regs): Add a finalize method and a x_simplifiable_subregs
	field.
	* target-globals.c (target_globals::~target_globals): Call
	hard_regs->finalize.
	* rtl.h (subreg_shape): New structure.
	(shape_of_subreg): New function.
	(simplifiable_subregs): Declare.
	* reginfo.c (simplifiable_subreg): New structure.
	(simplifiable_subregs_hasher): Likewise.
	(simplifiable_subregs): New function.
	(invalid_mode_changes): Delete.
	(alid_mode_changes, valid_mode_changes_obstack): New variables.
	(record_subregs_of_mode): Remove subregs_of_mode parameter.
	Record valid mode changes in valid_mode_changes.
	(find_subregs_of_mode): Remove subregs_of_mode parameter.
	Update calls to record_subregs_of_mode.
	(init_subregs_of_mode): Remove invalid_mode_changes and bitmap
	handling.  Initialize new variables.  Update call to
	find_subregs_of_mode.
	(invalid_mode_change_p): Check new variables instead of
	invalid_mode_changes.
	(finish_subregs_of_mode): Finalize new variables instead of
	invalid_mode_changes.
	(target_hard_regs::finalize): New function.
	* ira-costs.c (print_allocno_costs): Call invalid_mode_change_p
	even when CLASS_CANNOT_CHANGE_MODE is undefined.

After that patch, we consider a hard register to be invalid for a pseudo
register if the hard register doesn't allow all the mode changes
required by the pseudo register.  A class is invalid if all hard
registers in it are invalid.  The problem was that this redefines the
behaviour for union classes like ia64's GR_AND_FR_REGS.  Before the
patch, validity was based entirely on CANNOT_CHANGE_MODE_CLASS, which
rejects any superset of FR_REGS if a mode change is invalid for FR_REGS.
After the patch, mode changes allowed by GR_REGS are also allowed by
GR_AND_FR_REGS.

As explained in the big block comment in the patch, there's an argument
whether the mode changes allowed by union classes should be the union or
the intersection of the mode changes allowed by subclasses.  Both are
right in some cases and wrong in others.  The upshot is that we really
shouldn't be using union classes if only one of the subclasses is valid.

This first patch lays the groundwork for the main fix by:

(a) taking the register mode as well as the "approximate" allocation
    class into account in setup_regno_cost_classes_by_aclass, so that
    a register's cost classes are always "correct" for its mode.

(b) checking contains_reg_of_mode when setting up the cost classes
    rather than when using them, so that this can be taken into account
    when deciding whether union classes are useful.

(c) excluding classes that allow the same set of registers as existing
    cost classes, such as GR_AND_FR_REGS if only GR_REGS or FR_REGS
    are valid.

(d) using the cost_classes "index" array to map excluded classes to the
    appropriate subclass, so that we can still take the subunion of
    two cost classes and expect it to have a valid index.  E.g. if:

      A = { 1          }
      B = {    2, 3, 4 }
      C = { 1, 2, 3    }
      D = { 1, 2, 3, 4 }

    and if 4 is invalid for a particular pseudo register, A, B and C
    are useful but D is redundant with C.  A \subunion B gives D,
    then the lookup will map it to C.

This significantly reduces the number of redundant classes in the
cost_classes structure, so it's also a minor compile-time improvement.
The time for -O0 fold-const.ii on x86_64 improved by ~0.5%.
(record_reg_classes was previously the hottest function in the
compilation, after the patch it goes down to number 2, though
it's still costly.)

I did a diff of the assembly output before and after the patch
on x86_64-linux-gnu, powerpc64-linux-gnu, s390x-linux-gnu and
aarch64-linux-gnu.  There were some minor register allocation
changes in a handful files, but nothing major.

Tested on x86_64-linux-gnu, powerpc64-linux-gnu and aarch64-elf.
OK to install?

Thanks,
Richard


gcc/
	PR rtl-optimization/63340 (part 1)
	* ira-costs.c (all_cost_classes): New variable.
	(complete_cost_classes): New function, split out from...
	(setup_cost_classes): ...here.
	(initiate_regno_cost_classes): Set up all_cost_classes.
	(restrict_cost_classes): New function.
	(setup_regno_cost_classes_by_aclass): Restrict the cost classes to
	registers that are valid for the register's mode.
	(setup_regno_cost_classes_by_mode): Model the mode cache as a
	restriction of all_cost_classes to a particular mode.
	(print_allocno_costs): Remove contains_reg_of_mode check.
	(print_pseudo_costs, find_costs_and_classes): Likewise.
diff mbox

Patch

Index: gcc/ira-costs.c
===================================================================
--- gcc/ira-costs.c	2014-09-29 15:56:20.561355514 +0100
+++ gcc/ira-costs.c	2014-09-30 08:34:47.311926931 +0100
@@ -176,6 +176,31 @@  cost_classes_hasher::remove (value_type
 /* Map mode -> cost classes for pseudo of give mode.  */
 static cost_classes_t cost_classes_mode_cache[MAX_MACHINE_MODE];
 
+/* Cost classes that include all classes in ira_important_classes.  */
+static cost_classes all_cost_classes;
+
+/* Use the array of classes in CLASSES_PTR to fill out the rest of
+   the structure.  */
+static void
+complete_cost_classes (cost_classes_t classes_ptr)
+{
+  for (int i = 0; i < N_REG_CLASSES; i++)
+    classes_ptr->index[i] = -1;
+  for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+    classes_ptr->hard_regno_index[i] = -1;
+  for (int i = 0; i < classes_ptr->num; i++)
+    {
+      enum reg_class cl = classes_ptr->classes[i];
+      classes_ptr->index[cl] = i;
+      for (int j = ira_class_hard_regs_num[cl] - 1; j >= 0; j--)
+	{
+	  unsigned int hard_regno = ira_class_hard_regs[cl][j];
+	  if (classes_ptr->hard_regno_index[hard_regno] < 0)
+	    classes_ptr->hard_regno_index[hard_regno] = i;
+	}
+    }
+}
+
 /* Initialize info about the cost classes for each pseudo.  */
 static void
 initiate_regno_cost_classes (void)
@@ -189,6 +214,10 @@  initiate_regno_cost_classes (void)
   memset (cost_classes_mode_cache, 0,
 	  sizeof (cost_classes_t) * MAX_MACHINE_MODE);
   cost_classes_htab = new hash_table<cost_classes_hasher> (200);
+  all_cost_classes.num = ira_important_classes_num;
+  for (int i = 0; i < ira_important_classes_num; i++)
+    all_cost_classes.classes[i] = ira_important_classes[i];
+  complete_cost_classes (&all_cost_classes);
 }
 
 /* Create new cost classes from cost classes FROM and set up members
@@ -200,27 +229,105 @@  initiate_regno_cost_classes (void)
 setup_cost_classes (cost_classes_t from)
 {
   cost_classes_t classes_ptr;
-  enum reg_class cl;
-  int i, j, hard_regno;
 
   classes_ptr = (cost_classes_t) ira_allocate (sizeof (struct cost_classes));
   classes_ptr->num = from->num;
-  for (i = 0; i < N_REG_CLASSES; i++)
-    classes_ptr->index[i] = -1;
-  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-    classes_ptr->hard_regno_index[i] = -1;
-  for (i = 0; i < from->num; i++)
+  for (int i = 0; i < from->num; i++)
+    classes_ptr->classes[i] = from->classes[i];
+  complete_cost_classes (classes_ptr);
+  return classes_ptr;
+}
+
+/* Return a version of FULL that only considers registers in REGS that are
+   valid for mode MODE.  Both FULL and the returned class are globally
+   allocated.  */
+static cost_classes_t
+restrict_cost_classes (cost_classes_t full, enum machine_mode mode,
+		       const HARD_REG_SET &regs)
+{
+  static struct cost_classes narrow;
+  int map[N_REG_CLASSES];
+  narrow.num = 0;
+  for (int i = 0; i < full->num; i++)
     {
-      cl = classes_ptr->classes[i] = from->classes[i];
-      classes_ptr->index[cl] = i;
-      for (j = ira_class_hard_regs_num[cl] - 1; j >= 0; j--)
+      /* Assume that we'll drop the class.  */
+      map[i] = -1;
+
+      /* Ignore classes that are too small for the mode.  */
+      enum reg_class cl = full->classes[i];
+      if (!contains_reg_of_mode[cl][mode])
+	continue;
+
+      /* Calculate the set of registers in CL that belong to REGS and
+	 are valid for MODE.  */
+      HARD_REG_SET valid_for_cl;
+      COPY_HARD_REG_SET (valid_for_cl, reg_class_contents[cl]);
+      AND_HARD_REG_SET (valid_for_cl, regs);
+      AND_COMPL_HARD_REG_SET (valid_for_cl,
+			      ira_prohibited_class_mode_regs[cl][mode]);
+      AND_COMPL_HARD_REG_SET (valid_for_cl, ira_no_alloc_regs);
+      if (hard_reg_set_empty_p (valid_for_cl))
+	continue;
+
+      /* Don't use this class if the set of valid registers is a subset
+	 of an existing class.  For example, suppose we have two classes
+	 GR_REGS and FR_REGS and a union class GR_AND_FR_REGS.  Suppose
+	 that the mode changes allowed by FR_REGS are not as general as
+	 the mode changes allowed by GR_REGS.
+
+	 In this situation, the mode changes for GR_AND_FR_REGS could
+	 either be seen as the union or the intersection of the mode
+	 changes allowed by the two subclasses.  The justification for
+	 the union-based definition would be that, if you want a mode
+	 change that's only allowed by GR_REGS, you can pick a register
+	 from the GR_REGS subclass.  The justification for the
+	 intersection-based definition would be that every register
+	 from the class would allow the mode change.
+
+	 However, if we have a register that needs to be in GR_REGS,
+	 using GR_AND_FR_REGS with the intersection-based definition
+	 would be too pessimistic, since it would bring in restrictions
+	 that only apply to FR_REGS.  Conversely, if we have a register
+	 that needs to be in FR_REGS, using GR_AND_FR_REGS with the
+	 union-based definition would lose the extra restrictions
+	 placed on FR_REGS.  GR_AND_FR_REGS is therefore only useful
+	 for cases where GR_REGS and FP_REGS are both valid.  */
+      int pos;
+      for (pos = 0; pos < narrow.num; ++pos)
 	{
-	  hard_regno = ira_class_hard_regs[cl][j];
-	  if (classes_ptr->hard_regno_index[hard_regno] < 0)
-	    classes_ptr->hard_regno_index[hard_regno] = i;
+	  enum reg_class cl2 = narrow.classes[pos];
+	  if (hard_reg_set_subset_p (valid_for_cl, reg_class_contents[cl2]))
+	    break;
+	}
+      map[i] = pos;
+      if (pos == narrow.num)
+	{
+	  /* If several classes are equivalent, prefer to use the one
+	     that was chosen as the allocno class.  */
+	  enum reg_class cl2 = ira_allocno_class_translate[cl];
+	  if (ira_class_hard_regs_num[cl] == ira_class_hard_regs_num[cl2])
+	    cl = cl2;
+	  narrow.classes[narrow.num++] = cl;
 	}
     }
-  return classes_ptr;
+  if (narrow.num == full->num)
+    return full;
+
+  cost_classes **slot = cost_classes_htab->find_slot (&narrow, INSERT);
+  if (*slot == NULL)
+    {
+      cost_classes_t classes = setup_cost_classes (&narrow);
+      /* Map equivalent classes to the representative that we chose above.  */
+      for (int i = 0; i < ira_important_classes_num; i++)
+	{
+	  enum reg_class cl = ira_important_classes[i];
+	  int index = full->index[cl];
+	  if (index >= 0)
+	    classes->index[cl] = map[index];
+	}
+      *slot = classes;
+    }
+  return *slot;
 }
 
 /* Setup cost classes for pseudo REGNO whose allocno class is ACLASS.
@@ -270,6 +377,13 @@  setup_regno_cost_classes_by_aclass (int
 	}
       classes_ptr = cost_classes_aclass_cache[aclass] = (cost_classes_t) *slot;
     }
+  if (regno_reg_rtx[regno] != NULL_RTX)
+    /* Restrict the classes to those that are valid for REGNO's mode
+       (which might for example exclude singleton classes if the mode requires
+       two registers).  */
+    classes_ptr = restrict_cost_classes (classes_ptr,
+					 PSEUDO_REGNO_MODE (regno),
+					 reg_class_contents[ALL_REGS]);
   regno_cost_classes[regno] = classes_ptr;
 }
 
@@ -282,36 +396,11 @@  setup_regno_cost_classes_by_aclass (int
 static void
 setup_regno_cost_classes_by_mode (int regno, enum machine_mode mode)
 {
-  static struct cost_classes classes;
-  cost_classes_t classes_ptr;
-  enum reg_class cl;
-  int i;
-  cost_classes **slot;
-  HARD_REG_SET temp;
-
-  if ((classes_ptr = cost_classes_mode_cache[mode]) == NULL)
-    {
-      classes.num = 0;
-      for (i = 0; i < ira_important_classes_num; i++)
-	{
-	  cl = ira_important_classes[i];
-	  COPY_HARD_REG_SET (temp, ira_prohibited_class_mode_regs[cl][mode]);
-	  IOR_HARD_REG_SET (temp, ira_no_alloc_regs);
-	  if (hard_reg_set_subset_p (reg_class_contents[cl], temp))
-	    continue;
-	  classes.classes[classes.num++] = cl;
-	}
-      slot = cost_classes_htab->find_slot (&classes, INSERT);
-      if (*slot == NULL)
-	{
-	  classes_ptr = setup_cost_classes (&classes);
-	  *slot = classes_ptr;
-	}
-      else
-	classes_ptr = (cost_classes_t) *slot;
-      cost_classes_mode_cache[mode] = (cost_classes_t) *slot;
-    }
-  regno_cost_classes[regno] = classes_ptr;
+  if (cost_classes_mode_cache[mode] == NULL)
+    cost_classes_mode_cache[mode]
+      = restrict_cost_classes (&all_cost_classes, mode,
+			       reg_class_contents[ALL_REGS]);
+  regno_cost_classes[regno] = cost_classes_mode_cache[mode];
 }
 
 /* Finilize info about the cost classes for each pseudo.  */
@@ -1437,8 +1526,7 @@  print_allocno_costs (FILE *f)
       for (k = 0; k < cost_classes_ptr->num; k++)
 	{
 	  rclass = cost_classes[k];
-	  if (contains_reg_of_mode[rclass][PSEUDO_REGNO_MODE (regno)]
-	      && ! invalid_mode_change_p (regno, (enum reg_class) rclass))
+	  if (! invalid_mode_change_p (regno, (enum reg_class) rclass))
 	    {
 	      fprintf (f, " %s:%d", reg_class_names[rclass],
 		       COSTS (costs, i)->cost[k]);
@@ -1476,8 +1564,7 @@  print_pseudo_costs (FILE *f)
       for (k = 0; k < cost_classes_ptr->num; k++)
 	{
 	  rclass = cost_classes[k];
-	  if (contains_reg_of_mode[rclass][PSEUDO_REGNO_MODE (regno)]
-	      && ! invalid_mode_change_p (regno, (enum reg_class) rclass))
+	  if (! invalid_mode_change_p (regno, (enum reg_class) rclass))
 	    fprintf (f, " %s:%d", reg_class_names[rclass],
 		     COSTS (costs, regno)->cost[k]);
 	}
@@ -1718,8 +1805,7 @@  find_costs_and_classes (FILE *dump_file)
 	      rclass = cost_classes[k];
 	      /* Ignore classes that are too small or invalid for this
 		 operand.  */
-	      if (! contains_reg_of_mode[rclass][PSEUDO_REGNO_MODE (i)]
-		  || invalid_mode_change_p (i, (enum reg_class) rclass))
+	      if (invalid_mode_change_p (i, (enum reg_class) rclass))
 		continue;
 	      if (i_costs[k] < best_cost)
 		{
@@ -1812,8 +1898,7 @@  find_costs_and_classes (FILE *dump_file)
 			continue;
 		      /* Ignore classes that are too small or invalid
 			 for this operand.  */
-		      if (! contains_reg_of_mode[rclass][PSEUDO_REGNO_MODE (i)]
-			  || invalid_mode_change_p (i, (enum reg_class) rclass))
+		      if (invalid_mode_change_p (i, (enum reg_class) rclass))
 			;
 		      else if (total_a_costs[k] < best_cost)
 			{