Patchwork Update postreload.c to avoid long compilation time

login
register
mail settings
Submitter Martin Thuresson
Date July 30, 2010, 11:07 p.m.
Message ID <AANLkTi=UapBhS8xBT5LdOFn7YL_dOKJPVzu7S-NN0uBs@mail.gmail.com>
Download mbox | patch
Permalink /patch/60384/
State New
Headers show

Comments

Martin Thuresson - July 30, 2010, 11:07 p.m.
This patch updates the handling of temporary registers in postreload.c
to avoid very long build times in certain files.

By creating new registers instead of updating one existing one using
SET_REGNO it avoids going through the scan df structure.

Im new to gcc RTL and appreciate any feedback.

In the (inspired from real code) attached source, the time
for "reload CSE regs" went down from 20 usr seconds to less than 1.

2010-07-30  Martin Thuresson  <martint@google.com>

	* postreload.c (reload_cse_simplify, reload_cse_simplify_operands,
	reload_cse_regs_1): Define testreg in reload_cse_simplify_operands
	instead of passing it as function parameter.

Thanks,
Martin
Michael Matz - July 30, 2010, 11:57 p.m.
Hi,

On Fri, 30 Jul 2010, Martin Thuresson wrote:

> This patch updates the handling of temporary registers in postreload.c
> to avoid very long build times in certain files.
> 
> By creating new registers instead of updating one existing one using
> SET_REGNO it avoids going through the scan df structure.

But this also means generating heaps of garbage for the collector to 
collect.  I think it would be better to not loose the current optimization 
of doing just one gen_reg_RTX(), and rather create a variant of SET_REGNO 
that doesn't call df_ref_change_reg_with_loc; SET_REGNO_RAW or something 
(or just inline the "XCUINT (RTX, 0, REG) = N" instead of using SET_REGNO 
at that place).


Ciao,
Michael.
Martin Thuresson - July 31, 2010, midnight
On Fri, Jul 30, 2010 at 4:57 PM, Michael Matz <matz@suse.de> wrote:
>
> Hi,
>
> On Fri, 30 Jul 2010, Martin Thuresson wrote:
>
> > This patch updates the handling of temporary registers in postreload.c
> > to avoid very long build times in certain files.
> >
> > By creating new registers instead of updating one existing one using
> > SET_REGNO it avoids going through the scan df structure.
>
> But this also means generating heaps of garbage for the collector to
> collect.  I think it would be better to not loose the current optimization
> of doing just one gen_reg_RTX(), and rather create a variant of SET_REGNO
> that doesn't call df_ref_change_reg_with_loc; SET_REGNO_RAW or something
> (or just inline the "XCUINT (RTX, 0, REG) = N" instead of using SET_REGNO
> at that place).

Thanks for the feedback. I like your suggestion of SET_REGNO_RAW.
I'll update my patch.
Martin

>
> Ciao,
> Michael.
Bernd Schmidt - July 31, 2010, 12:03 a.m.
On 07/31/2010 01:07 AM, Martin Thuresson wrote:
> This patch updates the handling of temporary registers in postreload.c
> to avoid very long build times in certain files.
> 
> By creating new registers instead of updating one existing one using
> SET_REGNO it avoids going through the scan df structure.
> 
> Im new to gcc RTL and appreciate any feedback.
> 
> In the (inspired from real code) attached source, the time
> for "reload CSE regs" went down from 20 usr seconds to less than 1.

I have trouble reproducing the slowness.  Which target are you using?

It's a bit unfortunate, and probably avoidable, to create additional
garbage RTL.  Maybe what's really needed is
  df_set_flags (DF_DEFER_INSN_RESCAN);
at the top of postreload?  Can you try that?  Or, even simpler, avoid
the SET_REGNO macro, and check all other occurrences of it.  There seem
to be a few, e.g. in caller-save.c or ira.c, which also probably
shouldn't invoke df.


Bernd
Paolo Bonzini - July 31, 2010, 9:48 a.m.
On 07/31/2010 02:03 AM, Bernd Schmidt wrote:
> It's a bit unfortunate, and probably avoidable, to create additional
> garbage RTL.  Maybe what's really needed is
>    df_set_flags (DF_DEFER_INSN_RESCAN);
> at the top of postreload?  Can you try that?

That's a possibility, but considering postreload is a heavy user of 
note_uses/note_stores, deferring rescans makes it impossible to switch 
it to use the operand caches someday.

> Or, even simpler, avoid
> the SET_REGNO macro, and check all other occurrences of it.  There seem
> to be a few, e.g. in caller-save.c or ira.c, which also probably
> shouldn't invoke df.

Better, yes.

Paolo

Patch

--- postreload.c.orig	2010-07-27 09:37:16.000000000 -0700
+++ postreload.c	2010-07-30 13:53:37.000000000 -0700
@@ -51,10 +51,10 @@  along with GCC; see the file COPYING3.  
 #include "dbgcnt.h"
 
 static int reload_cse_noop_set_p (rtx);
-static void reload_cse_simplify (rtx, rtx);
+static void reload_cse_simplify (rtx);
 static void reload_cse_regs_1 (rtx);
 static int reload_cse_simplify_set (rtx, rtx);
-static int reload_cse_simplify_operands (rtx, rtx);
+static int reload_cse_simplify_operands (rtx);
 
 static void reload_combine (void);
 static void reload_combine_note_use (rtx *, rtx, int, rtx);
@@ -92,7 +92,7 @@  reload_cse_noop_set_p (rtx set)
 
 /* Try to simplify INSN.  */
 static void
-reload_cse_simplify (rtx insn, rtx testreg)
+reload_cse_simplify (rtx insn)
 {
   rtx body = PATTERN (insn);
 
@@ -120,7 +120,7 @@  reload_cse_simplify (rtx insn, rtx testr
       if (count > 0)
 	apply_change_group ();
       else
-	reload_cse_simplify_operands (insn, testreg);
+	reload_cse_simplify_operands (insn);
     }
   else if (GET_CODE (body) == PARALLEL)
     {
@@ -177,7 +177,7 @@  reload_cse_simplify (rtx insn, rtx testr
       if (count > 0)
 	apply_change_group ();
       else
-	reload_cse_simplify_operands (insn, testreg);
+	reload_cse_simplify_operands (insn);
     }
 }
 
@@ -202,7 +202,6 @@  static void
 reload_cse_regs_1 (rtx first)
 {
   rtx insn;
-  rtx testreg = gen_rtx_REG (VOIDmode, -1);
 
   cselib_init (CSELIB_RECORD_MEMORY);
   init_alias_analysis ();
@@ -210,7 +209,7 @@  reload_cse_regs_1 (rtx first)
   for (insn = first; insn; insn = NEXT_INSN (insn))
     {
       if (INSN_P (insn))
-	reload_cse_simplify (insn, testreg);
+	reload_cse_simplify (insn);
 
       cselib_process_insn (insn);
     }
@@ -371,7 +370,7 @@  reload_cse_simplify_set (rtx set, rtx in
    hard registers.  */
 
 static int
-reload_cse_simplify_operands (rtx insn, rtx testreg)
+reload_cse_simplify_operands (rtx insn)
 {
   int i, j;
 
@@ -469,7 +468,7 @@  reload_cse_simplify_operands (rtx insn, 
 			       1);
 	      if (! apply_change_group ())
 		return 0;
-	      return reload_cse_simplify_operands (insn, testreg);
+	      return reload_cse_simplify_operands (insn);
 	    }
 	  else
 	    /* ??? There might be arithmetic operations with memory that are
@@ -524,13 +523,11 @@  reload_cse_simplify_operands (rtx insn, 
       for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 	{
 	  enum reg_class rclass = NO_REGS;
+          rtx testreg = gen_rtx_REG (mode, regno);
 
 	  if (! TEST_HARD_REG_BIT (equiv_regs[i], regno))
 	    continue;
 
-	  SET_REGNO (testreg, regno);
-	  PUT_MODE (testreg, mode);
-
 	  /* We found a register equal to this operand.  Now look for all
 	     alternatives that can accept this register and have not been
 	     assigned a register they can use yet.  */