Patchwork New optimization for reload_combine

login
register
mail settings
Submitter Bernd Schmidt
Date July 19, 2010, 10 a.m.
Message ID <4C442253.8030805@codesourcery.com>
Download mbox | patch
Permalink /patch/59187/
State New
Headers show

Comments

Bernd Schmidt - July 19, 2010, 10 a.m.
On 07/17/2010 05:03 PM, H.J. Lu wrote:
> On Sat, Jul 17, 2010 at 7:25 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> On 07/17/2010 04:38 AM, H.J. Lu wrote:
>>> This caused:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44970
>>
>> Apparently, the sse_prologue_save_insn is broken.
>>
> 
> It is more than that. It failed to boostrap on Linux/ia32 when
> configured with
> 
> --enable-clocale=gnu --with-system-zlib --enable-shared
> --with-demangler-in-ld  --with-fpmath=sse

I've committed the following fix for the bootstrap comparison failures.
 DEBUG_INSNs got in the way, as they do.  This also fixes a problem
observed with sh libjava, we need to avoid moving something after an
insn that can throw.  Another fix which I found by inspection is for the
use_ruid bookkeeping, the interaction between the new and the old code
could have caused problems there.

This leaves the x86_64 bootstrap failure, which needs to be fixed by an
x86 maintainer (I don't know enough about what's required), and a
spec2k6 regression reported by HJ.  HJ, have you verified that that was
actually caused by the same revision, and not just a range of revisions?
 If it was indeed my patch, can you either extract a test case, or
compile the failing program to assembly files, then send me a diff of
before/after?


Bernd
* postreload.c (reload_combine_closest_single_use): Ignore the
	number of uses for DEBUG_INSNs.
	(fixup_debug_insns): New static function.
	(reload_combine_recognize_const_pattern): Use it.  Don't let the
	main loop be affected by DEBUG_INSNs.
	Really disallow moving adds past a jump insn.
	(reload_combine_recognize_pattern): Don't update use_ruid here.
	(reload_combine_note_use): Do it here.
	(reload_combine): Use control_flow_insn_p rather than JUMP_P.
H.J. Lu - July 19, 2010, 1:27 p.m.
On Mon, Jul 19, 2010 at 3:00 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 07/17/2010 05:03 PM, H.J. Lu wrote:
>> On Sat, Jul 17, 2010 at 7:25 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>> On 07/17/2010 04:38 AM, H.J. Lu wrote:
>>>> This caused:
>>>>
>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44970
>>>
>>> Apparently, the sse_prologue_save_insn is broken.
>>>
>>
>> It is more than that. It failed to boostrap on Linux/ia32 when
>> configured with
>>
>> --enable-clocale=gnu --with-system-zlib --enable-shared
>> --with-demangler-in-ld  --with-fpmath=sse
>
> I've committed the following fix for the bootstrap comparison failures.
>  DEBUG_INSNs got in the way, as they do.  This also fixes a problem
> observed with sh libjava, we need to avoid moving something after an
> insn that can throw.  Another fix which I found by inspection is for the
> use_ruid bookkeeping, the interaction between the new and the old code
> could have caused problems there.
>
> This leaves the x86_64 bootstrap failure, which needs to be fixed by an
> x86 maintainer (I don't know enough about what's required), and a
> spec2k6 regression reported by HJ.  HJ, have you verified that that was
> actually caused by the same revision, and not just a range of revisions?

I verified that revision 162270 is the cause.

>  If it was indeed my patch, can you either extract a test case, or
> compile the failing program to assembly files, then send me a diff of
> before/after?

I will do that. It will take a while.
Bernd Schmidt - July 20, 2010, 11:45 a.m.
On 07/19/2010 12:00 PM, Bernd Schmidt wrote:
> I've committed the following fix for the bootstrap comparison failures.
>  DEBUG_INSNs got in the way, as they do.  This also fixes a problem
> observed with sh libjava, we need to avoid moving something after an
> insn that can throw.  Another fix which I found by inspection is for the
> use_ruid bookkeeping, the interaction between the new and the old code
> could have caused problems there.

It's still producing different code for -g vs. no-debug on powerpc.  It
turns out that we only store a limited amount of uses for a reg, and
somehow the powerpc compiler produces a huge amount of debug_insns,
which fill the buffer and prevent us from doing the optimization on real
insns.

Not sure yet how to fix it, probably by not tracking DEBUG_INSNs at all
and letting them get out of date.

Do we have any data whether this whole DEBUG_INSN stuff is actually
useful?  It was still controversial when it was committed, and I've
personally not seen any improvement in trying to debug a cc1 compiled at
-O2.  Is there any evidence that it has actually made debugging easier,
or are we carrying around all this complexity for no gain?


Bernd
Jeff Law - July 20, 2010, 3:33 p.m.
On 07/20/10 05:45, Bernd Schmidt wrote:
> On 07/19/2010 12:00 PM, Bernd Schmidt wrote:
>    
>> I've committed the following fix for the bootstrap comparison failures.
>>   DEBUG_INSNs got in the way, as they do.  This also fixes a problem
>> observed with sh libjava, we need to avoid moving something after an
>> insn that can throw.  Another fix which I found by inspection is for the
>> use_ruid bookkeeping, the interaction between the new and the old code
>> could have caused problems there.
>>      
> It's still producing different code for -g vs. no-debug on powerpc.  It
> turns out that we only store a limited amount of uses for a reg, and
> somehow the powerpc compiler produces a huge amount of debug_insns,
> which fill the buffer and prevent us from doing the optimization on real
> insns.
>
> Not sure yet how to fix it, probably by not tracking DEBUG_INSNs at all
> and letting them get out of date.
>    
The only middle ground I see would be to track them, but if the buffer 
fills, you go through and purge all the DEBUG_INSN entries from the 
buffer and associated data structures.  Otherwise I think you have to 
track them without bound or ignore them completely.


Jeff

Patch

Index: postreload.c
===================================================================
--- postreload.c	(revision 162277)
+++ postreload.c	(working copy)
@@ -816,7 +816,8 @@  reload_combine_purge_reg_uses_after_ruid
 
 /* Find the use of REGNO with the ruid that is highest among those
    lower than RUID_LIMIT, and return it if it is the only use of this
-   reg in the insn.  Return NULL otherwise.  */
+   reg in the insn (or if the insn is a debug insn).  Return NULL
+   otherwise.  */
 
 static struct reg_use *
 reload_combine_closest_single_use (unsigned regno, int ruid_limit)
@@ -830,7 +831,8 @@  reload_combine_closest_single_use (unsig
   retval = NULL;
   for (i = use_idx; i < RELOAD_COMBINE_MAX_USES; i++)
     {
-      int this_ruid = reg_state[regno].reg_use[i].ruid;
+      struct reg_use *use = reg_state[regno].reg_use + i; 
+      int this_ruid = use->ruid;
       if (this_ruid >= ruid_limit)
 	continue;
       if (this_ruid > best_ruid)
@@ -838,7 +840,7 @@  reload_combine_closest_single_use (unsig
 	  best_ruid = this_ruid;
 	  retval = reg_state[regno].reg_use + i;
 	}
-      else if (this_ruid == best_ruid)
+      else if (this_ruid == best_ruid && !DEBUG_INSN_P (use->insn))
 	retval = NULL;
     }
   if (last_label_ruid >= best_ruid)
@@ -846,6 +848,44 @@  reload_combine_closest_single_use (unsig
   return retval;
 }
 
+/* After we've moved an add insn, fix up any debug insns that occur between
+   the old location of the add and the new location.  REGNO is the destination
+   register of the add insn; REG is the corresponding RTX.  REPLACEMENT is
+   the SET_SRC of the add.  MIN_RUID specifies the ruid of the insn after
+   which we've placed the add, we ignore any debug insns after it.  */
+
+static void
+fixup_debug_insns (unsigned regno, rtx reg, rtx replacement, int min_ruid)
+{
+  struct reg_use *use;
+  int from = reload_combine_ruid;
+  for (;;)
+    {
+      rtx t;
+      rtx use_insn = NULL_RTX;
+      if (from < min_ruid)
+	break;
+      use = reload_combine_closest_single_use (regno, from);
+      if (use)
+	{
+	  from = use->ruid;
+	  use_insn = use->insn;
+	}
+      else
+	break;
+      
+      if (NONDEBUG_INSN_P (use->insn))
+	continue;
+      t = INSN_VAR_LOCATION_LOC (use_insn);
+      t = simplify_replace_rtx (t, reg, copy_rtx (replacement));
+      validate_change (use->insn,
+		       &INSN_VAR_LOCATION_LOC (use->insn), t, 0);
+      reload_combine_purge_insn_uses (use_insn);
+      reload_combine_note_use (&PATTERN (use_insn), use_insn,
+			       use->ruid, NULL_RTX);
+    }
+}
+
 /* Called by reload_combine when scanning INSN.  This function tries to detect
    patterns where a constant is added to a register, and the result is used
    in an address.
@@ -913,6 +953,10 @@  reload_combine_recognize_const_pattern (
 	/* Start the search for the next use from here.  */
 	from_ruid = use->ruid;
 
+      /* We'll fix up DEBUG_INSNs after we're done.  */
+      if (use && DEBUG_INSN_P (use->insn))
+	continue;
+
       if (use && GET_MODE (*use->usep) == Pmode)
 	{
 	  rtx use_insn = use->insn;
@@ -921,7 +965,7 @@  reload_combine_recognize_const_pattern (
 	  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn));
 
 	  /* Avoid moving the add insn past a jump.  */
-	  if (must_move_add && use_ruid < last_jump_ruid)
+	  if (must_move_add && use_ruid <= last_jump_ruid)
 	    break;
 
 	  /* If the add clobbers another hard reg in parallel, don't move
@@ -1019,6 +1063,8 @@  reload_combine_recognize_const_pattern (
     /* Process the add normally.  */
     return false;
 
+  fixup_debug_insns (regno, reg, src, add_moved_after_ruid);
+  
   reorder_insns (insn, insn, add_moved_after_insn);
   reload_combine_purge_reg_uses_after_ruid (regno, add_moved_after_ruid);
   reload_combine_split_ruids (add_moved_after_ruid - 1);
@@ -1155,11 +1201,6 @@  reload_combine_recognize_pattern (rtx in
 		   reg_state[regno].reg_use[i].ruid,
 		   reg_state[regno].reg_use[i].containing_mem);
 
-	      if (reg_state[REGNO (base)].use_ruid
-		  > reg_state[regno].use_ruid)
-		reg_state[REGNO (base)].use_ruid
-		  = reg_state[regno].use_ruid;
-
 	      /* Delete the reg-reg addition.  */
 	      delete_insn (insn);
 
@@ -1277,7 +1318,7 @@  reload_combine (void)
 
       reload_combine_ruid++;
 
-      if (JUMP_P (insn))
+      if (control_flow_insn_p (insn))
 	last_jump_ruid = reload_combine_ruid;
 
       if (reload_combine_recognize_const_pattern (insn)
@@ -1495,8 +1536,14 @@  reload_combine_note_use (rtx *xp, rtx in
 	    reg_state[regno].all_offsets_match = true;
 	    reg_state[regno].use_ruid = ruid;
 	  }
-	else if (! rtx_equal_p (offset, reg_state[regno].offset))
-	  reg_state[regno].all_offsets_match = false;
+	else
+	  {
+	    if (reg_state[regno].use_ruid > ruid)
+	      reg_state[regno].use_ruid = ruid;
+
+	    if (! rtx_equal_p (offset, reg_state[regno].offset))
+	      reg_state[regno].all_offsets_match = false;
+	  }
 
 	reg_state[regno].reg_use[use_index].insn = insn;
 	reg_state[regno].reg_use[use_index].ruid = ruid;