Patchwork [RS6000] Stack tie fix.

login
register
mail settings
Submitter Alan Modra
Date April 12, 2012, 1:22 p.m.
Message ID <20120412132252.GA10114@bubble.grove.modra.org>
Download mbox | patch
Permalink /patch/152065/
State New
Headers show

Comments

Alan Modra - April 12, 2012, 1:22 p.m.
On Wed, Apr 11, 2012 at 11:00:04AM +0200, Richard Guenther wrote:
> On Wed, Apr 11, 2012 at 6:15 AM, Alan Modra <amodra@gmail.com> wrote:
> Well - you are expecting generic code to understand your arch-specific
> UNSPEC.  It of course cannot.

Right.

>  (USE (mem:BLK (reg 1)))
>  (CLOBBER (mem:BLK (reg 1)))
> 
> repeated, for each register (maybe you can avoid the USE if the CLOBBER

I tried that.  It doesn't work without something else in the insn to
stop rtl-dce deleting it, so you may as well use SETs.  But thanks for
the prod in the right direction.  We do get slightly better results
when the regs are not hidden away in an UNSPEC, for instance
non-stack writes/reads are seen by the alias oracle to not conflict
with the epilogue frame deallocation.

Bootstrapped etc. powerpc-linux.  OK to apply, David?

	PR target/52828
	* config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
	tie regs on destination of sets.  Delete forward declaration.
	(rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
	(rs6000_emit_prologue): Likewise.
	(rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
	and gen_stack_tie.
	(is_mem_ref): Use tie_operand to recognise stack ties.
	* config/rs6000/predicates.md (tie_operand): New.
	* config/rs6000/rs6000.md (UNSPEC_TIE): Delete.
	(restore_stack_block): Generate new stack tie rtl.
	(restore_stack_nonlocal): Likewise.
	(stack_tie): Update.
	(frame_tie): Delete.
David Edelsohn - April 12, 2012, 3:34 p.m.
On Thu, Apr 12, 2012 at 9:22 AM, Alan Modra <amodra@gmail.com> wrote:

> I tried that.  It doesn't work without something else in the insn to
> stop rtl-dce deleting it, so you may as well use SETs.  But thanks for
> the prod in the right direction.  We do get slightly better results
> when the regs are not hidden away in an UNSPEC, for instance
> non-stack writes/reads are seen by the alias oracle to not conflict
> with the epilogue frame deallocation.
>
> Bootstrapped etc. powerpc-linux.  OK to apply, David?
>
>        PR target/52828
>        * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
>        tie regs on destination of sets.  Delete forward declaration.
>        (rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
>        (rs6000_emit_prologue): Likewise.
>        (rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
>        and gen_stack_tie.
>        (is_mem_ref): Use tie_operand to recognise stack ties.
>        * config/rs6000/predicates.md (tie_operand): New.
>        * config/rs6000/rs6000.md (UNSPEC_TIE): Delete.
>        (restore_stack_block): Generate new stack tie rtl.
>        (restore_stack_nonlocal): Likewise.
>        (stack_tie): Update.
>        (frame_tie): Delete.

This probably is getting close to the best we can do with GCC's RTL
alias analysis infrastructure.

This version is okay, but I also want to give Richi and Olivier an
opportunity to comment if they still have any concerns.

Thanks, David
Richard Guenther - April 12, 2012, 4:03 p.m.
On Thu, Apr 12, 2012 at 5:34 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Thu, Apr 12, 2012 at 9:22 AM, Alan Modra <amodra@gmail.com> wrote:
>
>> I tried that.  It doesn't work without something else in the insn to
>> stop rtl-dce deleting it, so you may as well use SETs.  But thanks for
>> the prod in the right direction.  We do get slightly better results
>> when the regs are not hidden away in an UNSPEC, for instance
>> non-stack writes/reads are seen by the alias oracle to not conflict
>> with the epilogue frame deallocation.
>>
>> Bootstrapped etc. powerpc-linux.  OK to apply, David?
>>
>>        PR target/52828
>>        * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
>>        tie regs on destination of sets.  Delete forward declaration.
>>        (rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
>>        (rs6000_emit_prologue): Likewise.
>>        (rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
>>        and gen_stack_tie.
>>        (is_mem_ref): Use tie_operand to recognise stack ties.
>>        * config/rs6000/predicates.md (tie_operand): New.
>>        * config/rs6000/rs6000.md (UNSPEC_TIE): Delete.
>>        (restore_stack_block): Generate new stack tie rtl.
>>        (restore_stack_nonlocal): Likewise.
>>        (stack_tie): Update.
>>        (frame_tie): Delete.
>
> This probably is getting close to the best we can do with GCC's RTL
> alias analysis infrastructure.
>
> This version is okay, but I also want to give Richi and Olivier an
> opportunity to comment if they still have any concerns.

It looks fine to me.

Richard.

> Thanks, David
Olivier Hainque - April 12, 2012, 4:07 p.m.
On Apr 12, 2012, at 17:34 , David Edelsohn wrote:
> This version is okay, but I also want to give Richi and Olivier an
> opportunity to comment if they still have any concerns.

 Thanks :) I'm rebuilding our old compiler with the patch
 and will double check how things go on our original failing case.

 Olivier
Olivier Hainque - April 12, 2012, 5:18 p.m.
On Apr 12, 2012, at 18:07 , Olivier Hainque wrote:

> I'm rebuilding our old compiler with the patch
> and will double check how things go on our original
> failing case.

 All is well :) Thanks Alan!

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 186373)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -925,7 +925,6 @@  static const char *rs6000_invalid_within_doloop (c
 static bool rs6000_legitimate_address_p (enum machine_mode, rtx, bool);
 static bool rs6000_debug_legitimate_address_p (enum machine_mode, rtx, bool);
 static rtx rs6000_generate_compare (rtx, enum machine_mode);
-static void rs6000_emit_stack_tie (void);
 static bool spe_func_has_64bit_regs_p (void);
 static rtx gen_frame_mem_offset (enum machine_mode, rtx, int);
 static unsigned rs6000_hash_constant (rtx);
@@ -18505,12 +18504,29 @@  rs6000_aix_asm_output_dwarf_table_ref (char * fram
    and the change to the stack pointer.  */
 
 static void
-rs6000_emit_stack_tie (void)
+rs6000_emit_stack_tie (rtx fp, bool hard_frame_needed)
 {
-  rtx mem = gen_frame_mem (BLKmode,
-			   gen_rtx_REG (Pmode, STACK_POINTER_REGNUM));
+  rtvec p;
+  int i;
+  rtx regs[3];
 
-  emit_insn (gen_stack_tie (mem));
+  i = 0;
+  regs[i++] = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
+  if (hard_frame_needed)
+    regs[i++] = gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM);
+  if (!(REGNO (fp) == STACK_POINTER_REGNUM
+	|| (hard_frame_needed
+	    && REGNO (fp) == HARD_FRAME_POINTER_REGNUM)))
+    regs[i++] = fp;
+
+  p = rtvec_alloc (i);
+  while (--i >= 0)
+    {
+      rtx mem = gen_frame_mem (BLKmode, regs[i]);
+      RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, mem, const0_rtx);
+    }
+
+  emit_insn (gen_stack_tie (gen_rtx_PARALLEL (VOIDmode, p)));
 }
 
 /* Emit the correct code for allocating stack space, as insns.
@@ -19130,7 +19146,7 @@  rs6000_emit_stack_reset (rs6000_stack_t *info,
       || (TARGET_SPE_ABI
 	  && info->spe_64bit_regs_used != 0
 	  && info->first_gp_reg_save != 32))
-    rs6000_emit_stack_tie ();
+    rs6000_emit_stack_tie (frame_reg_rtx, frame_pointer_needed);
   
   if (frame_reg_rtx != sp_reg_rtx)
     {
@@ -19350,7 +19366,7 @@  rs6000_emit_prologue (void)
 	}
       rs6000_emit_allocate_stack (info->total_size, copy_reg);
       if (frame_reg_rtx != sp_reg_rtx)
-	rs6000_emit_stack_tie ();
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
     }
 
   /* Handle world saves specially here.  */
@@ -19854,7 +19870,7 @@  rs6000_emit_prologue (void)
 	sp_offset = info->total_size;
       rs6000_emit_allocate_stack (info->total_size, copy_reg);
       if (frame_reg_rtx != sp_reg_rtx)
-	rs6000_emit_stack_tie ();
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
     }
 
   /* Set frame pointer, if needed.  */
@@ -20425,13 +20441,7 @@  rs6000_emit_epilogue (int sibcall)
       /* Prevent reordering memory accesses against stack pointer restore.  */
       else if (cfun->calls_alloca
 	       || offset_below_red_zone_p (-info->total_size))
-	{
-	  rtx mem1 = gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx);
-	  rtx mem2 = gen_rtx_MEM (BLKmode, sp_reg_rtx);
-	  MEM_NOTRAP_P (mem1) = 1;
-	  MEM_NOTRAP_P (mem2) = 1;
-	  emit_insn (gen_frame_tie (mem1, mem2));
-	}
+	rs6000_emit_stack_tie (frame_reg_rtx, true);
 
       insn = emit_insn (gen_add3_insn (frame_reg_rtx, hard_frame_pointer_rtx,
 				       GEN_INT (info->total_size)));
@@ -20444,11 +20454,7 @@  rs6000_emit_epilogue (int sibcall)
       /* Prevent reordering memory accesses against stack pointer restore.  */
       if (cfun->calls_alloca
 	  || offset_below_red_zone_p (-info->total_size))
-	{
-	  rtx mem = gen_rtx_MEM (BLKmode, sp_reg_rtx);
-	  MEM_NOTRAP_P (mem) = 1;
-	  emit_insn (gen_stack_tie (mem));
-	}
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
       insn = emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx,
 				       GEN_INT (info->total_size)));
       sp_offset = 0;
@@ -22835,8 +22841,7 @@  is_mem_ref (rtx pat)
   bool ret = false;
 
   /* stack_tie does not produce any real memory traffic.  */
-  if (GET_CODE (pat) == UNSPEC
-      && XINT (pat, 1) == UNSPEC_TIE)
+  if (tie_operand (pat, VOIDmode))
     return false;
 
   if (GET_CODE (pat) == MEM)
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 186373)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1451,3 +1451,13 @@ 
 
   return 1;
 })
+
+;; Return 1 if OP is a stack tie operand.
+(define_predicate "tie_operand"
+  (match_code "parallel")
+{
+  return (GET_CODE (XVECEXP (op, 0, 0)) == SET
+	  && GET_CODE (XEXP (XVECEXP (op, 0, 0), 0)) == MEM
+	  && GET_MODE (XEXP (XVECEXP (op, 0, 0), 0)) == BLKmode
+	  && XEXP (XVECEXP (op, 0, 0), 1) == const0_rtx);
+})
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 186373)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -73,7 +73,6 @@ 
 (define_c_enum "unspec"
   [UNSPEC_FRSP			; frsp for POWER machines
    UNSPEC_PROBE_STACK		; probe stack memory reference
-   UNSPEC_TIE			; tie stack contents and stack pointer
    UNSPEC_TOCPTR		; address of a word pointing to the TOC
    UNSPEC_TOC			; address of the TOC (more-or-less)
    UNSPEC_MOVSI_GOT
@@ -11989,17 +11988,23 @@ 
 (define_expand "restore_stack_block"
   [(set (match_dup 2) (match_dup 3))
    (set (match_dup 4) (match_dup 2))
-   (set (match_dup 5) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
+   (match_dup 5)
    (set (match_operand 0 "register_operand" "")
 	(match_operand 1 "register_operand" ""))]
   ""
   "
 {
+  rtvec p;
+
   operands[1] = force_reg (Pmode, operands[1]);
   operands[2] = gen_reg_rtx (Pmode);
   operands[3] = gen_frame_mem (Pmode, operands[0]);
   operands[4] = gen_frame_mem (Pmode, operands[1]);
-  operands[5] = gen_frame_mem (BLKmode, operands[0]);
+  p = rtvec_alloc (1);
+  RTVEC_ELT (p, 0) = gen_rtx_SET (VOIDmode,
+				  gen_frame_mem (BLKmode, operands[0]),
+				  const0_rtx);
+  operands[5] = gen_rtx_PARALLEL (VOIDmode, p);
 }")
 
 (define_expand "save_stack_nonlocal"
@@ -12022,12 +12027,13 @@ 
   [(set (match_dup 2) (match_operand 1 "memory_operand" ""))
    (set (match_dup 3) (match_dup 4))
    (set (match_dup 5) (match_dup 2))
-   (set (match_dup 6) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
+   (match_dup 6)
    (set (match_operand 0 "register_operand" "") (match_dup 3))]
   ""
   "
 {
   int units_per_word = (TARGET_32BIT) ? 4 : 8;
+  rtvec p;
 
   /* Restore the backchain from the first word, sp from the second.  */
   operands[2] = gen_reg_rtx (Pmode);
@@ -12035,7 +12041,11 @@ 
   operands[1] = adjust_address_nv (operands[1], Pmode, 0);
   operands[4] = adjust_address_nv (operands[1], Pmode, units_per_word);
   operands[5] = gen_frame_mem (Pmode, operands[3]);
-  operands[6] = gen_frame_mem (BLKmode, operands[0]);
+  p = rtvec_alloc (1);
+  RTVEC_ELT (p, 0) = gen_rtx_SET (VOIDmode,
+				  gen_frame_mem (BLKmode, operands[0]),
+				  const0_rtx);
+  operands[6] = gen_rtx_PARALLEL (VOIDmode, p);
 }")
 
 ;; TOC register handling.
@@ -15899,25 +15909,15 @@ 
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
 
-; These are to explain that changes to the stack pointer should
-; not be moved over stores to stack memory.
+; This is to explain that changes to the stack pointer should
+; not be moved over loads from or stores to stack memory.
 (define_insn "stack_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-        (unspec:BLK [(match_dup 0)] UNSPEC_TIE))]
+  [(match_parallel 0 "tie_operand"
+		   [(set (mem:BLK (reg 1)) (const_int 0))])]
   ""
   ""
   [(set_attr "length" "0")])
 
-; Like stack_tie, but depend on both fp and sp based memory.
-(define_insn "frame_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-	(unspec:BLK [(match_dup 0)
-		     (match_operand:BLK 1 "memory_operand" "m")] UNSPEC_TIE))]
-  ""
-  ""
-  [(set_attr "length" "0")])
-
-
 (define_expand "epilogue"
   [(use (const_int 0))]
   ""