Patchwork [lra] a patch to fix ppc bootstrap failure

login
register
mail settings
Submitter Vladimir Makarov
Date Dec. 5, 2012, 5:35 a.m.
Message ID <50BEDD3C.1050306@redhat.com>
Download mbox | patch
Permalink /patch/203786/
State New
Headers show

Comments

Vladimir Makarov - Dec. 5, 2012, 5:35 a.m.
After so many changes in LRA ppc target was broken.  The following patch 
fixes PPC bootstrap.

The patch was successfully bootstrapped on x86/x86-64 and PPC.

Committed as rev. 194180.

2012-12-05  Vladimir Makarov  <vmakarov@redhat.com>

         * rtl.h (struct rtx_def): Add comment for field jump.
         (SUBREG_MATCH_RELOAD_P): New macro.
         * lra-constraints.c (curr_insn_set): New.
         (match_reload): Use SUBREG_MATCH_RELOAD_P.
         (check_and_process_move): Use curr_insn_set. Process only single
         set insns.  Don't initialize sec_mem_p and change_p.
         (simplify_operand_subreg): Check SUBREG_MATCH_RELOAD_P.
         (process_alt_operands): Use #if HAVE_ATTR_enabled instead of
         #ifdef.  Add code to remove cycling.
         (process_address): Process even if non-null disp. Reload inner
         instead of disp when base and index are null.
         (curr_insn_transform): Initialize sec_mem_p and change_p.  Set up
         curr_insn_set.  Call check_and_process_move only for single set
         insns.
         * rs6000/rs6000.c (legitimate_lo_sum_address_p): Allow TFmode too.
Michael Meissner - Dec. 5, 2012, 8:24 p.m.
On Wed, Dec 05, 2012 at 12:35:56AM -0500, Vladimir Makarov wrote:
> After so many changes in LRA ppc target was broken.  The following
> patch fixes PPC bootstrap.
> 
> The patch was successfully bootstrapped on x86/x86-64 and PPC.
> 
> Committed as rev. 194180.
> 
> 2012-12-05  Vladimir Makarov  <vmakarov@redhat.com>
> 
>         * rtl.h (struct rtx_def): Add comment for field jump.
>         (SUBREG_MATCH_RELOAD_P): New macro.
>         * lra-constraints.c (curr_insn_set): New.
>         (match_reload): Use SUBREG_MATCH_RELOAD_P.
>         (check_and_process_move): Use curr_insn_set. Process only single
>         set insns.  Don't initialize sec_mem_p and change_p.
>         (simplify_operand_subreg): Check SUBREG_MATCH_RELOAD_P.
>         (process_alt_operands): Use #if HAVE_ATTR_enabled instead of
>         #ifdef.  Add code to remove cycling.
>         (process_address): Process even if non-null disp. Reload inner
>         instead of disp when base and index are null.
>         (curr_insn_transform): Initialize sec_mem_p and change_p.  Set up
>         curr_insn_set.  Call check_and_process_move only for single set
>         insns.
>         * rs6000/rs6000.c (legitimate_lo_sum_address_p): Allow TFmode too.

Well I get further, using the LRA branch, but there are two problems.

The first problem is I build the compiler with decimal floating point enabled,
and it fails in the bootstrap.  As a workaround, I built the compiler using the
--disable-decimal-float option, and it got further.  I will dig out the files
in a separate message.

The second problem occurs in the build when I'm building libgfortran, using
BOOT_CFLAGS='-O2 -mcpu=power7 -g -flra' or using -O3 instead of -O2.

If you compile the attached file with the -std=gnu99 -mcpu=power7 -O3
-funroll-loops -g options, line 2010 of lra.c raises an assertion failure for
one insn.  If I don't do vectorization and loop unrolling, the file compiles
fine.  I went into the debugger and printed up the insn that is causing the
failure:

(insn 4489 4488 4490 320 (set (reg:CC 65 lr [1369])
        (reg:CC 65 lr [1369])) 360 {*movcc_internal1}
     (expr_list:REG_DEAD (reg:CC 65 lr [1369])
        (nil)))

I walked into the previoius and next insns, and they too were generating nop
moves of a register to itself.  However in this case, the rs6000 port is
inconsistant, in that it says the LR register can hold CC/CCUNS/CCFP values,
but the movcc insn doesn't allow the LR/CTR.

I suspect this is a holdover from the original register allocator, and I'm
going to play with restricting the types that can go in LR/CTR.  I know in the
4.4 time frame, I initially tried to disallow floating point values in those
registers, and found one spec benchmark was trying to spill from a FP register
to the LR register (which is not a good idea), and I had to go back to allow
such bogus uses.
Michael Meissner - Dec. 5, 2012, 10:51 p.m.
This is the file that causes the boostrap to fail if --enable-decimal-float is
used as a configuration option on the LRA branch.  You need to compile this in
32-bit mode with either -mcpu=power7, -mcpu=power6, or -mhard-dfp to enable the
decimal instructions.  I used -m32 -O2 -O3 -mcpu=power6.

The error message is:

/home/meissner/fsf-src/lra/libgcc/dfp-bit.c:248:1: error: unrecognizable insn:
 }
 ^
(insn 291 25 289 2 (set (reg:DD 199)
        (subreg:DD (reg:SD 33 1 [ arg_a ]) 0)) /home/meissner/fsf-src/lra/libgcc/dfp-bit.c:104 -1
     (expr_list:REG_DEAD (reg:DI 33 1)
        (nil)))
/home/meissner/fsf-src/lra/libgcc/dfp-bit.c:248:1: internal compiler error: in extract_insn, at recog.c:2152
0x107b198b _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        /home/meissner/fsf-src/lra/gcc/rtl-error.c:110
0x107b19f7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
        /home/meissner/fsf-src/lra/gcc/rtl-error.c:118
0x1074d09b extract_insn(rtx_def*)
        /home/meissner/fsf-src/lra/gcc/recog.c:2152
0x10664a03 check_rtl
        /home/meissner/fsf-src/lra/gcc/lra.c:2009
0x106674bf lra(_IO_FILE*)
        /home/meissner/fsf-src/lra/gcc/lra.c:2371
0x105ebd03 do_reload
        /home/meissner/fsf-src/lra/gcc/ira.c:4624
0x105ebfef rest_of_handle_reload
        /home/meissner/fsf-src/lra/gcc/ira.c:4737
Vladimir Makarov - Dec. 6, 2012, 3:31 a.m.
On 12-12-05 5:51 PM, Michael Meissner wrote:
> This is the file that causes the boostrap to fail if --enable-decimal-float is
> used as a configuration option on the LRA branch.  You need to compile this in
> 32-bit mode with either -mcpu=power7, -mcpu=power6, or -mhard-dfp to enable the
> decimal instructions.  I used -m32 -O2 -O3 -mcpu=power6.
I managed to reproduce it.  I'll look at this tomorrow.
Thanks, Mike.
> The error message is:
>
> /home/meissner/fsf-src/lra/libgcc/dfp-bit.c:248:1: error: unrecognizable insn:
>   }
>   ^
> (insn 291 25 289 2 (set (reg:DD 199)
>          (subreg:DD (reg:SD 33 1 [ arg_a ]) 0)) /home/meissner/fsf-src/lra/libgcc/dfp-bit.c:104 -1
>       (expr_list:REG_DEAD (reg:DI 33 1)
>          (nil)))
>

Patch

Index: rtl.h
===================================================================
--- rtl.h	(revision 194149)
+++ rtl.h	(working copy)
@@ -267,7 +267,8 @@  struct GTY((chain_next ("RTX_NEXT (&%h)"
      1 in a SET that is for a return.
      In a CODE_LABEL, part of the two-bit alternate entry field.
      1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c.
-     1 in a VALUE is SP_BASED_VALUE_P in cselib.c.  */
+     1 in a VALUE is SP_BASED_VALUE_P in cselib.c.
+     1 in a SUBREG generated by LRA for matching reload.  */
   unsigned int jump : 1;
   /* In a CODE_LABEL, part of the two-bit alternate entry field.
      1 in a MEM if it cannot trap.
@@ -1415,6 +1416,11 @@  do {									\
   ((RTL_FLAG_CHECK1("SUBREG_PROMOTED_UNSIGNED_P", (RTX), SUBREG)->volatil) \
    ? -1 : (int) (RTX)->unchanging)
 
+/* True if the subreg was generated by LRA for matching reload.  Such
+   subregs are valid only during LRA.  */
+#define SUBREG_MATCH_RELOAD_P(RTX)	\
+  (RTL_FLAG_CHECK1("SUBREG_MATCH_RELOAD_P", (RTX), SUBREG)->jump)
+
 /* Access various components of an ASM_OPERANDS rtx.  */
 
 #define ASM_OPERANDS_TEMPLATE(RTX) XCSTR (RTX, 0, ASM_OPERANDS)
Index: lra-constraints.c
===================================================================
--- lra-constraints.c	(revision 194149)
+++ lra-constraints.c	(working copy)
@@ -136,10 +136,11 @@ 
    reload insns.  */
 static int bb_reload_num;
 
-/* The current insn being processed and corresponding its data (basic
-   block, the insn data, the insn static data, and the mode of each
-   operand).  */
+/* The current insn being processed and corresponding its single set
+   (NULL otherwise), its data (basic block, the insn data, the insn
+   static data, and the mode of each operand).  */
 static rtx curr_insn;
+static rtx curr_insn_set;
 static basic_block curr_bb;
 static lra_insn_recog_data_t curr_id;
 static struct lra_static_insn_data *curr_static_id;
@@ -684,6 +685,7 @@  match_reload (signed char out, signed ch
 	    new_out_reg = gen_lowpart_SUBREG (outmode, reg);
 	  else
 	    new_out_reg = gen_rtx_SUBREG (outmode, reg, 0);
+	  SUBREG_MATCH_RELOAD_P (new_out_reg) = 1;
 	  /* If the input reg is dying here, we can use the same hard
 	     register for REG and IN_RTX.  We do it only for original
 	     pseudos as reload pseudos can die although original
@@ -707,6 +709,7 @@  match_reload (signed char out, signed ch
 	     it at the end of LRA work.  */
 	  clobber = emit_clobber (new_out_reg);
 	  LRA_TEMP_CLOBBER_P (PATTERN (clobber)) = 1;
+	  SUBREG_MATCH_RELOAD_P (new_in_reg) = 1;
 	  if (GET_CODE (in_rtx) == SUBREG)
 	    {
 	      rtx subreg_reg = SUBREG_REG (in_rtx);
@@ -848,23 +851,22 @@  emit_spill_move (bool to_p, rtx mem_pseu
 }
 
 /* Process a special case insn (register move), return true if we
-   don't need to process it anymore.  Return that RTL was changed
-   through CHANGE_P and macro SECONDARY_MEMORY_NEEDED says to use
-   secondary memory through SEC_MEM_P.	*/
+   don't need to process it anymore.  INSN should be a single set
+   insn.  Set up that RTL was changed through CHANGE_P and macro
+   SECONDARY_MEMORY_NEEDED says to use secondary memory through
+   SEC_MEM_P.  */
 static bool
 check_and_process_move (bool *change_p, bool *sec_mem_p)
 {
   int sregno, dregno;
-  rtx set, dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg;
+  rtx dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg;
   enum reg_class dclass, sclass, secondary_class;
   enum machine_mode sreg_mode;
   secondary_reload_info sri;
 
-  *sec_mem_p = *change_p = false;
-  if ((set = single_set (curr_insn)) == NULL)
-    return false;
-  dreg = dest = SET_DEST (set);
-  sreg = src = SET_SRC (set);
+  lra_assert (curr_insn_set != NULL_RTX);
+  dreg = dest = SET_DEST (curr_insn_set);
+  sreg = src = SET_SRC (curr_insn_set);
   /* Quick check on the right move insn which does not need
      reloads.  */
   if ((dclass = get_op_class (dest)) != NO_REGS
@@ -991,7 +993,7 @@  check_and_process_move (bool *change_p,
       if (GET_CODE (src) == SUBREG)
 	SUBREG_REG (src) = new_reg;
       else
-	SET_SRC (set) = new_reg;
+	SET_SRC (curr_insn_set) = new_reg;
     }
   else
     {
@@ -1188,7 +1190,10 @@  simplify_operand_subreg (int nop, enum m
        && (hard_regno_nregs[hard_regno][GET_MODE (reg)]
 	   >= hard_regno_nregs[hard_regno][mode])
        && simplify_subreg_regno (hard_regno, GET_MODE (reg),
-				 SUBREG_BYTE (operand), mode) < 0)
+				 SUBREG_BYTE (operand), mode) < 0
+       /* Don't reload subreg for matching reload.  It is actually
+	  valid subreg in LRA.  */
+       && ! SUBREG_MATCH_RELOAD_P (operand))
       || CONSTANT_P (reg) || GET_CODE (reg) == PLUS || MEM_P (reg))
     {
       enum op_type type = curr_static_id->operand[nop].type;
@@ -1372,7 +1377,7 @@  process_alt_operands (int only_alternati
   for (nalt = 0; nalt < n_alternatives; nalt++)
     {
       /* Loop over operands for one constraint alternative.  */
-#ifdef HAVE_ATTR_enabled
+#if HAVE_ATTR_enabled
       if (curr_id->alternative_enabled_p != NULL
 	  && ! curr_id->alternative_enabled_p[nalt])
 	continue;
@@ -1988,6 +1993,22 @@  process_alt_operands (int only_alternati
 	  if (early_clobber_p && operand_reg[nop] != NULL_RTX)
 	    early_clobbered_nops[early_clobbered_regs_num++] = nop;
 	}
+      if (curr_insn_set != NULL_RTX && n_operands == 2
+	  && ((! curr_alt_win[0] && curr_alt_win[1]
+	       && REG_P (no_subreg_reg_operand[1])
+	       && in_class_p (no_subreg_reg_operand[1], curr_alt[0], NULL))
+	      || (curr_alt_win[0] && ! curr_alt_win[1]
+		  && REG_P (no_subreg_reg_operand[0])
+		  && in_class_p (no_subreg_reg_operand[0], curr_alt[1], NULL)
+		  && (! CONST_POOL_OK_P (curr_operand_mode[1],
+					 no_subreg_reg_operand[1])
+		      || (targetm.preferred_reload_class
+			  (no_subreg_reg_operand[1],
+			   (enum reg_class) curr_alt[1]) != NO_REGS)))))
+	/* We have a move insn and a new reload insn will be similar
+	   to the current insn.  We should avoid such situation as it
+	   results in LRA cycling.  */
+	overall += LRA_MAX_REJECT;
       ok_p = true;
       curr_alt_dont_inherit_ops_num = 0;
       for (nop = 0; nop < early_clobbered_regs_num; nop++)
@@ -2334,24 +2355,23 @@  process_address (int nop, rtx *before, r
   /* There are three cases where the shape of *AD.INNER may now be invalid:
 
      1) the original address was valid, but either elimination or
-	equiv_address_substitution applied a displacement that made
-	it invalid.
+	equiv_address_substitution was applied and that made
+	the address invalid.
 
      2) the address is an invalid symbolic address created by
 	force_const_to_mem.
 
      3) the address is a frame address with an invalid offset.
 
-     All these cases involve a displacement and a non-autoinc address,
-     so there is no point revalidating other types.  */
-  if (ad.disp == NULL || ad.autoinc_p || valid_address_p (&ad))
+     All these cases involve a non-autoinc address, so there is no
+     point revalidating other types.  */
+  if (ad.autoinc_p || valid_address_p (&ad))
     return change_p;
 
   /* Any index existed before LRA started, so we can assume that the
      presence and shape of the index is valid.  */
   push_to_sequence (*before);
-  gcc_assert (ad.segment == NULL);
-  gcc_assert (ad.disp == ad.disp_term);
+  lra_assert (ad.disp == ad.disp_term);
   if (ad.base == NULL)
     {
       if (ad.index == NULL)
@@ -2359,25 +2379,25 @@  process_address (int nop, rtx *before, r
 	  int code = -1;
 	  enum reg_class cl = base_reg_class (ad.mode, ad.as,
 					      SCRATCH, SCRATCH);
-	  rtx disp = *ad.disp;
+	  rtx addr = *ad.inner;
 
-	  new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "disp");
+	  new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "addr");
 #ifdef HAVE_lo_sum
 	  {
 	    rtx insn;
 	    rtx last = get_last_insn ();
 
-	    /* disp => lo_sum (new_base, disp), case (2) above.  */
+	    /* addr => lo_sum (new_base, addr), case (2) above.  */
 	    insn = emit_insn (gen_rtx_SET
 			      (VOIDmode, new_reg,
-			       gen_rtx_HIGH (Pmode, copy_rtx (disp))));
+			       gen_rtx_HIGH (Pmode, copy_rtx (addr))));
 	    code = recog_memoized (insn);
 	    if (code >= 0)
 	      {
-		*ad.disp = gen_rtx_LO_SUM (Pmode, new_reg, disp);
+		*ad.inner = gen_rtx_LO_SUM (Pmode, new_reg, addr);
 		if (! valid_address_p (ad.mode, *ad.outer, ad.as))
 		  {
-		    *ad.disp = disp;
+		    *ad.inner = addr;
 		    code = -1;
 		  }
 	      }
@@ -2387,9 +2407,9 @@  process_address (int nop, rtx *before, r
 #endif
 	  if (code < 0)
 	    {
-	      /* disp => new_base, case (2) above.  */
-	      lra_emit_move (new_reg, disp);
-	      *ad.disp = new_reg;
+	      /* addr => new_base, case (2) above.  */
+	      lra_emit_move (new_reg, addr);
+	      *ad.inner = new_reg;
 	    }
 	}
       else
@@ -2601,7 +2621,10 @@  curr_insn_transform (void)
   no_input_reloads_p = no_output_reloads_p = false;
   goal_alt_number = -1;
 
-  if (check_and_process_move (&change_p, &sec_mem_p))
+  change_p = sec_mem_p = false;
+  curr_insn_set = single_set (curr_insn);
+  if (curr_insn_set != NULL_RTX
+      && check_and_process_move (&change_p, &sec_mem_p))
     return change_p;
 
   /* JUMP_INSNs and CALL_INSNs are not allowed to have any output
Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 194149)
+++ config/rs6000/rs6000.c	(working copy)
@@ -5523,7 +5523,7 @@  legitimate_lo_sum_address_p (enum machin
       if (GET_MODE_SIZE (mode) > UNITS_PER_WORD
 	  && !(/* ??? Assume floating point reg based on mode?  */
 	       TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-	       && (mode == DFmode || mode == DDmode)))
+	       && (mode == DFmode || mode == DDmode || mode == TFmode)))
 	return false;
 
       return CONSTANT_P (x) || toc_ok_p;