Patchwork PowerPC shrink-wrap support benchmark gains

login
register
mail settings
Submitter Alan Modra
Date Sept. 26, 2011, 2 p.m.
Message ID <20110926140005.GK10321@bubble.grove.modra.org>
Download mbox | patch
Permalink /patch/116426/
State New
Headers show

Comments

Alan Modra - Sept. 26, 2011, 2 p.m.
This patch increases opportunities for shrink-wrapping.  With this
applied, http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00754.html and
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01499.html plus the fix
mentioned in http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01016.html,
along with my powerpc changes, I finally see some gains in CPU2006.

453.povray gives a massive 12% gain in one particular setup, with
other benchmarks in the suite mostly showing small improvements.
Some of this gain is likely due to having an extra gpr to play with
before we start saving regs.

	* gcc/ifcvt.c (dead_or_predicable): Disable if-conversion when
	doing so is likely to kill a shrink-wrapping opportunity.

Patch

diff -urp -x'*~' -x'*.orig' -x'*.rej' -x.svn gcc-alanshrink1/gcc/ifcvt.c gcc-current/gcc/ifcvt.c
--- gcc-alanshrink1/gcc/ifcvt.c	2011-09-26 15:29:14.000000000 +0930
+++ gcc-current/gcc/ifcvt.c	2011-09-26 14:24:53.000000000 +0930
@@ -4138,6 +4138,64 @@  dead_or_predicable (basic_block test_bb,
       FOR_BB_INSNS (merge_bb, insn)
 	if (NONDEBUG_INSN_P (insn))
 	  df_simulate_find_defs (insn, merge_set);
+
+      /* If shrink-wrapping, disable this optimization when test_bb is
+	 the first basic block and merge_bb exits.  The idea is to not
+	 move code setting up a return register as that may clobber a
+	 register used to pass function parameters, which then must be
+	 saved in caller-saved regs.  A caller-saved reg requires the
+	 prologue, killing a shrink-wrap opportunity.  */
+      if ((flag_shrink_wrap && !epilogue_completed)
+	  && ENTRY_BLOCK_PTR->next_bb == test_bb
+	  && single_succ_p (new_dest)
+	  && single_succ (new_dest) == EXIT_BLOCK_PTR
+	  && bitmap_intersect_p (df_get_live_in (new_dest), merge_set))
+	{
+	  regset return_regs;
+	  unsigned int i;
+
+	  return_regs = BITMAP_ALLOC (&reg_obstack);
+
+	  /* Start off with the intersection of regs used to pass
+	     params and regs used to return values.  */
+	  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+	    if (FUNCTION_ARG_REGNO_P (i)
+		&& FUNCTION_VALUE_REGNO_P (i))
+	      bitmap_set_bit (return_regs, INCOMING_REGNO (i));
+
+	  bitmap_and_into (return_regs, df_get_live_out (ENTRY_BLOCK_PTR));
+	  bitmap_and_into (return_regs, df_get_live_in (EXIT_BLOCK_PTR));
+	  if (!bitmap_empty_p (return_regs))
+	    {
+	      FOR_BB_INSNS_REVERSE (new_dest, insn)
+		if (NONDEBUG_INSN_P (insn))
+		  {
+		    df_ref *def_rec;
+		    unsigned int uid = INSN_UID (insn);
+
+		    /* If this insn sets any reg in return_regs..  */
+		    for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++)
+		      {
+			df_ref def = *def_rec;
+			unsigned r = DF_REF_REGNO (def);
+
+			if (bitmap_bit_p (return_regs, r))
+			  break;
+		      }
+		    /* ..then add all reg uses to the set of regs
+		       we're interested in.  */
+		    if (*def_rec)
+		      df_simulate_uses (insn, return_regs);
+		  }
+	      if (bitmap_intersect_p (merge_set, return_regs))
+		{
+		  BITMAP_FREE (return_regs);
+		  BITMAP_FREE (merge_set);
+		  return FALSE;
+		}
+	    }
+	  BITMAP_FREE (return_regs);
+	}
     }
 
  no_body: