diff mbox

make ifcvt try harder to use conditional moves

Message ID 20101018200551.GG2806@nightcrawler
State New
Headers show

Commit Message

Nathan Froyd Oct. 18, 2010, 8:05 p.m. UTC
The patch below makes ifcvt try a little harder to use conditional
moves.  If we have (pseudo-RTL):

  if VAL == 0
    goto L1
  else
    goto L2

L1:
  rX:M1 = (subreg:M1 rT:M2 B)
  goto L3

L2:
  rX:M1 = (subreg:M1 rF:M2 B)
  goto L3

L3:
  rY:M2 = (zero_extend:M2 rX:M1)	# or similar

ifcvt tries to do a conditional move in machine mode M1, which the
machine might not have.  If it has a conditional move in mode M2,
however, we can do:

  rZ:M2 = cmov VAL == 0, rT:M2, rF:M2
  rX:M1 = (subreg:M1 rZ:M2 B)
  rY:M2 = (zero_extend:M2 rX:M1)

which eliminates branches and may enable further simplications of subreg
hackery.

Built cross to powerpc-eabispe (has cmov) and bootstrap on
x86_64-linux-gnu, testing in progress.  OK to commit?

-Nathan

	* ifcvt.c (noce_emit_cmove): If both of the values are SUBREGs, try
	emitting the conditional move in the inner mode of the SUBREG.

Comments

Eric Botcazou Oct. 20, 2010, 5:53 p.m. UTC | #1
> 	* ifcvt.c (noce_emit_cmove): If both of the values are SUBREGs, try
> 	emitting the conditional move in the inner mode of the SUBREG.

OK for mainline, modulo:

> +     If we can't create new pseudos, though, don't bother.  */
> +  if (!can_create_pseudo_p ())
> +    return NULL_RTX;

Write "if (reload_completed)" for the sake of consistency; no need to retest.
Nathan Froyd Oct. 21, 2010, 1:13 p.m. UTC | #2
On Wed, Oct 20, 2010 at 07:53:22PM +0200, Eric Botcazou wrote:
> > 	* ifcvt.c (noce_emit_cmove): If both of the values are SUBREGs, try
> > 	emitting the conditional move in the inner mode of the SUBREG.
> 
> OK for mainline, modulo:
> 
> > +     If we can't create new pseudos, though, don't bother.  */
> > +  if (!can_create_pseudo_p ())
> > +    return NULL_RTX;
> 
> Write "if (reload_completed)" for the sake of consistency; no need to retest.

I did this, but am confused: we have this handy macro in rtl.h for such
testing; why not use it in this case?  I thought the use of the macro
was preferred.

-Nathan
Eric Botcazou Oct. 21, 2010, 2:05 p.m. UTC | #3
> I did this, but am confused: we have this handy macro in rtl.h for such
> testing; why not use it in this case?  I thought the use of the macro
> was preferred.

Yes, but ifcvt.c uses reload_completed throughout the code, no need to make 
the reader scratch his head as to why sometimes the former is used and 
sometimes the latter.
diff mbox

Patch

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index eb96182..c86418c 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1337,6 +1337,9 @@  static rtx
 noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
 		 rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue)
 {
+  rtx target;
+  int unsignedp;
+
   /* If earliest == jump, try to build the cmove insn directly.
      This is helpful when combine has created some complex condition
      (like for alpha's cmovlbs) that we can't hope to regenerate
@@ -1371,10 +1374,62 @@  noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
     return NULL_RTX;
 
 #if HAVE_conditional_move
-  return emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
-				vtrue, vfalse, GET_MODE (x),
-			        (code == LTU || code == GEU
-				 || code == LEU || code == GTU));
+  unsignedp = (code == LTU || code == GEU
+	       || code == LEU || code == GTU);
+
+  target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
+				  vtrue, vfalse, GET_MODE (x),
+				  unsignedp);
+  if (target)
+    return target;
+
+  /* We might be faced with a situation like:
+
+     x = (reg:M TARGET)
+     vtrue = (subreg:M (reg:N VTRUE) BYTE)
+     vfalse = (subreg:M (reg:N VFALSE) BYTE)
+
+     We can't do a conditional move in mode M, but it's possible that we
+     could do a conditional move in mode N instead and take a subreg of
+     the result.
+
+     If we can't create new pseudos, though, don't bother.  */
+  if (!can_create_pseudo_p ())
+    return NULL_RTX;
+
+  if (GET_CODE (vtrue) == SUBREG && GET_CODE (vfalse) == SUBREG)
+    {
+      rtx reg_vtrue = SUBREG_REG (vtrue);
+      rtx reg_vfalse = SUBREG_REG (vfalse);
+      unsigned int byte_vtrue = SUBREG_BYTE (vtrue);
+      unsigned int byte_vfalse = SUBREG_BYTE (vfalse);
+      rtx promoted_target;
+
+      if (GET_MODE (reg_vtrue) != GET_MODE (reg_vfalse)
+	  || byte_vtrue != byte_vfalse
+	  || (SUBREG_PROMOTED_VAR_P (vtrue)
+	      != SUBREG_PROMOTED_VAR_P (vfalse))
+	  || (SUBREG_PROMOTED_UNSIGNED_P (vtrue)
+	      != SUBREG_PROMOTED_UNSIGNED_P (vfalse)))
+	return NULL_RTX;
+
+      promoted_target = gen_reg_rtx (GET_MODE (reg_vtrue));
+
+      target = emit_conditional_move (promoted_target, code, cmp_a, cmp_b,
+				      VOIDmode, reg_vtrue, reg_vfalse,
+				      GET_MODE (reg_vtrue), unsignedp);
+      /* Nope, couldn't do it in that mode either.  */
+      if (!target)
+	return NULL_RTX;
+
+      target = gen_rtx_SUBREG (GET_MODE (vtrue), promoted_target, byte_vtrue);
+      SUBREG_PROMOTED_VAR_P (target) = SUBREG_PROMOTED_VAR_P (vtrue);
+      SUBREG_PROMOTED_UNSIGNED_SET (target, SUBREG_PROMOTED_UNSIGNED_P (vtrue));
+      emit_move_insn (x, target);
+      return x;
+    }
+  else
+    return NULL_RTX;
 #else
   /* We'll never get here, as noce_process_if_block doesn't call the
      functions involved.  Ifdef code, however, should be discouraged