diff mbox

rs6000: Optimise SImode cstore on 64-bit

Message ID 848d1b5499f57ba7bd5271312f28eacdd91777ad.1449020085.git.segher@kernel.crashing.org
State New
Headers show

Commit Message

Segher Boessenkool Dec. 2, 2015, 1:55 a.m. UTC
On 64-bit we can do comparisons of 32-bit values by extending those
values to 64-bit, subtracting them, and then getting the high bit of
the result.  For registers this is always cheaper than using the carry
bit sequence; and if the comparison involves a constant, this is cheaper
than the sequence we previously generated in half of the cases (and the
same cost in the other cases).

After this, the only sequence left that is using the mfcr insn is the
one doing signed comparison of Pmode registers.

Testing in progress.  Okay for trunk if that succeeds?


Segher


2015-12-01  Segher Boessenkool  <segher@kernel.crashing.org>

	* config/rs6000/rs6000.md (cstore_si_as_di): New expander.
	(cstore<mode>4): Use it.

---
 gcc/config/rs6000/rs6000.md | 52 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

Comments

David Edelsohn Dec. 2, 2015, 1:59 a.m. UTC | #1
On Tue, Dec 1, 2015 at 8:55 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On 64-bit we can do comparisons of 32-bit values by extending those
> values to 64-bit, subtracting them, and then getting the high bit of
> the result.  For registers this is always cheaper than using the carry
> bit sequence; and if the comparison involves a constant, this is cheaper
> than the sequence we previously generated in half of the cases (and the
> same cost in the other cases).
>
> After this, the only sequence left that is using the mfcr insn is the
> one doing signed comparison of Pmode registers.
>
> Testing in progress.  Okay for trunk if that succeeds?

Okay.

Thanks, David
Alan Modra Dec. 2, 2015, 3:20 a.m. UTC | #2
On Wed, Dec 02, 2015 at 01:55:17AM +0000, Segher Boessenkool wrote:
> +  emit_insn (gen_subdi3 (tmp, op1, op2));
> +  emit_insn (gen_lshrdi3 (tmp2, tmp, GEN_INT (63)));
> +  emit_insn (gen_anddi3 (tmp3, tmp2, const1_rtx));

Why the AND?  The top 63 bits are already clear.
Segher Boessenkool Dec. 2, 2015, 3:39 a.m. UTC | #3
On Wed, Dec 02, 2015 at 01:50:46PM +1030, Alan Modra wrote:
> On Wed, Dec 02, 2015 at 01:55:17AM +0000, Segher Boessenkool wrote:
> > +  emit_insn (gen_subdi3 (tmp, op1, op2));
> > +  emit_insn (gen_lshrdi3 (tmp2, tmp, GEN_INT (63)));
> > +  emit_insn (gen_anddi3 (tmp3, tmp2, const1_rtx));
> 
> Why the AND?  The top 63 bits are already clear.

Ha, yes.  Thanks.  In a previous version I shifted by less, in which
case GCC is smart enough to make it 63 anyway.  63 is always correct
as well, and simpler because you don't need the AND.  But I forgot
to take it out :-)


Segher
diff mbox

Patch

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a500d67..a599372 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -10564,6 +10564,53 @@  (define_expand "cstore<mode>4_unsigned"
   DONE;
 })
 
+(define_expand "cstore_si_as_di"
+  [(use (match_operator 1 "unsigned_comparison_operator"
+         [(match_operand:SI 2 "gpc_reg_operand")
+          (match_operand:SI 3 "reg_or_short_operand")]))
+   (clobber (match_operand:SI 0 "register_operand"))]
+  ""
+{
+  int uns_flag = unsigned_comparison_operator (operands[1], VOIDmode) ? 1 : 0;
+  enum rtx_code cond_code = signed_condition (GET_CODE (operands[1]));
+
+  rtx op1 = gen_reg_rtx (DImode);
+  rtx op2 = gen_reg_rtx (DImode);
+  convert_move (op1, operands[2], uns_flag);
+  convert_move (op2, operands[3], uns_flag);
+
+  if (cond_code == GT || cond_code == LE)
+    {
+      cond_code = swap_condition (cond_code);
+      std::swap (op1, op2);
+    }
+
+  rtx tmp = gen_reg_rtx (DImode);
+  rtx tmp2 = gen_reg_rtx (DImode);
+  rtx tmp3 = gen_reg_rtx (DImode);
+  emit_insn (gen_subdi3 (tmp, op1, op2));
+  emit_insn (gen_lshrdi3 (tmp2, tmp, GEN_INT (63)));
+  emit_insn (gen_anddi3 (tmp3, tmp2, const1_rtx));
+
+  rtx tmp4;
+  switch (cond_code)
+    {
+    default:
+      gcc_unreachable ();
+    case LT:
+      tmp4 = tmp3;
+      break;
+    case GE:
+      tmp4 = gen_reg_rtx (DImode);
+      emit_insn (gen_xordi3 (tmp4, tmp3, const1_rtx));
+      break;
+    }
+
+  convert_move (operands[0], tmp4, 1);
+
+  DONE;
+})
+
 (define_expand "cstore<mode>4_signed_imm"
   [(use (match_operator 1 "signed_comparison_operator"
          [(match_operand:GPR 2 "gpc_reg_operand")
@@ -10688,6 +10735,11 @@  (define_expand "cstore<mode>4"
     emit_insn (gen_cstore<mode>4_unsigned (operands[0], operands[1],
 					   operands[2], operands[3]));
 
+  /* For comparisons smaller than Pmode we can cheaply do things in Pmode.  */
+  else if (<MODE>mode == SImode && Pmode == DImode)
+    emit_insn (gen_cstore_si_as_di (operands[0], operands[1],
+				    operands[2], operands[3]));
+
   /* For signed comparisons against a constant, we can do some simple
      bit-twiddling.  */
   else if (signed_comparison_operator (operands[1], VOIDmode)