Patchwork [rs6000] Enable scalar shifts of vectors

login
register
mail settings
Submitter Richard Henderson
Date Oct. 12, 2011, 10:32 p.m.
Message ID <4E96158E.5030106@redhat.com>
Download mbox | patch
Permalink /patch/119315/
State New
Headers show

Comments

Richard Henderson - Oct. 12, 2011, 10:32 p.m.
I suppose technically the middle-end could be improved to implement
ashl<mode> as vashl<mode> by broadcasting the scalar, but Altivec
is the only extant SIMD ISA that would make use of this.  All of
the others can arrange for constant shifts to be encoded into the
insn, and so implement the ashl<mode> named pattern.

Tested on ppc64-linux, --with-cpu=G5.

Ok?


r~


	* config/rs6000/rs6000.c (rs6000_expand_vector_broadcast): New.
	* config/rs6000/rs6000-protos.h: Update.
	* config/rs6000/vector.md (ashl<VEC_I>3): New.
	(lshr<VEC_I>3, ashr<VEC_I>3): New.
commit 63a6b475bcde403cc4e220827370e6ecea9aad33
Author: Richard Henderson <rth@twiddle.net>
Date:   Mon Oct 10 12:34:59 2011 -0700

    rs6000: Implement scalar shifts of vectors.
David Miller - Oct. 12, 2011, 10:37 p.m.
From: Richard Henderson <rth@redhat.com>
Date: Wed, 12 Oct 2011 15:32:46 -0700

> I suppose technically the middle-end could be improved to implement
> ashl<mode> as vashl<mode> by broadcasting the scalar, but Altivec
> is the only extant SIMD ISA that would make use of this.  All of
> the others can arrange for constant shifts to be encoded into the
> insn, and so implement the ashl<mode> named pattern.

I'm pretty sure Sparc's VIS3 can do this too, see the
'<vis3_shift_insn><vbits>_vis' patterns in sparc.md
Richard Henderson - Oct. 12, 2011, 10:49 p.m.
On 10/12/2011 03:37 PM, David Miller wrote:
> From: Richard Henderson <rth@redhat.com>
> Date: Wed, 12 Oct 2011 15:32:46 -0700
> 
>> I suppose technically the middle-end could be improved to implement
>> ashl<mode> as vashl<mode> by broadcasting the scalar, but Altivec
>> is the only extant SIMD ISA that would make use of this.  All of
>> the others can arrange for constant shifts to be encoded into the
>> insn, and so implement the ashl<mode> named pattern.
> 
> I'm pretty sure Sparc's VIS3 can do this too, see the
> '<vis3_shift_insn><vbits>_vis' patterns in sparc.md

Ok, if I read the rtl correctly, you can perform a vector shift, where each shift count comes from the corresponding element of op2.  But VIS has no vector shift where the shift count comes from a single scalar (immediate or register)?

If so, please rename this pattern to the "v<shift_pat_name><mode>3" form and I'll work on more middle-end support for re-use of the v<shift_pat_name> optab.


r~
David Miller - Oct. 12, 2011, 10:52 p.m.
From: Richard Henderson <rth@redhat.com>
Date: Wed, 12 Oct 2011 15:49:28 -0700

> Ok, if I read the rtl correctly, you can perform a vector shift,
> where each shift count comes from the corresponding element of op2.
> But VIS has no vector shift where the shift count comes from a
> single scalar (immediate or register)?

That's correct.

> If so, please rename this pattern to the "v<shift_pat_name><mode>3"
> form and I'll work on more middle-end support for re-use of the
> v<shift_pat_name> optab.

Will do, thanks Richard.
David Edelsohn - Oct. 13, 2011, 6:36 p.m.
On Wed, Oct 12, 2011 at 6:32 PM, Richard Henderson <rth@redhat.com> wrote:
> I suppose technically the middle-end could be improved to implement
> ashl<mode> as vashl<mode> by broadcasting the scalar, but Altivec
> is the only extant SIMD ISA that would make use of this.  All of
> the others can arrange for constant shifts to be encoded into the
> insn, and so implement the ashl<mode> named pattern.
>
> Tested on ppc64-linux, --with-cpu=G5.

Richard,

Are there testcases in the GCC testsuite that exercise these patterns?

Thanks, David
Richard Henderson - Oct. 13, 2011, 6:43 p.m.
On 10/13/2011 11:36 AM, David Edelsohn wrote:
> Are there testcases in the GCC testsuite that exercise these patterns?

I thought the vectorizer would use them.  E.g. gcc.dg/vect/vect-shift-3.c.

I see that I should have added ppc to check_effective_target_vect_shift_scalar,
though, to enable even more testing.


r~
David Edelsohn - Oct. 13, 2011, 9:05 p.m.
On Wed, Oct 12, 2011 at 6:32 PM, Richard Henderson <rth@redhat.com> wrote:
> I suppose technically the middle-end could be improved to implement
> ashl<mode> as vashl<mode> by broadcasting the scalar, but Altivec
> is the only extant SIMD ISA that would make use of this.  All of
> the others can arrange for constant shifts to be encoded into the
> insn, and so implement the ashl<mode> named pattern.
>
> Tested on ppc64-linux, --with-cpu=G5.
>
> Ok?
>
>
> r~
>
>
>        * config/rs6000/rs6000.c (rs6000_expand_vector_broadcast): New.
>        * config/rs6000/rs6000-protos.h: Update.
>        * config/rs6000/vector.md (ashl<VEC_I>3): New.
>        (lshr<VEC_I>3, ashr<VEC_I>3): New.

The patch is fine.

Thanks, David
Michael Meissner - Oct. 14, 2011, 7:35 p.m.
On Thu, Oct 13, 2011 at 11:43:35AM -0700, Richard Henderson wrote:
> On 10/13/2011 11:36 AM, David Edelsohn wrote:
> > Are there testcases in the GCC testsuite that exercise these patterns?
> 
> I thought the vectorizer would use them.  E.g. gcc.dg/vect/vect-shift-3.c.
> 
> I see that I should have added ppc to check_effective_target_vect_shift_scalar,
> though, to enable even more testing.

I tried this patch on trunk, and I'm not seeing any changes in the code.  I'll
include the test case and asm as attachments.

This is due to the code I put into tree-vect-generic.c (in
expand_vector_operations_1) that converts between vector shift by vector and
vector shift by scalar.  Note, that AMD's XOP shifts are also vector/vector
shifts.

The code shifting by a scalar is pretty bad in that it recalcuates the splat of
the shift element every time in the loop, rather than doing the splat once
before the loop.  We also have the problem we've had for a couple of years that
if the type is signed char or signed short, the compiler wants to promote the
items to int and does this by several unpacks and repacks.

Patch

diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 73da0f6..4dee23f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -55,6 +55,7 @@  extern void rs6000_expand_vector_init (rtx, rtx);
 extern void paired_expand_vector_init (rtx, rtx);
 extern void rs6000_expand_vector_set (rtx, rtx, int);
 extern void rs6000_expand_vector_extract (rtx, rtx, int);
+extern rtx rs6000_expand_vector_broadcast (enum machine_mode, rtx);
 extern void build_mask64_2_operands (rtx, rtx *);
 extern int expand_block_clear (rtx[]);
 extern int expand_block_move (rtx[]);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 63c0f0c..786736d 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4890,6 +4890,35 @@  rs6000_expand_vector_extract (rtx target, rtx vec, int elt)
   emit_move_insn (target, adjust_address_nv (mem, inner_mode, 0));
 }
 
+/* Broadcast an element to all parts of a vector, loaded into a register.
+   Used to turn vector shifts by a scalar into vector shifts by a vector.  */
+
+rtx
+rs6000_expand_vector_broadcast (enum machine_mode mode, rtx elt)
+{
+  rtx repl, vec[16];
+  int i, n;
+
+  n = GET_MODE_NUNITS (mode);
+  for (i = 0; i < n; ++i)
+    vec[i] = elt;
+
+  if (CONSTANT_P (elt))
+    {
+      repl = gen_rtx_CONST_VECTOR (mode, gen_rtvec_v (n, vec));
+      repl = force_reg (mode, repl);
+    }
+  else
+    {
+      rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (n, vec));
+      repl = gen_reg_rtx (mode);
+      rs6000_expand_vector_init (repl, par);
+    }
+
+  return repl;
+}
+
+
 /* Generates shifts and masks for a pair of rldicl or rldicr insns to
    implement ANDing by the mask IN.  */
 void
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 0179cd9..24b473e 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -987,6 +987,16 @@ 
   "TARGET_ALTIVEC"
   "")
 
+(define_expand "ashl<mode>3"
+  [(set (match_operand:VEC_I 0 "vint_operand" "")
+	(ashift:VEC_I
+	  (match_operand:VEC_I 1 "vint_operand" "")
+	  (match_operand:<VEC_base> 2 "nonmemory_operand" "")))]
+  "TARGET_ALTIVEC"
+{
+  operands[2] = rs6000_expand_vector_broadcast (<MODE>mode, operands[2]);
+})
+
 ;; Expanders for logical shift right on each vector element
 (define_expand "vlshr<mode>3"
   [(set (match_operand:VEC_I 0 "vint_operand" "")
@@ -995,6 +1005,16 @@ 
   "TARGET_ALTIVEC"
   "")
 
+(define_expand "lshr<mode>3"
+  [(set (match_operand:VEC_I 0 "vint_operand" "")
+	(lshiftrt:VEC_I
+	  (match_operand:VEC_I 1 "vint_operand" "")
+	  (match_operand:<VEC_base> 2 "nonmemory_operand" "")))]
+  "TARGET_ALTIVEC"
+{
+  operands[2] = rs6000_expand_vector_broadcast (<MODE>mode, operands[2]);
+})
+
 ;; Expanders for arithmetic shift right on each vector element
 (define_expand "vashr<mode>3"
   [(set (match_operand:VEC_I 0 "vint_operand" "")
@@ -1002,6 +1022,16 @@ 
 			(match_operand:VEC_I 2 "vint_operand" "")))]
   "TARGET_ALTIVEC"
   "")
+
+(define_expand "ashr<mode>3"
+  [(set (match_operand:VEC_I 0 "vint_operand" "")
+	(ashiftrt:VEC_I
+	  (match_operand:VEC_I 1 "vint_operand" "")
+	  (match_operand:<VEC_base> 2 "nonmemory_operand" "")))]
+  "TARGET_ALTIVEC"
+{
+  operands[2] = rs6000_expand_vector_broadcast (<MODE>mode, operands[2]);
+})
 
 ;; Vector reduction expanders for VSX