diff mbox series

[RFC,rs6000,v2] folding of vec_splat

Message ID 1534434645.18801.8.camel@brimstone.rchland.ibm.com
State New
Headers show
Series [RFC,rs6000,v2] folding of vec_splat | expand

Commit Message

will schmidt Aug. 16, 2018, 3:50 p.m. UTC
Hi
  Enable GIMPLE folding of the vec_splat() intrinsic. (v2).
    
This uses the tree_vec_extract() function out of tree-vect-generic.c
to retrieve the splat value, which is a BIT_FIELD_REF.   That function is
made non-static as part of this change.
    
Testcases are already in-tree.
    
V2 updates, per feedback previously received.
Forced arg1 into range (modulo #elements) before attempting to extract
the splat value.
Removed the (now unnecessary) code that did bounds-checking before calling
the tree_vec_extract helper.
Used arg0_type rather than lhs_type for calculating the tree size.

OK for trunk?
    
Thanks,
-Will
    
[gcc]

2018-08-16  Will Schmidt  <will_schmidt@vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
	  early gimple folding of vec_splat().
	* tree-vect-generic.c: Remove static from tree_vec_extract() definition.
	* gimple-fold.h:  Add an extern define for tree_vec_extract().

Comments

Segher Boessenkool Aug. 16, 2018, 8:51 p.m. UTC | #1
Hi Will,

On Thu, Aug 16, 2018 at 10:50:45AM -0500, Will Schmidt wrote:
> 2018-08-16  Will Schmidt  <will_schmidt@vnet.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
> 	  early gimple folding of vec_splat().

Continuation lines should be indented to the *, not to the text after it.

> +	/* Only fold the vec_splat_*() if arg1 is a constant
> +	   5-bit unsigned literal.  */
> +	if (TREE_CODE (arg1) != INTEGER_CST || TREE_INT_CST_LOW (arg1) > 0x1f)
> +	  return false;

Does this need to check for negative as well?  So something with IN_RANGE
then.

> +	/* force arg1 into range.  */
> +	tree new_arg1 = build_int_cst (arg1_type,
> +				       TREE_INT_CST_LOW (arg1) % n_elts);

Or is the range check unnecessary completely, since you have this?

> +	tree splat;
> +	if (TREE_CODE (arg0) == VECTOR_CST)
> +	  splat = VECTOR_CST_ELT (arg0, TREE_INT_CST_LOW (new_arg1));
> +	else
> +	  {
> +	    /* Determine (in bits) the length and start location of the
> +	       splat value for a call to the tree_vec_extract helper.  */
> +	    int tree_size_in_bits = TREE_INT_CST_LOW (size_in_bytes (arg0_type))
> +				    * BITS_PER_UNIT;
> +	    int splat_elem_size = tree_size_in_bits / n_elts;
> +	    int splat_start_bit = TREE_INT_CST_LOW (new_arg1) * splat_elem_size;
> +	    /* Do not attempt to early-fold if the size + specified offset into
> +	       the vector would touch outside of the source vector.  */
> +	    tree len = build_int_cst (bitsizetype, splat_elem_size);
> +	    tree start = build_int_cst (bitsizetype, splat_start_bit);
> +	    splat = tree_vec_extract (gsi, TREE_TYPE (lhs_type), arg0,
> +				      len, start);
> +	}

This closing brace should be indented two spaces more.

> -static inline tree
> +tree
>  tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
>  		  tree t, tree bitsize, tree bitpos)

It could use a comment, too (what the args are, etc.)

Other than those nits, looks fine to me.  Maybe Richard or Bill have
more comments?


Segher
will schmidt Aug. 16, 2018, 11:20 p.m. UTC | #2
On Thu, 2018-08-16 at 15:51 -0500, Segher Boessenkool wrote:
> Hi Will,
> 
> On Thu, Aug 16, 2018 at 10:50:45AM -0500, Will Schmidt wrote:
> > 2018-08-16  Will Schmidt  <will_schmidt@vnet.ibm.com>
> > 
> > 	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
> > 	  early gimple folding of vec_splat().
> 
> Continuation lines should be indented to the *, not to the text after it.
OK

> 
> > +	/* Only fold the vec_splat_*() if arg1 is a constant
> > +	   5-bit unsigned literal.  */
> > +	if (TREE_CODE (arg1) != INTEGER_CST || TREE_INT_CST_LOW (arg1) > 0x1f)
> > +	  return false;
> 
> Does this need to check for negative as well?  So something with IN_RANGE
> then.

> 
> > +	/* force arg1 into range.  */
> > +	tree new_arg1 = build_int_cst (arg1_type,
> > +				       TREE_INT_CST_LOW (arg1) % n_elts);
> 
> Or is the range check unnecessary completely, since you have this?

I can be convinced either way. :-)
I think i still want both.  The first rejects (from gimple-folding) any
values that are outside of the 5-bit range as specified by the function
definition.
The second (modulo) then maps any 'valid' values into what is reasonable
for the data type being splatted.

While trying to convince myself one way or another, I do notice that
today (pre-folding), i can get out-of-range errors such as 
  /tmp/ccP0s6iJ.s:781: Error: operand out of range (-1 is not between
  0 and 3)
while with folding enabled, and this modulo in place, we compile without
warning.  So there is a behavior change, for which I have mixed
feelings.  :-)


> > +	tree splat;
> > +	if (TREE_CODE (arg0) == VECTOR_CST)
> > +	  splat = VECTOR_CST_ELT (arg0, TREE_INT_CST_LOW (new_arg1));
> > +	else
> > +	  {
> > +	    /* Determine (in bits) the length and start location of the
> > +	       splat value for a call to the tree_vec_extract helper.  */
> > +	    int tree_size_in_bits = TREE_INT_CST_LOW (size_in_bytes (arg0_type))
> > +				    * BITS_PER_UNIT;
> > +	    int splat_elem_size = tree_size_in_bits / n_elts;
> > +	    int splat_start_bit = TREE_INT_CST_LOW (new_arg1) * splat_elem_size;
> > +	    /* Do not attempt to early-fold if the size + specified offset into
> > +	       the vector would touch outside of the source vector.  */
> > +	    tree len = build_int_cst (bitsizetype, splat_elem_size);
> > +	    tree start = build_int_cst (bitsizetype, splat_start_bit);
> > +	    splat = tree_vec_extract (gsi, TREE_TYPE (lhs_type), arg0,
> > +				      len, start);
> > +	}
> 
> This closing brace should be indented two spaces more.
Ok

> > -static inline tree
> > +tree
> >  tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
> >  		  tree t, tree bitsize, tree bitpos)
> 
> It could use a comment, too (what the args are, etc.)

I can definitely add some commentary to my call into this function.
At a glance, the functions nearby tree_vec_extract do have a couple
lines of description each, so I'll look and see if I can come up with
anything reasonable here.

> Other than those nits, looks fine to me.  Maybe Richard or Bill have
> more comments?

Thanks for the review. 
-Will

> 
> 
> Segher
>
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 97b922f..ec92e6a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15766,10 +15766,58 @@  rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 	 tree splat_tree = build_vector_from_val (TREE_TYPE (lhs), splat_value);
 	 g = gimple_build_assign (lhs, splat_tree);
 	 gimple_set_location (g, gimple_location (stmt));
 	 gsi_replace (gsi, g, true);
 	 return true;
+	}
+
+    /* Flavors of vec_splat.  */
+    /* a = vec_splat (b, 0x3) becomes a = { b[3],b[3],b[3],...};  */
+    case ALTIVEC_BUILTIN_VSPLTB:
+    case ALTIVEC_BUILTIN_VSPLTH:
+    case ALTIVEC_BUILTIN_VSPLTW:
+    case VSX_BUILTIN_XXSPLTD_V2DI:
+    case VSX_BUILTIN_XXSPLTD_V2DF:
+      {
+	arg0 = gimple_call_arg (stmt, 0); /* input vector.  */
+	arg1 = gimple_call_arg (stmt, 1); /* index into arg0.  */
+	/* Only fold the vec_splat_*() if arg1 is a constant
+	   5-bit unsigned literal.  */
+	if (TREE_CODE (arg1) != INTEGER_CST || TREE_INT_CST_LOW (arg1) > 0x1f)
+	  return false;
+	lhs = gimple_call_lhs (stmt);
+	tree lhs_type = TREE_TYPE (lhs);
+	tree arg0_type = TREE_TYPE (arg0);
+	tree arg1_type = TREE_TYPE (arg1);
+	int n_elts = VECTOR_CST_NELTS (arg0);
+	/* force arg1 into range.  */
+	tree new_arg1 = build_int_cst (arg1_type,
+				       TREE_INT_CST_LOW (arg1) % n_elts);
+	tree splat;
+	if (TREE_CODE (arg0) == VECTOR_CST)
+	  splat = VECTOR_CST_ELT (arg0, TREE_INT_CST_LOW (new_arg1));
+	else
+	  {
+	    /* Determine (in bits) the length and start location of the
+	       splat value for a call to the tree_vec_extract helper.  */
+	    int tree_size_in_bits = TREE_INT_CST_LOW (size_in_bytes (arg0_type))
+				    * BITS_PER_UNIT;
+	    int splat_elem_size = tree_size_in_bits / n_elts;
+	    int splat_start_bit = TREE_INT_CST_LOW (new_arg1) * splat_elem_size;
+	    /* Do not attempt to early-fold if the size + specified offset into
+	       the vector would touch outside of the source vector.  */
+	    tree len = build_int_cst (bitsizetype, splat_elem_size);
+	    tree start = build_int_cst (bitsizetype, splat_start_bit);
+	    splat = tree_vec_extract (gsi, TREE_TYPE (lhs_type), arg0,
+				      len, start);
+	}
+	/* And finally, build the new vector.  */
+	tree splat_tree = build_vector_from_val (lhs_type, splat);
+	g = gimple_build_assign (lhs, splat_tree);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
       }
 
     /* vec_mergel (integrals).  */
     case ALTIVEC_BUILTIN_VMRGLH:
     case ALTIVEC_BUILTIN_VMRGLW:
diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
index 04e9bfa..e634180 100644
--- a/gcc/gimple-fold.h
+++ b/gcc/gimple-fold.h
@@ -59,10 +59,11 @@  extern tree gimple_fold_indirect_ref (tree);
 extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *);
 extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *);
 extern bool arith_code_with_undefined_signed_overflow (tree_code);
 extern gimple_seq rewrite_to_defined_overflow (gimple *);
 extern void replace_call_with_value (gimple_stmt_iterator *, tree);
+extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, tree);
 
 /* gimple_build, functionally matching fold_buildN, outputs stmts
    int the provided sequence, matching and simplifying them on-the-fly.
    Supposed to replace force_gimple_operand (fold_buildN (...), ...).  */
 extern tree gimple_build (gimple_seq *, location_t,
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 909f790..1c9701d 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -118,11 +118,11 @@  build_word_mode_vector_type (int nunits)
 
 typedef tree (*elem_op_func) (gimple_stmt_iterator *,
 			      tree, tree, tree, tree, tree, enum tree_code,
 			      tree);
 
-static inline tree
+tree
 tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
 		  tree t, tree bitsize, tree bitpos)
 {
   if (TREE_CODE (t) == SSA_NAME)
     {