Patchwork [rtl] combine a vec_concat of 2 vec_selects from the same vector

login
register
mail settings
Submitter Marc Glisse
Date March 16, 2013, 4:14 p.m.
Message ID <alpine.DEB.2.02.1303161137300.10203@stedding.saclay.inria.fr>
Download mbox | patch
Permalink /patch/228223/
State New
Headers show

Comments

Marc Glisse - March 16, 2013, 4:14 p.m.
On Sat, 16 Mar 2013, Eric Botcazou wrote:

>> My current understanding of simplify-rtx is that we should only do "safe"
>> optimizations in it (make sure we only create expressions that every
>> target will recognize), and if I want more advanced optimizations, I
>> should do them elsewhere (not sure where). So I should probably at least
>> restrict this one to the case where the result and XEXP (trueop0, 0) have
>> the same mode.
>
> Yes, simplify-rtx should only canonicalize or simplify expressions. 
> Merging 2 VEC_SELECTs into a single one seems to fall into the second 
> category, but if no target can take advantage of it, that's probably a 
> bit questionable indeed.

Thank you for the comment.

I successfully regtested the attached trivial patch on x86_64-linux-gnu:

2013-03-16  Marc Glisse  <marc.glisse@inria.fr>

 	* simplify-rtx.c (simplify_binary_operation_1) <VEC_CONCAT>:
 	Restrict the transformation to equal modes.


The original use case (and the testcase) are still handled. I don't think 
any target supports size(output) > size(input) for vec_select, so we don't 
lose anything. For size(output) < size(input), I can see one case in the 
x86 backend that could have used it:

#include <x86intrin.h>
__v2sf f(__v4sf x){ return (__v2sf){x[0],x[1]}; }

which ideally would exactly match sse_storelps (extractps is a 
vec_select:SF V4SF), but sadly it writes to memory instead of using a 
vec_concat so it doesn't benefit from the simplify-rtx code. sse_storehps 
is similar.
(tree optimizations should have merged the 2 BIT_FIELD_REFs earlier, but 
that's a different issue)


Even with the restriction we can still generate arbitrary permutations, 
which not all targets will recognize for every vector mode, but I guess 
there will be time to think about it if someone comes up with a testcase 
where this causes a problem.

Is the patch ok for trunk (once 4.8.0 is out and the new server up)?
Eric Botcazou - March 16, 2013, 4:45 p.m.
> I successfully regtested the attached trivial patch on x86_64-linux-gnu:
> 
> 2013-03-16  Marc Glisse  <marc.glisse@inria.fr>
> 
>  	* simplify-rtx.c (simplify_binary_operation_1) <VEC_CONCAT>:
>  	Restrict the transformation to equal modes.

OK for 4.9 and 4.8.1 if you add a comment line about the restriction.

Patch

Index: gcc/simplify-rtx.c

===================================================================
--- gcc/simplify-rtx.c	(revision 196701)

+++ gcc/simplify-rtx.c	(working copy)

@@ -3619,21 +3619,22 @@  simplify_binary_operation_1 (enum rtx_co

 							   i - in_n_elts);
 		  }
 	      }
 
 	    return gen_rtx_CONST_VECTOR (mode, v);
 	  }
 
 	/* Try to merge VEC_SELECTs from the same vector into a single one.  */
 	if (GET_CODE (trueop0) == VEC_SELECT
 	    && GET_CODE (trueop1) == VEC_SELECT
-	    && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0)))

+	    && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))

+	    && GET_MODE (XEXP (trueop0, 0)) == mode)

 	  {
 	    rtx par0 = XEXP (trueop0, 1);
 	    rtx par1 = XEXP (trueop1, 1);
 	    int len0 = XVECLEN (par0, 0);
 	    int len1 = XVECLEN (par1, 0);
 	    rtvec vec = rtvec_alloc (len0 + len1);
 	    for (int i = 0; i < len0; i++)
 	      RTVEC_ELT (vec, i) = XVECEXP (par0, 0, i);
 	    for (int i = 0; i < len1; i++)
 	      RTVEC_ELT (vec, len0 + i) = XVECEXP (par1, 0, i);