Message ID | alpine.DEB.2.02.1205072120220.3230@laptop-mg.saclay.inria.fr |
---|---|
State | New |
Headers | show |
Looks like a good idea. Marc Glisse <marc.glisse@inria.fr> writes: > For the testsuite, since the patch is not in a particular target, it would > be better to have a generic test (in gcc.dg?), but I don't really know how > to write a generic one, so would a test in gcc.target/i386 that scans > the asm for shuf or perm be ok? Tree-level vectorisation tests tend to go in gcc.dg/vector, but it's hard to generalise rtl-level transforms like these. An x86-only test sounds good to me FWIW. > Index: simplify-rtx.c > =================================================================== > --- simplify-rtx.c (revision 187228) > +++ simplify-rtx.c (working copy) > @@ -3268,10 +3268,32 @@ simplify_binary_operation_1 (enum rtx_co > > if (GET_MODE (vec) == mode) > return vec; > } > > + /* If we build {a,b} then permute it, build the result directly. */ > + if (XVECLEN (trueop1, 0) == 2 > + && CONST_INT_P (XVECEXP (trueop1, 0, 0)) > + && CONST_INT_P (XVECEXP (trueop1, 0, 1)) > + && GET_CODE (trueop0) == VEC_CONCAT > + && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop0, 1)) > + && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT > + && GET_MODE (XEXP (trueop0, 0)) == mode) > + { > + int offset0 = INTVAL (XVECEXP (trueop1, 0, 0)) % 2; > + int offset1 = INTVAL (XVECEXP (trueop1, 0, 1)) % 2; > + rtx baseop = XEXP (trueop0, 0); > + rtx baseop0 = XEXP (baseop , 0); > + rtx baseop1 = XEXP (baseop , 1); > + baseop0 = avoid_constant_pool_reference (baseop0); > + baseop1 = avoid_constant_pool_reference (baseop1); > + > + return simplify_gen_binary (VEC_CONCAT, mode, > + offset0 ? baseop1 : baseop0, > + offset1 ? baseop1 : baseop0); > + } > + I know you said that generalising it could be done later, and that's fine, but it looks in some ways like it would be easier to go straight for the more general: && GET_CODE (trueop0) == VEC_CONCAT && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT && GET_MODE (XEXP (trueop0, 0)) == mode && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT && GET_MODE (XEXP (trueop0, 1)) == mode) { unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0)); unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1)); rtx op0, op1; gcc_assert (i0 < 4 && i1 < 4); op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2); op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2); return simplify_gen_binary (VEC_CONCAT, mode, op0, op1); } (completely untested). avoid_constant_pool_reference shouldn't be called here. Very minor, but this code probably belongs in the else part of the if (!VECTOR_MODE_P (mode)) block. Richard
On Tue, 8 May 2012, Richard Sandiford wrote: > I know you said that generalising it could be done later, > and that's fine, but it looks in some ways like it would > be easier to go straight for the more general: > > && GET_CODE (trueop0) == VEC_CONCAT > && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT > && GET_MODE (XEXP (trueop0, 0)) == mode > && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT > && GET_MODE (XEXP (trueop0, 1)) == mode) > { > unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0)); > unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1)); > rtx op0, op1; > > gcc_assert (i0 < 4 && i1 < 4); > op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2); > op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2); > > return simplify_gen_binary (VEC_CONCAT, mode, op0, op1); > } Yes, I hesitated. > (completely untested). avoid_constant_pool_reference shouldn't be > called here. I wasn't quite sure what it was for, but it looked safer to call it for nothing than to forget it ;-) > Very minor, but this code probably belongs in the else part of the > if (!VECTOR_MODE_P (mode)) block. Thanks, I'll update the patch with your comments, add a testcase and ChangeLog and re-send it here.
Marc Glisse <marc.glisse@inria.fr> writes: > On Tue, 8 May 2012, Richard Sandiford wrote: >> I know you said that generalising it could be done later, >> and that's fine, but it looks in some ways like it would >> be easier to go straight for the more general: >> >> && GET_CODE (trueop0) == VEC_CONCAT >> && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT >> && GET_MODE (XEXP (trueop0, 0)) == mode >> && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT >> && GET_MODE (XEXP (trueop0, 1)) == mode) >> { >> unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0)); >> unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1)); >> rtx op0, op1; >> >> gcc_assert (i0 < 4 && i1 < 4); >> op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2); >> op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2); >> >> return simplify_gen_binary (VEC_CONCAT, mode, op0, op1); >> } > > Yes, I hesitated. Realised afterwards that both versions need to check GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1 to be scalar. Sorry for not noticing first time. Richard
On Tue, 8 May 2012, Richard Sandiford wrote: > Marc Glisse <marc.glisse@inria.fr> writes: >> On Tue, 8 May 2012, Richard Sandiford wrote: >>> I know you said that generalising it could be done later, >>> and that's fine, but it looks in some ways like it would >>> be easier to go straight for the more general: >>> >>> && GET_CODE (trueop0) == VEC_CONCAT >>> && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT >>> && GET_MODE (XEXP (trueop0, 0)) == mode >>> && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT >>> && GET_MODE (XEXP (trueop0, 1)) == mode) >>> { >>> unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0)); >>> unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1)); >>> rtx op0, op1; >>> >>> gcc_assert (i0 < 4 && i1 < 4); >>> op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2); >>> op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2); >>> >>> return simplify_gen_binary (VEC_CONCAT, mode, op0, op1); >>> } >> >> Yes, I hesitated. > > Realised afterwards that both versions need to check > GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1 > to be scalar. Sorry for not noticing first time. I thought that was a consequence of XVECLEN (trueop1, 0) == 2 (in the lines before your first &&) ie the result (which has mode "mode") is a vec_select of 2 objects.
Marc Glisse <marc.glisse@inria.fr> writes: >> Realised afterwards that both versions need to check >> GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1 >> to be scalar. Sorry for not noticing first time. > > I thought that was a consequence of > > XVECLEN (trueop1, 0) == 2 > > (in the lines before your first &&) > ie the result (which has mode "mode") is a vec_select of 2 objects. Yeah, you're right of course. Richard
Index: simplify-rtx.c =================================================================== --- simplify-rtx.c (revision 187228) +++ simplify-rtx.c (working copy) @@ -3268,10 +3268,32 @@ simplify_binary_operation_1 (enum rtx_co if (GET_MODE (vec) == mode) return vec; } + /* If we build {a,b} then permute it, build the result directly. */ + if (XVECLEN (trueop1, 0) == 2 + && CONST_INT_P (XVECEXP (trueop1, 0, 0)) + && CONST_INT_P (XVECEXP (trueop1, 0, 1)) + && GET_CODE (trueop0) == VEC_CONCAT + && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop0, 1)) + && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT + && GET_MODE (XEXP (trueop0, 0)) == mode) + { + int offset0 = INTVAL (XVECEXP (trueop1, 0, 0)) % 2; + int offset1 = INTVAL (XVECEXP (trueop1, 0, 1)) % 2; + rtx baseop = XEXP (trueop0, 0); + rtx baseop0 = XEXP (baseop , 0); + rtx baseop1 = XEXP (baseop , 1); + baseop0 = avoid_constant_pool_reference (baseop0); + baseop1 = avoid_constant_pool_reference (baseop1); + + return simplify_gen_binary (VEC_CONCAT, mode, + offset0 ? baseop1 : baseop0, + offset1 ? baseop1 : baseop0); + } + return 0; case VEC_CONCAT: { enum machine_mode op0_mode = (GET_MODE (trueop0) != VOIDmode ? GET_MODE (trueop0)