Message ID | CAFiYyc075Bxczhy606-AGj_m414fCv6GHhWwuoA87+BA7e57mQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Tue, 30 Aug 2011, Richard Guenther wrote: > oh, hum - now I remember ;) Eventually the C frontend should handle > this not via the function call mechanism but similar to how Joseph > added __builtin_complex support with > > 2011-08-19 Joseph Myers <joseph@codesourcery.com> > > * c-parser.c (c_parser_postfix_expression): Handle RID_BUILTIN_COMPLEX. > * doc/extend.texi (__builtin_complex): Document. > > and then emit VEC_SHUFFLE_EXPRs directly from the frontend. Joseph? It's probably time to refactor the parsing code before adding yet another pseudo-builtin. Considering just those all of whose operands are expressions (there are more where types are involved), we have __builtin_complex (two operands) and __builtin_choose_expr (three operands). How about a helper that parses a parenthesized list of expressions (using c_parser_expr_list, disabling all folding and conversions), gives an error if the number of expressions is wrong, then returns an error status and the list? Pass the keyword to this function and it can give a "wrong number of arguments" error that says which pseudo-builtin is involved, rather than less friendly parse errors - so these things would act a bit more like built-in functions while still being purely front-end syntax for GENERIC and GIMPLE operations. Then c_parser_postfix_expression would only have the code that deals with semantics, without duplicating the generic code for parsing lists.
On Tue, Aug 30, 2011 at 2:03 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > On Tue, Aug 30, 2011 at 4:31 AM, Artem Shinkarov > <artyom.shinkaroff@gmail.com> wrote: >> Hi >> >> This is a patch for the explicit vector shuffling we have discussed a >> long time ago here: >> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01092.html >> >> The new patch introduces the new tree code, as we agreed, and expands >> this code by checking the vshuffle pattern in the backend. >> >> The patch at the moment lacks of some examples, but mainly it works >> fine for me. It would be nice if i386 gurus could look into the way I >> am doing the expansion. >> >> Middle-end parts seems to be more or less fine, they have not changed >> much from the previous time. > > +@code{__builtin_shuffle (vec, mask)} and > +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct > > the latter would be __builtin_shuffle2. Why?? That was the syntax we agreed on that elegantly handles both cases in one place. > +bool > +expand_vec_shuffle_expr_p (enum machine_mode mode, tree v0, > + tree v1, tree mask) > +{ > +#define inner_type_size(vec) \ > + GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (TREE_TYPE (vec)))) > > missing comment. No #defines like this please, just initialize > two temporary variables. > > + > +rtx > +expand_vec_shuffle_expr (tree type, tree v0, tree v1, tree mask, rtx target) > +{ > > comment. > > +vshuffle: > + gcc_assert (v1 == v0); > + > + icode = direct_optab_handler (vshuffle_optab, mode); > > hmm, so we don't have a vshuffle2 optab but always go via the > builtin function, but only for constant masks there? I wonder > if we should arrange for targets to only support a vshuffle > optab (thus, transition away from the builtin) and so > unconditionally have a vshuffle2 optab only (with possibly > equivalent v1 and v0?) I have only implemented the case with non-constant mask that supports only one argument. I think that it would be enough for the first version. Later we can introduce vshuffle2 pattern and reuse the code that expands vshuffle at the moment. > I suppose Richard might remember what he had in mind back > when we discussed this. > > Index: gcc/c-typeck.c > =================================================================== > --- gcc/c-typeck.c (revision 177758) > +++ gcc/c-typeck.c (working copy) > @@ -2815,6 +2815,68 @@ build_function_call_vec (location_t loc, > && !check_builtin_function_arguments (fundecl, nargs, argarray)) > return error_mark_node; > > + /* Typecheck a builtin function which is declared with variable > + argument list. */ > + if (fundecl && DECL_BUILT_IN (fundecl) > + && DECL_BUILT_IN_CLASS (fundecl) == BUILT_IN_NORMAL) > > just add to check_builtin_function_arguments which is called right > in front of your added code. > > + /* Here we change the return type of the builtin function > + from int f(...) --> t f(...) where t is a type of the > + first argument. */ > + fundecl = copy_node (fundecl); > + TREE_TYPE (fundecl) = build_function_type (TREE_TYPE (firstarg), > + TYPE_ARG_TYPES (TREE_TYPE (fundecl))); > + function = build_fold_addr_expr (fundecl); > > oh, hum - now I remember ;) Eventually the C frontend should handle > this not via the function call mechanism but similar to how Joseph > added __builtin_complex support with > > 2011-08-19 Joseph Myers <joseph@codesourcery.com> > > * c-parser.c (c_parser_postfix_expression): Handle RID_BUILTIN_COMPLEX. > * doc/extend.texi (__builtin_complex): Document. > > and then emit VEC_SHUFFLE_EXPRs directly from the frontend. Joseph? > > FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (inside_init), ix, value) > - if (!CONSTANT_CLASS_P (value)) > + if (!CONSTANT_CLASS_P (value)) > > watch out for spurious whitespace changes. > > Index: gcc/gimplify.c > =================================================================== > --- gcc/gimplify.c (revision 177758) > +++ gcc/gimplify.c (working copy) > @@ -7050,6 +7050,7 @@ gimplify_expr (tree *expr_p, gimple_seq > break; > > case BIT_FIELD_REF: > + case VEC_SHUFFLE_EXPR: > > I don't think that's quite the right place given the is_gimple_lvalue > predicate on the first operand. More like > > case VEC_SHUFFLE_EXPR: > goto expr_3; > > +/* Vector shuffle expression. A = VEC_SHUFFLE_EXPR<v0, v1, maks> > > typo, mask > > + means > + > + freach i in length (mask): > + A = mask[i] < length (v0) ? v0[mask[i]] : v1[mask[i]] > +*/ > +DEFTREECODE (VEC_SHUFFLE_EXPR, "vec_shuffle_expr", tcc_expression, 3) > > what is the (is there any?) constraint on the operand types, especially > the mask type? > > Index: gcc/gimple.c > =================================================================== > --- gcc/gimple.c (revision 177758) > +++ gcc/gimple.c (working copy) > @@ -2623,6 +2623,7 @@ get_gimple_rhs_num_ops (enum tree_code c > || (SYM) == ADDR_EXPR \ > || (SYM) == WITH_SIZE_EXPR \ > || (SYM) == SSA_NAME \ > + || (SYM) == VEC_SHUFFLE_EXPR \ > || (SYM) == VEC_COND_EXPR) ? GIMPLE_SINGLE_RHS \ > : GIMPLE_INVALID_RHS), > #define END_OF_BASE_TREE_CODES (unsigned char) GIMPLE_INVALID_RHS, > > please make it GIMPLE_TERNARY_RHS instead. > > which requires adjustment at least here: > > Index: gcc/tree-ssa-operands.c > =================================================================== > --- gcc/tree-ssa-operands.c (revision 177758) > +++ gcc/tree-ssa-operands.c (working copy) > @@ -943,6 +943,7 @@ get_expr_operands (gimple stmt, tree *ex > > case COND_EXPR: > case VEC_COND_EXPR: > + case VEC_SHUFFLE_EXPR: > > > I think it would be nicer if the builtin would be handled by the frontend > not as builtin but like __builtin_complex and we'd just deal with > VEC_SHUFFLE_EXPR throughout the middle-end, eventually > lowering it in tree-vect-generic.c. So I didn't look at the lowering > code in detail because that would obviously change then. > > Defering to Joseph for a decision here and to x86 maintainers for > the target specific bits. > > Thanks, > Richard. I'll go and see how the __builtin_complex are treated, and try to adjust the patch. Thanks, Artem. >> ChangeLog: >> 2011-08-30 Artjoms Sinkarovs <artyom.shinkaroff@gmailc.com> >> >> gcc/ >> * optabs.c (expand_vec_shuffle_expr_p): New function. Checks >> if given expression can be expanded by the target. >> (expand_vec_shuffle_expr): New function. Expand VEC_SHUFFLE_EXPR >> using target vector instructions. >> * optabs.h: New optab vshuffle. >> (expand_vec_shuffle_expr_p): New prototype. >> (expand_vec_shuffle_expr): New prototype. >> * genopinit.c: Adjust to support vecshuffle. >> * builtins.def: New builtin __builtin_shuffle. >> * c-typeck.c (build_function_call_vec): Typecheck >> __builtin_shuffle, allowing only two or three arguments. >> Change the type of builtin depending on the arguments. >> (digest_init): Warn when constructor has less elements than >> vector type. >> * gimplify.c (gimplify_exp): Adjusted to support VEC_SHUFFLE_EXPR. >> * tree.def: New tree code VEC_SHUFFLE_EXPR. >> * tree-vect-generic.c (vector_element): New function. Returns an >> element of the vector at the given position. >> (lower_builtin_shuffle): Change builtin_shuffle with VEC_SHUFLLE_EXPR >> or expand an expression piecewise. >> (expand_vector_operations_1): Adjusted. >> (gate_expand_vector_operations_noop): New gate function. >> * gimple.c (get_gimple_rhs_num_ops): Adjust. >> * passes.c: Move veclower down. >> * tree-pretty-print.c (dump_generic_node): Recognize >> VEC_SHUFFLE_EXPR as valid expression. >> * tree-ssa-operands: Adjust. >> >> gcc/config/i386 >> * sse.md: (sseshuffint) New mode_attr. Correspondence between the >> vector and the type of the mask when shuffling. >> (vecshuffle<mode>): New expansion. >> * i386-protos.h (ix86_expand_vshuffle): New prototype. >> * i386.c (ix86_expand_vshuffle): Expand vshuffle using pshufb. >> (ix86_vectorize_builtin_vec_perm_ok): Adjust. >> >> gcc/doc >> * extend.texi: Adjust. >> >> gcc/testsuite >> * gcc.c-torture/execute/vect-shuffle-2.c: New test. >> * gcc.c-torture/execute/vect-shuffle-4.c: New test. >> * gcc.c-torture/execute/vect-shuffle-1.c: New test. >> * gcc.c-torture/execute/vect-shuffle-3.c: New test. >> >> bootstrapped on x86_64-unknown-linux-gnu. The AVX parts are not >> tested, because I don't have actual hardware. It works with -mavx, the >> assembler code looks fine to me. I'll test it on a real hardware in >> couple of days. >> >> >> >> Thanks, >> Artem Shinkarov. >> >
On Aug 30, 2011, at 10:01 AM, Artem Shinkarov wrote: >>> The patch at the moment lacks of some examples, but mainly it works >>> fine for me. It would be nice if i386 gurus could look into the way I >>> am doing the expansion. >>> >>> Middle-end parts seems to be more or less fine, they have not changed >>> much from the previous time. >> >> +@code{__builtin_shuffle (vec, mask)} and >> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >> >> the latter would be __builtin_shuffle2. > > Why?? > That was the syntax we agreed on that elegantly handles both cases in one place. If you're going to add vector shuffling builtins, you might consider adding the same builtin that clang has for compatibility: http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector It should be straight-forward to map it into the same IR. -Chris
On Tue, Aug 30, 2011 at 7:01 PM, Artem Shinkarov <artyom.shinkaroff@gmail.com> wrote: > On Tue, Aug 30, 2011 at 2:03 PM, Richard Guenther > <richard.guenther@gmail.com> wrote: >> On Tue, Aug 30, 2011 at 4:31 AM, Artem Shinkarov >> <artyom.shinkaroff@gmail.com> wrote: >>> Hi >>> >>> This is a patch for the explicit vector shuffling we have discussed a >>> long time ago here: >>> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01092.html >>> >>> The new patch introduces the new tree code, as we agreed, and expands >>> this code by checking the vshuffle pattern in the backend. >>> >>> The patch at the moment lacks of some examples, but mainly it works >>> fine for me. It would be nice if i386 gurus could look into the way I >>> am doing the expansion. >>> >>> Middle-end parts seems to be more or less fine, they have not changed >>> much from the previous time. >> >> +@code{__builtin_shuffle (vec, mask)} and >> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >> >> the latter would be __builtin_shuffle2. > > Why?? > That was the syntax we agreed on that elegantly handles both cases in one place. Ah, then there was a case below that mentions __builtin_shuffle2 that needs adjusting then. >> +bool >> +expand_vec_shuffle_expr_p (enum machine_mode mode, tree v0, >> + tree v1, tree mask) >> +{ >> +#define inner_type_size(vec) \ >> + GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (TREE_TYPE (vec)))) >> >> missing comment. No #defines like this please, just initialize >> two temporary variables. >> >> + >> +rtx >> +expand_vec_shuffle_expr (tree type, tree v0, tree v1, tree mask, rtx target) >> +{ >> >> comment. >> >> +vshuffle: >> + gcc_assert (v1 == v0); >> + >> + icode = direct_optab_handler (vshuffle_optab, mode); >> >> hmm, so we don't have a vshuffle2 optab but always go via the >> builtin function, but only for constant masks there? I wonder >> if we should arrange for targets to only support a vshuffle >> optab (thus, transition away from the builtin) and so >> unconditionally have a vshuffle2 optab only (with possibly >> equivalent v1 and v0?) > > I have only implemented the case with non-constant mask that supports > only one argument. I think that it would be enough for the first > version. Later we can introduce vshuffle2 pattern and reuse the code > that expands vshuffle at the moment. Ok. >> I suppose Richard might remember what he had in mind back >> when we discussed this. >> >> Index: gcc/c-typeck.c >> =================================================================== >> --- gcc/c-typeck.c (revision 177758) >> +++ gcc/c-typeck.c (working copy) >> @@ -2815,6 +2815,68 @@ build_function_call_vec (location_t loc, >> && !check_builtin_function_arguments (fundecl, nargs, argarray)) >> return error_mark_node; >> >> + /* Typecheck a builtin function which is declared with variable >> + argument list. */ >> + if (fundecl && DECL_BUILT_IN (fundecl) >> + && DECL_BUILT_IN_CLASS (fundecl) == BUILT_IN_NORMAL) >> >> just add to check_builtin_function_arguments which is called right >> in front of your added code. >> >> + /* Here we change the return type of the builtin function >> + from int f(...) --> t f(...) where t is a type of the >> + first argument. */ >> + fundecl = copy_node (fundecl); >> + TREE_TYPE (fundecl) = build_function_type (TREE_TYPE (firstarg), >> + TYPE_ARG_TYPES (TREE_TYPE (fundecl))); >> + function = build_fold_addr_expr (fundecl); >> >> oh, hum - now I remember ;) Eventually the C frontend should handle >> this not via the function call mechanism but similar to how Joseph >> added __builtin_complex support with >> >> 2011-08-19 Joseph Myers <joseph@codesourcery.com> >> >> * c-parser.c (c_parser_postfix_expression): Handle RID_BUILTIN_COMPLEX. >> * doc/extend.texi (__builtin_complex): Document. >> >> and then emit VEC_SHUFFLE_EXPRs directly from the frontend. Joseph? >> >> FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (inside_init), ix, value) >> - if (!CONSTANT_CLASS_P (value)) >> + if (!CONSTANT_CLASS_P (value)) >> >> watch out for spurious whitespace changes. >> >> Index: gcc/gimplify.c >> =================================================================== >> --- gcc/gimplify.c (revision 177758) >> +++ gcc/gimplify.c (working copy) >> @@ -7050,6 +7050,7 @@ gimplify_expr (tree *expr_p, gimple_seq >> break; >> >> case BIT_FIELD_REF: >> + case VEC_SHUFFLE_EXPR: >> >> I don't think that's quite the right place given the is_gimple_lvalue >> predicate on the first operand. More like >> >> case VEC_SHUFFLE_EXPR: >> goto expr_3; >> >> +/* Vector shuffle expression. A = VEC_SHUFFLE_EXPR<v0, v1, maks> >> >> typo, mask >> >> + means >> + >> + freach i in length (mask): >> + A = mask[i] < length (v0) ? v0[mask[i]] : v1[mask[i]] >> +*/ >> +DEFTREECODE (VEC_SHUFFLE_EXPR, "vec_shuffle_expr", tcc_expression, 3) >> >> what is the (is there any?) constraint on the operand types, especially >> the mask type? >> >> Index: gcc/gimple.c >> =================================================================== >> --- gcc/gimple.c (revision 177758) >> +++ gcc/gimple.c (working copy) >> @@ -2623,6 +2623,7 @@ get_gimple_rhs_num_ops (enum tree_code c >> || (SYM) == ADDR_EXPR \ >> || (SYM) == WITH_SIZE_EXPR \ >> || (SYM) == SSA_NAME \ >> + || (SYM) == VEC_SHUFFLE_EXPR \ >> || (SYM) == VEC_COND_EXPR) ? GIMPLE_SINGLE_RHS \ >> : GIMPLE_INVALID_RHS), >> #define END_OF_BASE_TREE_CODES (unsigned char) GIMPLE_INVALID_RHS, >> >> please make it GIMPLE_TERNARY_RHS instead. >> >> which requires adjustment at least here: >> >> Index: gcc/tree-ssa-operands.c >> =================================================================== >> --- gcc/tree-ssa-operands.c (revision 177758) >> +++ gcc/tree-ssa-operands.c (working copy) >> @@ -943,6 +943,7 @@ get_expr_operands (gimple stmt, tree *ex >> >> case COND_EXPR: >> case VEC_COND_EXPR: >> + case VEC_SHUFFLE_EXPR: >> >> >> I think it would be nicer if the builtin would be handled by the frontend >> not as builtin but like __builtin_complex and we'd just deal with >> VEC_SHUFFLE_EXPR throughout the middle-end, eventually >> lowering it in tree-vect-generic.c. So I didn't look at the lowering >> code in detail because that would obviously change then. >> >> Defering to Joseph for a decision here and to x86 maintainers for >> the target specific bits. >> >> Thanks, >> Richard. > > I'll go and see how the __builtin_complex are treated, and try to > adjust the patch. Thanks, also see Josephs comments on this. Richard. > > Thanks, > Artem. > >>> ChangeLog: >>> 2011-08-30 Artjoms Sinkarovs <artyom.shinkaroff@gmailc.com> >>> >>> gcc/ >>> * optabs.c (expand_vec_shuffle_expr_p): New function. Checks >>> if given expression can be expanded by the target. >>> (expand_vec_shuffle_expr): New function. Expand VEC_SHUFFLE_EXPR >>> using target vector instructions. >>> * optabs.h: New optab vshuffle. >>> (expand_vec_shuffle_expr_p): New prototype. >>> (expand_vec_shuffle_expr): New prototype. >>> * genopinit.c: Adjust to support vecshuffle. >>> * builtins.def: New builtin __builtin_shuffle. >>> * c-typeck.c (build_function_call_vec): Typecheck >>> __builtin_shuffle, allowing only two or three arguments. >>> Change the type of builtin depending on the arguments. >>> (digest_init): Warn when constructor has less elements than >>> vector type. >>> * gimplify.c (gimplify_exp): Adjusted to support VEC_SHUFFLE_EXPR. >>> * tree.def: New tree code VEC_SHUFFLE_EXPR. >>> * tree-vect-generic.c (vector_element): New function. Returns an >>> element of the vector at the given position. >>> (lower_builtin_shuffle): Change builtin_shuffle with VEC_SHUFLLE_EXPR >>> or expand an expression piecewise. >>> (expand_vector_operations_1): Adjusted. >>> (gate_expand_vector_operations_noop): New gate function. >>> * gimple.c (get_gimple_rhs_num_ops): Adjust. >>> * passes.c: Move veclower down. >>> * tree-pretty-print.c (dump_generic_node): Recognize >>> VEC_SHUFFLE_EXPR as valid expression. >>> * tree-ssa-operands: Adjust. >>> >>> gcc/config/i386 >>> * sse.md: (sseshuffint) New mode_attr. Correspondence between the >>> vector and the type of the mask when shuffling. >>> (vecshuffle<mode>): New expansion. >>> * i386-protos.h (ix86_expand_vshuffle): New prototype. >>> * i386.c (ix86_expand_vshuffle): Expand vshuffle using pshufb. >>> (ix86_vectorize_builtin_vec_perm_ok): Adjust. >>> >>> gcc/doc >>> * extend.texi: Adjust. >>> >>> gcc/testsuite >>> * gcc.c-torture/execute/vect-shuffle-2.c: New test. >>> * gcc.c-torture/execute/vect-shuffle-4.c: New test. >>> * gcc.c-torture/execute/vect-shuffle-1.c: New test. >>> * gcc.c-torture/execute/vect-shuffle-3.c: New test. >>> >>> bootstrapped on x86_64-unknown-linux-gnu. The AVX parts are not >>> tested, because I don't have actual hardware. It works with -mavx, the >>> assembler code looks fine to me. I'll test it on a real hardware in >>> couple of days. >>> >>> >>> >>> Thanks, >>> Artem Shinkarov. >>> >> >
On Wed, Aug 31, 2011 at 1:51 AM, Chris Lattner <clattner@apple.com> wrote: > On Aug 30, 2011, at 10:01 AM, Artem Shinkarov wrote: >>>> The patch at the moment lacks of some examples, but mainly it works >>>> fine for me. It would be nice if i386 gurus could look into the way I >>>> am doing the expansion. >>>> >>>> Middle-end parts seems to be more or less fine, they have not changed >>>> much from the previous time. >>> >>> +@code{__builtin_shuffle (vec, mask)} and >>> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >>> >>> the latter would be __builtin_shuffle2. >> >> Why?? >> That was the syntax we agreed on that elegantly handles both cases in one place. > > If you're going to add vector shuffling builtins, you might consider adding the same builtin that clang has for compatibility: > http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector > > It should be straight-forward to map it into the same IR. Sure. It doesn't support a vector argument for element selection though, which I think is required for a mapping to OpenCL shuffle/shuffle2. That's odd. Richard. > -Chris >
On Wed, Aug 31, 2011 at 12:51 AM, Chris Lattner <clattner@apple.com> wrote: > On Aug 30, 2011, at 10:01 AM, Artem Shinkarov wrote: >>>> The patch at the moment lacks of some examples, but mainly it works >>>> fine for me. It would be nice if i386 gurus could look into the way I >>>> am doing the expansion. >>>> >>>> Middle-end parts seems to be more or less fine, they have not changed >>>> much from the previous time. >>> >>> +@code{__builtin_shuffle (vec, mask)} and >>> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >>> >>> the latter would be __builtin_shuffle2. >> >> Why?? >> That was the syntax we agreed on that elegantly handles both cases in one place. > > If you're going to add vector shuffling builtins, you might consider adding the same builtin that clang has for compatibility: > http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector > > It should be straight-forward to map it into the same IR. > > -Chris > Chris I am trying to use OpenCL syntax here which says that the mask for shuffling is a vector. Also I didn't really get from the clang description if the indexes could be non-constnants? If not, then I have a problem here, because I want to support this. Artem.
Hi Artem, On 31/08/11 10:27, Artem Shinkarov wrote: > On Wed, Aug 31, 2011 at 12:51 AM, Chris Lattner<clattner@apple.com> wrote: >> On Aug 30, 2011, at 10:01 AM, Artem Shinkarov wrote: >>>>> The patch at the moment lacks of some examples, but mainly it works >>>>> fine for me. It would be nice if i386 gurus could look into the way I >>>>> am doing the expansion. >>>>> >>>>> Middle-end parts seems to be more or less fine, they have not changed >>>>> much from the previous time. >>>> >>>> +@code{__builtin_shuffle (vec, mask)} and >>>> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >>>> >>>> the latter would be __builtin_shuffle2. >>> >>> Why?? >>> That was the syntax we agreed on that elegantly handles both cases in one place. >> >> If you're going to add vector shuffling builtins, you might consider adding the same builtin that clang has for compatibility: >> http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector >> >> It should be straight-forward to map it into the same IR. >> >> -Chris >> > > Chris > > I am trying to use OpenCL syntax here which says that the mask for > shuffling is a vector. Also I didn't really get from the clang > description if the indexes could be non-constnants? If not, then I > have a problem here, because I want to support this. probably it maps directly to the LLVM shufflevector instruction, see http://llvm.org/docs/LangRef.html#i_shufflevector That requires the shuffle mask to be constant. Ciao, Duncan.
On Wed, Aug 31, 2011 at 10:35 AM, Duncan Sands <baldrick@free.fr> wrote: > Hi Artem, > > On 31/08/11 10:27, Artem Shinkarov wrote: >> >> On Wed, Aug 31, 2011 at 12:51 AM, Chris Lattner<clattner@apple.com> >> wrote: >>> >>> On Aug 30, 2011, at 10:01 AM, Artem Shinkarov wrote: >>>>>> >>>>>> The patch at the moment lacks of some examples, but mainly it works >>>>>> fine for me. It would be nice if i386 gurus could look into the way I >>>>>> am doing the expansion. >>>>>> >>>>>> Middle-end parts seems to be more or less fine, they have not changed >>>>>> much from the previous time. >>>>> >>>>> +@code{__builtin_shuffle (vec, mask)} and >>>>> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >>>>> >>>>> the latter would be __builtin_shuffle2. >>>> >>>> Why?? >>>> That was the syntax we agreed on that elegantly handles both cases in >>>> one place. >>> >>> If you're going to add vector shuffling builtins, you might consider >>> adding the same builtin that clang has for compatibility: >>> >>> http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector >>> >>> It should be straight-forward to map it into the same IR. >>> >>> -Chris >>> >> >> Chris >> >> I am trying to use OpenCL syntax here which says that the mask for >> shuffling is a vector. Also I didn't really get from the clang >> description if the indexes could be non-constnants? If not, then I >> have a problem here, because I want to support this. > > probably it maps directly to the LLVM shufflevector instruction, see > http://llvm.org/docs/LangRef.html#i_shufflevector > That requires the shuffle mask to be constant. I see. I think it's not worth copying LLVM builtins that merely map its internal IL. Richard. > Ciao, Duncan. >
On Aug 31, 2011, at 1:27 AM, Artem Shinkarov wrote: >> If you're going to add vector shuffling builtins, you might consider adding the same builtin that clang has for compatibility: >> http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector >> >> It should be straight-forward to map it into the same IR. >> >> -Chris >> > > Chris > > I am trying to use OpenCL syntax here which says that the mask for > shuffling is a vector. Also I didn't really get from the clang > description if the indexes could be non-constnants? If not, then I > have a problem here, because I want to support this. Yes, constant elements are required for this builtin. It is an implementation detail, but Clang doesn't implement the OpenCL shuffle operations with builtins. -Chris
Index: gcc/c-typeck.c =================================================================== --- gcc/c-typeck.c (revision 177758) +++ gcc/c-typeck.c (working copy) @@ -2815,6 +2815,68 @@ build_function_call_vec (location_t loc, && !check_builtin_function_arguments (fundecl, nargs, argarray)) return error_mark_node; + /* Typecheck a builtin function which is declared with variable + argument list. */ + if (fundecl && DECL_BUILT_IN (fundecl) + && DECL_BUILT_IN_CLASS (fundecl) == BUILT_IN_NORMAL) just add to check_builtin_function_arguments which is called right in front of your added code. + /* Here we change the return type of the builtin function + from int f(...) --> t f(...) where t is a type of the + first argument. */ + fundecl = copy_node (fundecl); + TREE_TYPE (fundecl) = build_function_type (TREE_TYPE (firstarg), + TYPE_ARG_TYPES (TREE_TYPE (fundecl))); + function = build_fold_addr_expr (fundecl); oh, hum - now I remember ;) Eventually the C frontend should handle this not via the function call mechanism but similar to how Joseph added __builtin_complex support with 2011-08-19 Joseph Myers <joseph@codesourcery.com> * c-parser.c (c_parser_postfix_expression): Handle RID_BUILTIN_COMPLEX. * doc/extend.texi (__builtin_complex): Document. and then emit VEC_SHUFFLE_EXPRs directly from the frontend. Joseph? FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (inside_init), ix, value) - if (!CONSTANT_CLASS_P (value)) + if (!CONSTANT_CLASS_P (value)) watch out for spurious whitespace changes. Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c (revision 177758) +++ gcc/gimplify.c (working copy) @@ -7050,6 +7050,7 @@ gimplify_expr (tree *expr_p, gimple_seq break; case BIT_FIELD_REF: + case VEC_SHUFFLE_EXPR: I don't think that's quite the right place given the is_gimple_lvalue predicate on the first operand. More like case VEC_SHUFFLE_EXPR: goto expr_3; +/* Vector shuffle expression. A = VEC_SHUFFLE_EXPR<v0, v1, maks> typo, mask + means + + freach i in length (mask): + A = mask[i] < length (v0) ? v0[mask[i]] : v1[mask[i]] +*/ +DEFTREECODE (VEC_SHUFFLE_EXPR, "vec_shuffle_expr", tcc_expression, 3) what is the (is there any?) constraint on the operand types, especially the mask type? Index: gcc/gimple.c =================================================================== --- gcc/gimple.c (revision 177758) +++ gcc/gimple.c (working copy) @@ -2623,6 +2623,7 @@ get_gimple_rhs_num_ops (enum tree_code c || (SYM) == ADDR_EXPR \ || (SYM) == WITH_SIZE_EXPR \ || (SYM) == SSA_NAME \ + || (SYM) == VEC_SHUFFLE_EXPR \ || (SYM) == VEC_COND_EXPR) ? GIMPLE_SINGLE_RHS \ : GIMPLE_INVALID_RHS), #define END_OF_BASE_TREE_CODES (unsigned char) GIMPLE_INVALID_RHS, please make it GIMPLE_TERNARY_RHS instead. which requires adjustment at least here: Index: gcc/tree-ssa-operands.c =================================================================== --- gcc/tree-ssa-operands.c (revision 177758) +++ gcc/tree-ssa-operands.c (working copy) @@ -943,6 +943,7 @@ get_expr_operands (gimple stmt, tree *ex case COND_EXPR: case VEC_COND_EXPR: + case VEC_SHUFFLE_EXPR: I think it would be nicer if the builtin would be handled by the frontend