From patchwork Mon Aug 22 21:11:11 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Artem Shinkarov X-Patchwork-Id: 110991 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 8145CB6F83 for ; Tue, 23 Aug 2011 07:12:06 +1000 (EST) Received: (qmail 5737 invoked by alias); 22 Aug 2011 21:12:03 -0000 Received: (qmail 5710 invoked by uid 22791); 22 Aug 2011 21:11:55 -0000 X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, TW_CF, TW_FG, TW_TM X-Spam-Check-By: sourceware.org Received: from mail-qy0-f182.google.com (HELO mail-qy0-f182.google.com) (209.85.216.182) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 22 Aug 2011 21:11:32 +0000 Received: by qyk9 with SMTP id 9so2554055qyk.20 for ; Mon, 22 Aug 2011 14:11:31 -0700 (PDT) Received: by 10.229.136.81 with SMTP id q17mr1699319qct.170.1314047491154; Mon, 22 Aug 2011 14:11:31 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.120.82 with HTTP; Mon, 22 Aug 2011 14:11:11 -0700 (PDT) In-Reply-To: References: <4E4D224F.1020206@redhat.com> From: Artem Shinkarov Date: Mon, 22 Aug 2011 22:11:11 +0100 Message-ID: Subject: Re: Vector Comparison patch To: Richard Guenther Cc: Richard Henderson , gcc-patches@gcc.gnu.org, "Joseph S. Myers" , Uros Bizjak X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org I'll just send you my current version. I'll be a little bit more specific. The problem starts when you try to lower the following expression: x = a > b; x1 = vcond vcond Now, you go from the beginning to the end of the block, and you cannot leave a > b, because only vconds are valid expressions to expand. Now, you meet a > b first. You try to transform it into vcond b, -1, 0>, you build this expression, then you try to gimplify it, and you see that you have something like: x' = a >b; x = vcond x1 = vcond vcond and your gsi stands at the x1 now, so the gimplification created a comparison that optab would not understand. And I am not really sure that you would be able to solve this problem easily. It would helpr, if you could create vcond, but you cant and x op y is a single tree that must be gimplified, and I am not sure that you can persuade gimplifier to leave this expression untouched. In the attachment the current version of the patch. Thanks, Artem. On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov > wrote: >> On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov >>> wrote: >>>> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther >>>> wrote: >>>>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov >>>>> wrote: >>>>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther >>>>>> wrote: >>>>>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov >>>>>>> wrote: >>>>>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther >>>>>>>> wrote: >>>>>>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov >>>>>>>>> wrote: >>>>>>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther >>>>>>>>>> wrote: >>>>>>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov >>>>>>>>>>> wrote: >>>>>>>>>>>> Richard >>>>>>>>>>>> >>>>>>>>>>>> I formalized an approach a little-bit, now it works without target >>>>>>>>>>>> hooks, but some polishing is still required. I want you to comment on >>>>>>>>>>>> the several important approaches that I use in the patch. >>>>>>>>>>>> >>>>>>>>>>>> So how does it work. >>>>>>>>>>>> 1) All the vector comparisons at the level of  type-checker are >>>>>>>>>>>> introduced using VEC_COND_EXPR with constant selection operands being >>>>>>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR>>>>>>>>>>>> v1, {-1}, {0}>. >>>>>>>>>>>> >>>>>>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >>>>>>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case nothing changes. >>>>>>>>>>>> 2.b) first operand is something else, in that case, we specially mark >>>>>>>>>>>> this case, recognize it in the backend, and do not create a >>>>>>>>>>>> comparison, but use the mask as it was a result of some comparison. >>>>>>>>>>>> >>>>>>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR is a >>>>>>>>>>>> vector comparison we use is_vector_comparison function, if it returns >>>>>>>>>>>> false, then we replace mask with mask != {0}. >>>>>>>>>>>> >>>>>>>>>>>> So we end-up with the following functionality: >>>>>>>>>>>> VEC_COND_EXPR -- if we know that mask is a result of >>>>>>>>>>>> comparison of two vectors, we leave it as it is, otherwise change with >>>>>>>>>>>> mask != {0}. >>>>>>>>>>>> >>>>>>>>>>>> Plain vector comparison a b is represented with VEC_COND_EXPR, >>>>>>>>>>>> which correctly expands, without creating useless masking. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Basically for me there are two questions: >>>>>>>>>>>> 1) Can we perform information passing in optabs in a nicer way? >>>>>>>>>>>> 2) How is_vector_comparison could be improved? I have several ideas, >>>>>>>>>>>> like checking if constant vector all consists of 0 and -1, and so on. >>>>>>>>>>>> But first is it conceptually fine. >>>>>>>>>>>> >>>>>>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in >>>>>>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with -On, >>>>>>>>>>>> which I find inappropriate, and the functionality gets more >>>>>>>>>>>> complicated. >>>>>>>>>>> >>>>>>>>>>> Why is it inappropriate to not optimize it at -O0?  If the user >>>>>>>>>>> separates comparison and ?: expression it's his own fault. >>>>>>>>>> >>>>>>>>>> Well, because all the information is there, and I perfectly envision >>>>>>>>>> the case when user expressed comparison separately, just to avoid code >>>>>>>>>> duplication. >>>>>>>>>> >>>>>>>>>> Like: >>>>>>>>>> mask = a > b; >>>>>>>>>> res1 = mask ? v0 : v1; >>>>>>>>>> res2 = mask ? v2 : v3; >>>>>>>>>> >>>>>>>>>> Which in this case would be different from >>>>>>>>>> res1 = a > b ? v0 : v1; >>>>>>>>>> res2 = a > b ? v2 : v3; >>>>>>>>>> >>>>>>>>>>> Btw, the new hook is still in the patch. >>>>>>>>>>> >>>>>>>>>>> I would simply always create != 0 if it isn't and let optimizers >>>>>>>>>>> (tree-ssa-forwprop.c) optimize this.  You still have to deal with >>>>>>>>>>> non-comparison operands during expansion though, but if >>>>>>>>>>> you always forced a != 0 from the start you can then simply >>>>>>>>>>> interpret it as a proper comparison result (in which case I'd >>>>>>>>>>> modify the backends to have an alternate pattern or directly >>>>>>>>>>> expand to masking operations - using the fake comparison >>>>>>>>>>> RTX is too much of a hack). >>>>>>>>>> >>>>>>>>>> Richard, I think you didn't get the problem. >>>>>>>>>> I really need the way, to pass the information, that the expression >>>>>>>>>> that is in the first operand of vcond is an appropriate mask. I though >>>>>>>>>> for quite a while and this hack is the only answer I found, is there a >>>>>>>>>> better way to do that. I could for example introduce another >>>>>>>>>> tree-node, but it would be overkill as well. >>>>>>>>>> >>>>>>>>>> Now why do I need it so much: >>>>>>>>>> I want to implement the comparison in a way that {1, 5, 0, -1} is >>>>>>>>>> actually {-1,-1,-1,-1}. So whenever I am not sure that mask of >>>>>>>>>> VEC_COND_EXPR is a real comparison I transform it to mask != {0} (not >>>>>>>>>> always). To check the stuff, I use is_vector_comparison in >>>>>>>>>> tree-vect-generic. >>>>>>>>>> >>>>>>>>>> So I really have the difference between mask ? x : y and mask != {0} ? >>>>>>>>>> x : y, otherwise I could treat mask != {0} in the backend as just >>>>>>>>>> mask. >>>>>>>>>> >>>>>>>>>> If this link between optabs and backend breaks, then the patch falls >>>>>>>>>> apart. Because every time the comparison is taken out VEC_COND_EXPR, I >>>>>>>>>> will have to put != {0}. Keep in mind, that I cannot always put the >>>>>>>>>> comparison inside the VEC_COND_EXPR, what if it is defined in a >>>>>>>>>> function-comparison, or somehow else? >>>>>>>>>> >>>>>>>>>> So what would be an appropriate way to connect optabs and the backend? >>>>>>>>> >>>>>>>>> Well, there is no problem in having the only valid mask operand for >>>>>>>>> VEC_COND_EXPRs being either a comparison or a {-1,...} / {0,....} mask. >>>>>>>>> Just the C parser has to transform mask ? vec1 : vec2 to mask != 0 ? >>>>>>>>> vec1 : vec2. >>>>>>>> >>>>>>>> This happens already in the new version of patch (not submitted yet). >>>>>>>> >>>>>>>>> This comparison can be eliminated by optimization passes >>>>>>>>> that >>>>>>>>> either replace it by the real comparison computing the mask or just >>>>>>>>> propagating the information this mask is already {-1,...} / {0,....} by simply >>>>>>>>> dropping the comparison against zero. >>>>>>>> >>>>>>>> This is not a problem, because the backend recognizes these patterns, >>>>>>>> so no optimization is needed in this part. >>>>>>> >>>>>>> I mean for >>>>>>> >>>>>>>  mask = v1 < v2 ? {-1,...} : {0,...}; >>>>>>>  val = VCOND_EXPR ; >>>>>>> >>>>>>> optimizers can see how mask is defined and drop the != 0 test or replace >>>>>>> it by v1 < v2. >>>>>> >>>>>> Yes, sure. >>>>>> >>>>>>>>> For the backends I'd have vcond patterns for both an embedded comparison >>>>>>>>> and for a mask.  (Now we can rewind the discussion a bit and allow >>>>>>>>> arbitrary masks and define a vcond with a mask operand to do bitwise >>>>>>>>> selection - what matters is the C frontend semantics which we need to >>>>>>>>> translate to what the middle-end thinks of a VEC_COND_EXPR, they >>>>>>>>> do not have to agree). >>>>>>>> >>>>>>>> But it seems like another combinatorial explosion here. Considering >>>>>>>> what Richard said in his e-mail, in order to support "generic" vcond, >>>>>>>> we just need to enumerate all the possible cases. Or I didn't >>>>>>>> understand right? >>>>>>> >>>>>>> Well, the question is still what VCOND_EXPR and thus the vcond pattern >>>>>>> semantically does for a non-comparison operand.  I'd argue that using >>>>>>> the bitwise selection semantic gives us maximum flexibility and a native >>>>>>> instruction with AMD XOP.  This non-comparison VCOND_EXPR is >>>>>>> also easy to implement in the middle-end expansion code if there is >>>>>>> no native instruction for it - by simply emitting the bitwise operations. >>>>>>> >>>>>>> But I have the feeling we are talking past each other ...? >>>>>> >>>>>> I am all for the bitwise behaviour in the backend pattern, that is >>>>>> something that I rely on at the moment. What I don't want to have is >>>>>> the same behaviour in the frontend. So If we can guarantee, that we >>>>>> add != 0, when we don't know the "nature" of the mask, then I am >>>>>> perfectly fine with the back-end having bitwise-selection behaviour. >>>>> >>>>> Well, the C frontend would simply always add that != 0 (because it >>>>> doesn't know). >>>>> >>>>>>>> I mean, I don't mind of course, but it seems to me that it would be >>>>>>>> cleaner to have one generic enough pattern. >>>>>>>> >>>>>>>> Is there seriously no way to pass something from optab into the backend?? >>>>>>> >>>>>>> You can pass operands.  And information is implicitly encoded in the name. >>>>>> >>>>>> I didn't quite get that, could you give an example? >>>>> >>>>> It was a larger variant of "no, apart from what is obvious". >>>> >>>> Ha, joking again. :) >>>> >>>>>>>>> If the mask is computed by a function you are of course out of luck, >>>>>>>>> but I don't see how you'd manage to infer knowledge from nowhere either. >>>>>>>> >>>>>>>> Well, take simpler example >>>>>>>> >>>>>>>> a = {0}; >>>>>>>> for ( ; *p; p += 16) >>>>>>>>  a &= pattern > (vec)*p; >>>>>>>> >>>>>>>> res = a ? v0 : v1; >>>>>>>> >>>>>>>> In this case it is simple to analyse that a is a comparison, but you >>>>>>>> cannot embed the operations of a into VEC_COND_EXPR. >>>>>>> >>>>>>> Sure, but if the above is C source the frontend would generate >>>>>>> res = a != 0 ? v0 : v1; initially.  An optimization pass could still >>>>>>> track this information and replace VEC_COND_EXPR >>>>>>> with VEC_COND_EXPR (no existing one would track >>>>>>> vector contents though). >>>>>> >>>>>> Yeah, sure. My point is, that we must be able to pass this information >>>>>> in the backend, that we checked everything, and we are sure that a is >>>>>> a corerct mask, please don't add any != 0 to it. >>>>> >>>>> But all masks are correct as soon as they appear in a VEC_COND_EXPR. >>>>> That's the whole point of the bitwise semantics.  It's only the C frontend >>>>> that needs to be careful to impose its stricter semantics. >>>>> >>>>> Richard. >>>>> >>>> >>>> Ok, I see the last difference in the approaches we envision. >>>> I am assuming, that frontend does not put != 0, but the later >>>> optimisations (veclower in my case) check every mask in VEC_COND_EXPR >>>> and does the same functionality as you describe. So the philosophical >>>> question why it is better to first add and then remove, rather than >>>> just add if needed? >>> >>> Well, it's "better be right than sorry".  Thus, default to the >>> conservatively correct >>> way and let optimizers "optimize" it. >> >> How can we get sorry, it is impossible to skip the vcond during the >> optimisation, but whatever, it is not really so important when to add. >> Currently I have a bigger problem, see below. >>> >>>> In all the rest I think we agreed. >>> >>> Fine. >>> >>> Thanks, >>> Richard. >>> >>>> >>>> Artem. >>>> >>> >> >> I found out that I cannot really gimplify correctly the vcondb , >> c, d> expression when a > b is vcond b, -1, 0>. The problem is >> that gimplifier pulls a > b always as a separate expression during the >> gimplification, and I don't think that we can avoid it. So what >> happens is: >> >> vcond b , c , d> >> transformed to >> x = b > c; >> x1 = vcond >> vcond >> >> and so on, infinitely long. > > Sounds like a bug that is possible to fix. > >> In order to fix the problem we need whether to introduce a new code >> like VEC_COMP_LT, VEC_COMP_GT, and so on >> whether a builtin function which we would lower >> whether stick back to the idea of hook. >> >> Anyway, representing a >b using vcond does not work. > > Well, sure it will work, it just needs some work appearantly. > >> What would be your thinking here? > > Do you have a patch that exposes this problem?  I can have a look > tomorrow. > > Richard. > >> >> Thanks, >> Artem. >> > Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 177665) +++ gcc/doc/extend.texi (working copy) @@ -6553,6 +6553,97 @@ invoke undefined behavior at runtime. W accesses for vector subscription can be enabled with @option{-Warray-bounds}. +In C vector comparison is supported within standard comparison operators: +@code{==, !=, <, <=, >, >=}. Both integer-type and real-type vectors +can be compared but only of the same type. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. +Comparison is happening element by element. False value is 0, true +value is -1 (constant of the appropriate type where all bits are set). +Consider the following example. + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{3,2,1,4@}; +v4si c; + +c = a > b; /* The result would be @{0, 0,-1, 0@} */ +c = a == b; /* The result would be @{0,-1, 0,-1@} */ +@end smallexample + +In addition to the vector comparison C supports conditional expressions +where the condition is a vector of signed integers. In that case result +of the condition is used as a mask to select either from the first +operand or from the second. Consider the following example: + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{3,2,1,7@}; +v4si c = @{2,3,4,5@}; +v4si d = @{6,7,8,9@}; +v4si res; + +res = a >= b ? c : d; /* res would contain @{6, 3, 4, 9@} */ +@end smallexample + +The number of elements in the condition must be the same as number of +elements in the both operands. The same stands for the size of the type +of the elements. The type of the vector conditional is determined by +the types of the operands which must be the same. Consider an example: + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); +typedef float v4f __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{2,3,4,5@}; +v4f f = @{1., 5., 7., -8.@}; +v4f g = @{3., -2., 8., 1.@}; +v4si ires; +v4f fres; + +fres = a <= b ? f : g; /* fres would contain @{1., 5., 7., -8.@} */ +ires = f <= g ? a : b; /* fres would contain @{1, 3, 3, 4@} */ +@end smallexample + +For the convenience condition in the vector conditional can be just a +vector of signed integer type. In that case this vector is implicitly +compared with vectors of zeroes. Consider an example: + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,0,3,0@}; +v4si b = @{2,3,4,5@}; +v4si ires; + +ires = a ? b : a; /* synonym for ires = a != @{0,0,0,0@} ? a :b; */ +@end smallexample + +Pleas note that the conditional where the operands are vectors and the +condition is integer works in a standard way -- returns first operand +if the condition is true and second otherwise. Consider an example: + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,0,3,0@}; +v4si b = @{2,3,4,5@}; +v4si ires; +int x,y; + +/* standard conditional returning A or B */ +ires = x > y ? a : b; + +/* vector conditional where the condition is (x > y ? a : b) */ +ires = (x > y ? a : b) ? b : a; +@end smallexample + + You can declare variables and use them in function calls and returns, as well as in assignments and some casts. You can specify a vector type as a return type for a function. Vector types can also be used as function Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi (revision 177665) +++ gcc/doc/tm.texi (working copy) @@ -5738,6 +5738,10 @@ misalignment value (@var{misalign}). Return true if vector alignment is reachable (by peeling N iterations) for the given type. @end deftypefn +@deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VEC_COMPARE (gimple_stmt_iterator *@var{gsi}, tree @var{type}, tree @var{v0}, tree @var{v1}, enum tree_code @var{code}) +This hook should check whether it is possible to express vectorcomparison using the hardware-specific instructions and return resulttree. Hook should return NULL_TREE if expansion is impossible. +@end deftypefn + @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VEC_PERM (tree @var{type}, tree *@var{mask_element_type}) Target builtin that implements vector permute. @end deftypefn Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in (revision 177665) +++ gcc/doc/tm.texi.in (working copy) @@ -5676,6 +5676,8 @@ misalignment value (@var{misalign}). Return true if vector alignment is reachable (by peeling N iterations) for the given type. @end deftypefn +@hook TARGET_VECTORIZE_BUILTIN_VEC_COMPARE + @hook TARGET_VECTORIZE_BUILTIN_VEC_PERM Target builtin that implements vector permute. @end deftypefn Index: gcc/targhooks.c =================================================================== --- gcc/targhooks.c (revision 177665) +++ gcc/targhooks.c (working copy) @@ -969,6 +969,18 @@ default_builtin_vector_alignment_reachab return true; } +/* Replaces vector comparison with the target-specific instructions + and returns the resulting variable or NULL_TREE otherwise. */ +tree +default_builtin_vec_compare (gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, + tree type ATTRIBUTE_UNUSED, + tree v0 ATTRIBUTE_UNUSED, + tree v1 ATTRIBUTE_UNUSED, + enum tree_code code ATTRIBUTE_UNUSED) +{ + return NULL_TREE; +} + /* By default, assume that a target supports any factor of misalignment memory access if it supports movmisalign patten. is_packed is true if the memory access is defined in a packed struct. */ Index: gcc/targhooks.h =================================================================== --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,11 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + +extern tree default_builtin_vec_compare (gimple_stmt_iterator *gsi, + tree type, tree v0, tree v1, + enum tree_code code); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, Index: gcc/target.def =================================================================== --- gcc/target.def (revision 177665) +++ gcc/target.def (working copy) @@ -988,6 +988,15 @@ DEFHOOK bool, (tree vec_type, tree mask), hook_bool_tree_tree_true) +/* Implement hardware vector comparison or return false. */ +DEFHOOK +(builtin_vec_compare, + "This hook should check whether it is possible to express vector\ +comparison using the hardware-specific instructions and return result\ +tree. Hook should return NULL_TREE if expansion is impossible.", + tree, (gimple_stmt_iterator *gsi, tree type, tree v0, tree v1, enum tree_code code), + default_builtin_vec_compare) + /* Return true if the target supports misaligned store/load of a specific factor denoted in the third parameter. The last parameter is true if the access is defined in a packed struct. */ Index: gcc/optabs.c =================================================================== --- gcc/optabs.c (revision 177665) +++ gcc/optabs.c (working copy) @@ -6572,16 +6572,37 @@ expand_vec_cond_expr (tree vec_cond_type if (icode == CODE_FOR_nothing) return 0; - comparison = vector_compare_rtx (op0, unsignedp, icode); rtx_op1 = expand_normal (op1); rtx_op2 = expand_normal (op2); + + if (COMPARISON_CLASS_P (op0)) + { + comparison = vector_compare_rtx (op0, unsignedp, icode); + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], rtx_op1, mode); + create_input_operand (&ops[2], rtx_op2, mode); + create_fixed_operand (&ops[3], comparison); + create_fixed_operand (&ops[4], XEXP (comparison, 0)); + create_fixed_operand (&ops[5], XEXP (comparison, 1)); + + } + else + { + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], rtx_op1, mode); + create_input_operand (&ops[2], rtx_op2, mode); + create_input_operand (&ops[3], comparison, mode); + create_input_operand (&ops[4], rtx_op0, mode); + create_input_operand (&ops[5], vec, mode); + } - create_output_operand (&ops[0], target, mode); - create_input_operand (&ops[1], rtx_op1, mode); - create_input_operand (&ops[2], rtx_op2, mode); - create_fixed_operand (&ops[3], comparison); - create_fixed_operand (&ops[4], XEXP (comparison, 0)); - create_fixed_operand (&ops[5], XEXP (comparison, 1)); expand_insn (icode, 6, ops); return ops[0].value; } Index: gcc/target.h =================================================================== --- gcc/target.h (revision 177665) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include "insn-modes.h" +#include "gimple.h" #ifdef ENABLE_CHECKING Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 177665) +++ gcc/fold-const.c (working copy) @@ -5930,12 +5930,21 @@ extract_muldiv_1 (tree t, tree c, enum t } /* Return a node which has the indicated constant VALUE (either 0 or - 1), and is of the indicated TYPE. */ + 1 for scalars and is either {-1,-1,..} or {0,0,...} for vectors), + and is of the indicated TYPE. */ tree constant_boolean_node (int value, tree type) { - if (type == integer_type_node) + if (TREE_CODE (type) == VECTOR_TYPE) + { + tree tval; + + gcc_assert (TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE); + tval = build_int_cst (TREE_TYPE (type), value ? -1 : 0); + return build_vector_from_val (type, tval); + } + else if (type == integer_type_node) return value ? integer_one_node : integer_zero_node; else if (type == boolean_type_node) return value ? boolean_true_node : boolean_false_node; @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { + tree arg0_type = TREE_TYPE (arg0); + switch (code) { case EQ_EXPR: - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0)))) + if (! FLOAT_TYPE_P (arg0_type) + || ! HONOR_NANS (TYPE_MODE (arg0_type))) return constant_boolean_node (1, type); break; case GE_EXPR: case LE_EXPR: - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0)))) + if (! FLOAT_TYPE_P (arg0_type) + || ! HONOR_NANS (TYPE_MODE (arg0_type))) return constant_boolean_node (1, type); return fold_build2_loc (loc, EQ_EXPR, type, arg0, arg1); case NE_EXPR: /* For NE, we can only do this simplification if integer or we don't honor IEEE floating point NaNs. */ - if (FLOAT_TYPE_P (TREE_TYPE (arg0)) - && HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0)))) + if (FLOAT_TYPE_P (arg0_type) + && HONOR_NANS (TYPE_MODE (arg0_type))) break; /* ... fall through ... */ case GT_EXPR: Index: gcc/testsuite/gcc.c-torture/execute/vector-vcond-2.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-vcond-2.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-vcond-2.c (revision 0) @@ -0,0 +1,78 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) + +#define check_compare(count, res, i0, i1, c0, c1, op, fmt0, fmt1) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i ++) { \ + if ((res)[__i] != \ + ((i0)[__i] op (i1)[__i] \ + ? (c0)[__i] : (c1)[__i])) \ + { \ + __builtin_printf (fmt0 " != (" fmt1 " " #op " " fmt1 " ? " \ + fmt0 " : " fmt0 ")", \ + (res)[__i], (i0)[__i], (i1)[__i],\ + (c0)[__i], (c1)[__i]); \ + __builtin_abort (); \ + } \ + } \ +} while (0) + +#define test(count, v0, v1, c0, c1, res, fmt0, fmt1); \ +do { \ + res = (v0 > v1) ? c0: c1; \ + check_compare (count, res, v0, v1, c0, c1, >, fmt0, fmt1); \ + res = (v0 >= v1) ? c0: c1; \ + check_compare (count, res, v0, v1, c0, c1, >=, fmt0, fmt1); \ + res = (v0 < v1) ? c0: c1; \ + check_compare (count, res, v0, v1, c0, c1, <, fmt0, fmt1); \ + res = (v0 <= v1) ? c0: c1; \ + check_compare (count, res, v0, v1, c0, c1, <=, fmt0, fmt1); \ + res = (v0 == v1) ? c0: c1; \ + check_compare (count, res, v0, v1, c0, c1, ==, fmt0, fmt1); \ + res = (v0 != v1) ? c0: c1; \ + check_compare (count, res, v0, v1, c0, c1, !=, fmt0, fmt1); \ +} while (0) + + +int main (int argc, char *argv[]) { + vector (4, int) i0 = {argc, 1, 2, 10}; + vector (4, int) i1 = {0, argc, 2, (int)-23}; + vector (4, int) ires; + vector (4, float) f0 = {1., 7., (float)argc, 4.}; + vector (4, float) f1 = {6., 2., 8., (float)argc}; + vector (4, float) fres; + + vector (2, double) d0 = {1., (double)argc}; + vector (2, double) d1 = {6., 2.}; + vector (2, double) dres; + vector (2, long) l0 = {argc, 3}; + vector (2, long) l1 = {5, 8}; + vector (2, long) lres; + + /* Thes tests work fine. */ + test (4, i0, i1, f0, f1, fres, "%f", "%i"); + test (4, f0, f1, i0, i1, ires, "%i", "%f"); + test (2, d0, d1, l0, l1, lres, "%i", "%f"); + test (2, l0, l1, d0, d1, dres, "%f", "%i"); + + /* Condition expressed with a single variable. */ + dres = l0 ? d0 : d1; + check_compare (2, dres, l0, ((vector (2, long)){-1,-1}), d0, d1, ==, "%f", "%i"); + + lres = l1 ? l0 : l1; + check_compare (2, lres, l1, ((vector (2, long)){-1,-1}), l0, l1, ==, "%i", "%i"); + + fres = i0 ? f0 : f1; + check_compare (4, fres, i0, ((vector (4, int)){-1,-1,-1,-1}), + f0, f1, ==, "%f", "%i"); + + ires = i1 ? i0 : i1; + check_compare (4, ires, i1, ((vector (4, int)){-1,-1,-1,-1}), + i0, i1, ==, "%i", "%i"); + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c (revision 0) @@ -0,0 +1,123 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define check_compare(count, res, i0, i1, op, fmt) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i ++) { \ + if ((res)[__i] != ((i0)[__i] op (i1)[__i] ? -1 : 0)) \ + { \ + __builtin_printf ("%i != ((" fmt " " #op " " fmt " ? -1 : 0) ", \ + (res)[__i], (i0)[__i], (i1)[__i]); \ + __builtin_abort (); \ + } \ + } \ +} while (0) + +#define test(count, v0, v1, res, fmt); \ +do { \ + res = (v0 > v1); \ + check_compare (count, res, v0, v1, >, fmt); \ + res = (v0 < v1); \ + check_compare (count, res, v0, v1, <, fmt); \ + res = (v0 >= v1); \ + check_compare (count, res, v0, v1, >=, fmt); \ + res = (v0 <= v1); \ + check_compare (count, res, v0, v1, <=, fmt); \ + res = (v0 == v1); \ + check_compare (count, res, v0, v1, ==, fmt); \ + res = (v0 != v1); \ + check_compare (count, res, v0, v1, !=, fmt); \ +} while (0) + + +int main (int argc, char *argv[]) { +#define INT int + vector (4, INT) i0; + vector (4, INT) i1; + vector (4, int) ires; + int i; + + i0 = (vector (4, INT)){argc, 1, 2, 10}; + i1 = (vector (4, INT)){0, 3, 2, (INT)-23}; + test (4, i0, i1, ires, "%i"); +#undef INT + +#define INT unsigned int + vector (4, int) ures; + vector (4, INT) u0; + vector (4, INT) u1; + + u0 = (vector (4, INT)){argc, 1, 2, 10}; + u1 = (vector (4, INT)){0, 3, 2, (INT)-23}; + test (4, u0, u1, ures, "%u"); +#undef INT + + +#define SHORT short + vector (8, SHORT) s0; + vector (8, SHORT) s1; + vector (8, short) sres; + + s0 = (vector (8, SHORT)){argc, 1, 2, 10, 6, 87, (SHORT)-5, 2}; + s1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; + test (8, s0, s1, sres, "%i"); +#undef SHORT + +#define SHORT unsigned short + vector (8, SHORT) us0; + vector (8, SHORT) us1; + vector (8, short) usres; + + us0 = (vector (8, SHORT)){argc, 1, 2, 10, 6, 87, (SHORT)-5, 2}; + us1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; + test (8, us0, us1, usres, "%u"); +#undef SHORT + +#define CHAR signed char + vector (16, CHAR) c0; + vector (16, CHAR) c1; + vector (16, signed char) cres; + + c0 = (vector (16, CHAR)){argc, 1, 2, 10, 6, 87, (CHAR)-5, 2, \ + argc, 1, 2, 10, 6, 87, (CHAR)-5, 2 }; + + c1 = (vector (16, CHAR)){0, 3, 2, (CHAR)-23, 12, 10, (CHAR)-2, 0, \ + 0, 3, 2, (CHAR)-23, 12, 10, (CHAR)-2, 0}; + test (16, c0, c1, cres, "%i"); +#undef CHAR + +#define CHAR unsigned char + vector (16, CHAR) uc0; + vector (16, CHAR) uc1; + vector (16, signed char) ucres; + + uc0 = (vector (16, CHAR)){argc, 1, 2, 10, 6, 87, (CHAR)-5, 2, \ + argc, 1, 2, 10, 6, 87, (CHAR)-5, 2 }; + + uc1 = (vector (16, CHAR)){0, 3, 2, (CHAR)-23, 12, 10, (CHAR)-2, 0, \ + 0, 3, 2, (CHAR)-23, 12, 10, (CHAR)-2, 0}; + test (16, uc0, uc1, ucres, "%u"); +#undef CHAR +/* Float comparison. */ + vector (4, float) f0; + vector (4, float) f1; + vector (4, int) ifres; + + f0 = (vector (4, float)){(float)argc, 1., 2., 10.}; + f1 = (vector (4, float)){0., 3., 2., (float)-23}; + test (4, f0, f1, ifres, "%f"); + +/* Double comparison. */ + vector (2, double) d0; + vector (2, double) d1; + vector (2, long) idres; + + d0 = (vector (2, double)){(double)argc, 10.}; + d1 = (vector (2, double)){0., (double)-23}; + test (2, d0, d1, idres, "%f"); + + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vector-vcond-1.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-vcond-1.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-vcond-1.c (revision 0) @@ -0,0 +1,154 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) + +#define check_compare(type, count, res, i0, i1, c0, c1, op, fmt) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i ++) { \ + if (vidx (type, res, __i) != \ + ((vidx (type, i0, __i) op vidx (type, i1, __i)) \ + ? vidx (type, c0, __i) : vidx (type, c1, __i))) \ + { \ + __builtin_printf (fmt " != ((" fmt " " #op " " fmt ") ? " fmt " : " fmt ")", \ + vidx (type, res, __i), vidx (type, i0, __i), vidx (type, i1, __i),\ + vidx (type, c0, __i), vidx (type, c1, __i)); \ + __builtin_abort (); \ + } \ + } \ +} while (0) + +#define test(type, count, v0, v1, c0, c1, res, fmt); \ +do { \ + res = (v0 > v1) ? c0: c1; \ + check_compare (type, count, res, v0, v1, c0, c1, >, fmt); \ + res = (v0 >= v1) ? c0: c1; \ + check_compare (type, count, res, v0, v1, c0, c1, >=, fmt); \ + res = (v0 < v1) ? c0: c1; \ + check_compare (type, count, res, v0, v1, c0, c1, <, fmt); \ + res = (v0 <= v1) ? c0: c1; \ + check_compare (type, count, res, v0, v1, c0, c1, <=, fmt); \ + res = (v0 == v1) ? c0: c1; \ + check_compare (type, count, res, v0, v1, c0, c1, ==, fmt); \ + res = (v0 != v1) ? c0: c1; \ + check_compare (type, count, res, v0, v1, c0, c1, !=, fmt); \ +} while (0) + +int main (int argc, char *argv[]) { +#define INT int + vector (4, INT) i0; vector (4, INT) i1; + vector (4, INT) ic0; vector (4, INT) ic1; + vector (4, INT) ires; + + i0 = (vector (4, INT)){argc, 1, 2, 10}; + i1 = (vector (4, INT)){0, 3, 2, (INT)-23}; + + ic0 = (vector (4, INT)){1, argc, argc, 10}; + ic1 = (vector (4, INT)){2, 3, argc, (INT)-23}; + test (INT, 4, i0, i1, ic0, ic1, ires, "%i"); +#undef INT + +#define INT unsigned int + vector (4, INT) ui0; vector (4, INT) ui1; + vector (4, INT) uic0; vector (4, INT) uic1; + vector (4, INT) uires; + + ui0 = (vector (4, INT)){argc, 1, 2, 10}; + ui1 = (vector (4, INT)){0, 3, 2, (INT)-23}; + + uic0 = (vector (4, INT)){1, argc, argc, 10}; + uic1 = (vector (4, INT)){2, 3, argc, (INT)-23}; + test (INT, 4, ui0, ui1, uic0, uic1, uires, "%u"); +#undef INT + +#define SHORT short + vector (8, SHORT) s0; vector (8, SHORT) s1; + vector (8, SHORT) sc0; vector (8, SHORT) sc1; + vector (8, short) sres; + + s0 = (vector (8, SHORT)){argc, 1, 2, 10, 6, 87, (SHORT)-5, 2}; + s1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; + + sc0 = (vector (8, SHORT)){argc, 1, argc, 10, 6, 87, (SHORT)-5, argc}; + sc1= (vector (8, SHORT)){0, 5, 2, (SHORT)-23, 2, 10, (SHORT)-2, argc}; + + test (SHORT, 8, s0, s1, sc0, sc1, sres, "%i"); +#undef SHORT + +#define SHORT unsigned short + vector (8, SHORT) us0; vector (8, SHORT) us1; + vector (8, SHORT) usc0; vector (8, SHORT) usc1; + vector (8, SHORT) usres; + + us0 = (vector (8, SHORT)){argc, 1, 2, 10, 6, 87, (SHORT)-5, 2}; + us1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; + + usc0 = (vector (8, SHORT)){argc, 1, argc, 10, 6, 87, (SHORT)-5, argc}; + usc1= (vector (8, SHORT)){0, 5, 2, (SHORT)-23, 2, 10, (SHORT)-2, argc}; + + test (SHORT, 8, us0, us1, usc0, usc1, usres, "%u"); +#undef SHORT + +#define CHAR signed char + vector (16, CHAR) c0; vector (16, CHAR) c1; + vector (16, CHAR) cc0; vector (16, CHAR) cc1; + vector (16, CHAR) cres; + + c0 = (vector (16, CHAR)){argc, 1, 2, 4, 7, 87, (CHAR)-5, 2, \ + argc, 1, 3, 18, 6, 87, (CHAR)-5, 2 }; + + c1 = (vector (16, CHAR)){0, 3, 2, (CHAR)-23, 12, 10, (CHAR)-2, 0, \ + 0, 3, 2, (CHAR)-5, 28, 10, (CHAR)-2, 0}; + + cc0 = (vector (16, CHAR)){argc, 1, argc, 4, 7, 87, (CHAR)-23, 2, \ + 33, 8, 3, 18, 6, 87, (CHAR)-5, 41 }; + + cc1 = (vector (16, CHAR)){0, 27, 2, (CHAR)-1, 12, 10, (CHAR)-2, 0, \ + 0, 3, 0x23, (CHAR)-5, 28, 1, (CHAR)-2, 0}; + + test (CHAR, 16, c0, c1, cc0, cc1, cres, "%i"); +#undef CHAR + +#define CHAR unsigned char + vector (16, CHAR) uc0; vector (16, CHAR) uc1; + vector (16, CHAR) ucc0; vector (16, CHAR) ucc1; + vector (16, CHAR) ucres; + + uc0 = (vector (16, CHAR)){argc, 1, 2, 4, 7, 87, (CHAR)-5, 2, \ + argc, 1, 3, 18, 6, 87, (CHAR)-5, 2 }; + + uc1 = (vector (16, CHAR)){0, 3, 2, (CHAR)-23, 12, 10, (CHAR)-2, 0, \ + 0, 3, 2, (CHAR)-5, 28, 10, (CHAR)-2, 0}; + + ucc0 = (vector (16, CHAR)){argc, 1, argc, 4, 7, 87, (CHAR)-23, 2, \ + 33, 8, 3, 18, 6, 87, (CHAR)-5, 41 }; + + ucc1 = (vector (16, CHAR)){0, 27, 2, (CHAR)-1, 12, 10, (CHAR)-2, 0, \ + 0, 3, 0x23, (CHAR)-5, 28, 1, (CHAR)-2, 0}; + + test (CHAR, 16, uc0, uc1, ucc0, ucc1, ucres, "%u"); +#undef CHAR + +/* Float version. */ + vector (4, float) f0 = {1., 7., (float)argc, 4.}; + vector (4, float) f1 = {6., 2., 8., (float)argc}; + vector (4, float) fc0 = {3., 12., 4., (float)argc}; + vector (4, float) fc1 = {7., 5., (float)argc, 6.}; + vector (4, float) fres; + + test (float, 4, f0, f1, fc0, fc1, fres, "%f"); + +/* Double version. */ + vector (2, double) d0 = {1., (double)argc}; + vector (2, double) d1 = {6., 2.}; + vector (2, double) dc0 = {(double)argc, 7.}; + vector (2, double) dc1 = {7., 5.}; + vector (2, double) dres; + + //test (double, 2, d0, d1, dc0, dc1, dres, "%f"); + + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vector-compare-2.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vector-compare-2.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-compare-2.c (revision 0) @@ -0,0 +1,27 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +/* Check that constant folding in + these simple cases works. */ +vector (4, int) +foo (vector (4, int) x) +{ + return (x == x) + (x != x) + (x > x) + + (x < x) + (x >= x) + (x <= x); +} + +int +main (int argc, char *argv[]) +{ + vector (4, int) t = {argc, 2, argc, 42}; + vector (4, int) r; + int i; + + r = foo (t); + + for (i = 0; i < 4; i++) + if (r[i] != -3) + __builtin_abort (); + + return 0; +} Index: gcc/testsuite/gcc.dg/vector-compare-1.c =================================================================== --- gcc/testsuite/gcc.dg/vector-compare-1.c (revision 0) +++ gcc/testsuite/gcc.dg/vector-compare-1.c (revision 0) @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +void +foo (vector (4, int) x, vector (4, float) y) +{ + vector (4, int) p4; + vector (4, int) r4; + vector (4, unsigned int) q4; + vector (8, int) r8; + vector (4, float) f4; + + r4 = x > y; /* { dg-error "comparing vectors with different element types" } */ + r8 = (x != p4); /* { dg-error "incompatible types when assigning to type" } */ + r8 == r4; /* { dg-error "comparing vectors with different number of elements" } */ + + r4 ? y : p4; /* { dg-error "vectors of different types involved in vector comparison" } */ + r4 ? r4 : r8; /* { dg-error "vectors of different length found in vector comparison" } */ + y ? f4 : y; /* { dg-error "non-integer type in vector condition" } */ + + /* Do not trigger that */ + q4 ? p4 : r4; /* { "vector comparison must be of signed integer vector type" } */ +} Index: gcc/testsuite/gcc.dg/vector-compare-2.c =================================================================== --- gcc/testsuite/gcc.dg/vector-compare-2.c (revision 0) +++ gcc/testsuite/gcc.dg/vector-compare-2.c (revision 0) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ + +/* Test if C_MAYBE_CONST are folded correctly when + creating VEC_COND_EXPR. */ + +typedef int vec __attribute__((vector_size(16))); + +vec i,j; +extern vec a, b, c; + +vec +foo (int x) +{ + return (x ? i : j) ? a : b; +} + +vec +bar (int x) +{ + return a ? (x ? i : j) : b; +} + +vec +baz (int x) +{ + return a ? b : (x ? i : j); +} Index: gcc/c-typeck.c =================================================================== --- gcc/c-typeck.c (revision 177665) +++ gcc/c-typeck.c (working copy) @@ -4009,6 +4009,52 @@ ep_convert_and_check (tree type, tree ex return convert (type, expr); } +static tree +fold_build_vec_cond_expr (tree ifexp, tree op1, tree op2) +{ + bool wrap = true; + bool maybe_const = false; + tree vcond, tmp; + + /* Avoid C_MAYBE_CONST in VEC_COND_EXPR. */ + tmp = c_fully_fold (ifexp, false, &maybe_const); + ifexp = save_expr (tmp); + wrap &= maybe_const; + + tmp = c_fully_fold (op1, false, &maybe_const); + op1 = save_expr (tmp); + wrap &= maybe_const; + + tmp = c_fully_fold (op2, false, &maybe_const); + op2 = save_expr (tmp); + wrap &= maybe_const; + + /* Currently the expansion of VEC_COND_EXPR does not allow + expessions where the type of vectors you compare differs + form the type of vectors you select from. For the time + being we insert implicit conversions. */ + if ((COMPARISON_CLASS_P (ifexp) + && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1)) + || TREE_TYPE (ifexp) != TREE_TYPE (op1)) + { + tree comp_type = COMPARISON_CLASS_P (ifexp) + ? TREE_TYPE (TREE_OPERAND (ifexp, 0)) + : TREE_TYPE (ifexp); + + op1 = convert (comp_type, op1); + op2 = convert (comp_type, op2); + vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2); + vcond = convert (TREE_TYPE (op1), vcond); + } + else + vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2); + + /*if (!wrap) + vcond = c_wrap_maybe_const (vcond, true);*/ + + return vcond; +} + /* Build and return a conditional expression IFEXP ? OP1 : OP2. If IFEXP_BCP then the condition is a call to __builtin_constant_p, and if folded to an integer constant then the unselected half may @@ -4058,6 +4104,49 @@ build_conditional_expr (location_t colon type2 = TREE_TYPE (op2); code2 = TREE_CODE (type2); + if (TREE_CODE (TREE_TYPE (ifexp)) == VECTOR_TYPE) + { + if (TREE_CODE (type1) != VECTOR_TYPE + || TREE_CODE (type2) != VECTOR_TYPE) + { + error_at (colon_loc, "vector comparison arguments must be of " + "type vector"); + return error_mark_node; + } + + if (TREE_CODE (TREE_TYPE (TREE_TYPE (ifexp))) != INTEGER_TYPE) + { + error_at (colon_loc, "non-integer type in vector condition"); + return error_mark_node; + } + + if (TYPE_VECTOR_SUBPARTS (type1) != TYPE_VECTOR_SUBPARTS (type2) + || TYPE_VECTOR_SUBPARTS (TREE_TYPE (ifexp)) + != TYPE_VECTOR_SUBPARTS (type1)) + { + error_at (colon_loc, "vectors of different length found in " + "vector comparison"); + return error_mark_node; + } + + if (TREE_TYPE (type1) != TREE_TYPE (type2)) + { + error_at (colon_loc, "vectors of different types involved in " + "vector comparison"); + return error_mark_node; + } + + if (TYPE_SIZE (TREE_TYPE (TREE_TYPE (ifexp))) + != TYPE_SIZE (TREE_TYPE (type1))) + { + error_at (colon_loc, "vector-condition element type must be " + "the same as result vector element type"); + return error_mark_node; + } + + return fold_build_vec_cond_expr (ifexp, op1, op2); + } + /* C90 does not permit non-lvalue arrays in conditional expressions. In C99 they will be pointers by now. */ if (code1 == ARRAY_TYPE || code2 == ARRAY_TYPE) @@ -9906,6 +9995,37 @@ build_binary_op (location_t location, en case EQ_EXPR: case NE_EXPR: + if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE) + { + tree intt; + if (TREE_TYPE (type0) != TREE_TYPE (type1)) + { + error_at (location, "comparing vectors with different " + "element types"); + return error_mark_node; + } + + if (TYPE_VECTOR_SUBPARTS (type0) != TYPE_VECTOR_SUBPARTS (type1)) + { + error_at (location, "comparing vectors with different " + "number of elements"); + return error_mark_node; + } + + /* Always construct signed integer vector type. */ + intt = c_common_type_for_size (TYPE_PRECISION (TREE_TYPE (type0)),0); + result_type = build_vector_type (intt, TYPE_VECTOR_SUBPARTS (type0)); + converted = 1; + /*break; */ + + ret = fold_build_vec_cond_expr + (build2 (code, result_type, op0, op1), + build_vector_from_val (result_type, + build_int_cst (intt, -1)), + build_vector_from_val (result_type, + build_int_cst (intt, 0))); + goto return_build_binary_op; + } if (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1)) warning_at (location, OPT_Wfloat_equal, @@ -10018,6 +10138,37 @@ build_binary_op (location_t location, en case GE_EXPR: case LT_EXPR: case GT_EXPR: + if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE) + { + tree intt; + if (TREE_TYPE (type0) != TREE_TYPE (type1)) + { + error_at (location, "comparing vectors with different " + "element types"); + return error_mark_node; + } + + if (TYPE_VECTOR_SUBPARTS (type0) != TYPE_VECTOR_SUBPARTS (type1)) + { + error_at (location, "comparing vectors with different " + "number of elements"); + return error_mark_node; + } + + /* Always construct signed integer vector type. */ + intt = c_common_type_for_size (TYPE_PRECISION (TREE_TYPE (type0)),0); + result_type = build_vector_type (intt, TYPE_VECTOR_SUBPARTS (type0)); + converted = 1; + /* break; */ + ret = fold_build_vec_cond_expr + (build2 (code, result_type, op0, op1), + build_vector_from_val (result_type, + build_int_cst (intt, -1)), + build_vector_from_val (result_type, + build_int_cst (intt, 0))); + goto return_build_binary_op; + + } build_type = integer_type_node; if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == FIXED_POINT_TYPE) @@ -10425,6 +10576,10 @@ c_objc_common_truthvalue_conversion (loc case FUNCTION_TYPE: gcc_unreachable (); + case VECTOR_TYPE: + error_at (location, "used vector type where scalar is required"); + return error_mark_node; + default: break; } Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c (revision 177665) +++ gcc/gimplify.c (working copy) @@ -7064,6 +7064,22 @@ gimplify_expr (tree *expr_p, gimple_seq } break; + case VEC_COND_EXPR: + { + enum gimplify_status r0, r1, r2; + + r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, + post_p, is_gimple_condexpr, fb_rvalue); + r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p, + post_p, is_gimple_val, fb_rvalue); + r2 = gimplify_expr (&TREE_OPERAND (*expr_p, 2), pre_p, + post_p, is_gimple_val, fb_rvalue); + recalculate_side_effects (*expr_p); + + ret = MIN (r0, MIN (r1, r2)); + } + break; + case TARGET_MEM_REF: { enum gimplify_status r0 = GS_ALL_DONE, r1 = GS_ALL_DONE; @@ -7348,6 +7364,36 @@ gimplify_expr (tree *expr_p, gimple_seq { tree type = TREE_TYPE (TREE_OPERAND (*expr_p, 1)); + /* Vector comparisons is a valid gimple expression + which could be lowered down later. */ + if (TREE_CODE (type) == VECTOR_TYPE) + { + goto expr_2; + /* XXX my humble attempt to avoid comparisons. + enum gimplify_status r0, r1; + tree t, f; + + debug_tree (*expr_p); + + r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, + post_p, is_gimple_condexpr, fb_rvalue); + r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p, + post_p, is_gimple_val, fb_rvalue); + + t = build_vector_from_val (TREE_TYPE (*expr_p), + build_int_cst (TREE_TYPE (TREE_TYPE (*expr_p)), -1)); + f = build_vector_from_val (TREE_TYPE (*expr_p), + build_int_cst (TREE_TYPE (TREE_TYPE (*expr_p)), 0)); + + recalculate_side_effects (*expr_p); + t = build3 (VEC_COND_EXPR, TREE_TYPE (*expr_p), *expr_p, t, f); + *expr_p = t; + + ret = MIN (r0, r1); + break;*/ + } + + if (!AGGREGATE_TYPE_P (type)) { tree org_type = TREE_TYPE (*expr_p); Index: gcc/tree.def =================================================================== --- gcc/tree.def (revision 177665) +++ gcc/tree.def (working copy) @@ -704,7 +704,10 @@ DEFTREECODE (TRUTH_NOT_EXPR, "truth_not_ The others are allowed only for integer (or pointer or enumeral) or real types. In all cases the operands will have the same type, - and the value is always the type used by the language for booleans. */ + and the value is either the type used by the language for booleans + or an integer vector type of the same size and with the same number + of elements as the comparison operands. True for a vector of + comparison results has all bits set while false is equal to zero. */ DEFTREECODE (LT_EXPR, "lt_expr", tcc_comparison, 2) DEFTREECODE (LE_EXPR, "le_expr", tcc_comparison, 2) DEFTREECODE (GT_EXPR, "gt_expr", tcc_comparison, 2) Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 177665) +++ gcc/emit-rtl.c (working copy) @@ -5474,6 +5474,11 @@ gen_const_vector (enum machine_mode mode return tem; } +rtx +gen_const_vector1 (enum machine_mode mode, int constant) +{ + return gen_const_vector (mode, constant); +} /* Generate a vector like gen_rtx_raw_CONST_VEC, but use the zero vector when all elements are zero, and the one vector when all elements are one. */ rtx Index: gcc/tree-ssa-forwprop.c =================================================================== --- gcc/tree-ssa-forwprop.c (revision 177665) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -585,6 +585,128 @@ forward_propagate_into_cond (gimple_stmt return 0; } + +static tree +combine_vec_cond_expr_cond (location_t loc, enum tree_code code, + tree type, tree op0, tree op1) +{ + tree t; + + if (op0 == NULL_TREE && op1 == NULL_TREE) + return NULL_TREE; + + if (op0 == NULL_TREE) + return op1; + + if (op1 == NULL_TREE) + return op0; + + gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison); + + t = fold_binary_loc (loc, code, type, op0, op1); + if (!t) + return NULL_TREE; + + /* Require that we got a boolean type out if we put one in. */ + gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type)); + + /* Canonicalize the combined condition for use in a COND_EXPR. */ + /* t = canonicalize_cond_expr_cond (t); */ + + /* Bail out if we required an invariant but didn't get one. */ + if (!t) + return NULL_TREE; + + return t; +} + + + +static tree +forward_propagate_into_vec_comp (location_t loc, tree expr) +{ + tree tmp = NULL_TREE; + tree rhs0 = NULL_TREE, rhs1 = NULL_TREE; + bool single_use0_p = false, single_use1_p = false; + + /* For comparisons use the first operand, that is likely to + simplify comparisons against constants. */ + /* debug_tree (expr); */ + + if (TREE_CODE (expr) == VEC_COND_EXPR) + { + tree type = TREE_TYPE (expr); + tree lhs = forward_propagate_into_vec_comp (loc, TREE_OPERAND (expr, 0)); + tree rhs = forward_propagate_into_vec_comp (loc, TREE_OPERAND (expr, 1)); + + return combine_vec_cond_expr_cond (loc, TREE_CODE (expr), + type, lhs, rhs); + } + else if (TREE_CODE (expr) == SSA_NAME) + { + gimple def_stmt = get_prop_source_stmt (expr, false, &single_use0_p); + if (def_stmt && can_propagate_from (def_stmt)) + { + expr = rhs_to_tree (TREE_TYPE (expr), def_stmt); + return forward_propagate_into_vec_comp (loc, expr); + } + else + return tmp; + } + + return tmp; +} + + + + +/* The same as forward_propogate_into_cond only for vector conditions. */ +static int +forward_propagate_into_vec_cond (gimple_stmt_iterator *gsi_p) +{ + gimple stmt = gsi_stmt (*gsi_p); + location_t loc = gimple_location (stmt); + tree tmp = NULL_TREE; + tree cond = gimple_assign_rhs1 (stmt); + + /* We can do tree combining on SSA_NAME and comparison expressions. */ + if (TREE_CODE (cond) == VEC_COND_EXPR) + tmp = forward_propagate_into_vec_comp (loc, cond); + else if (TREE_CODE (cond) == SSA_NAME) + { + tree name = cond, rhs0; + gimple def_stmt = get_prop_source_stmt (name, true, NULL); + if (!def_stmt || !can_propagate_from (def_stmt)) + return 0; + + rhs0 = gimple_assign_rhs1 (def_stmt); + tmp = forward_propagate_into_vec_comp (loc, rhs0); + } + + /* XXX Don't change anything for the time being. */ + tmp = NULL_TREE; + + if (tmp) + { + if (tmp) + { + fprintf (dump_file, " Replaced '"); + print_generic_expr (dump_file, cond, 0); + fprintf (dump_file, "' with '"); + print_generic_expr (dump_file, tmp, 0); + fprintf (dump_file, "'\n"); + } + + gimple_assign_set_rhs_from_tree (gsi_p, unshare_expr (tmp)); + stmt = gsi_stmt (*gsi_p); + update_stmt (stmt); + + return is_gimple_min_invariant (tmp) ? 2 : 1; + } + + return 0; +} + /* We've just substituted an ADDR_EXPR into stmt. Update all the relevant data structures to match. */ @@ -2445,6 +2567,20 @@ ssa_forward_propagate_and_combine (void) stmt = gsi_stmt (gsi); if (did_something == 2) cfg_changed = true; + fold_undefer_overflow_warnings + (!TREE_NO_WARNING (rhs1) && did_something, stmt, + WARN_STRICT_OVERFLOW_CONDITIONAL); + changed = did_something != 0; + } + else if (code == VEC_COND_EXPR) + { + /* In this case the entire VEC_COND_EXPR is in rhs1. */ + int did_something; + fold_defer_overflow_warnings (); + did_something = forward_propagate_into_vec_cond (&gsi); + stmt = gsi_stmt (gsi); + if (did_something == 2) + cfg_changed = true; fold_undefer_overflow_warnings (!TREE_NO_WARNING (rhs1) && did_something, stmt, WARN_STRICT_OVERFLOW_CONDITIONAL); Index: gcc/tree-vect-generic.c =================================================================== --- gcc/tree-vect-generic.c (revision 177665) +++ gcc/tree-vect-generic.c (working copy) @@ -30,11 +30,16 @@ along with GCC; see the file COPYING3. #include "tree-pass.h" #include "flags.h" #include "ggc.h" +#include "target.h" /* Need to include rtl.h, expr.h, etc. for optabs. */ #include "expr.h" #include "optabs.h" + +static void expand_vector_operations_1 (gimple_stmt_iterator *); + + /* Build a constant of type TYPE, made of VALUE's bits replicated every TYPE_SIZE (INNER_TYPE) bits to fit TYPE's precision. */ static tree @@ -125,6 +130,31 @@ do_binop (gimple_stmt_iterator *gsi, tre return gimplify_build2 (gsi, code, inner_type, a, b); } + +/* Construct expression (A[BITPOS] code B[BITPOS]) ? -1 : 0 + + INNER_TYPE is the type of A and B elements + + returned expression is of signed integer type with the + size equal to the size of INNER_TYPE. */ +static tree +do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b, + tree bitpos, tree bitsize, enum tree_code code) +{ + tree cond; + tree comp_type; + + a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos); + b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos); + + comp_type = lang_hooks.types.type_for_size (TYPE_PRECISION (inner_type), 0); + + cond = gimplify_build2 (gsi, code, comp_type, a, b); + return gimplify_build3 (gsi, COND_EXPR, comp_type, cond, + build_int_cst (comp_type, -1), + build_int_cst (comp_type, 0)); +} + /* Expand vector addition to scalars. This does bit twiddling in order to increase parallelism: @@ -333,6 +363,49 @@ uniform_vector_p (tree vec) return NULL_TREE; } +/* Try to expand vector comparison expression OP0 CODE OP1 using + builtin_vec_compare hardware hook, in case target does not + support comparison of type TYPE, extract comparison piecewise. + GSI is used inside the target hook to create the code needed + for the given comparison. */ +static tree +expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0, + tree op1, enum tree_code code) +{ + tree t; + /*if (expand_vec_cond_expr_p (type, TYPE_MODE (type))) + { + tree arg_type = TREE_TYPE (op0); + tree if_true, if_false, ifexp; + tree el_type = TREE_TYPE (type); + + //el_type = lang_hooks.types.type_for_size (TYPE_PRECISION (el_type), 0); + + if_true = build_vector_from_val (type, build_int_cst (el_type, -1)); + if_false = build_vector_from_val (type, build_int_cst (el_type, 0)); + ifexp = gimplify_build2 (gsi, code, type, op0, op1); + + debug_tree (ifexp); + debug_tree (if_true); + debug_tree (if_false); + + if (arg_type != type) + { + if_true = convert (arg_type, if_true); + if_false = convert (arg_type, if_true); + t = build3 (VEC_COND_EXPR, arg_type, ifexp, if_true, if_false); + t = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, t); + } + else + t = gimplify_build3 (gsi, VEC_COND_EXPR, type, ifexp, if_true, if_false); + } + else + t = expand_vector_piecewise (gsi, do_compare, type, + TREE_TYPE (TREE_TYPE (op0)), op0, op1, code);*/ + return gimplify_build2 (gsi, code, type, op0, op1);; + +} + static tree expand_vector_operation (gimple_stmt_iterator *gsi, tree type, tree compute_type, gimple assign, enum tree_code code) @@ -375,8 +448,27 @@ expand_vector_operation (gimple_stmt_ite case BIT_NOT_EXPR: return expand_vector_parallel (gsi, do_unop, type, gimple_assign_rhs1 (assign), - NULL_TREE, code); + NULL_TREE, code); + case EQ_EXPR: + case NE_EXPR: + case GT_EXPR: + case LT_EXPR: + case GE_EXPR: + case LE_EXPR: + case UNEQ_EXPR: + case UNGT_EXPR: + case UNLT_EXPR: + case UNGE_EXPR: + case UNLE_EXPR: + case LTGT_EXPR: + case ORDERED_EXPR: + case UNORDERED_EXPR: + { + tree rhs1 = gimple_assign_rhs1 (assign); + tree rhs2 = gimple_assign_rhs2 (assign); + return expand_vector_comparison (gsi, type, rhs1, rhs2, code); + } default: break; } @@ -432,6 +524,126 @@ type_for_widest_vector_mode (enum machin } } + + +/* Expand vector condition EXP which should have the form + VEC_COND_EXPR into the following + vector: + {cond[i] != 0 ? vec0[i] : vec1[i], ... } + i changes from 0 to TYPE_VECTOR_SUBPARTS (TREE_TYPE (vec0)). */ +static tree +expand_vec_cond_expr_piecewise (gimple_stmt_iterator *gsi, tree exp) +{ + tree cond = TREE_OPERAND (exp, 0); + tree vec0 = TREE_OPERAND (exp, 1); + tree vec1 = TREE_OPERAND (exp, 2); + tree type = TREE_TYPE (vec0); + tree lhs, rhs, notmask; + tree var, new_rhs; + optab op = NULL; + gimple new_stmt; + gimple_stmt_iterator gsi_tmp; + tree t; + + + if (COMPARISON_CLASS_P (cond)) + { + /* Expand vector condition inside of VEC_COND_EXPR. */ + op = optab_for_tree_code (TREE_CODE (cond), type, optab_default); + if (!op || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) + { + tree op0 = TREE_OPERAND (cond, 0); + tree op1 = TREE_OPERAND (cond, 1); + + var = create_tmp_reg (TREE_TYPE (cond), "cond"); + new_rhs = expand_vector_piecewise (gsi, do_compare, + TREE_TYPE (cond), + TREE_TYPE (TREE_TYPE (op1)), + op0, op1, TREE_CODE (cond)); + + new_stmt = gimple_build_assign (var, new_rhs); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + update_stmt (gsi_stmt (*gsi)); + } + else + var = cond; + } + else + var = cond; + + gsi_tmp = *gsi; + gsi_prev (&gsi_tmp); + + /* Expand VCOND to ((v0 & mask) | (v1 & ~mask)) */ + lhs = gimplify_build2 (gsi, BIT_AND_EXPR, type, var, vec0); + notmask = gimplify_build1 (gsi, BIT_NOT_EXPR, type, var); + rhs = gimplify_build2 (gsi, BIT_AND_EXPR, type, notmask, vec1); + t = gimplify_build2 (gsi, BIT_IOR_EXPR, type, lhs, rhs); + + /* Run vecower on the expresisons we have introduced. */ + for (; gsi_tmp.ptr != gsi->ptr; gsi_next (&gsi_tmp)) + expand_vector_operations_1 (&gsi_tmp); + + return t; +} + +static bool +is_vector_comparison (gimple_stmt_iterator *gsi, tree expr) +{ + tree type = TREE_TYPE (expr); + + if (TREE_CODE (expr) == VEC_COND_EXPR) + return true; + + if (COMPARISON_CLASS_P (expr) && TREE_CODE (type) == VECTOR_TYPE) + return true; + + if (TREE_CODE (expr) == BIT_IOR_EXPR || TREE_CODE (expr) == BIT_AND_EXPR + || TREE_CODE (expr) == BIT_XOR_EXPR) + return is_vector_comparison (gsi, TREE_OPERAND (expr, 0)) + & is_vector_comparison (gsi, TREE_OPERAND (expr, 1)); + + if (TREE_CODE (expr) == VAR_DECL) + { + gimple_stmt_iterator gsi_tmp; + tree name = DECL_NAME (expr); + tree var = NULL_TREE; + + gsi_tmp = *gsi; + + for (; gsi_tmp.ptr; gsi_prev (&gsi_tmp)) + { + gimple stmt = gsi_stmt (gsi_tmp); + + if (gimple_code (stmt) != GIMPLE_ASSIGN) + continue; + + if (TREE_CODE (gimple_assign_lhs (stmt)) == VAR_DECL + && DECL_NAME (gimple_assign_lhs (stmt)) == name) + return is_vector_comparison (&gsi_tmp, + gimple_assign_rhs_to_tree (stmt)); + } + } + + if (TREE_CODE (expr) == SSA_NAME) + { + enum tree_code code; + gimple exprdef = SSA_NAME_DEF_STMT (expr); + + if (gimple_code (exprdef) != GIMPLE_ASSIGN) + return false; + + if (TREE_CODE (gimple_expr_type (exprdef)) != VECTOR_TYPE) + return false; + + + return is_vector_comparison (gsi, + gimple_assign_rhs_to_tree (exprdef)); + } + + return false; +} + /* Process one statement. If we identify a vector operation, expand it. */ static void @@ -450,11 +662,34 @@ expand_vector_operations_1 (gimple_stmt_ code = gimple_assign_rhs_code (stmt); rhs_class = get_gimple_rhs_class (code); + lhs = gimple_assign_lhs (stmt); + + if (code == VEC_COND_EXPR) + { + tree exp = gimple_assign_rhs1 (stmt); + tree cond = TREE_OPERAND (exp, 0); + + if (!is_vector_comparison (gsi, cond)) + TREE_OPERAND (exp, 0) = + build2 (NE_EXPR, TREE_TYPE (cond), cond, + build_vector_from_val (TREE_TYPE (cond), + build_int_cst (TREE_TYPE (TREE_TYPE (cond)), 0))); + + if (expand_vec_cond_expr_p (TREE_TYPE (exp), + TYPE_MODE (TREE_TYPE (exp)))) + { + update_stmt (gsi_stmt (*gsi)); + return; + } + + new_rhs = expand_vec_cond_expr_piecewise (gsi, exp); + gimple_assign_set_rhs_from_tree (gsi, new_rhs); + update_stmt (gsi_stmt (*gsi)); + } if (rhs_class != GIMPLE_UNARY_RHS && rhs_class != GIMPLE_BINARY_RHS) return; - lhs = gimple_assign_lhs (stmt); rhs1 = gimple_assign_rhs1 (stmt); type = gimple_expr_type (stmt); if (rhs_class == GIMPLE_BINARY_RHS) Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in (revision 177665) +++ gcc/Makefile.in (working copy) @@ -888,7 +888,7 @@ EXCEPT_H = except.h $(HASHTAB_H) vecprim TARGET_DEF = target.def target-hooks-macros.h C_TARGET_DEF = c-family/c-target.def target-hooks-macros.h COMMON_TARGET_DEF = common/common-target.def target-hooks-macros.h -TARGET_H = $(TM_H) target.h $(TARGET_DEF) insn-modes.h +TGT = $(TM_H) target.h $(TARGET_DEF) insn-modes.h C_TARGET_H = c-family/c-target.h $(C_TARGET_DEF) COMMON_TARGET_H = common/common-target.h $(INPUT_H) $(COMMON_TARGET_DEF) MACHMODE_H = machmode.h mode-classes.def insn-modes.h @@ -919,8 +919,9 @@ TREE_H = tree.h all-tree.def tree.def c- REGSET_H = regset.h $(BITMAP_H) hard-reg-set.h BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) $(FUNCTION_H) cfghooks.h GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \ - vecir.h $(GGC_H) $(BASIC_BLOCK_H) $(TARGET_H) tree-ssa-operands.h \ + vecir.h $(GGC_H) $(BASIC_BLOCK_H) $(TGT) tree-ssa-operands.h \ tree-ssa-alias.h $(INTERNAL_FN_H) +TARGET_H = $(TGT) gimple.h GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h COVERAGE_H = coverage.h $(GCOV_IO_H) DEMANGLE_H = $(srcdir)/../include/demangle.h @@ -3185,7 +3186,7 @@ tree-vect-generic.o : tree-vect-generic. $(TM_H) $(TREE_FLOW_H) $(GIMPLE_H) tree-iterator.h $(TREE_PASS_H) \ $(FLAGS_H) $(OPTABS_H) $(MACHMODE_H) $(EXPR_H) \ langhooks.h $(FLAGS_H) $(DIAGNOSTIC_H) gt-tree-vect-generic.h $(GGC_H) \ - coretypes.h insn-codes.h + coretypes.h insn-codes.h target.h df-core.o : df-core.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ insn-config.h $(RECOG_H) $(FUNCTION_H) $(REGS_H) alloc-pool.h \ hard-reg-set.h $(BASIC_BLOCK_H) $(DF_H) $(BITMAP_H) sbitmap.h $(TIMEVAR_H) \ Index: gcc/tree-cfg.c =================================================================== --- gcc/tree-cfg.c (revision 177665) +++ gcc/tree-cfg.c (working copy) @@ -3191,6 +3191,38 @@ verify_gimple_comparison (tree type, tre return true; } + if (TREE_CODE (type) == VECTOR_TYPE) + { + if (TREE_CODE (op0_type) != VECTOR_TYPE + || TREE_CODE (op1_type) != VECTOR_TYPE) + { + error ("non-vector operands in vector comparison"); + debug_generic_expr (op0_type); + debug_generic_expr (op1_type); + return true; + } + + if (!useless_type_conversion_p (op0_type, op1_type) + && !useless_type_conversion_p (op1_type, op0_type)) + { + error ("type mismatch in vector comparison"); + debug_generic_expr (op0_type); + debug_generic_expr (op1_type); + return true; + } + + if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type) + && TYPE_PRECISION (TREE_TYPE (op0_type)) + != TYPE_PRECISION (TREE_TYPE (type))) + { + error ("invalid vector comparison resulting type"); + debug_generic_expr (type); + return true; + } + + return false; + } + /* For comparisons we do not have the operations type as the effective type the comparison is carried out in. Instead we require that either the first operand is trivially Index: gcc/c-parser.c =================================================================== --- gcc/c-parser.c (revision 177665) +++ gcc/c-parser.c (working copy) @@ -5339,6 +5339,15 @@ c_parser_conditional_expression (c_parse tree eptype = NULL_TREE; middle_loc = c_parser_peek_token (parser)->location; + + if (TREE_CODE (TREE_TYPE (cond.value)) == VECTOR_TYPE) + { + error_at (middle_loc, "cannot ommit middle operator in " + "vector comparison"); + ret.value = error_mark_node; + return ret; + } + pedwarn (middle_loc, OPT_pedantic, "ISO C forbids omitting the middle term of a ?: expression"); warn_for_omitted_condop (middle_loc, cond.value); @@ -5357,9 +5366,12 @@ c_parser_conditional_expression (c_parse } else { - cond.value - = c_objc_common_truthvalue_conversion - (cond_loc, default_conversion (cond.value)); + if (TREE_CODE (TREE_TYPE (cond.value)) != VECTOR_TYPE) + { + cond.value + = c_objc_common_truthvalue_conversion + (cond_loc, default_conversion (cond.value)); + } c_inhibit_evaluation_warnings += cond.value == truthvalue_false_node; exp1 = c_parser_expression_conv (parser); mark_exp_read (exp1.value); Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 177665) +++ gcc/config/i386/i386.c (working copy) @@ -25,6 +25,7 @@ along with GCC; see the file COPYING3. #include "tm.h" #include "rtl.h" #include "tree.h" +#include "tree-flow.h" #include "tm_p.h" #include "regs.h" #include "hard-reg-set.h" @@ -18402,27 +18403,55 @@ ix86_expand_sse_fp_minmax (rtx dest, enu return true; } +rtx rtx_build_vector_from_val (enum machine_mode, HOST_WIDE_INT); + +/* Returns a vector of mode MODE where all the elements are ARG. */ +rtx +rtx_build_vector_from_val (enum machine_mode mode, HOST_WIDE_INT arg) +{ + rtvec v; + int units, i; + enum machine_mode inner; + + units = GET_MODE_NUNITS (mode); + inner = GET_MODE_INNER (mode); + v = rtvec_alloc (units); + for (i = 0; i < units; ++i) + RTVEC_ELT (v, i) = gen_rtx_CONST_INT (inner, arg); + + return gen_rtx_raw_CONST_VECTOR (mode, v); +} + /* Expand an sse vector comparison. Return the register with the result. */ static rtx ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1, - rtx op_true, rtx op_false) + rtx op_true, rtx op_false, bool no_comparison) { enum machine_mode mode = GET_MODE (dest); rtx x; - cmp_op0 = force_reg (mode, cmp_op0); - if (!nonimmediate_operand (cmp_op1, mode)) - cmp_op1 = force_reg (mode, cmp_op1); + /* Avoid useless comparison. */ + if (no_comparison) + { + cmp_op0 = force_reg (mode, cmp_op0); + x = cmp_op0; + } + else + { + cmp_op0 = force_reg (mode, cmp_op0); + if (!nonimmediate_operand (cmp_op1, mode)) + cmp_op1 = force_reg (mode, cmp_op1); + + x = gen_rtx_fmt_ee (code, mode, cmp_op0, cmp_op1); + } if (optimize || reg_overlap_mentioned_p (dest, op_true) || reg_overlap_mentioned_p (dest, op_false)) dest = gen_reg_rtx (mode); - x = gen_rtx_fmt_ee (code, mode, cmp_op0, cmp_op1); emit_insn (gen_rtx_SET (VOIDmode, dest, x)); - return dest; } @@ -18434,8 +18463,14 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp { enum machine_mode mode = GET_MODE (dest); rtx t2, t3, x; - - if (op_false == CONST0_RTX (mode)) + rtx mask_true; + + if (rtx_equal_p (op_true, rtx_build_vector_from_val (mode, -1)) + && rtx_equal_p (op_false, CONST0_RTX (mode))) + { + emit_insn (gen_rtx_SET (VOIDmode, dest, cmp)); + } + else if (op_false == CONST0_RTX (mode)) { op_true = force_reg (mode, op_true); x = gen_rtx_AND (mode, cmp, op_true); @@ -18512,7 +18547,7 @@ ix86_expand_fp_movcc (rtx operands[]) return true; tmp = ix86_expand_sse_cmp (operands[0], code, op0, op1, - operands[2], operands[3]); + operands[2], operands[3], false); ix86_expand_sse_movcc (operands[0], tmp, operands[2], operands[3]); return true; } @@ -18555,7 +18590,7 @@ ix86_expand_fp_vcond (rtx operands[]) return true; cmp = ix86_expand_sse_cmp (operands[0], code, operands[4], operands[5], - operands[1], operands[2]); + operands[1], operands[2], false); ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]); return true; } @@ -18569,7 +18604,9 @@ ix86_expand_int_vcond (rtx operands[]) enum rtx_code code = GET_CODE (operands[3]); bool negate = false; rtx x, cop0, cop1; + rtx comp; + comp = operands[3]; cop0 = operands[4]; cop1 = operands[5]; @@ -18681,8 +18718,18 @@ ix86_expand_int_vcond (rtx operands[]) } } - x = ix86_expand_sse_cmp (operands[0], code, cop0, cop1, - operands[1+negate], operands[2-negate]); + if (GET_CODE (comp) == NE && XEXP (comp, 0) == NULL_RTX + && XEXP (comp, 1) == NULL_RTX) + { + rtx vec = CONST0_RTX (mode); + x = ix86_expand_sse_cmp (operands[0], code, cop0, vec, + operands[1+negate], operands[2-negate], true); + } + else + { + x = ix86_expand_sse_cmp (operands[0], code, cop0, cop1, + operands[1+negate], operands[2-negate], false); + } ix86_expand_sse_movcc (operands[0], x, operands[1+negate], operands[2-negate]); @@ -18774,7 +18821,7 @@ ix86_expand_sse_unpack (rtx operands[2], tmp = force_reg (imode, CONST0_RTX (imode)); else tmp = ix86_expand_sse_cmp (gen_reg_rtx (imode), GT, CONST0_RTX (imode), - operands[1], pc_rtx, pc_rtx); + operands[1], pc_rtx, pc_rtx, false); emit_insn (unpack (dest, operands[1], tmp)); } @@ -32827,6 +32874,276 @@ ix86_vectorize_builtin_vec_perm (tree ve return ix86_builtins[(int) fcode]; } +/* Find target specific sequence for vector comparison of + real-type vectors V0 and V1. Returns variable containing + result of the comparison or NULL_TREE in other case. */ +static tree +vector_fp_compare (gimple_stmt_iterator *gsi, tree rettype, + enum machine_mode mode, tree v0, tree v1, + enum tree_code code) +{ + enum ix86_builtins fcode; + int arg = -1; + tree fdef, frtype, tmp, var, t; + gimple new_stmt; + bool reverse = false; + +#define SWITCH_MODE(mode, fcode, code, value) \ +switch (mode) \ + { \ + case V2DFmode: \ + if (!TARGET_SSE2) return NULL_TREE; \ + fcode = IX86_BUILTIN_CMP ## code ## PD; \ + break; \ + case V4DFmode: \ + if (!TARGET_AVX) return NULL_TREE; \ + fcode = IX86_BUILTIN_CMPPD256; \ + arg = value; \ + break; \ + case V4SFmode: \ + if (!TARGET_SSE) return NULL_TREE; \ + fcode = IX86_BUILTIN_CMP ## code ## PS; \ + break; \ + case V8SFmode: \ + if (!TARGET_AVX) return NULL_TREE; \ + fcode = IX86_BUILTIN_CMPPS256; \ + arg = value; \ + break; \ + default: \ + return NULL_TREE; \ + /* FIXME: Similar instructions for MMX. */ \ + } + + switch (code) + { + case EQ_EXPR: + SWITCH_MODE (mode, fcode, EQ, 0); + break; + + case NE_EXPR: + SWITCH_MODE (mode, fcode, NEQ, 4); + break; + + case GT_EXPR: + SWITCH_MODE (mode, fcode, LT, 1); + reverse = true; + break; + + case LT_EXPR: + SWITCH_MODE (mode, fcode, LT, 1); + break; + + case LE_EXPR: + SWITCH_MODE (mode, fcode, LE, 2); + break; + + case GE_EXPR: + SWITCH_MODE (mode, fcode, LE, 2); + reverse = true; + break; + + default: + return NULL_TREE; + } +#undef SWITCH_MODE + + fdef = ix86_builtins[(int)fcode]; + frtype = TREE_TYPE (TREE_TYPE (fdef)); + + tmp = create_tmp_var (frtype, "tmp"); + var = create_tmp_var (rettype, "tmp"); + + if (arg == -1) + if (reverse) + new_stmt = gimple_build_call (fdef, 2, v1, v0); + else + new_stmt = gimple_build_call (fdef, 2, v0, v1); + else + if (reverse) + new_stmt = gimple_build_call (fdef, 3, v0, v1, + build_int_cst (char_type_node, arg)); + else + new_stmt = gimple_build_call (fdef, 3, v1, v0, + build_int_cst (char_type_node, arg)); + + gimple_call_set_lhs (new_stmt, tmp); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + t = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, rettype, tmp); + new_stmt = gimple_build_assign (var, t); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + + return var; +} + +/* Find target specific sequence for vector comparison of + integer-type vectors V0 and V1. Returns variable containing + result of the comparison or NULL_TREE in other case. */ +static tree +vector_int_compare (gimple_stmt_iterator *gsi, tree rettype, + enum machine_mode mode, tree v0, tree v1, + enum tree_code code) +{ + enum ix86_builtins feq, fgt; + tree var, t, tmp, tmp1, tmp2, defeq, defgt, gtrtype, eqrtype; + gimple new_stmt; + + switch (mode) + { + /* SSE integer-type vectors. */ + case V2DImode: + if (!TARGET_SSE4_2) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQQ; + fgt = IX86_BUILTIN_PCMPGTQ; + break; + + case V4SImode: + if (!TARGET_SSE2) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQD128; + fgt = IX86_BUILTIN_PCMPGTD128; + break; + + case V8HImode: + if (!TARGET_SSE2) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQW128; + fgt = IX86_BUILTIN_PCMPGTW128; + break; + + case V16QImode: + if (!TARGET_SSE2) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQB128; + fgt = IX86_BUILTIN_PCMPGTB128; + break; + + /* MMX integer-type vectors. */ + case V2SImode: + if (!TARGET_MMX) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQD; + fgt = IX86_BUILTIN_PCMPGTD; + break; + + case V4HImode: + if (!TARGET_MMX) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQW; + fgt = IX86_BUILTIN_PCMPGTW; + break; + + case V8QImode: + if (!TARGET_MMX) return NULL_TREE; + feq = IX86_BUILTIN_PCMPEQB; + fgt = IX86_BUILTIN_PCMPGTB; + break; + + /* FIXME: Similar instructions for AVX. */ + default: + return NULL_TREE; + } + + + var = create_tmp_var (rettype, "ret"); + defeq = ix86_builtins[(int)feq]; + defgt = ix86_builtins[(int)fgt]; + eqrtype = TREE_TYPE (TREE_TYPE (defeq)); + gtrtype = TREE_TYPE (TREE_TYPE (defgt)); + +#define EQGT_CALL(gsi, stmt, var, op0, op1, gteq) \ +do { \ + var = create_tmp_var (gteq ## rtype, "tmp"); \ + stmt = gimple_build_call (def ## gteq, 2, op0, op1); \ + gimple_call_set_lhs (stmt, var); \ + gsi_insert_before (gsi, stmt, GSI_SAME_STMT); \ +} while (0) + + switch (code) + { + case EQ_EXPR: + EQGT_CALL (gsi, new_stmt, tmp, v0, v1, eq); + break; + + case NE_EXPR: + tmp = create_tmp_var (eqrtype, "tmp"); + + EQGT_CALL (gsi, new_stmt, tmp1, v0, v1, eq); + EQGT_CALL (gsi, new_stmt, tmp2, v0, v0, eq); + + /* t = tmp1 ^ {-1, -1,...} */ + t = gimplify_build2 (gsi, BIT_XOR_EXPR, eqrtype, tmp1, tmp2); + new_stmt = gimple_build_assign (tmp, t); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + break; + + case GT_EXPR: + EQGT_CALL (gsi, new_stmt, tmp, v0, v1, gt); + break; + + case LT_EXPR: + EQGT_CALL (gsi, new_stmt, tmp, v1, v0, gt); + break; + + case GE_EXPR: + if (eqrtype != gtrtype) + return NULL_TREE; + tmp = create_tmp_var (eqrtype, "tmp"); + EQGT_CALL (gsi, new_stmt, tmp1, v0, v1, gt); + EQGT_CALL (gsi, new_stmt, tmp2, v0, v1, eq); + t = gimplify_build2 (gsi, BIT_IOR_EXPR, eqrtype, tmp1, tmp2); + new_stmt = gimple_build_assign (tmp, t); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + break; + + case LE_EXPR: + if (eqrtype != gtrtype) + return NULL_TREE; + tmp = create_tmp_var (eqrtype, "tmp"); + EQGT_CALL (gsi, new_stmt, tmp1, v1, v0, gt); + EQGT_CALL (gsi, new_stmt, tmp2, v0, v1, eq); + t = gimplify_build2 (gsi, BIT_IOR_EXPR, eqrtype, tmp1, tmp2); + new_stmt = gimple_build_assign (tmp, t); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + break; + + default: + return NULL_TREE; + } +#undef EQGT_CALL + + t = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, rettype, tmp); + new_stmt = gimple_build_assign (var, t); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + return var; +} + +/* Lower a comparison of two vectors V0 and V1, returning a + variable with the result of comparison. Returns NULL_TREE + when it is impossible to find a target specific sequence. */ +static tree +ix86_vectorize_builtin_vec_compare (gimple_stmt_iterator *gsi, tree rettype, + tree v0, tree v1, enum tree_code code) +{ + tree type; + + /* Make sure we are comparing the same types. */ + if (TREE_TYPE (v0) != TREE_TYPE (v1) + || TREE_TYPE (TREE_TYPE (v0)) != TREE_TYPE (TREE_TYPE (v1))) + return NULL_TREE; + + type = TREE_TYPE (v0); + + /* Cannot compare packed unsigned integers + unless it is EQ or NEQ operations. */ + if (TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE + && TYPE_UNSIGNED (TREE_TYPE (type))) + if (code != EQ_EXPR && code != NE_EXPR) + return NULL_TREE; + + + if (TREE_CODE (TREE_TYPE (type)) == REAL_TYPE) + return vector_fp_compare (gsi, rettype, TYPE_MODE (type), v0, v1, code); + else if (TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE) + return vector_int_compare (gsi, rettype, TYPE_MODE (type), v0, v1, code); + else + return NULL_TREE; +} + /* Return a vector mode with twice as many elements as VMODE. */ /* ??? Consider moving this to a table generated by genmodes.c. */ @@ -35270,6 +35587,11 @@ ix86_autovectorize_vector_sizes (void) #define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES \ ix86_autovectorize_vector_sizes +#undef TARGET_VECTORIZE_BUILTIN_VEC_COMPARE +#define TARGET_VECTORIZE_BUILTIN_VEC_COMPARE \ + ix86_vectorize_builtin_vec_compare + + #undef TARGET_SET_CURRENT_FUNCTION #define TARGET_SET_CURRENT_FUNCTION ix86_set_current_function