From patchwork Thu Dec 15 10:54:29 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 131561 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 7728F1007D4 for ; Thu, 15 Dec 2011 21:55:14 +1100 (EST) Received: (qmail 25931 invoked by alias); 15 Dec 2011 10:55:12 -0000 Received: (qmail 25899 invoked by uid 22791); 15 Dec 2011 10:55:06 -0000 X-SWARE-Spam-Status: No, hits=-5.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_IV, TW_TM X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 15 Dec 2011 10:54:32 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pBFAsVb1027462 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 15 Dec 2011 05:54:31 -0500 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id pBFAsUWh011470 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 15 Dec 2011 05:54:31 -0500 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id pBFAsUPJ006303; Thu, 15 Dec 2011 11:54:30 +0100 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id pBFAsTgV006302; Thu, 15 Dec 2011 11:54:29 +0100 Date: Thu, 15 Dec 2011 11:54:29 +0100 From: Jakub Jelinek To: Richard Guenther Cc: Ira Rosen , gcc-patches@gcc.gnu.org, Kirill Yukhin Subject: [PATCH] Re: Vectorizer question: DIV to RSHIFT conversion (take 2) Message-ID: <20111215105429.GM1957@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek References: <20111213132128.GZ1957@tyan-ft48-01.lab.bos.redhat.com> <20111213134741.GA1957@tyan-ft48-01.lab.bos.redhat.com> <20111214122513.GD1957@tyan-ft48-01.lab.bos.redhat.com> <20111215070257.GL1957@tyan-ft48-01.lab.bos.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Thu, Dec 15, 2011 at 10:02:15AM +0100, Richard Guenther wrote: > > But it's really ugly to insert part of pattern sequence, don't you think? > > It indeed is. The issue in the past was ICEing with -fno-tree-dce > when the pattern stmts did not have regular RTL expansion support > and the vectorization didn't trigger in the end. I remember those, but that shouldn't ICE here, because the patterns don't contain any of the artificial GIMPLE stmts that can't be expanded. Anyway, here is an updated (so far just make check-gcc RUNTESTFLAGS=vect.exp tested) patch, which turns pattern def from a gimple stmt into a gimple_seq. Perhaps it would be even cleaner to get rid of the pattern stmt and def stmt seq distinction and just have pattern as whole be represented as gimple_seq, but perhaps that cleanup can be deferred for later. This patch also fixes a problem where vect_determine_vectorization_factor would iterate the same stmt twice - for some reason both the original stmt and pattern stmt (and def stmt) are marked as relevant, and we iterate on the same original stmt not just 3 times, but 4 times - first iteration on the original stmt, setting analyze_pattern_stmt, second iteration starts with the pattern stmt but clears analyze_pattern_stmt, then sees it has a def stmt and thus runs on the def stmt, third iteration with pattern_def set on entry again on the original stmt, setting analyze_pattern_stmt again and last one on the pattern stmt (because pattern_def was already set on entry and it clears it). Sounds like those two routines (vect_determine_vectorization_factor and vect_transform_loop) would be bettern rewritten with a helper that would handle a single stmt/stmt_info pair and we would just call that helper on all the original/pattern/def stmts we want to handle. 2011-12-15 Jakub Jelinek * tree-vectorizer.h (struct _stmt_vec_info): Remove pattern_def_stmt field, add pattern_def_seq. (STMT_VINFO_PATTERN_DEF_STMT): Remove. (STMT_VINFO_PATTERN_DEF_SEQ): Define. (NUM_PATTERNS): Bump to 10. * tree-vect-loop.c (vect_determine_vectorization_factor, vect_transform_loop): Adjust for pattern def changing from a single gimple stmt to gimple_seq. * tree-vect-stmts.c (vect_analyze_stmt, new_stmt_vec_info, free_stmt_vec_info): Likewise. * tree-vect-patterns.c (vect_recog_over_widening_pattern, vect_recog_vector_vector_shift_pattern, vect_recog_mixed_size_cond_pattern, adjust_bool_pattern_cast, adjust_bool_pattern, vect_mark_pattern_stmts): Likewise. (vect_recog_sdivmod_pow2_pattern): New function. (vect_vect_recog_func_ptrs): Add it. * config/i386/sse.md (vcond, vcond, vcondv2di): Use general_operand instead of nonimmediate_operand for operand 5 and no predicate for operands 1 and 2. * config/i386/i386.c (ix86_expand_int_vcond): Optimize x < 0 ? -1 : 0 and x < 0 ? 1 : 0 into vector arithmetic resp. logical shift. * gcc.dg/vect/vect-sdivmod-1.c: New test. Jakub --- gcc/tree-vectorizer.h.jj 2011-12-15 08:06:48.910107224 +0100 +++ gcc/tree-vectorizer.h 2011-12-15 10:01:52.354899874 +0100 @@ -487,8 +487,8 @@ typedef struct _stmt_vec_info { pattern). */ gimple related_stmt; - /* Used to keep a def stmt of a pattern stmt if such exists. */ - gimple pattern_def_stmt; + /* Used to keep a sequence of def stmts of a pattern stmt if such exists. */ + gimple_seq pattern_def_seq; /* List of datarefs that are known to have the same alignment as the dataref of this stmt. */ @@ -561,7 +561,7 @@ typedef struct _stmt_vec_info { #define STMT_VINFO_IN_PATTERN_P(S) (S)->in_pattern_p #define STMT_VINFO_RELATED_STMT(S) (S)->related_stmt -#define STMT_VINFO_PATTERN_DEF_STMT(S) (S)->pattern_def_stmt +#define STMT_VINFO_PATTERN_DEF_SEQ(S) (S)->pattern_def_seq #define STMT_VINFO_SAME_ALIGN_REFS(S) (S)->same_align_refs #define STMT_VINFO_DEF_TYPE(S) (S)->def_type #define STMT_VINFO_GROUP_FIRST_ELEMENT(S) (S)->first_element @@ -929,7 +929,7 @@ extern void vect_slp_transform_bb (basic Additional pattern recognition functions can (and will) be added in the future. */ typedef gimple (* vect_recog_func_ptr) (VEC (gimple, heap) **, tree *, tree *); -#define NUM_PATTERNS 9 +#define NUM_PATTERNS 10 void vect_pattern_recog (loop_vec_info); /* In tree-vectorizer.c. */ --- gcc/tree-vect-loop.c.jj 2011-12-05 09:23:53.000000000 +0100 +++ gcc/tree-vect-loop.c 2011-12-15 11:18:49.237029894 +0100 @@ -1,5 +1,5 @@ /* Loop Vectorization - Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 + Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Dorit Naishlos and Ira Rosen @@ -181,8 +181,10 @@ vect_determine_vectorization_factor (loo stmt_vec_info stmt_info; int i; HOST_WIDE_INT dummy; - gimple stmt, pattern_stmt = NULL, pattern_def_stmt = NULL; - bool analyze_pattern_stmt = false, pattern_def = false; + gimple stmt, pattern_stmt = NULL; + gimple_seq pattern_def_seq = NULL; + gimple_stmt_iterator pattern_def_si = gsi_start (NULL); + bool analyze_pattern_stmt = false; if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "=== vect_determine_vectorization_factor ==="); @@ -248,10 +250,7 @@ vect_determine_vectorization_factor (loo tree vf_vectype; if (analyze_pattern_stmt) - { - stmt = pattern_stmt; - analyze_pattern_stmt = false; - } + stmt = pattern_stmt; else stmt = gsi_stmt (si); @@ -296,28 +295,54 @@ vect_determine_vectorization_factor (loo || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_stmt)))) analyze_pattern_stmt = true; - /* If a pattern statement has a def stmt, analyze it too. */ - if (is_pattern_stmt_p (stmt_info) - && (pattern_def_stmt = STMT_VINFO_PATTERN_DEF_STMT (stmt_info)) - && (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_def_stmt)) - || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_def_stmt)))) - { - if (pattern_def) - pattern_def = false; - else - { - if (vect_print_dump_info (REPORT_DETAILS)) - { - fprintf (vect_dump, "==> examining pattern def stmt: "); - print_gimple_stmt (vect_dump, pattern_def_stmt, 0, - TDF_SLIM); - } + /* If a pattern statement has def stmts, analyze them too. */ + if (is_pattern_stmt_p (stmt_info)) + { + if (pattern_def_seq == NULL) + { + pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info); + pattern_def_si = gsi_start (pattern_def_seq); + } + else if (!gsi_end_p (pattern_def_si)) + gsi_next (&pattern_def_si); + if (pattern_def_seq != NULL) + { + gimple pattern_def_stmt = NULL; + stmt_vec_info pattern_def_stmt_info = NULL; - pattern_def = true; - stmt = pattern_def_stmt; - stmt_info = vinfo_for_stmt (stmt); - } - } + while (!gsi_end_p (pattern_def_si)) + { + pattern_def_stmt = gsi_stmt (pattern_def_si); + pattern_def_stmt_info + = vinfo_for_stmt (pattern_def_stmt); + if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info) + || STMT_VINFO_LIVE_P (pattern_def_stmt_info)) + break; + gsi_next (&pattern_def_si); + } + + if (!gsi_end_p (pattern_def_si)) + { + if (vect_print_dump_info (REPORT_DETAILS)) + { + fprintf (vect_dump, + "==> examining pattern def stmt: "); + print_gimple_stmt (vect_dump, pattern_def_stmt, 0, + TDF_SLIM); + } + + stmt = pattern_def_stmt; + stmt_info = pattern_def_stmt_info; + } + else + { + pattern_def_si = gsi_start (NULL); + analyze_pattern_stmt = false; + } + } + else + analyze_pattern_stmt = false; + } if (gimple_get_lhs (stmt) == NULL_TREE) { @@ -347,7 +372,7 @@ vect_determine_vectorization_factor (loo idiom). */ gcc_assert (STMT_VINFO_DATA_REF (stmt_info) || is_pattern_stmt_p (stmt_info) - || pattern_def); + || !gsi_end_p (pattern_def_si)); vectype = STMT_VINFO_VECTYPE (stmt_info); } else @@ -425,8 +450,11 @@ vect_determine_vectorization_factor (loo || (nunits > vectorization_factor)) vectorization_factor = nunits; - if (!analyze_pattern_stmt && !pattern_def) - gsi_next (&si); + if (!analyze_pattern_stmt && gsi_end_p (pattern_def_si)) + { + pattern_def_seq = NULL; + gsi_next (&si); + } } } @@ -5150,8 +5178,10 @@ vect_transform_loop (loop_vec_info loop_ tree cond_expr = NULL_TREE; gimple_seq cond_expr_stmt_list = NULL; bool do_peeling_for_loop_bound; - gimple stmt, pattern_stmt, pattern_def_stmt; - bool transform_pattern_stmt = false, pattern_def = false; + gimple stmt, pattern_stmt; + gimple_seq pattern_def_seq = NULL; + gimple_stmt_iterator pattern_def_si = gsi_start (NULL); + bool transform_pattern_stmt = false; if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "=== vec_transform_loop ==="); @@ -5245,10 +5275,7 @@ vect_transform_loop (loop_vec_info loop_ bool is_store; if (transform_pattern_stmt) - { - stmt = pattern_stmt; - transform_pattern_stmt = false; - } + stmt = pattern_stmt; else stmt = gsi_stmt (si); @@ -5295,28 +5322,53 @@ vect_transform_loop (loop_vec_info loop_ || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_stmt)))) transform_pattern_stmt = true; - /* If pattern statement has a def stmt, vectorize it too. */ - if (is_pattern_stmt_p (stmt_info) - && (pattern_def_stmt = STMT_VINFO_PATTERN_DEF_STMT (stmt_info)) - && (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_def_stmt)) - || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_def_stmt)))) - { - if (pattern_def) - pattern_def = false; - else - { - if (vect_print_dump_info (REPORT_DETAILS)) - { - fprintf (vect_dump, "==> vectorizing pattern def" - " stmt: "); - print_gimple_stmt (vect_dump, pattern_def_stmt, 0, - TDF_SLIM); - } + /* If pattern statement has def stmts, vectorize them too. */ + if (is_pattern_stmt_p (stmt_info)) + { + if (pattern_def_seq == NULL) + { + pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info); + pattern_def_si = gsi_start (pattern_def_seq); + } + else if (!gsi_end_p (pattern_def_si)) + gsi_next (&pattern_def_si); + if (pattern_def_seq != NULL) + { + gimple pattern_def_stmt = NULL; + stmt_vec_info pattern_def_stmt_info = NULL; - pattern_def = true; - stmt = pattern_def_stmt; - stmt_info = vinfo_for_stmt (stmt); - } + while (!gsi_end_p (pattern_def_si)) + { + pattern_def_stmt = gsi_stmt (pattern_def_si); + pattern_def_stmt_info + = vinfo_for_stmt (pattern_def_stmt); + if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info) + || STMT_VINFO_LIVE_P (pattern_def_stmt_info)) + break; + gsi_next (&pattern_def_si); + } + + if (!gsi_end_p (pattern_def_si)) + { + if (vect_print_dump_info (REPORT_DETAILS)) + { + fprintf (vect_dump, "==> vectorizing pattern def" + " stmt: "); + print_gimple_stmt (vect_dump, pattern_def_stmt, 0, + TDF_SLIM); + } + + stmt = pattern_def_stmt; + stmt_info = pattern_def_stmt_info; + } + else + { + pattern_def_si = gsi_start (NULL); + transform_pattern_stmt = false; + } + } + else + transform_pattern_stmt = false; } gcc_assert (STMT_VINFO_VECTYPE (stmt_info)); @@ -5346,9 +5398,12 @@ vect_transform_loop (loop_vec_info loop_ /* Hybrid SLP stmts must be vectorized in addition to SLP. */ if (!vinfo_for_stmt (stmt) || PURE_SLP_STMT (stmt_info)) { - if (!transform_pattern_stmt && !pattern_def) - gsi_next (&si); - continue; + if (!transform_pattern_stmt && gsi_end_p (pattern_def_si)) + { + pattern_def_seq = NULL; + gsi_next (&si); + } + continue; } } @@ -5378,8 +5433,11 @@ vect_transform_loop (loop_vec_info loop_ } } - if (!transform_pattern_stmt && !pattern_def) - gsi_next (&si); + if (!transform_pattern_stmt && gsi_end_p (pattern_def_si)) + { + pattern_def_seq = NULL; + gsi_next (&si); + } } /* stmts in BB */ } /* BBs in loop */ --- gcc/tree-vect-stmts.c.jj 2011-12-14 08:11:03.000000000 +0100 +++ gcc/tree-vect-stmts.c 2011-12-15 10:01:52.359899967 +0100 @@ -5203,7 +5203,8 @@ vect_analyze_stmt (gimple stmt, bool *ne enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info); bool ok; tree scalar_type, vectype; - gimple pattern_stmt, pattern_def_stmt; + gimple pattern_stmt; + gimple_seq pattern_def_seq; if (vect_print_dump_info (REPORT_DETAILS)) { @@ -5274,21 +5275,29 @@ vect_analyze_stmt (gimple stmt, bool *ne } if (is_pattern_stmt_p (stmt_info) - && (pattern_def_stmt = STMT_VINFO_PATTERN_DEF_STMT (stmt_info)) - && (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_def_stmt)) - || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_def_stmt)))) + && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info))) { - /* Analyze def stmt of STMT if it's a pattern stmt. */ - if (vect_print_dump_info (REPORT_DETAILS)) - { - fprintf (vect_dump, "==> examining pattern def statement: "); - print_gimple_stmt (vect_dump, pattern_def_stmt, 0, TDF_SLIM); - } - - if (!vect_analyze_stmt (pattern_def_stmt, need_to_vectorize, node)) - return false; - } + gimple_stmt_iterator si; + for (si = gsi_start (pattern_def_seq); !gsi_end_p (si); gsi_next (&si)) + { + gimple pattern_def_stmt = gsi_stmt (si); + if (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_def_stmt)) + || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_def_stmt))) + { + /* Analyze def stmt of STMT if it's a pattern stmt. */ + if (vect_print_dump_info (REPORT_DETAILS)) + { + fprintf (vect_dump, "==> examining pattern def statement: "); + print_gimple_stmt (vect_dump, pattern_def_stmt, 0, TDF_SLIM); + } + + if (!vect_analyze_stmt (pattern_def_stmt, + need_to_vectorize, node)) + return false; + } + } + } switch (STMT_VINFO_DEF_TYPE (stmt_info)) { @@ -5605,7 +5614,7 @@ new_stmt_vec_info (gimple stmt, loop_vec STMT_VINFO_VECTORIZABLE (res) = true; STMT_VINFO_IN_PATTERN_P (res) = false; STMT_VINFO_RELATED_STMT (res) = NULL; - STMT_VINFO_PATTERN_DEF_STMT (res) = NULL; + STMT_VINFO_PATTERN_DEF_SEQ (res) = NULL; STMT_VINFO_DATA_REF (res) = NULL; STMT_VINFO_DR_BASE_ADDRESS (res) = NULL; @@ -5676,8 +5685,13 @@ free_stmt_vec_info (gimple stmt) = vinfo_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)); if (patt_info) { - if (STMT_VINFO_PATTERN_DEF_STMT (patt_info)) - free_stmt_vec_info (STMT_VINFO_PATTERN_DEF_STMT (patt_info)); + gimple_seq seq = STMT_VINFO_PATTERN_DEF_SEQ (patt_info); + if (seq) + { + gimple_stmt_iterator si; + for (si = gsi_start (seq); !gsi_end_p (si); gsi_next (&si)) + free_stmt_vec_info (gsi_stmt (si)); + } free_stmt_vec_info (STMT_VINFO_RELATED_STMT (stmt_info)); } } --- gcc/tree-vect-patterns.c.jj 2011-12-15 08:06:48.938107050 +0100 +++ gcc/tree-vect-patterns.c 2011-12-15 10:01:52.357899935 +0100 @@ -53,6 +53,8 @@ static gimple vect_recog_widen_shift_pat tree *, tree *); static gimple vect_recog_vector_vector_shift_pattern (VEC (gimple, heap) **, tree *, tree *); +static gimple vect_recog_sdivmod_pow2_pattern (VEC (gimple, heap) **, + tree *, tree *); static gimple vect_recog_mixed_size_cond_pattern (VEC (gimple, heap) **, tree *, tree *); static gimple vect_recog_bool_pattern (VEC (gimple, heap) **, tree *, tree *); @@ -64,6 +66,7 @@ static vect_recog_func_ptr vect_vect_rec vect_recog_over_widening_pattern, vect_recog_widen_shift_pattern, vect_recog_vector_vector_shift_pattern, + vect_recog_sdivmod_pow2_pattern, vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern}; @@ -867,7 +870,7 @@ vect_recog_widen_sum_pattern (VEC (gimpl NEW_DEF_STMT - in case DEF has to be promoted, we create two pattern statements for STMT: the first one is a type promotion and the second one is the operation itself. We return the type promotion statement - in NEW_DEF_STMT and further store it in STMT_VINFO_PATTERN_DEF_STMT of + in NEW_DEF_STMT and further store it in STMT_VINFO_PATTERN_DEF_SEQ of the second pattern statement. */ static bool @@ -988,7 +991,7 @@ vect_operation_fits_smaller_type (gimple a. Its type is not sufficient for the operation, we create a new stmt: a type conversion for OPRND from HALF_TYPE to INTERM_TYPE. We store this statement in NEW_DEF_STMT, and it is later put in - STMT_VINFO_PATTERN_DEF_STMT of the pattern statement for STMT. + STMT_VINFO_PATTERN_DEF_SEQ of the pattern statement for STMT. b. OPRND is good to use in the new statement. */ if (first) { @@ -1143,7 +1146,8 @@ vect_recog_over_widening_pattern (VEC (g = gimple_build_assign_with_ops (gimple_assign_rhs_code (stmt), var, op0, op1); STMT_VINFO_RELATED_STMT (vinfo_for_stmt (stmt)) = pattern_stmt; - STMT_VINFO_PATTERN_DEF_STMT (vinfo_for_stmt (stmt)) = new_def_stmt; + STMT_VINFO_PATTERN_DEF_SEQ (vinfo_for_stmt (stmt)) + = gimple_seq_alloc_with_stmt (new_def_stmt); if (vect_print_dump_info (REPORT_DETAILS)) { @@ -1198,8 +1202,8 @@ vect_recog_over_widening_pattern (VEC (g else { if (prev_stmt) - STMT_VINFO_PATTERN_DEF_STMT (vinfo_for_stmt (use_stmt)) - = STMT_VINFO_PATTERN_DEF_STMT (vinfo_for_stmt (prev_stmt)); + STMT_VINFO_PATTERN_DEF_SEQ (vinfo_for_stmt (use_stmt)) + = STMT_VINFO_PATTERN_DEF_SEQ (vinfo_for_stmt (prev_stmt)); *type_in = vectype; *type_out = NULL_TREE; @@ -1475,7 +1479,7 @@ vect_recog_widen_shift_pattern (VEC (gim i.e. the shift/rotate stmt. The original stmt (S3) is replaced with a shift/rotate which has same type on both operands, in the second case just b_T op c_T, in the first case with added cast - from a_t to c_T in STMT_VINFO_PATTERN_DEF_STMT. + from a_t to c_T in STMT_VINFO_PATTERN_DEF_SEQ. Output: @@ -1555,7 +1559,8 @@ vect_recog_vector_vector_shift_pattern ( def = vect_recog_temp_ssa_var (TREE_TYPE (oprnd0), NULL); def_stmt = gimple_build_assign_with_ops (NOP_EXPR, def, oprnd1, NULL_TREE); - STMT_VINFO_PATTERN_DEF_STMT (stmt_vinfo) = def_stmt; + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) + = gimple_seq_alloc_with_stmt (def_stmt); } /* Pattern detected. */ @@ -1573,6 +1578,217 @@ vect_recog_vector_vector_shift_pattern ( return pattern_stmt; } +/* Detect a signed division by power of two constant that wouldn't be + otherwise vectorized: + + type a_t, b_t; + + S1 a_t = b_t / N; + + where type 'type' is a signed integral type and N is a constant positive + power of two. + + Similarly handle signed modulo by power of two constant: + + S4 a_t = b_t % N; + + Input/Output: + + * STMTS: Contains a stmt from which the pattern search begins, + i.e. the division stmt. S1 is replaced by: + S3 y_t = b_t < 0 ? N - 1 : 0; + S2 x_t = b_t + y_t; + S1' a_t = x_t >> log2 (N); + + S4 is replaced by (where *_T temporaries have unsigned type): + S9 y_T = b_t < 0 ? -1U : 0U; + S8 z_T = y_T >> (sizeof (type_t) * CHAR_BIT - log2 (N)); + S7 z_t = (type) z_T; + S6 w_t = b_t + z_t; + S5 x_t = w_t & (N - 1); + S4' a_t = x_t - z_t; + + Output: + + * TYPE_IN: The type of the input arguments to the pattern. + + * TYPE_OUT: The type of the output of this pattern. + + * Return value: A new stmt that will be used to replace the division + S1 or modulo S4 stmt. */ + +static gimple +vect_recog_sdivmod_pow2_pattern (VEC (gimple, heap) **stmts, + tree *type_in, tree *type_out) +{ + gimple last_stmt = VEC_pop (gimple, *stmts); + tree oprnd0, oprnd1, vectype, itype, cond; + gimple pattern_stmt, def_stmt; + enum tree_code rhs_code; + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); + loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo); + optab optab; + + if (!is_gimple_assign (last_stmt)) + return NULL; + + rhs_code = gimple_assign_rhs_code (last_stmt); + switch (rhs_code) + { + case TRUNC_DIV_EXPR: + case TRUNC_MOD_EXPR: + break; + default: + return NULL; + } + + if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo)) + return NULL; + + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); + itype = TREE_TYPE (oprnd0); + if (TREE_CODE (oprnd0) != SSA_NAME + || TREE_CODE (oprnd1) != INTEGER_CST + || TREE_CODE (itype) != INTEGER_TYPE + || TYPE_UNSIGNED (itype) + || TYPE_PRECISION (itype) != GET_MODE_PRECISION (TYPE_MODE (itype)) + || !integer_pow2p (oprnd1) + || tree_int_cst_sgn (oprnd1) != 1) + return NULL; + + vectype = get_vectype_for_scalar_type (itype); + if (vectype == NULL_TREE) + return NULL; + + /* If the target can handle vectorized division or modulo natively, + don't attempt to optimize this. */ + optab = optab_for_tree_code (rhs_code, vectype, optab_default); + if (optab != NULL) + { + enum machine_mode vec_mode = TYPE_MODE (vectype); + int icode = (int) optab_handler (optab, vec_mode); + if (icode != CODE_FOR_nothing + || GET_MODE_SIZE (vec_mode) == UNITS_PER_WORD) + return NULL; + } + + /* Pattern detected. */ + if (vect_print_dump_info (REPORT_DETAILS)) + fprintf (vect_dump, "vect_recog_sdivmod_pow2_pattern: detected: "); + + cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst (itype, 0)); + if (rhs_code == TRUNC_DIV_EXPR) + { + tree var = vect_recog_temp_ssa_var (itype, NULL); + def_stmt + = gimple_build_assign_with_ops3 (COND_EXPR, var, cond, + fold_build2 (MINUS_EXPR, itype, + oprnd1, + build_int_cst (itype, + 1)), + build_int_cst (itype, 0)); + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) + = gimple_seq_alloc_with_stmt (def_stmt); + var = vect_recog_temp_ssa_var (itype, NULL); + def_stmt + = gimple_build_assign_with_ops (PLUS_EXPR, var, oprnd0, + gimple_assign_lhs (def_stmt)); + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + + pattern_stmt + = gimple_build_assign_with_ops (RSHIFT_EXPR, + vect_recog_temp_ssa_var (itype, NULL), + var, + build_int_cst (itype, + tree_log2 (oprnd1))); + } + else + { + tree signmask; + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL; + if (compare_tree_int (oprnd1, 2) == 0) + { + signmask = vect_recog_temp_ssa_var (itype, NULL); + def_stmt + = gimple_build_assign_with_ops3 (COND_EXPR, signmask, cond, + build_int_cst (itype, 1), + build_int_cst (itype, 0)); + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + } + else + { + tree utype + = build_nonstandard_integer_type (TYPE_PRECISION (itype), 1); + tree vecutype = get_vectype_for_scalar_type (utype); + tree shift + = build_int_cst (utype, GET_MODE_BITSIZE (TYPE_MODE (itype)) + - tree_log2 (oprnd1)); + tree var = vect_recog_temp_ssa_var (utype, NULL); + stmt_vec_info def_stmt_vinfo; + + def_stmt + = gimple_build_assign_with_ops3 (COND_EXPR, var, cond, + build_int_cst (utype, -1), + build_int_cst (utype, 0)); + def_stmt_vinfo = new_stmt_vec_info (def_stmt, loop_vinfo, NULL); + set_vinfo_for_stmt (def_stmt, def_stmt_vinfo); + STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecutype; + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + var = vect_recog_temp_ssa_var (utype, NULL); + def_stmt + = gimple_build_assign_with_ops (RSHIFT_EXPR, var, + gimple_assign_lhs (def_stmt), + shift); + def_stmt_vinfo = new_stmt_vec_info (def_stmt, loop_vinfo, NULL); + set_vinfo_for_stmt (def_stmt, def_stmt_vinfo); + STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecutype; + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + signmask = vect_recog_temp_ssa_var (itype, NULL); + def_stmt + = gimple_build_assign_with_ops (NOP_EXPR, signmask, var, + NULL_TREE); + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + } + def_stmt + = gimple_build_assign_with_ops (PLUS_EXPR, + vect_recog_temp_ssa_var (itype, NULL), + oprnd0, signmask); + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + def_stmt + = gimple_build_assign_with_ops (BIT_AND_EXPR, + vect_recog_temp_ssa_var (itype, NULL), + gimple_assign_lhs (def_stmt), + fold_build2 (MINUS_EXPR, itype, + oprnd1, + build_int_cst (itype, + 1))); + gimplify_seq_add_stmt (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo), + def_stmt); + + pattern_stmt + = gimple_build_assign_with_ops (MINUS_EXPR, + vect_recog_temp_ssa_var (itype, NULL), + gimple_assign_lhs (def_stmt), + signmask); + } + + if (vect_print_dump_info (REPORT_DETAILS)) + print_gimple_stmt (vect_dump, pattern_stmt, 0, TDF_SLIM); + + VEC_safe_push (gimple, heap, *stmts, last_stmt); + + *type_in = vectype; + *type_out = vectype; + return pattern_stmt; +} + /* Function vect_recog_mixed_size_cond_pattern Try to find the following pattern: @@ -1680,7 +1896,8 @@ vect_recog_mixed_size_cond_pattern (VEC vect_recog_temp_ssa_var (type, NULL), gimple_assign_lhs (def_stmt), NULL_TREE); - STMT_VINFO_PATTERN_DEF_STMT (stmt_vinfo) = def_stmt; + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) + = gimple_seq_alloc_with_stmt (def_stmt); def_stmt_info = new_stmt_vec_info (def_stmt, loop_vinfo, NULL); set_vinfo_for_stmt (def_stmt, def_stmt_info); STMT_VINFO_VECTYPE (def_stmt_info) = vecitype; @@ -1767,7 +1984,7 @@ check_bool_pattern (tree var, loop_vec_i /* Helper function of adjust_bool_pattern. Add a cast to TYPE to a previous stmt (SSA_NAME_DEF_STMT of VAR) by moving the COND_EXPR from RELATED_STMT - to PATTERN_DEF_STMT and adding a cast as RELATED_STMT. */ + to PATTERN_DEF_SEQ and adding a cast as RELATED_STMT. */ static tree adjust_bool_pattern_cast (tree type, tree var) @@ -1775,9 +1992,10 @@ adjust_bool_pattern_cast (tree type, tre stmt_vec_info stmt_vinfo = vinfo_for_stmt (SSA_NAME_DEF_STMT (var)); gimple cast_stmt, pattern_stmt; - gcc_assert (!STMT_VINFO_PATTERN_DEF_STMT (stmt_vinfo)); + gcc_assert (!STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo)); pattern_stmt = STMT_VINFO_RELATED_STMT (stmt_vinfo); - STMT_VINFO_PATTERN_DEF_STMT (stmt_vinfo) = pattern_stmt; + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) + = gimple_seq_alloc_with_stmt (pattern_stmt); cast_stmt = gimple_build_assign_with_ops (NOP_EXPR, vect_recog_temp_ssa_var (type, NULL), @@ -1882,7 +2100,7 @@ adjust_bool_pattern (tree var, tree out_ VEC_quick_push (gimple, *stmts, stmt); STMT_VINFO_RELATED_STMT (vinfo_for_stmt (stmt)) = STMT_VINFO_RELATED_STMT (stmt_def_vinfo); - gcc_assert (!STMT_VINFO_PATTERN_DEF_STMT (stmt_def_vinfo)); + gcc_assert (!STMT_VINFO_PATTERN_DEF_SEQ (stmt_def_vinfo)); STMT_VINFO_RELATED_STMT (stmt_def_vinfo) = NULL; return irhs2; } @@ -1907,7 +2125,7 @@ adjust_bool_pattern (tree var, tree out_ VEC_quick_push (gimple, *stmts, stmt); STMT_VINFO_RELATED_STMT (vinfo_for_stmt (stmt)) = STMT_VINFO_RELATED_STMT (stmt_def_vinfo); - gcc_assert (!STMT_VINFO_PATTERN_DEF_STMT (stmt_def_vinfo)); + gcc_assert (!STMT_VINFO_PATTERN_DEF_SEQ (stmt_def_vinfo)); STMT_VINFO_RELATED_STMT (stmt_def_vinfo) = NULL; return irhs1; } @@ -2086,7 +2304,8 @@ vect_recog_bool_pattern (VEC (gimple, he tree rhs2 = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL); gimple cast_stmt = gimple_build_assign_with_ops (NOP_EXPR, rhs2, rhs, NULL_TREE); - STMT_VINFO_PATTERN_DEF_STMT (stmt_vinfo) = cast_stmt; + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) + = gimple_seq_alloc_with_stmt (cast_stmt); rhs = rhs2; } pattern_stmt @@ -2139,23 +2358,28 @@ vect_mark_pattern_stmts (gimple orig_stm STMT_VINFO_VECTYPE (pattern_stmt_info) = pattern_vectype; STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true; STMT_VINFO_RELATED_STMT (orig_stmt_info) = pattern_stmt; - STMT_VINFO_PATTERN_DEF_STMT (pattern_stmt_info) - = STMT_VINFO_PATTERN_DEF_STMT (orig_stmt_info); - if (STMT_VINFO_PATTERN_DEF_STMT (pattern_stmt_info)) - { - def_stmt = STMT_VINFO_PATTERN_DEF_STMT (pattern_stmt_info); - def_stmt_info = vinfo_for_stmt (def_stmt); - if (def_stmt_info == NULL) + STMT_VINFO_PATTERN_DEF_SEQ (pattern_stmt_info) + = STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info); + if (STMT_VINFO_PATTERN_DEF_SEQ (pattern_stmt_info)) + { + gimple_stmt_iterator si; + for (si = gsi_start (STMT_VINFO_PATTERN_DEF_SEQ (pattern_stmt_info)); + !gsi_end_p (si); gsi_next (&si)) { - def_stmt_info = new_stmt_vec_info (def_stmt, loop_vinfo, NULL); - set_vinfo_for_stmt (def_stmt, def_stmt_info); + def_stmt = gsi_stmt (si); + def_stmt_info = vinfo_for_stmt (def_stmt); + if (def_stmt_info == NULL) + { + def_stmt_info = new_stmt_vec_info (def_stmt, loop_vinfo, NULL); + set_vinfo_for_stmt (def_stmt, def_stmt_info); + } + gimple_set_bb (def_stmt, gimple_bb (orig_stmt)); + STMT_VINFO_RELATED_STMT (def_stmt_info) = orig_stmt; + STMT_VINFO_DEF_TYPE (def_stmt_info) + = STMT_VINFO_DEF_TYPE (orig_stmt_info); + if (STMT_VINFO_VECTYPE (def_stmt_info) == NULL_TREE) + STMT_VINFO_VECTYPE (def_stmt_info) = pattern_vectype; } - gimple_set_bb (def_stmt, gimple_bb (orig_stmt)); - STMT_VINFO_RELATED_STMT (def_stmt_info) = orig_stmt; - STMT_VINFO_DEF_TYPE (def_stmt_info) - = STMT_VINFO_DEF_TYPE (orig_stmt_info); - if (STMT_VINFO_VECTYPE (def_stmt_info) == NULL_TREE) - STMT_VINFO_VECTYPE (def_stmt_info) = pattern_vectype; } } --- gcc/config/i386/sse.md.jj 2011-12-15 08:06:48.962106914 +0100 +++ gcc/config/i386/sse.md 2011-12-15 10:01:52.361899998 +0100 @@ -6340,9 +6340,9 @@ (define_expand "vcondmode) == GET_MODE_NUNITS (mode))" @@ -6357,9 +6357,9 @@ (define_expand "vcondmode) == GET_MODE_NUNITS (mode))" @@ -6374,9 +6374,9 @@ (define_expand "vcondv2di (if_then_else:VI8F_128 (match_operator 3 "" [(match_operand:V2DI 4 "nonimmediate_operand" "") - (match_operand:V2DI 5 "nonimmediate_operand" "")]) - (match_operand:VI8F_128 1 "general_operand" "") - (match_operand:VI8F_128 2 "general_operand" "")))] + (match_operand:V2DI 5 "general_operand" "")]) + (match_operand:VI8F_128 1 "" "") + (match_operand:VI8F_128 2 "" "")))] "TARGET_SSE4_2" { bool ok = ix86_expand_int_vcond (operands); --- gcc/config/i386/i386.c.jj 2011-12-15 08:06:49.176105645 +0100 +++ gcc/config/i386/i386.c 2011-12-15 10:01:52.368900116 +0100 @@ -19434,6 +19434,45 @@ ix86_expand_int_vcond (rtx operands[]) cop0 = operands[4]; cop1 = operands[5]; + /* Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 + and x < 0 ? 1 : 0 into (unsigned) x >> 31. */ + if ((code == LT || code == GE) + && data_mode == mode + && cop1 == CONST0_RTX (mode) + && operands[1 + (code == LT)] == CONST0_RTX (data_mode) + && GET_MODE_SIZE (GET_MODE_INNER (data_mode)) > 1 + && GET_MODE_SIZE (GET_MODE_INNER (data_mode)) <= 8 + && (GET_MODE_SIZE (data_mode) == 16 + || (TARGET_AVX2 && GET_MODE_SIZE (data_mode) == 32))) + { + rtx negop = operands[2 - (code == LT)]; + int shift = GET_MODE_BITSIZE (GET_MODE_INNER (data_mode)) - 1; + if (negop == CONST1_RTX (data_mode)) + { + rtx res = expand_simple_binop (mode, LSHIFTRT, cop0, GEN_INT (shift), + operands[0], 1, OPTAB_DIRECT); + if (res != operands[0]) + emit_move_insn (operands[0], res); + return true; + } + else if (GET_MODE_INNER (data_mode) != DImode + && vector_all_ones_operand (negop, data_mode)) + { + rtx res = expand_simple_binop (mode, ASHIFTRT, cop0, GEN_INT (shift), + operands[0], 0, OPTAB_DIRECT); + if (res != operands[0]) + emit_move_insn (operands[0], res); + return true; + } + } + + if (!nonimmediate_operand (cop1, mode)) + cop1 = force_reg (mode, cop1); + if (!general_operand (operands[1], data_mode)) + operands[1] = force_reg (data_mode, operands[1]); + if (!general_operand (operands[2], data_mode)) + operands[2] = force_reg (data_mode, operands[2]); + /* XOP supports all of the comparisons on all 128-bit vector int types. */ if (TARGET_XOP && (mode == V16QImode || mode == V8HImode --- gcc/testsuite/gcc.dg/vect/vect-sdivmod-1.c.jj 2011-12-15 10:01:52.372900152 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-sdivmod-1.c 2011-12-15 10:01:52.372900152 +0100 @@ -0,0 +1,98 @@ +#include "tree-vect.h" + +extern void abort (void); +int a[4096]; + +__attribute__((noinline, noclone)) void +f1 (int x) +{ + int i, j; + for (i = 1; i <= x; i++) + { + j = a[i] >> 8; + j = 1 + (j / 2); + a[i] = j << 8; + } +} + +__attribute__((noinline, noclone)) void +f2 (int x) +{ + int i, j; + for (i = 1; i <= x; i++) + { + j = a[i] >> 8; + j = 1 + (j / 16); + a[i] = j << 8; + } +} + +__attribute__((noinline, noclone)) void +f3 (int x) +{ + int i, j; + for (i = 1; i <= x; i++) + { + j = a[i] >> 8; + j = 1 + (j % 2); + a[i] = j << 8; + } +} + +__attribute__((noinline, noclone)) void +f4 (int x) +{ + int i, j; + for (i = 1; i <= x; i++) + { + j = a[i] >> 8; + j = 1 + (j % 16); + a[i] = j << 8; + } +} + +int +main () +{ + int i; + check_vect (); + for (i = 0; i < 4096; i++) + { + asm (""); + a[i] = (i - 2048) << 8; + } + f1 (4095); + if (a[0] != (-2048 << 8)) + abort (); + for (i = 1; i < 4096; i++) + if (a[i] != ((1 + ((i - 2048) / 2)) << 8)) + abort (); + else + a[i] = (i - 2048) << 8; + f2 (4095); + if (a[0] != (-2048 << 8)) + abort (); + for (i = 1; i < 4096; i++) + if (a[i] != ((1 + ((i - 2048) / 16)) << 8)) + abort (); + else + a[i] = (i - 2048) << 8; + f3 (4095); + if (a[0] != (-2048 << 8)) + abort (); + for (i = 1; i < 4096; i++) + if (a[i] != ((1 + ((i - 2048) % 2)) << 8)) + abort (); + else + a[i] = (i - 2048) << 8; + f4 (4095); + if (a[0] != (-2048 << 8)) + abort (); + for (i = 1; i < 4096; i++) + if (a[i] != ((1 + ((i - 2048) % 16)) << 8)) + abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" { target vect_condition } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */