From patchwork Wed Jun 10 06:37:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1306593 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=mq/N/VX4; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49hckW48lBz9sSF for ; Wed, 10 Jun 2020 16:38:10 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 17658388E824; Wed, 10 Jun 2020 06:38:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 17658388E824 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1591771088; bh=PBu8r/xyIfhzwa6gM+/r5ivd1rQCQoxt9jF/96M2Pwg=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=mq/N/VX41+f6dP8WsYT267DvlxkYAxJ5ehDSqdrjedlAOrczsbNOLkhkFT+1AsrCF 1cSLDGVQxa9AhVkMIkvglkt/U+GCZhwlNaJilTnxjBrLITcdSeS2FByw/aAfT7k63L j4NSa+7PoT+YciMQSzkUvl22KXx2/eJNfYIUoTts= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 93E60388B039 for ; Wed, 10 Jun 2020 06:38:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 93E60388B039 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05A6WUqW043594; Wed, 10 Jun 2020 02:38:01 -0400 Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 31jgu5q6kk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Jun 2020 02:38:01 -0400 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05A6ZKrT022185; Wed, 10 Jun 2020 06:37:59 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma03ams.nl.ibm.com with ESMTP id 31g2s7y6bw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Jun 2020 06:37:58 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05A6buY131719648 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 10 Jun 2020 06:37:56 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 905B0AE04D; Wed, 10 Jun 2020 06:37:56 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7EADFAE051; Wed, 10 Jun 2020 06:37:53 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.53.22]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 10 Jun 2020 06:37:53 +0000 (GMT) Subject: [PATCH 4/4] vect: Factor out and rename some functions/macros To: GCC Patches References: <8107a42b-92e8-56f1-0721-8e594c18b8ed@linux.ibm.com> Message-ID: Date: Wed, 10 Jun 2020 14:37:51 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <8107a42b-92e8-56f1-0721-8e594c18b8ed@linux.ibm.com> Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-10_02:2020-06-10, 2020-06-10 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 priorityscore=1501 adultscore=0 cotscore=-2147483648 mlxscore=0 phishscore=0 suspectscore=0 bulkscore=0 malwarescore=0 spamscore=0 impostorscore=0 mlxlogscore=999 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006100046 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Cc: Bill Schmidt , Segher Boessenkool Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/ChangeLog: * tree-vect-loop-manip.c (vect_set_loop_controls_directly): Rename LOOP_VINFO_MASK_COMPARE_TYPE to LOOP_VINFO_COMPARE_TYPE. Rename LOOP_VINFO_MASK_IV_TYPE to LOOP_VINFO_IV_TYPE. (vect_set_loop_condition_masked): Renamed to ... (vect_set_loop_condition_partial_vectors): ... this. Rename LOOP_VINFO_MASK_COMPARE_TYPE to LOOP_VINFO_COMPARE_TYPE. Rename vect_iv_limit_for_full_masking to vect_iv_limit_for_partial_vectors. (vect_set_loop_condition_unmasked): Renamed to ... (vect_set_loop_condition_normal): ... this. (vect_set_loop_condition): Rename vect_set_loop_condition_unmasked to vect_set_loop_condition_normal. Rename vect_set_loop_condition_masked to vect_set_loop_condition_partial_vectors. (vect_prepare_for_masked_peels): Rename LOOP_VINFO_MASK_COMPARE_TYPE to LOOP_VINFO_COMPARE_TYPE. * tree-vect-loop.c (known_niters_smaller_than_vf): New, factored out from ... (vect_analyze_loop_costing): ... this. (_loop_vec_info::_loop_vec_info): Rename mask_compare_type to compare_type. (min_prec_for_max_niters): New, factored out from ... (vect_verify_full_masking): ... this. Rename vect_iv_limit_for_full_masking to vect_iv_limit_for_partial_vectors. Rename LOOP_VINFO_MASK_COMPARE_TYPE to LOOP_VINFO_COMPARE_TYPE. Rename LOOP_VINFO_MASK_IV_TYPE to LOOP_VINFO_IV_TYPE. (vectorizable_reduction): Update some dumpings with partial vectorization instead of fully-masked. (vectorizable_live_operation): Likewise. (vect_iv_limit_for_full_masking): Renamed to ... (vect_iv_limit_for_partial_vectors): ... this. * tree-vect-stmts.c (check_load_store_masking): Renamed to ... (check_load_store_for_partial_vectors): ... this. Update some dumpings with partial vectorization instead of fully-masked. (vectorizable_store): Rename check_load_store_masking to check_load_store_for_partial_vectors. (vectorizable_load): Likewise. * tree-vectorizer.h (LOOP_VINFO_MASK_COMPARE_TYPE): Renamed to ... (LOOP_VINFO_COMPARE_TYPE): ... this. (LOOP_VINFO_MASK_IV_TYPE): Renamed to ... (LOOP_VINFO_IV_TYPE): ... this. (vect_iv_limit_for_full_masking): Renamed to ... (vect_iv_limit_for_partial_vectors): this. (_loop_vec_info): Rename mask_compare_type to compare_type. --- --- gcc/tree-vect-loop-manip.c | 40 ++++++------- gcc/tree-vect-loop.c | 117 +++++++++++++++++++++++-------------- gcc/tree-vect-stmts.c | 41 +++++++------ gcc/tree-vectorizer.h | 15 ++--- 4 files changed, 123 insertions(+), 90 deletions(-) -- diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index ca68d04a919..1fac5898525 100644 --- a/gcc/tree-vect-loop-manip.c +++ b/gcc/tree-vect-loop-manip.c @@ -420,8 +420,8 @@ vect_set_loop_controls_directly (class loop *loop, loop_vec_info loop_vinfo, rgroup_controls *rgc, tree niters, tree niters_skip, bool might_wrap_p) { - tree compare_type = LOOP_VINFO_MASK_COMPARE_TYPE (loop_vinfo); - tree iv_type = LOOP_VINFO_MASK_IV_TYPE (loop_vinfo); + tree compare_type = LOOP_VINFO_COMPARE_TYPE (loop_vinfo); + tree iv_type = LOOP_VINFO_IV_TYPE (loop_vinfo); tree ctrl_type = rgc->type; unsigned int nscalars_per_iter = rgc->max_nscalars_per_iter; poly_uint64 nscalars_per_ctrl = TYPE_VECTOR_SUBPARTS (ctrl_type); @@ -643,15 +643,15 @@ vect_set_loop_controls_directly (class loop *loop, loop_vec_info loop_vinfo, final gcond. */ static gcond * -vect_set_loop_condition_masked (class loop *loop, loop_vec_info loop_vinfo, - tree niters, tree final_iv, - bool niters_maybe_zero, - gimple_stmt_iterator loop_cond_gsi) +vect_set_loop_condition_partial_vectors (class loop *loop, + loop_vec_info loop_vinfo, tree niters, + tree final_iv, bool niters_maybe_zero, + gimple_stmt_iterator loop_cond_gsi) { gimple_seq preheader_seq = NULL; gimple_seq header_seq = NULL; - tree compare_type = LOOP_VINFO_MASK_COMPARE_TYPE (loop_vinfo); + tree compare_type = LOOP_VINFO_COMPARE_TYPE (loop_vinfo); unsigned int compare_precision = TYPE_PRECISION (compare_type); tree orig_niters = niters; @@ -677,7 +677,7 @@ vect_set_loop_condition_masked (class loop *loop, loop_vec_info loop_vinfo, else niters = gimple_convert (&preheader_seq, compare_type, niters); - widest_int iv_limit = vect_iv_limit_for_full_masking (loop_vinfo); + widest_int iv_limit = vect_iv_limit_for_partial_vectors (loop_vinfo); /* Iterate over all the rgroups and fill in their controls. We could use the first control from any rgroup for the loop condition; here we @@ -748,13 +748,12 @@ vect_set_loop_condition_masked (class loop *loop, loop_vec_info loop_vinfo, } /* Like vect_set_loop_condition, but handle the case in which there - are no loop masks. */ + are no partial vectorization loops. */ static gcond * -vect_set_loop_condition_unmasked (class loop *loop, tree niters, - tree step, tree final_iv, - bool niters_maybe_zero, - gimple_stmt_iterator loop_cond_gsi) +vect_set_loop_condition_normal (class loop *loop, tree niters, tree step, + tree final_iv, bool niters_maybe_zero, + gimple_stmt_iterator loop_cond_gsi) { tree indx_before_incr, indx_after_incr; gcond *cond_stmt; @@ -913,13 +912,14 @@ vect_set_loop_condition (class loop *loop, loop_vec_info loop_vinfo, gimple_stmt_iterator loop_cond_gsi = gsi_for_stmt (orig_cond); if (loop_vinfo && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)) - cond_stmt = vect_set_loop_condition_masked (loop, loop_vinfo, niters, - final_iv, niters_maybe_zero, - loop_cond_gsi); + cond_stmt = vect_set_loop_condition_partial_vectors (loop, loop_vinfo, + niters, final_iv, + niters_maybe_zero, + loop_cond_gsi); else - cond_stmt = vect_set_loop_condition_unmasked (loop, niters, step, - final_iv, niters_maybe_zero, - loop_cond_gsi); + cond_stmt = vect_set_loop_condition_normal (loop, niters, step, final_iv, + niters_maybe_zero, + loop_cond_gsi); /* Remove old loop exit test. */ stmt_vec_info orig_cond_info; @@ -1774,7 +1774,7 @@ void vect_prepare_for_masked_peels (loop_vec_info loop_vinfo) { tree misalign_in_elems; - tree type = LOOP_VINFO_MASK_COMPARE_TYPE (loop_vinfo); + tree type = LOOP_VINFO_COMPARE_TYPE (loop_vinfo); gcc_assert (vect_use_loop_mask_for_alignment_p (loop_vinfo)); diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 7ea75d6d095..b6e96f77f69 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -155,6 +155,7 @@ along with GCC; see the file COPYING3. If not see static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *); static stmt_vec_info vect_is_simple_reduction (loop_vec_info, stmt_vec_info, bool *, bool *); +static bool known_niters_smaller_than_vf (loop_vec_info); /* Subroutine of vect_determine_vf_for_stmt that handles only one statement. VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE @@ -800,7 +801,7 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared) vectorization_factor (0), max_vectorization_factor (0), mask_skip_niters (NULL_TREE), - mask_compare_type (NULL_TREE), + compare_type (NULL_TREE), simd_if_cond (NULL_TREE), unaligned_dr (NULL), peeling_for_alignment (0), @@ -959,14 +960,41 @@ vect_get_max_nscalars_per_iter (loop_vec_info loop_vinfo) return res; } +/* Calculate the minimal bits necessary to represent the maximal iteration + count of loop with loop_vec_info LOOP_VINFO which is scaling with a given + factor FACTOR. */ + +static unsigned +min_prec_for_max_niters (loop_vec_info loop_vinfo, unsigned int factor) +{ + class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + + /* Get the maximum number of iterations that is representable + in the counter type. */ + tree ni_type = TREE_TYPE (LOOP_VINFO_NITERSM1 (loop_vinfo)); + widest_int max_ni = wi::to_widest (TYPE_MAX_VALUE (ni_type)) + 1; + + /* Get a more refined estimate for the number of iterations. */ + widest_int max_back_edges; + if (max_loop_iterations (loop, &max_back_edges)) + max_ni = wi::smin (max_ni, max_back_edges + 1); + + /* Account for factor, in which each bit is replicated N times. */ + max_ni *= factor; + + /* Work out how many bits we need to represent the limit. */ + unsigned int min_ni_width = wi::min_precision (max_ni, UNSIGNED); + + return min_ni_width; +} + /* Each statement in LOOP_VINFO can be masked where necessary. Check whether we can actually generate the masks required. Return true if so, - storing the type of the scalar IV in LOOP_VINFO_MASK_COMPARE_TYPE. */ + storing the type of the scalar IV in LOOP_VINFO_COMPARE_TYPE. */ static bool vect_verify_full_masking (loop_vec_info loop_vinfo) { - class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); unsigned int min_ni_width; unsigned int max_nscalars_per_iter = vect_get_max_nscalars_per_iter (loop_vinfo); @@ -977,27 +1005,14 @@ vect_verify_full_masking (loop_vec_info loop_vinfo) if (LOOP_VINFO_MASKS (loop_vinfo).is_empty ()) return false; - /* Get the maximum number of iterations that is representable - in the counter type. */ - tree ni_type = TREE_TYPE (LOOP_VINFO_NITERSM1 (loop_vinfo)); - widest_int max_ni = wi::to_widest (TYPE_MAX_VALUE (ni_type)) + 1; - - /* Get a more refined estimate for the number of iterations. */ - widest_int max_back_edges; - if (max_loop_iterations (loop, &max_back_edges)) - max_ni = wi::smin (max_ni, max_back_edges + 1); - - /* Account for rgroup masks, in which each bit is replicated N times. */ - max_ni *= max_nscalars_per_iter; - /* Work out how many bits we need to represent the limit. */ - min_ni_width = wi::min_precision (max_ni, UNSIGNED); + min_ni_width = min_prec_for_max_niters (loop_vinfo, max_nscalars_per_iter); /* Find a scalar mode for which WHILE_ULT is supported. */ opt_scalar_int_mode cmp_mode_iter; tree cmp_type = NULL_TREE; tree iv_type = NULL_TREE; - widest_int iv_limit = vect_iv_limit_for_full_masking (loop_vinfo); + widest_int iv_limit = vect_iv_limit_for_partial_vectors (loop_vinfo); unsigned int iv_precision = UINT_MAX; if (iv_limit != -1) @@ -1050,8 +1065,8 @@ vect_verify_full_masking (loop_vec_info loop_vinfo) if (!cmp_type) return false; - LOOP_VINFO_MASK_COMPARE_TYPE (loop_vinfo) = cmp_type; - LOOP_VINFO_MASK_IV_TYPE (loop_vinfo) = iv_type; + LOOP_VINFO_COMPARE_TYPE (loop_vinfo) = cmp_type; + LOOP_VINFO_IV_TYPE (loop_vinfo) = iv_type; return true; } @@ -1631,15 +1646,7 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo) vectorization factor. */ if (!LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)) { - HOST_WIDE_INT max_niter; - - if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) - max_niter = LOOP_VINFO_INT_NITERS (loop_vinfo); - else - max_niter = max_stmt_executions_int (loop); - - if (max_niter != -1 - && (unsigned HOST_WIDE_INT) max_niter < assumed_vf) + if (known_niters_smaller_than_vf (loop_vinfo)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6801,8 +6808,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo, { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because no" - " conditional operation is available.\n"); + "can't use a partial vectorization loop because" + " no conditional operation is available.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; } else if (reduction_type == FOLD_LEFT_REDUCTION @@ -6813,8 +6820,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo, { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because no" - " conditional operation is available.\n"); + "can't use a partial vectorization loop because" + " no conditional operation is available.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; } else @@ -8021,25 +8028,26 @@ vectorizable_live_operation (loop_vec_info loop_vinfo, { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because " - "the target doesn't support extract last " - "reduction.\n"); + "can't use a partial vectorization loop " + "because the target doesn't support extract " + "last reduction.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; } else if (slp_node) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because an " - "SLP statement is live after the loop.\n"); + "can't use a partial vectorization loop " + "because an SLP statement is live after " + "the loop.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; } else if (ncopies > 1) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because" - " ncopies is greater than 1.\n"); + "can't use a partial vectorization loop " + "because ncopies is greater than 1.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; } else @@ -9194,12 +9202,13 @@ optimize_mask_stores (class loop *loop) } /* Decide whether it is possible to use a zero-based induction variable - when vectorizing LOOP_VINFO with a fully-masked loop. If it is, - return the value that the induction variable must be able to hold - in order to ensure that the loop ends with an all-false mask. + when vectorizing LOOP_VINFO with a partial vectorization loop. If + it is, return the value that the induction variable must be able to + hold in order to ensure that the loop ends with an all-false rgroup + control like mask. Return -1 otherwise. */ widest_int -vect_iv_limit_for_full_masking (loop_vec_info loop_vinfo) +vect_iv_limit_for_partial_vectors (loop_vec_info loop_vinfo) { tree niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo); class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); @@ -9234,3 +9243,23 @@ vect_iv_limit_for_full_masking (loop_vec_info loop_vinfo) return iv_limit; } +/* If we know the iteration count is smaller than vectorization factor, return + true, otherwise return false. */ + +static bool +known_niters_smaller_than_vf (loop_vec_info loop_vinfo) +{ + unsigned int assumed_vf = vect_vf_for_cost (loop_vinfo); + + HOST_WIDE_INT max_niter; + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) + max_niter = LOOP_VINFO_INT_NITERS (loop_vinfo); + else + max_niter = max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)); + + if (max_niter != -1 && (unsigned HOST_WIDE_INT) max_niter < assumed_vf) + return true; + + return false; +} + diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index fb82c8d940f..484470091a8 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -1771,9 +1771,9 @@ static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info, gimple_stmt_iterator *); /* Check whether a load or store statement in the loop described by - LOOP_VINFO is possible in a fully-masked loop. This is testing - whether the vectorizer pass has the appropriate support, as well as - whether the target does. + LOOP_VINFO is possible in a partial vectorization loop. This is + testing whether the vectorizer pass has the appropriate support, + as well as whether the target does. VLS_TYPE says whether the statement is a load or store and VECTYPE is the type of the vector being loaded or stored. MEMORY_ACCESS_TYPE @@ -1783,14 +1783,15 @@ static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info, its arguments. If the load or store is conditional, SCALAR_MASK is the condition under which it occurs. - Clear LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P if a fully-masked loop is not - supported, otherwise record the required mask types. */ + Clear LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P if a partial vectorization + loop is not supported, otherwise record the required rgroup control + types. */ static void -check_load_store_masking (loop_vec_info loop_vinfo, tree vectype, - vec_load_store_type vls_type, int group_size, - vect_memory_access_type memory_access_type, - gather_scatter_info *gs_info, tree scalar_mask) +check_load_store_for_partial_vectors ( + loop_vec_info loop_vinfo, tree vectype, vec_load_store_type vls_type, + int group_size, vect_memory_access_type memory_access_type, + gather_scatter_info *gs_info, tree scalar_mask) { /* Invariant loads need no special support. */ if (memory_access_type == VMAT_INVARIANT) @@ -1807,8 +1808,8 @@ check_load_store_masking (loop_vec_info loop_vinfo, tree vectype, { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because the" - " target doesn't have an appropriate masked" + "can't use a partial vectorization loop because" + " the target doesn't have an appropriate" " load/store-lanes instruction.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; return; @@ -1830,8 +1831,8 @@ check_load_store_masking (loop_vec_info loop_vinfo, tree vectype, { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because the" - " target doesn't have an appropriate masked" + "can't use a partial vectorization loop because" + " the target doesn't have an appropriate" " gather load or scatter store instruction.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; return; @@ -1848,8 +1849,8 @@ check_load_store_masking (loop_vec_info loop_vinfo, tree vectype, scalar loop. We need more work to support other mappings. */ if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "can't use a fully-masked loop because an access" - " isn't contiguous.\n"); + "can't use a partial vectorization loop because an" + " access isn't contiguous.\n"); LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; return; } @@ -7529,8 +7530,9 @@ vectorizable_store (vec_info *vinfo, if (loop_vinfo && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) - check_load_store_masking (loop_vinfo, vectype, vls_type, group_size, - memory_access_type, &gs_info, mask); + check_load_store_for_partial_vectors (loop_vinfo, vectype, vls_type, + group_size, memory_access_type, + &gs_info, mask); if (slp_node && !vect_maybe_update_slp_op_vectype (SLP_TREE_CHILDREN (slp_node)[0], @@ -8836,8 +8838,9 @@ vectorizable_load (vec_info *vinfo, if (loop_vinfo && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) - check_load_store_masking (loop_vinfo, vectype, VLS_LOAD, group_size, - memory_access_type, &gs_info, mask); + check_load_store_for_partial_vectors (loop_vinfo, vectype, VLS_LOAD, + group_size, memory_access_type, + &gs_info, mask); STMT_VINFO_TYPE (stmt_info) = load_vec_info_type; vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 8f3499cc811..857b4a9db15 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -534,9 +534,10 @@ public: elements that should be false in the first mask). */ tree mask_skip_niters; - /* Type of the variables to use in the WHILE_ULT call for fully-masked + /* Type of the variables to use in the loop closing comparison for + partial vectorization, like WHILE_ULT call for fully-masked loops. */ - tree mask_compare_type; + tree compare_type; /* For #pragma omp simd if (x) loops the x expression. If constant 0, the loop should not be vectorized, if constant non-zero, simd_if_cond @@ -545,8 +546,8 @@ public: is false and vectorized loop otherwise. */ tree simd_if_cond; - /* Type of the IV to use in the WHILE_ULT call for fully-masked - loops. */ + /* Type of the IV to use in the loop closing comparison for partial + vectorization, like WHILE_ULT call for fully-masked loops. */ tree iv_type; /* Unknown DRs according to which loop was peeled. */ @@ -696,8 +697,8 @@ public: #define LOOP_VINFO_MAX_VECT_FACTOR(L) (L)->max_vectorization_factor #define LOOP_VINFO_MASKS(L) (L)->masks #define LOOP_VINFO_MASK_SKIP_NITERS(L) (L)->mask_skip_niters -#define LOOP_VINFO_MASK_COMPARE_TYPE(L) (L)->mask_compare_type -#define LOOP_VINFO_MASK_IV_TYPE(L) (L)->iv_type +#define LOOP_VINFO_COMPARE_TYPE(L) (L)->compare_type +#define LOOP_VINFO_IV_TYPE(L) (L)->iv_type #define LOOP_VINFO_PTR_MASK(L) (L)->ptr_mask #define LOOP_VINFO_LOOP_NEST(L) (L)->shared->loop_nest #define LOOP_VINFO_DATAREFS(L) (L)->shared->datarefs @@ -1831,7 +1832,7 @@ extern tree vect_create_addr_base_for_vector_ref (vec_info *, tree, tree = NULL_TREE); /* In tree-vect-loop.c. */ -extern widest_int vect_iv_limit_for_full_masking (loop_vec_info loop_vinfo); +extern widest_int vect_iv_limit_for_partial_vectors (loop_vec_info loop_vinfo); /* Used in tree-vect-loop-manip.c */ extern void determine_peel_for_niter (loop_vec_info); /* Used in gimple-loop-interchange.c and tree-parloops.c. */