From patchwork Tue Jun 19 12:23:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 931574 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-480013-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="mwLqhQ35"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4196Z124Ggz9s4n for ; Tue, 19 Jun 2018 22:23:19 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=q7W0tLhDVdFxmL9A6NnvlbNwyXgE1rQoJgB+2fQ+TTdRWS8C2GZGE Nljw/BRu4EtGFrSLlnlp7lBc4Fj7xruKuBN8PkN9sSTkH0/kmeLTWPRbKV0GH35Z sLzkpNsRVTl+URUzN1csY/Om5ccRM0VKA/k85PCH9lx/N2TQRCIA60= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=QYQBSq+hrwur02lzrVP5FbFqs7A=; b=mwLqhQ35m5nNZRkoH2k9 weE32VScRvd5VNWsbI6DzcLhe8DBfzcMnNz81NTlRuyK57NvyeyLN6/hcaVyhv36 iTt5LdjSQcF/F/K8vhm5BlkMabvzlZGFcTczkPiv99Jy0FvXpphkyDqAUmJo53xc VL8y7ozNvZfpFu7s6mDBz0g= Received: (qmail 72394 invoked by alias); 19 Jun 2018 12:23:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 72376 invoked by uid 89); 19 Jun 2018 12:23:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.2 spammy=aux X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Jun 2018 12:23:07 +0000 Received: from relay1.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id CA90EACC8 for ; Tue, 19 Jun 2018 12:23:04 +0000 (UTC) Date: Tue, 19 Jun 2018 14:23:04 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] Refactor vectorize_loops Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 This refactors the main loop vectorization loop to avoid all the gotos which stand in the way of me adding a local with a constructor. On the way I fixed the dbgcnt handling now that we support things like 2:5 which the vectorizer mishandles by simply aborting the loop walk. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2018-06-19 Richard Biener * tree-vectorizer.c (try_vectorize_loop_1): Split out of ... (vectorize_loops): ... here. Fix dbgcnt handling. (try_vectorize_loop): Wrap try_vectorize_loop_1. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c (revision 261732) +++ gcc/tree-vectorizer.c (working copy) @@ -659,6 +659,158 @@ set_uid_loop_bbs (loop_vec_info loop_vin free (bbs); } +/* Try to vectorize LOOP. */ + +static unsigned +try_vectorize_loop_1 (hash_table *&simduid_to_vf_htab, + unsigned *num_vectorized_loops, + loop_p loop, loop_vec_info orig_loop_vinfo, + gimple *loop_vectorized_call, + gimple *loop_dist_alias_call) +{ + unsigned ret = 0; + vect_location = find_loop_location (loop); + if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION + && dump_enabled_p ()) + dump_printf (MSG_NOTE, "\nAnalyzing loop at %s:%d\n", + LOCATION_FILE (vect_location), + LOCATION_LINE (vect_location)); + + loop_vec_info loop_vinfo = vect_analyze_loop (loop, orig_loop_vinfo); + loop->aux = loop_vinfo; + + if (!loop_vinfo || !LOOP_VINFO_VECTORIZABLE_P (loop_vinfo)) + { + /* Free existing information if loop is analyzed with some + assumptions. */ + if (loop_constraint_set_p (loop, LOOP_C_FINITE)) + vect_free_loop_info_assumptions (loop); + + /* If we applied if-conversion then try to vectorize the + BB of innermost loops. + ??? Ideally BB vectorization would learn to vectorize + control flow by applying if-conversion on-the-fly, the + following retains the if-converted loop body even when + only non-if-converted parts took part in BB vectorization. */ + if (flag_tree_slp_vectorize != 0 + && loop_vectorized_call + && ! loop->inner) + { + basic_block bb = loop->header; + bool has_mask_load_store = false; + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); + !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (is_gimple_call (stmt) + && gimple_call_internal_p (stmt) + && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD + || gimple_call_internal_fn (stmt) == IFN_MASK_STORE)) + { + has_mask_load_store = true; + break; + } + gimple_set_uid (stmt, -1); + gimple_set_visited (stmt, false); + } + if (! has_mask_load_store && vect_slp_bb (bb)) + { + dump_printf_loc (MSG_NOTE, vect_location, + "basic block vectorized\n"); + fold_loop_internal_call (loop_vectorized_call, + boolean_true_node); + loop_vectorized_call = NULL; + ret |= TODO_cleanup_cfg; + } + } + /* If outer loop vectorization fails for LOOP_VECTORIZED guarded + loop, don't vectorize its inner loop; we'll attempt to + vectorize LOOP_VECTORIZED guarded inner loop of the scalar + loop version. */ + if (loop_vectorized_call && loop->inner) + loop->inner->dont_vectorize = true; + return ret; + } + + if (!dbg_cnt (vect_loop)) + { + /* Free existing information if loop is analyzed with some + assumptions. */ + if (loop_constraint_set_p (loop, LOOP_C_FINITE)) + vect_free_loop_info_assumptions (loop); + return ret; + } + + if (loop_vectorized_call) + set_uid_loop_bbs (loop_vinfo, loop_vectorized_call); + + unsigned HOST_WIDE_INT bytes; + if (current_vector_size.is_constant (&bytes)) + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, + "loop vectorized vectorized using " + HOST_WIDE_INT_PRINT_UNSIGNED " byte " + "vectors\n", bytes); + else + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, + "loop vectorized using variable length vectors\n"); + + loop_p new_loop = vect_transform_loop (loop_vinfo); + (*num_vectorized_loops)++; + /* Now that the loop has been vectorized, allow it to be unrolled + etc. */ + loop->force_vectorize = false; + + if (loop->simduid) + { + simduid_to_vf *simduid_to_vf_data = XNEW (simduid_to_vf); + if (!simduid_to_vf_htab) + simduid_to_vf_htab = new hash_table (15); + simduid_to_vf_data->simduid = DECL_UID (loop->simduid); + simduid_to_vf_data->vf = loop_vinfo->vectorization_factor; + *simduid_to_vf_htab->find_slot (simduid_to_vf_data, INSERT) + = simduid_to_vf_data; + } + + if (loop_vectorized_call) + { + fold_loop_internal_call (loop_vectorized_call, boolean_true_node); + loop_vectorized_call = NULL; + ret |= TODO_cleanup_cfg; + } + if (loop_dist_alias_call) + { + tree value = gimple_call_arg (loop_dist_alias_call, 1); + fold_loop_internal_call (loop_dist_alias_call, value); + loop_dist_alias_call = NULL; + ret |= TODO_cleanup_cfg; + } + + /* Epilogue of vectorized loop must be vectorized too. */ + if (new_loop) + ret |= try_vectorize_loop_1 (simduid_to_vf_htab, num_vectorized_loops, + new_loop, loop_vinfo, NULL, NULL); + + return ret; +} + +/* Try to vectorize LOOP. */ + +static unsigned +try_vectorize_loop (hash_table *&simduid_to_vf_htab, + unsigned *num_vectorized_loops, loop_p loop) +{ + if (!((flag_tree_loop_vectorize + && optimize_loop_nest_for_speed_p (loop)) + || loop->force_vectorize)) + return 0; + + return try_vectorize_loop_1 (simduid_to_vf_htab, num_vectorized_loops, + loop, NULL, + vect_loop_vectorized_call (loop), + vect_loop_dist_alias_call (loop)); +} + + /* Function vectorize_loops. Entry point to loop vectorization phase. */ @@ -674,7 +826,6 @@ vectorize_loops (void) hash_table *simd_array_to_simduid_htab = NULL; bool any_ifcvt_loops = false; unsigned ret = 0; - struct loop *new_loop; vect_loops_num = number_of_loops (cfun); @@ -727,157 +878,18 @@ vectorize_loops (void) = get_loop (cfun, tree_to_shwi (arg)); if (vector_loop && vector_loop != loop) { - loop = vector_loop; /* Make sure we don't vectorize it twice. */ - loop->dont_vectorize = true; - goto try_vectorize; + vector_loop->dont_vectorize = true; + ret |= try_vectorize_loop (simduid_to_vf_htab, + &num_vectorized_loops, + vector_loop); } } } } else - { - loop_vec_info loop_vinfo, orig_loop_vinfo; - gimple *loop_vectorized_call, *loop_dist_alias_call; - try_vectorize: - if (!((flag_tree_loop_vectorize - && optimize_loop_nest_for_speed_p (loop)) - || loop->force_vectorize)) - continue; - orig_loop_vinfo = NULL; - loop_vectorized_call = vect_loop_vectorized_call (loop); - loop_dist_alias_call = vect_loop_dist_alias_call (loop); - vectorize_epilogue: - vect_location = find_loop_location (loop); - if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION - && dump_enabled_p ()) - dump_printf (MSG_NOTE, "\nAnalyzing loop at %s:%d\n", - LOCATION_FILE (vect_location), - LOCATION_LINE (vect_location)); - - loop_vinfo = vect_analyze_loop (loop, orig_loop_vinfo); - loop->aux = loop_vinfo; - - if (!loop_vinfo || !LOOP_VINFO_VECTORIZABLE_P (loop_vinfo)) - { - /* Free existing information if loop is analyzed with some - assumptions. */ - if (loop_constraint_set_p (loop, LOOP_C_FINITE)) - vect_free_loop_info_assumptions (loop); - - /* If we applied if-conversion then try to vectorize the - BB of innermost loops. - ??? Ideally BB vectorization would learn to vectorize - control flow by applying if-conversion on-the-fly, the - following retains the if-converted loop body even when - only non-if-converted parts took part in BB vectorization. */ - if (flag_tree_slp_vectorize != 0 - && loop_vectorized_call - && ! loop->inner) - { - basic_block bb = loop->header; - bool has_mask_load_store = false; - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); - !gsi_end_p (gsi); gsi_next (&gsi)) - { - gimple *stmt = gsi_stmt (gsi); - if (is_gimple_call (stmt) - && gimple_call_internal_p (stmt) - && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD - || gimple_call_internal_fn (stmt) == IFN_MASK_STORE)) - { - has_mask_load_store = true; - break; - } - gimple_set_uid (stmt, -1); - gimple_set_visited (stmt, false); - } - if (! has_mask_load_store && vect_slp_bb (bb)) - { - dump_printf_loc (MSG_NOTE, vect_location, - "basic block vectorized\n"); - fold_loop_internal_call (loop_vectorized_call, - boolean_true_node); - loop_vectorized_call = NULL; - ret |= TODO_cleanup_cfg; - } - } - /* If outer loop vectorization fails for LOOP_VECTORIZED guarded - loop, don't vectorize its inner loop; we'll attempt to - vectorize LOOP_VECTORIZED guarded inner loop of the scalar - loop version. */ - if (loop_vectorized_call && loop->inner) - loop->inner->dont_vectorize = true; - continue; - } - - if (!dbg_cnt (vect_loop)) - { - /* We may miss some if-converted loops due to - debug counter. Set any_ifcvt_loops to visit - them at finalization. */ - any_ifcvt_loops = true; - /* Free existing information if loop is analyzed with some - assumptions. */ - if (loop_constraint_set_p (loop, LOOP_C_FINITE)) - vect_free_loop_info_assumptions (loop); - - break; - } - - if (loop_vectorized_call) - set_uid_loop_bbs (loop_vinfo, loop_vectorized_call); - - unsigned HOST_WIDE_INT bytes; - if (current_vector_size.is_constant (&bytes)) - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, - "loop vectorized vectorized using " - HOST_WIDE_INT_PRINT_UNSIGNED " byte " - "vectors\n", bytes); - else - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, - "loop vectorized using variable length vectors\n"); - - new_loop = vect_transform_loop (loop_vinfo); - num_vectorized_loops++; - /* Now that the loop has been vectorized, allow it to be unrolled - etc. */ - loop->force_vectorize = false; - - if (loop->simduid) - { - simduid_to_vf *simduid_to_vf_data = XNEW (simduid_to_vf); - if (!simduid_to_vf_htab) - simduid_to_vf_htab = new hash_table (15); - simduid_to_vf_data->simduid = DECL_UID (loop->simduid); - simduid_to_vf_data->vf = loop_vinfo->vectorization_factor; - *simduid_to_vf_htab->find_slot (simduid_to_vf_data, INSERT) - = simduid_to_vf_data; - } - - if (loop_vectorized_call) - { - fold_loop_internal_call (loop_vectorized_call, boolean_true_node); - loop_vectorized_call = NULL; - ret |= TODO_cleanup_cfg; - } - if (loop_dist_alias_call) - { - tree value = gimple_call_arg (loop_dist_alias_call, 1); - fold_loop_internal_call (loop_dist_alias_call, value); - loop_dist_alias_call = NULL; - ret |= TODO_cleanup_cfg; - } - - if (new_loop) - { - /* Epilogue of vectorized loop must be vectorized too. */ - vect_loops_num = number_of_loops (cfun); - loop = new_loop; - orig_loop_vinfo = loop_vinfo; /* To pass vect_analyze_loop. */ - goto vectorize_epilogue; - } - } + ret |= try_vectorize_loop (simduid_to_vf_htab, &num_vectorized_loops, + loop); vect_location = UNKNOWN_LOCATION; @@ -914,18 +926,16 @@ vectorize_loops (void) } } - for (i = 1; i < vect_loops_num; i++) + for (i = 1; i < number_of_loops (cfun); i++) { loop_vec_info loop_vinfo; bool has_mask_store; loop = get_loop (cfun, i); - if (!loop) + if (!loop || !loop->aux) continue; loop_vinfo = (loop_vec_info) loop->aux; - has_mask_store = false; - if (loop_vinfo) - has_mask_store = LOOP_VINFO_HAS_MASK_STORE (loop_vinfo); + has_mask_store = LOOP_VINFO_HAS_MASK_STORE (loop_vinfo); delete loop_vinfo; if (has_mask_store && targetm.vectorize.empty_mask_is_expensive (IFN_MASK_STORE))