From patchwork Fri Oct 25 12:43:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1184128 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-511759-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="knSIq+l5"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4703hJ39cdz9sS3 for ; Fri, 25 Oct 2019 23:43:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=default; b=Tty1VQWyllefgkgTT+uSkn99iQvAK UoDJgoMeb/YsbfQoFqlYPu9FzpzzCJQo5+T6sT6Etx6SwUIZvnDagbqkBV6NgNo0 hF8Wj7KUurAY7PuGo5FRcEXhiT0LtigMrSKmscRk/k0jkFzMy33IPdpyLO2Gr7ow mxW/Olu3/UsZXk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:references:date:in-reply-to:message-id:mime-version :content-type; s=default; bh=adKKLCQR2YIEsO26/0RxYUl8LT8=; b=knS Iq+l51nGFH7KGw5ltD8uZ4ikspENYEFNUMdy3zWTBnCuumWfE8Hlc7k4Kl9Od1DU KHEH/9vIVGlyvXmUAhUut7UPQBD+kDrJ0BQD7LILN74+lhix4+pynOFekDLyTIuQ 2dDONnEX+zGGiJYs89aLpiguIc694ZRup/KoILU8= Received: (qmail 22273 invoked by alias); 25 Oct 2019 12:43:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22261 invoked by uid 89); 25 Oct 2019 12:43:51 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS autolearn=ham version=3.3.1 spammy=acts, machmode.h, machmodeh, UD:machmode.h X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 25 Oct 2019 12:43:49 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E740C28 for ; Fri, 25 Oct 2019 05:43:47 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8CFCA3F718 for ; Fri, 25 Oct 2019 05:43:47 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [11/n] Support vectorisation with mixed vector sizes References: Date: Fri, 25 Oct 2019 13:43:46 +0100 In-Reply-To: (Richard Sandiford's message of "Fri, 25 Oct 2019 13:30:08 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes After previous patches, it's now possible to make the vectoriser support multiple vector sizes in the same vector region, using related_vector_mode to pick the right vector mode for a given element mode. No port yet takes advantage of this, but I have a follow-on patch for AArch64. This patch also seemed like a good opportunity to add some more dump messages: one to make it clear which vector size/mode was being used when analysis passed or failed, and another to say when we've decided to skip a redundant vector size/mode. 2019-10-24 Richard Sandiford gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. Index: gcc/machmode.h =================================================================== --- gcc/machmode.h 2019-10-25 13:26:59.053879364 +0100 +++ gcc/machmode.h 2019-10-25 13:27:26.201687539 +0100 @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) bool exists () const; template bool exists (U *) const; + bool operator== (const T &m) const { return m_mode == m; } + bool operator!= (const T &m) const { return m_mode != m; } + private: machine_mode m_mode; }; Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2019-10-25 13:27:19.317736181 +0100 +++ gcc/tree-vectorizer.h 2019-10-25 13:27:26.209687483 +0100 @@ -329,8 +329,9 @@ typedef std::pair vec_object /* Cost data used by the target cost model. */ void *target_cost_data; - /* If we've chosen a vector size for this vectorization region, - this is one mode that has such a size, otherwise it is VOIDmode. */ + /* The argument we should pass to related_vector_mode when looking up + the vector mode for a scalar mode, or VOIDmode if we haven't yet + made any decisions about which vector modes to use. */ machine_mode vector_mode; private: @@ -1595,8 +1596,9 @@ extern dump_user_location_t find_loop_lo extern bool vect_can_advance_ivs_p (loop_vec_info); /* In tree-vect-stmts.c. */ +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, + poly_uint64 = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree); -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); extern tree get_mask_type_for_scalar_type (vec_info *, tree); extern tree get_same_sized_vectype (tree, tree); extern bool vect_get_loop_mask_type (loop_vec_info); Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-10-25 13:27:19.313736209 +0100 +++ gcc/tree-vect-slp.c 2019-10-25 13:27:26.205687511 +0100 @@ -3118,7 +3118,12 @@ vect_slp_bb_region (gimple_stmt_iterator && dbg_cnt (vect_slp)) { if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + { + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode" + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + } bb_vinfo->shared->check_datarefs (); vect_schedule_slp (bb_vinfo); @@ -3138,6 +3143,13 @@ vect_slp_bb_region (gimple_stmt_iterator vectorized = true; } + else + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (bb_vinfo->vector_mode)); + } if (mode_i == 0) autodetected_vector_mode = bb_vinfo->vector_mode; @@ -3145,9 +3157,22 @@ vect_slp_bb_region (gimple_stmt_iterator delete bb_vinfo; if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (vectorized || mode_i == vector_modes.length () Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2019-10-25 13:27:19.309736237 +0100 +++ gcc/tree-vect-loop.c 2019-10-25 13:27:26.201687539 +0100 @@ -2367,6 +2367,17 @@ vect_analyze_loop (class loop *loop, loo opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); if (mode_i == 0) autodetected_vector_mode = loop_vinfo->vector_mode; + if (dump_enabled_p ()) + { + if (res) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + else + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + } if (res) { @@ -2400,9 +2411,22 @@ vect_analyze_loop (class loop *loop, loo } if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (mode_i == vector_modes.length () || autodetected_vector_mode == VOIDmode) @@ -4763,7 +4787,10 @@ vect_create_epilog_for_reduction (stmt_v && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) sz1 = GET_MODE_SIZE (mode1).to_constant (); - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); + unsigned int scalar_bytes = tree_to_uhwi (TYPE_SIZE_UNIT (scalar_type)); + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + sz1 / scalar_bytes); reduce_with_shift = have_whole_vector_shift (mode1); if (!VECTOR_MODE_P (mode1)) reduce_with_shift = false; @@ -4781,7 +4808,9 @@ vect_create_epilog_for_reduction (stmt_v { gcc_assert (!slp_reduc); sz /= 2; - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + sz / scalar_bytes); /* The target has to make sure we support lowpart/highpart extraction, either via direct vector extract or through Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2019-10-25 13:27:22.985710263 +0100 +++ gcc/tree-vect-stmts.c 2019-10-25 13:27:26.205687511 +0100 @@ -11111,18 +11111,28 @@ vect_remove_stores (stmt_vec_info first_ } } -/* Function get_vectype_for_scalar_type_and_size. - - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported - by the target. */ +/* If NUNITS is nonzero, return a vector type that contains NUNITS + elements of type SCALAR_TYPE, or null if the target doesn't support + such a type. + + If NUNITS is zero, return a vector type that contains elements of + type SCALAR_TYPE, choosing whichever vector size the target prefers. + + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode + for this vectorization region and want to "autodetect" the best choice. + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE + and we want the new type to be interoperable with it. PREVAILING_MODE + in this case can be a scalar integer mode or a vector mode; when it + is a vector mode, the function acts like a tree-level version of + related_vector_mode. */ tree -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, + tree scalar_type, poly_uint64 nunits) { tree orig_scalar_type = scalar_type; scalar_mode inner_mode; machine_mode simd_mode; - poly_uint64 nunits; tree vectype; if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) @@ -11162,10 +11172,11 @@ get_vectype_for_scalar_type_and_size (tr if (scalar_type == NULL_TREE) return NULL_TREE; - /* If no size was supplied use the mode the target prefers. Otherwise - lookup a vector mode of the specified size. */ - if (known_eq (size, 0U)) + /* If no prevailing mode was supplied, use the mode the target prefers. + Otherwise lookup a vector mode based on the prevailing mode. */ + if (prevailing_mode == VOIDmode) { + gcc_assert (known_eq (nunits, 0U)); simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); if (SCALAR_INT_MODE_P (simd_mode)) { @@ -11181,9 +11192,19 @@ get_vectype_for_scalar_type_and_size (tr return NULL_TREE; } } - else if (!multiple_p (size, nbytes, &nunits) - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) - return NULL_TREE; + else if (SCALAR_INT_MODE_P (prevailing_mode) + || !related_vector_mode (prevailing_mode, + inner_mode, nunits).exists (&simd_mode)) + { + /* Fall back to using mode_for_vector, mostly in the hope of being + able to use an integer mode. */ + if (known_eq (nunits, 0U) + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) + return NULL_TREE; + + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + return NULL_TREE; + } vectype = build_vector_type_for_mode (scalar_type, simd_mode); @@ -11211,9 +11232,8 @@ get_vectype_for_scalar_type_and_size (tr tree get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) { - tree vectype; - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, + scalar_type); if (vectype && vinfo->vector_mode == VOIDmode) vinfo->vector_mode = TYPE_MODE (vectype); return vectype; @@ -11246,8 +11266,13 @@ get_same_sized_vectype (tree scalar_type if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) return truth_type_for (vector_type); - return get_vectype_for_scalar_type_and_size - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); + poly_uint64 nunits; + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) + return NULL_TREE; + + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), + scalar_type, nunits); } /* Function vect_is_simple_use. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c 2019-10-25 13:27:19.317736181 +0100 +++ gcc/tree-vectorizer.c 2019-10-25 13:27:26.209687483 +0100 @@ -1348,7 +1348,7 @@ get_vec_alignment_for_array_type (tree t poly_uint64 array_size, vector_size; tree scalar_type = strip_array_types (type); - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); if (!vectype || !poly_int_tree_p (TYPE_SIZE (type), &array_size) || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)