From patchwork Fri Mar 26 16:15:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1458873 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=ffUcgbxO; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4F6Rsj13SKz9sR4 for ; Sat, 27 Mar 2021 03:15:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 289DB385702C; Fri, 26 Mar 2021 16:15:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 289DB385702C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1616775351; bh=FheFl2V65yRYdq4IVcRQre1i1MiQGn/KjMmRXRTYVd0=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=ffUcgbxOzzHvA9dIDZWGfC0vRTEqLKYY19Hf4iijMpM3dUrakZ+KmJrzRwFPlzvz2 kfm1ARRb/yKudnDGBLihv68yluxYCKkzeEbKs5z7tPkGFDY+6ZDVsESKrN7fFb0hlx yqQP1XTL6VLwjjbJmxIrLd3UY8hZ/hpjvbZ3oJA0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1DEDF3844061 for ; Fri, 26 Mar 2021 16:15:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1DEDF3844061 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D36DC1476 for ; Fri, 26 Mar 2021 09:15:47 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7B7FA3F792 for ; Fri, 26 Mar 2021 09:15:47 -0700 (PDT) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 05/13] aarch64: Add costs for one element of a scatter store References: Date: Fri, 26 Mar 2021 16:15:46 +0000 In-Reply-To: (Richard Sandiford's message of "Fri, 26 Mar 2021 16:12:42 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Currently each element in a gather load is costed as a scalar_load and each element in a scatter store is costed as a scalar_store. The load side seems to work pretty well in practice, since many CPU-specific costs give loads quite a high cost relative to arithmetic operations. However, stores usually have a cost of just 1, which means that scatters tend to appear too cheap. This patch adds a separate cost for one element in a scatter store. Like with the previous patches, this one only becomes active if a CPU selects use_new_vector_costs. It should therefore have a very low impact on other CPUs. gcc/ * config/aarch64/aarch64-protos.h (sve_vec_cost::scatter_store_elt_cost): New member variable. * config/aarch64/aarch64.c (generic_sve_vector_cost): Update accordingly, taking the cost from the cost of a scalar_store. (a64fx_sve_vector_cost): Likewise. (aarch64_detect_vector_stmt_subtype): Detect scatter stores. --- gcc/config/aarch64/aarch64-protos.h | 9 +++++++-- gcc/config/aarch64/aarch64.c | 13 +++++++++++-- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index fabe3df7071..2ffa96ec24b 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -256,12 +256,14 @@ struct sve_vec_cost : simd_vec_cost unsigned int clast_cost, unsigned int fadda_f16_cost, unsigned int fadda_f32_cost, - unsigned int fadda_f64_cost) + unsigned int fadda_f64_cost, + unsigned int scatter_store_elt_cost) : simd_vec_cost (base), clast_cost (clast_cost), fadda_f16_cost (fadda_f16_cost), fadda_f32_cost (fadda_f32_cost), - fadda_f64_cost (fadda_f64_cost) + fadda_f64_cost (fadda_f64_cost), + scatter_store_elt_cost (scatter_store_elt_cost) {} /* The cost of a vector-to-scalar CLASTA or CLASTB instruction, @@ -274,6 +276,9 @@ struct sve_vec_cost : simd_vec_cost const int fadda_f16_cost; const int fadda_f32_cost; const int fadda_f64_cost; + + /* The per-element cost of a scatter store. */ + const int scatter_store_elt_cost; }; /* Cost for vector insn classes. */ diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 20bb75bd56c..7f727413d01 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -638,7 +638,8 @@ static const sve_vec_cost generic_sve_vector_cost = 2, /* clast_cost */ 2, /* fadda_f16_cost */ 2, /* fadda_f32_cost */ - 2 /* fadda_f64_cost */ + 2, /* fadda_f64_cost */ + 1 /* scatter_store_elt_cost */ }; /* Generic costs for vector insn classes. */ @@ -705,7 +706,8 @@ static const sve_vec_cost a64fx_sve_vector_cost = 13, /* clast_cost */ 13, /* fadda_f16_cost */ 13, /* fadda_f32_cost */ - 13 /* fadda_f64_cost */ + 13, /* fadda_f64_cost */ + 1 /* scatter_store_elt_cost */ }; static const struct cpu_vector_cost a64fx_vector_cost = @@ -14279,6 +14281,13 @@ aarch64_detect_vector_stmt_subtype (vec_info *vinfo, vect_cost_for_stmt kind, && DR_IS_WRITE (STMT_VINFO_DATA_REF (stmt_info))) return simd_costs->store_elt_extra_cost; + /* Detect cases in which a scalar_store is really storing one element + in a scatter operation. */ + if (kind == scalar_store + && sve_costs + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER) + return sve_costs->scatter_store_elt_cost; + /* Detect cases in which vec_to_scalar represents an in-loop reduction. */ if (kind == vec_to_scalar && where == vect_body