From patchwork Sat Nov 16 11:16:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1196087 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-513781-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="xaXgOzQ4"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47FXj53d2Lz9s4Y for ; Sat, 16 Nov 2019 22:16:23 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=DtnM0ynA7gt89+yrsP17X1rCyOwYepB4Rd2LaxrRpggBhPwubiYLr YDi6PilCd4Qt6Ft/o9ZZUrhEwRSLH3q1jz1kvLR2fTRzceUvzMbo8rSdNi8YJojn fiWLZGhvVFiNRfIG5zHEv2HgCVpoEEMB6APs6XH0mzxCHDB+sO0XR4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=FvkYsd6kuhORzj/Ll7MCnX3giHg=; b=xaXgOzQ4w/F+aEqNUfxw uYYzcLEuRtci/Jml0ziMsYkh/gbzsVjeyE9Hzorm6Ky+PClZaqSKY6rwuVLjZZES HyAR+RrRVxXLfx6giXdA46Zk+RChPsgtoOrapUxebmXC99eLIhEJ15zxglFPPBQL beA8bs0lHiA1vqu7KNDdJ1A= Received: (qmail 125653 invoked by alias); 16 Nov 2019 11:16:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 125640 invoked by uid 89); 16 Nov 2019 11:16:15 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy=trunc X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 16 Nov 2019 11:16:13 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A097530E for ; Sat, 16 Nov 2019 03:16:11 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 44E973F534 for ; Sat, 16 Nov 2019 03:16:11 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [committed][AArch64] Add truncation for partial SVE modes Date: Sat, 16 Nov 2019 11:16:10 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes This patch adds support for "truncating" to a partial SVE vector from either a full SVE vector or a wider partial vector. This truncation is actually a no-op and so should have zero cost in the vector cost model. Tested on aarch64-linux-gnu and applied as r278344. Richard 2019-11-16 Richard Sandiford gcc/ * config/aarch64/aarch64-sve.md (trunc2): New pattern. * config/aarch64/aarch64.c (aarch64_integer_truncation_p): New function. (aarch64_sve_adjust_stmt_cost): Call it. gcc/testsuite/ * gcc.target/aarch64/sve/mask_struct_load_1.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise. * gcc.target/aarch64/sve/pack_1.c: Likewise. * gcc.target/aarch64/sve/truncate_1.c: New test. Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2019-11-16 11:11:42.929267513 +0000 +++ gcc/config/aarch64/aarch64-sve.md 2019-11-16 11:13:24.236550470 +0000 @@ -72,6 +72,7 @@ ;; ---- [INT] General unary arithmetic corresponding to rtx codes ;; ---- [INT] General unary arithmetic corresponding to unspecs ;; ---- [INT] Sign and zero extension +;; ---- [INT] Truncation ;; ---- [INT] Logical inverse ;; ---- [FP<-INT] General unary arithmetic that maps to unspecs ;; ---- [FP] General unary arithmetic corresponding to unspecs @@ -2889,6 +2890,29 @@ (define_insn "*cond_uxt_any" ) ;; ------------------------------------------------------------------------- +;; ---- [INT] Truncation +;; ------------------------------------------------------------------------- +;; The patterns in this section are synthetic. +;; ------------------------------------------------------------------------- + +;; Truncate to a partial SVE vector from either a full vector or a +;; wider partial vector. This is a no-op, because we can just ignore +;; the unused upper bits of the source. +(define_insn_and_split "trunc2" + [(set (match_operand:SVE_PARTIAL_I 0 "register_operand" "=w") + (truncate:SVE_PARTIAL_I + (match_operand:SVE_HSDI 1 "register_operand" "w")))] + "TARGET_SVE && (~ & ) == 0" + "#" + "&& reload_completed" + [(set (match_dup 0) (match_dup 1))] + { + operands[1] = aarch64_replace_reg_mode (operands[1], + mode); + } +) + +;; ------------------------------------------------------------------------- ;; ---- [INT] Logical inverse ;; ------------------------------------------------------------------------- ;; Includes: Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c 2019-11-16 11:11:42.933267485 +0000 +++ gcc/config/aarch64/aarch64.c 2019-11-16 11:13:24.236550470 +0000 @@ -12901,6 +12901,21 @@ aarch64_extending_load_p (stmt_vec_info && DR_IS_READ (STMT_VINFO_DATA_REF (def_stmt_info))); } +/* Return true if STMT_INFO is an integer truncation. */ +static bool +aarch64_integer_truncation_p (stmt_vec_info stmt_info) +{ + gassign *assign = dyn_cast (stmt_info->stmt); + if (!assign || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (assign))) + return false; + + tree lhs_type = TREE_TYPE (gimple_assign_lhs (assign)); + tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); + return (INTEGRAL_TYPE_P (lhs_type) + && INTEGRAL_TYPE_P (rhs_type) + && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type)); +} + /* STMT_COST is the cost calculated by aarch64_builtin_vectorization_cost for STMT_INFO, which has cost kind KIND. Adjust the cost as necessary for SVE targets. */ @@ -12919,6 +12934,11 @@ aarch64_sve_adjust_stmt_cost (vect_cost_ if (kind == vector_stmt && aarch64_extending_load_p (stmt_info)) stmt_cost = 0; + /* For similar reasons, vector_stmt integer truncations are a no-op, + because we can just ignore the unused upper bits of the source. */ + if (kind == vector_stmt && aarch64_integer_truncation_p (stmt_info)) + stmt_cost = 0; + return stmt_cost; } Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_1.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_1.c 2019-03-08 18:14:29.768994780 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_1.c 2019-11-16 11:13:24.236550470 +0000 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math --param aarch64-sve-compare-costs=0" } */ #include Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_2.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_2.c 2019-03-08 18:14:29.772994767 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_2.c 2019-11-16 11:13:24.236550470 +0000 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math --param aarch64-sve-compare-costs=0" } */ #include Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3.c 2019-03-08 18:14:29.772994767 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3.c 2019-11-16 11:13:24.236550470 +0000 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math --param aarch64-sve-compare-costs=0" } */ #include Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_4.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_4.c 2019-03-08 18:14:29.776994751 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_4.c 2019-11-16 11:13:24.240550442 +0000 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math --param aarch64-sve-compare-costs=0" } */ #include Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c 2019-03-08 18:14:29.784994721 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c 2019-11-16 11:13:24.240550442 +0000 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math --param aarch64-sve-compare-costs=0" } */ #include Index: gcc/testsuite/gcc.target/aarch64/sve/pack_1.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/pack_1.c 2019-03-08 18:14:29.768994780 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/pack_1.c 2019-11-16 11:13:24.240550442 +0000 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize" } */ +/* { dg-options "-O2 -ftree-vectorize --param aarch64-sve-compare-costs=0" } */ #include Index: gcc/testsuite/gcc.target/aarch64/sve/truncate_1.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/truncate_1.c 2019-11-16 11:13:24.240550442 +0000 @@ -0,0 +1,44 @@ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define TEST_LOOP(TYPE1, TYPE2, SHIFT) \ + void \ + f_##TYPE1##_##TYPE2 (TYPE2 *restrict dst, TYPE1 *restrict src1, \ + TYPE1 *restrict src2, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + dst[i] = (TYPE1) (src1[i] + src2[i]) >> SHIFT; \ + } + +#define TEST_ALL(T) \ + T (uint16_t, uint8_t, 2) \ + T (uint32_t, uint8_t, 18) \ + T (uint64_t, uint8_t, 34) \ + T (uint32_t, uint16_t, 3) \ + T (uint64_t, uint16_t, 19) \ + T (uint64_t, uint32_t, 4) + +TEST_ALL (TEST_LOOP) + +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h,} 2 } } */ +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s,} 4 } } */ +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d,} 6 } } */ + +/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.h, z[0-9]+\.h, #2\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, z[0-9]+\.s, #18\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, z[0-9]+\.d, #34\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, z[0-9]+\.s, #3\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, z[0-9]+\.d, #19\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, z[0-9]+\.d, #4\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.h,} 1 } } */ +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s,} 1 } } */ +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.d,} 1 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s,} 1 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.d,} 1 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.d,} 1 } } */