From patchwork Thu Oct 26 02:18:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1855583 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=BCHCZ8zA; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SG8c13G8bz23jr for ; Thu, 26 Oct 2023 13:18:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C8EAC385AC26 for ; Thu, 26 Oct 2023 02:18:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by sourceware.org (Postfix) with ESMTPS id 1370D3858022 for ; Thu, 26 Oct 2023 02:18:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1370D3858022 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1370D3858022 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.65 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698286705; cv=none; b=McwTfaSmb80qj2P9aarGjqx2VF7mMJcHRWSFXthi1LAg1QtlTpcGrBYPL/THItVCYaS8+lc4HppyOqmdMUDusIlsWpkhwOqSMjVawNtx6OqzgPzVJWjOWjXRjQdUIS6wxq8T/+WRzTErO2XaA+75pfRv2hTKwqxFWXadIAAIs1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698286705; c=relaxed/simple; bh=iPbLhKGpYHqrtdpDktiMYjALZa5BOIegvzg3rZG47V8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=BwJe4nw8VhJV2uwl6jcIYEXV9c0tIXoGGhh+KRoXhAo5Obx55xZyQgyTxYlEMciw6A+NUzRDANaFmNoOVQ7e79tohW1t8vovQMU41APy1E6isYGRnJMqJFdl93jr3o2hUcpHhbR48HEmFcxv6dhBVj9GrfaBAUjJnJogSOSRUwk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698286703; x=1729822703; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iPbLhKGpYHqrtdpDktiMYjALZa5BOIegvzg3rZG47V8=; b=BCHCZ8zAxwKd83rmAiRl2r3qOh4s87aG5xiG79yQYQ4zBm3ijguG5rkY J67AEjVZ+jij9yNTZYCt9+bzp8GbQfQ+DLYvpge7UvfHz9E9fsA7Zvrhh IoQYBIf0W7neK7OpDjFQ6ayPgJ8V7aqrjS4HSz+AhmB7b8aiM2f8S6Nc7 0w8SXKqhf7sHFeMs/IM6UqDwJSA60Yf1UAznEo3CERjCkvotDOY1TCj/n hDA1524Tv5UBVtCEWH5INVdLBuAVmBO5QjbFTDDb1xaIz6SdbsmfJsZRO m91nElYujV6h9BVOJe5uMqI0ucmFai/P3NIQfZFYMrDkzdjN05LpXYD4H g==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="391311845" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="391311845" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2023 19:18:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="1090440593" X-IronPort-AV: E=Sophos;i="6.03,252,1694761200"; d="scan'208";a="1090440593" Received: from shvmail02.sh.intel.com ([10.239.244.9]) by fmsmga005.fm.intel.com with ESMTP; 25 Oct 2023 19:18:19 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail02.sh.intel.com (Postfix) with ESMTP id 3B6AA1005663; Thu, 26 Oct 2023 10:18:18 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com, hongtao.liu@intel.com, richard.guenther@gmail.com Subject: [PATCH v2] VECT: Remove the type size restriction of vectorizer Date: Thu, 26 Oct 2023 10:18:16 +0800 Message-Id: <20231026021816.1633907-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231018012009.849697-1-pan2.li@intel.com> References: <20231018012009.849697-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to remove this data type size check and unblock the standard name like lrintmn2. The below test are passed for this patch. * The x86 bootstrap and regression test. * The aarch64 regression test. * The risc-v regression tests. gcc/ChangeLog: * internal-fn.cc (expand_fn_using_insn): Add vector int assertion. * tree-vect-stmts.cc (vectorizable_call): Remove size check. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/clrsb_1.c: Adjust checker. * gcc.target/aarch64/sve/clz_1.c: Ditto. * gcc.target/aarch64/sve/popcount_1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto. Signed-off-by: Pan Li --- gcc/internal-fn.cc | 3 ++- gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c | 3 +-- gcc/testsuite/gcc.target/aarch64/sve/clz_1.c | 3 +-- gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c | 3 +-- .../gcc.target/riscv/rvv/autovec/unop/popcount.c | 2 +- gcc/tree-vect-stmts.cc | 13 ------------- 6 files changed, 6 insertions(+), 21 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 61d5a9e4772..17c0f4c3805 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, unsigned int noutputs, emit_move_insn (lhs_rtx, ops[0].value); else { - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))); + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs))); convert_move (lhs_rtx, ops[0].value, 0); } } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c index bdc9856faaf..940d08bbc7b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict src, int size) } /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */ -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c index 0c7a4e6d768..58b8ff406d2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c @@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict src, int size) } /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */ -/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c index dfb6f4ac7a5..0eba898307c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c @@ -18,5 +18,4 @@ popcount_64 (unsigned int *restrict dst, uint64_t *restrict src, int size) } /* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */ -/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c index 585a522aa81..e6e3c70f927 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c @@ -1461,4 +1461,4 @@ main () RUN_ALL () } -/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 229 "vect" } } */ +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 384 "vect" } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a9200767f67..fa4ca0634e8 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3361,19 +3361,6 @@ vectorizable_call (vec_info *vinfo, return false; } - /* FORNOW: we don't yet support mixtures of vector sizes for calls, - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed - by a pack of the two vectors into an SI vector. We would need - separate code to handle direct VnDI->VnSI IFN_CTZs. */ - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "mismatched vector sizes %T and %T\n", - vectype_in, vectype_out); - return false; - } if (VECTOR_BOOLEAN_TYPE_P (vectype_out) != VECTOR_BOOLEAN_TYPE_P (vectype_in))