From patchwork Fri Aug 6 07:04:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1514197 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=LdcEoS/x; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GgxM428CMz9sCD for ; Fri, 6 Aug 2021 17:05:19 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 455EF3999034 for ; Fri, 6 Aug 2021 07:05:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 455EF3999034 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1628233516; bh=0+iGRvrnzk92iJ4WoQDpIrzaFCt9ExTLOH5qK/15EmA=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=LdcEoS/xGhHEr/e7ZaNOr5+wlNYfpyJZVOb+M2CQ0ziPkEmI2vQ3KL+M2Jy71Lzwl 79XyEQWy0oJh6HRuEgp8nuJDWU8ETVzHmKOA6yhNvDELKjD76fyBCtRCePe22N5K5L DowZRSAZ+7v8n9ixVta5Njh83sTaJ5HkMWtMMbI0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by sourceware.org (Postfix) with ESMTPS id 6A260385743E for ; Fri, 6 Aug 2021 07:04:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6A260385743E X-IronPort-AV: E=McAfee;i="6200,9189,10067"; a="214051501" X-IronPort-AV: E=Sophos;i="5.84,299,1620716400"; d="scan'208";a="214051501" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Aug 2021 00:04:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,299,1620716400"; d="scan'208";a="569668526" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga004.jf.intel.com with ESMTP; 06 Aug 2021 00:04:52 -0700 Received: from shliclel219.sh.intel.com (shliclel219.sh.intel.com [10.239.236.219]) by scymds01.sc.intel.com with ESMTP id 17674oXZ026113; Fri, 6 Aug 2021 00:04:51 -0700 To: gcc-patches@gcc.gnu.org Subject: [PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element. Date: Fri, 6 Aug 2021 15:04:50 +0800 Message-Id: <20210806070450.1168329-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: "Liu, Hongtao" Reply-To: liuhongt Cc: richard.sandiford@arm.com, segher@kernel.crashing.org Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi: Bootstrapped and regtested on x86_64-linux-gnu{-m32,} Ok for trunk? gcc/ChangeLog: PR rtl-optimization/101796 * simplify-rtx.c (simplify_context::simplify_binary_operation_1): Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element. gcc/testsuite/ChangeLog: PR rtl-optimization/101796 * gcc.target/i386/pr101796.c: New test. --- gcc/simplify-rtx.c | 15 ++++++ gcc/testsuite/gcc.target/i386/pr101796.c | 65 ++++++++++++++++++++++++ 2 files changed, 80 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr101796.c diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index a719f57870f..75f3e455562 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -3970,6 +3970,21 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, return simplify_gen_binary (code, mode, op0, gen_int_shift_amount (mode, val)); } + + /* Optimize vector shift/rotate with const_vec_duplicate + to vector shift/rotate with const_int element. + /* TODO: vec_duplicate with variable can also be simplified, + but GCC only require operand 2 of shift/rotate to be a scalar type + which can have different modes in different backends, it makes + simplication difficult to decide which mode should be choosed + for shift/rotate count. */ + if ((code == ASHIFTRT || code == LSHIFTRT + || code == ASHIFT || code == ROTATERT + || code == ROTATE) + && const_vec_duplicate_p (op1)) + return simplify_gen_binary (code, mode, op0, + unwrap_const_vec_duplicate (op1)); + break; case ASHIFT: diff --git a/gcc/testsuite/gcc.target/i386/pr101796.c b/gcc/testsuite/gcc.target/i386/pr101796.c new file mode 100644 index 00000000000..c22d6267fe5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101796.c @@ -0,0 +1,65 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -O2 " } */ +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ +/* { dg-final { scan-assembler-not "vpsrlv\[dwq\]" } } */ +/* { dg-final { scan-assembler-not "vpsllv\[dwq\]" } } */ +/* { dg-final { scan-assembler-not "vpsrav\[dwq\]" } } */ +/* { dg-final { scan-assembler-times "vpsrl\[dwq\]" 3 } } */ +/* { dg-final { scan-assembler-times "vpsll\[dwq\]" 3 } } */ +/* { dg-final { scan-assembler-times "vpsra\[dwq\]" 3 } } */ + +#include + +__m512i +foo (__m512i a) +{ + return _mm512_srlv_epi16 (a, _mm512_set1_epi16 (3)); +} + +__m512i +foo1 (__m512i a) +{ + return _mm512_srlv_epi32 (a, _mm512_set1_epi32 (3)); +} + +__m512i +foo2 (__m512i a, long long b) +{ + return _mm512_srlv_epi64 (a, _mm512_set1_epi64 (3)); +} + +__m512i +foo3 (__m512i a) +{ + return _mm512_srav_epi16 (a, _mm512_set1_epi16 (3)); +} + +__m512i +foo4 (__m512i a) +{ + return _mm512_srav_epi32 (a, _mm512_set1_epi32 (3)); +} + +__m512i +foo5 (__m512i a, long long b) +{ + return _mm512_srav_epi64 (a, _mm512_set1_epi64 (3)); +} + +__m512i +foo6 (__m512i a) +{ + return _mm512_sllv_epi16 (a, _mm512_set1_epi16 (3)); +} + +__m512i +foo7 (__m512i a) +{ + return _mm512_sllv_epi32 (a, _mm512_set1_epi32 (3)); +} + +__m512i +foo8 (__m512i a, long long b) +{ + return _mm512_sllv_epi64 (a, _mm512_set1_epi64 (3)); +}