From patchwork Thu Aug 31 08:20:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1828152 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=SNx654Md; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbvJ84C24z1ygF for ; Thu, 31 Aug 2023 18:21:12 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8F3FB3858028 for ; Thu, 31 Aug 2023 08:21:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8F3FB3858028 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470070; bh=veyPbRMSCFJDyQnuvu1piNdyRmfOtaTpWmGzLXd+BxY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=SNx654MdSSXnKYIDefdE25N/LreJFYxy/TKTRu46BVtVWWbx+FP6C/AVCkWsLzfyU z6ZOJ3XbzQT5bsjW6/ofLGS0RNLcrcpL7m6blBQi/Sm/tcjdsmipDmDt3Hp9OghQS9 1J2/ZfMKyfqctUxZevY4YTI5qr9QfqFztPrtQnH4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 83662385842E for ; Thu, 31 Aug 2023 08:20:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 83662385842E X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235629" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235629" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938664" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938664" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:29 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 92A191005130; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 08/13] [APX EGPR] Handle GPR16 only vector move insns Date: Thu, 31 Aug 2023 16:20:19 +0800 Message-Id: <20230831082024.314097-9-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" For vector move insns like vmovdqa/vmovdqu, their evex counterparts requrire explicit suffix 64/32/16/8. The usage of these instruction are prohibited under AVX10_1 or AVX512F, so for AVX2+APX_F we select vmovaps/vmovups for vector load/store insns that contains EGPR. gcc/ChangeLog: * config/i386/i386.cc (ix86_get_ssemov): Check if egpr is used, adjust mnemonic for vmovduq/vmovdqa. * config/i386/sse.md (*_vinsert_0): Check if egpr is used, adjust mnemonic for vmovdqu/vmovdqa. (avx_vec_concat): Likewise, and separate alternative 0 to avx_noavx512f. --- gcc/config/i386/i386.cc | 31 ++++++++++++++++++++++++++++++- gcc/config/i386/sse.md | 34 ++++++++++++++++++++++++---------- 2 files changed, 54 insertions(+), 11 deletions(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 412f3aefc43..f5d642948bc 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -5469,6 +5469,11 @@ ix86_get_ssemov (rtx *operands, unsigned size, bool evex_reg_p = (size == 64 || EXT_REX_SSE_REG_P (operands[0]) || EXT_REX_SSE_REG_P (operands[1])); + + bool egpr_p = (TARGET_APX_EGPR + && (x86_extended_rex2reg_mentioned_p (operands[0]) + || x86_extended_rex2reg_mentioned_p (operands[1]))); + machine_mode scalar_mode; const char *opcode = NULL; @@ -5547,6 +5552,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5563,6 +5574,8 @@ ix86_get_ssemov (rtx *operands, unsigned size, case E_TFmode: if (evex_reg_p) opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; @@ -5581,6 +5594,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, ? "vmovdqu8" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu8" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5589,12 +5608,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, : "%vmovdqa"); break; case E_HImode: - if (evex_reg_p) + if (evex_reg_p || egpr_p) opcode = (misaligned_p ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5605,6 +5630,8 @@ ix86_get_ssemov (rtx *operands, unsigned size, case E_SImode: if (evex_reg_p) opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; @@ -5613,6 +5640,8 @@ ix86_get_ssemov (rtx *operands, unsigned size, case E_OImode: if (evex_reg_p) opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 192e746fda3..bd6674d34f9 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -18918,6 +18918,12 @@ (define_insn "*_vinsert_0" { if (which_alternative == 0) return "vinsert\t{$0, %2, %1, %0|%0, %1, %2, 0}"; + bool egpr_used = (TARGET_APX_EGPR + && x86_extended_rex2reg_mentioned_p (operands[2])); + const char *align_templ = egpr_used ? "vmovdqa\t{%2, %x0|%x0, %2}" + : "vmovaps\t{%2, %x0|%x0, %2}"; + const char *unalign_templ = egpr_used ? "vmovdqu\t{%2, %x0|%x0, %2}" + : "vmovups\t{%2, %x0|%x0, %2}"; switch (mode) { case E_V8DFmode: @@ -18933,17 +18939,17 @@ (define_insn "*_vinsert_0" case E_V8DImode: if (misaligned_operand (operands[2], mode)) return which_alternative == 2 ? "vmovdqu64\t{%2, %x0|%x0, %2}" - : "vmovdqu\t{%2, %x0|%x0, %2}"; + : unalign_templ; else return which_alternative == 2 ? "vmovdqa64\t{%2, %x0|%x0, %2}" - : "vmovdqa\t{%2, %x0|%x0, %2}"; + : align_templ; case E_V16SImode: if (misaligned_operand (operands[2], mode)) return which_alternative == 2 ? "vmovdqu32\t{%2, %x0|%x0, %2}" - : "vmovdqu\t{%2, %x0|%x0, %2}"; + : unalign_templ; else return which_alternative == 2 ? "vmovdqa32\t{%2, %x0|%x0, %2}" - : "vmovdqa\t{%2, %x0|%x0, %2}"; + : align_templ; default: gcc_unreachable (); } @@ -27652,11 +27658,13 @@ (define_insn "avx_vec_concat" [(set (match_operand:V_256_512 0 "register_operand" "=x,v,x,Yv") (vec_concat:V_256_512 (match_operand: 1 "nonimmediate_operand" "x,v,xm,vm") - (match_operand: 2 "nonimm_or_0_operand" "xm,vm,C,C")))] + (match_operand: 2 "nonimm_or_0_operand" "xBt,vm,C,C")))] "TARGET_AVX && (operands[2] == CONST0_RTX (mode) || !MEM_P (operands[1]))" { + bool egpr_used = (TARGET_APX_EGPR + && x86_extended_rex2reg_mentioned_p (operands[1])); switch (which_alternative) { case 0: @@ -27704,7 +27712,8 @@ (define_insn "avx_vec_concat" if (misaligned_operand (operands[1], mode)) { if (which_alternative == 2) - return "vmovdqu\t{%1, %t0|%t0, %1}"; + return egpr_used ? "vmovups\t{%1, %t0|%t0, %1}" + : "vmovdqu\t{%1, %t0|%t0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqu64\t{%1, %t0|%t0, %1}"; else @@ -27713,7 +27722,8 @@ (define_insn "avx_vec_concat" else { if (which_alternative == 2) - return "vmovdqa\t{%1, %t0|%t0, %1}"; + return egpr_used ? "vmovaps\t{%1, %t0|%t0, %1}" + : "vmovdqa\t{%1, %t0|%t0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqa64\t{%1, %t0|%t0, %1}"; else @@ -27723,7 +27733,8 @@ (define_insn "avx_vec_concat" if (misaligned_operand (operands[1], mode)) { if (which_alternative == 2) - return "vmovdqu\t{%1, %x0|%x0, %1}"; + return egpr_used ? "vmovups\t{%1, %x0|%x0, %1}" + : "vmovdqu\t{%1, %x0|%x0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqu64\t{%1, %x0|%x0, %1}"; else @@ -27732,7 +27743,8 @@ (define_insn "avx_vec_concat" else { if (which_alternative == 2) - return "vmovdqa\t{%1, %x0|%x0, %1}"; + return egpr_used ? "vmovaps\t{%1, %x0|%x0, %1}" + : "vmovdqa\t{%1, %x0|%x0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqa64\t{%1, %x0|%x0, %1}"; else @@ -27745,7 +27757,9 @@ (define_insn "avx_vec_concat" gcc_unreachable (); } } - [(set_attr "type" "sselog,sselog,ssemov,ssemov") + [(set_attr "isa" "noavx512f,avx512f,*,*") + (set_attr "gpr32" "0,1,1,1") + (set_attr "type" "sselog,sselog,ssemov,ssemov") (set_attr "prefix_extra" "1,1,*,*") (set_attr "length_immediate" "1,1,*,*") (set_attr "prefix" "maybe_evex")