From patchwork Wed Nov 15 09:47:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1864135 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=PUXZSmPm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SVdkN42Hnz1yR8 for ; Wed, 15 Nov 2023 20:52:28 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E77E838582A1 for ; Wed, 15 Nov 2023 09:52:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 8FDB63857C44 for ; Wed, 15 Nov 2023 09:47:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8FDB63857C44 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8FDB63857C44 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041646; cv=none; b=JRwFwi++gqLkWaEu744EDX/6ya+4MttchLTEnnaB26KVEaSn9uKXfVacD4eIxqUPR6ZSeJQ+/cS7uCbPhdOpKUasx8uQeIyLFkKQHW9iDLBe+8jnlApyOWBF9nRVuEVtM9TsB33etaeTXK/YthshK3UVLsHHWsELW+wvLvEbWcY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041646; c=relaxed/simple; bh=VpNCWGB9NMslHvCUVLHNlyvvNT5JOpyi/vLRAZ7ryZQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=SLaOPFo8xGYMY2iN/VQe9f9Lb61o65NiFPmCqyvwK2h68UqFcshMtCBfWwS2jEtwDLnSM+PSLcQczV96r8HXmt9ucKvQ4fD5XkPLXtnlqqNnk4VU/mtZ+QUlr4uusUWvfhM1AiptfI9+Pr8l4SB4ncG2wiXuUuP5zk8HO/wPb+0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041643; x=1731577643; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VpNCWGB9NMslHvCUVLHNlyvvNT5JOpyi/vLRAZ7ryZQ=; b=PUXZSmPmQLaGX65XitX0JyONtsH7joGrFDtOZenoc5lAk0xqqAcIZIzh RC9v7PRbDtDG2Z9mgopR6h2Xvq/wXhp+n1RwN8K2dqpP1PEctmj6goYna ORDIKO09JL/XEAv7HRPbo10Om2hDqvYGa+XJ6Pp007m/OS1o1gVIy4AA0 k7NB011mvTKBL8eYvD3xaZhSiD+2NhPm7fhGCaQqian4LzEOLbARtdiTg MrmWGwxT20YJ3WQu3MKlhsqV/mRCXSLmjdXEOEHKcMOYBBk9MDGf32Oir hbgAvFKvEkkRYi6zc+eGf/YVbakJ2WRUkm6htoAPjT72pElZjiflp3HC8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342622" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342622" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431725" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431725" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:12 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C7A5D1005682; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 13/16] [APX NDD] Support APX NDD for right shift insns Date: Wed, 15 Nov 2023 17:47:02 +0800 Message-Id: <20231115094705.3976553-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Similar to LSHIFT, rshift should also emit $1 for NDD form with CX_REG as operands[1]. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add LSHIFTRT and RSHIFTRT. * config/i386/i386.md (ashr3_cvt): Extend with new alternatives to support NDD, and adjust output templates. (*ashrsi3_cvt_zext): Likewise. (*ashr3_1): Likewise for SI/DI mode. (*highpartdisi2): Likewise. (*lshr3_1): Likewise. (*si3_1_zext): Likewise. (*ashr3_1): Likewise for QI/HI mode. (*lshrqi3_1): Likewise. (*lshrhi3_1): Likewise. (3_cmp): Likewise. (*si3_cmp_zext): Likewise. (*3_cconly): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add l/ashiftrt tests. --- gcc/config/i386/i386-expand.cc | 2 + gcc/config/i386/i386.md | 265 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 24 +++ 3 files changed, 191 insertions(+), 100 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 7e3080482a6..8e040346fbb 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1277,6 +1277,8 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case IOR: case XOR: case ASHIFT: + case ASHIFTRT: + case LSHIFTRT: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a0e81545f17..3ff333d4a41 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15490,39 +15490,45 @@ (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) (define_insn "ashr3_cvt" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "*a,0") + (match_operand:SWI48 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand"))) (clobber (reg:CC FLAGS_REG))] "INTVAL (operands[2]) == GET_MODE_BITSIZE (mode)-1 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" "@ - sar{}\t{%2, %0|%0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{}\t{%2, %0|%0, %2} + sar{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "")]) (define_insn "*ashrsi3_cvt_zext" - [(set (match_operand:DI 0 "register_operand" "=*d,r") + [(set (match_operand:DI 0 "register_operand" "=*d,r,r") (zero_extend:DI - (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0") + (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0,r") (match_operand:QI 2 "const_int_operand")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && INTVAL (operands[2]) == 31 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, SImode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" "@ {cltd|cdq} - sar{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{l}\t{%2, %k0|%k0, %2} + sar{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "SI")]) (define_expand "@x86_shift_adj_3" @@ -15564,13 +15570,15 @@ (define_insn "*bmi2_3_1" (set_attr "mode" "")]) (define_insn "*ashr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,r"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15578,14 +15586,18 @@ (define_insn "*ashr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sar{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sar{}\t{%1, %0|%0, %1}" + : "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15598,8 +15610,8 @@ (define_insn "*ashr3_1" ;; Specialization of *lshr3_1 below, extracting the SImode ;; highpart of a DI to be extracted, but allowing it to be clobbered. (define_insn_and_split "*highpartdisi2" - [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k") 0) - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k") + [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k,r") 0) + (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k,r") (const_int 32))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" @@ -15618,16 +15630,20 @@ (define_insn_and_split "*highpartdisi2" DONE; } operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); -}) +} +[(set_attr "isa" "*,*,*,apx_ndd")]) + (define_insn "*lshr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k,r") (lshiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,r,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, mode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, mode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15636,14 +15652,18 @@ (define_insn "*lshr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{}\t{%1, %0|%0, %1}" + : "shr{}\t%0"; else - return "shr{}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{}\t{%2, %1, %0|%0, %1, %2}" + : "shr{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2,") - (set_attr "type" "ishift,ishiftx,msklog") + [(set_attr "isa" "*,bmi2,,apx_ndd") + (set_attr "type" "ishift,ishiftx,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15676,13 +15696,15 @@ (define_insn "*bmi2_si3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,r")))) + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15690,14 +15712,18 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15720,20 +15746,26 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashr3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m, r") (ashiftrt:SWI12 - (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + (match_operand:SWI12 1 "nonimmediate_operand" "0, rm") + (match_operand:QI 2 "nonmemory_operand" "c, c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sar{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "sar{}\t{%1, %0|%0, %1}" + : "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return which_alternative ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15744,29 +15776,35 @@ (define_insn "*ashr3_1" (set_attr "mode" "")]) (define_insn "*lshrqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k,r") (lshiftrt:QI - (match_operand:QI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI,Wb"))) + (match_operand:QI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, QImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, QImode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{b}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{b}\t{%1, %0|%0, %1}" + : "shr{b}\t%0"; else - return "shr{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{b}\t{%2, %1, %0|%0, %1, %2}" + : "shr{b}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*,avx512dq,apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15778,29 +15816,35 @@ (define_insn "*lshrqi3_1" (set_attr "mode" "QI")]) (define_insn "*lshrhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k") + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k, r") (lshiftrt:HI - (match_operand:HI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI, Ww"))) + (match_operand:HI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI, Ww, cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, HImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, HImode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{w}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{w}\t{%1, %0|%0, %1}" + : "shr{w}\t%0"; else - return "shr{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{w}\t{%2, %1, %0|%0, %1, %2}" + : "shr{w}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*, avx512f") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*, avx512f, apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15853,25 +15897,31 @@ (define_insn "*3_cmp" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (any_shiftrt:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, mode, operands)" + && ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15884,10 +15934,10 @@ (define_insn "*3_cmp" (define_insn "*si3_cmp_zext" [(set (reg FLAGS_REG) (compare - (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0") + (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0,r") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (any_shiftrt:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -15895,15 +15945,20 @@ (define_insn "*si3_cmp_zext" || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, SImode, operands)" + && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return which_alternative ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15917,10 +15972,10 @@ (define_insn "*3_cconly" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "register_operand" "0,r") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx @@ -15928,12 +15983,18 @@ (define_insn "*3_cconly" && ix86_match_ccmode (insn, CCGOCmode)" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative + ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16537,18 +16598,22 @@ (define_insn "rcrdi2" ;; Versions of sar and shr that set the carry flag. (define_insn "3_carry" [(set (reg:CCC FLAGS_REG) - (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0") + (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0,r") (const_int 1)) (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI48 0 "register_operand" "=r") + (set (match_operand:SWI48 0 "register_operand" "=r,r") (any_shiftrt:SWI48 (match_dup 1) (const_int 1)))] "" { - if (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) - return "{}\t%0"; - return "{}\t{1, %0|%0, 1}"; + if ((TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; + return which_alternative ? "{}\t{$1, %1, %0|%0, %1, 1}" + : "{}\t{$1, %0|%0, 1}"; } - [(set_attr "type" "ishift1") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift1") (set (attr "length_immediate") (if_then_else (ior (match_test "TARGET_SHIFT1") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 481ec8b00a8..28c0df72988 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,6 +2,8 @@ /* { dg-options "-mapxf -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ +#include + #define FOO(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -132,6 +134,24 @@ FOO3 (int, shl, <<, 7) FOO (long, shl, <<) FOO3 (long, shl, <<, 7) +FOO (char, sar, >>) +FOO3 (char, sar, >>, 7) +FOO (short, sar, >>) +FOO3 (short, sar, >>, 7) +FOO (int, sar, >>) +FOO3 (int, sar, >>, 7) +FOO (long, sar, >>) +FOO3 (long, sar, >>, 7) + +FOO (uint8_t, shr, >>) +FOO3 (uint8_t, shr, >>, 7) +FOO (uint16_t, shr, >>) +FOO3 (uint16_t, shr, >>, 7) +FOO (uint32_t, shr, >>) +FOO3 (uint32_t, shr, >>, 7) +FOO (uint64_t, shr, >>) +FOO3 (uint64_t, shr, >>, 7) + /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -156,3 +176,7 @@ FOO3 (long, shl, <<, 7) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */