From patchwork Thu Jan 24 23:17:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 1030714 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-494709-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=marvell.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="KnCmDuxI"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=marvell.com header.i=@marvell.com header.b="wvbyz2pg"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=marvell.onmicrosoft.com header.i=@marvell.onmicrosoft.com header.b="bDarzPwz"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43lykN6VxHz9s9h for ; Fri, 25 Jan 2019 10:18:04 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version; q=dns; s= default; b=HzjAkViqMcxvwpaQ1jG7Jvg5L30ynuxXj05Apk5HTCEqSD45cDvHR 1aWYIfHeKwJyLUXj/ead71BooU/NzL3I+XL3PZWA6eKx6/y7i5Emo1TCyZdQYgCO 52o+jdNg9RWr72HYDp00aUi+NkVPfNgtVCYYof91NcfQgYNOJ7OW9g= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version; s= default; bh=l0orf4ngywBMUXC4eMrprWLxRSY=; b=KnCmDuxI7HKGPfhd5zS2 Uxn7nhk/kObdoJg5viNHc3/DvZfnAu+Nj9RiO+UEwZQXKf4jD81njG7EWx1iYY4z h+gKb5W4HWdz85m4Nl8TJt+KUIkpl1FEphtRFblnJE3oucIQNRpEYQ0i/22DpkD0 e6c4hF8XwJhIYfUimv6DPQo= Received: (qmail 24475 invoked by alias); 24 Jan 2019 23:17:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 24456 invoked by uid 89); 24 Jan 2019 23:17:56 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-27.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_NUMSUBJECT, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=combine_bfi_1.c, combine_bfi_1c, UD:combine_bfi_1.c X-HELO: mx0b-0016f401.pphosted.com Received: from mx0a-0016f401.pphosted.com (HELO mx0b-0016f401.pphosted.com) (67.231.148.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 24 Jan 2019 23:17:55 +0000 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0ONGhua024453 for ; Thu, 24 Jan 2019 15:17:53 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : subject : date : message-id : content-type : mime-version; s=pfpt0818; bh=nVQbSbbSvoFoiIHXUCq3nbFpCjQaGmKJ/DZVGQCyn90=; b=wvbyz2pgW9mcpBu7yqMp3Y97gsSGSX3mEHaDZH17ZU7f+Z/96VR8MgCtrkNlG/S6Lu2q yxnrix1GovCCMAeV9k4cFTHfR/wDUUW67NVQkcW1pMDmuiF90ATDeN68MyFOZe1UE174 syzaDxhVOOE8qOaGipU4SdFJ92Fw3s8NwJ9qB1AsR7qIFBU+BjR4EqiqrbSNZ7q7taNj lw3StXbqfxtgNxTv7Yp/YSPyvWkhzJuBEVMEIuxzw91MjUrGiffW5mi7SQbyq9g671gr HpewU227PPym94lkMrlrQUvUx6DHeGZ7sLoicyDrcsXoisz7GIu6SqNq5AbGJUncgbGM HA== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0a-0016f401.pphosted.com with ESMTP id 2q7jdrsfcw-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 24 Jan 2019 15:17:52 -0800 Received: from SC-EXCH02.marvell.com (10.93.176.82) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Thu, 24 Jan 2019 15:17:47 -0800 Received: from NAM04-CO1-obe.outbound.protection.outlook.com (104.47.45.54) by SC-EXCH02.marvell.com (10.93.176.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3 via Frontend Transport; Thu, 24 Jan 2019 15:17:47 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector1-marvell-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nVQbSbbSvoFoiIHXUCq3nbFpCjQaGmKJ/DZVGQCyn90=; b=bDarzPwz7ryh3CzNxedMpQ/gzMW4RNiJHPgwTRoWoxE6w2ZwhdkwABoGBqzxT75+5CJhltZDqR2sOdmRMjSmcTRrchXj/wyJvSBeOk28RoXTTQwXMiftzS1EL4ICZP57bO/MJTcoaMJfH2VAlkCpfaORb5cD6v1VOeNrxmovOJM= Received: from BN6PR1801MB2033.namprd18.prod.outlook.com (10.161.156.162) by BN6PR1801MB1987.namprd18.prod.outlook.com (10.161.156.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1537.27; Thu, 24 Jan 2019 23:17:46 +0000 Received: from BN6PR1801MB2033.namprd18.prod.outlook.com ([fe80::927:7ff8:1493:3882]) by BN6PR1801MB2033.namprd18.prod.outlook.com ([fe80::927:7ff8:1493:3882%5]) with mapi id 15.20.1558.016; Thu, 24 Jan 2019 23:17:45 +0000 From: Steve Ellcey To: "gcc-patches@gcc.gnu.org" Subject: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64 Date: Thu, 24 Jan 2019 23:17:45 +0000 Message-ID: <73a131005a775d06e344d035031005dc765403c4.camel@marvell.com> received-spf: None (protection.outlook.com: marvell.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 MIME-Version: 1.0 X-Proofpoint-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901240158 Here is my attempt at creating a couple of new instructions to generate more bfi instructions on aarch64. I haven't finished testing this but it helps with gcc.target/aarch64/combine_bfi_1.c. Before I went any further with it I wanted to see if anyone else was working on something like this and if this seems like a reasonable approach. Steve Ellcey sellcey@marvell.com 2018-01-24 Steve Ellcey PR rtl-optimization/87763 * config/aarch64/aarch64-protos.h (aarch64_masks_and_shift_for_aarch64_bfi_p): New prototype. * config/aarch64/aarch64.c (aarch64_masks_and_shift_for_aarch64_bfi_p): New function. * config/aarch64/aarch64.md (*aarch64_bfi4_shift): New instruction. (*aarch64_bfi4_noshift): Ditto. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index b035e35..ec90053 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -429,6 +429,7 @@ bool aarch64_label_mentioned_p (rtx); void aarch64_declare_function_name (FILE *, const char*, tree); bool aarch64_legitimate_pic_operand_p (rtx); bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx); +bool aarch64_masks_and_shift_for_aarch64_bfi_p (scalar_int_mode, rtx, rtx, rtx); bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx); bool aarch64_move_imm (HOST_WIDE_INT, machine_mode); opt_machine_mode aarch64_sve_pred_mode (unsigned int); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5df5a8b..69cc69f 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9294,6 +9294,44 @@ aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode mode, rtx mask, & ((HOST_WIDE_INT_1U << INTVAL (shft_amnt)) - 1)) == 0; } +/* Return true if the masks and a shift amount from an RTX of the form + ((x & MASK1) | ((y << SHIFT_AMNT) & MASK2)) are valid to combine into + a BFI instruction of mode MODE. See *arch64_bfi patterns. */ + +bool +aarch64_masks_and_shift_for_aarch64_bfi_p (scalar_int_mode mode, rtx mask1, + rtx shft_amnt, rtx mask2) +{ + unsigned HOST_WIDE_INT m1, m2, s, t; + + if (!CONST_INT_P (mask1) || !CONST_INT_P (mask2) || !CONST_INT_P (shft_amnt)) + return false; + + m1 = UINTVAL (mask1); + m2 = UINTVAL (mask2); + s = UINTVAL (shft_amnt); + + /* Verify that there is no overlap in what bits are set in the two masks. */ + if ((m1 + m2 + 1) != 0) + return false; + + /* Verify that the shift amount is less than the mode size. */ + if (s >= GET_MODE_BITSIZE (mode)) + return false; + + /* Verify that the mask being shifted is contigious and would be in the + least significant bits after shifting by creating a mask 't' based on + the number of bits set in mask2 and the shift amount for mask2 and + comparing that to the actual mask2. */ + t = popcount_hwi (m2); + t = (1 << t) - 1; + t = t << s; + if (t != m2) + return false; + + return true; +} + /* Calculate the cost of calculating X, storing it in *COST. Result is true if the total cost of the operation has now been calculated. */ static bool diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index b7f6fe0..e1f526b 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5476,6 +5476,41 @@ [(set_attr "type" "bfm")] ) +;; Match a bfi instruction where the shift of OP3 means that we are +;; actually copying the least significant bits of OP3 into OP0 by way +;; of the AND masks and the IOR instruction. + +(define_insn "*aarch64_bfi4_shift" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") + (match_operand:GPI 2 "const_int_operand" "n")) + (and:GPI (ashift:GPI + (match_operand:GPI 3 "register_operand" "r") + (match_operand:GPI 4 "aarch64_simd_shift_imm_" "n")) + (match_operand:GPI 5 "const_int_operand" "n"))))] + "aarch64_masks_and_shift_for_aarch64_bfi_p (mode, operands[2], operands[4], operands[5])" +{ + return "bfi\t%0, %3, %4, %P5"; +} + [(set_attr "type" "bfm")] +) + +;; Like the above instruction but with no shifting, we are just copying the +;; least significant bits of OP3 to OP0. + +(define_insn "*aarch64_bfi4_noshift" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") + (match_operand:GPI 2 "const_int_operand" "n")) + (and:GPI (match_operand:GPI 3 "register_operand" "r") + (match_operand:GPI 4 "const_int_operand" "n"))))] + "aarch64_masks_and_shift_for_aarch64_bfi_p (mode, operands[2], const0_rtx, operands[4])" +{ + return "bfi\t%0, %3, 0, %P4"; +} + [(set_attr "type" "bfm")] +) + (define_insn "*extr_insv_lower_reg" [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r") (match_operand 1 "const_int_operand" "n")