From patchwork Mon Feb 3 15:01:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 1232868 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-518766-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha1 header.s=default header.b=ozR60byJ; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=GeZsibLl; dkim=fail reason="signature verification failed" (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=GeZsibLl; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48B9yw2ym3z9sRh for ; Tue, 4 Feb 2020 02:01:58 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=Z6e FrtIF7HsCR+hTu17oM53fS5klmBjOVXLxBRQ8bOamcBeg+hIytQUNBoy+IP5lmER m4bM9jO+F2EtrQQcb5DNL1+JLcFsVUSV+oaBXSDzk3oZCvTKXiu4x4Db/0893d1T u0H4DqUgIKkuNnvJBl9MRuR3lPLTBfiIambq3LRQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type :content-transfer-encoding:mime-version; s=default; bh=Lj3RR1ITj JgkxX1l4PzWS+DYy2g=; b=ozR60byJU5VgPuaVsHck5yGiwoifPpw7zsqsAHquC u+7Yr1xn0wRt86OAbEEloWe2ebBezThj8EFh8UNy/gaynWTnd7Us3e09bg2s4ODe 1J/R9V+1UcoT26kN7Nuy8+/BwP+Ce/jnZg6BzLeHULZSeaOmeA0nVjhYIQWLCovB Us= Received: (qmail 125914 invoked by alias); 3 Feb 2020 15:01:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 125906 invoked by uid 89); 3 Feb 2020 15:01:50 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.4 required=5.0 tests=AWL, BAYES_00, FORGED_SPF_HELO, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 spammy=vel, vetype, VEL, sxtw X-HELO: EUR02-HE1-obe.outbound.protection.outlook.com Received: from mail-eopbgr10075.outbound.protection.outlook.com (HELO EUR02-HE1-obe.outbound.protection.outlook.com) (40.107.1.75) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 03 Feb 2020 15:01:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DzlMrboMyyslsreqDkcTk/5ifwiXAlArmdeMwTyEQfk=; b=GeZsibLlGi29fG0vxwLLanhzDWTOrWhGFlTzdFN1Mk8JpolMDLUCdXgP2fbRj1XenmsnqHSjLR8nhv31Kn3XMYP1G9WHtYcJi5whFPz2CP9f3zeoHyOko1Jhy7SAw4UvQb5ezR5LBah9DtBsWHfIuvQlHhuA7apvRbX95LSK7UQ= Received: from VI1PR08CA0128.eurprd08.prod.outlook.com (2603:10a6:800:d4::30) by AM6PR08MB4200.eurprd08.prod.outlook.com (2603:10a6:20b:a8::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2686.32; Mon, 3 Feb 2020 15:01:35 +0000 Received: from AM5EUR03FT061.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e08::206) by VI1PR08CA0128.outlook.office365.com (2603:10a6:800:d4::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2686.32 via Frontend Transport; Mon, 3 Feb 2020 15:01:35 +0000 Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT061.mail.protection.outlook.com (10.152.16.247) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2665.18 via Frontend Transport; Mon, 3 Feb 2020 15:01:35 +0000 Received: ("Tessian outbound 846b976b3941:v42"); Mon, 03 Feb 2020 15:01:35 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 81dbc158f1d2b32a X-CR-MTA-TID: 64aa7808 Received: from e94758a54442.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 16E8917B-8F4B-4130-9420-FF353D718C6B.1; Mon, 03 Feb 2020 15:01:29 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e94758a54442.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 03 Feb 2020 15:01:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d2wt/pBpIElYxsqKtWIRCZyYcLnxCYuGier5aBPzDDEa/qWC/sqpPR1CMTaNN60aW1QEcqno2wBSUchY56+Uo8QDC+1xN30qxV/ec01I/QWAJXs1GBoFzZEFvrWcCdnbr+qO2Be/OEwJ7hpLLU8RL8dl6nbL6pioU9roQE4/EdgJm+UkbNY4h8zOxFoVDRfjHnJz9hMNQuSEbPqQXfcuwKwC+Pn037uN8RUuifFAyOaCKVG+iW9jNE8UI3o1M3eUmrI8lZt8JcSjusGyx5lUIYxfly6LHu4M0IpJ26zz6PkAJ93BCMATtMhZNZC1yTEfn4/RkMgFO3Sav/QUYeNXJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DzlMrboMyyslsreqDkcTk/5ifwiXAlArmdeMwTyEQfk=; b=fC6apuEnEGTRcp6ejn5f1p5QrSntYNROTHEuoRlspPLFsxQPGDykJfBmHrHKT405vvrrBVVX+1dC+B8b0xO4s9BxQ/J5iQV/+/2c4JvLGuRZOk7qo+Sfu3nwbBRng5lMpw81cBbxkilUHkn4I4IE+e3/2P5TS2CCr6tgxwkm2IGrLSlJtqMxCkdZWBXKS1E5quFRvD1XLuF15FNkoZLN3DjnQ6DCNi+H0TDqUcioLUrfDtHEaMIAsbQkiSMiCjFmAi/pvovRehSUMpYZWBlZXbUeGzHO2xpAb17UPjD3dJrrkWVyiXN2ep0uKdQ4rbt886/3UrRsJeKoEMXicrcQug== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DzlMrboMyyslsreqDkcTk/5ifwiXAlArmdeMwTyEQfk=; b=GeZsibLlGi29fG0vxwLLanhzDWTOrWhGFlTzdFN1Mk8JpolMDLUCdXgP2fbRj1XenmsnqHSjLR8nhv31Kn3XMYP1G9WHtYcJi5whFPz2CP9f3zeoHyOko1Jhy7SAw4UvQb5ezR5LBah9DtBsWHfIuvQlHhuA7apvRbX95LSK7UQ= Received: from AM5PR0801MB2035.eurprd08.prod.outlook.com (10.168.157.147) by AM5PR0801MB1908.eurprd08.prod.outlook.com (10.168.158.18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2686.29; Mon, 3 Feb 2020 15:01:27 +0000 Received: from AM5PR0801MB2035.eurprd08.prod.outlook.com ([fe80::19ff:5219:d351:3199]) by AM5PR0801MB2035.eurprd08.prod.outlook.com ([fe80::19ff:5219:d351:3199%3]) with mapi id 15.20.2686.030; Mon, 3 Feb 2020 15:01:27 +0000 From: Wilco Dijkstra To: GCC Patches CC: Kyrylo Tkachov , Richard Sandiford , Richard Earnshaw Subject: [PATCH][AArch64] Improve popcount expansion Date: Mon, 3 Feb 2020 15:01:27 +0000 Message-ID: Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-exchange-transport-forked: True x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:9508;OLM:9508; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(396003)(376002)(136003)(366004)(39860400002)(346002)(189003)(199004)(52536014)(478600001)(76116006)(71200400001)(66446008)(66556008)(64756008)(66476007)(86362001)(66946007)(54906003)(5660300002)(186003)(81166006)(2906002)(81156014)(8676002)(8936002)(4326008)(7696005)(9686003)(33656002)(6916009)(316002)(6506007)(26005)(55016002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM5PR0801MB1908; H:AM5PR0801MB2035.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: rxf8R2DHyfZEL4MTXB31BijN++XOPoNzDtzUdza0/0EQoSb/phLuLWOef4DaXnFuulj/eag9aeb1M35lXy4Wn/0+KWRpebkYW84yNqK/fQwDm3razanWtffoI2qqTZ6PEhJtg2wlZU3L+6oPD825yHUtAwLiP7S1MKe75AwmOaRoDimZ7NBLmfRNrysJnJ7goizbHHtVp7dhQjjcTK/GfgwOy9tgv+5OXGYeSFOdVkgzRaeCV+U38yYGZmfSAAMi3RxCwpjlgQuTGiBwCvQ+6tvYkRxa87q/wdMD7SWId0ZPievF4xhx04NnNjkhGnvSK7sQJ8F8tgJnAqIdY9D//zuURYy1fU5tcIkiG5Dd+aiTzE/pqYweIjXy9FJ6+VN77neVfHCbBMWVk9Cy9g5WDXWYC5iADY3Yz2iVWd7TZLO+nyFTF3n218/Qk51MkQSI x-ms-exchange-antispam-messagedata: 4ly+jIPCGW8Rzfn2rpDYZ6xKD6CuhDbG4UMlY+kTEPaY64KNiziaHZWNZ6BbALgTTIediXih2wxVQs1FlT1+5qHb9sg/03C2AnOdmAvb4+chuOJzhCRy+nYOQlKQ5xyzPy2KiT3ZV7A9qu/PbwbLng== MIME-Version: 1.0 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT061.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 4c940a44-f831-4dc8-5b38-08d7a8b9f1f1 The popcount expansion uses umov to extend the result and move it back to the integer register file. If we model ADDV as a zero-extending operation, fmov can be used to move back to the integer side. This results in a ~0.5% speedup on deepsjeng on Cortex-A57. A typical __builtin_popcount expansion is now: fmov s0, w0 cnt v0.8b, v0.8b addv b0, v0.8b fmov w0, s0 Bootstrap OK, passes regress. ChangeLog 2020-02-02 Wilco Dijkstra gcc/ * config/aarch64/aarch64.md (popcount2): Improve expansion. * config/aarch64/aarch64-simd.md (aarch64_zero_extend_reduc_plus_): New pattern. * config/aarch64/iterators.md (VDQV_E): New iterator. testsuite/ * gcc.target/aarch64/popcnt2.c: New test. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 97f46f96968a6bc2f93bbc812931537b819b3b19..34765ff43c1a090a31e2aed64ce95510317ab8c3 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2460,6 +2460,17 @@ (define_insn "aarch64_reduc_plus_internal" [(set_attr "type" "neon_reduc_add")] ) +;; ADDV with result zero-extended to SI/DImode (for popcount). +(define_insn "aarch64_zero_extend_reduc_plus_" + [(set (match_operand:GPI 0 "register_operand" "=w") + (zero_extend:GPI + (unspec: [(match_operand:VDQV_E 1 "register_operand" "w")] + UNSPEC_ADDV)))] + "TARGET_SIMD" + "add\\t%0, %1." + [(set_attr "type" "neon_reduc_add")] +) + (define_insn "aarch64_reduc_plus_internalv2si" [(set (match_operand:V2SI 0 "register_operand" "=w") (unspec:V2SI [(match_operand:V2SI 1 "register_operand" "w")] diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 86c2cdfc7973f4b964ba233cfbbe369b24e0ac10..5edc76ee14b55b2b4323530e10bd22b3ffca483e 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4829,7 +4829,6 @@ (define_expand "popcount2" { rtx v = gen_reg_rtx (V8QImode); rtx v1 = gen_reg_rtx (V8QImode); - rtx r = gen_reg_rtx (QImode); rtx in = operands[1]; rtx out = operands[0]; if(mode == SImode) @@ -4843,8 +4842,7 @@ (define_expand "popcount2" } emit_move_insn (v, gen_lowpart (V8QImode, in)); emit_insn (gen_popcountv8qi2 (v1, v)); - emit_insn (gen_reduc_plus_scal_v8qi (r, v1)); - emit_insn (gen_zero_extendqi2 (out, r)); + emit_insn (gen_aarch64_zero_extend_reduc_plus_v8qi (out, v1)); DONE; }) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index fc973086cb91ae0dc54eeeb0b832d522539d7982..926779bf2442fa60d184ef17308f91996d6e8d1b 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -208,6 +208,9 @@ (define_mode_iterator VDQV [V8QI V16QI V4HI V8HI V4SI V2DI]) ;; Advanced SIMD modes (except V2DI) for Integer reduction across lanes. (define_mode_iterator VDQV_S [V8QI V16QI V4HI V8HI V4SI]) +;; Advanced SIMD modes for Integer reduction across lanes (zero/sign extended). +(define_mode_iterator VDQV_E [V8QI V16QI V4HI V8HI]) + ;; All double integer narrow-able modes. (define_mode_iterator VDN [V4HI V2SI DI]) diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt2.c b/gcc/testsuite/gcc.target/aarch64/popcnt2.c new file mode 100644 index 0000000000000000000000000000000000000000..e321858afa4d6ecb6fc7348f39f6e5c6c0c46147 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/popcnt2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +unsigned +foo (int x) +{ + return __builtin_popcount (x); +} + +unsigned long +foo1 (int x) +{ + return __builtin_popcount (x); +} + +/* { dg-final { scan-assembler-not {popcount} } } */ +/* { dg-final { scan-assembler-times {cnt\t} 2 } } */ +/* { dg-final { scan-assembler-times {fmov} 4 } } */ +/* { dg-final { scan-assembler-not {umov} } } */ +/* { dg-final { scan-assembler-not {uxtw} } } */ +/* { dg-final { scan-assembler-not {sxtw} } } */