From patchwork Tue Jun 20 14:21:47 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 778362 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wsVRQ1RHHz9s0Z for ; Wed, 21 Jun 2017 00:22:25 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="R6s5UXz1"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=xn+zQuxSpSI17gXpTC/isDR9Fk115ou5gJIsAXoDZkoTv0ACV3 /5jdYqmRpBnpyMO0sZxbNIupJB1RhTtFbEmycNeduzqZbTCkhJ/C+2rgf5W8r5qg rbRqn7+FOaud4oM6mqgIcN43Wrf/TNI+6bdE1huBWmZ1kw3k3rIW3xYvY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=xJzrwZtd1proNB8SCcL5odX/T/Y=; b=R6s5UXz1j5RucHjoIv6W OkXmCrT7t/YpbJa81/TkBDcQiANNttW8ZAsHf/6IZLavO0L7ug2d0LzW1R/M7ur7 LIN4xfWo/WQHTNwbRUIBJdC5i8dXf0aWfyXA5nASGfzL9OoV4F22SF3iaqEqEPjP khce2ZQ3VpX3LO7mT5W8Dyw= Received: (qmail 68820 invoked by alias); 20 Jun 2017 14:22:14 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 68798 invoked by uid 89); 20 Jun 2017 14:22:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:209, H*RU:209, cost!, xx X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com Received: from mail-eopbgr20063.outbound.protection.outlook.com (HELO EUR02-VE1-obe.outbound.protection.outlook.com) (40.107.2.63) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Jun 2017 14:22:11 +0000 Received: from DB5PR08CA0062.eurprd08.prod.outlook.com (2a01:111:e400:c576::30) by DB6PR0802MB2518.eurprd08.prod.outlook.com (2603:10a6:4:a1::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1178.14; Tue, 20 Jun 2017 14:22:07 +0000 Received: from DB5EUR03FT018.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::209) by DB5PR08CA0062.outlook.office365.com (2a01:111:e400:c576::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1178.14 via Frontend Transport; Tue, 20 Jun 2017 14:22:07 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=none (message not signed) header.d=none; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by DB5EUR03FT018.mail.protection.outlook.com (10.152.20.69) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.1178.14 via Frontend Transport; Tue, 20 Jun 2017 14:22:07 +0000 Received: from e107456-lin.cambridge.arm.com (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.294.0; Tue, 20 Jun 2017 15:21:47 +0100 From: James Greenhalgh To: CC: , , Subject: [AArch64] Improve HFA code generation Date: Tue, 20 Jun 2017 15:21:47 +0100 Message-ID: <1497968507-24709-1-git-send-email-james.greenhalgh@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140; IPV:CAL; SCL:-1; CTRY:GB; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(39840400002)(39860400002)(39450400003)(39410400002)(39850400002)(39400400002)(2980300002)(438002)(199003)(189002)(377424004)(4610100001)(2476003)(568964002)(50986999)(36756003)(4326008)(189998001)(5660300001)(305945005)(77096006)(106466001)(512874002)(2351001)(104016004)(110136004)(38730400002)(5890100001)(33646002)(54906002)(50226002)(478600001)(72206003)(8676002)(356003)(8936002)(6916009)(2906002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0802MB2518; H:nebula.arm.com; FPR:; SPF:Pass; MLV:sfv; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; DB5EUR03FT018; 1:Xabc2PAOKsNztPTA7GJ3Pd0V2ecfXQ+KQfQUA2lJzREA8xq+Dx8yfM6Yomk+e3OkPt9qbmppjkAQP0udHdildxrLRA4OR+2A/0575NeyK2jKDD2gkewe50Ih9TXD9g1P8d4R6o2S3Yvp1MXLM4PrsVLKX+xaRS6LMLb0j1Yhiy68ozLvZNqFpgUd9ch4oDONr4ah9+xoGjXZLkupZ+2FteoSQYha17CATCW7+hmPlpEPlEvcNL9CcPGYZNWSA7PPW3mHD8VuW56olP9aEXKaLAmIu3cm8KMWJUb98wRs9/IigrDDVUSokOPnpjMPedXfxjpBVFbOnI5Lc7FUkoaU8L+YoiAJS5GD6+On5pFbU/gyAXdVwNUgSQlB+3j/ahY7iLLSbmIPKjoVJySOcAeN+FV/8QSHoRCPwpxbPnOKY1k0JWEeTAlkbcID//tqae2HkcdGgUDfZv2klUFf/xi/DgnIhGB8lh5Ys+ubv1xN/tM4h3mctiEpwmMiAJxD65QDpe4TkzjPep5zxkRr6tfWkGixOu2Lz/aMiD2ezsmQ3QSxMOIka9yizI2U4y2Vr+kAYfRNlf0neAS9AQjkEGBNSnkHo+E622tkftHeCMKpnI26Ku5OwUk6WP+R+rxVXZ7DuIp6OBA15V+ws4X+nVtqV6i8HYUfa9icM+POMEATDMAeQuBnI/mDdNfAEgT+E+B+Txt3Jo+ev1B1R46/8gVrWdnnIBXblZom0FXCRrr59Bs= X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 02abb428-5391-4af6-35f0-08d4b7e7bbbd X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500055)(300135000095)(300000501055)(300135300095)(300000502055)(300135100095)(22001)(8251501002)(2017030254075)(300000503055)(300135400095)(49563074)(201703131423075)(201703031133081)(201702281549075)(300000504055)(300135200095)(300000505055)(300135600095)(300000506048)(300135500095); SRVR:DB6PR0802MB2518; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB6PR0802MB2518; 3:cPmj2WErabZW2Q65GWqLOLGm5hFb2YKN7KTorziq?= =?us-ascii?Q?irOtqXoQZyLb8r4CEYcP6ITBrFJhY42nb6XugWrhqVYMQTBRxRPGws54ItXk?= =?us-ascii?Q?kvFEzogWcunkNBbpl2PQTKfvpMaGArjgyKdTem8Gegy3v/lInyoumWSYlaZU?= =?us-ascii?Q?8Kp/kv7dOywshZlPkkS6b8rm0Aoy5cYtCE3hQXsROq1XOHmnCkiL5Qz0zXfq?= =?us-ascii?Q?iIa+N69qwCwByMzJvYyrTmS/zJ1Av7OapyEJ4gtdXseIScK9Bi+Fv5Fy6nw/?= =?us-ascii?Q?7PhUuwYQ7IUkj25riM9yqw2mlh7Mk4hmBRZi/OileQJGHi2dgWBc81Qg/mEd?= =?us-ascii?Q?AhGFsRpz/S7iPhkX7DyRv3y1sawEfYcO3IzPvNBaKFMe6ffmnPUHGFYwJZom?= =?us-ascii?Q?bL3ciAPeVhBKDjDsI3ClAfg+D3rU1DjyDuIsTGcIpkfdirnS0HOtRHDTkl7r?= =?us-ascii?Q?3iTIKW0THrT3k4kRQ2crjNOFo7hn7O3QnJQ/cIZh7J3fcX/cbPrxHAjun6wY?= =?us-ascii?Q?KAJsuscepoz8gnHDY2Kii8dm9OyqvEydkRa9Uw21oWSs0i4pWOGUHVfvdsiJ?= =?us-ascii?Q?3P2nbLN2DEK4sio7cnzr6BzCA6lGmtwTAku7vUeyYGR9vEooVD6Awmmlt48u?= =?us-ascii?Q?LflXn6lE3DyV5wMByBzhp0Dh//pRMSEVuNjne1JDDpNuMVV0V2ATSslAFZPV?= =?us-ascii?Q?g6+9DHcgaZF1FqTIjTCMie7rE4UF1EP+9yhjCeGs3nm0PlJXZQulpbLNarl7?= =?us-ascii?Q?ap0fhqq5ld9Df3Yf3mo5fL5WX6s/psjNz9bU6L9sla0Y9S+8ee6rIKgWwylY?= =?us-ascii?Q?SPY2yoSE+BcaiHK13Q8NEfNgQsnxUhhggG+ZWd85QbH3R9FHaRdVTnSuzw47?= =?us-ascii?Q?P7DT2jbiakjQhoJ12Y91sNqlWk/sk+mOz3nFP/rSEj/HY2ZdyqJbxOGpbqsG?= =?us-ascii?Q?3hHm1JmJFlD2IYiD+qN0OsNrsbYmEl1sllDWL4eibw=3D=3D?= X-MS-TrafficTypeDiagnostic: DB6PR0802MB2518: X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB6PR0802MB2518; 25:ICF/mo0eJSBhx2udoMPyqzYRrrUlYO8nHOFlxvB?= =?us-ascii?Q?EzKxopba2ZCHnG+7P88Ig2q/+GqSi+jyQGXN1wI5K68dxZb+MKL4lPyeTTWb?= =?us-ascii?Q?u9PNC9rFxJ8AHv4QyGm4WQARXiGLbYKQmLYaK+xZaYppKEqj46C8FinUygFJ?= =?us-ascii?Q?5TpClaoa/GT5WQs4aF9fMpwKuvdhi4mFCeVvUOq/M+wMVoM2hqdfPzJN6oLg?= =?us-ascii?Q?jLIRtvos2xYj0Pp8BUvfoo5r+tjt63+STMmS+fUJUoggkvDk1tvgkhDxJwKz?= =?us-ascii?Q?KcI7Fa4H4N7AN6xRrZsBzgfwn20L5zOhrxcmuAvhbkkMj7MJHioVtmnJh69u?= =?us-ascii?Q?tt2rQO7WxH+NN5v5B0n78RZxdIbUZaCJaIGJwhcME7P2Gp4Q6Fl6R47qW00t?= =?us-ascii?Q?O4pn0fNG0ZUMq6goOGhtJfblwRnyWcpHUGfMMFIJT8m77nBaANNxrJWRttKk?= =?us-ascii?Q?357vD7qdtI87ngwrHgKJBDvV9oisD+dD+45VO4uecbSt8U1HtSJq4yk83nQs?= =?us-ascii?Q?ImZsseg4CeOs08+HdRO4wf65IwmJcGDTysqKL/lVVcgV92EYIufOFPqiz8CN?= =?us-ascii?Q?BJjMhT4gHCTw8mGG8iMhmuzykD/TRXaMOuVlLypjJC58OqNSRf+cgYTKmbMT?= =?us-ascii?Q?RfXy2GE7tGO0gb1AyF/nTWDi40yKElAZqibsjzZkZUrmUhyvg9vDDypCHExx?= =?us-ascii?Q?gpcYx1Plgg6Wnd6uOGM6o+qlCsXq4NiDNXnapIkqO9QCPfoTm1cqqu5henHc?= =?us-ascii?Q?3zCmdLtAr+IUkmrC6mwlV9+z5QjqpIEtHgyfJzWsaqvW70bz+HFM/Sc2LrWN?= =?us-ascii?Q?k2OYpaeRDoMd4AWcYcN8KxfKSciU3RmeM5DXhVn/QAVF7a/I1VSM8QcP8Nz8?= =?us-ascii?Q?Fs2yAQ+wOS537Zjtq6c06H9au+zpMHXQUHBW+Kk8PyJOh06SFz5E0tDI/Zy2?= =?us-ascii?Q?RH1O7H+hQ5leW+97fC6rcSrHlWdqHjk5Izvsx4mdGUA=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2518; 31:v+pJT9FkeCFfSE07el9VnytjeE0SH5sl2rAOIv7Uv0aacmZ4zKXUXwv64+P5sbJSVXlXLBlbZ0cfTpZX4tEaZqDjXcKaysygKR9rYYb4p/w4ljr+V4ZnNLgy1KTFXqNdjv/zvsNfeBDh0adD9wSzb3FRnP5S5F25trazexrQCKdgp3iOqPMKKUTp+kZK2LbdbbO45uIpjl+FaoPb6Jk0kmi8VxI8wzMRvHInIoyqvef1b+3zd+mCE2/hsCWJLs0zjcjWYDWFi6znIR2mubVPLz8yFnsnE+D9h2VwF/yPWc7Om1xXDztyHn2JlDDsDoOILxVSBV69m5wD/9QA4mc17qdy/suH3zFIzXduHp3H62Xbw/ZmYxA7Y+D6Wll6DDPpGOXHQQQExBDq+YC8F81gu0BzDOIxLncGR8WHOHXjtX1GKch48KNhWxpePd6KPmE94aFxa9/CmA10ia6BGNz5L15JyizFy78OpATSenEBkLlc/v3E0ZeSQshBQ9DNbZS9LHFEcv+1wIfbTvOxDMbaqvcxfK1UNdp9sm0EJVBnuq5/f85JCIu91CiuRBr+oPJg7KmkeAI5NdPp1Br6O4nTgimxVimqtKQo0h41fVOuIIBKUeozWdGNMsEgoekXlBQpM5arvpqnrantELoUZRCgxk6yXmBO/4tbUAsCx+ojPQmkD0K+n6XYQbEv7GJrDkTdA1pf1g3j5Ux7jd0VFBto1cPRYWIjz/wRb3dvCeORqHz85f8+nLnZ7ldQOWv2VE4Q NoDisclaimer: True X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2518; 20:hFy2zWEocokQCmh6vIGQItPRy/E/cc1HgWZB+vo8VeGPBDecS/S819HNiW3W5/W6vjROFCFwEH9xLccNGQ/NfRUf9quXUwLG5JkV12juyEenAVD6UkBcZ8iDEMiLwTYiGlStt+1mdCe6r99tLWh5mR45KvTfwytykKCPUR1siKzBmLDXTwwPcm9OFxwuMH0hX0gDOT+SD4rEkXuO3hIDS7iJxZ6tVgc9rGgzQ0GEMDxbP5DKeA/Oeh+i0FUht61u X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917)(131327999870524); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(102415395)(6040450)(601004)(2401047)(13021025)(8121501046)(13013025)(5005006)(100000703101)(100105400095)(10201501046)(93006095)(93004095)(3002001)(6055026)(6041248)(20161123562025)(20161123558100)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(6072148)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB6PR0802MB2518; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB6PR0802MB2518; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB6PR0802MB2518; 4:npRI4QWbpUwfH4w5qefALTPSgLwrDUoQWKGV3Gp4?= =?us-ascii?Q?jNq2GgRGZTfH93gFlu7nackl3bdPB7bXXWoOXEZ+DlJYrU9T7hNtAdRx6eug?= =?us-ascii?Q?ChamF4UIsses79sfu6tNDpl87LSLh+2bBy0fLFALKN17rT0IlKVxoZROwHo8?= =?us-ascii?Q?rsE3RR7GjqE47jPMqvXt/N8XaBxfuRAuZHWOvf+pU14TI5p02285jaWyemGy?= =?us-ascii?Q?Y8E2WSn517sRa6oFEeopWxjJC75kteBhmJtsOEuSfevg+e4PiMbVerc6ANQv?= =?us-ascii?Q?5wHXORV3Kf+Ra14kBBtyFRLh00y624Jq4NkCfb2XfQcDS7/7LDfFyEq4ijQB?= =?us-ascii?Q?RtbJvz2CCaCCb5lcJ3vyWQBMNRMhMEHVNzBBEDqIib81tsPTNIPNowzxoiMx?= =?us-ascii?Q?q1oiyV8ilOK3TDTZz8jL2f5vbm3Ja3wKclJhvFRSbnsVryJ8vwFr9/2Wy5dT?= =?us-ascii?Q?Elz/apzF5H4wdy5mDp6ZjHWVPR6hPgwAVH34I5XTBIKVMYXGL3xCh03lj7PK?= =?us-ascii?Q?dyFjV420xNI3l5SXS68gt37sh2YEeM4MoEq1H8sfNxbjisvscO81WfyHg6PQ?= =?us-ascii?Q?vAdJ0JSEiTnWsgU5RcHNu6190zvzumIeVm9hdee+jYmRIR5P/sK0d6WBxQKU?= =?us-ascii?Q?mQz98FQmTEYOwognneEsiuyrBAuq7NhLYqtNy0YvcE9G/Q9tom/KJm/AHwow?= =?us-ascii?Q?pdSz0yMsgvxgYZI5dI7ta76/p+ic8Huvv6r8EQ5LIkOGYEHLQjOm2qxIV6x9?= =?us-ascii?Q?d/3V8HLq/w7HjaRBeIh3ae3t9XwsmnGspOf7p8RiHcinOB5WVwSwCeIoFOFt?= =?us-ascii?Q?KTuziV1lU3IVTjK1pjI/izwP17OS4jApoM++HNXeNjIoc4Jm+zxXwgJNnTTW?= =?us-ascii?Q?1yAqOUtBydAohc3/SG2T0IMMI9rNzbxBVGfOCMiaH4l/iZLXbSIpVmrwvM1P?= =?us-ascii?Q?c4ngGpOz+PAry/rZjVZrZ8KgQlibtqq3XznlckVWNxXcFitgouyPyc55K7uR?= =?us-ascii?Q?w7ky/wWGmDTxUfwK5mjzR2YxZCdVGOUHkBEwx3I88dvmmnv6V0NwgHBTpfNo?= =?us-ascii?Q?RMDhFBe46yjVnRV8I3e0SAs7n5NjpnhP6qMfx7Q3e6FgKHBMQLkIevgBueFC?= =?us-ascii?Q?mvMEv6ElKJi2EhY7YVQK+qYg5qkwI5l3vwLbFITMcSyKxb0+Z271J6ejCzQV?= =?us-ascii?Q?o2Ozopn3WYQqpabhXxVyFR/nDvYFS7a8gmr39zoZfCwGWnJPApO8LrVbSJHQ?= =?us-ascii?Q?7LjAePLzkWh0LCjfH6G56UWZYtoNocraolEVBa4cs/wl8VeMuk6CEpV3rWHp?= =?us-ascii?Q?Uw=3D=3D?= X-Forefront-PRVS: 03449D5DD1 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB6PR0802MB2518; 23:3DmXQ1koFw602LRFpSrqc0lriFQdFn09djBeHhY?= =?us-ascii?Q?r8ERgpi/nnqbeXd/Arl7CQSzdFD2b89U6IdebxwXbEF6XvhpeT5t4g0R07iH?= =?us-ascii?Q?QGXrzx5QZG9zbnWyraWurS+qf2kCTAmUCv+90xHTDEruWr0n6wSYQXcGIb1A?= =?us-ascii?Q?aASAbxf/9QgDIXKOEsFuDjxlzcPxtHjqcaoaXnPPbEeyRX8U47Nq+hbJnogI?= =?us-ascii?Q?VPfwUuFi51UQelT7lHvOlHelQM9aMrSq2t3IpEkW96XNN3FJphKOkrszLUfb?= =?us-ascii?Q?a1k7OlwHLWsAOnUK8Rs92CtWAHZHLL1ohxwZDcXV0pX36hizdDahYHv5ToG4?= =?us-ascii?Q?rpj1sjGU5PqjzEriH7+s7VRkgm+ZmxjWgUB0t+cy5TFieT6VolcqkAZZiqTq?= =?us-ascii?Q?3hDLeCQ2KdIN18sNfOAsjyAp17PQil8yhMtJQQGDu7AYfH9pBu3MXKY2vw+u?= =?us-ascii?Q?J13S9HJiJtYaEjaDUKQlprqg7X+f/fWKPns6cUV6Q6Gws5AOjLaJHAQTvqsf?= =?us-ascii?Q?i+JXONiFWjpdrVPorxokYajWNczByvWpqT30hu1w3NglDWdY+Kf+viQDZPwW?= =?us-ascii?Q?ARtAsjdgmElNw/IzrELyUDf61SGzXnFnosVqTgmlb/yxeh/I+8hFqy8ySNdn?= =?us-ascii?Q?90Y/OmAQWHW5oxqxDMDN2TfX6oQjgR5lfrxPluAcp1sz+6NmSqSozWPv1Kzx?= =?us-ascii?Q?eHvFTpq/9T/ADUSZ6Z1HkQoNg0xAtli3hwfGRP9L0+uMPRj/n7lJpUpo+Nqb?= =?us-ascii?Q?AHpRP1p879H5Qq7XlCpGJD98ml6IdJEFkxsKCuNvz3UjkM4/9pIzwv3ftro4?= =?us-ascii?Q?+FAyJ87GK1uqGvDwirBwpKmYiDAZyC9VNqePf39Gg1FwzLgQKz2DkrE0NEqN?= =?us-ascii?Q?oiYUdAsOEkWJIQAgr84W+jOPPKajI3bkOZyQugDJTz++2zSpx4eWiqTx2gmr?= =?us-ascii?Q?wKsXxl9bYBa1giov7mainqtlCJuN573VYEut3/IqWwnndVih8nUX/b9Q2Fn6?= =?us-ascii?Q?EFl0FScgpT118hzDssEIHHaV9?= X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB6PR0802MB2518; 6:zCQKXnah01hvvwye3iHVTeO21OQIIzJgFJpcmLwZ?= =?us-ascii?Q?2scmWcQ3qNvauaOJUaToBsmXvLRX3MPXNgk8ITW+LWP/KZVsma3SrCKGvz06?= =?us-ascii?Q?/aZxd0r/3fQNwBdoccbVYK5+in1tXfg4ywjbwhU3CjNLXjK/ztqF5HC8lmnL?= =?us-ascii?Q?cHpdSFOILeK72dwCOV/0w2Gik2sBi72JR4wi8NmqSvk0v30EMdalW0VkzOF4?= =?us-ascii?Q?UULjx7D8ZEQootCaqQQnkaCtS/qhgtoSoJHZIA4MkxtYMaZN6OZA94dcRsOc?= =?us-ascii?Q?vxySxDPXxzk/s7p3FmgyHTBI2z8JstDo4ezXc4n4i793e4ajHUxJElvWqrmW?= =?us-ascii?Q?EDPH5kOtM3F/JzVoxoflg2NQ9S213u5eluPJWu6r4YRvjFyMdjeLxEN82J8o?= =?us-ascii?Q?H3SaqA1/jFbTPGneUjxMeX9w+4RbbbedMGc9CtIqM5ue8/mosBtZ6z6QaK+L?= =?us-ascii?Q?ZqMyoRCW4WfdjVIoteRuYNCvWkA1DuEkeUZvD3fpu9jrTuDPYxO9Uz6B072d?= =?us-ascii?Q?IUXOfRksTQxc1XXLBdACrEnZBQrT/iK2viHRa/uwoydFd4c04GXXG0hJCyg8?= =?us-ascii?Q?JyrGaUULqzZjzwiFEbiqpa8HUc3r1wpj8BkYOgPomVSpt8n+AQ0+nD9pexnM?= =?us-ascii?Q?XqHLTY+020iQWkod0Gu2hCUqSaj2n84H9LSVGTLqXAqMuaV8yeRmcIwICFaL?= =?us-ascii?Q?wf0fVZo7DLbbdHedCWO3422gn+USLnY50Ze6cel22L6w2qswzg+gp1DrgN9P?= =?us-ascii?Q?sj+KbErJvLVaDxYTX4svQQc3HLIlhLn0Z09+wdct8NEj2Fl/sMEYt6Nsr5Zi?= =?us-ascii?Q?Rxq/xFDcQ1/Xj5v4WhglbaHP6lP7IGsedDgxCqtt57iDYOJZm+dLVVUrDCcJ?= =?us-ascii?Q?xEtVCWuNYMP6zvggH30jMLF2ikot6MdMYxQ7HTCDyWSObscTkrHUgdV1qYJX?= =?us-ascii?Q?pt0f3A4xboXrY+QVXQFsuCHVcXtKnkm2fLYBuWoUDYaiBboD8sUEZRIFqWRJ?= =?us-ascii?Q?m8LHzDRssKUGputCQJRnGtiz?= X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2518; 5:WJ5gFjK9uku30FhOqV7IQJWEZBNyaVphacXklfq/5J8K1/Hukzcmafoneb6E3NgBHBz3CxgIhQdr90YrapmitzwNUPNIBLDFSu+F1gpEHwOGLfAevEx7CDs6nbgnQj+Cq7Wh1NT8uqVNxYbiJD4YU0x4ys2L5h1/ev22kTfaORcNciiwOJNs5xwTnW+qY6BqLeZzmzISu0WGTu3OTMLzQ7wWle/ivof9xF+mStqx8b+qN+M4fWlhXofGHRnIt412O+ASC1cHjHr9NXT0HuIquaaBH+a1h7SMk2Jemiq2AGcGsuupilbbxsmMTdtCM5A4GjP1bcQMRZ1slYe8OZrvkg0qrL2M4goQRnebhiNHBIkggVRy5D70nZZljybelrvYVm0PK+6B4UkFaOX8TDgkJEG3fLZF8Xj1JidXxLlBNNz/9irsIm7KcqkyNxoRaG9j6t6fuo5mjPmV6zs4lbyGvxcBr2H13j1KUvTsfXmNIc/UlaaWr1gaeIi1oBU26dVn; 24:gFc4v3nOuxnL49FSoLUJxo3Yzop2UH+RqkSDJ7qFkAFcs1X9QzU3BtzuQYmXSPq4MCO68t3BRojtYThE0bvy0RnNZTJD4a6Muq+kTlEL4so= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2518; 7:mkbKfkHYkP7W5spuAoHlc5Rgt7FUuxM42RNvXeAcxGdSzvhdgY29VJaS7ilWin1wm5F7JkWqv78AFKVU+rDPYa49PHT8NPAid/HQKqkHR1fUA2kXRTDV+QNcllfovfqEDYbM5+1YKBev7I/WiH/wDbd3sse/ta349b3SMf5J0uf8Db+cZNJO+/XnzA79tccRNP8VOU1s0GLrucnL7wvzwwuN2H/JqmsFR04Uu+2Lu4O89V8oD1V8O77zVWpAwSGnLuSF8cctGaAP54eu+D2pv/Z+DsKDjE6CYWc64zrDzHcWA6yZcqEFPYcCFJlBUAIeKMq3X0TKncjBhrAHsVNyUpJIigJwpGb+haOFhyJBUsGAW/lGL0+WcHYstq/jHoXXi2fT4b/HKxIFla5xNHp+JrvoKjJbj0OPch4fMjXuY7pYyemjLWQEGPB3jLcR+4+X+UunPPoUO/sXBiZvs/88ySZ3246tsob8kUHZoI62jCwTKKbrEXF5/injLFJHEW5qk7YLt77M8xBBLD618KWwTGuk42QYfQyBuK7lVHH5bNc2Mdw19gB558LZJQcRKSdXCQkKgeOTOYYhC4/VS/PNNiFIR/M5iWwM5DRJvvQ/8Zj073UdnY05vlFG3IfxJZkjkyCTDkG7P3uBkRK11siPPV5BXKUFYQ3Mn0cdQXQlHrZ5lGn+ICvNtZ54CHLJ1wNepQhAZ6Bn6VGL9rxKodPPcqDshNJJMplspp98lt43oYyY6WMQD1yn33GJuKty86aeQ3G+iGXzT5FuMQ+9D7juGwlzUbVxGVCMVXz08SqZVZ4= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jun 2017 14:22:07.8806 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0802MB2518 X-IsSubscribed: yes Hi, For this code: struct y { float x[4]; }; float bar3 (struct y x) { return x.x[3]; } GCC generates: bar3: fmov x1, d2 mov x0, 0 bfi x0, x1, 0, 32 fmov x1, d3 bfi x0, x1, 32, 32 sbfx x0, x0, 32, 32 fmov s0, w0 ret If you can wrap your head around that, you'll spot that it could be simplified to: bar3: fmov s0, s3 ret Looking at it, I think the issue is the mode that we assign to the PARALLEL we build for an HFA in registers. When we get in to aarch64_layout_arg with a composite, MODE is set to the smallest integer mode that would contain the size of the composite type. That is to say, in the example above, MODE will be TImode. Looking at the expansion path through assign_parms, we're going to go: assign_parms assign_parm_setup_reg assign_parm_remove_parallels emit_group_store assign_parm_remove_parallels is going to try to create a REG in MODE, then construct that REG using the values in the HFA PARALLEL we created. So, for the example above, we're going to try to create a packed TImode value built up from each of the four "S" registers we've assigned for the arguments. Using one of the struct elements is then a 32-bit extract from the TImode value (then a move back to FP/SIMD registers). This explains the code-gen in the example. Note that an extract from the TImode value makes the whole TImode value live, so we can't optimize away the construction in registers. If instead we make the PARALLEL that we create in aarch64_layout_arg BLKmode then our expansion path is through: assign_parms assign_parm_setup_block Which handles creating a stack slot of the right size for our HFA, and copying it to there. We could then trust the usual optimisers to deal with the object construction and eliminate it where possible. However, we can't just return a BLKmode Parallel, as the mid-end was explictly asking us to return in MODE, and will eventually ICE given the inconsistency. One other way we can force these structures to be given BLKmode is through TARGET_MEMBER_TYPE_FORCES_BLK. Which is what we do in this patch. We're going to tell the mid-end that any structure of more than one element which contains either floating-point or vector data should be set out in BLKmode rather than a large-enough integer mode. In doing so, we implicitly fix the issue with HFA layout above. But at what cost! A long running deficiency in GCC's code-gen (doesn't clean up stack allocations after stack uses have been eliminated) prevents us from getting what we really wanted, but: bar3: sub sp, sp, #16 fmov s0, s3 add sp, sp, 16 ret is pretty close, and a huge improvement over where we are today. Note that we can still get some pretty bad code-generation out of the compiler when passing and returning structs. I particularly like this one: struct y { float x[4]; }; struct y bar (struct y x) { return x; } bar: sub sp, sp, #48 stp s0, s1, [sp, 16] stp s2, s3, [sp, 24] ldp x0, x1, [sp, 16] stp x0, x1, [sp, 32] ldp s0, s1, [sp, 32] ldp s2, s3, [sp, 40] add sp, sp, 48 ret But that looks to be a seperate issue, and is not substantially worse tha current trunk: bar: fmov x2, d0 mov x1, 0 mov x0, 0 bfi x1, x2, 0, 32 fmov x2, d2 bfi x0, x2, 0, 32 fmov x2, d1 bfi x1, x2, 32, 32 fmov x2, d3 bfi x0, x2, 32, 32 ubfx x2, x1, 0, 32 ubfx x1, x1, 32, 32 fmov s0, w2 ubfx x3, x0, 0, 32 fmov s1, w1 ubfx x0, x0, 32, 32 fmov s2, w3 fmov s3, w0 ret I've benchamrked this with Spec2000 and found no performance differences. And bootstrapped on aarch64-none-linux-gnu with no issues. Does this look like a sensible approach and if so, is it OK for trunk? Thanks, James --- gcc/ 2017-06-20 James Greenhalgh * config/aarch64/aarch64.c (aarch64_layout_arg): Construct HFA PARALLELs in BLKmode. gcc/testsuite/ 2017-06-20 James Greenhalgh * gcc.target/aarch64/hfa_1.c: New. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 04417dc..a147068 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -14925,6 +14925,32 @@ aarch64_sched_can_speculate_insn (rtx_insn *insn) } } +/* We're an composite type, so MODE is the smallest integer mode + that can fit the total size of our aggregate. However, + we're going to build a parallel that contains each of our + registers, and GCC is going to emit code to move them in + to a packed value in MODE. As an example, for an HFA of + two floats, this means creating a packed DImode value from + which we can then extract the SFmode elements. This adds + considerable inefficiency. We can avoid that issue by + creating the parallel with BLKmode, which forces the + the structure to be constructed on the stack. + Optimisation passes may then elide the stack stores if + possible. */ + +static bool +aarch64_member_type_forces_blk (const_tree field, machine_mode mode) +{ + machine_mode field_mode = TYPE_MODE (TREE_TYPE (field)); + /* If we are a multi-element structure, and any of those elements are + floating-point or vector types then move the whole thing to BLKmode + to prevent the silly compiler from deciding to build great big stonking + integers full of HFA data. Though presumably this is going to blow + copies of those structures, which is annoying. */ + return (mode == VOIDmode + && (SCALAR_FLOAT_MODE_P (field_mode))); +} + /* Target-specific selftests. */ #if CHECKING_P @@ -15115,6 +15141,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE aarch64_mangle_type +#undef TARGET_MEMBER_TYPE_FORCES_BLK +#define TARGET_MEMBER_TYPE_FORCES_BLK aarch64_member_type_forces_blk + #undef TARGET_MEMORY_MOVE_COST #define TARGET_MEMORY_MOVE_COST aarch64_memory_move_cost diff --git a/gcc/testsuite/gcc.target/aarch64/hfa_1.c b/gcc/testsuite/gcc.target/aarch64/hfa_1.c new file mode 100644 index 0000000..55c22cf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/hfa_1.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +struct x { + float x[4]; +}; + +float +foo0 (struct x x) +{ + return x.x[0]; +} + +float +foo1 (struct x x) +{ + return x.x[1]; +} + +float +foo2 (struct x x) +{ + return x.x[2]; +} + + +float +foo3 (struct x x) +{ + return x.x[3]; +} + +/* We don't want to see any moves to the integer register bank. */ +/* { dg-final { scan-assembler-not "fmov\\t\[wx\]\[0-9\]+"} } */