From patchwork Mon Feb 1 17:54:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Joel Hutton X-Patchwork-Id: 1434260 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Ty8LnFyo; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DTwZ250hYz9sBy for ; Tue, 2 Feb 2021 04:54:33 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 52729386F01F; Mon, 1 Feb 2021 17:54:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 52729386F01F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1612202070; bh=v0NSpUbXlL+pKvzeJdTZoQLDZpMHO+N9TDt/FTKermE=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Ty8LnFyo/5CdDwqEPDCLPwN1AIlS53ZILZ+cBGz7UZSHB3OW9v3ciH4bxTFMQV66U xLt3sv/h5Jj3F0OoY3qoTA3dtqzh1C/3iZ3GR+Nz/z79stAlSpefcZNcGa1ri7BUxl 7vuZxq+w4hg7PfvSvi4JO+Q6SyOc82bBsfNY7XXw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2041.outbound.protection.outlook.com [40.107.22.41]) by sourceware.org (Postfix) with ESMTPS id B823D386F01F for ; Mon, 1 Feb 2021 17:54:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B823D386F01F Received: from AM7PR03CA0011.eurprd03.prod.outlook.com (2603:10a6:20b:130::21) by DBBPR08MB6027.eurprd08.prod.outlook.com (2603:10a6:10:20c::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.23; Mon, 1 Feb 2021 17:54:22 +0000 Received: from AM5EUR03FT050.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:130:cafe::84) by AM7PR03CA0011.outlook.office365.com (2603:10a6:20b:130::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.17 via Frontend Transport; Mon, 1 Feb 2021 17:54:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT050.mail.protection.outlook.com (10.152.17.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.11 via Frontend Transport; Mon, 1 Feb 2021 17:54:21 +0000 Received: ("Tessian outbound 8418c949a3fa:v71"); Mon, 01 Feb 2021 17:54:21 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ba5fa039e5894ff6 X-CR-MTA-TID: 64aa7808 Received: from af9a1512a4a7.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 48FECC93-C5F7-4641-866D-D2ECCABA05C7.1; Mon, 01 Feb 2021 17:54:10 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id af9a1512a4a7.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 01 Feb 2021 17:54:10 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zme2uwuQfciPr6/12vjIhx/HFC/w1G+jv31J5v34rAafJM01ZxEp6YNAYwjTpLU4W9baUa9iwTRJgagKhcGis4KZRgZKYbh3yi/YJHWaRgbsJRql2bVUL6XVaM+sTacH7srIkcubNEj9rYok9Adx3heXamtXUlFf0SwkJsBxYZvToC8Lsi7Ug//yZ2Jd/eA06rZ8drekyhzEbjYWb+ksPvoZLcMTAcnquePRuMEtJFz9X9EW905A/l+I5qtfCNZ1fiyxbgsi46yi624iVupVOzy3dblkZgVE1h51yraNyQfl/M8THd0KkhaOCEECoYP1NWU5vvfyo0bCN4h+BWRK8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=v0NSpUbXlL+pKvzeJdTZoQLDZpMHO+N9TDt/FTKermE=; b=OS5eudLBo/ofYlY5N7Ec4wr+eU28jIMjPtDf/SyQuRJ8Ool3hU/9yuqEdror7CD3z6+hKsKULw9Z0q5xG4ksTFQkurzfOAR5cE406hdcMcMNiRJQbNf1i1Xsmid8SL+e7XGj4/G+yvpZ2dDS9IO+9N4wtbptK0LeHSQPik/Eh1UawtYovFr9VOe3WR6dDhknisv+RjcSfVf2WHExYBiM70hawTEAlfk1ANbUyTW+0fyytzJ4yXNXDdAGeafUifbENBkii5ffbxa0MoQm7c+26zbDDGnQ1yMZT4XCBB/RW8sWAe6AOpokYLzQ5OTBvXFgfaM4qmIaOmVneC8/WbS3qA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from AM5PR0802MB2500.eurprd08.prod.outlook.com (2603:10a6:203:a0::12) by AM6PR08MB5237.eurprd08.prod.outlook.com (2603:10a6:20b:e9::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.22; Mon, 1 Feb 2021 17:54:07 +0000 Received: from AM5PR0802MB2500.eurprd08.prod.outlook.com ([fe80::8da3:f307:f155:73a4]) by AM5PR0802MB2500.eurprd08.prod.outlook.com ([fe80::8da3:f307:f155:73a4%11]) with mapi id 15.20.3805.027; Mon, 1 Feb 2021 17:54:04 +0000 To: GCC Patches Subject: [RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns Thread-Topic: [RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns Thread-Index: AQHW+L/myhBwjQ+PsU2VE5G5BH4aDw== Date: Mon, 1 Feb 2021 17:54:04 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [217.140.99.251] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 70c51a44-6df9-43c6-3717-08d8c6da67d0 x-ms-traffictypediagnostic: AM6PR08MB5237:|DBBPR08MB6027: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: cs/0jtqXTn84XGEwBDP/Vs1KTvhfvIL20Hox4aaQAsDRMHAK3DB7G1Ivj+lanW5c56EvoQr/KEGbTRXiGyXyJTQNUj+7biraKsip5wmgh8dhYAnnh9D33urWCYwWeZnw1dfFke8Ju9zKoTtN91hgkU2uTaEhl5aMAXps7pQ4m+IqiI9k4fMiGoTQqcrFWsFMjoBmaEnHAgo+GfxcBK0w334WGDHTiIqxIoguX7Grr4046mfakTu4fw4TzHq4P+ggfc8ScSNdj/Je+gBAZUQrYM9oPkhywotmVxotWyB913fsqw9z83A4OqGD+I0vbKyw3uU+9RV1LbIgJvTZCdBovimixSzBekKd8Y3ud0LeD+dUWKMP1HdfnZzVRSzjR8TequcHW0Wu06Vb4t9RWWE15821G84Sx9RWeapEaW/u0F9Bcko7Jt4VeDdRoJ7xrCJ+vT4lWgqKIW8LD52dJrcYfmRp0Ng+G06lX4WhRxp0oGXZkqsmrNhZO2b1DeUwd3A5CuxMokm9klm1TF4Wi96vxCNG8PIXdkaitObglIG5tvQM5C2WdnQmL4z5QqPRrzKz7FLTAHck0Ssu2YU2TQqglnZ+MtZEn3mwAM85d0prYWj3ggEv5OPqqoxNryXRy2imwd/E8Km2u1imN91sowr15g== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM5PR0802MB2500.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(39860400002)(376002)(346002)(366004)(136003)(6916009)(186003)(8936002)(478600001)(6506007)(86362001)(83380400001)(26005)(33656002)(4326008)(66946007)(66616009)(66476007)(66556008)(64756008)(66446008)(9686003)(76116006)(2906002)(99936003)(55016002)(8676002)(7696005)(54906003)(5660300002)(71200400001)(52536014)(316002)(4744005)(32563001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?q?0bnXmZMaELEDgYR+Rtqk4Be0o?= =?iso-8859-1?q?7uPVlXER7WQ1sAEhZ8RrToZ4sGWzfXs1bCm/5SILcgnyRPcqGEWz8v17LYu5?= =?iso-8859-1?q?CfpUvjieYhLpSkGUHNBjoW91VUPGip9zGb9G/3zv0bjThDitSs2yBM8INVyZ?= =?iso-8859-1?q?nHL+CggPcV2XnQ7b0wKWwu+f7pdNKevBO7jrSwBY7U6SXXmIbk8TGYlB0InH?= =?iso-8859-1?q?JL/EQ8PDFf47bqqVFRGk4FTQ9mpbhBGADRgoGCmyZ7il5ekrCIPhhWHS3QcV?= =?iso-8859-1?q?CFhTRnqOjhFzq31rGNs73ym83f7OJFiqwpikh8hv7IiqACu/+3kh0IG1uMPV?= =?iso-8859-1?q?7j37qvbDE8PPQsGaaMqh17VACSnn1XjxgfNuhCEOdidbn0UZDgECCdlh8E1l?= =?iso-8859-1?q?ZK6MBWyutr2NDSnX4VbE5hIqwAKKo7MiY0ufw3YnoPV+m/0huM4zBjniNQFj?= =?iso-8859-1?q?uyXDwpzzOYo1T/+0vOvCQp/cHOaxutMb9m7nI9e6uuPanntqJ11lrkeCwfp2?= =?iso-8859-1?q?HyJ6BOJoUkjssQs1YfsuZqlKGrYdjKXb7xhublCeVR2kiwCOAO6SnjX+on0J?= =?iso-8859-1?q?zZI86Q9g7mra6jSkiMgwHwMHZgY32WtkOM+B7GcFYdoeSbCxNhAF3bTW8Khv?= =?iso-8859-1?q?29/AK3HWkhKoFuYbfVLnS20Ps/I9QmhtDkhRLfSwMfYPzTj6IWI+/WgryZv7?= =?iso-8859-1?q?mLyIegHH3XouqT4SeGACgsBkfKbYPMoQTumXNcpT+L8UVwPeVa2vEd6I6/iI?= =?iso-8859-1?q?IWLQ357ICwuu24HCypt654LcK5MSc5/HedHsGW7J4Fj0HaIoF34qWeJ+xLd3?= =?iso-8859-1?q?Puo+3jADGBxSGK8cCl1EI8RXU51Np8BJg2H6sm1dmadl82kzwxUW2nVLMOru?= =?iso-8859-1?q?sDT2hB2unpREzZ39SEsGf6Su/oQsnnAQrWoOx9xMx4Ey1LgHbxA0FeXiLUrh?= =?iso-8859-1?q?/4RmFaaaDdYkAq6SsXhy9T62vHxz+Xnwa0Bb7gIi4LBI7rnGQMugMLFZ+tWL?= =?iso-8859-1?q?rES8cdztl6HpdUnpIiQGzIFHW4T/m82kZhl5sXVY080VMZMIn2WBJ510b/aS?= =?iso-8859-1?q?qxdu7m+72+Jtlt95uSWPiw=3D?= MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB5237 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT050.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 9e2e901e-1161-48cc-9344-08d8c6da5da8 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5+JoPYymeMe8vDupeh8P2mW1klPHji2VYDUa+iTNJkr9lGwckMNa6qwW2PNw6wcMlQ7WIZ0feE6j0XR2ytSk5jJoPLWCpknm8Oze7lBz9qm9LAaqbT6IRuz8egWzvvZUXlZ4IA7R5GxfWqvdKe7NUaZ3fWpk3cJ9hx31L/R4er6NY3Uc640ieI1zMuGfwTDA7m4EW1VKAF4li5Flh2nQqwPyk2d/zgfcNUXmAc/2XBvItGjtX/5PnjbVYKRfhNhWHy7ymdV6OIBh17UqZAtYre4l+JHCOCvJ1T13+B1Og9lYWpTB8UlJJ3y8n/VGviBlyhagM1sFg30oSWdqp6w/f3OkjCL5LcqtV18lBIBZ7SzfxdVSfR6eMtWxhI5RJO0cKrdLcvzCkYacgc+wk7QLs75cTycVt5uduOjQIBHsGLsbwgmONnaZ4Kggs73ZfQMVdT388sTnsDwPaWcEDJ2MACGLl5Y7Ma+lq4LV8NqDB/OeqN4y7ZZ4KNkBDh7U+RktKU7nEyFBiVq4dZyp1NUnIrpdZhzMhDCh2oLv7ZP+qPc0sS73x/TZAPtRNw4GDTbEm7+3CFDhb+HZ6nBM45mZJ5DDFReshAWEs3zyLv5B5L3QSS5a1bDQFbRg+PPdjrt6enDOQbUv5pz58kNibo1uoMWPZSlMS7+VsbhXpaiAu2AzC0foVh8spsmYhYoxwNFPso5xDlm5OqnEEY530kVWJ374RspzCqjl5vXyLRPTgbTgvyLNann5HWiLBcfUBDLJ X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39860400002)(396003)(346002)(136003)(376002)(46966006)(36840700001)(54906003)(83380400001)(82740400003)(316002)(66616009)(52536014)(478600001)(8676002)(36860700001)(70206006)(7696005)(99936003)(82310400003)(70586007)(235185007)(55016002)(6916009)(8936002)(4326008)(2906002)(107886003)(86362001)(33656002)(47076005)(81166007)(186003)(9686003)(336012)(5660300002)(356005)(6506007)(26005)(32563001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Feb 2021 17:54:21.8960 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 70c51a44-6df9-43c6-3717-08d8c6da67d0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT050.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6027 X-Spam-Status: No, score=-14.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Joel Hutton via Gcc-patches From: Joel Hutton Reply-To: Joel Hutton Cc: Richard Sandiford Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi Richard(s), I'm just looking to see if I'm going about this the right way, based on the discussion we had on IRC. I've managed to hack something together, I've attached a (very) WIP patch which gives the correct codegen for the testcase in question (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98772). It would obviously need to support other widening patterns and differentiate between big/little endian among other things. I added a backend pattern because I wasn't quite clear which changes to make in order to allow the existing backend patterns to be used with a V8QI, or how to represent V16QI where we don't care about the top/bottom 8. I made some attempt in optabs.c, which is in the patch commented out, but I'm not sure if I'm going about this the right way. Joel diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index be2a5a865172bdd7848be4082abb0fdfb0b35937..c66b8a367623c8daf4423677d292e292feee3606 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3498,6 +3498,14 @@ DONE; }) +(define_insn "vec_widen_usubl_half_v8qi" + [(match_operand:V8HI 0 "register_operand") + (match_operand:V8QI 1 "register_operand") + (match_operand:V8QI 2 "register_operand")] + "TARGET_SIMD" + "usubl\t%0., %1., %2." +) + (define_expand "vec_widen_subl_hi_" [(match_operand: 0 "register_operand") (ANY_EXTEND: (match_operand:VQW 1 "register_operand")) diff --git a/gcc/expr.c b/gcc/expr.c index 04ef5ad114d0662948c896cdbf58e67737b39c7e..0939a156deef63f1cf2fa7e29c2c94925820f2ba 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9785,6 +9785,7 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, case VEC_WIDEN_PLUS_HI_EXPR: case VEC_WIDEN_PLUS_LO_EXPR: + case VEC_WIDEN_MINUS_HALF_EXPR: case VEC_WIDEN_MINUS_HI_EXPR: case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h index 876a3a6f348de122e5a52e6dd70d7946bc810162..10aa21d07595325fd8ef3057444853fc946385de 100644 --- a/gcc/optabs-query.h +++ b/gcc/optabs-query.h @@ -186,6 +186,9 @@ bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &, enum insn_code find_widening_optab_handler_and_mode (optab, machine_mode, machine_mode, machine_mode *); +enum insn_code find_half_mode_optab_and_mode (optab, machine_mode, + machine_mode, + machine_mode *); int can_mult_highpart_p (machine_mode, bool); bool can_vec_mask_load_store_p (machine_mode, machine_mode, bool); opt_machine_mode get_len_load_store_mode (machine_mode, bool); diff --git a/gcc/optabs-query.c b/gcc/optabs-query.c index 3248ce2c06e65c9c0366757907ab057407f7c594..7abfc04aa18b7ee5b734a1b1f4378b4615ee31fd 100644 --- a/gcc/optabs-query.c +++ b/gcc/optabs-query.c @@ -462,6 +462,17 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, return false; } +enum insn_code +find_half_mode_optab_and_mode (optab op, machine_mode to_mode, + machine_mode from_mode, + machine_mode *found_mode) +{ + insn_code icode = CODE_FOR_nothing; + if (GET_MODE_2XWIDER_MODE(from_mode).exists(found_mode)) + icode = optab_handler (op, *found_mode); + return icode; +} + /* Find a widening optab even if it doesn't widen as much as we want. E.g. if from_mode is HImode, and to_mode is DImode, and there is no direct HI->SI insn, then return SI->DI, if that exists. */ diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c index c94073e3ed98f8c4cab65891f65dedebdb1ec274..eb52dc15f8094594c4aa22d5fc1c442886e4ebf6 100644 --- a/gcc/optabs-tree.c +++ b/gcc/optabs-tree.c @@ -185,6 +185,9 @@ optab_for_tree_code (enum tree_code code, const_tree type, case VEC_WIDEN_MINUS_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); + + case VEC_WIDEN_MINUS_HALF_EXPR: + return vec_widen_usubl_half_optab; case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) @@ -308,6 +311,16 @@ supportable_convert_operation (enum tree_code code, if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2)) return false; + /* The case where vectype_in is half the vector width, as opposed to the + normal case for widening patterns of vector width input, with output in + multiple registers. */ + if (code == WIDEN_MINUS_EXPR && + known_eq(TYPE_VECTOR_SUBPARTS(vectype_in),TYPE_VECTOR_SUBPARTS(vectype_out)) ) + { + *code1 = VEC_WIDEN_MINUS_HALF_EXPR; + return true; + } + /* First check if we can done conversion directly. */ if ((code == FIX_TRUNC_EXPR && can_fix_p (m1,m2,TYPE_UNSIGNED (vectype_out), &truncp) diff --git a/gcc/optabs.c b/gcc/optabs.c index f4614a394587787293dc8b680a38901f7906f61c..1252097be9893d7d65ea844fc0eda9bad70b9256 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -293,6 +293,13 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op, icode = find_widening_optab_handler (widen_pattern_optab, TYPE_MODE (TREE_TYPE (ops->op2)), tmode0); + // Perhaps something like this can eliminate the need for an additional backend pattern? + //else if (ops->code == VEC_WIDEN_MINUS_HI_EXPR) + //{ + // icode = find_half_mode_optab_and_mode (widen_pattern_optab, tmode0, + // tmode0, + // &tmode1); + //} else icode = optab_handler (widen_pattern_optab, tmode0); gcc_assert (icode != CODE_FOR_nothing); diff --git a/gcc/optabs.def b/gcc/optabs.def index b192a9d070b8aa72e5676b2eaa020b5bdd7ffcc8..43fccfa29127d99ce0131a21c2dc58fcb247bd25 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -403,6 +403,7 @@ OPTAB_D (vec_widen_umult_lo_optab, "vec_widen_umult_lo_$a") OPTAB_D (vec_widen_umult_odd_optab, "vec_widen_umult_odd_$a") OPTAB_D (vec_widen_ushiftl_hi_optab, "vec_widen_ushiftl_hi_$a") OPTAB_D (vec_widen_ushiftl_lo_optab, "vec_widen_ushiftl_lo_$a") +OPTAB_D (vec_widen_usubl_half_optab, "vec_widen_usubl_half_$a") OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 75d814bd121f40c6a430f33f4c7d6395642f6c33..0e2313009d39c17d998c2285b9a9938e616dc35c 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -4007,6 +4007,7 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case VEC_WIDEN_MINUS_HALF_EXPR: case VEC_WIDEN_MINUS_HI_EXPR: case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_PLUS_HI_EXPR: diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index a710fa590279234e5e8062a87bac68eb324df3cb..2c415abaf6091693c31d636644e54e18a90650b1 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4242,6 +4242,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case VEC_WIDEN_PLUS_HI_EXPR: case VEC_WIDEN_PLUS_LO_EXPR: + case VEC_WIDEN_MINUS_HALF_EXPR: case VEC_WIDEN_MINUS_HI_EXPR: case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c index c8d8493e6eaefca589ff73bcae4dc014140a1c5c..1911d2b0d637e058affadb21dac93e6880376eae 100644 --- a/gcc/tree-vect-generic.c +++ b/gcc/tree-vect-generic.c @@ -2121,6 +2121,7 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, || code == VEC_WIDEN_PLUS_HI_EXPR || code == VEC_WIDEN_PLUS_LO_EXPR || code == VEC_WIDEN_MINUS_HI_EXPR + || code == VEC_WIDEN_MINUS_HALF_EXPR || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index f180ced312443ba1e698932d5e8362208690b3fc..0a31c7b004eaa6fba7bbbaca0ef39a265774093f 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -4545,6 +4545,51 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, *vec_oprnds0 = vec_tmp; } +/* Create vectorized promotion stmts for widening stmts using only half the + potential vector size for input */ +static void +vect_create_vectorized_promotion_stmts (vec_info *vinfo, + vec *vec_oprnds0, + vec *vec_oprnds1, + stmt_vec_info stmt_info, tree vec_dest, + gimple_stmt_iterator *gsi, + enum tree_code code1, + int op_type) +{ + int i; + tree vop0, vop1, new_tmp; + gimple *new_stmt; + vec vec_tmp = vNULL; + + vec_tmp.create (vec_oprnds0->length () * 2); + FOR_EACH_VEC_ELT (*vec_oprnds0, i, vop0) + { + if (op_type == binary_op) + vop1 = (*vec_oprnds1)[i]; + else + vop1 = NULL_TREE; + + /* Generate the two halves of promotion operation. */ + new_stmt = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + op_type, vec_dest, gsi, + stmt_info); + if (is_gimple_call (new_stmt)) + { + new_tmp = gimple_call_lhs (new_stmt); + } + else + { + new_tmp = gimple_assign_lhs (new_stmt); + } + + /* Store the results for the next step. */ + vec_tmp.quick_push (new_tmp); + } + + vec_oprnds0->release (); + *vec_oprnds0 = vec_tmp; +} + /* Check if STMT_INFO performs a conversion operation that can be vectorized. If VEC_STMT is also passed, vectorize STMT_INFO: create a vectorized @@ -4731,7 +4776,8 @@ vectorizable_conversion (vec_info *vinfo, case NONE: if (code != FIX_TRUNC_EXPR && code != FLOAT_EXPR - && !CONVERT_EXPR_CODE_P (code)) + && !CONVERT_EXPR_CODE_P (code) + && code != WIDEN_MINUS_EXPR) return false; if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) break; @@ -4937,22 +4983,55 @@ vectorizable_conversion (vec_info *vinfo, switch (modifier) { case NONE: - vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies, - op0, &vec_oprnds0); - FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) - { - /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); - new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); - vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + if (code == WIDEN_MINUS_EXPR) + { + vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies * ninputs, + op0, &vec_oprnds0, + op1, + &vec_oprnds1); + vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, + &vec_oprnds1, stmt_info, + vec_dsts[0], gsi, + code1, op_type); - if (slp_node) - SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); - else - STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); - } + FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) + { + gimple *new_stmt; + if (cvt_type) + { + gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + new_temp = make_ssa_name (vec_dest); + new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + } + else + new_stmt = SSA_NAME_DEF_STMT (vop0); + + if (slp_node) + SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); + else + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + } + } + else + { + vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies, + op0, &vec_oprnds0); + FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) + { + /* Arguments are ready, create the new vector stmt. */ + gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); + gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + new_temp = make_ssa_name (vec_dest, new_stmt); + gimple_assign_set_lhs (new_stmt, new_temp); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + + if (slp_node) + SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); + else + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + } + } break; case WIDEN: diff --git a/gcc/tree.def b/gcc/tree.def index eda050bdc55c68fa11ac5526e3a3f618aad0df4b..5b2c4e74a85be18738eb6fc36bbaedd036acf89a 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1433,6 +1433,7 @@ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) +DEFTREECODE (VEC_WIDEN_MINUS_HALF_EXPR, "widen_minus_half_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)