From patchwork Thu Jun 15 06:03:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 1795200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=DqHzVXu/; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QhWv02TYGz20WR for ; Thu, 15 Jun 2023 16:03:39 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 953A43858022 for ; Thu, 15 Jun 2023 06:03:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 953A43858022 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686809017; bh=zSWgvmKK5vu2NAyJI56VfFa4/Xu4UhLHCe0VQL2DaxE=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=DqHzVXu/PxtC461afw1eaMyZ6lw7n2V1iziK/bJNM9Een7B98huvb3PwnCVsJIjVl SkqDChJN0jJUZXRto0Tp+ioMDAF1PdibeknXvRU326R66reKM7U5owP4Xz0dq4dR/O 4/iaUHZUrEIGBiMMwdswxzwryOKbPKqYxgfF/9QA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2044.outbound.protection.outlook.com [40.107.7.44]) by sourceware.org (Postfix) with ESMTPS id 5C7033858C1F for ; Thu, 15 Jun 2023 06:03:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C7033858C1F ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FaE3IqIqTVDfjVvofLn2wQ4BMyJQKvgLEe0GiKrrvUb/akEgrNIKR1YF4TJWZtAUHkPuUL8iYNcScFJn2CE46NLjUl1RwnkUjakrDMj1wA1Rf/u62V+ID9EKxmikxsgOsVsMLPplRVETMTWpdWJdO/ZDlIhXZUAYMTIGDgdx8PjGUX1sgSiR25CY2UYW2RfpYkQWoZLARM1mx4E5CMFY7a5mx1DD+IZ1cjWW6xwpQJ3RXQJIjiLMODdC018c413QR6FognSZGoHZ+u0ooIssX2A+SkouVlcWEYbKzhWiRKAC+XrrM/sxODviog/ZoH9VuOa9DKX0rz2B43499YKrdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zSWgvmKK5vu2NAyJI56VfFa4/Xu4UhLHCe0VQL2DaxE=; b=M8munLF+ygqX/iMOvgZA5Vgq9TjTALDbYtaofCaA6X51VmZgD/gkCLmO0u/2R/jCAcEWLtjy9LqmHpfZFkDf6tVs6pruttX97/ti9Bis2WJQf9wKGADfILTM7BOjqPVOCHfbfsZe1ElAEA92NJlX415yoU/OWxuUqLdsN3J7ejPbsNADNMXxsI9JWkgCIu8Fig5gmE+kuVO5r3oM1LwdBV2rZdgEstdsUfwZvm3H9DwShFoRN3uvbg7uAMMF7utJ2q+G//5cI6mCUzP6+W2BnQyGFOT61XCcLMVt3aB7M1e1JvODAB09z6/Zdqq6fWGFU6CWS7jHKPI4UTB/Ht3ioA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DU0PR04MB9493.eurprd04.prod.outlook.com (2603:10a6:10:350::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37; Thu, 15 Jun 2023 06:03:13 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Thu, 15 Jun 2023 06:03:13 +0000 Message-ID: Date: Thu, 15 Jun 2023 08:03:11 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin Subject: [PATCH] x86: correct and improve "*vec_dupv2di" X-ClientProxiedBy: FR2P281CA0125.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9d::19) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|DU0PR04MB9493:EE_ X-MS-Office365-Filtering-Correlation-Id: 4777edd4-acce-4459-c1c1-08db6d663419 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: R+2O532NJQMA+zHcja4YOD7NgRIySInd3ISdmMXBpZ032DHGGK64oT6aVEgO7L0Siu0cuTK4ga6t+vU6y/1MqKEWRS6S0Dx2iJ/dzRm9gxFVuMcbDWP51YhWBmwCaOXd8SHvstxgEMAFRbDWNT/LmF33nGdC1uSEfHyKdurt8cSLoQLgYeUa6xqActTrfsgCMLe5ADS9tG+kw49+rtsuiTCF9OVf8TLY2mUI4JTjMZsHQ/Q1rlr836PxqpEh8JU2x6UE/ouXTCxfl98yO0pvSHY8sST1sLIsJtHdcxdaaD9cScqxDHVHxlAnrTmxi+gIaEj0Hlg/IMHpnk4rLqTs5a1yMt67s2MTbF2+cmuhjoLsq7st8kz9U4FiXbXNWl9Av+4a8whHzL5hVNDNxW69Upy8hvDH87hWwR/o2fe9U15I9WHfPI3lWv44N0gi2jdCUpRDaGdgtA47KSsCdOyDAVxWCkWY5hxPHEnwjAnA9y9GC5lMEnGITe9CLNYv6Otp6s2iKxlh1An76if65Lc4zoY+KMobR6VRQXY12OxzSeoqP2tczxEBDgY4obJ1L5PHNSx3oplgm3xni4H+J9z2VDBAWoE2fK1H8+a5PH3xhkmZ/o/YdevlTV+G5oRSnXt2QvFJzlCWVg5ym5qeHnUUQg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(346002)(39860400002)(396003)(136003)(376002)(366004)(451199021)(2906002)(2616005)(36756003)(86362001)(31696002)(38100700002)(8936002)(8676002)(6486002)(316002)(41300700001)(5660300002)(478600001)(31686004)(54906003)(66946007)(4326008)(66476007)(6916009)(6512007)(26005)(6506007)(66556008)(186003)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?gNEP/wixoN5gMF5UfdADvMdvWWTH?= =?utf-8?q?nESsZCEWaTCFF5+O5zwkWOQW/FE0JxaT5fUTlieeTkWJXe4HHfT0eLf42AYofw0Q/?= =?utf-8?q?qhyqgVbR8akIQ2+LgjNAQGhcmOzn41fnIujr7I0uL0oLS185MhnnWpden3TJGgS56?= =?utf-8?q?DBawSqDgmp2Xm9usYpTriKoFnYH6jX8vZayn6nMFWXSkgz5KMHPVN7fmqNOF6CiAc?= =?utf-8?q?PoywaTkUdHAFG55S+EO9/t6RM9FWZ75yjBKuKHK8nRCM4pYXACGef64UiiD597cQ9?= =?utf-8?q?2Q980rOQ2JXNPxmICTW5h01M2Hc6ikXLYxX7IgAzX3OVGE0mSei2erGfsXIU9EPHe?= =?utf-8?q?hrT2hF8niWyA1lYvWAbKMqtz4FtRz9wVGVcuudz+JViqmZO6/a9sbWrh48Hw9hZOr?= =?utf-8?q?UfpWWhz8CJn/bUzEodgpgDRowHh6X0KKcBVfUqepNRceytX1baggM68l/88V4281R?= =?utf-8?q?d5deQYCADF0Sy/ec/afHBhqyO78CsieL4ndO0FN2S+ADeig9LveI+1hPPhZIkZj3f?= =?utf-8?q?xxa4rOmJg3qDzRMXSOUIPlhD9RPkrSddQEE+4p7xLOjM2bqBKLKqZpp47k8EXVdre?= =?utf-8?q?zWL07w3RrxYtbEvhSC01LB1hXwuVE9/tmU3Ten2Xt08Wt9TUUJhEJ4pVe8IKzgYmN?= =?utf-8?q?ARrZQxhEu5Sb+PPAt14PSzMYj816ZA1xbkd1A3Tt/Ld0K+pjZePvYyr4wmN+/1DwE?= =?utf-8?q?sz7L1uYnPlEgBd2G1U4E7EtY+6nzakzUoNtlYXTXCiovCcMZlh5OwKe/EUrfFUqde?= =?utf-8?q?RWPmwFO+cVxGg2FIMIKBZl+7yOhT5chayupopiZYdyJc3WR/2ZnMp/sRiRJU93uaS?= =?utf-8?q?aBMZqvezluXJMsPVp/M5KOVOHDQvHjfedPNT4evptz1Z5JK58wPDl+pA1VBskkpaq?= =?utf-8?q?3/tmLU65fa0iyYUOf+F0c0Ldi1TGhMrqIfTeTj06t3H4Nu3e6doEULlZ+Kgj9ghg8?= =?utf-8?q?VNFjrhdvmzAPRRK4h9fQ05ucSgJJ1KFNfCbvJP+wwxbl70oO9KCHUEPphVVaK5SZ2?= =?utf-8?q?4bvwnkcFuQj7qbXoX1p8HG+CciULjipJ88cGDw+w1SEtlf6eDMLPwohbSdE+eAdwJ?= =?utf-8?q?+jHlhaP4qbq8wjbTwWJWCDm55sLkR3m6OM81gpDJ8DIVli/YwqvX0z2V26T03sQuF?= =?utf-8?q?6D/q6lGq1NmBTnQ0jhccedJ69OpETKoMpTMUzqsqdVY7SYTDYbrPbwRil/m+woX3y?= =?utf-8?q?pZjHTIyAbbI8VgR2oN32ACLF9unEUD841FOaB9B/IC3duyErmzyv22nYEVzaRXeLM?= =?utf-8?q?MlDWFw+BmCwAZTGGc27MZDG/iL66BzT8Iipc0fOseQKBg5kcfcJ2NhXhulkVh2lGF?= =?utf-8?q?i3Ggx7FTPTYQTuXSfa+zdbBsuD68Xs0gWMqfEe/yYkH4fX2+GjTrHC0NFv4kXzWL9?= =?utf-8?q?3b0FIRPAtIXSNcQcdqxZTpxYLj64+c29sbrKgbx3xJ6wohDRjb/8QFAHboBzL/hQy?= =?utf-8?q?aiUj+eEev+lXZMdTPxvA54nLAXdb8H8HvkNpgiNEnyczOBV5yt/QkGx0WS8Qb6991?= =?utf-8?q?B1FKJSJTXutZ?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4777edd4-acce-4459-c1c1-08db6d663419 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2023 06:03:13.2679 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 72bnYYfjjHhWsby9+Pxtnkk2uXNHJjZoy/z0IoQiMctA0KiIolKZf4LupMDwKKdlMYJUCC3WXM8AxWFetrYw3A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR04MB9493 X-Spam-Status: No, score=-3027.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Gcc-patches From: Jan Beulich Reply-To: Jan Beulich Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/ * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input constraint. Add new AVX512F alternative. --- Strictly speaking the new alternative could be enabled from AVX2 onwards, but vmovddup can frequently be a shorter encoding (VEX2 vs VEX3). --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -25851,19 +25851,39 @@ (symbol_ref "true")))]) (define_insn "*vec_dupv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] "TARGET_SSE" - "@ - punpcklqdq\t%0, %0 - vpunpcklqdq\t{%d1, %0|%0, %d1} - %vmovddup\t{%1, %0|%0, %1} - movlhps\t%0, %0" - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") - (set_attr "mode" "TI,TI,DF,V4SF")]) +{ + switch (which_alternative) + { + case 0: + return "punpcklqdq\t%0, %0"; + case 1: + return "vpunpcklqdq\t{%d1, %0|%0, %d1}"; + case 2: + if (TARGET_AVX512VL) + return "vpbroadcastq\t{%1, %0|%0, %1}"; + return "vpbroadcastq\t{%1, %g0|%g0, %1}"; + case 3: + return "%vmovddup\t{%1, %0|%0, %1}"; + case 4: + return "movlhps\t%0, %0"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") + (set_attr "mode" "TI,TI,TI,DF,V4SF") + (set (attr "enabled") + (if_then_else + (eq_attr "alternative" "2") + (symbol_ref "TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") + (const_string "*")))]) (define_insn "avx2_vbroadcasti128_" [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")