From patchwork Fri Jun 16 06:19:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 1795710 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=lLPHZFQq; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qj8Cx1jgKz20Wy for ; Fri, 16 Jun 2023 16:20:29 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E128F385C6F4 for ; Fri, 16 Jun 2023 06:20:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E128F385C6F4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686896426; bh=q5LIqNl1pPgDOoZFV5LNKa4QnINoPDFWicf09FDKCrM=; h=Date:Subject:To:Cc:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=lLPHZFQqweCoEs9a7Vm/CjRWzNflYD44QKJR1L4IKZLtGK6Y9HSDFoO769U7PSuWz MjC7SKboBwOO7kHVKFW5FKvX2fHAe23tjS/kmcNofVdSPv+PNkWp/0GqFX2MJB0bqJ hRf3Yio/e8hgdurKwBZxF3SdY8mijgBl0+DEDZIA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2041.outbound.protection.outlook.com [40.107.8.41]) by sourceware.org (Postfix) with ESMTPS id E364B385C6FE for ; Fri, 16 Jun 2023 06:20:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E364B385C6FE ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zqjk0xsMcCbtgirbnoio4hxU7GNe+ZRVchkPUPHZnF7yWUkDJhXZvP1qVgIzvSzDF/MkWJ4Tozn3EkNaMCCwAM/GfwrHsCFdJNL/KZXPerghIlZYAohW5Tnnfl7KIZODNh+MvZHWvjDxVk3IGABz9u/TBmAnii6y22546qmswIK7aP/zQIAVUfIEK2WLHRN0mKnnqazs+F/3UGuUtxpE7/6nz5t1Dxapf60nR7Mtx28pi7DTgDRzJTh+/P9EA1lBh+x9mxlYbF9PjBdax6ld3L98TTwPTTqF6PS7Gf4AHPtCqpa913xPBEi2fO/9CGiiXnTw/qZ5p0M/sHyhkpGvjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=q5LIqNl1pPgDOoZFV5LNKa4QnINoPDFWicf09FDKCrM=; b=A7BET30qXrOUL7NmTqxDqY50jmNpn2BXcWdu2V0f7NUBWZ8/N9zW4Ayjw9wYQc9cZ6k4lS5ShUP45AVec0srvWJFLVyM6F/f4HnOybGsuBRs4iiilJgTPNfeche/CxXsaXqfCm42mHp/HMNIfEOmWIU45ELVWT51k6ySNkPY824zuVvl26lBiv3+ILn0z/Ps0Gr9iL43d5QvgUaMM6vXJ55vbSwWZzHdpviwPKeenV7bqrJ/eQFHT+ooMErpGK7R22DQtSmfStiCjndvHmaCopBipGDnvA8v0FAxbLzPQT5+uw3oANkfpbKWVfvx1qqVQZIdG7t0QnAx5znuSXqtgA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by VI1PR04MB7008.eurprd04.prod.outlook.com (2603:10a6:803:13b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 06:20:02 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 06:20:02 +0000 Message-ID: Date: Fri, 16 Jun 2023 08:19:59 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH v2] x86: correct and improve "*vec_dupv2di" To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin Content-Language: en-US X-ClientProxiedBy: FR3P281CA0129.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:94::17) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|VI1PR04MB7008:EE_ X-MS-Office365-Filtering-Correlation-Id: 79fbdac9-52d3-454e-758a-08db6e31b7ea X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: udOgPZZneEukNmPTGxbNJ8eI6DRfu30qb2leV3sVNpSd9MbxoxHi8ptn6LNZjR3QVkhTjj5CwdwNl3Eqo71S+Vjs+Xhs/WVuSYN27x0m0HGv/FhwjN5ZW3AkJudXIvPECWVnwXtJzHoQF5cKv0h2kgZ2SJHsoS3xi9Go+At0sRMQNpb5FR7N/W16azvtAn5LjCsOXXUNW8+3TNWhduNeSJ53uEASazV7Y8EGfZAGDUYwj4NIsy9KTJz0MFni0CcPC0qu93BjbGEsxAMsIvicGFrzQr4QkgMaTLiK/5xrfTKpEVUgbVesX8AkcF8RC4QGZnnY9tguFW6Hf7jLiIF7WtKBc875tFcXmDTeq8n8aZOj+OkQex7uxV4HerApmM1X1St4Hqt00+34GboekKOWf+tRbNyk6b8eOhrtaBzCFfPM7UZR05YRYUu7yRrWzXHAZHMGhKGzNEhExhFKyRi76wiuznGQ82JyYQkgeUmsPUUUxAH46mI1BRE02BVwezow9LNUbZgDkHNmARdCt5kHxZflMYIRR30QW3DTYgYslWcuAHJmLFmEXkn0F7E5avFtthrDAdteTbGiKBcF3cP+/MPRXI13r3njavhNqG5o198EOhMw2jYIosbMuWF6SpiCtHBUdfcoQ3PkgoqFrbAYLw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(396003)(366004)(136003)(346002)(376002)(39860400002)(451199021)(2906002)(41300700001)(5660300002)(8676002)(8936002)(316002)(4326008)(6916009)(66476007)(66556008)(66946007)(36756003)(2616005)(6506007)(6512007)(26005)(186003)(31696002)(38100700002)(31686004)(86362001)(478600001)(54906003)(6486002)(6666004)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?GgB2H9tTgJBBBvmsBW+9qDwSeJOt?= =?utf-8?q?65DskmgRHbwHYPf+IulDifZmQPNl0vP7T63WYuCAEMunbUjAqLzGu+pdaSs6nZaai?= =?utf-8?q?sSO6CInVYEobGs8gRjRpQJTiKXXyzMweQdCi/aI/IulzrnD3s8/uBXZ/eS0tp42X4?= =?utf-8?q?fQK0iY+AMtEVVg9dTdqy2XA5MHEsbYxuMcrs8w/GuUGir/JJQRCb3errs6yHxR68L?= =?utf-8?q?v3JGdZilAHYjCCZ4SfuUUkQlcB9V3yP4lWei2IvlBtcaC8tnx/lAR/3NROlKxc5Uy?= =?utf-8?q?ZZZkeGWpfPxKHAXNPk2D5aHGVXpfD+YkCxsHKcfLB547nz1D575bUUpCP+u5T0Ue7?= =?utf-8?q?YKdOGqXXzR2uQOo1j0KKadvpqkd/w7gH7Y+9HWowAEnJ5wxYmbft1StaH/f2N/kG6?= =?utf-8?q?riFCCXQ8LOiY9R4h4SnVBpNgAvTaFV0GfhioYP50MUqyqdR2nxmAKVULGy0+Zp5fB?= =?utf-8?q?UNfCvuJ32ePOZpiBJTnozN0yAUi48UmsGg+D3zv+axw5dP/bXUsEiHvRmfAHsazSI?= =?utf-8?q?GGR+0giQa1gpNiDxPSmbGyX/fjrxObk/En4L/AidJk7x9HhD/kOwJCU0aAFX6HWxZ?= =?utf-8?q?vx1QUv9bGaiJFT6gORlzsez7SSEWzVAqpRF0E9U9ICDm0Wonh19oxV6RVAnO2C1zW?= =?utf-8?q?PtTynYiIuLDgdQzwPVcSaZMbmG2gXaWHQGSLvG4rbGGL+6d9mAVJ5hrwJRIYr2D7u?= =?utf-8?q?0Xn/MfPp5zScQPWuAfvQGfaf5FMF2W1HCrwB/FDp6j2taAGy2oLd+5SUrUUuWpafb?= =?utf-8?q?kVEWY/Bn0rBgX2udTvQhf7xicYksQXwwyUJxSQHsgnsRHy/79VvwjO7UESLeBXSbB?= =?utf-8?q?rhcktaXlXRG2kvCJ/+R0yU0QX7E3f+Xl/q5Yr5i628y4d9INA6xSy6CIBrXz74RWi?= =?utf-8?q?bUgsOiuk90g1K6nBQWl89B+Y/+Lgj2vV/zE/etjT9CmuP5BfuFB7xX9Aby9QNoiSf?= =?utf-8?q?iHqoeVhon64K9cnk22sabceiHbr60PmGoxYHJc/YYtS/qyEhlu+BR1WItpmKdTL3f?= =?utf-8?q?e3LIJiS7/JBLolpC+n3CuhdYj1ofeMaCbNin8tjQzQpcMJMouG4y+0s33sVq88Aa1?= =?utf-8?q?sUMNOJbn7VVA52uDzKhM2qSNjn10g6k22C/sHNmAFgM4ntF/fGs8Mid3PZQvZ3VyV?= =?utf-8?q?MK9ZNLcac7htt2Tf8PP9ycKDoTfTX2rGogcAJMjmUsFZBsPW9yTCuKBs63odQKL8X?= =?utf-8?q?yZkyn1fYgcMHnCvww7OOEu/8sHbNdLHJ18PkcwbRHuEqe6q2rIlTaqkJ72/RfH6wK?= =?utf-8?q?1I5WyPr757nc42K2cKLqOVq7yuP1EyC4RAcjLieg4Wy+BKjjB1d5p6sYhmRyGK5Uy?= =?utf-8?q?Vka3xgutCnJVhfg1fFv4OMI4UJsER/HdrN/C7uC/aZTqxDFdwFw16KhFBsFbNRcoF?= =?utf-8?q?XPy2zpsmgvUKwVsnKTpq2RfAdDW5xdbWy0MqgPq80smS0AKs2qCJ3ISF/8eY/aklO?= =?utf-8?q?bnyakL/uMlgAe6AyqQxPGoiQUE1UGfwmSFJS2FG7FTAJQ1VPC1CpjVC3e4uh2R3ZJ?= =?utf-8?q?JibwCNf0tZ4n?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 79fbdac9-52d3-454e-758a-08db6e31b7ea X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 06:20:02.2499 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: lj80zKICBvqwGSjj2XFuRxGmvWnN2FsFZCBSn2hUVkvgeJTfeveTeWdxvHvJcAcMEZfxZtscAxhOKPMc7GpdpA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR04MB7008 X-Spam-Status: No, score=-3027.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Gcc-patches From: Jan Beulich Reply-To: Jan Beulich Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/ * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input constraint. Add new AVX512F alternative. --- Strictly speaking the new alternative could be enabled from AVX2 onwards, but vmovddup can frequently be a shorter encoding (VEX2 vs VEX3). It was suggested that the previously flawed %vmovddup alternative could use "xm" as source constraint. But then its destination would better also use "x", I think? --- v2: Use "* return ..." form. Set "mode" to XI for new alternative without AVX512VL. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -26033,19 +26033,35 @@ (symbol_ref "true")))]) (define_insn "*vec_dupv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] "TARGET_SSE" "@ punpcklqdq\t%0, %0 vpunpcklqdq\t{%d1, %0|%0, %d1} + * return TARGET_AVX512VL ? \"vpbroadcastq\t{%1, %0|%0, %1}\" : \"vpbroadcastq\t{%1, %g0|%g0, %1}\"; %vmovddup\t{%1, %0|%0, %1} movlhps\t%0, %0" - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") - (set_attr "mode" "TI,TI,DF,V4SF")]) + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") + (set (attr "mode") + (cond [(and (eq_attr "alternative" "2") + (match_test "!TARGET_AVX512VL")) + (const_string "XI") + (eq_attr "alternative" "3") + (const_string "DF") + (eq_attr "alternative" "4") + (const_string "V4SF") + ] + (const_string "TI"))) + (set (attr "enabled") + (if_then_else + (eq_attr "alternative" "2") + (symbol_ref "TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") + (const_string "*")))]) (define_insn "avx2_vbroadcasti128_" [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")