From patchwork Sat Feb 15 15:26:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1238557 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-519599-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha1 header.s=default header.b=Ge4yY/Zy; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=hboB+14G; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48KYzt2GPMz9s29 for ; Sun, 16 Feb 2020 02:28:26 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; q=dns; s=default; b=du4 E1945KIJ670mD9jaG9TjgH6/gVyS2XoeMb1cVCwQ5zVfRcDFIL4F32KqDOaYcDe2 w+m+f7M8ZVdi7aZy6eblOKljc9ehyOdFXDezWBJNIZVaW5GGtZL6MlqN5MpzqkmQ CQOTM0Pl5nJ6oe5ubKPFS4k10ZPnpi9e10gGS5Fo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=default; bh=6GsQ/w87H S+xfqYemZS9vIGmtwI=; b=Ge4yY/ZywACFqoElXPthTSUR4o0u1xTLuiOO/mWu+ w4WJbvgbVyC0MDfu1lbRcqHadAyKNFjF87+Ush9wLR13vq08rtWX7jxULW0sQdm2 Uo/c2cpTpk+j1GJfJfDKrHVnJZINTPnHnGpIgznAQbDR9oXitXyAZVwecdd3wQlY 0s= Received: (qmail 105687 invoked by alias); 15 Feb 2020 15:26:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 105482 invoked by uid 89); 15 Feb 2020 15:26:37 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.2 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=preferences, evex, $0x0, MODE_SIZE X-HELO: mail-pl1-f196.google.com Received: from mail-pl1-f196.google.com (HELO mail-pl1-f196.google.com) (209.85.214.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 15 Feb 2020 15:26:33 +0000 Received: by mail-pl1-f196.google.com with SMTP id t14so5004592plr.8 for ; Sat, 15 Feb 2020 07:26:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ks/2V4X02HOffCl2ss/jq+dVtxtLbaJdKVZUqFm0fH0=; b=hboB+14GyUu+ArpcyfJ+v6zhGJ7+J+vGNbWGQUFn/gD+vx/E9G9wr5i7KuwSRKFBFN TX+btzfzOqJonfwBFt+RZ4kF63OBb1DlXFuC6O/4UozT0CGlfevr14W+muijm4X9fMlT 1gItYUXbkUFimt2PPZ8NdnQS8kSFmqoRcBewM0vtRShl2k9shhW0sIvtBphIB+Ql+bag uYQ8kA1tKDZd6C7a1wgJfnOev7a/IQoB8XdsW/NfHo64rVYYLaZvMNIe9mhGvSYCtlZx EhX6GdFGEl+pKeYvU9FYHj1REQci08Oa62A3l0lAKV4wbmapqk2ektsmys62YdALkYuk 0HbA== Received: from gnu-cfl-2.localdomain (c-73-93-86-59.hsd1.ca.comcast.net. [73.93.86.59]) by smtp.gmail.com with ESMTPSA id z30sm11381216pfq.154.2020.02.15.07.26.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Feb 2020 07:26:29 -0800 (PST) Received: from gnu-cfl-2.hsd1.ca.comcast.net (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id 3924FC0483; Sat, 15 Feb 2020 07:26:28 -0800 (PST) From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: Jakub Jelinek , Jeffrey Law , Jan Hubicka , Uros Bizjak Subject: [PATCH 01/10] i386: Properly encode vector registers in vector move Date: Sat, 15 Feb 2020 07:26:19 -0800 Message-Id: <20200215152628.32068-2-hjl.tools@gmail.com> In-Reply-To: <20200215152628.32068-1-hjl.tools@gmail.com> References: <20200215152628.32068-1-hjl.tools@gmail.com> MIME-Version: 1.0 X-IsSubscribed: yes On x86, when AVX and AVX512 are enabled, vector move instructions can be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512): 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2 We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit and 256-bit vector moves. Mode attributes on x86 vector move patterns indicate target preferences of vector move encoding. For vector register to vector register move, we can use 512-bit vector move instructions to move 128-bit/256-bit vector if AVX512VL isn't available. With AVX512F and AVX512VL, we should use VEX encoding for 128-bit/256-bit vector moves if upper 16 vector registers aren't used. This patch adds a function, ix86_output_ssemov, to generate vector moves: 1. If zmm registers are used, use EVEX encoding. 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding will be generated. 3. If xmm16-xmm31/ymm16-ymm31 registers are used: a. With AVX512VL, AVX512VL vector moves will be generated. b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register move will be done with zmm register move. Tested on AVX2 and AVX512 with and without --with-arch=native. gcc/ PR target/89229 PR target/89346 * config/i386/i386-protos.h (ix86_output_ssemov): New prototype. * config/i386/i386.c (ix86_get_ssemov): New function. (ix86_output_ssemov): Likewise. * config/i386/sse.md (VMOVE:mov_internal): Call ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL check. gcc/testsuite/ PR target/89229 PR target/89346 * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated. * gcc.target/i386/pr89229-2a.c: New test. --- gcc/config/i386/i386-protos.h | 2 + gcc/config/i386/i386.c | 274 ++++++++++++++++++ gcc/config/i386/sse.md | 98 +------ .../gcc.target/i386/avx512vl-vmovdqa64-1.c | 7 +- gcc/testsuite/gcc.target/i386/pr89346.c | 15 + 5 files changed, 296 insertions(+), 100 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 266381ca5a6..39fcaa0ad5f 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -38,6 +38,8 @@ extern void ix86_expand_split_stack_prologue (void); extern void ix86_output_addr_vec_elt (FILE *, int); extern void ix86_output_addr_diff_elt (FILE *, int, int); +extern const char *ix86_output_ssemov (rtx_insn *, rtx *); + extern enum calling_abi ix86_cfun_abi (void); extern enum calling_abi ix86_function_type_abi (const_tree); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index dac7a3fc5fd..26f8c9494b9 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -4915,6 +4915,280 @@ ix86_pre_reload_split (void) && !(cfun->curr_properties & PROP_rtl_split_insns)); } +/* Return the opcode of the TYPE_SSEMOV instruction. To move from + or to xmm16-xmm31/ymm16-ymm31 registers, we either require + TARGET_AVX512VL or it is a register to register move which can + be done with zmm register move. */ + +static const char * +ix86_get_ssemov (rtx *operands, unsigned size, + enum attr_mode insn_mode, machine_mode mode) +{ + char buf[128]; + bool misaligned_p = (misaligned_operand (operands[0], mode) + || misaligned_operand (operands[1], mode)); + bool evex_reg_p = (EXT_REX_SSE_REG_P (operands[0]) + || EXT_REX_SSE_REG_P (operands[1])); + machine_mode scalar_mode; + + const char *opcode = NULL; + enum + { + opcode_int, + opcode_float, + opcode_double + } type = opcode_int; + + switch (insn_mode) + { + case MODE_V16SF: + case MODE_V8SF: + case MODE_V4SF: + scalar_mode = E_SFmode; + break; + case MODE_V8DF: + case MODE_V4DF: + case MODE_V2DF: + scalar_mode = E_DFmode; + break; + case MODE_XI: + case MODE_OI: + case MODE_TI: + scalar_mode = GET_MODE_INNER (mode); + break; + default: + gcc_unreachable (); + } + + if (SCALAR_FLOAT_MODE_P (scalar_mode)) + { + switch (scalar_mode) + { + case E_SFmode: + if (size == 64 || !evex_reg_p || TARGET_AVX512VL) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; + else + type = opcode_float; + break; + case E_DFmode: + if (size == 64 || !evex_reg_p || TARGET_AVX512VL) + opcode = misaligned_p ? "%vmovupd" : "%vmovapd"; + else + type = opcode_double; + break; + case E_TFmode: + if (size == 64) + opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (evex_reg_p) + { + if (TARGET_AVX512VL) + opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + } + else + opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; + break; + default: + gcc_unreachable (); + } + } + else if (SCALAR_INT_MODE_P (scalar_mode)) + { + switch (scalar_mode) + { + case E_QImode: + if (size == 64) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu8" + : "vmovdqu64") + : "vmovdqa64"); + else if (evex_reg_p) + { + if (TARGET_AVX512VL) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu8" + : "vmovdqu64") + : "vmovdqa64"); + } + else + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu8" + : "%vmovdqu") + : "%vmovdqa"); + break; + case E_HImode: + if (size == 64) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "vmovdqu64") + : "vmovdqa64"); + else if (evex_reg_p) + { + if (TARGET_AVX512VL) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "vmovdqu64") + : "vmovdqa64"); + } + else + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovdqu") + : "%vmovdqa"); + break; + case E_SImode: + if (size == 64) + opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; + else if (evex_reg_p) + { + if (TARGET_AVX512VL) + opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; + } + else + opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; + break; + case E_DImode: + case E_TImode: + case E_OImode: + if (size == 64) + opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (evex_reg_p) + { + if (TARGET_AVX512VL) + opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + } + else + opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; + break; + case E_XImode: + opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + break; + default: + gcc_unreachable (); + } + } + else + gcc_unreachable (); + + if (!opcode) + { + /* NB: We get here only because we move xmm16-xmm31/ymm16-ymm31 + registers without AVX512VL by using zmm register move. */ + if (!evex_reg_p + || TARGET_AVX512VL + || memory_operand (operands[0], mode) + || memory_operand (operands[1], mode)) + gcc_unreachable (); + size = 64; + switch (type) + { + case opcode_int: + opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; + break; + case opcode_float: + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; + break; + case opcode_double: + opcode = misaligned_p ? "%vmovupd" : "%vmovapd"; + break; + } + } + + switch (size) + { + case 64: + snprintf (buf, sizeof (buf), "%s\t{%%g1, %%g0|%%g0, %%g1}", + opcode); + break; + case 32: + snprintf (buf, sizeof (buf), "%s\t{%%t1, %%t0|%%t0, %%t1}", + opcode); + break; + case 16: + snprintf (buf, sizeof (buf), "%s\t{%%x1, %%x0|%%x0, %%x1}", + opcode); + break; + default: + gcc_unreachable (); + } + output_asm_insn (buf, operands); + return ""; +} + +/* Return the template of the TYPE_SSEMOV instruction to move + operands[1] into operands[0]. */ + +const char * +ix86_output_ssemov (rtx_insn *insn, rtx *operands) +{ + machine_mode mode = GET_MODE (operands[0]); + if (get_attr_type (insn) != TYPE_SSEMOV + || mode != GET_MODE (operands[1])) + gcc_unreachable (); + + enum attr_mode insn_mode = get_attr_mode (insn); + + switch (insn_mode) + { + case MODE_XI: + case MODE_V8DF: + case MODE_V16SF: + return ix86_get_ssemov (operands, 64, insn_mode, mode); + + case MODE_OI: + case MODE_V4DF: + case MODE_V8SF: + return ix86_get_ssemov (operands, 32, insn_mode, mode); + + case MODE_TI: + case MODE_V2DF: + case MODE_V4SF: + return ix86_get_ssemov (operands, 16, insn_mode, mode); + + case MODE_DI: + /* Handle broken assemblers that require movd instead of movq. */ + if (!HAVE_AS_IX86_INTERUNIT_MOVQ + && (GENERAL_REG_P (operands[0]) + || GENERAL_REG_P (operands[1]))) + return "%vmovd\t{%1, %0|%0, %1}"; + else + return "%vmovq\t{%1, %0|%0, %1}"; + + case MODE_V2SF: + if (TARGET_AVX && REG_P (operands[0])) + return "vmovlps\t{%1, %d0|%d0, %1}"; + else + return "%vmovlps\t{%1, %0|%0, %1}"; + + case MODE_DF: + if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) + return "vmovsd\t{%d1, %0|%0, %d1}"; + else + return "%vmovsd\t{%1, %0|%0, %1}"; + + case MODE_V1DF: + gcc_assert (!TARGET_AVX); + return "movlpd\t{%1, %0|%0, %1}"; + + case MODE_SI: + return "%vmovd\t{%1, %0|%0, %1}"; + + case MODE_SF: + if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) + return "vmovss\t{%d1, %0|%0, %d1}"; + else + return "%vmovss\t{%1, %0|%0, %1}"; + + default: + gcc_unreachable (); + } +} + /* Returns true if OP contains a symbol reference */ bool diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index ee1f138d1af..8f5902292c6 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1013,98 +1013,7 @@ (define_insn "mov_internal" return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: - /* There is no evex-encoded vmov* for sizes smaller than 64-bytes - in avx512f, so we need to use workarounds, to access sse registers - 16-31, which are evex-only. In avx512vl we don't need workarounds. */ - if (TARGET_AVX512F && < 64 && !TARGET_AVX512VL - && (EXT_REX_SSE_REG_P (operands[0]) - || EXT_REX_SSE_REG_P (operands[1]))) - { - if (memory_operand (operands[0], mode)) - { - if ( == 32) - return "vextract64x4\t{$0x0, %g1, %0|%0, %g1, 0x0}"; - else if ( == 16) - return "vextract32x4\t{$0x0, %g1, %0|%0, %g1, 0x0}"; - else - gcc_unreachable (); - } - else if (memory_operand (operands[1], mode)) - { - if ( == 32) - return "vbroadcast64x4\t{%1, %g0|%g0, %1}"; - else if ( == 16) - return "vbroadcast32x4\t{%1, %g0|%g0, %1}"; - else - gcc_unreachable (); - } - else - /* Reg -> reg move is always aligned. Just use wider move. */ - switch (get_attr_mode (insn)) - { - case MODE_V8SF: - case MODE_V4SF: - return "vmovaps\t{%g1, %g0|%g0, %g1}"; - case MODE_V4DF: - case MODE_V2DF: - return "vmovapd\t{%g1, %g0|%g0, %g1}"; - case MODE_OI: - case MODE_TI: - return "vmovdqa64\t{%g1, %g0|%g0, %g1}"; - default: - gcc_unreachable (); - } - } - - switch (get_attr_mode (insn)) - { - case MODE_V16SF: - case MODE_V8SF: - case MODE_V4SF: - if (misaligned_operand (operands[0], mode) - || misaligned_operand (operands[1], mode)) - return "%vmovups\t{%1, %0|%0, %1}"; - else - return "%vmovaps\t{%1, %0|%0, %1}"; - - case MODE_V8DF: - case MODE_V4DF: - case MODE_V2DF: - if (misaligned_operand (operands[0], mode) - || misaligned_operand (operands[1], mode)) - return "%vmovupd\t{%1, %0|%0, %1}"; - else - return "%vmovapd\t{%1, %0|%0, %1}"; - - case MODE_OI: - case MODE_TI: - if (misaligned_operand (operands[0], mode) - || misaligned_operand (operands[1], mode)) - return TARGET_AVX512VL - && (mode == V4SImode - || mode == V2DImode - || mode == V8SImode - || mode == V4DImode - || TARGET_AVX512BW) - ? "vmovdqu\t{%1, %0|%0, %1}" - : "%vmovdqu\t{%1, %0|%0, %1}"; - else - return TARGET_AVX512VL ? "vmovdqa64\t{%1, %0|%0, %1}" - : "%vmovdqa\t{%1, %0|%0, %1}"; - case MODE_XI: - if (misaligned_operand (operands[0], mode) - || misaligned_operand (operands[1], mode)) - return (mode == V16SImode - || mode == V8DImode - || TARGET_AVX512BW) - ? "vmovdqu\t{%1, %0|%0, %1}" - : "vmovdqu64\t{%1, %0|%0, %1}"; - else - return "vmovdqa64\t{%1, %0|%0, %1}"; - - default: - gcc_unreachable (); - } + return ix86_output_ssemov (insn, operands); default: gcc_unreachable (); @@ -1113,10 +1022,7 @@ (define_insn "mov_internal" [(set_attr "type" "sselog1,sselog1,ssemov,ssemov") (set_attr "prefix" "maybe_vex") (set (attr "mode") - (cond [(and (eq_attr "alternative" "1") - (match_test "TARGET_AVX512VL")) - (const_string "") - (match_test "TARGET_AVX") + (cond [(match_test "TARGET_AVX") (const_string "") (ior (not (match_test "TARGET_SSE2")) (match_test "optimize_function_for_size_p (cfun)")) diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c index 14fe4b84544..db4d9d14875 100644 --- a/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c +++ b/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c @@ -4,14 +4,13 @@ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\\(\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */ -/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\\(\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */ +/* { dg-final { scan-assembler-times "vmovdqa\[ \\t\]+\\(\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */ +/* { dg-final { scan-assembler-times "vmovdqa\[ \\t\]+\\(\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\nxy\]*\\(.{5,6}(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */ -/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\nxy\]*\\((?:\n|\[ \\t\]+#)" 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times "vmovdqa\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\nxy\]*\\(.{5,6}(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*\\)\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*\\)\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr89346.c b/gcc/testsuite/gcc.target/i386/pr89346.c new file mode 100644 index 00000000000..cdc9accf521 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr89346.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=skylake-avx512" } */ + +#include + +long long *p; +volatile __m256i y; + +void +foo (void) +{ + _mm256_store_epi64 (p, y); +} + +/* { dg-final { scan-assembler-not "vmovdqa64" } } */