From patchwork Fri Dec 1 13:54:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 843502 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-468332-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="pvY70TDh"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ypG3t3vdPz9t3Z for ; Sat, 2 Dec 2017 00:54:49 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=nVVS9xKiYiC51pCUz Czvs/8A3F3Pc1NN1GWjJIU9QDrqZxHOln/zLIHkbl4O8kshPxKNV0Rs9hfP1Z3B5 KBtZo8T1aTgo/TLuvy6mFoICS2X+0/OApL16WX9jDLLq8biVLYuQuMEfji/vWzJi LdeHcIb3jkFiCqWrGymb9XZs/0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; s=default; bh=aTrzIchlYS5YDyC7KA65XiQ Ot70=; b=pvY70TDhu0NpozRBc2BESwKnUxgqv2Upqnfz2l1ic49IQzz+h36pMI3 4SyR4ugtdM1qiwFsrVgVVqpB+rMOUIJbC92FSvLn0tKglYzZNOuu+uQq50XWComc S26Eod7LrdcBV6K0K+xKnS95f/8/AQ+GNWwbhFJci9j4+q+6NPP8= Received: (qmail 27407 invoked by alias); 1 Dec 2017 13:54:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27391 invoked by uid 89); 1 Dec 2017 13:54:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KB_WAM_FROM_NAME_SINGLEWORD, SPF_HELO_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=avx-512dq, v2df_t, mavx512vl, v4df_t X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 01 Dec 2017 13:54:39 +0000 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2238AC0587F3; Fri, 1 Dec 2017 13:54:36 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-77.ams2.redhat.com [10.36.116.77]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C77D25D973; Fri, 1 Dec 2017 13:54:33 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id vB1DsUuH004307; Fri, 1 Dec 2017 14:54:30 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id vB1DsSUI004280; Fri, 1 Dec 2017 14:54:28 +0100 Date: Fri, 1 Dec 2017 14:54:28 +0100 From: Jakub Jelinek To: Jan Beulich , Uros Bizjak , Kirill Yukhin Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Re: loading of zeros into {x,y,z}mm registers Message-ID: <20171201135428.GX2353@tucnak> Reply-To: Jakub Jelinek References: <5A1EE7890200007800193391@prv-mh.provo.novell.com> <20171201054550.GA22657@titus> <5A2154580200007800193C57@prv-mh.provo.novell.com> <20171201121843.GW2353@tucnak> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20171201121843.GW2353@tucnak> User-Agent: Mutt/1.7.1 (2016-10-04) X-IsSubscribed: yes On Fri, Dec 01, 2017 at 01:18:43PM +0100, Jakub Jelinek wrote: > > Furthermore this > > > > typedef double __attribute__((vector_size(16))) v2df_t; > > typedef double __attribute__((vector_size(32))) v4df_t; > > > > void test1(void) { > > register v2df_t x asm("xmm31") = {}; > > asm volatile("" :: "v" (x)); > > } > > > > void test2(void) { > > register v4df_t x asm("ymm31") = {}; > > asm volatile("" :: "v" (x)); > > } > > > > translates to "vxorpd %xmm31, %xmm31, %xmm31" for both > > functions with -mavx512vl, yet afaict the instructions would #UD > > without AVX-512DQ, which suggests to me that the original > > intention wasn't fully met. > > This indeed is a bug, please file a PR; we should IMHO just use > vpxorq instead in that case, which is just AVX512VL and doesn't need > DQ. Of course if DQ is available, we should use vxorpd. > Working on a fix. Will try this: 2017-12-01 Jakub Jelinek * config/i386/i386-protos.h (standard_sse_constant_opcode): Change last argument to rtx pointer. * config/i386/i386.c (standard_sse_constant_opcode): Replace X argument with OPERANDS. For AVX+ 128-bit VEX encoded instructions over 256-bit or 512-bit. If setting EXT_REX_SSE_REG_P, use EVEX encoded insn depending on the chosen ISAs. * config/i386/i386.md (*movxi_internal_avx512f, *movoi_internal_avx, *movti_internal, *movdi_internal, *movsi_internal, *movtf_internal, *movdf_internal, *movsf_internal): Adjust standard_sse_constant_opcode callers. * config/i386/sse.md (mov_internal): Likewise. * config/i386/mmx.md (*mov_internal): Likewise. Jakub --- gcc/config/i386/i386-protos.h.jj 2017-10-28 09:00:44.000000000 +0200 +++ gcc/config/i386/i386-protos.h 2017-12-01 14:39:36.498608799 +0100 @@ -52,7 +52,7 @@ extern int standard_80387_constant_p (rt extern const char *standard_80387_constant_opcode (rtx); extern rtx standard_80387_constant_rtx (int); extern int standard_sse_constant_p (rtx, machine_mode); -extern const char *standard_sse_constant_opcode (rtx_insn *, rtx); +extern const char *standard_sse_constant_opcode (rtx_insn *, rtx *); extern bool ix86_standard_x87sse_constant_load_p (const rtx_insn *, rtx); extern bool symbolic_reference_mentioned_p (rtx); extern bool extended_reg_mentioned_p (rtx); --- gcc/config/i386/i386.c.jj 2017-12-01 09:19:07.000000000 +0100 +++ gcc/config/i386/i386.c 2017-12-01 14:36:38.884847618 +0100 @@ -10380,12 +10380,13 @@ standard_sse_constant_p (rtx x, machine_ } /* Return the opcode of the special instruction to be used to load - the constant X. */ + the constant operands[1] into operands[0]. */ const char * -standard_sse_constant_opcode (rtx_insn *insn, rtx x) +standard_sse_constant_opcode (rtx_insn *insn, rtx *operands) { machine_mode mode; + rtx x = operands[1]; gcc_assert (TARGET_SSE); @@ -10395,34 +10396,51 @@ standard_sse_constant_opcode (rtx_insn * { switch (get_attr_mode (insn)) { + case MODE_TI: + if (!EXT_REX_SSE_REG_P (operands[0])) + return "%vpxor\t%0, %d0"; + /* FALLTHRU */ case MODE_XI: - return "vpxord\t%g0, %g0, %g0"; case MODE_OI: - return (TARGET_AVX512VL - ? "vpxord\t%x0, %x0, %x0" - : "vpxor\t%x0, %x0, %x0"); - case MODE_TI: - return (TARGET_AVX512VL - ? "vpxord\t%x0, %x0, %x0" - : "%vpxor\t%0, %d0"); + if (EXT_REX_SSE_REG_P (operands[0])) + return (TARGET_AVX512VL + ? "vpxord\t%x0, %x0, %x0" + : "vpxord\t%g0, %g0, %g0"); + return "vpxor\t%x0, %x0, %x0"; + case MODE_V2DF: + if (!EXT_REX_SSE_REG_P (operands[0])) + return "%vxorpd\t%0, %d0"; + /* FALLTHRU */ case MODE_V8DF: - return (TARGET_AVX512DQ - ? "vxorpd\t%g0, %g0, %g0" - : "vpxorq\t%g0, %g0, %g0"); case MODE_V4DF: - return "vxorpd\t%x0, %x0, %x0"; - case MODE_V2DF: - return "%vxorpd\t%0, %d0"; + if (EXT_REX_SSE_REG_P (operands[0])) + if (TARGET_AVX512DQ) + return (TARGET_AVX512VL + ? "vxorpd\t%x0, %x0, %x0" + : "vxorpd\t%g0, %g0, %g0"); + else + return (TARGET_AVX512VL + ? "vpxorq\t%x0, %x0, %x0" + : "vpxorq\t%g0, %g0, %g0"); + return "vxorpd\t%x0, %x0, %x0"; + case MODE_V4SF: + if (!EXT_REX_SSE_REG_P (operands[0])) + return "%vxorps\t%0, %d0"; + /* FALLTHRU */ case MODE_V16SF: - return (TARGET_AVX512DQ - ? "vxorps\t%g0, %g0, %g0" - : "vpxord\t%g0, %g0, %g0"); case MODE_V8SF: - return "vxorps\t%x0, %x0, %x0"; - case MODE_V4SF: - return "%vxorps\t%0, %d0"; + if (EXT_REX_SSE_REG_P (operands[0])) + if (TARGET_AVX512DQ) + return (TARGET_AVX512VL + ? "vxorps\t%x0, %x0, %x0" + : "vxorps\t%g0, %g0, %g0"); + else + return (TARGET_AVX512VL + ? "vpxord\t%x0, %x0, %x0" + : "vpxord\t%g0, %g0, %g0"); + return "vxorps\t%x0, %x0, %x0"; default: gcc_unreachable (); @@ -10449,11 +10467,14 @@ standard_sse_constant_opcode (rtx_insn * case MODE_V2DF: case MODE_V4SF: gcc_assert (TARGET_SSE2); - return (TARGET_AVX512F - ? "vpternlogd\t{$0xFF, %0, %0, %0|%0, %0, %0, 0xFF}" - : TARGET_AVX + if (!EXT_REX_SSE_REG_P (operands[0])) + return (TARGET_AVX ? "vpcmpeqd\t%0, %0, %0" : "pcmpeqd\t%0, %0"); + if (TARGET_AVX512VL) + return "vpternlogd\t{$0xFF, %0, %0, %0|%0, %0, %0, 0xFF}"; + else + return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}"; default: gcc_unreachable (); --- gcc/config/i386/i386.md.jj 2017-12-01 09:06:14.000000000 +0100 +++ gcc/config/i386/i386.md 2017-12-01 14:39:25.359749204 +0100 @@ -2044,7 +2044,7 @@ (define_insn "*movxi_internal_avx512f" switch (get_attr_type (insn)) { case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: if (misaligned_operand (operands[0], XImode) @@ -2071,7 +2071,7 @@ (define_insn "*movoi_internal_avx" switch (get_attr_type (insn)) { case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: if (misaligned_operand (operands[0], OImode) @@ -2131,7 +2131,7 @@ (define_insn "*movti_internal" return "#"; case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: /* TDmode values are passed as TImode on the stack. Moving them @@ -2243,7 +2243,7 @@ (define_insn "*movdi_internal" return "movq\t{%1, %0|%0, %1}"; case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: switch (get_attr_mode (insn)) @@ -2456,7 +2456,7 @@ (define_insn "*movsi_internal" switch (get_attr_type (insn)) { case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_MSKMOV: return "kmovd\t{%1, %0|%0, %1}"; @@ -3327,7 +3327,7 @@ (define_insn "*movtf_internal" switch (get_attr_type (insn)) { case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: /* Handle misaligned load/store since we @@ -3504,7 +3504,7 @@ (define_insn "*movdf_internal" return "mov{q}\t{%1, %0|%0, %1}"; case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: switch (get_attr_mode (insn)) @@ -3698,7 +3698,7 @@ (define_insn "*movsf_internal" return "mov{l}\t{%1, %0|%0, %1}"; case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: switch (get_attr_mode (insn)) --- gcc/config/i386/sse.md.jj 2017-11-30 09:42:46.000000000 +0100 +++ gcc/config/i386/sse.md 2017-12-01 13:29:09.064964872 +0100 @@ -923,7 +923,7 @@ (define_insn "mov_internal" switch (get_attr_type (insn)) { case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: /* There is no evex-encoded vmov* for sizes smaller than 64-bytes --- gcc/config/i386/mmx.md.jj 2017-08-01 10:25:42.000000000 +0200 +++ gcc/config/i386/mmx.md 2017-12-01 13:28:44.541274286 +0100 @@ -112,7 +112,7 @@ (define_insn "*mov_internal" return "movdq2q\t{%1, %0|%0, %1}"; case TYPE_SSELOG1: - return standard_sse_constant_opcode (insn, operands[1]); + return standard_sse_constant_opcode (insn, operands); case TYPE_SSEMOV: switch (get_attr_mode (insn))