From patchwork Fri Nov 1 01:12:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongtao Liu X-Patchwork-Id: 1187722 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-512207-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="oySJnkE+"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="M2Bj2SBU"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4743xZ5Px7z9sPF for ; Fri, 1 Nov 2019 12:09:20 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; q=dns; s=default; b=kowfB2YlUWs9rVrwiF/uDXhErd6zJ8SIAwaJHLVb62X E2swoqjdD7d4ofJx830mIoWH2dnUB+hGTv2PodOzCtr0f4n7uzeOSo52ol3AMpyS Bigx5tMn9rDEQ6ZC6frV71SuoL6MO4ncknjeDii8z20LNZVRy72gZUuBdmus+yt0 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; s=default; bh=cWTtl19uiJnXKab6BYmCQp4E550=; b=oySJnkE+K3mDGeuAJ bGD+ip89QIP8djT8U+JJTri78LXUDjpyT1IQO5pL7FMTQn8RPx5Q1AYqbi/s0TOu FEfWRGId03UaCy5dm0YkmdQdjumS4RwhPwysrxHKqrqLUri31guWFJoXprha/yc9 grHLVuFbNn2IhpdHLxtYXI6AwM= Received: (qmail 16404 invoked by alias); 1 Nov 2019 01:09:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16395 invoked by uid 89); 1 Nov 2019 01:09:13 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=skylakeavx512, V2DImode, v2dimode, skylake-avx512 X-HELO: mail-ot1-f48.google.com Received: from mail-ot1-f48.google.com (HELO mail-ot1-f48.google.com) (209.85.210.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 01 Nov 2019 01:09:11 +0000 Received: by mail-ot1-f48.google.com with SMTP id v24so1822355otp.5 for ; Thu, 31 Oct 2019 18:09:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=GvwkzUnmvRoY7QCay8eouaue92T4C5vSNWp4LwXIPtg=; b=M2Bj2SBUEwu8QOr5Uf3jeraicrQ9fwd0kgVLvSrhx4dyifZ8eqKeZW0fzc4N+FHVL+ 7Oqa6OLrhd3SQ1j5/kYptu2p6GSrSvKcymnqI3uVajpQDK0waNfVLeIoxU97kle+TAyX OC4XsibdMzd0joBPE57uoR+SlL3kZd3p1rkC+Y2ylacGNJ28IvIPNslA1/aUVAWu/lJq 3EqsOrIMlOMqjPRfGOM3mi2luH6EvdVqzsCi97foRnMYCjEx11A4OIaQu96k3JQ6+Wom OeNiVwUmgXfGEjt+oG79dBSLHYGAa7CqnRlCBJLEGct6QOHAGnYN7XRQlw43ywDifjkW H4VA== MIME-Version: 1.0 From: Hongtao Liu Date: Fri, 1 Nov 2019 09:12:14 +0800 Message-ID: Subject: [PATCH target/92295] Fix inefficient vector constructor To: Uros Bizjak , GCC Patches Cc: "H. J. Lu" X-IsSubscribed: yes Hi uros: This patch is about to fix inefficient vector constructor. Currently in ix86_expand_vector_init_concat, vector are initialized per 2 elements which can miss some optimization opportunity like pr92295. Bootstrap and i386 regression test is ok. Ok for trunk? Changelog gcc/ PR target/92295 * config/i386/i386-expand.c (ix86_expand_vector_init_concat) Enhance ix86_expand_vector_init_concat. gcc/testsuite * gcc.target/i386/pr92295.c: New test. From 408fb093993f9df4da42d8daf2e6996f087c4618 Mon Sep 17 00:00:00 2001 From: liuhongt Date: Thu, 31 Oct 2019 15:14:00 +0000 Subject: [PATCH] Enhance ix86_expand_vector_init_concat. Changelog gcc/ PR target/92295 * config/i386/i386-expand.c (ix86_expand_vector_init_concat) Enhance ix86_expand_vector_init_concat. gcc/testsuite * gcc.target/i386/pr92295.c: New test. --- gcc/config/i386/i386-expand.c | 130 ++++++++++-------------- gcc/testsuite/gcc.target/i386/pr92295.c | 13 +++ 2 files changed, 65 insertions(+), 78 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr92295.c diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 6d3d14c37dd..be040a1bc3e 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -13654,8 +13654,8 @@ static void ix86_expand_vector_init_concat (machine_mode mode, rtx target, rtx *ops, int n) { - machine_mode cmode, hmode = VOIDmode, gmode = VOIDmode; - rtx first[16], second[8], third[4]; + machine_mode half_mode = VOIDmode; + rtx half[2]; rtvec v; int i, j; @@ -13665,55 +13665,55 @@ ix86_expand_vector_init_concat (machine_mode mode, switch (mode) { case E_V16SImode: - cmode = V8SImode; + half_mode = V8SImode; break; case E_V16SFmode: - cmode = V8SFmode; + half_mode = V8SFmode; break; case E_V8DImode: - cmode = V4DImode; + half_mode = V4DImode; break; case E_V8DFmode: - cmode = V4DFmode; + half_mode = V4DFmode; break; case E_V8SImode: - cmode = V4SImode; + half_mode = V4SImode; break; case E_V8SFmode: - cmode = V4SFmode; + half_mode = V4SFmode; break; case E_V4DImode: - cmode = V2DImode; + half_mode = V2DImode; break; case E_V4DFmode: - cmode = V2DFmode; + half_mode = V2DFmode; break; case E_V4SImode: - cmode = V2SImode; + half_mode = V2SImode; break; case E_V4SFmode: - cmode = V2SFmode; + half_mode = V2SFmode; break; case E_V2DImode: - cmode = DImode; + half_mode = DImode; break; case E_V2SImode: - cmode = SImode; + half_mode = SImode; break; case E_V2DFmode: - cmode = DFmode; + half_mode = DFmode; break; case E_V2SFmode: - cmode = SFmode; + half_mode = SFmode; break; default: gcc_unreachable (); } - if (!register_operand (ops[1], cmode)) - ops[1] = force_reg (cmode, ops[1]); - if (!register_operand (ops[0], cmode)) - ops[0] = force_reg (cmode, ops[0]); + if (!register_operand (ops[1], half_mode)) + ops[1] = force_reg (half_mode, ops[1]); + if (!register_operand (ops[0], half_mode)) + ops[0] = force_reg (half_mode, ops[0]); emit_insn (gen_rtx_SET (target, gen_rtx_VEC_CONCAT (mode, ops[0], ops[1]))); break; @@ -13722,16 +13722,16 @@ ix86_expand_vector_init_concat (machine_mode mode, switch (mode) { case E_V4DImode: - cmode = V2DImode; + half_mode = V2DImode; break; case E_V4DFmode: - cmode = V2DFmode; + half_mode = V2DFmode; break; case E_V4SImode: - cmode = V2SImode; + half_mode = V2SImode; break; case E_V4SFmode: - cmode = V2SFmode; + half_mode = V2SFmode; break; default: gcc_unreachable (); @@ -13742,20 +13742,16 @@ ix86_expand_vector_init_concat (machine_mode mode, switch (mode) { case E_V8DImode: - cmode = V2DImode; - hmode = V4DImode; + half_mode = V4DImode; break; case E_V8DFmode: - cmode = V2DFmode; - hmode = V4DFmode; + half_mode = V4DFmode; break; case E_V8SImode: - cmode = V2SImode; - hmode = V4SImode; + half_mode = V4SImode; break; case E_V8SFmode: - cmode = V2SFmode; - hmode = V4SFmode; + half_mode = V4SFmode; break; default: gcc_unreachable (); @@ -13766,14 +13762,10 @@ ix86_expand_vector_init_concat (machine_mode mode, switch (mode) { case E_V16SImode: - cmode = V2SImode; - hmode = V4SImode; - gmode = V8SImode; + half_mode = V8SImode; break; case E_V16SFmode: - cmode = V2SFmode; - hmode = V4SFmode; - gmode = V8SFmode; + half_mode = V8SFmode; break; default: gcc_unreachable (); @@ -13783,50 +13775,32 @@ ix86_expand_vector_init_concat (machine_mode mode, half: /* FIXME: We process inputs backward to help RA. PR 36222. */ i = n - 1; - j = (n >> 1) - 1; - for (; i > 0; i -= 2, j--) - { - first[j] = gen_reg_rtx (cmode); - v = gen_rtvec (2, ops[i - 1], ops[i]); - ix86_expand_vector_init (false, first[j], - gen_rtx_PARALLEL (cmode, v)); - } - - n >>= 1; - if (n > 4) - { - gcc_assert (hmode != VOIDmode); - gcc_assert (gmode != VOIDmode); - for (i = j = 0; i < n; i += 2, j++) - { - second[j] = gen_reg_rtx (hmode); - ix86_expand_vector_init_concat (hmode, second [j], - &first [i], 2); - } - n >>= 1; - for (i = j = 0; i < n; i += 2, j++) - { - third[j] = gen_reg_rtx (gmode); - ix86_expand_vector_init_concat (gmode, third[j], - &second[i], 2); - } - n >>= 1; - ix86_expand_vector_init_concat (mode, target, third, n); - } - else if (n > 2) + for (j = 1; j != -1; j--) { - gcc_assert (hmode != VOIDmode); - for (i = j = 0; i < n; i += 2, j++) + half[j] = gen_reg_rtx (half_mode); + switch (n >> 1) { - second[j] = gen_reg_rtx (hmode); - ix86_expand_vector_init_concat (hmode, second [j], - &first [i], 2); + case 2: + v = gen_rtvec (2, ops[i-1], ops[i]); + i -= 2; + break; + case 4: + v = gen_rtvec (4, ops[i-3], ops[i-2], ops[i-1], ops[i]); + i -= 4; + break; + case 8: + v = gen_rtvec (8, ops[i-7], ops[i-6], ops[i-5], ops[i-4], + ops[i-3], ops[i-2], ops[i-1], ops[i]); + i -= 8; + break; + default: + gcc_unreachable (); } - n >>= 1; - ix86_expand_vector_init_concat (mode, target, second, n); + ix86_expand_vector_init (false, half[j], + gen_rtx_PARALLEL (half_mode, v)); } - else - ix86_expand_vector_init_concat (mode, target, first, n); + + ix86_expand_vector_init_concat (mode, target, half, 2); break; default: diff --git a/gcc/testsuite/gcc.target/i386/pr92295.c b/gcc/testsuite/gcc.target/i386/pr92295.c new file mode 100644 index 00000000000..179dc487b98 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr92295.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=skylake-avx512" } */ + +typedef int X __attribute__((vector_size (32))); + +X +foo (int x, int z) +{ + X y = { x, x, x, x, z, z, z, z }; + return y; +} + +/* { dg-final { scan-assembler-times "vpbroadcast" "2" } } */ -- 2.19.1