From patchwork Tue Jun 8 17:59:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1489527 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=uJyXyhJI; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Fzyhw4zwxz9sVm for ; Wed, 9 Jun 2021 04:01:04 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC1183973027 for ; Tue, 8 Jun 2021 18:01:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC1183973027 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1623175262; bh=rX+bEj+X5gOJbb5zvPmNxn5crIrJwiNW5Zk4pQnup3o=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=uJyXyhJIPfNjX71ovrTnnzxyGVcC96UkvANYA6bB4ujVdUiMZdDxvPaPW5SQnYQRt s/VMw2vdQ8yHQJ9kKt9Xm9GDaCxvk5k2jo3oPJ3PyDG94Wno+wZ8+jF8Z7de1Dij/g GPQphC8T6NTZUtM8Hw3onZnCrxvM47TiEe/wr21Y= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by sourceware.org (Postfix) with ESMTPS id 40241396EC38 for ; Tue, 8 Jun 2021 17:59:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 40241396EC38 Received: by mail-pf1-x42e.google.com with SMTP id q25so16315454pfh.7 for ; Tue, 08 Jun 2021 10:59:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=rX+bEj+X5gOJbb5zvPmNxn5crIrJwiNW5Zk4pQnup3o=; b=OlfIfN7W66AKFu/+g/KwxC0+FFxgquvTIGZweokWuu9s95eq9OG/3SSxPzSIME/fmY Mt25QrqLjixMokXfcT1Y0S115gSzwPF5MSaw/lqN/FJ41W25v3m9skHiJdtlZT4tR1B0 16gtTeE/fkGTE672PVKoZZVqjr2i7t7PuawsjlSsYdc9p4GdPVsHEGp/VB5Zj4YAyp84 zwojKkJT1XhDuxB26ZyqO15J8WQqyzkcq1l7oLLeBqi0PBlSak8hbWvCI0oPUariXSvK aCLZaw+vvNp6U6YWKa3WX5JujEM3sffpLQJftpyl6RsGk4C+KiUiWtlWlTrm7VNb8OKt uqPg== X-Gm-Message-State: AOAM530eOi2vFOLOhGO6w6iNK3J5a/7C2wLAUt+KMl+9uynaeyPqXPV+ 3BEMECaTkos9jtUohIzEbXM= X-Google-Smtp-Source: ABdhPJwdsJ7dFZ2+B8FxInQqdJ9qhGEycSYIN6gVH+zjDRfevU4i9tkqDktTF52/h7dShxXq+YUSdQ== X-Received: by 2002:a65:4508:: with SMTP id n8mr13977892pgq.120.1623175171143; Tue, 08 Jun 2021 10:59:31 -0700 (PDT) Received: from gnu-cfl-2.localdomain ([172.56.38.102]) by smtp.gmail.com with ESMTPSA id gn4sm15638485pjb.16.2021.06.08.10.59.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Jun 2021 10:59:30 -0700 (PDT) Received: from gnu-tgl-2.localdomain (gnu-tgl-2 [192.168.1.34]) by gnu-cfl-2.localdomain (Postfix) with ESMTPS id D21C8C0301; Tue, 8 Jun 2021 10:59:28 -0700 (PDT) Received: from gnu-tgl-2.lan (localhost [IPv6:::1]) by gnu-tgl-2.localdomain (Postfix) with ESMTP id B0A0330047F; Tue, 8 Jun 2021 10:59:18 -0700 (PDT) To: gcc-patches@gcc.gnu.org Subject: [PATCH v3 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast Date: Tue, 8 Jun 2021 10:59:16 -0700 Message-Id: <20210608175918.61759-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-3028.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Gcc-patches" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: Jakub Jelinek , Richard Sandiford Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO operands to vector broadcast from an integer with AVX2. 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which won't increase stack alignment requirement and blocks transformation by the combine pass. 3. Update PR 87767 tests to expect integer broadcast instead of broadcast from memory. 4. Update avx512f_cond_move.c to expect integer broadcast. 5. Update vec_duplicate to allow to fail so that backend can only allow broadcasting an integer constant to a vector when broadcast instruction is available. This can be used by memset expander to avoid vec_duplicate when loading from constant pool is more efficient. 6. Add vec_duplicate expander and enable vec_duplicate from a non-standard SSE constant integer only if vector broadcast is available. A small benchmark: https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/memset/broadcast shows that broadcast is a little bit faster on Intel Core i7-8559U: $ make gcc -g -I. -O2 -c -o test.o test.c gcc -g -c -o memory.o memory.S gcc -g -c -o broadcast.o broadcast.S gcc -g -c -o vec_dup_sse2.o vec_dup_sse2.S gcc -o test test.o memory.o broadcast.o vec_dup_sse2.o ./test memory : 147215 broadcast : 121213 vec_dup_sse2: 171366 $ broadcast is also smaller: $ size memory.o broadcast.o text data bss dec hex filename 132 0 0 132 84 memory.o 122 0 0 122 7a broadcast.o $ H.J. Lu (2): x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast x86: Add vec_duplicate expander gcc/config/i386/i386-expand.c | 213 +++++++++++++++++- gcc/config/i386/i386-protos.h | 3 + gcc/config/i386/i386.c | 31 +++ gcc/config/i386/sse.md | 20 ++ gcc/doc/md.texi | 2 - .../i386/avx512f-broadcast-pr87767-1.c | 7 +- .../i386/avx512f-broadcast-pr87767-5.c | 5 +- .../gcc.target/i386/avx512f_cond_move.c | 4 +- .../i386/avx512vl-broadcast-pr87767-1.c | 12 +- .../i386/avx512vl-broadcast-pr87767-5.c | 9 +- gcc/testsuite/gcc.target/i386/pr100865-1.c | 13 ++ gcc/testsuite/gcc.target/i386/pr100865-10a.c | 33 +++ gcc/testsuite/gcc.target/i386/pr100865-10b.c | 7 + gcc/testsuite/gcc.target/i386/pr100865-2.c | 14 ++ gcc/testsuite/gcc.target/i386/pr100865-3.c | 15 ++ gcc/testsuite/gcc.target/i386/pr100865-4a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr100865-4b.c | 9 + gcc/testsuite/gcc.target/i386/pr100865-5a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr100865-5b.c | 9 + gcc/testsuite/gcc.target/i386/pr100865-6a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr100865-6b.c | 9 + gcc/testsuite/gcc.target/i386/pr100865-7a.c | 17 ++ gcc/testsuite/gcc.target/i386/pr100865-7b.c | 9 + gcc/testsuite/gcc.target/i386/pr100865-8a.c | 24 ++ gcc/testsuite/gcc.target/i386/pr100865-8b.c | 7 + gcc/testsuite/gcc.target/i386/pr100865-9a.c | 25 ++ gcc/testsuite/gcc.target/i386/pr100865-9b.c | 7 + 27 files changed, 526 insertions(+), 26 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-10a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-10b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-4a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-4b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-5a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-5b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-6a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-6b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-7a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-7b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-8a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-8b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-9a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-9b.c