From patchwork Mon Jan 4 09:47:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1422031 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4D8W5D173fz9sRR for ; Mon, 4 Jan 2021 20:47:42 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DFC91385783A; Mon, 4 Jan 2021 09:47:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id AA79A3858C27 for ; Mon, 4 Jan 2021 09:47:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AA79A3858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rguenther@suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id B6906ACBA for ; Mon, 4 Jan 2021 09:47:35 +0000 (UTC) Date: Mon, 4 Jan 2021 10:47:34 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/98291 - allow SLP more vectorization of reductions Message-ID: User-Agent: Alpine 2.21 (LSU 202 2017-01-01) MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" When the VF is one a SLP reduction is in-order and thus we can vectorize even when the reduction op is not associative. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-01-04 Richard Biener PR tree-optimization/98291 * tree-vect-loop.c (vectorizable_reduction): Bypass associativity check for SLP reductions with VF 1. * gcc.dg/vect/slp-reduc-11.c: New testcase. * gcc.dg/vect/vect-reduc-in-order-4.c: Adjust. --- gcc/testsuite/gcc.dg/vect/slp-reduc-11.c | 20 +++++++++++++++++++ .../gcc.dg/vect/vect-reduc-in-order-4.c | 2 -- gcc/tree-vect-loop.c | 10 ++++++++-- 3 files changed, 28 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/slp-reduc-11.c diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-11.c b/gcc/testsuite/gcc.dg/vect/slp-reduc-11.c new file mode 100644 index 00000000000..a2f86fb8d66 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-11.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_double } */ + +double dotprod(const double *a, const double *b, unsigned long long n) +{ + double d1 = 0.0; + double d2 = 0.0; + + for (unsigned long long i = 0; i < n; i += 2) { + d1 += a[i] * b[i]; + d2 += a[i + 1] * b[i + 1]; + } + + return (d1 + d2); +} + +/* We should use a SLP reduction even without -ffast-math by using a + VF of one. */ +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c index 7706a2dc5b2..eff3994a335 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c @@ -41,6 +41,4 @@ main () return 0; } -/* { dg-final { scan-tree-dump {in-order unchained SLP reductions not supported} "vect" } } */ -/* { dg-final { scan-tree-dump-not {vectorizing stmts using SLP} "vect" } } */ /* { dg-final { scan-tree-dump-times "VECT_PERM_EXPR" 0 "vect" } } */ diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index d6f1ffcd386..4f5e3fe20cb 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -6868,8 +6868,14 @@ vectorizable_reduction (loop_vec_info loop_vinfo, cases, so we need to check that this is ok. One exception is when vectorizing an outer-loop: the inner-loop is executed sequentially, and therefore vectorizing reductions in the inner-loop during - outer-loop vectorization is safe. */ - if (needs_fold_left_reduction_p (scalar_type, orig_code)) + outer-loop vectorization is safe. Likewise when we are vectorizing + a series of reductions using SLP and the VF is one the reductions + are performed in scalar order. */ + if (slp_node + && !REDUC_GROUP_FIRST_ELEMENT (stmt_info) + && known_eq (LOOP_VINFO_VECT_FACTOR (loop_vinfo), 1u)) + ; + else if (needs_fold_left_reduction_p (scalar_type, orig_code)) { /* When vectorizing a reduction chain w/o SLP the reduction PHI is not directy used in stmt. */