From patchwork Fri Jul 28 12:56:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1814271 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=UxYq0laz; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RC72M2jGsz1ydh for ; Fri, 28 Jul 2023 22:57:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 74BC5385C017 for ; Fri, 28 Jul 2023 12:57:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 74BC5385C017 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1690549031; bh=FH41TUx2ITrKr6SqOITiYgnLZfXHbSNE/3Y5r6FQaRo=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=UxYq0lazfnBkYZ3nRvcrWNSfheIFgYYTwEp4HY0eCHtaHxkwTqPPZcJiQxk6X1iUC Wy5PikdzL8jGrroVrNia6YG799iFNBKmvAuObEnlQRec7UN4FM7MfFN0DYIvh3tVJe IgE0TxufaWA4bkDWViClZEdVMoiHRQqaGQOM2WGk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 2C1613858D20 for ; Fri, 28 Jul 2023 12:56:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2C1613858D20 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 1C1CA281ACE; Fri, 28 Jul 2023 14:56:51 +0200 (CEST) Date: Fri, 28 Jul 2023 14:56:51 +0200 To: gcc-patches@gcc.gnu.org Subject: Loop-split improvements, part 3 Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, JMQ_SPF_NEUTRAL, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Hubicka via Gcc-patches From: Jan Hubicka Reply-To: Jan Hubicka Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch extends tree-ssa-loop-split to understand test of the form if (i==0) and if (i!=0) which triggers only during the first iteration. Naturally we should also be able to trigger last iteration or split into 3 cases if the test indeed can fire in the middle of the loop. Last iteration is bit trickier pattern matching so I want to do it incrementally, but I implemented easy case using value range that handled loops with constant iterations. The testcase gets misupdated profile, I will also fix that incrementally. Bootstrapped/regtested x86_64-linux, OK? gcc/ChangeLog: PR middle-end/77689 * tree-ssa-loop-split.cc: Include value-query.h. (split_at_bb_p): Analyze cases where EQ/NE can be turned into LT/LE/GT/GE; return updated guard code. (split_loop): Use guard code. gcc/testsuite/ChangeLog: PR middle-end/77689 * g++.dg/tree-ssa/loop-split-1.C: New test. diff --git a/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C b/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C new file mode 100644 index 00000000000..9581438b536 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-lsplit-details -std=c++11" } */ +#include +#include + +constexpr unsigned s = 100000000; + +int main() +{ + std::vector a, b, c; + a.reserve(s); + b.reserve(s); + c.reserve(s); + + for(unsigned i = 0; i < s; ++i) + { + if(i == 0) + a[i] = b[i] * c[i]; + else + a[i] = (b[i] + c[i]) * c[i-1] * std::log(i); + } +} +/* { dg-final { scan-tree-dump-times "loop split" 1 "lsplit" } } */ diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc index 70cd0aaefa7..641346cba70 100644 --- a/gcc/tree-ssa-loop-split.cc +++ b/gcc/tree-ssa-loop-split.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-fold.h" #include "gimplify-me.h" #include "print-tree.h" +#include "value-query.h" /* This file implements two kinds of loop splitting. @@ -75,7 +76,8 @@ along with GCC; see the file COPYING3. If not see point in *BORDER and the comparison induction variable in IV. */ static tree -split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv) +split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv, + enum tree_code *guard_code) { gcond *stmt; affine_iv iv2; @@ -87,19 +89,6 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv) enum tree_code code = gimple_cond_code (stmt); - /* Only handle relational comparisons, for equality and non-equality - we'd have to split the loop into two loops and a middle statement. */ - switch (code) - { - case LT_EXPR: - case LE_EXPR: - case GT_EXPR: - case GE_EXPR: - break; - default: - return NULL_TREE; - } - if (loop_exits_from_bb_p (loop, bb)) return NULL_TREE; @@ -129,6 +118,56 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv) if (!iv->no_overflow) return NULL_TREE; + /* Only handle relational comparisons, for equality and non-equality + we'd have to split the loop into two loops and a middle statement. */ + switch (code) + { + case LT_EXPR: + case LE_EXPR: + case GT_EXPR: + case GE_EXPR: + break; + case NE_EXPR: + case EQ_EXPR: + /* If the test check for first iteration, we can handle NE/EQ + with only one split loop. */ + if (operand_equal_p (iv->base, iv2.base, 0)) + { + if (code == EQ_EXPR) + code = !tree_int_cst_sign_bit (iv->step) ? LE_EXPR : GE_EXPR; + else + code = !tree_int_cst_sign_bit (iv->step) ? GT_EXPR : LT_EXPR; + break; + } + /* Similarly when the test checks for minimal or maximal + value range. */ + else + { + int_range<2> r; + get_global_range_query ()->range_of_expr (r, op0, stmt); + if (!r.varying_p () && !r.undefined_p () + && TREE_CODE (op1) == INTEGER_CST) + { + wide_int val = wi::to_wide (op1); + if (known_eq (val, r.lower_bound ())) + { + code = (code == EQ_EXPR) ? LE_EXPR : GT_EXPR; + break; + } + else if (known_eq (val, r.upper_bound ())) + { + code = (code == EQ_EXPR) ? GE_EXPR : LT_EXPR; + break; + } + } + } + /* TODO: We can compare with exit condition; it seems that testing for + last iteration is common case. */ + return NULL_TREE; + default: + return NULL_TREE; + } + if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "Found potential split point: "); @@ -143,6 +182,7 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv) } *border = iv2.base; + *guard_code = code; return op0; } @@ -566,8 +606,9 @@ split_loop (class loop *loop1) } /* Find a splitting opportunity. */ + enum tree_code guard_code; for (i = 0; i < loop1->num_nodes; i++) - if ((guard_iv = split_at_bb_p (loop1, bbs[i], &border, &iv))) + if ((guard_iv = split_at_bb_p (loop1, bbs[i], &border, &iv, &guard_code))) { profile_count entry_count = loop_preheader_edge (loop1)->count (); /* Handling opposite steps is not implemented yet. Neither @@ -585,7 +626,6 @@ split_loop (class loop *loop1) gcond *guard_stmt = as_a (*gsi_last_bb (bbs[i])); tree guard_init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop1)); - enum tree_code guard_code = gimple_cond_code (guard_stmt); /* Loop splitting is implemented by versioning the loop, placing the new loop after the old loop, make the first loop iterate