From patchwork Fri Jul 28 12:56:51 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jan Hubicka <hubicka@ucw.cz>
X-Patchwork-Id: 1814271
Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=8.43.85.97; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=<UNKNOWN>)
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (1024-bit key;
 unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256
 header.s=default header.b=UxYq0laz;
	dkim-atps=neutral
Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4RC72M2jGsz1ydh
	for <incoming@patchwork.ozlabs.org>; Fri, 28 Jul 2023 22:57:13 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 74BC5385C017
	for <incoming@patchwork.ozlabs.org>; Fri, 28 Jul 2023 12:57:11 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 74BC5385C017
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1690549031;
	bh=FH41TUx2ITrKr6SqOITiYgnLZfXHbSNE/3Y5r6FQaRo=;
	h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=UxYq0lazfnBkYZ3nRvcrWNSfheIFgYYTwEp4HY0eCHtaHxkwTqPPZcJiQxk6X1iUC
	 Wy5PikdzL8jGrroVrNia6YG799iFNBKmvAuObEnlQRec7UN4FM7MfFN0DYIvh3tVJe
	 IgE0TxufaWA4bkDWViClZEdVMoiHRQqaGQOM2WGk=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16])
 by sourceware.org (Postfix) with ESMTPS id 2C1613858D20
 for <gcc-patches@gcc.gnu.org>; Fri, 28 Jul 2023 12:56:52 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2C1613858D20
Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202)
 id 1C1CA281ACE; Fri, 28 Jul 2023 14:56:51 +0200 (CEST)
Date: Fri, 28 Jul 2023 14:56:51 +0200
To: gcc-patches@gcc.gnu.org
Subject: Loop-split improvements, part 3
Message-ID: <ZMO7E9C9KBfbyJHH@kam.mff.cuni.cz>
MIME-Version: 1.0
Content-Disposition: inline
X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS,
 JMQ_SPF_NEUTRAL, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Jan Hubicka via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Jan Hubicka <hubicka@ucw.cz>
Reply-To: Jan Hubicka <hubicka@ucw.cz>
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>

Hi,
This patch extends tree-ssa-loop-split to understand test of the form
 if (i==0)
and
 if (i!=0)
which triggers only during the first iteration.  Naturally we should
also be able to trigger last iteration or split into 3 cases if
the test indeed can fire in the middle of the loop.

Last iteration is bit trickier pattern matching so I want to do it
incrementally, but I implemented easy case using value range that handled
loops with constant iterations.

The testcase gets misupdated profile, I will also fix that incrementally.

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

	PR middle-end/77689
	* tree-ssa-loop-split.cc: Include value-query.h.
	(split_at_bb_p): Analyze cases where EQ/NE can be turned
	into LT/LE/GT/GE; return updated guard code.
	(split_loop): Use guard code.

gcc/testsuite/ChangeLog:

	PR middle-end/77689
	* g++.dg/tree-ssa/loop-split-1.C: New test.
diff --git a/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C b/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C
new file mode 100644
index 00000000000..9581438b536
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-lsplit-details -std=c++11" } */
+#include <vector>
+#include <cmath>
+
+constexpr unsigned s = 100000000;
+
+int main()
+{
+    std::vector<float> a, b, c;
+    a.reserve(s);
+    b.reserve(s);
+    c.reserve(s);
+
+    for(unsigned i = 0; i < s; ++i)
+    {
+        if(i == 0)
+            a[i] = b[i] * c[i];
+        else
+            a[i] = (b[i] + c[i]) * c[i-1] * std::log(i);
+    }
+}
+/* { dg-final { scan-tree-dump-times "loop split" 1 "lsplit" } } */
diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
index 70cd0aaefa7..641346cba70 100644
--- a/gcc/tree-ssa-loop-split.cc
+++ b/gcc/tree-ssa-loop-split.cc
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-fold.h"
 #include "gimplify-me.h"
 #include "print-tree.h"
+#include "value-query.h"
 
 /* This file implements two kinds of loop splitting.
 
@@ -75,7 +76,8 @@ along with GCC; see the file COPYING3.  If not see
    point in *BORDER and the comparison induction variable in IV.  */
 
 static tree
-split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv)
+split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv,
+	       enum tree_code *guard_code)
 {
   gcond *stmt;
   affine_iv iv2;
@@ -87,19 +89,6 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv)
 
   enum tree_code code = gimple_cond_code (stmt);
 
-  /* Only handle relational comparisons, for equality and non-equality
-     we'd have to split the loop into two loops and a middle statement.  */
-  switch (code)
-    {
-      case LT_EXPR:
-      case LE_EXPR:
-      case GT_EXPR:
-      case GE_EXPR:
-	break;
-      default:
-	return NULL_TREE;
-    }
-
   if (loop_exits_from_bb_p (loop, bb))
     return NULL_TREE;
 
@@ -129,6 +118,56 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv)
   if (!iv->no_overflow)
     return NULL_TREE;
 
+  /* Only handle relational comparisons, for equality and non-equality
+     we'd have to split the loop into two loops and a middle statement.  */
+  switch (code)
+    {
+      case LT_EXPR:
+      case LE_EXPR:
+      case GT_EXPR:
+      case GE_EXPR:
+	break;
+      case NE_EXPR:
+      case EQ_EXPR:
+	/* If the test check for first iteration, we can handle NE/EQ
+	   with only one split loop.  */
+	if (operand_equal_p (iv->base, iv2.base, 0))
+	  {
+	    if (code == EQ_EXPR)
+	      code = !tree_int_cst_sign_bit (iv->step) ? LE_EXPR : GE_EXPR;
+	    else
+	      code = !tree_int_cst_sign_bit (iv->step) ? GT_EXPR : LT_EXPR;
+	    break;
+	  }
+	/* Similarly when the test checks for minimal or maximal
+	   value range.  */
+	else
+	  {
+	    int_range<2> r;
+	    get_global_range_query ()->range_of_expr (r, op0, stmt);
+	    if (!r.varying_p () && !r.undefined_p ()
+		&& TREE_CODE (op1) == INTEGER_CST)
+	      {
+		wide_int val = wi::to_wide (op1);
+		if (known_eq (val, r.lower_bound ()))
+		  {
+		    code = (code == EQ_EXPR) ? LE_EXPR : GT_EXPR;
+		    break;
+		  }
+		else if (known_eq (val, r.upper_bound ()))
+		  {
+		    code = (code == EQ_EXPR) ? GE_EXPR : LT_EXPR;
+		    break;
+		  }
+	      }
+	  }
+	/* TODO: We can compare with exit condition; it seems that testing for
+	   last iteration is common case.  */
+	return NULL_TREE;
+      default:
+	return NULL_TREE;
+    }
+
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
       fprintf (dump_file, "Found potential split point: ");
@@ -143,6 +182,7 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv)
     }
 
   *border = iv2.base;
+  *guard_code = code;
   return op0;
 }
 
@@ -566,8 +606,9 @@ split_loop (class loop *loop1)
     }
 
   /* Find a splitting opportunity.  */
+  enum tree_code guard_code;
   for (i = 0; i < loop1->num_nodes; i++)
-    if ((guard_iv = split_at_bb_p (loop1, bbs[i], &border, &iv)))
+    if ((guard_iv = split_at_bb_p (loop1, bbs[i], &border, &iv, &guard_code)))
       {
 	profile_count entry_count = loop_preheader_edge (loop1)->count ();
 	/* Handling opposite steps is not implemented yet.  Neither
@@ -585,7 +626,6 @@ split_loop (class loop *loop1)
 	gcond *guard_stmt = as_a<gcond *> (*gsi_last_bb (bbs[i]));
 	tree guard_init = PHI_ARG_DEF_FROM_EDGE (phi,
 						 loop_preheader_edge (loop1));
-	enum tree_code guard_code = gimple_cond_code (guard_stmt);
 
 	/* Loop splitting is implemented by versioning the loop, placing
 	   the new loop after the old loop, make the first loop iterate