From patchwork Sat Jul 17 02:31:28 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Andrew MacLeod <amacleod@redhat.com>
X-Patchwork-Id: 1506365
Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org;
 envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dkim=pass (1024-bit key;
 unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256
 header.s=default header.b=pFlsaAwx;
	dkim-atps=neutral
Received: from sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest
 SHA256)
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 4GRXDw3Wslz9sWc
	for <incoming@patchwork.ozlabs.org>; Sat, 17 Jul 2021 12:31:58 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id C5DB439ACC2A
	for <incoming@patchwork.ozlabs.org>; Sat, 17 Jul 2021 02:31:55 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C5DB439ACC2A
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1626489115;
	bh=3k1T+MWTsMxQoXl0zqgydqRCgswXq2D/5BzOowfJ0i8=;
	h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=pFlsaAwxhDqSv4gMB4bSIMywqPxz4+YFl5X2XmapSW227F5mWJkIwiSOzcfOrZacL
	 N29iOunhGDWccDpnvM5Kpahq8ZLvZY0VhHxN6f20peZF0Nzx+m0RXUxIkQqXrFCOV9
	 C3CA6VtptxHrB5T/XlP57j5r4/KwiI5kpGzV/Rmc=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by sourceware.org (Postfix) with ESMTP id E6F1C382CC33
 for <gcc-patches@gcc.gnu.org>; Sat, 17 Jul 2021 02:31:33 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E6F1C382CC33
Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com
 [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-481-c2gG9FBFNjyMCv-CCtiNAA-1; Fri, 16 Jul 2021 22:31:31 -0400
X-MC-Unique: c2gG9FBFNjyMCv-CCtiNAA-1
Received: by mail-qv1-f71.google.com with SMTP id
 u8-20020a0562141c08b02902e82df307f0so8070912qvc.4
 for <gcc-patches@gcc.gnu.org>; Fri, 16 Jul 2021 19:31:31 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:subject:to:cc:message-id:date:user-agent
 :mime-version:content-language;
 bh=3k1T+MWTsMxQoXl0zqgydqRCgswXq2D/5BzOowfJ0i8=;
 b=Dg2bzjLy2LBwHq4wwuWqpw+Pe6CRdyp7M3aX63T1NZRJ7YvvquuBltLk7umavfq0YO
 shnXyp2bbz0joiBhRSS7eMUVYPkWM2kQN/QVQ9EsRD3UqcLpCuH/fwoU5AS7oEi1Ny9n
 f2jbs6kcM0M6jZ3rPJxUbw/FrlvCijO+4iN38qgI2iaNyGjGlV71MGaJUcEFTpdz+u7X
 nVvYedmS9Vcwxuo7Tn3BZVFqPdNjbGvEWfNxvbdZh40FCuCsQsE8H+lQTH2037lswiwS
 e2gsNdnAFwDIOcID5PRTl6OOTMRJWstwNQJyRUp1u1fpe2sK6eN5d7BKwbUHXBw4KjKJ
 r+pA==
X-Gm-Message-State: AOAM531chZQG8tSQikejDG8id0Z2oJyPdQufpMTK8lSYH3zelu76hJTj
 9WSq4TOa5On5RO/QB6doZUzmPMeh8n0QC9060MX6YRQsglAHn+P1M+kAC0x52YKkO2wE4oMZ4sO
 v5ilQ1c4TJ343N8ps9A==
X-Received: by 2002:a37:90d:: with SMTP id 13mr12899101qkj.386.1626489090734;
 Fri, 16 Jul 2021 19:31:30 -0700 (PDT)
X-Google-Smtp-Source: 
 ABdhPJwO8XHgZ1KnupvEUjqv2BDEgjc13DJa0p7dNRhP7EudoB8El+hxQ0S01ywyIEX4uKnd8cj3Lw==
X-Received: by 2002:a37:90d:: with SMTP id 13mr12899070qkj.386.1626489090423;
 Fri, 16 Jul 2021 19:31:30 -0700 (PDT)
Received: from [192.168.0.102] ([104.219.122.163])
 by smtp.gmail.com with ESMTPSA id c15sm1503456qtc.37.2021.07.16.19.31.29
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Fri, 16 Jul 2021 19:31:29 -0700 (PDT)
Subject: [COMMITTED] tree-optimization/96542 - Add wi_fold_in_parts.
To: gcc-patches <gcc-patches@gcc.gnu.org>
Message-ID: <618a0878-d150-d82b-4181-14ea236d337e@redhat.com>
Date: Fri, 16 Jul 2021 22:31:28 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.12.0
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Language: en-CA
X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Andrew MacLeod via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Andrew MacLeod <amacleod@redhat.com>
Reply-To: Andrew MacLeod <amacleod@redhat.com>
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>

Range-ops uses wi_fold (implemented by each opcode)  to individually 
fold subranges one at a time and then combines them. This patch first 
calls wi_fold_in_parts which checks if one of the subranges is small, 
and if so, further splits that subrange into constants.

Currently, if a subrange is 4 or less values, then we call it 
individually for each of the 4 values. 4 was chosen as a reasonable 
tradeoff between excess work vs benefit.     We could consider 
increasing that number for -O3 perhaps.

This allows us under some circumstances to generate much more precise 
values. For instance:

     tmp_5 = x_4(D) != 0;
     _1 = (int) tmp_5;
     _2 = 255 >> _1;
     _3 = (unsigned char) _2;
     _6 = _3 * 2;
     return _6;

_1 : int [0, 1]
_2 : int [127, 255]
_3 : unsigned char [127, +INF]

we currently produce a range of [127,255] for _2.   the RHS of the shift 
(_1)  only has 2 values, so by further splitting that subrange in [0,0] 
and [1,1], we can produce a more precise range, and do better:

     y_5 = x_4(D) != 0;
     _1 = (int) y_5;
     _2 = 255 >> _1;
     _3 = (unsigned char) _2;
     return 254;

_1 : int [0, 1]
_2 : int [127, 127][255, 255]
_3 : unsigned char [127, 127][+INF, +INF]

Now _2 is a perfect 127 or 255, and we can fold this function to just 
the return value now.

This will work for many different opcodes without having to go and 
rework range-ops for them.

Bootstraps on x86_64-pc-linux-gnu  with no regressions and fixes all 3 
testcases in the PR.  pushed.

Andrew

commit 704e8a825c78b9a8424c291509413bbb48e602c7
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Fri Jul 16 11:42:14 2021 -0400

    Add wi_fold_in_parts.
    
    range-ops uses wi_fold to individually fold subranges one at a time and
    then combined them.  This patch first calls wi_fold_in_parts which checks if
    one of the subranges is small, and if so, further splits that subrange
    into constants.
    
            gcc/
            PR tree-optimization/96542
            * range-op.cc (range_operator::wi_fold_in_parts): New.
            (range_operator::fold_range): Call wi_fold_in_parts.
            (operator_lshift::wi_fold): Fix broken lshift by [0,0].
            * range-op.h (wi_fold_in_parts): Add prototype.
    
            gcc/testsuite
            * gcc.dg/pr96542.c: New.
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 08000465fd9..e0be51dbc90 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -133,6 +133,65 @@ range_operator::wi_fold (irange &r, tree type,
   r.set_varying (type);
 }
 
+// Call wi_fold, except further split small subranges into constants.
+// This can provide better precision. For something   8 >> [0,1]
+// Instead of [8, 16], we will produce [8,8][16,16]
+
+void
+range_operator::wi_fold_in_parts (irange &r, tree type,
+				  const wide_int &lh_lb,
+				  const wide_int &lh_ub,
+				  const wide_int &rh_lb,
+				  const wide_int &rh_ub) const
+{
+  wi::overflow_type ov_rh, ov_lh;
+  int_range_max tmp;
+  wide_int rh_range = wi::sub (rh_ub, rh_lb, TYPE_SIGN (type), &ov_rh);
+  wide_int lh_range = wi::sub (lh_ub, lh_lb, TYPE_SIGN (type), &ov_lh);
+  signop sign = TYPE_SIGN (type);;
+  // If there are 2, 3, or 4 values in the RH range, do them separately.
+  // Call wi_fold_in_parts to check the RH side.
+  if (wi::gt_p (rh_range, 0, sign) && wi::lt_p (rh_range, 4, sign)
+      && ov_rh == wi::OVF_NONE)
+    {
+      wi_fold_in_parts (r, type, lh_lb, lh_ub, rh_lb, rh_lb);
+      if (wi::gt_p (rh_range, 1, sign))
+	{
+	  wi_fold_in_parts (tmp, type, lh_lb, lh_ub, rh_lb + 1, rh_lb + 1);
+	  r.union_ (tmp);
+	  if (wi::eq_p (rh_range, 3))
+	    {
+	      wi_fold_in_parts (tmp, type, lh_lb, lh_ub, rh_lb + 2, rh_lb + 2);
+	      r.union_ (tmp);
+	    }
+	}
+      wi_fold_in_parts (tmp, type, lh_lb, lh_ub, rh_ub, rh_ub);
+      r.union_ (tmp);
+    }
+  // Otherise check for 2, 3, or 4 values in the LH range and split them up.
+  // The RH side has been checked, so no recursion needed.
+  else if (wi::gt_p (lh_range, 0, sign) && wi::lt_p (lh_range, 4, sign)
+	   && ov_lh == wi::OVF_NONE)
+    {
+      wi_fold (r, type, lh_lb, lh_lb, rh_lb, rh_ub);
+      if (wi::gt_p (lh_range, 1, sign))
+	{
+	  wi_fold (tmp, type, lh_lb + 1, lh_lb + 1, rh_lb, rh_ub);
+	  r.union_ (tmp);
+	  if (wi::eq_p (lh_range, 3))
+	    {
+	      wi_fold (tmp, type, lh_lb + 2, lh_lb + 2, rh_lb, rh_ub);
+	      r.union_ (tmp);
+	    }
+	}
+      wi_fold (tmp, type, lh_ub, lh_ub, rh_lb, rh_ub);
+      r.union_ (tmp);
+    }
+  // Otherwise just call wi_fold.
+  else
+    wi_fold (r, type, lh_lb, lh_ub, rh_lb, rh_ub);
+}
+
 // The default for fold is to break all ranges into sub-ranges and
 // invoke the wi_fold method on each sub-range pair.
 
@@ -152,8 +211,8 @@ range_operator::fold_range (irange &r, tree type,
   // If both ranges are single pairs, fold directly into the result range.
   if (num_lh == 1 && num_rh == 1)
     {
-      wi_fold (r, type, lh.lower_bound (0), lh.upper_bound (0),
-	       rh.lower_bound (0), rh.upper_bound (0));
+      wi_fold_in_parts (r, type, lh.lower_bound (0), lh.upper_bound (0),
+			rh.lower_bound (0), rh.upper_bound (0));
       op1_op2_relation_effect (r, type, lh, rh, rel);
       return true;
     }
@@ -167,7 +226,7 @@ range_operator::fold_range (irange &r, tree type,
 	wide_int lh_ub = lh.upper_bound (x);
 	wide_int rh_lb = rh.lower_bound (y);
 	wide_int rh_ub = rh.upper_bound (y);
-	wi_fold (tmp, type, lh_lb, lh_ub, rh_lb, rh_ub);
+	wi_fold_in_parts (tmp, type, lh_lb, lh_ub, rh_lb, rh_ub);
 	r.union_ (tmp);
 	if (r.varying_p ())
 	  {
@@ -1915,8 +1974,14 @@ operator_lshift::wi_fold (irange &r, tree type,
   int bound_shift = overflow_pos - rh_ub.to_shwi ();
   // If bound_shift == HOST_BITS_PER_WIDE_INT, the llshift can
   // overflow.  However, for that to happen, rh.max needs to be zero,
-  // which means rh is a singleton range of zero, which means it
-  // should be handled by the lshift fold_range above.
+  // which means rh is a singleton range of zero, which means we simply return
+  // [lh_lb, lh_ub] as the range.
+  if (wi::eq_p (rh_ub, rh_lb) && wi::eq_p (rh_ub, 0))
+    {
+      r = int_range<2> (type, lh_lb, lh_ub);
+      return;
+    }
+
   wide_int bound = wi::set_bit_in_zero (bound_shift, prec);
   wide_int complement = ~(bound - 1);
   wide_int low_bound, high_bound;
diff --git a/gcc/range-op.h b/gcc/range-op.h
index 2b5db64bb98..17be9e00c08 100644
--- a/gcc/range-op.h
+++ b/gcc/range-op.h
@@ -97,6 +97,12 @@ protected:
 					const irange &op1_range,
 					const irange &op2_range,
 					relation_kind rel) const;
+  // Called by fold range to split small subranges into parts.
+  void wi_fold_in_parts (irange &r, tree type,
+			 const wide_int &lh_lb,
+			 const wide_int &lh_ub,
+			 const wide_int &rh_lb,
+			 const wide_int &rh_ub) const;
 };
 
 extern range_operator *range_op_handler (enum tree_code code, tree type);
diff --git a/gcc/testsuite/gcc.dg/pr96542.c b/gcc/testsuite/gcc.dg/pr96542.c
new file mode 100644
index 00000000000..5014f2acad8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr96542.c
@@ -0,0 +1,27 @@
+/* { dg-do compile} */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+
+unsigned char
+foo (unsigned int x)
+{
+  _Bool y = x;
+  return (((unsigned char) ~0) >> y) * 2;
+}
+
+unsigned char
+bar (unsigned int x)
+{
+  return (((unsigned char) ~0) >> (_Bool) x) * 2;
+}
+
+unsigned
+baz (unsigned int x)
+{
+  if (x >= 4) return 32;
+  return (-1U >> x) * 16;
+}
+
+/* { dg-final { scan-tree-dump-times  "254" 2 "evrp" } }  */
+/* { dg-final { scan-tree-dump "= PHI <32.*, 4294967280" "evrp" } }  */
+