From patchwork Thu May 21 15:58:21 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Aditya K <hiraditya@msn.com>
X-Patchwork-Id: 475076
Return-Path: 
 <gcc-patches-return-398555-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 1D68114016A
	for <incoming@patchwork.ozlabs.org>;
	Fri, 22 May 2015 01:59:08 +1000 (AEST)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b=KiJ+ToOK; dkim-atps=neutral
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:from:to:subject:date:in-reply-to:references
	:content-type:content-transfer-encoding:mime-version; q=dns; s=
	default; b=Pk48akal7tkFEn/4as7mznwoaZWSOohh+Ea+2RR96pfX02s5N3HVn
	YGgAKPPCkiwwTlQCR9KQduS9vVh4ftmuoPnbFouMnUOK3S2AgyM0qK9xzul39NBM
	F2AszmflnbgRg6HXQ5fOmekFWtCsws9TjVKK7C3UG5RrckaI84vZZ8=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:from:to:subject:date:in-reply-to:references
	:content-type:content-transfer-encoding:mime-version; s=default;
	bh=Ulp/5TkppsQq+XSBhzklWlIW7ps=; b=KiJ+ToOK5cZRBww3P5NsrfMd+3iy
	L43BP8gRepWoFWvNAA4zN7YBleo4UOmsK/snWFYhjSTa3yLRj4R4vC8KlVPVDqBt
	b2erF1J9YqxSASsF5NS29WIvjheCta+NcqrAm4bZDVBvcRbT4+UmxYHQd1/m3X5O
	JEuEHbavfZTVsMs=
Received: (qmail 45948 invoked by alias); 21 May 2015 15:59:01 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 45937 invoked by uid 89); 21 May 2015 15:59:00 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL, BAYES_00,
	FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
	SPF_PASS autolearn=ham version=3.3.2
X-HELO: BLU004-OMC3S2.hotmail.com
Received: from blu004-omc3s2.hotmail.com (HELO BLU004-OMC3S2.hotmail.com)
	(65.55.116.77) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256
	encrypted) ESMTPS; Thu, 21 May 2015 15:58:58 +0000
Received: from BLU179-W31 ([65.55.116.73]) by BLU004-OMC3S2.hotmail.com over
	TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751);
	Thu, 21 May 2015 08:58:21 -0700
X-TMN: [jc0cr57awZvRWyie7wXA/tUrU9hNbwis]
Message-ID: <BLU179-W3124F25CCB02DBD86C6DD6B6C10@phx.gbl>
From: Aditya K <hiraditya@msn.com>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	"a.zaafrani@samsung.com" <a.zaafrani@samsung.com>,
	"sebpop@gmail.com"	<sebpop@gmail.com>, "law@redhat.com" <law@redhat.com>,
	"richard.guenther@gmail.com" <richard.guenther@gmail.com>
Subject: RE: Fix PR48052: loop not vectorized if index is "unsigned int"
Date: Thu, 21 May 2015 15:58:21 +0000
In-Reply-To: <BLU179-W94C54A046BE24828966E26B6C30@phx.gbl>
References: <BLU179-W94C54A046BE24828966E26B6C30@phx.gbl>
MIME-Version: 1.0

I tested this patch and it passes bootstrap and no extra failures.

Thanks
-Aditya


Symbolically evaluate conditionals, and subtraction when additional constraints are provided.

Adding this evaluation mechanism helps vectorize some loops on 64bit machines because on 64bit, a typecast appears
which causes scev to bail out.

gcc/ChangeLog:

2015-05-21  hiraditya  <hiraditya@msn.com>
2015-05-21 Sebastian Pop  <s.pop@samsung.com>
2015-05-21 Abderrazek Zaafrani <a.zaafrani@samsung.com>

        * gcc.dg/vect/pr48052.c: New test.
        * tree-ssa-loop-niter.c (fold_binary_cond_p): Fold a conditional operation when additional constraints are
        available.
        (fold_binary_minus_p): Fold a subtraction operations of the form (A - B -1) when additional constraints are
        available.
        (scev_probably_wraps_p): Use the above two functions to find whether valid_niter>= loop->nb_iterations.

diff --git a/gcc/testsuite/gcc.dg/vect/pr48052.c b/gcc/testsuite/gcc.dg/vect/pr48052.c
new file mode 100644
index 0000000..8e406d7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr48052.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+
+int foo(int* A, int* B,  unsigned start, unsigned BS)
+{
+  int s;
+  for (unsigned k = start;  k < start + BS; k++)
+    {
+      s += A[k] * B[k];
+    }
+
+  return s;
+}
+
+int bar(int* A, int* B, unsigned BS)
+{
+  int s;
+  for (unsigned k = 0;  k < BS; k++)
+    {
+      s += A[k] * B[k];
+    }
+
+  return s;
+}
+


----------------------------------------
> From: hiraditya@msn.com
> To: gcc-patches@gcc.gnu.org; a.zaafrani@samsung.com; sebpop@gmail.com; law@redhat.com; richard.guenther@gmail.com
> Subject: Fix PR48052: loop not vectorized if index is "unsigned int"
> Date: Tue, 19 May 2015 16:12:26 +0000
>
> w.r.t. the PR48052, here is the patch which finds out if scev would wrap or not.
> The patch symbolically evaluates if valid_niter>= loop->nb_iterations is true. In that case the scev would not wrap (??).
> Currently, we only look for two special 'patterns', which are sufficient to analyze the simple test cases.
>
> valid_niter = ~s (= UNIT_MAX - s)
> We have to prove that valid_niter>= loop->nb_iterations
>
> Pattern1 loop->nb_iterations: s>= e ? s - e : 0
> Pattern2 loop->nb_iterations: (e - s) -1
>
> In the first case we prove that valid_niter>= loop->nb_iterations in both the cases i.e., when s>=e and when not.
> In the second case we prove valid_niter>= loop->nb_iterations, by simple analysis that  UINT_MAX>= e is true in all cases.
>
> I haven't tested this patch completely. I'm looking for feedback and any scope for improvement.
>
>
> hth,
> -Aditya
>
>
>
> Vectorize loops which has typecast.
>
> 2015-05-19  hiraditya  <hiraditya@msn.com>
> 2015-05-19 Sebastian Pop  <s.pop@samsung.com>
> 2015-05-19 Abderrazek Zaafrani <a.zaafrani@samsung.com>
>
>         * gcc.dg/vect/pr48052.c: New test.
>
> gcc/ChangeLog:
>
> 2015-05-19  hiraditya  <hiraditya@msn.com>
>
>         * tree-ssa-loop-niter.c (fold_binary_cond_p): Fold a conditional operation when additional constraints are
>         available.
>         (fold_binary_minus_p): Fold a subtraction operations of the form (A - B -1) when additional constraints are
>         available.
>         (scev_probably_wraps_p): Use the above two functions to find whether valid_niter>= loop->nb_iterations.
>
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr48052.c b/gcc/testsuite/gcc.dg/vect/pr48052.c
> new file mode 100644
> index 0000000..8e406d7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr48052.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> +
> +int foo(int* A, int* B,  unsigned start, unsigned BS)
> +{
> +  int s;
> +  for (unsigned k = start;  k < start + BS; k++)
> +    {
> +      s += A[k] * B[k];
> +    }
> +
> +  return s;
> +}
> +
> +int bar(int* A, int* B, unsigned BS)
> +{
> +  int s;
> +  for (unsigned k = 0;  k < BS; k++)
> +    {
> +      s += A[k] * B[k];
> +    }
> +
> +  return s;
> +}
> +
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 042f8df..ddc00cc 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -3773,6 +3773,117 @@ nowrap_type_p (tree type)
>    return false;
>  }
>
> +/* Return true when op0>= op1.
> +   For example:
> +   Where, op0 = ~start_3(D);
> +   op1 = start_3(D) <= stop_6(D) ? stop_6(D) - start_3(D) : 0;
> +   In this case op0 = UINT_MAX - start_3(D);
> +   So, op0>= op1 in all cases because UINT_MAX>= stop_6(D),
> +   when TREE_TYPE(stop_6(D)) == unsigned int;  */
> +bool
> +fold_binary_cond_p (enum tree_code code, tree type, tree op0, tree op1)
> +{
> +  gcc_assert (type == boolean_type_node);
> +
> +  if (TREE_TYPE (op0) != TREE_TYPE (op1))
> +    return false;
> +
> +  /* TODO: Handle other operations.  */
> +  if (code != GE_EXPR)
> +    return false;
> +  // The type of op0 and op1 should be unsigned.
> +  if (!TYPE_UNSIGNED (TREE_TYPE(op0)))
> +    return false;
> +  if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != COND_EXPR))
> +    return false;
> +
> +  /* We have to show that in both the cases,
> +     (when cond is true and when cond is false) op (op0, op1) is true.  */
> +   tree neg_op0 = TREE_OPERAND (op0, 0);
> +   tree cond_op1 = TREE_OPERAND (op1, 0);
> +   tree true_op1 = TREE_OPERAND (op1, 1);
> +   tree false_op1 = TREE_OPERAND (op1, 2);
> +   gcc_assert(neg_op0 && cond_op1 && true_op1 && false_op1);
> +
> +  /* When cond is false. Evaluate op (op0, false_op1).  */
> +  tree running_exp = fold_binary (code, boolean_type_node, op0, false_op1);
> +  if (running_exp == NULL || integer_zerop (running_exp))
> +    /* TODO: Handle more cases here. */
> +    return false;
> +
> +  /* When cond is true. Evaluate op (op0, true_op1).  */
> +  running_exp = fold_binary (code, boolean_type_node, op0, true_op1);
> +  if (running_exp != NULL && integer_nonzerop (running_exp))
> +    return true;
> +
> +  tree smaller, bigger;
> +  if (TREE_CODE (cond_op1) == LE_EXPR)
> +    {
> +      smaller = TREE_OPERAND (cond_op1, 0);
> +      bigger = TREE_OPERAND (cond_op1, 1);
> +    } else return false;
> +
> +  if (TREE_CODE (true_op1) == MINUS_EXPR)
> +    {
> +      tree minuend = TREE_OPERAND (true_op1, 0);
> +      tree subtrahend = TREE_OPERAND (true_op1, 1);
> +      if (subtrahend == neg_op0 && subtrahend == smaller && minuend == bigger)
> +        {
> +          tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0),
> +                                              TREE_TYPE (neg_op0));
> +          running_exp = fold_binary (code, boolean_type_node, extreme, minuend);
> +          return running_exp != NULL && integer_nonzerop(running_exp);
> +        } else return false;
> +    } else return false;
> +}
> +
> +/* Return true when op0>= op1 and
> +   op0 is ~start3(D) or, UINT_MAX - start3(D)
> +   op1 is (_21 - start_3(D)) - 1; */
> +bool
> +fold_binary_minus_p (enum tree_code code, tree type, tree op0, tree op1)
> +{
> +  gcc_assert (type == boolean_type_node);
> +
> +  if (TREE_TYPE (op0) != TREE_TYPE (op1))
> +    return false;
> +  /* TODO: Handle other operations.  */
> +  if (code != GE_EXPR)
> +    return false;
> +
> +  // The type of op0 and op1 should be unsigned.
> +  if (!TYPE_UNSIGNED (TREE_TYPE(op0)))
> +    return false;
> +  if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != MINUS_EXPR))
> +    return false;
> +
> +  /* We have to show that op (op0, op1) is true.  */
> +  tree neg_op0 = TREE_OPERAND (op0, 0);
> +  tree minuend_op1 = TREE_OPERAND (op1, 0);
> +  tree subtrahend_op1 = TREE_OPERAND (op1, 1);
> +  gcc_assert(neg_op0 && subtrahend_op1 && minuend_op1);
> +
> +  /* TODO: Also check that the integer_cst is positive.  */
> +  if (TREE_CODE (minuend_op1) != MINUS_EXPR ||
> +      TREE_CODE (subtrahend_op1) != INTEGER_CST)
> +    return false;
> +
> +  tree minuend_minuend_op1 = TREE_OPERAND (minuend_op1, 0);
> +  tree subtrahend_minuend_op1 = TREE_OPERAND (minuend_op1, 1);
> +
> +  /* TODO: Extend this to evaluate the subtrahends.
> +     i.e., when there are complicated operations in the subtrahend.  */
> +  if (subtrahend_minuend_op1 != neg_op0)
> +    return false;
> +
> +  tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0), TREE_TYPE (neg_op0));
> +  tree compare_minuend = fold_binary (GE_EXPR, boolean_type_node,
> +                                      extreme, minuend_minuend_op1);
> +  if (compare_minuend != NULL && integer_nonzerop (compare_minuend))
> +    return true;
> +  return false;
> +}
> +
>  /* Return false only when the induction variable BASE + STEP * I is
>     known to not overflow: i.e. when the number of iterations is small
>     enough with respect to the step and initial condition in order to
> @@ -3867,6 +3978,17 @@ scev_probably_wraps_p (tree base, tree step,
>        fold_undefer_and_ignore_overflow_warnings ();
>        return false;
>      }
> +
> +  if (loop->nb_iterations && at_stmt
> +      && (fold_binary_cond_p (GE_EXPR, boolean_type_node,
> +                            valid_niter, loop->nb_iterations) ||
> +          fold_binary_minus_p (GE_EXPR, boolean_type_node,
> +                             valid_niter, loop->nb_iterations)))
> +    {
> +      fold_undefer_and_ignore_overflow_warnings ();
> +      return false;
> +    }
> +
>    if (at_stmt)
>      for (bound = loop->bounds; bound; bound = bound->next)
>        {
>
>diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 042f8df..ddc00cc 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3773,6 +3773,117 @@ nowrap_type_p (tree type)
   return false;
 }
 
+/* Return true when op0>= op1.
+   For example:
+   Where, op0 = ~start_3(D);
+   op1 = start_3(D) <= stop_6(D) ? stop_6(D) - start_3(D) : 0;
+   In this case op0 = UINT_MAX - start_3(D);
+   So, op0>= op1 in all cases because UINT_MAX>= stop_6(D),
+   when TREE_TYPE(stop_6(D)) == unsigned int;  */
+bool
+fold_binary_cond_p (enum tree_code code, tree type, tree op0, tree op1)
+{
+  gcc_assert (type == boolean_type_node);
+
+  if (TREE_TYPE (op0) != TREE_TYPE (op1))
+    return false;
+
+  /* TODO: Handle other operations.  */
+  if (code != GE_EXPR)
+    return false;
+  // The type of op0 and op1 should be unsigned.
+  if (!TYPE_UNSIGNED (TREE_TYPE(op0)))
+    return false;
+  if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != COND_EXPR))
+    return false;
+
+  /* We have to show that in both the cases,
+     (when cond is true and when cond is false) op (op0, op1) is true.  */
+   tree neg_op0 = TREE_OPERAND (op0, 0);
+   tree cond_op1 = TREE_OPERAND (op1, 0);
+   tree true_op1 = TREE_OPERAND (op1, 1);
+   tree false_op1 = TREE_OPERAND (op1, 2);
+   gcc_assert(neg_op0 && cond_op1 && true_op1 && false_op1);
+
+  /* When cond is false. Evaluate op (op0, false_op1).  */
+  tree running_exp = fold_binary (code, boolean_type_node, op0, false_op1);
+  if (running_exp == NULL || integer_zerop (running_exp))
+    /* TODO: Handle more cases here. */
+    return false;
+
+  /* When cond is true. Evaluate op (op0, true_op1).  */
+  running_exp = fold_binary (code, boolean_type_node, op0, true_op1);
+  if (running_exp != NULL && integer_nonzerop (running_exp))
+    return true;
+
+  tree smaller, bigger;
+  if (TREE_CODE (cond_op1) == LE_EXPR)
+    {
+      smaller = TREE_OPERAND (cond_op1, 0);
+      bigger = TREE_OPERAND (cond_op1, 1);
+    } else return false;
+
+  if (TREE_CODE (true_op1) == MINUS_EXPR)
+    {
+      tree minuend = TREE_OPERAND (true_op1, 0);
+      tree subtrahend = TREE_OPERAND (true_op1, 1);
+      if (subtrahend == neg_op0 && subtrahend == smaller && minuend == bigger)
+        {
+          tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0),
+                                              TREE_TYPE (neg_op0));
+          running_exp = fold_binary (code, boolean_type_node, extreme, minuend);
+          return running_exp != NULL && integer_nonzerop(running_exp);
+        } else return false;
+    } else return false;
+}
+
+/* Return true when op0>= op1 and
+   op0 is ~start3(D) or, UINT_MAX - start3(D)
+   op1 is (_21 - start_3(D)) - 1; */
+bool
+fold_binary_minus_p (enum tree_code code, tree type, tree op0, tree op1)
+{
+  gcc_assert (type == boolean_type_node);
+
+  if (TREE_TYPE (op0) != TREE_TYPE (op1))
+    return false;
+  /* TODO: Handle other operations.  */
+  if (code != GE_EXPR)
+    return false;
+
+  // The type of op0 and op1 should be unsigned.
+  if (!TYPE_UNSIGNED (TREE_TYPE(op0)))
+    return false;
+  if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != MINUS_EXPR))
+    return false;
+
+  /* We have to show that op (op0, op1) is true.  */
+  tree neg_op0 = TREE_OPERAND (op0, 0);
+  tree minuend_op1 = TREE_OPERAND (op1, 0);
+  tree subtrahend_op1 = TREE_OPERAND (op1, 1);
+  gcc_assert(neg_op0 && subtrahend_op1 && minuend_op1);
+
+  /* TODO: Also check that the integer_cst is positive.  */
+  if (TREE_CODE (minuend_op1) != MINUS_EXPR ||
+      TREE_CODE (subtrahend_op1) != INTEGER_CST)
+    return false;
+
+  tree minuend_minuend_op1 = TREE_OPERAND (minuend_op1, 0);
+  tree subtrahend_minuend_op1 = TREE_OPERAND (minuend_op1, 1);
+
+  /* TODO: Extend this to evaluate the subtrahends.
+     i.e., when there are complicated operations in the subtrahend.  */
+  if (subtrahend_minuend_op1 != neg_op0)
+    return false;
+
+  tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0), TREE_TYPE (neg_op0));
+  tree compare_minuend = fold_binary (GE_EXPR, boolean_type_node,
+                                      extreme, minuend_minuend_op1);
+  if (compare_minuend != NULL && integer_nonzerop (compare_minuend))
+    return true;
+  return false;
+}
+
 /* Return false only when the induction variable BASE + STEP * I is
    known to not overflow: i.e. when the number of iterations is small
    enough with respect to the step and initial condition in order to
@@ -3867,6 +3978,17 @@ scev_probably_wraps_p (tree base, tree step,
       fold_undefer_and_ignore_overflow_warnings ();
       return false;
     }
+
+  if (loop->nb_iterations && at_stmt
+      && (fold_binary_cond_p (GE_EXPR, boolean_type_node,
+                            valid_niter, loop->nb_iterations) ||
+          fold_binary_minus_p (GE_EXPR, boolean_type_node,
+                             valid_niter, loop->nb_iterations)))
+    {
+      fold_undefer_and_ignore_overflow_warnings ();
+      return false;
+    }
+
   if (at_stmt)
     for (bound = loop->bounds; bound; bound = bound->next)
       {