From patchwork Thu May 21 15:58:21 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Aditya K X-Patchwork-Id: 475076 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 1D68114016A for ; Fri, 22 May 2015 01:59:08 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=KiJ+ToOK; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:from:to:subject:date:in-reply-to:references :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=Pk48akal7tkFEn/4as7mznwoaZWSOohh+Ea+2RR96pfX02s5N3HVn YGgAKPPCkiwwTlQCR9KQduS9vVh4ftmuoPnbFouMnUOK3S2AgyM0qK9xzul39NBM F2AszmflnbgRg6HXQ5fOmekFWtCsws9TjVKK7C3UG5RrckaI84vZZ8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:from:to:subject:date:in-reply-to:references :content-type:content-transfer-encoding:mime-version; s=default; bh=Ulp/5TkppsQq+XSBhzklWlIW7ps=; b=KiJ+ToOK5cZRBww3P5NsrfMd+3iy L43BP8gRepWoFWvNAA4zN7YBleo4UOmsK/snWFYhjSTa3yLRj4R4vC8KlVPVDqBt b2erF1J9YqxSASsF5NS29WIvjheCta+NcqrAm4bZDVBvcRbT4+UmxYHQd1/m3X5O JEuEHbavfZTVsMs= Received: (qmail 45948 invoked by alias); 21 May 2015 15:59:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 45937 invoked by uid 89); 21 May 2015 15:59:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 X-HELO: BLU004-OMC3S2.hotmail.com Received: from blu004-omc3s2.hotmail.com (HELO BLU004-OMC3S2.hotmail.com) (65.55.116.77) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256 encrypted) ESMTPS; Thu, 21 May 2015 15:58:58 +0000 Received: from BLU179-W31 ([65.55.116.73]) by BLU004-OMC3S2.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Thu, 21 May 2015 08:58:21 -0700 X-TMN: [jc0cr57awZvRWyie7wXA/tUrU9hNbwis] Message-ID: From: Aditya K To: "gcc-patches@gcc.gnu.org" , "a.zaafrani@samsung.com" , "sebpop@gmail.com" , "law@redhat.com" , "richard.guenther@gmail.com" Subject: RE: Fix PR48052: loop not vectorized if index is "unsigned int" Date: Thu, 21 May 2015 15:58:21 +0000 In-Reply-To: References: MIME-Version: 1.0 I tested this patch and it passes bootstrap and no extra failures. Thanks -Aditya Symbolically evaluate conditionals, and subtraction when additional constraints are provided. Adding this evaluation mechanism helps vectorize some loops on 64bit machines because on 64bit, a typecast appears which causes scev to bail out. gcc/ChangeLog: 2015-05-21  hiraditya  2015-05-21 Sebastian Pop  2015-05-21 Abderrazek Zaafrani         * gcc.dg/vect/pr48052.c: New test.         * tree-ssa-loop-niter.c (fold_binary_cond_p): Fold a conditional operation when additional constraints are         available.         (fold_binary_minus_p): Fold a subtraction operations of the form (A - B -1) when additional constraints are         available.         (scev_probably_wraps_p): Use the above two functions to find whether valid_niter>= loop->nb_iterations. diff --git a/gcc/testsuite/gcc.dg/vect/pr48052.c b/gcc/testsuite/gcc.dg/vect/pr48052.c new file mode 100644 index 0000000..8e406d7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr48052.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3" } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ + +int foo(int* A, int* B,  unsigned start, unsigned BS) +{ +  int s; +  for (unsigned k = start;  k < start + BS; k++) +    { +      s += A[k] * B[k]; +    } + +  return s; +} + +int bar(int* A, int* B, unsigned BS) +{ +  int s; +  for (unsigned k = 0;  k < BS; k++) +    { +      s += A[k] * B[k]; +    } + +  return s; +} + ---------------------------------------- > From: hiraditya@msn.com > To: gcc-patches@gcc.gnu.org; a.zaafrani@samsung.com; sebpop@gmail.com; law@redhat.com; richard.guenther@gmail.com > Subject: Fix PR48052: loop not vectorized if index is "unsigned int" > Date: Tue, 19 May 2015 16:12:26 +0000 > > w.r.t. the PR48052, here is the patch which finds out if scev would wrap or not. > The patch symbolically evaluates if valid_niter>= loop->nb_iterations is true. In that case the scev would not wrap (??). > Currently, we only look for two special 'patterns', which are sufficient to analyze the simple test cases. > > valid_niter = ~s (= UNIT_MAX - s) > We have to prove that valid_niter>= loop->nb_iterations > > Pattern1 loop->nb_iterations: s>= e ? s - e : 0 > Pattern2 loop->nb_iterations: (e - s) -1 > > In the first case we prove that valid_niter>= loop->nb_iterations in both the cases i.e., when s>=e and when not. > In the second case we prove valid_niter>= loop->nb_iterations, by simple analysis that UINT_MAX>= e is true in all cases. > > I haven't tested this patch completely. I'm looking for feedback and any scope for improvement. > > > hth, > -Aditya > > > > Vectorize loops which has typecast. > > 2015-05-19 hiraditya > 2015-05-19 Sebastian Pop > 2015-05-19 Abderrazek Zaafrani > > * gcc.dg/vect/pr48052.c: New test. > > gcc/ChangeLog: > > 2015-05-19 hiraditya > > * tree-ssa-loop-niter.c (fold_binary_cond_p): Fold a conditional operation when additional constraints are > available. > (fold_binary_minus_p): Fold a subtraction operations of the form (A - B -1) when additional constraints are > available. > (scev_probably_wraps_p): Use the above two functions to find whether valid_niter>= loop->nb_iterations. > > > diff --git a/gcc/testsuite/gcc.dg/vect/pr48052.c b/gcc/testsuite/gcc.dg/vect/pr48052.c > new file mode 100644 > index 0000000..8e406d7 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/pr48052.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-O3" } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */ > +/* { dg-final { cleanup-tree-dump "vect" } } */ > + > +int foo(int* A, int* B, unsigned start, unsigned BS) > +{ > + int s; > + for (unsigned k = start; k < start + BS; k++) > + { > + s += A[k] * B[k]; > + } > + > + return s; > +} > + > +int bar(int* A, int* B, unsigned BS) > +{ > + int s; > + for (unsigned k = 0; k < BS; k++) > + { > + s += A[k] * B[k]; > + } > + > + return s; > +} > + > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > index 042f8df..ddc00cc 100644 > --- a/gcc/tree-ssa-loop-niter.c > +++ b/gcc/tree-ssa-loop-niter.c > @@ -3773,6 +3773,117 @@ nowrap_type_p (tree type) > return false; > } > > +/* Return true when op0>= op1. > + For example: > + Where, op0 = ~start_3(D); > + op1 = start_3(D) <= stop_6(D) ? stop_6(D) - start_3(D) : 0; > + In this case op0 = UINT_MAX - start_3(D); > + So, op0>= op1 in all cases because UINT_MAX>= stop_6(D), > + when TREE_TYPE(stop_6(D)) == unsigned int; */ > +bool > +fold_binary_cond_p (enum tree_code code, tree type, tree op0, tree op1) > +{ > + gcc_assert (type == boolean_type_node); > + > + if (TREE_TYPE (op0) != TREE_TYPE (op1)) > + return false; > + > + /* TODO: Handle other operations. */ > + if (code != GE_EXPR) > + return false; > + // The type of op0 and op1 should be unsigned. > + if (!TYPE_UNSIGNED (TREE_TYPE(op0))) > + return false; > + if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != COND_EXPR)) > + return false; > + > + /* We have to show that in both the cases, > + (when cond is true and when cond is false) op (op0, op1) is true. */ > + tree neg_op0 = TREE_OPERAND (op0, 0); > + tree cond_op1 = TREE_OPERAND (op1, 0); > + tree true_op1 = TREE_OPERAND (op1, 1); > + tree false_op1 = TREE_OPERAND (op1, 2); > + gcc_assert(neg_op0 && cond_op1 && true_op1 && false_op1); > + > + /* When cond is false. Evaluate op (op0, false_op1). */ > + tree running_exp = fold_binary (code, boolean_type_node, op0, false_op1); > + if (running_exp == NULL || integer_zerop (running_exp)) > + /* TODO: Handle more cases here. */ > + return false; > + > + /* When cond is true. Evaluate op (op0, true_op1). */ > + running_exp = fold_binary (code, boolean_type_node, op0, true_op1); > + if (running_exp != NULL && integer_nonzerop (running_exp)) > + return true; > + > + tree smaller, bigger; > + if (TREE_CODE (cond_op1) == LE_EXPR) > + { > + smaller = TREE_OPERAND (cond_op1, 0); > + bigger = TREE_OPERAND (cond_op1, 1); > + } else return false; > + > + if (TREE_CODE (true_op1) == MINUS_EXPR) > + { > + tree minuend = TREE_OPERAND (true_op1, 0); > + tree subtrahend = TREE_OPERAND (true_op1, 1); > + if (subtrahend == neg_op0 && subtrahend == smaller && minuend == bigger) > + { > + tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0), > + TREE_TYPE (neg_op0)); > + running_exp = fold_binary (code, boolean_type_node, extreme, minuend); > + return running_exp != NULL && integer_nonzerop(running_exp); > + } else return false; > + } else return false; > +} > + > +/* Return true when op0>= op1 and > + op0 is ~start3(D) or, UINT_MAX - start3(D) > + op1 is (_21 - start_3(D)) - 1; */ > +bool > +fold_binary_minus_p (enum tree_code code, tree type, tree op0, tree op1) > +{ > + gcc_assert (type == boolean_type_node); > + > + if (TREE_TYPE (op0) != TREE_TYPE (op1)) > + return false; > + /* TODO: Handle other operations. */ > + if (code != GE_EXPR) > + return false; > + > + // The type of op0 and op1 should be unsigned. > + if (!TYPE_UNSIGNED (TREE_TYPE(op0))) > + return false; > + if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != MINUS_EXPR)) > + return false; > + > + /* We have to show that op (op0, op1) is true. */ > + tree neg_op0 = TREE_OPERAND (op0, 0); > + tree minuend_op1 = TREE_OPERAND (op1, 0); > + tree subtrahend_op1 = TREE_OPERAND (op1, 1); > + gcc_assert(neg_op0 && subtrahend_op1 && minuend_op1); > + > + /* TODO: Also check that the integer_cst is positive. */ > + if (TREE_CODE (minuend_op1) != MINUS_EXPR || > + TREE_CODE (subtrahend_op1) != INTEGER_CST) > + return false; > + > + tree minuend_minuend_op1 = TREE_OPERAND (minuend_op1, 0); > + tree subtrahend_minuend_op1 = TREE_OPERAND (minuend_op1, 1); > + > + /* TODO: Extend this to evaluate the subtrahends. > + i.e., when there are complicated operations in the subtrahend. */ > + if (subtrahend_minuend_op1 != neg_op0) > + return false; > + > + tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0), TREE_TYPE (neg_op0)); > + tree compare_minuend = fold_binary (GE_EXPR, boolean_type_node, > + extreme, minuend_minuend_op1); > + if (compare_minuend != NULL && integer_nonzerop (compare_minuend)) > + return true; > + return false; > +} > + > /* Return false only when the induction variable BASE + STEP * I is > known to not overflow: i.e. when the number of iterations is small > enough with respect to the step and initial condition in order to > @@ -3867,6 +3978,17 @@ scev_probably_wraps_p (tree base, tree step, > fold_undefer_and_ignore_overflow_warnings (); > return false; > } > + > + if (loop->nb_iterations && at_stmt > + && (fold_binary_cond_p (GE_EXPR, boolean_type_node, > + valid_niter, loop->nb_iterations) || > + fold_binary_minus_p (GE_EXPR, boolean_type_node, > + valid_niter, loop->nb_iterations))) > + { > + fold_undefer_and_ignore_overflow_warnings (); > + return false; > + } > + > if (at_stmt) > for (bound = loop->bounds; bound; bound = bound->next) > { > >diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 042f8df..ddc00cc 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -3773,6 +3773,117 @@ nowrap_type_p (tree type)    return false;  }   +/* Return true when op0>= op1. +   For example: +   Where, op0 = ~start_3(D); +   op1 = start_3(D) <= stop_6(D) ? stop_6(D) - start_3(D) : 0; +   In this case op0 = UINT_MAX - start_3(D); +   So, op0>= op1 in all cases because UINT_MAX>= stop_6(D), +   when TREE_TYPE(stop_6(D)) == unsigned int;  */ +bool +fold_binary_cond_p (enum tree_code code, tree type, tree op0, tree op1) +{ +  gcc_assert (type == boolean_type_node); + +  if (TREE_TYPE (op0) != TREE_TYPE (op1)) +    return false; + +  /* TODO: Handle other operations.  */ +  if (code != GE_EXPR) +    return false; +  // The type of op0 and op1 should be unsigned. +  if (!TYPE_UNSIGNED (TREE_TYPE(op0))) +    return false; +  if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != COND_EXPR)) +    return false; + +  /* We have to show that in both the cases, +     (when cond is true and when cond is false) op (op0, op1) is true.  */ +   tree neg_op0 = TREE_OPERAND (op0, 0); +   tree cond_op1 = TREE_OPERAND (op1, 0); +   tree true_op1 = TREE_OPERAND (op1, 1); +   tree false_op1 = TREE_OPERAND (op1, 2); +   gcc_assert(neg_op0 && cond_op1 && true_op1 && false_op1); + +  /* When cond is false. Evaluate op (op0, false_op1).  */ +  tree running_exp = fold_binary (code, boolean_type_node, op0, false_op1); +  if (running_exp == NULL || integer_zerop (running_exp)) +    /* TODO: Handle more cases here. */ +    return false; + +  /* When cond is true. Evaluate op (op0, true_op1).  */ +  running_exp = fold_binary (code, boolean_type_node, op0, true_op1); +  if (running_exp != NULL && integer_nonzerop (running_exp)) +    return true; + +  tree smaller, bigger; +  if (TREE_CODE (cond_op1) == LE_EXPR) +    { +      smaller = TREE_OPERAND (cond_op1, 0); +      bigger = TREE_OPERAND (cond_op1, 1); +    } else return false; + +  if (TREE_CODE (true_op1) == MINUS_EXPR) +    { +      tree minuend = TREE_OPERAND (true_op1, 0); +      tree subtrahend = TREE_OPERAND (true_op1, 1); +      if (subtrahend == neg_op0 && subtrahend == smaller && minuend == bigger) +        { +          tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0), +                                              TREE_TYPE (neg_op0)); +          running_exp = fold_binary (code, boolean_type_node, extreme, minuend); +          return running_exp != NULL && integer_nonzerop(running_exp); +        } else return false; +    } else return false; +} + +/* Return true when op0>= op1 and +   op0 is ~start3(D) or, UINT_MAX - start3(D) +   op1 is (_21 - start_3(D)) - 1; */ +bool +fold_binary_minus_p (enum tree_code code, tree type, tree op0, tree op1) +{ +  gcc_assert (type == boolean_type_node); + +  if (TREE_TYPE (op0) != TREE_TYPE (op1)) +    return false; +  /* TODO: Handle other operations.  */ +  if (code != GE_EXPR) +    return false; + +  // The type of op0 and op1 should be unsigned. +  if (!TYPE_UNSIGNED (TREE_TYPE(op0))) +    return false; +  if ((TREE_CODE (op0) != BIT_NOT_EXPR) || (TREE_CODE (op1) != MINUS_EXPR)) +    return false; + +  /* We have to show that op (op0, op1) is true.  */ +  tree neg_op0 = TREE_OPERAND (op0, 0); +  tree minuend_op1 = TREE_OPERAND (op1, 0); +  tree subtrahend_op1 = TREE_OPERAND (op1, 1); +  gcc_assert(neg_op0 && subtrahend_op1 && minuend_op1); + +  /* TODO: Also check that the integer_cst is positive.  */ +  if (TREE_CODE (minuend_op1) != MINUS_EXPR || +      TREE_CODE (subtrahend_op1) != INTEGER_CST) +    return false; + +  tree minuend_minuend_op1 = TREE_OPERAND (minuend_op1, 0); +  tree subtrahend_minuend_op1 = TREE_OPERAND (minuend_op1, 1); + +  /* TODO: Extend this to evaluate the subtrahends. +     i.e., when there are complicated operations in the subtrahend.  */ +  if (subtrahend_minuend_op1 != neg_op0) +    return false; + +  tree extreme = upper_bound_in_type (TREE_TYPE (neg_op0), TREE_TYPE (neg_op0)); +  tree compare_minuend = fold_binary (GE_EXPR, boolean_type_node, +                                      extreme, minuend_minuend_op1); +  if (compare_minuend != NULL && integer_nonzerop (compare_minuend)) +    return true; +  return false; +} +  /* Return false only when the induction variable BASE + STEP * I is     known to not overflow: i.e. when the number of iterations is small     enough with respect to the step and initial condition in order to @@ -3867,6 +3978,17 @@ scev_probably_wraps_p (tree base, tree step,        fold_undefer_and_ignore_overflow_warnings ();        return false;      } + +  if (loop->nb_iterations && at_stmt +      && (fold_binary_cond_p (GE_EXPR, boolean_type_node, +                            valid_niter, loop->nb_iterations) || +          fold_binary_minus_p (GE_EXPR, boolean_type_node, +                             valid_niter, loop->nb_iterations))) +    { +      fold_undefer_and_ignore_overflow_warnings (); +      return false; +    } +    if (at_stmt)      for (bound = loop->bounds; bound; bound = bound->next)        {