From patchwork Fri May 6 09:40:26 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Bin.Cheng" X-Patchwork-Id: 619234 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r1Rbn2pflz9s9c for ; Fri, 6 May 2016 19:40:53 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=puIIH/CC; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=Akv+gg+8xc0tMtET6x ks3CuAHShewEqvNq/xDHj7h3S5w6nA/4i6q1VY3lCyifbZd51jMpJuZMxE8X5p7K Zt+qWMN+QgN6v96WIgsCYX6/tWAvMGmhP4Cw/iYPgoEmA4nx64nVZjga//NCkrb7 9aQw34UxdfXDYmqa0k4iNT7MY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=86GaJlkEeLsxLE68gud5t0Y2 Jg8=; b=puIIH/CCbGxKNnA+p0/HWLPCYIqwp+nQovBH6h4HjJ+Y7O/jjkrpNIfP MDkQ/wTIRQ+EH4DZhhfM9rY+p88WO2HH8s9wtXysbiv2d+JcS3ZBY7ciHnxekyC4 TRFNciBvpCcSaEZaNugduEBk4LeSj7cQuc5YBAM319KT4CKr/Yw= Received: (qmail 119533 invoked by alias); 6 May 2016 09:40:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 119483 invoked by uid 89); 6 May 2016 09:40:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=threading, SAVE_EXPR, save_expr, m2 X-HELO: mail-yw0-f176.google.com Received: from mail-yw0-f176.google.com (HELO mail-yw0-f176.google.com) (209.85.161.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 06 May 2016 09:40:28 +0000 Received: by mail-yw0-f176.google.com with SMTP id o66so194647099ywc.3 for ; Fri, 06 May 2016 02:40:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=qlOcErYDIwlbZDwNiS7terr8Cu9myHMM8R7eDthwIvw=; b=CgqpDTYo39B+BQt9CCr0aPio5gvj4eQhub7a+OeVmUjiYEqrPCIr5jbU650+XWomxy KQ5qahntd2q1hX2eSzzBAJWwejKJX6gVW4nb/LpETTJTjw78N8AJyo2LDhbqF12T4rco vrJgRjz2kgZBv1LZvok+33+TjjokvtOZW3f7ZfWOOVIzImTWsMe8ujxDMw9Mfr8lvopL 4abpRtzOGP7+s2qA+Cg9tddrdYTUO/gTmOKQhTi9oggXyP2bdrFi7zoaMmIxrMkJY2Jm poM1ua+7VUYEGGirtY5sdN/jTAtV5W9w2HTvq/6/LellxNF0WwpaTPKgp+tBmcr+gHX4 Zemg== X-Gm-Message-State: AOPr4FU54dYeoolNls/sheBFfTcj8IYHfQggn+NrN1oFsVpO+QgZVt0tKdI7ps2OREq6mZCBLjPUsZVok3BFdQ== MIME-Version: 1.0 X-Received: by 10.159.39.231 with SMTP id b94mr12814162uab.64.1462527626572; Fri, 06 May 2016 02:40:26 -0700 (PDT) Received: by 10.103.44.148 with HTTP; Fri, 6 May 2016 02:40:26 -0700 (PDT) In-Reply-To: References: Date: Fri, 6 May 2016 10:40:26 +0100 Message-ID: Subject: Re: [PATCH GCC]Proving no-trappness for array ref in tree if-conv using loop niter information. From: "Bin.Cheng" To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" X-IsSubscribed: yes On Tue, May 3, 2016 at 11:08 AM, Richard Biener wrote: > On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng wrote: >> On Mon, May 2, 2016 at 10:00 AM, Richard Biener >> wrote: >>> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng wrote: >>>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener >>>> wrote: >>>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng wrote: >>>>>> Hi, >>>>>> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one. Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping. This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information. To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap. This patch also factors out base_object_writable checking if the base object is writable or not. >>>>>> Bootstrap and test on x86_64 and aarch64, is it OK? >>>>> >>>>> I think you miss to handle the case optimally where the only >>>>> non-ARRAY_REF idx is the dereference of the >>>>> base-pointer for, say, p->a[i]. In this case we can use >>>>> base_master_dr to see if p is unconditionally dereferenced >>>> Yes, will pick up this case. >>>> >>>>> in the loop. You also fail to handle the case where we have >>>>> MEM_REF[&x].a[i] that is, you see a decl base. >>>> I am having difficulty in creating this case for ifcvt, any advices? Thanks. >>> >>> Sth like >>> >>> float a[128]; >>> float foo (int n, int i) >>> { >>> return (*((float(*)[n])a))[i]; >>> } >>> >>> should do the trick (w/o the component-ref). Any other type-punning >>> would do it, too. >>> >>>>> I suppose for_each_index should be fixed for this particular case (to >>>>> return true), same for TARGET_MEM_REF TMR_BASE. >>>>> >>>>> + /* The case of nonconstant bounds could be handled, but it would be >>>>> + complicated. */ >>>>> + if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low) >>>>> + || !high || TREE_CODE (high) != INTEGER_CST) >>>>> + return false; >>>>> + >>>>> >>>>> handling of a non-zero but constant low bound is important - otherwise >>>>> all this is a no-op for Fortran. It >>>>> shouldn't be too difficult to handle after all. In fact I think your >>>>> code does handle it correctly already. >>>>> >>>>> + if (!init || TREE_CODE (init) != INTEGER_CST >>>>> + || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) >>>>> + return false; >>>>> >>>>> step == 0 should be easy to handle as well, no? The index will simply >>>>> always be 'init' ... >>>>> >>>>> + /* In case the relevant bound of the array does not fit in type, or >>>>> + it does, but bound + step (in type) still belongs into the range of the >>>>> + array, the index may wrap and still stay within the range of the array >>>>> + (consider e.g. if the array is indexed by the full range of >>>>> + unsigned char). >>>>> + >>>>> + To make things simpler, we require both bounds to fit into type, although >>>>> + there are cases where this would not be strictly necessary. */ >>>>> + if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type)) >>>>> + return false; >>>>> + >>>>> + low = fold_convert (type, low); >>>>> >>>>> please use wide_int for all of this. >>>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not >>>> sure what's the meaning by "handle "low = fold_convert (type, low);" >>>> related code in wide_int". Do you mean to use tree_int_cst_compare >>>> instead of tree_int_cst_compare in the following code? >>> >>> I don't think you need any kind of fits-to-type check here. You'd simply >>> use to_widest () when operating on / comparing with high/low. >> But what would happen if low/high and init/step are different in type >> sign-ness? Anything special I need to do before using wi::ltu_p or >> wi::lts_p directly? > > You want to use to_widest (min) which extends according to sign to > an "infinite" precision signed integer. So you can then use the new > operator< overloads as well. > Hi, Here is the updated patch. It includes below changes according to review comments: 1) It uses widest_int for all INTEGER_CST tree computations, which simplifies the patch a lot. 2) It covers array with non-zero low bound, which is important for Fortran. 3) It picks up a boundary case so that ifc-11.c/vect-23.c/vect-24.c can be handled. 4) It also checks within bound array reference inside a structure like p->a[i] by using base_master_dr in tree-if-conv.c so that ifc-12.c can be handled. It leaves two other review comments not addressed: 1) It doesn't handle array reference whose idx is a wrapping SCEV. Because only read is safe and vectorizer itself may be confused by it now. 2) It doesn't handle array reference in the form of "MEM[(float[0:(sizetype) ((long int) SAVE_EXPR + -1)] *)&b][i_1];". To handle this case, existing code in array_at_struct_end_p as well as this patch need to be improved. I tend to handle it in an independent patch. With these changes, now cases pr61194.c and vect-23.c can be vectorized, I removed XFAIL from them. Also vect-mask-store-move-1.c is affected and not vectorized now, this is tricky and it exposes a known bug PR65206 in vectorizer's dependence analyzer. This should be handled independently too. Also gcc.dg/vect/vect-24.c now is XPASS on AArch64, but not x86_64. Root cause is dom2 jump threads first iteration of loop thus idx_within_array_bound is failed. I didn't check if jump threading is good in this case, either I remove the XFAIL mark. I tend to improve idx_within_array_bound by using VRP information in a following patch. Otherwise bootstrap and test on x86_64 and AArch64 are fine. Is this version OK? Thanks, bin 2016-05-04 Bin Cheng * tree-if-conv.c (tree-ssa-loop.h): Include header file. (tree-ssa-loop-niter.h): Ditto. (idx_within_array_bound, ref_within_array_bound): New functions. (ifcvt_memrefs_wont_trap): Check if array ref is within bound. Factor out check on writable base object to ... (base_object_writable): ... here. gcc/testsuite/ChangeLog 2016-05-04 Bin Cheng * gcc.dg/tree-ssa/ifc-9.c: New test. * gcc.dg/tree-ssa/ifc-10.c: New test. * gcc.dg/tree-ssa/ifc-11.c: New test. * gcc.dg/tree-ssa/ifc-12.c: New test. * gcc.dg/vect/pr61194.c: Remove XFAIL. * gcc.dg/vect/vect-23.c: Remove XFAIL. * gcc.dg/vect/vect-mask-store-move-1.c: Revise test check. diff --git a/gcc/testsuite/gcc.dg/vect/pr61194.c b/gcc/testsuite/gcc.dg/vect/pr61194.c index 8d74e00..f7c71b9 100644 --- a/gcc/testsuite/gcc.dg/vect/pr61194.c +++ b/gcc/testsuite/gcc.dg/vect/pr61194.c @@ -38,4 +38,4 @@ int main() return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-23.c b/gcc/testsuite/gcc.dg/vect/vect-23.c index 44bed75..e463f1b 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-23.c +++ b/gcc/testsuite/gcc.dg/vect/vect-23.c @@ -123,5 +123,5 @@ int main (void) return main1 (); } -/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-24.c b/gcc/testsuite/gcc.dg/vect/vect-24.c index 09a6d7e..d27410b 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-24.c +++ b/gcc/testsuite/gcc.dg/vect/vect-24.c @@ -123,5 +123,5 @@ int main (void) return main1 (); } -/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c b/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c index f5cae4f..f928dbf 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c +++ b/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c @@ -15,4 +15,4 @@ void foo (int n) } } -/* { dg-final { scan-tree-dump-times "Move stmt to created bb" 6 "vect" { target { i?86-*-* x86_64-*-* } } } } */ +/* { dg-final { scan-tree-dump-times "Move stmt to created bb" 4 "vect" { target { i?86-*-* x86_64-*-* } } } } */ diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index 32ced16..5527a7c 100644 --- a/gcc/tree-if-conv.c +++ b/gcc/tree-if-conv.c @@ -106,6 +106,8 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "tree-data-ref.h" #include "tree-scalar-evolution.h" +#include "tree-ssa-loop.h" +#include "tree-ssa-loop-niter.h" #include "tree-ssa-loop-ivopts.h" #include "tree-ssa-address.h" #include "dbgcnt.h" @@ -740,6 +742,105 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a) } } +/* Return TRUE if can prove the index IDX of an array reference REF is + within array bound. Return false otherwise. */ + +static bool +idx_within_array_bound (tree ref, tree *idx, void *dta) +{ + bool overflow; + widest_int niter, valid_niter, delta, wi_step; + tree ev, init, step; + tree low, high; + struct loop *loop = (struct loop*) dta; + + /* Only support within-bound access for array references. */ + if (TREE_CODE (ref) != ARRAY_REF) + return false; + + /* For arrays at the end of the structure, we are not guaranteed that they + do not really extend over their declared size. However, for arrays of + size greater than one, this is unlikely to be intended. */ + if (array_at_struct_end_p (ref)) + return false; + + ev = analyze_scalar_evolution (loop, *idx); + ev = instantiate_parameters (loop, ev); + init = initial_condition (ev); + step = evolution_part_in_loop_num (ev, loop->num); + + if (!init || TREE_CODE (init) != INTEGER_CST + || (step && TREE_CODE (step) != INTEGER_CST)) + return false; + + low = array_ref_low_bound (ref); + high = array_ref_up_bound (ref); + + /* The case of nonconstant bounds could be handled, but it would be + complicated. */ + if (TREE_CODE (low) != INTEGER_CST + || !high || TREE_CODE (high) != INTEGER_CST) + return false; + + /* Check if the intial idx is within bound. */ + if (wi::to_widest (init) < wi::to_widest (low) + || wi::to_widest (init) > wi::to_widest (high)) + return false; + + /* The idx is always within bound. */ + if (!step || integer_zerop (step)) + return true; + + if (!max_loop_iterations (loop, &niter)) + return false; + + if (wi::to_widest (step) < 0) + { + delta = wi::to_widest (init) - wi::to_widest (low); + wi_step = -wi::to_widest (step); + } + else + { + delta = wi::to_widest (high) - wi::to_widest (init); + wi_step = wi::to_widest (step); + } + + valid_niter = wi::div_floor (delta, wi_step, SIGNED, &overflow); + /* The iteration space of idx is within array bound. */ + if (!overflow && niter <= valid_niter) + return true; + + return false; +} + +/* Return TRUE if ref is a within bound array reference. */ + +static bool +ref_within_array_bound (gimple *stmt, tree ref) +{ + struct loop *loop = loop_containing_stmt (stmt); + + gcc_assert (loop != NULL); + return for_each_index (&ref, idx_within_array_bound, loop); +} + + +/* Given a memory reference expression T, return TRUE if base object + it refers to is writable. The base object of a memory reference + is the main object being referenced, which is returned by function + get_base_address. */ + +static bool +base_object_writable (tree ref) +{ + tree base_tree = get_base_address (ref); + + return (base_tree + && DECL_P (base_tree) + && decl_binds_to_current_def_p (base_tree) + && !TREE_READONLY (base_tree)); +} + /* Return true when the memory references of STMT won't trap in the if-converted code. There are two things that we have to check for: @@ -787,8 +888,13 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec drs) if (DR_W_UNCONDITIONALLY (*master_dr)) return true; - /* If a is unconditionally accessed then ... */ - if (DR_RW_UNCONDITIONALLY (*master_dr)) + /* If a is unconditionally accessed then ... + + Even a is conditional access, we can treat it as an unconditional + one if it's an array reference and all its index are within array + bound. */ + if (DR_RW_UNCONDITIONALLY (*master_dr) + || ref_within_array_bound (stmt, DR_REF (a))) { /* an unconditional read won't trap. */ if (DR_IS_READ (a)) @@ -799,16 +905,11 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec drs) if (base_master_dr && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); - else - { - /* or the base is know to be not readonly. */ - tree base_tree = get_base_address (DR_REF (a)); - if (DECL_P (base_tree) - && decl_binds_to_current_def_p (base_tree) - && ! TREE_READONLY (base_tree)) - return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); - } + /* or the base is known to be not readonly. */ + else if (base_object_writable (DR_REF (a))) + return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); } + return false; } diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c new file mode 100644 index 0000000..70b7422 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */ +/* { dg-require-visibility "" } */ + +int b[256] = {0}, y; +void bar (int *); +int foo (int x, int n) +{ + int i; + int a[128]; + + for (i = 0; i < n; i++) + { + a[i] = i; + if (x > i) + b[i] = y; + } + bar (a); + return 0; +} + +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c new file mode 100644 index 0000000..bacf428 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */ +/* { dg-require-visibility "" } */ + +int a[1024] = {0.0}; +int b[1024] = {0.0}; +int c[1024] = {0.0}; +int foo (float *x) +{ + int i = 0; + + for (i = 0; i < 1024; i++) + { + c[i] = (x[i] > 0.0) ? a[i] : b[i]; + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c new file mode 100644 index 0000000..89d42b4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */ +/* { dg-require-visibility "" } */ + +struct st +{ + int a[1024]; + int b[1024]; +}; + +struct st s = {0}; +int foo (int x) +{ + int i; + struct st *p = &s; + + for (i = 0; i < 1024; i++) + { + if (x > i) + p->a[i] = p->b[i]; + } + + return 0; +} +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c new file mode 100644 index 0000000..24c19c0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */ +/* { dg-require-visibility "" } */ + +extern int b[256], y; +void bar (int *, int); +int foo (int x, int n) +{ + int i; + int a[128]; + + for (i = 0; i < n; i++) + { + a[i] = i; + if (x > i) + y = b[i]; + } + bar (a, y); + return 0; +} + +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */