From patchwork Sat Jun 11 09:13:19 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 99993 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 0E4B2B709A for ; Sat, 11 Jun 2011 19:13:31 +1000 (EST) Received: (qmail 29039 invoked by alias); 11 Jun 2011 09:13:29 -0000 Received: (qmail 29027 invoked by uid 22791); 11 Jun 2011 09:13:26 -0000 X-SWARE-Spam-Status: No, hits=-0.8 required=5.0 tests=AWL, BAYES_40, RCVD_IN_DNSWL_NONE, SPF_FAIL, TW_TM, TW_XV, T_FRT_SLUT X-Spam-Check-By: sourceware.org Received: from smtp-vbr6.xs4all.nl (HELO smtp-vbr6.xs4all.nl) (194.109.24.26) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 11 Jun 2011 09:13:09 +0000 Received: from [192.168.1.68] (teejay.xs4all.nl [213.84.119.160]) (authenticated bits=0) by smtp-vbr6.xs4all.nl (8.13.8/8.13.8) with ESMTP id p5B9D2On064019 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 11 Jun 2011 11:13:07 +0200 (CEST) (envelope-from vries@codesourcery.com) Message-ID: <4DF331AF.7030406@codesourcery.com> Date: Sat, 11 Jun 2011 11:13:19 +0200 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110424 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Zdenek Dvorak CC: gcc-patches@gcc.gnu.org Subject: Re: [PATCH PR45098, 7/10] Nowrap limits iterations References: <4DD21F6E.4050308@codesourcery.com> <4DD221CF.4040002@codesourcery.com> <4DD3FD79.2020804@codesourcery.com> <20110518211157.GA19788@kam.mff.cuni.cz> <4DD63AE1.7070600@codesourcery.com> <20110521122407.GA22860@kam.mff.cuni.cz> <4DD7FD8F.20909@codesourcery.com> <4DE110D3.8080904@codesourcery.com> <20110530123851.GA29240@kam.mff.cuni.cz> <4DE49E91.60502@codesourcery.com> <20110531080442.GA19683@kam.mff.cuni.cz> In-Reply-To: <20110531080442.GA19683@kam.mff.cuni.cz> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi Zdenek, On 05/31/2011 10:04 AM, Zdenek Dvorak wrote: > Hi, > >> As far as I can tell, what is current calculated in i_bound (and assigned to >> nb_iterations_upper_bound), is the maximum amount of times any statement in the >> loop is executed, where any includes exit tests. Differently put, the maximum >> amount of times the loop header is executed. > > hmm... this is rather confusing, I don't really recall why I gave > nb_iterations_upper_bound a different semantics from any other instance > of what # of iterations of a loop means. > >> This is confirmed by this comment in tree-vrp.c: >> >> /* Try to use estimated number of iterations for the loop to constrain the >> final value in the evolution. >> We are interested in the number of executions of the latch, while >> nb_iterations_upper_bound includes the last execution of the exit test. */ >> >> I modified the patch to improved the comment. > > I think a better fix would be to make the nb_iterations_upper_bound semantics > consistent with that of nb_iterations. Let me try to do it, hopefully this should > be mostly mechanical, > This patch changes the semantics of nb_iterations_upper_bound and nb_iterations_estimate, to mean: the amount of latch executions. That change is countered at all use sites, except for tree-ssa-loop-ivopts.c:may_eliminate_iv. Passed x86_64 bootstrapping and reg-testing. OK for trunk? 2011-06-10 Zdenek Dvorak Tom de Vries PR target/45098 * cfgloop.h (nb_iterations_upper_bound, nb_iterations_estimate): Document changed semantics. (max_stmt_executions, max_stmt_executions_int): Declare. * tree-data-ref.c (estimated_loop_iterations) (estimated_loop_iterations_int): Move functions... * tree-ssa-loop-niter.c (estimated_loop_iterations) (estimated_loop_iterations_int): here. (record_estimate): Change nb_iterations_upper_bound and nb_iterations_estimate semantics. (max_stmt_executions, max_stmt_executions_int): New function. * tree-data-ref.c (estimated_loop_iterations_tree): Rename to ... (max_stmt_executions_tree): this. (analyze_miv_subscript): Use max_stmt_executions_tree instead of estimated_loop_iterations_tree. tree-ssa-loop-ivopts.c (avg_loop_niter): Use max_stmt_executions_int instead of estimated_loop_iterations_int. * predict.c (predict_loops): Idem. * tree-parloops.c (parallelize_loops): Idem. * tree-data-ref.c (analyze_siv_subscript_cst_affine) (compute_overlap_steps_for_affine_1_2, analyze_subscript_affine_affine) (init_omega_for_ddr_1): Idem. * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse) (loop_prefetch_arrays): Idem * graphite-sese-to-poly.c (build_loop_iteration_domains): Use max_stmt_executions instead of estimated_loop_iterations. * tree-data-ref.c (estimated_loop_iterations_tree): Idem. * tree-vrp.c (adjust_range_with_scev): Use estimated_loop_iterations instead of nb_iterations_upper_bound. Index: gcc/tree-vrp.c =================================================================== --- gcc/tree-vrp.c (revision 174810) +++ gcc/tree-vrp.c (working copy) @@ -3403,44 +3403,42 @@ adjust_range_with_scev (value_range_t *v tmax = TYPE_MAX_VALUE (type); /* Try to use estimated number of iterations for the loop to constrain the - final value in the evolution. - We are interested in the number of executions of the latch, while - nb_iterations_upper_bound includes the last execution of the exit test. */ + final value in the evolution. */ if (TREE_CODE (step) == INTEGER_CST - && loop->any_upper_bound - && !double_int_zero_p (loop->nb_iterations_upper_bound) && is_gimple_val (init) && (TREE_CODE (init) != SSA_NAME || get_value_range (init)->type == VR_RANGE)) { - value_range_t maxvr = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }; - double_int dtmp; - bool unsigned_p = TYPE_UNSIGNED (TREE_TYPE (step)); - int overflow = 0; + double_int nit; - dtmp = double_int_mul_with_sign (tree_to_double_int (step), - double_int_sub ( - loop->nb_iterations_upper_bound, - double_int_one), - unsigned_p, &overflow); - /* If the multiplication overflowed we can't do a meaningful - adjustment. Likewise if the result doesn't fit in the type - of the induction variable. For a signed type we have to - check whether the result has the expected signedness which - is that of the step as nb_iterations_upper_bound is unsigned. */ - if (!overflow - && double_int_fits_to_tree_p (TREE_TYPE (init), dtmp) - && (unsigned_p - || ((dtmp.high ^ TREE_INT_CST_HIGH (step)) >= 0))) + if (estimated_loop_iterations (loop, true, &nit)) { - tem = double_int_to_tree (TREE_TYPE (init), dtmp); - extract_range_from_binary_expr (&maxvr, PLUS_EXPR, - TREE_TYPE (init), init, tem); - /* Likewise if the addition did. */ - if (maxvr.type == VR_RANGE) + value_range_t maxvr = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }; + double_int dtmp; + bool unsigned_p = TYPE_UNSIGNED (TREE_TYPE (step)); + int overflow = 0; + + dtmp = double_int_mul_with_sign (tree_to_double_int (step), nit, + unsigned_p, &overflow); + /* If the multiplication overflowed we can't do a meaningful + adjustment. Likewise if the result doesn't fit in the type + of the induction variable. For a signed type we have to + check whether the result has the expected signedness which + is that of the step as number of iterations is unsigned. */ + if (!overflow + && double_int_fits_to_tree_p (TREE_TYPE (init), dtmp) + && (unsigned_p + || ((dtmp.high ^ TREE_INT_CST_HIGH (step)) >= 0))) { - tmin = maxvr.min; - tmax = maxvr.max; + tem = double_int_to_tree (TREE_TYPE (init), dtmp); + extract_range_from_binary_expr (&maxvr, PLUS_EXPR, + TREE_TYPE (init), init, tem); + /* Likewise if the addition did. */ + if (maxvr.type == VR_RANGE) + { + tmin = maxvr.min; + tmax = maxvr.max; + } } } } Index: gcc/tree-ssa-loop-niter.c =================================================================== --- gcc/tree-ssa-loop-niter.c (revision 174810) +++ gcc/tree-ssa-loop-niter.c (working copy) @@ -2568,18 +2568,17 @@ record_estimate (struct loop *loop, tree } /* Update the number of iteration estimates according to the bound. - If at_stmt is an exit, then every statement in the loop is - executed at most BOUND + 1 times. If it is not an exit, then - some of the statements before it could be executed BOUND + 2 - times, if an exit of LOOP is before stmt. */ + If at_stmt is an exit or dominates the single exit from the loop, + then the loop latch is executed at most BOUND times, otherwise + it can be executed BOUND + 1 times. */ exit = single_exit (loop); if (is_exit || (exit != NULL && dominated_by_p (CDI_DOMINATORS, exit->src, gimple_bb (at_stmt)))) - delta = double_int_one; + delta = double_int_zero; else - delta = double_int_two; + delta = double_int_one; i_bound = double_int_add (i_bound, delta); /* If an overflow occurred, ignore the result. */ @@ -3042,6 +3041,93 @@ estimate_numbers_of_iterations_loop (str loop->nb_iterations_estimate = loop->nb_iterations_upper_bound; } +/* Sets NIT to the estimated number of executions of the latch of the + LOOP. If CONSERVATIVE is true, we must be sure that NIT is at least as + large as the number of iterations. If we have no reliable estimate, + the function returns false, otherwise returns true. */ + +bool +estimated_loop_iterations (struct loop *loop, bool conservative, + double_int *nit) +{ + estimate_numbers_of_iterations_loop (loop, true); + if (conservative) + { + if (!loop->any_upper_bound) + return false; + + *nit = loop->nb_iterations_upper_bound; + } + else + { + if (!loop->any_estimate) + return false; + + *nit = loop->nb_iterations_estimate; + } + + return true; +} + +/* Similar to estimated_loop_iterations, but returns the estimate only + if it fits to HOST_WIDE_INT. If this is not the case, or the estimate + on the number of iterations of LOOP could not be derived, returns -1. */ + +HOST_WIDE_INT +estimated_loop_iterations_int (struct loop *loop, bool conservative) +{ + double_int nit; + HOST_WIDE_INT hwi_nit; + + if (!estimated_loop_iterations (loop, conservative, &nit)) + return -1; + + if (!double_int_fits_in_shwi_p (nit)) + return -1; + hwi_nit = double_int_to_shwi (nit); + + return hwi_nit < 0 ? -1 : hwi_nit; +} + +/* Returns an upper bound on the number of executions of statements + in the LOOP. For statements before the loop exit, this exceeds + the number of execution of the latch by one. */ + +HOST_WIDE_INT +max_stmt_executions_int (struct loop *loop, bool conservative) +{ + HOST_WIDE_INT nit = estimated_loop_iterations_int (loop, conservative); + HOST_WIDE_INT snit; + + if (nit == -1) + return -1; + + snit = (HOST_WIDE_INT) ((unsigned HOST_WIDE_INT) nit + 1); + + /* If the computation overflows, return -1. */ + return snit < 0 ? -1 : snit; +} + +/* Sets NIT to the estimated number of executions of the latch of the + LOOP, plus one. If CONSERVATIVE is true, we must be sure that NIT is at + least as large as the number of iterations. If we have no reliable + estimate, the function returns false, otherwise returns true. */ + +bool +max_stmt_executions (struct loop *loop, bool conservative, double_int *nit) +{ + double_int nit_minus_one; + + if (!estimated_loop_iterations (loop, conservative, nit)) + return false; + + nit_minus_one = *nit; + + *nit = double_int_add (*nit, double_int_one); + + return double_int_ucmp (*nit, nit_minus_one) > 0; +} + /* Records estimates on numbers of iterations of loops. */ void Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 174810) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -115,7 +115,7 @@ along with GCC; see the file COPYING3. static inline HOST_WIDE_INT avg_loop_niter (struct loop *loop) { - HOST_WIDE_INT niter = estimated_loop_iterations_int (loop, false); + HOST_WIDE_INT niter = max_stmt_executions_int (loop, false); if (niter == -1) return AVG_LOOP_NITER (loop); Index: gcc/predict.c =================================================================== --- gcc/predict.c (revision 174810) +++ gcc/predict.c (working copy) @@ -994,7 +994,7 @@ predict_loops (void) the loop, use it to predict this exit. */ else if (n_exits == 1) { - nitercst = estimated_loop_iterations_int (loop, false); + nitercst = max_stmt_executions_int (loop, false); if (nitercst < 0) continue; if (nitercst > max) Index: gcc/tree-parloops.c =================================================================== --- gcc/tree-parloops.c (revision 174810) +++ gcc/tree-parloops.c (working copy) @@ -2134,7 +2134,7 @@ parallelize_loops (void) /* FIXME: the check for vector phi nodes could be removed. */ || loop_has_vector_phi_nodes (loop)) continue; - estimated = estimated_loop_iterations_int (loop, false); + estimated = max_stmt_executions_int (loop, false); /* FIXME: Bypass this check as graphite doesn't update the count and frequency correctly now. */ if (!flag_loop_parallelize_all Index: gcc/tree-data-ref.c =================================================================== --- gcc/tree-data-ref.c (revision 174810) +++ gcc/tree-data-ref.c (working copy) @@ -1621,66 +1621,18 @@ analyze_ziv_subscript (tree chrec_a, fprintf (dump_file, ")\n"); } -/* Sets NIT to the estimated number of executions of the statements in - LOOP. If CONSERVATIVE is true, we must be sure that NIT is at least as - large as the number of iterations. If we have no reliable estimate, - the function returns false, otherwise returns true. */ - -bool -estimated_loop_iterations (struct loop *loop, bool conservative, - double_int *nit) -{ - estimate_numbers_of_iterations_loop (loop, true); - if (conservative) - { - if (!loop->any_upper_bound) - return false; - - *nit = loop->nb_iterations_upper_bound; - } - else - { - if (!loop->any_estimate) - return false; - - *nit = loop->nb_iterations_estimate; - } - - return true; -} - -/* Similar to estimated_loop_iterations, but returns the estimate only - if it fits to HOST_WIDE_INT. If this is not the case, or the estimate - on the number of iterations of LOOP could not be derived, returns -1. */ - -HOST_WIDE_INT -estimated_loop_iterations_int (struct loop *loop, bool conservative) -{ - double_int nit; - HOST_WIDE_INT hwi_nit; - - if (!estimated_loop_iterations (loop, conservative, &nit)) - return -1; - - if (!double_int_fits_in_shwi_p (nit)) - return -1; - hwi_nit = double_int_to_shwi (nit); - - return hwi_nit < 0 ? -1 : hwi_nit; -} - -/* Similar to estimated_loop_iterations, but returns the estimate as a tree, +/* Similar to max_stmt_executions_int, but returns the bound as a tree, and only if it fits to the int type. If this is not the case, or the - estimate on the number of iterations of LOOP could not be derived, returns + bound on the number of iterations of LOOP could not be derived, returns chrec_dont_know. */ static tree -estimated_loop_iterations_tree (struct loop *loop, bool conservative) +max_stmt_executions_tree (struct loop *loop) { double_int nit; tree type; - if (!estimated_loop_iterations (loop, conservative, &nit)) + if (!max_stmt_executions (loop, true, &nit)) return chrec_dont_know; type = lang_hooks.types.type_for_size (INT_TYPE_SIZE, true); @@ -1763,7 +1715,7 @@ analyze_siv_subscript_cst_affine (tree c /* Perform weak-zero siv test to see if overlap is outside the loop bounds. */ - numiter = estimated_loop_iterations_int (loop, false); + numiter = max_stmt_executions_int (loop, true); if (numiter >= 0 && compare_tree_int (tmp, numiter) > 0) @@ -1841,7 +1793,7 @@ analyze_siv_subscript_cst_affine (tree c /* Perform weak-zero siv test to see if overlap is outside the loop bounds. */ - numiter = estimated_loop_iterations_int (loop, false); + numiter = max_stmt_executions_int (loop, true); if (numiter >= 0 && compare_tree_int (tmp, numiter) > 0) @@ -2022,10 +1974,9 @@ compute_overlap_steps_for_affine_1_2 (tr step_z = int_cst_value (CHREC_RIGHT (chrec_b)); niter_x = - estimated_loop_iterations_int (get_chrec_loop (CHREC_LEFT (chrec_a)), - false); - niter_y = estimated_loop_iterations_int (get_chrec_loop (chrec_a), false); - niter_z = estimated_loop_iterations_int (get_chrec_loop (chrec_b), false); + max_stmt_executions_int (get_chrec_loop (CHREC_LEFT (chrec_a)), true); + niter_y = max_stmt_executions_int (get_chrec_loop (chrec_a), true); + niter_z = max_stmt_executions_int (get_chrec_loop (chrec_b), true); if (niter_x < 0 || niter_y < 0 || niter_z < 0) { @@ -2350,10 +2301,8 @@ analyze_subscript_affine_affine (tree ch HOST_WIDE_INT niter, niter_a, niter_b; affine_fn ova, ovb; - niter_a = estimated_loop_iterations_int (get_chrec_loop (chrec_a), - false); - niter_b = estimated_loop_iterations_int (get_chrec_loop (chrec_b), - false); + niter_a = max_stmt_executions_int (get_chrec_loop (chrec_a), true); + niter_b = max_stmt_executions_int (get_chrec_loop (chrec_b), true); niter = MIN (niter_a, niter_b); step_a = int_cst_value (CHREC_RIGHT (chrec_a)); step_b = int_cst_value (CHREC_RIGHT (chrec_b)); @@ -2460,10 +2409,10 @@ analyze_subscript_affine_affine (tree ch if (i1 > 0 && j1 > 0) { - HOST_WIDE_INT niter_a = estimated_loop_iterations_int - (get_chrec_loop (chrec_a), false); - HOST_WIDE_INT niter_b = estimated_loop_iterations_int - (get_chrec_loop (chrec_b), false); + HOST_WIDE_INT niter_a = max_stmt_executions_int + (get_chrec_loop (chrec_a), true); + HOST_WIDE_INT niter_b = max_stmt_executions_int + (get_chrec_loop (chrec_b), true); HOST_WIDE_INT niter = MIN (niter_a, niter_b); /* (X0, Y0) is a solution of the Diophantine equation: @@ -2740,8 +2689,7 @@ analyze_miv_subscript (tree chrec_a, in the same order. */ *overlaps_a = conflict_fn (1, affine_fn_cst (integer_zero_node)); *overlaps_b = conflict_fn (1, affine_fn_cst (integer_zero_node)); - *last_conflicts = estimated_loop_iterations_tree - (get_chrec_loop (chrec_a), true); + *last_conflicts = max_stmt_executions_tree (get_chrec_loop (chrec_a)); dependence_stats.num_miv_dependent++; } @@ -3754,7 +3702,7 @@ init_omega_for_ddr_1 (struct data_refere for (i = 0; i <= DDR_INNER_LOOP (ddr) && VEC_iterate (loop_p, DDR_LOOP_NEST (ddr), i, loopi); i++) { - HOST_WIDE_INT nbi = estimated_loop_iterations_int (loopi, false); + HOST_WIDE_INT nbi = max_stmt_executions_int (loopi, true); /* 0 <= loop_x */ ineq = omega_add_zero_geq (pb, omega_black); Index: gcc/tree-ssa-loop-prefetch.c =================================================================== --- gcc/tree-ssa-loop-prefetch.c (revision 174810) +++ gcc/tree-ssa-loop-prefetch.c (working copy) @@ -1549,7 +1549,7 @@ determine_loop_nest_reuse (struct loop * continue; aloop = VEC_index (loop_p, vloops, i); - vol = estimated_loop_iterations_int (aloop, false); + vol = max_stmt_executions_int (aloop, false); if (vol < 0) vol = expected_loop_iterations (aloop); volume *= vol; @@ -1801,7 +1801,7 @@ loop_prefetch_arrays (struct loop *loop) return false; ahead = (PREFETCH_LATENCY + time - 1) / time; - est_niter = estimated_loop_iterations_int (loop, false); + est_niter = max_stmt_executions_int (loop, false); /* Prefetching is not likely to be profitable if the trip count to ahead ratio is too small. */ Index: gcc/graphite-sese-to-poly.c =================================================================== --- gcc/graphite-sese-to-poly.c (revision 174810) +++ gcc/graphite-sese-to-poly.c (working copy) @@ -1092,7 +1092,7 @@ build_loop_iteration_domains (scop_p sco scan_tree_for_params (SCOP_REGION (scop), nb_iters, ub_expr, one); mpz_clear (one); - if (estimated_loop_iterations (loop, true, &nit)) + if (max_stmt_executions (loop, true, &nit)) add_upper_bounds_from_estimated_nit (scop, nit, dim, ub_expr); /* loop_i <= expr_nb_iters */ Index: gcc/cfgloop.h =================================================================== --- gcc/cfgloop.h (revision 174810) +++ gcc/cfgloop.h (working copy) @@ -143,11 +143,13 @@ struct GTY ((chain_next ("%h.next"))) lo computes and caches the computed information in this field. */ tree nb_iterations; - /* An integer guaranteed to bound the number of iterations of the loop - from above. */ + /* An integer guaranteed to be greater or equal to nb_iterations. Only + valid if any_upper_bound is true. */ double_int nb_iterations_upper_bound; - /* An integer giving the expected number of iterations of the loop. */ + /* An integer giving an estimate on nb_iterations. Unlike + nb_iterations_upper_bound, there is no guarantee that it is at least + nb_iterations. */ double_int nb_iterations_estimate; bool any_upper_bound; @@ -278,7 +280,9 @@ extern rtx doloop_condition_get (rtx); void estimate_numbers_of_iterations_loop (struct loop *, bool); HOST_WIDE_INT estimated_loop_iterations_int (struct loop *, bool); +HOST_WIDE_INT max_stmt_executions_int (struct loop *, bool); bool estimated_loop_iterations (struct loop *, bool, double_int *); +bool max_stmt_executions (struct loop *, bool, double_int *); /* Loop manipulation. */ extern bool can_duplicate_loop_p (const struct loop *loop);