Message ID | VI1PR0802MB2176C2A33BF825A1B550BFA1E7190@VI1PR0802MB2176.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On Tue, Apr 18, 2017 at 12:50 PM, Bin Cheng <Bin.Cheng@arm.com> wrote: > Hi, > IVOPTs still have difficulty for outer loop (especially for large loop nest), and tend to select too many candidates. > It's generally bad because of unavoidable register spilling. In this case, we probably want to compute iv_uses with > small number of bivs. Though this results in more computation inside of loop, it could improve spilling. > This patch adds new parameter bound on number of selected candidates, it simply gives up if too many candidates are > selected. So far it works loop by loop, I am not sure if we want to by pass whole loop nest once this bound is hit. > Is it OK? Hmm, I don't like such kind of caps. We should simply add less candidates in such cases? Like cap this in find_optimal_iv_set instead, always using the original set of candidates just not trying harder? IIRC much of the problem is because the inner loop has been processed already and thus there may appear to be a very large number of "original" IVs already? I may misremeber though. I think this patch doesn't really belong in the series? Richard. > > Thanks, > bin > 2017-04-11 Bin Cheng <bin.cheng@arm.com> > > * doc/invoke.texi (iv-max-selected-candidates): New. > * params.def (PARAM_IV_MAX_SELECTED_CANDIDATES): New. > * tree-ssa-loop-ivopts.c (MAX_SELECTED_CANDIDATES): New. > (tree_ssa_iv_optimize_loop): Skip if too many cands are selected.
On Wed, Apr 26, 2017 at 2:22 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Apr 18, 2017 at 12:50 PM, Bin Cheng <Bin.Cheng@arm.com> wrote: >> Hi, >> IVOPTs still have difficulty for outer loop (especially for large loop nest), and tend to select too many candidates. >> It's generally bad because of unavoidable register spilling. In this case, we probably want to compute iv_uses with >> small number of bivs. Though this results in more computation inside of loop, it could improve spilling. >> This patch adds new parameter bound on number of selected candidates, it simply gives up if too many candidates are >> selected. So far it works loop by loop, I am not sure if we want to by pass whole loop nest once this bound is hit. >> Is it OK? > > Hmm, I don't like such kind of caps. We should simply add less > candidates in such cases? Like cap this > in find_optimal_iv_set instead, always using the original set of > candidates just not trying harder? > > IIRC much of the problem is because the inner loop has been processed > already and thus there may appear > to be a very large number of "original" IVs already? I may misremeber though. Only loop header phis contribute to BIVs, IVs created by inner loop is GIVs. Yes, inner loop sometime creates too many non-linear type GIVs for outer loop, which again causes too many candidates selected. I haven't found a good fix yet. The coming register pressure estimate should be helpful. Note when register pressure is within range, we do want inner loop to create as many GIVs for outer loop as possible. > > I think this patch doesn't really belong in the series? Okay, I will create a standalone patch if necessary. Thanks, bin > > Richard. > >> >> Thanks, >> bin >> 2017-04-11 Bin Cheng <bin.cheng@arm.com> >> >> * doc/invoke.texi (iv-max-selected-candidates): New. >> * params.def (PARAM_IV_MAX_SELECTED_CANDIDATES): New. >> * tree-ssa-loop-ivopts.c (MAX_SELECTED_CANDIDATES): New. >> (tree_ssa_iv_optimize_loop): Skip if too many cands are selected.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 19a85b6..f9cbdbb 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9922,6 +9922,10 @@ If the number of candidates in the set is smaller than this value, always try to remove unnecessary ivs from the set when adding a new one. +@item iv-max-selected-candidates +The induction variable optimizations give up on loops that more induction +variable candidates are selected. + @item avg-loop-niter Average number of iterations of a loop. diff --git a/gcc/params.def b/gcc/params.def index 1b058e4..7daab14 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -527,6 +527,14 @@ DEFPARAM(PARAM_IV_ALWAYS_PRUNE_CAND_SET_BOUND, "If number of candidates in the set is smaller, we always try to remove unused ivs during its optimization.", 10, 0, 0) +/* The induction variable optimizations give up on loops that more induction + variable candidates are selected. */ + +DEFPARAM(PARAM_IV_MAX_SELECTED_CANDIDATES, + "iv-max-selected-candidates", + "Bound on number of selected iv candidates for loops in iv optimizations.", + 48, 0, 0) + DEFPARAM(PARAM_AVG_LOOP_NITER, "avg-loop-niter", "Average number of iterations of a loop.", diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index dcc4618..8469782 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -667,6 +667,12 @@ struct iv_ca_delta #define ALWAYS_PRUNE_CAND_SET_BOUND \ ((unsigned) PARAM_VALUE (PARAM_IV_ALWAYS_PRUNE_CAND_SET_BOUND)) +/* If there are more candidates slected, we just give up because it usually + causes high register pressure issue. */ + +#define MAX_SELECTED_CANDIDATES \ + ((unsigned) PARAM_VALUE (PARAM_IV_MAX_SELECTED_CANDIDATES)) + /* The list of trees for that the decl_rtl field must be reset is stored here. */ @@ -7382,7 +7388,7 @@ tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop) /* Find the optimal set of induction variables (item 3, part 2). */ iv_ca = find_optimal_iv_set (data); - if (!iv_ca) + if (!iv_ca || iv_ca->n_cands > MAX_SELECTED_CANDIDATES) goto finish; changed = true;
Hi, IVOPTs still have difficulty for outer loop (especially for large loop nest), and tend to select too many candidates. It's generally bad because of unavoidable register spilling. In this case, we probably want to compute iv_uses with small number of bivs. Though this results in more computation inside of loop, it could improve spilling. This patch adds new parameter bound on number of selected candidates, it simply gives up if too many candidates are selected. So far it works loop by loop, I am not sure if we want to by pass whole loop nest once this bound is hit. Is it OK? Thanks, bin 2017-04-11 Bin Cheng <bin.cheng@arm.com> * doc/invoke.texi (iv-max-selected-candidates): New. * params.def (PARAM_IV_MAX_SELECTED_CANDIDATES): New. * tree-ssa-loop-ivopts.c (MAX_SELECTED_CANDIDATES): New. (tree_ssa_iv_optimize_loop): Skip if too many cands are selected. From 40517ca836f868b8bd79bde56aa7c053ffef4fc2 Mon Sep 17 00:00:00 2001 From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com> Date: Tue, 7 Mar 2017 13:53:04 +0000 Subject: [PATCH 24/33] add-bound-on-selected-cands-20170221.txt --- gcc/doc/invoke.texi | 4 ++++ gcc/params.def | 8 ++++++++ gcc/tree-ssa-loop-ivopts.c | 8 +++++++- 3 files changed, 19 insertions(+), 1 deletion(-)