[24/33] New parameter bound on number of selected candidates

Submitted by Bin Cheng on April 18, 2017, 10:50 a.m.

Details

Message ID VI1PR0802MB2176C2A33BF825A1B550BFA1E7190@VI1PR0802MB2176.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Bin Cheng April 18, 2017, 10:50 a.m.
Hi,
IVOPTs still have difficulty for outer loop (especially for large loop nest), and tend to select too many candidates.
It's generally bad because of unavoidable register spilling.  In this case, we probably want to compute iv_uses with
small number of bivs.  Though this results in more computation inside of loop, it could improve spilling.
This patch adds new parameter bound on number of selected candidates, it simply gives up if too many candidates are
selected.  So far it works loop by loop, I am not sure if we want to by pass whole loop nest once this bound is hit.
Is it OK?

Thanks,
bin
2017-04-11  Bin Cheng  <bin.cheng@arm.com>

	* doc/invoke.texi (iv-max-selected-candidates): New.
	* params.def (PARAM_IV_MAX_SELECTED_CANDIDATES): New.
	* tree-ssa-loop-ivopts.c (MAX_SELECTED_CANDIDATES): New.
	(tree_ssa_iv_optimize_loop): Skip if too many cands are selected.
From 40517ca836f868b8bd79bde56aa7c053ffef4fc2 Mon Sep 17 00:00:00 2001
From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com>
Date: Tue, 7 Mar 2017 13:53:04 +0000
Subject: [PATCH 24/33] add-bound-on-selected-cands-20170221.txt

---
 gcc/doc/invoke.texi        | 4 ++++
 gcc/params.def             | 8 ++++++++
 gcc/tree-ssa-loop-ivopts.c | 8 +++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

Comments

Richard Guenther April 26, 2017, 1:22 p.m.
On Tue, Apr 18, 2017 at 12:50 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> IVOPTs still have difficulty for outer loop (especially for large loop nest), and tend to select too many candidates.
> It's generally bad because of unavoidable register spilling.  In this case, we probably want to compute iv_uses with
> small number of bivs.  Though this results in more computation inside of loop, it could improve spilling.
> This patch adds new parameter bound on number of selected candidates, it simply gives up if too many candidates are
> selected.  So far it works loop by loop, I am not sure if we want to by pass whole loop nest once this bound is hit.
> Is it OK?

Hmm, I don't like such kind of caps.  We should simply add less
candidates in such cases?  Like cap this
in find_optimal_iv_set instead, always using the original set of
candidates just not trying harder?

IIRC much of the problem is because the inner loop has been processed
already and thus there may appear
to be a very large number of "original" IVs already?  I may misremeber though.

I think this patch doesn't really belong in the series?

Richard.

>
> Thanks,
> bin
> 2017-04-11  Bin Cheng  <bin.cheng@arm.com>
>
>         * doc/invoke.texi (iv-max-selected-candidates): New.
>         * params.def (PARAM_IV_MAX_SELECTED_CANDIDATES): New.
>         * tree-ssa-loop-ivopts.c (MAX_SELECTED_CANDIDATES): New.
>         (tree_ssa_iv_optimize_loop): Skip if too many cands are selected.
Bin.Cheng April 26, 2017, 1:51 p.m.
On Wed, Apr 26, 2017 at 2:22 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Apr 18, 2017 at 12:50 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> IVOPTs still have difficulty for outer loop (especially for large loop nest), and tend to select too many candidates.
>> It's generally bad because of unavoidable register spilling.  In this case, we probably want to compute iv_uses with
>> small number of bivs.  Though this results in more computation inside of loop, it could improve spilling.
>> This patch adds new parameter bound on number of selected candidates, it simply gives up if too many candidates are
>> selected.  So far it works loop by loop, I am not sure if we want to by pass whole loop nest once this bound is hit.
>> Is it OK?
>
> Hmm, I don't like such kind of caps.  We should simply add less
> candidates in such cases?  Like cap this
> in find_optimal_iv_set instead, always using the original set of
> candidates just not trying harder?
>
> IIRC much of the problem is because the inner loop has been processed
> already and thus there may appear
> to be a very large number of "original" IVs already?  I may misremeber though.
Only loop header phis contribute to BIVs, IVs created by inner loop is
GIVs.  Yes, inner loop sometime creates too many non-linear type GIVs
for outer loop, which again causes too many candidates selected.  I
haven't found a good fix yet.  The coming register pressure estimate
should be helpful.   Note when register pressure is within range, we
do want inner loop to create as many GIVs for outer loop as possible.

>
> I think this patch doesn't really belong in the series?
Okay, I will create a standalone patch if necessary.

Thanks,
bin
>
> Richard.
>
>>
>> Thanks,
>> bin
>> 2017-04-11  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * doc/invoke.texi (iv-max-selected-candidates): New.
>>         * params.def (PARAM_IV_MAX_SELECTED_CANDIDATES): New.
>>         * tree-ssa-loop-ivopts.c (MAX_SELECTED_CANDIDATES): New.
>>         (tree_ssa_iv_optimize_loop): Skip if too many cands are selected.

Patch hide | download patch | download mbox

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 19a85b6..f9cbdbb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9922,6 +9922,10 @@  If the number of candidates in the set is smaller than this value,
 always try to remove unnecessary ivs from the set
 when adding a new one.
 
+@item iv-max-selected-candidates
+The induction variable optimizations give up on loops that more induction
+variable candidates are selected.
+
 @item avg-loop-niter
 Average number of iterations of a loop.
 
diff --git a/gcc/params.def b/gcc/params.def
index 1b058e4..7daab14 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -527,6 +527,14 @@  DEFPARAM(PARAM_IV_ALWAYS_PRUNE_CAND_SET_BOUND,
 	 "If number of candidates in the set is smaller, we always try to remove unused ivs during its optimization.",
 	 10, 0, 0)
 
+/* The induction variable optimizations give up on loops that more induction
+   variable candidates are selected.  */
+
+DEFPARAM(PARAM_IV_MAX_SELECTED_CANDIDATES,
+	 "iv-max-selected-candidates",
+	 "Bound on number of selected iv candidates for loops in iv optimizations.",
+	 48, 0, 0)
+
 DEFPARAM(PARAM_AVG_LOOP_NITER,
 	 "avg-loop-niter",
 	 "Average number of iterations of a loop.",
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index dcc4618..8469782 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -667,6 +667,12 @@  struct iv_ca_delta
 #define ALWAYS_PRUNE_CAND_SET_BOUND \
   ((unsigned) PARAM_VALUE (PARAM_IV_ALWAYS_PRUNE_CAND_SET_BOUND))
 
+/* If there are more candidates slected, we just give up because it usually
+   causes high register pressure issue.  */
+
+#define MAX_SELECTED_CANDIDATES \
+  ((unsigned) PARAM_VALUE (PARAM_IV_MAX_SELECTED_CANDIDATES))
+
 /* The list of trees for that the decl_rtl field must be reset is stored
    here.  */
 
@@ -7382,7 +7388,7 @@  tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop)
 
   /* Find the optimal set of induction variables (item 3, part 2).  */
   iv_ca = find_optimal_iv_set (data);
-  if (!iv_ca)
+  if (!iv_ca || iv_ca->n_cands > MAX_SELECTED_CANDIDATES)
     goto finish;
   changed = true;