diff mbox series

Ping agian: [PATCH V2] Loop split upon semi-invariant condition (PR tree-optimization/89134)

Message ID BYAPR01MB4869403E98A1528A9219419CF7CF0@BYAPR01MB4869.prod.exchangelabs.com
State New
Headers show
Series Ping agian: [PATCH V2] Loop split upon semi-invariant condition (PR tree-optimization/89134) | expand

Commit Message

Feng Xue OS July 15, 2019, 2:19 a.m. UTC
Some time passed, so ping again. I made this patch, because it can reward us with 7%

performance benefit in some real application. For convenience, the optimization to be

implemented was listed in the following again. And hope your comments on the patch, or

design suggestions. Thanks!


Suppose a loop as:

    void f (std::map<int, int> m)
    {
        for (auto it = m.begin (); it != m.end (); ++it) {
            /* if (b) is semi-invariant. */
            if (b) {
                b = do_something();    /* Has effect on b */
            } else {
                                                        /* No effect on b */
            }
            statements;                      /* Also no effect on b */
        }
    }

A transformation, kind of loop split, could be:

    void f (std::map<int, int> m)
    {
        for (auto it = m.begin (); it != m.end (); ++it) {
            if (b) {
                b = do_something();
            } else {
                ++it;
                statements;
                break;
            }
            statements;
        }

        for (; it != m.end (); ++it) {
            statements;
        }
    }

If "statements" contains nothing, the second loop becomes an empty one, which can be removed.
And if "statements" are straight line instructions, we get an opportunity to vectorize the second loop.


Feng

Comments

Michael Matz July 29, 2019, 5:59 p.m. UTC | #1
Hello Feng,

first, sorry for the terrible delay in reviewing, but here is one now :)

Generally I do like the idea of the transformation, and the basic building 
blocks seem to be sound.  But I dislike it being a separate pass, so 
please integrate the code you have written into the existing loop split 
pass.  Most building blocks can be used as is, except the main driver.

The existing loop-split code uses loop->aux as binary marker and analyses 
and transforms loops in one go, you're doing it separately.  That 
separation makes sense for you, so the existing code should be changed to 
also do that separately.  Some info for the existing loop-split analysis 
needs to be stored in the info struct then as well, which can be done if 
you add some fields in yours.  Some splitting-out of code from the 
existing main driver is probably needed (basically the parts that 
determine eligibility and which cond statement to split).

The two routines that actually split the loops (those that call 
loop_version) also have an overlap, so maybe something more can be 
commonized between the two (ultimately the way of splitting in both 
variants is different, so somewhere they'll do something else, but still 
some parts are common).

So, with these general remarks, some more concrete ones about the patch:

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index eaef4cd63d2..0427fede3d6 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11352,6 +11352,14 @@ The maximum number of branches unswitched in a single loop.
>  @item lim-expensive
>  The minimum cost of an expensive expression in the loop invariant motion.
> 
> +@item max-cond-loop-split-insns
> +The maximum number of insns to be increased due to loop split on
> +semi-invariant condition statement.

"to be increased" --> "to be generated" (or "added")

> +@item min-cond-loop-split-prob
> +The minimum threshold for probability of semi-invaraint condition
> +statement to trigger loop split.

typo, semi-invariant
I think somewhere in the docs your definition of semi-invariant needs
to be stated in some form (can be short, doesn't need to reproduce the 
diagram or such), so don't just replicate the short info from the 
params.def file.

> diff --git a/gcc/params.def b/gcc/params.def
> index 0db60951413..5384f7d1c4d 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -386,6 +386,20 @@ DEFPARAM(PARAM_MAX_UNSWITCH_LEVEL,
>          "The maximum number of unswitchings in a single loop.",
>          3, 0, 0)
> 
> +/* The maximum number of increased insns due to loop split on semi-invariant
> +   condition statement.  */
> +DEFPARAM(PARAM_MAX_COND_LOOP_SPLIT_INSNS,
> +       "max-cond-loop-split-insns",
> +       "The maximum number of insns to be increased due to loop split on "
> +       "semi-invariant condition statement.",
> +       100, 0, 0)
> +
> +DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB,
> +       "min-cond-loop-split-prob",
> +       "The minimum threshold for probability of semi-invaraint condition "
> +       "statement to trigger loop split.",

Same typo: "semi-invariant".

> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> index 999c9a30366..7239d0cfb00 100644
> --- a/gcc/tree-ssa-loop-split.c
> +++ b/gcc/tree-ssa-loop-split.c
> -/* This file implements loop splitting, i.e. transformation of loops like
> +/* This file implements two kind of loop splitting.

kind_s_, plural

> +   One transformation of loops like:
> 
>     for (i = 0; i < 100; i++)
>       {
> @@ -670,6 +674,782 @@ tree_ssa_split_loops (void)
>    return 0;
>  }
> 
> +
> +/* Another transformation of loops like:
> +
> +   for (i = INIT (); CHECK (i); i = NEXT ())
> +     {
> +       if (expr (a_1, a_2, ..., a_n))
> +         a_j = ...;  // change at least one a_j
> +       else
> +         S;          // not change any a_j
> +     }

You should mention that 'expr' needs to be pure, i.e. once it
becomes false and the inputs don't change, that it remains false.

> +
> +   into:
> +
> +   for (i = INIT (); CHECK (i); i = NEXT ())
> +     {
> +       if (expr (a_1, a_2, ..., a_n))
> +         a_j = ...;
> +       else
> +         {
> +           S;
> +           i = NEXT ();
> +           break;
> +         }
> +     }
> +
> +   for (; CHECK (i); i = NEXT ())
> +     {
> +       S;
> +     }
> +
> +   */
> +
...
> +/* Determine when conditional statement never transfers execution to one of its
> +   branch, whether we can remove the branch's leading basic block (BRANCH_BB)
> +   and those basic blocks dominated by BRANCH_BB.  */
> +
> +static bool
> +branch_removable_p (basic_block branch_bb)
> +{
> +  if (single_pred_p (branch_bb))
> +    return true;
> +
> +  edge e;
> +  edge_iterator ei;
> +
> +  FOR_EACH_EDGE (e, ei, branch_bb->preds)
> +    {
> +      if (dominated_by_p (CDI_DOMINATORS, e->src, branch_bb))
> +       continue;
> +
> +      if (dominated_by_p (CDI_DOMINATORS, branch_bb, e->src))
> +       continue;

My gut feeling is surprised by this.  So one of the predecessors of
branch_bb dominates it.  Why should that indicate that branch_bb
can be safely removed?

Think about something like this:

   esrc --> cond_bb --> branch_bb
    '-------------------^

(cond_bb is the callers bb of the cond statement in question).  Now esrc
dominates branch_bb but still you can't simply remove it, even if
the cond_bb->branch_bb edge becomes unexecutable.

> +       /* The branch can be reached from opposite branch, or from some
> +         statement not dominated by the conditional statement.  */
> +      return false;
> +    }
> +
> +  return true;
> +}
> +
> +/* Find out which branch of a conditional statement (COND) is invariant in the
> +   execution context of LOOP.  That is: once the branch is selected in certain
> +   iteration of the loop, any operand that contributes to computation of the
> +   conditional statement remains unchanged in all following iterations.  */
> +
> +static int
> +get_cond_invariant_branch (struct loop *loop, gcond *cond)

Please return an edge here, not an edge index (which removes the using of 
-1).  I think the name (and comment) should consistently talk about 
semi-invariant, not invariant.  For when you really need an edge index 
later, you could use "EDGE_SUCC(bb, 0) != edge".  But you probably don't 
really need it, e.g. instead of using the gimple pass-local-flag on a 
statement you can just as well also store the edge in your info structure.

> +/* Given a conditional statement in LOOP, whose basic block is COND_BB,
> +   suppose its execution only goes through one of its branch, whose index is
> +   specified by BRANCH.  Return TRUE if this statement still executes multiple
> +   times in one iteration of LOOP, in that the statement belongs a nested
> +   unrecognized loop.  */
> +
> +static bool
> +is_cond_in_hidden_loop (const struct loop *loop, basic_block cond_bb,
> +                       int branch)

With above change in get_cond_invariant_branch, this also should
take an edge, not a bb+edge-index.

> +/* Calculate increased code size measured by estimated insn number if applying
> +   loop split upon certain branch (BRANCH) of a conditional statement whose
> +   basic block is COND_BB.  */
> +
> +static int
> +compute_added_num_insns (struct loop *loop, const_basic_block cond_bb,
> +                        int branch)

This should take an edge as well.

> +{
> +  const_basic_block targ_bb_var = EDGE_SUCC (cond_bb, !branch)->dest;
> +  basic_block *bbs = ((split_info *) loop->aux)->bbs;
> +  int num = 0;
> +
> +  for (unsigned i = 0; i < loop->num_nodes; i++)
> +    {
> +      /* Do no count basic blocks only in opposite branch.  */
> +      if (dominated_by_p (CDI_DOMINATORS, bbs[i], targ_bb_var))
> +       continue;
> +
> +      for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);
> +          gsi_next (&gsi))
> +       num += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);

Replace the loop by
  estimate_num_insn_seq (bb_seq (bbs[i]), &eni_size_weights);

> +/* Return true if it is eligible and profitable to perform loop split upon
> +   a conditional statement COND in LOOP.  */
> +
> +static bool
> +can_split_loop_on_cond (struct loop *loop, gcond *cond)
> +{
> +  int branch = get_cond_invariant_branch (loop, cond);
> +
> +  if (branch < 0)
> +    return false;
> +
> +  basic_block cond_bb = gimple_bb (cond);
> +  profile_probability prob = EDGE_SUCC (cond_bb, branch)->probability;
> +
> +  /* When accurate profile information is available, and execution
> +     frequency of the branch is too low, just let it go.  */
> +  if (prob.reliable_p ())
> +    {
> +      int thres = PARAM_VALUE (PARAM_MIN_COND_LOOP_SPLIT_PROB);
> +
> +      if (prob < profile_probability::always ().apply_scale (thres, 100))
> +       return false;
> +    }
> +
> +  /* Add a threshold for increased code size to disable loop split.  */
> +  if (compute_added_num_insns (loop, cond_bb, branch) >
> +      PARAM_VALUE (PARAM_MAX_COND_LOOP_SPLIT_INSNS))

Operator should be first on next line, not last on previous line.

> +    return false;
> +
> +  /* Skip conditional statement that is inside a nested unrecognized loop.  */
> +  if (is_cond_in_hidden_loop (loop, cond_bb, branch))
> +    return false;

This part (as well as is_cond_in_hidden_loop implementation) had me 
confused for a while, because "unrecognized" loops seems strange.  I think 
I now know what you try to do here, but I wonder if there's an easier way, 
or at least about which situations you stumbled into that made you write 
this code.

> +
> +  /* Temporarily keep branch index in conditional statement.  */
> +  gimple_set_plf (cond, GF_PLF_1, branch);

i.e. here, store the edge in your info structure.

> +/* Main entry point to perform loop splitting for suitable if-conditions
> +   in all loops.  */
> +
> +static unsigned int
> +tree_ssa_split_loops_for_cond (void)

So, from here on the code should be integrated into the existing code
of the file (which might need changes as well for this integration to look 
good).

That's it for now.


Ciao,
Michael.
Feng Xue OS July 31, 2019, 7:24 a.m. UTC | #2
Thanks for these comments.

Feng
Feng Xue OS Sept. 12, 2019, 10:21 a.m. UTC | #3
Hi, Michael,

  Since I was involved in other tasks, it is a little bit late to reply you. Sorry
for that. I composed a new one with your suggestions. Please review that
when you are in convenience. 

> Generally I do like the idea of the transformation, and the basic building
> blocks seem to be sound.  But I dislike it being a separate pass, so
> please integrate the code you have written into the existing loop split
> pass.  Most building blocks can be used as is, except the main driver.
This new transformation was integrated into the pass of original loop split.

>> +@item max-cond-loop-split-insns
>> +The maximum number of insns to be increased due to loop split on
>> +semi-invariant condition statement.

> "to be increased" --> "to be generated" (or "added")
Done.

>> +@item min-cond-loop-split-prob
>> +The minimum threshold for probability of semi-invaraint condition
>> +statement to trigger loop split.

> typo, semi-invariant
Done.

> I think somewhere in the docs your definition of semi-invariant needs
> to be stated in some form (can be short, doesn't need to reproduce the
> diagram or such), so don't just replicate the short info from the
> params.def file.
Done.

>> +DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB,
>> +       "min-cond-loop-split-prob",
>> +       "The minimum threshold for probability of semi-invaraint condition "
>> +       "statement to trigger loop split.",

> Same typo: "semi-invariant".
Done.

>> -/* This file implements loop splitting, i.e. transformation of loops like
>> +/* This file implements two kind of loop splitting.

> kind_s_, plural
Done.

>> +/* Another transformation of loops like:
>> +
>> +   for (i = INIT (); CHECK (i); i = NEXT ())
>> +     {
>> +       if (expr (a_1, a_2, ..., a_n))
>> +         a_j = ...;  // change at least one a_j
>> +       else
>> +         S;          // not change any a_j
>> +     }

> You should mention that 'expr' needs to be pure, i.e. once it
> becomes false and the inputs don't change, that it remains false.
Done.

>> +static bool
>> +branch_removable_p (basic_block branch_bb)
>> +{
>> +  if (single_pred_p (branch_bb))
>> +    return true;
>> +
>> +  edge e;
>> +  edge_iterator ei;
>> +
>> +  FOR_EACH_EDGE (e, ei, branch_bb->preds)
>> +    {
>> +      if (dominated_by_p (CDI_DOMINATORS, e->src, branch_bb))
>> +       continue;
>> +
>> +      if (dominated_by_p (CDI_DOMINATORS, branch_bb, e->src))
>> +       continue;

> My gut feeling is surprised by this.  So one of the predecessors of
> branch_bb dominates it.  Why should that indicate that branch_bb
> can be safely removed?
>
> Think about something like this:
>
>   esrc --> cond_bb --> branch_bb
>   '-------------------^

If all predecessors of branch_bb dominate it, these predecessors should also
be in dominating relationship among them, and the conditional statement must
be branch_bb's immediate dominator, and branch_bb is removable. In your example.

For "esrc", loop is continued, nothing is impacted. But in the next iteration, we
encounter "cond_bb", it does not dominate "branch_bb", so the function return
false in the following return statement.

> (cond_bb is the callers bb of the cond statement in question).  Now esrc
> dominates branch_bb but still you can't simply remove it, even if
> the cond_bb->branch_bb edge becomes unexecutable.


>> +static int
>> +get_cond_invariant_branch (struct loop *loop, gcond *cond)

> Please return an edge here, not an edge index (which removes the using of
> -1).  I think the name (and comment) should consistently talk about
> semi-invariant, not invariant.  For when you really need an edge index
> later, you could use "EDGE_SUCC(bb, 0) != edge".  But you probably don't
> really need it, e.g. instead of using the gimple pass-local-flag on a
> statement you can just as well also store the edge in your info structure.
Done.

>> +static bool
>> +is_cond_in_hidden_loop (const struct loop *loop, basic_block cond_bb,
>> +                       int branch)

> With above change in get_cond_invariant_branch, this also should
> take an edge, not a bb+edge-index.
Done.

>> +static int
>> +compute_added_num_insns (struct loop *loop, const_basic_block cond_bb,
>> +                        int branch)

> This should take an edge as well.
Done.

>> +  for (unsigned i = 0; i < loop->num_nodes; i++)
>> +    {
>> +      /* Do no count basic blocks only in opposite branch.  */
>> +      if (dominated_by_p (CDI_DOMINATORS, bbs[i], targ_bb_var))
>> +       continue;
>> +
>> +      for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);
>> +          gsi_next (&gsi))
>> +       num += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);

> Replace the loop by
>  estimate_num_insn_seq (bb_seq (bbs[i]), &eni_size_weights);
Done.


>> +  /* Add a threshold for increased code size to disable loop split.  */
>> +  if (compute_added_num_insns (loop, cond_bb, branch) >
>> +      PARAM_VALUE (PARAM_MAX_COND_LOOP_SPLIT_INSNS))

> Operator should be first on next line, not last on previous line.
Done.

>> +  /* Skip conditional statement that is inside a nested unrecognized loop.  */
>> +  if (is_cond_in_hidden_loop (loop, cond_bb, branch))
>> +    return false;

> This part (as well as is_cond_in_hidden_loop implementation) had me
> confused for a while, because "unrecognized" loops seems strange.  I think
> I now know what you try to do here, but I wonder if there's an easier way,
> or at least about which situations you stumbled into that made you write
> this code.
Use BB_IRREDUCIBLE_LOOP flag to check that, for tree-loop-init pass 
requires LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS which marks
irreducible loops.

>> +
>> +  /* Temporarily keep branch index in conditional statement.  */
>> +  gimple_set_plf (cond, GF_PLF_1, branch);

> i.e. here, store the edge in your info structure.
Done.

>> +/* Main entry point to perform loop splitting for suitable if-conditions
>> +   in all loops.  */
>> +
>> +static unsigned int
>> +tree_ssa_split_loops_for_cond (void)

> So, from here on the code should be integrated into the existing code
> of the file (which might need changes as well for this integration to look
> good).
Done.

Thanks,
Feng
Richard Biener Sept. 12, 2019, 11:09 a.m. UTC | #4
On Mon, Jul 15, 2019 at 4:20 AM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> Some time passed, so ping again. I made this patch, because it can reward us with 7%
>
> performance benefit in some real application. For convenience, the optimization to be
>
> implemented was listed in the following again. And hope your comments on the patch, or
>
> design suggestions. Thanks!

Replying again to the very first post since it contains the figure below.

> Suppose a loop as:
>
>     void f (std::map<int, int> m)
>     {
>         for (auto it = m.begin (); it != m.end (); ++it) {
>             /* if (b) is semi-invariant. */
>             if (b) {
>                 b = do_something();    /* Has effect on b */
>             } else {
>                                                         /* No effect on b */
>             }
>             statements;                      /* Also no effect on b */
>         }
>     }
>
> A transformation, kind of loop split, could be:
>
>     void f (std::map<int, int> m)
>     {
>         for (auto it = m.begin (); it != m.end (); ++it) {
>             if (b) {
>                 b = do_something();
>             } else {
>                 ++it;
>                 statements;
>                 break;
>             }
>             statements;
>         }
>
>         for (; it != m.end (); ++it) {
>             statements;
>         }
>     }

So if you consider approaching this from unswitching instead we'd
unswitch it on if (b) but
treat the condition as constant only in the 'false' path, thus the
transformed code would
look like the following.  I believe implementing this in the existing
unswitching pass
involves a lot less code than putting it into the splitting pass but
it would catch
exactly the same cases?

  if (b)
   {
        for (auto it = m.begin (); it != m.end (); ++it) {
             /* if (b) is non-invariant. */
            if (b) {
                b = do_something();    /* Has effect on b */
             } else {
                                                        /* No effect on b */
            }
            statements;                      /* Also no effect on b */
        }
    }
  else
    {
          for (auto it = m.begin (); it != m.end (); ++it) {
             /* if (b) is invariant. */
             if (false) {
                 b = do_something();    /* Has effect on b */
             } else {
                                                         /* No effect on b */
             }
             statements;                      /* Also no effect on b */
         }
    }


> If "statements" contains nothing, the second loop becomes an empty one, which can be removed.
> And if "statements" are straight line instructions, we get an opportunity to vectorize the second loop.
>
> Feng
>
>
> ________________________________
> From: Feng Xue OS
> Sent: Tuesday, June 18, 2019 3:00 PM
> To: Richard Biener; Michael Matz
> Cc: gcc-patches@gcc.gnu.org
> Subject: Ping: [PATCH V2] Loop split upon semi-invariant condition (PR tree-optimization/89134)
>
> Richard & Michael,
>
>    I made some adjustments on coding style and added test cases for this version.
>
>    Would you please take a look at the patch? It is long a little bit and might steal some
>    of your time.
>
> Thanks a lot.
>
> ----
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 9a46f93d89d..2334b184945 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,23 @@
> +2019-06-18  Feng Xue <fxue@os.amperecomputing.com>
> +
> +       PR tree-optimization/89134
> +       * doc/invoke.texi (max-cond-loop-split-insns): Document new --params.
> +       (min-cond-loop-split-prob): Likewise.
> +       * params.def: Add max-cond-loop-split-insns, min-cond-loop-split-prob.
> +       * passes.def (pass_cond_loop_split) : New pass.
> +       * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
> +       * tree-pass.h (make_pass_cond_loop_split): New declaration.
> +       * tree-ssa-loop-split.c (split_info): New class.
> +       (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
> +       (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
> +       (branch_removable_p, get_cond_invariant_branch): Likewise.
> +       (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
> +       (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
> +       (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
> +       (pass_data_cond_loop_split): New variable.
> +       (pass_cond_loop_split): New class.
> +       (make_pass_cond_loop_split): New function.
> +
>  2019-06-18  Kewen Lin  <linkw@gcc.gnu.org>
>
>          PR middle-end/80791
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index eaef4cd63d2..0427fede3d6 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11352,6 +11352,14 @@ The maximum number of branches unswitched in a single loop.
>  @item lim-expensive
>  The minimum cost of an expensive expression in the loop invariant motion.
>
> +@item max-cond-loop-split-insns
> +The maximum number of insns to be increased due to loop split on
> +semi-invariant condition statement.
> +
> +@item min-cond-loop-split-prob
> +The minimum threshold for probability of semi-invaraint condition
> +statement to trigger loop split.
> +
>  @item iv-consider-all-candidates-bound
>  Bound on number of candidates for induction variables, below which
>  all candidates are considered for each use in induction variable
> diff --git a/gcc/params.def b/gcc/params.def
> index 0db60951413..5384f7d1c4d 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -386,6 +386,20 @@ DEFPARAM(PARAM_MAX_UNSWITCH_LEVEL,
>          "The maximum number of unswitchings in a single loop.",
>          3, 0, 0)
>
> +/* The maximum number of increased insns due to loop split on semi-invariant
> +   condition statement.  */
> +DEFPARAM(PARAM_MAX_COND_LOOP_SPLIT_INSNS,
> +       "max-cond-loop-split-insns",
> +       "The maximum number of insns to be increased due to loop split on "
> +       "semi-invariant condition statement.",
> +       100, 0, 0)
> +
> +DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB,
> +       "min-cond-loop-split-prob",
> +       "The minimum threshold for probability of semi-invaraint condition "
> +       "statement to trigger loop split.",
> +       30, 0, 100)
> +
>  /* The maximum number of insns in loop header duplicated by the copy loop
>     headers pass.  */
>  DEFPARAM(PARAM_MAX_LOOP_HEADER_INSNS,
> diff --git a/gcc/passes.def b/gcc/passes.def
> index ad2efabd385..bb32b88738e 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -261,6 +261,7 @@ along with GCC; see the file COPYING3.  If not see
>            NEXT_PASS (pass_tree_unswitch);
>            NEXT_PASS (pass_scev_cprop);
>            NEXT_PASS (pass_loop_split);
> +         NEXT_PASS (pass_cond_loop_split);
>            NEXT_PASS (pass_loop_versioning);
>            NEXT_PASS (pass_loop_jam);
>            /* All unswitching, final value replacement and splitting can expose
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 27a522e0140..9aa069e5c29 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2019-06-18  Feng Xue  <fxue@os.amperecomputing.com>
> +
> +       * gcc.dg/tree-ssa/loop-cond-split-1.c: New test.
> +       * g++.dg/tree-ssa/loop-cond-split-1.C: New test.
> +
>  2019-06-17  Jakub Jelinek  <jakub@redhat.com>
>
>          * gcc.dg/vect/vect-simd-8.c: New test.
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/loop-cond-split-1.C b/gcc/testsuite/g++.dg/tree-ssa/loop-cond-split-1.C
> new file mode 100644
> index 00000000000..df269c5ee44
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tree-ssa/loop-cond-split-1.C
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-cond_lsplit-details" } */
> +
> +#include <string>
> +#include <map>
> +
> +using namespace std;
> +
> +class  A
> +{
> +public:
> +  bool empty;
> +  void set (string s);
> +};
> +
> +class  B
> +{
> +  map<int, string> m;
> +  void f ();
> +};
> +
> +extern A *ga;
> +
> +void B::f ()
> +{
> +  for (map<int, string>::iterator iter = m.begin (); iter != m.end (); ++iter)
> +    {
> +      if (ga->empty)
> +        ga->set (iter->second);
> +    }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "split loop 1 at branch" 1 "cond_lsplit" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-cond-split-1.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-cond-split-1.c
> new file mode 100644
> index 00000000000..a0eb7a26ad5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-cond-split-1.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-cond_lsplit-details" } */
> +
> +__attribute__((pure)) __attribute__((noinline)) int inc (int i)
> +{
> +  return i + 1;
> +}
> +
> +extern int do_something (void);
> +extern int b;
> +
> +void test(int n)
> +{
> +  int i;
> +
> +  for (i = 0; i < n; i = inc (i))
> +    {
> +      if (b)
> +        b = do_something();
> +    }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "split loop 1 at branch" 1 "cond_lsplit" } } */
> diff --git a/gcc/timevar.def b/gcc/timevar.def
> index 13cb470b688..5a2a80a29f7 100644
> --- a/gcc/timevar.def
> +++ b/gcc/timevar.def
> @@ -188,6 +188,7 @@ DEFTIMEVAR (TV_TREE_LOOP_IVCANON     , "tree canonical iv")
>  DEFTIMEVAR (TV_SCEV_CONST            , "scev constant prop")
>  DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH    , "tree loop unswitching")
>  DEFTIMEVAR (TV_LOOP_SPLIT            , "loop splitting")
> +DEFTIMEVAR (TV_COND_LOOP_SPLIT       , "loop splitting for conditions")
>  DEFTIMEVAR (TV_LOOP_JAM              , "unroll and jam")
>  DEFTIMEVAR (TV_COMPLETE_UNROLL       , "complete unrolling")
>  DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 3a0b3805d24..cdb7ef3c9f2 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -367,6 +367,7 @@ extern gimple_opt_pass *make_pass_lim (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_linterchange (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_tree_unswitch (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_loop_split (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_cond_loop_split (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_loop_jam (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_predcom (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_iv_canon (gcc::context *ctxt);
> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> index 999c9a30366..7239d0cfb00 100644
> --- a/gcc/tree-ssa-loop-split.c
> +++ b/gcc/tree-ssa-loop-split.c
> @@ -32,7 +32,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-loop.h"
>  #include "tree-ssa-loop-manip.h"
>  #include "tree-into-ssa.h"
> +#include "tree-inline.h"
>  #include "cfgloop.h"
> +#include "params.h"
>  #include "tree-scalar-evolution.h"
>  #include "gimple-iterator.h"
>  #include "gimple-pretty-print.h"
> @@ -40,7 +42,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-fold.h"
>  #include "gimplify-me.h"
>
> -/* This file implements loop splitting, i.e. transformation of loops like
> +/* This file implements two kind of loop splitting.
> +
> +   One transformation of loops like:
>
>     for (i = 0; i < 100; i++)
>       {
> @@ -670,6 +674,782 @@ tree_ssa_split_loops (void)
>    return 0;
>  }
>
> +
> +/* Another transformation of loops like:
> +
> +   for (i = INIT (); CHECK (i); i = NEXT ())
> +     {
> +       if (expr (a_1, a_2, ..., a_n))
> +         a_j = ...;  // change at least one a_j
> +       else
> +         S;          // not change any a_j
> +     }
> +
> +   into:
> +
> +   for (i = INIT (); CHECK (i); i = NEXT ())
> +     {
> +       if (expr (a_1, a_2, ..., a_n))
> +         a_j = ...;
> +       else
> +         {
> +           S;
> +           i = NEXT ();
> +           break;
> +         }
> +     }
> +
> +   for (; CHECK (i); i = NEXT ())
> +     {
> +       S;
> +     }
> +
> +   */
> +
> +/* Data structure to hold temporary information during loop split upon
> +   semi-invariant conditional statement.  */
> +class split_info {
> +public:
> +  /* Array of all basic blocks in a loop, returned by get_loop_body().  */
> +  basic_block *bbs;
> +
> +  /* All memory store/clobber statements in a loop.  */
> +  auto_vec<gimple *> stores;
> +
> +  /* Whether above memory stores vector has been filled.  */
> +  bool set_stores;
> +
> +  /* Semi-invariant conditional statement, upon which to split loop.  */
> +  gcond *cond;
> +
> +  split_info () : bbs (NULL),  set_stores (false), cond (NULL) { }
> +
> +  ~split_info ()
> +    {
> +      if (bbs)
> +       free (bbs);
> +    }
> +};
> +
> +/* Find all statements with memory-write effect in LOOP, including memory
> +   store and non-pure function call, and keep those in a vector.  This work
> +   is only done one time, for the vector should be constant during analysis
> +   stage of semi-invariant condition.  */
> +
> +static void
> +find_vdef_in_loop (struct loop *loop)
> +{
> +  split_info *info = (split_info *) loop->aux;
> +  gphi *vphi = get_virtual_phi (loop->header);
> +
> +  /* Indicate memory store vector has been filled.  */
> +  info->set_stores = true;
> +
> +  /* If loop contains memory operation, there must be a virtual PHI node in
> +     loop header basic block.  */
> +  if (vphi == NULL)
> +    return;
> +
> +  /* All virtual SSA names inside the loop are connected to be a cyclic
> +     graph via virtual PHI nodes.  The virtual PHI node in loop header just
> +     links the first and the last virtual SSA names, by using the last as
> +     PHI operand to define the first.  */
> +  const edge latch = loop_latch_edge (loop);
> +  const tree first = gimple_phi_result (vphi);
> +  const tree last = PHI_ARG_DEF_FROM_EDGE (vphi, latch);
> +
> +  /* The virtual SSA cyclic graph might consist of only one SSA name, who
> +     is defined by itself.
> +
> +       .MEM_1 = PHI <.MEM_2(loop entry edge), .MEM_1(latch edge)>
> +
> +     This means the loop contains only memory loads, so we can skip it.  */
> +  if (first == last)
> +    return;
> +
> +  auto_vec<gimple *> others;
> +  auto_vec<tree> worklist;
> +  auto_bitmap visited;
> +
> +  bitmap_set_bit (visited, SSA_NAME_VERSION (first));
> +  bitmap_set_bit (visited, SSA_NAME_VERSION (last));
> +  worklist.safe_push (last);
> +
> +  do
> +    {
> +      tree vuse = worklist.pop ();
> +      gimple *stmt = SSA_NAME_DEF_STMT (vuse);
> +
> +      /* We mark the first and last SSA names as visited at the beginning,
> +        and reversely start the process from the last SSA name towards the
> +        first, which ensures that this do-while will not touch SSA names
> +        defined outside of the loop.  */
> +      gcc_assert (gimple_bb (stmt)
> +                 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)));
> +
> +      if (gimple_code (stmt) == GIMPLE_PHI)
> +       {
> +         gphi *phi = as_a <gphi *> (stmt);
> +
> +         for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
> +           {
> +             tree arg = gimple_phi_arg_def (stmt, i);
> +
> +             if (bitmap_set_bit (visited, SSA_NAME_VERSION (arg)))
> +               worklist.safe_push (arg);
> +           }
> +       }
> +      else
> +       {
> +         tree prev = gimple_vuse (stmt);
> +
> +         /* Non-pure call statement is conservatively assumed to impact all
> +            memory locations.  So place call statements ahead of other memory
> +            stores in the vector with an idea of of using them as shortcut
> +            terminators to memory alias analysis.  */
> +         if (gimple_code (stmt) == GIMPLE_CALL)
> +           info->stores.safe_push (stmt);
> +         else
> +           others.safe_push (stmt);
> +
> +         if (bitmap_set_bit (visited, SSA_NAME_VERSION (prev)))
> +           worklist.safe_push (prev);
> +       }
> +    } while (!worklist.is_empty ());
> +
> +    info->stores.safe_splice (others);
> +}
> +
> +
> +/* Given STMT, memory load or pure call statement, check whether it is impacted
> +   by some memory store in LOOP, excluding trace starting from SKIP_HEAD (the
> +   trace is composed of SKIP_HEAD and those basic block dominated by it, always
> +   corresponds to one branch of a conditional statement).  If SKIP_HEAD is
> +   NULL, all basic blocks of LOOP are checked.  */
> +
> +static bool
> +vuse_semi_invariant_p (struct loop *loop, gimple *stmt,
> +                      const_basic_block skip_head)
> +{
> +  split_info *info = (split_info *) loop->aux;
> +
> +  /* Collect memory store/clobber statements if have not do that.  */
> +  if (!info->set_stores)
> +    find_vdef_in_loop (loop);
> +
> +  tree rhs = is_gimple_assign (stmt) ? gimple_assign_rhs1 (stmt) : NULL_TREE;
> +  ao_ref ref;
> +  gimple *store;
> +  unsigned i;
> +
> +  ao_ref_init (&ref, rhs);
> +
> +  FOR_EACH_VEC_ELT (info->stores, i, store)
> +    {
> +      /* Skip basic blocks dominated by SKIP_HEAD, if non-NULL.  */
> +      if (skip_head
> +         && dominated_by_p (CDI_DOMINATORS, gimple_bb (store), skip_head))
> +       continue;
> +
> +      if (!ref.ref || stmt_may_clobber_ref_p_1 (store, &ref))
> +       return false;
> +    }
> +
> +  return true;
> +}
> +
> +/* Forward declaration.  */
> +
> +static bool
> +stmt_semi_invariant_p (struct loop *loop, gimple *stmt,
> +                      const_basic_block skip_head);
> +
> +/* Suppose one condition branch, led by SKIP_HEAD, is not executed since
> +   certain iteration of LOOP, check whether an SSA name (NAME) remains
> +   unchanged in next interation.  We call this characterisic as semi-
> +   invariantness.  SKIP_HEAD might be NULL, if so, nothing excluded, all
> +   basic blocks and control flows in the loop will be considered.  If non-
> +   NULL, SSA name to check is supposed to be defined before SKIP_HEAD.  */
> +
> +static bool
> +ssa_semi_invariant_p (struct loop *loop, const tree name,
> +                     const_basic_block skip_head)
> +{
> +  gimple *def = SSA_NAME_DEF_STMT (name);
> +  const_basic_block def_bb = gimple_bb (def);
> +
> +  /* An SSA name defined outside a loop is definitely semi-invariant.  */
> +  if (!def_bb || !flow_bb_inside_loop_p (loop, def_bb))
> +    return true;
> +
> +  if (gimple_code (def) == GIMPLE_PHI)
> +    {
> +      /* For PHI node that is not in loop header, its source operands should
> +        be defined inside the loop, which are seen as loop variant.  */
> +      if (def_bb != loop->header || !skip_head)
> +       return false;
> +
> +      const_edge latch = loop_latch_edge (loop);
> +      tree from = PHI_ARG_DEF_FROM_EDGE (as_a <gphi *> (def), latch);
> +
> +      /* A PHI node in loop header contains two source operands, one is
> +        initial value, the other is the copy of last iteration through loop
> +        latch, we call it latch value.  From the PHI node to definition
> +        of latch value, if excluding branch trace from SKIP_HEAD, there
> +        is no definition of other version of same variable, SSA name defined
> +        by the PHI node is semi-invariant.
> +
> +                         loop entry
> +                              |     .--- latch ---.
> +                              |     |             |
> +                              v     v             |
> +                  x_1 = PHI <x_0,  x_3>           |
> +                           |                      |
> +                           v                      |
> +              .------- if (cond) -------.         |
> +              |                         |         |
> +              |                     [ SKIP ]      |
> +              |                         |         |
> +              |                     x_2 = ...     |
> +              |                         |         |
> +              '---- T ---->.<---- F ----'         |
> +                           |                      |
> +                           v                      |
> +                  x_3 = PHI <x_1, x_2>            |
> +                           |                      |
> +                           '----------------------'
> +
> +       Suppose in certain iteration, execution flow in above graph goes
> +       through true branch, which means that one source value to define
> +       x_3 in false branch (x2) is skipped, x_3 only comes from x_1, and
> +       x_1 in next iterations is defined by x_3, we know that x_1 will
> +       never changed if COND always chooses true branch from then on.  */
> +
> +      while (from != name)
> +       {
> +         /* A new value comes from a CONSTANT.  */
> +         if (TREE_CODE (from) != SSA_NAME)
> +           return false;
> +
> +         gimple *stmt = SSA_NAME_DEF_STMT (from);
> +         const_basic_block bb = gimple_bb (stmt);
> +
> +         /* A new value comes from outside of loop.  */
> +         if (!bb || !flow_bb_inside_loop_p (loop, bb))
> +           return false;
> +
> +         from = NULL_TREE;
> +
> +         if (gimple_code (stmt) == GIMPLE_PHI)
> +           {
> +             gphi *phi = as_a <gphi *> (stmt);
> +
> +             for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
> +               {
> +                 const_edge e = gimple_phi_arg_edge (phi, i);
> +
> +                 /* Not consider redefinitions in excluded basic blocks.  */
> +                 if (!dominated_by_p (CDI_DOMINATORS, e->src, skip_head))
> +                   {
> +                     /* There are more than one source operands that can
> +                        provide value to the SSA name, it is variant.  */
> +                     if (from)
> +                       return false;
> +
> +                     from = gimple_phi_arg_def (phi, i);
> +                   }
> +               }
> +           }
> +         else if (gimple_code (stmt) == GIMPLE_ASSIGN)
> +           {
> +             /* For simple value copy, check its rhs instead.  */
> +             if (gimple_assign_ssa_name_copy_p (stmt))
> +               from = gimple_assign_rhs1 (stmt);
> +           }
> +
> +         /* Any other kind of definition is deemed to introduce a new value
> +            to the SSA name.  */
> +         if (!from)
> +           return false;
> +       }
> +       return true;
> +    }
> +
> +  /* Value originated from volatile memory load or return of normal (non-
> +     const/pure) call should not be treated as constant in each iteration.  */
> +  if (gimple_has_side_effects (def))
> +    return false;
> +
> +  /* Check if any memory store may kill memory load at this place.  */
> +  if (gimple_vuse (def) && !vuse_semi_invariant_p (loop, def, skip_head))
> +    return false;
> +
> +  /* Check operands of definition statement of the SSA name.  */
> +  return stmt_semi_invariant_p (loop, def, skip_head);
> +}
> +
> +/* Check whether STMT is semi-invariant in LOOP, iff all its operands are
> +   semi-invariant.  Trace composed of basic block SKIP_HEAD and basic blocks
> +   dominated by it are excluded from the loop.  */
> +
> +static bool
> +stmt_semi_invariant_p (struct loop *loop, gimple *stmt,
> +                      const_basic_block skip_head)
> +{
> +  ssa_op_iter iter;
> +  tree use;
> +
> +  /* Although operand of a statement might be SSA name, CONSTANT or VARDECL,
> +     here we only need to check SSA name operands.  This is because check on
> +     VARDECL operands, which involve memory loads, must have been done
> +     prior to invocation of this function in vuse_semi_invariant_p.  */
> +  FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
> +    {
> +      if (!ssa_semi_invariant_p (loop, use, skip_head))
> +       return false;
> +    }
> +
> +  return true;
> +}
> +
> +/* Determine when conditional statement never transfers execution to one of its
> +   branch, whether we can remove the branch's leading basic block (BRANCH_BB)
> +   and those basic blocks dominated by BRANCH_BB.  */
> +
> +static bool
> +branch_removable_p (basic_block branch_bb)
> +{
> +  if (single_pred_p (branch_bb))
> +    return true;
> +
> +  edge e;
> +  edge_iterator ei;
> +
> +  FOR_EACH_EDGE (e, ei, branch_bb->preds)
> +    {
> +      if (dominated_by_p (CDI_DOMINATORS, e->src, branch_bb))
> +       continue;
> +
> +      if (dominated_by_p (CDI_DOMINATORS, branch_bb, e->src))
> +       continue;
> +
> +       /* The branch can be reached from opposite branch, or from some
> +         statement not dominated by the conditional statement.  */
> +      return false;
> +    }
> +
> +  return true;
> +}
> +
> +/* Find out which branch of a conditional statement (COND) is invariant in the
> +   execution context of LOOP.  That is: once the branch is selected in certain
> +   iteration of the loop, any operand that contributes to computation of the
> +   conditional statement remains unchanged in all following iterations.  */
> +
> +static int
> +get_cond_invariant_branch (struct loop *loop, gcond *cond)
> +{
> +  basic_block cond_bb = gimple_bb (cond);
> +  basic_block targ_bb[2];
> +  bool invar[2];
> +  unsigned invar_checks;
> +
> +  for (unsigned i = 0; i < 2; i++)
> +    {
> +      targ_bb[i] = EDGE_SUCC (cond_bb, i)->dest;
> +
> +      /* One branch directs to loop exit, no need to perform loop split upon
> +        this conditional statement.  Firstly, it is trivial if the exit branch
> +        is semi-invariant, for the statement is just to break loop.  Secondly,
> +        if the opposite branch is semi-invariant, it means that the statement
> +        is real loop-invariant, which is covered by loop unswitch.  */
> +      if (!flow_bb_inside_loop_p (loop, targ_bb[i]))
> +       return -1;
> +    }
> +
> +  invar_checks = 0;
> +
> +  for (unsigned i = 0; i < 2; i++)
> +    {
> +      invar[!i] = false;
> +
> +      if (!branch_removable_p (targ_bb[i]))
> +       continue;
> +
> +      /* Given a semi-invariant branch, if its opposite branch dominates
> +        loop latch, it and its following trace will only be executed in
> +        final iteration of loop, namely it is not part of repeated body
> +        of the loop.  Similar to the above case that the branch is loop
> +        exit, no need to split loop.  */
> +      if (dominated_by_p (CDI_DOMINATORS, loop->latch, targ_bb[i]))
> +       continue;
> +
> +      invar[!i] = stmt_semi_invariant_p (loop, cond, targ_bb[i]);
> +      invar_checks++;
> +    }
> +
> +  /* With both branches being invariant (handled by loop unswitch) or
> +     variant is not what we want.  */
> +  if (invar[0] ^ !invar[1])
> +    return -1;
> +
> +  /* Found a real loop-invariant condition, do nothing.  */
> +  if (invar_checks < 2 && stmt_semi_invariant_p (loop, cond, NULL))
> +    return -1;
> +
> +  return invar[1];
> +}
> +
> +/* Given a conditional statement in LOOP, whose basic block is COND_BB,
> +   suppose its execution only goes through one of its branch, whose index is
> +   specified by BRANCH.  Return TRUE if this statement still executes multiple
> +   times in one iteration of LOOP, in that the statement belongs a nested
> +   unrecognized loop.  */
> +
> +static bool
> +is_cond_in_hidden_loop (const struct loop *loop, basic_block cond_bb,
> +                       int branch)
> +{
> +  basic_block branch_bb = EDGE_SUCC (cond_bb, branch)->dest;
> +
> +  if (cond_bb == loop->header || branch_bb == loop->latch)
> +    return false;
> +
> +  basic_block *bbs = ((split_info *) loop->aux)->bbs;
> +  auto_vec<basic_block> worklist;
> +
> +  for (unsigned i = 0; i < loop->num_nodes; i++)
> +    bbs[i]->flags &= ~BB_REACHABLE;
> +
> +  /* Mark latch basic block as visited so as to terminate reachablility
> +     traversal.  */
> +  loop->latch->flags |= BB_REACHABLE;
> +
> +  gcc_assert (flow_bb_inside_loop_p (loop, branch_bb));
> +
> +  /* Start from the specified branch, the opposite branch is ignored for it
> +     will not be executed.  */
> +  branch_bb->flags |= BB_REACHABLE;
> +  worklist.safe_push (branch_bb);
> +
> +  do
> +    {
> +      basic_block bb = worklist.pop ();
> +      edge e;
> +      edge_iterator ei;
> +
> +      FOR_EACH_EDGE (e, ei, bb->succs)
> +       {
> +         basic_block succ_bb = e->dest;
> +
> +         if (succ_bb == cond_bb)
> +           return true;
> +
> +         if (!flow_bb_inside_loop_p (loop, succ_bb))
> +           continue;
> +
> +         if (succ_bb->flags & BB_REACHABLE)
> +           continue;
> +
> +         succ_bb->flags |= BB_REACHABLE;
> +         worklist.safe_push (succ_bb);
> +       }
> +    } while (!worklist.is_empty ());
> +
> +  return false;
> +}
> +
> +
> +/* Calculate increased code size measured by estimated insn number if applying
> +   loop split upon certain branch (BRANCH) of a conditional statement whose
> +   basic block is COND_BB.  */
> +
> +static int
> +compute_added_num_insns (struct loop *loop, const_basic_block cond_bb,
> +                        int branch)
> +{
> +  const_basic_block targ_bb_var = EDGE_SUCC (cond_bb, !branch)->dest;
> +  basic_block *bbs = ((split_info *) loop->aux)->bbs;
> +  int num = 0;
> +
> +  for (unsigned i = 0; i < loop->num_nodes; i++)
> +    {
> +      /* Do no count basic blocks only in opposite branch.  */
> +      if (dominated_by_p (CDI_DOMINATORS, bbs[i], targ_bb_var))
> +       continue;
> +
> +      for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);
> +          gsi_next (&gsi))
> +       num += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);
> +    }
> +
> +  return num;
> +}
> +
> +/* Return true if it is eligible and profitable to perform loop split upon
> +   a conditional statement COND in LOOP.  */
> +
> +static bool
> +can_split_loop_on_cond (struct loop *loop, gcond *cond)
> +{
> +  int branch = get_cond_invariant_branch (loop, cond);
> +
> +  if (branch < 0)
> +    return false;
> +
> +  basic_block cond_bb = gimple_bb (cond);
> +  profile_probability prob = EDGE_SUCC (cond_bb, branch)->probability;
> +
> +  /* When accurate profile information is available, and execution
> +     frequency of the branch is too low, just let it go.  */
> +  if (prob.reliable_p ())
> +    {
> +      int thres = PARAM_VALUE (PARAM_MIN_COND_LOOP_SPLIT_PROB);
> +
> +      if (prob < profile_probability::always ().apply_scale (thres, 100))
> +       return false;
> +    }
> +
> +  /* Add a threshold for increased code size to disable loop split.  */
> +  if (compute_added_num_insns (loop, cond_bb, branch) >
> +      PARAM_VALUE (PARAM_MAX_COND_LOOP_SPLIT_INSNS))
> +    return false;
> +
> +  /* Skip conditional statement that is inside a nested unrecognized loop.  */
> +  if (is_cond_in_hidden_loop (loop, cond_bb, branch))
> +    return false;
> +
> +  /* Temporarily keep branch index in conditional statement.  */
> +  gimple_set_plf (cond, GF_PLF_1, branch);
> +  return true;
> +}
> +
> +/* Traverse all conditional statements in LOOP, to find out a good candidate
> +   upon which we can do loop split.  */
> +
> +static bool
> +mark_cond_to_split_loop (struct loop *loop)
> +{
> +  split_info *info = new split_info ();
> +  basic_block *bbs = info->bbs = get_loop_body (loop);
> +
> +  /* Allocate an area to keep temporary info, and associate its address
> +     with loop aux field.  */
> +  loop->aux = info;
> +
> +  for (unsigned i = 0; i < loop->num_nodes; i++)
> +    {
> +      basic_block bb = bbs[i];
> +
> +      /* We only consider conditional statement, which be executed at most once
> +        in each iteration of the loop.  So skip statements in inner loops.  */
> +      if (bb->loop_father != loop)
> +       continue;
> +
> +      /* Actually this check is not a must constraint. With it, we can ensure
> +        conditional statement will always be executed in each iteration. */
> +      if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> +       continue;
> +
> +      gimple *last = last_stmt (bb);
> +
> +      if (!last || gimple_code (last) != GIMPLE_COND)
> +       continue;
> +
> +      gcond *cond = as_a <gcond *> (last);
> +
> +      if (can_split_loop_on_cond (loop, cond))
> +       {
> +         info->cond = cond;
> +         return true;
> +       }
> +    }
> +
> +  delete info;
> +  loop->aux = NULL;
> +
> +  return false;
> +}
> +
> +/* Given a loop (LOOP1) with a chosen conditional statement candidate, perform
> +   loop split transformation illustrated as the following graph.
> +
> +               .-------T------ if (true) ------F------.
> +               |                    .---------------. |
> +               |                    |               | |
> +               v                    |               v v
> +          pre-header                |            pre-header
> +               | .------------.     |                 | .------------.
> +               | |            |     |                 | |            |
> +               | v            |     |                 | v            |
> +             header           |     |               header           |
> +               |              |     |                 |              |
> +       [ bool r = cond; ]     |     |                 |              |
> +               |              |     |                 |              |
> +      .---- if (r) -----.     |     |        .--- if (true) ---.     |
> +      |                 |     |     |        |                 |     |
> +  invariant             |     |     |    invariant             |     |
> +      |                 |     |     |        |                 |     |
> +      '---T--->.<---F---'     |     |        '---T--->.<---F---'     |
> +               |              |    /                  |              |
> +             stmts            |   /                 stmts            |
> +               |              |  /                    |              |
> +              / \             | /                    / \             |
> +     .-------*   *       [ if (!r) ]        .-------*   *            |
> +     |           |            |             |           |            |
> +     |         latch          |             |         latch          |
> +     |           |            |             |           |            |
> +     |           '------------'             |           '------------'
> +     '------------------------. .-----------'
> +             loop1            | |                   loop2
> +                              v v
> +                             exits
> +
> +   In the graph, loop1 represents the part derived from original one, and
> +   loop2 is duplicated using loop_version (), which corresponds to the part
> +   of original one being splitted out.  In loop1, a new bool temporary (r)
> +   is introduced to keep value of the condition result.  In original latch
> +   edge of loop1, we insert a new conditional statement whose value comes
> +   from previous temporary (r), one of its branch goes back to loop1 header
> +   as a latch edge, and the other branch goes to loop2 pre-header as an entry
> +   edge.  And also in loop2, we abandon the variant branch of the conditional
> +   statement candidate by setting a constant bool condition, based on which
> +   branch is semi-invariant.  */
> +
> +static bool
> +split_loop_for_cond (struct loop *loop1)
> +{
> +  split_info *info = (split_info *) loop1->aux;
> +  gcond *cond = info->cond;
> +  basic_block cond_bb = gimple_bb (cond);
> +  int branch = gimple_plf (cond, GF_PLF_1);
> +  bool true_invar = !!(EDGE_SUCC (cond_bb, branch)->flags & EDGE_TRUE_VALUE);
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +   {
> +     fprintf (dump_file, "In %s(), split loop %d at branch<%s>, BB %d\n",
> +             current_function_name (), loop1->num,
> +             true_invar ? "T" : "F", cond_bb->index);
> +     print_gimple_stmt (dump_file, cond, 0, TDF_SLIM | TDF_VOPS);
> +   }
> +
> +  initialize_original_copy_tables ();
> +
> +  struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
> +                                    profile_probability::always (),
> +                                    profile_probability::never (),
> +                                    profile_probability::always (),
> +                                    profile_probability::always (),
> +                                    true);
> +  if (!loop2)
> +    {
> +      free_original_copy_tables ();
> +      return false;
> +    }
> +
> +  /* Generate a bool type temporary to hold result of the condition.  */
> +  tree tmp = make_ssa_name (boolean_type_node);
> +  gimple_stmt_iterator gsi = gsi_last_bb (cond_bb);
> +  gimple *stmt = gimple_build_assign (tmp,
> +                                     gimple_cond_code (cond),
> +                                     gimple_cond_lhs (cond),
> +                                     gimple_cond_rhs (cond));
> +
> +  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
> +  gimple_cond_set_condition (cond, EQ_EXPR, tmp, boolean_true_node);
> +  update_stmt (cond);
> +
> +  basic_block cond_bb_copy = get_bb_copy (cond_bb);
> +  gcond *cond_copy = as_a<gcond *> (last_stmt (cond_bb_copy));
> +
> +  /* Replace the condition in loop2 with a bool constant to let PassManager
> +     remove the variant branch after current pass completes.  */
> +  if (true_invar)
> +    gimple_cond_make_true (cond_copy);
> +  else
> +    gimple_cond_make_false (cond_copy);
> +
> +  update_stmt (cond_copy);
> +
> +  /* Insert a new conditional statement on latch edge of loop1.  This
> +     statement acts as a switch to transfer execution from loop1 to loop2,
> +     when loop1 enters into invariant state.  */
> +  basic_block latch_bb = split_edge (loop_latch_edge (loop1));
> +  basic_block break_bb = split_edge (single_pred_edge (latch_bb));
> +  gimple *break_cond = gimple_build_cond (EQ_EXPR, tmp, boolean_true_node,
> +                                         NULL_TREE, NULL_TREE);
> +
> +  gsi = gsi_last_bb (break_bb);
> +  gsi_insert_after (&gsi, break_cond, GSI_NEW_STMT);
> +
> +  edge to_loop1 = single_succ_edge (break_bb);
> +  edge to_loop2 = make_edge (break_bb, loop_preheader_edge (loop2)->src, 0);
> +
> +  to_loop1->flags &= ~EDGE_FALLTHRU;
> +  to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE;
> +  to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE;
> +
> +  update_ssa (TODO_update_ssa);
> +
> +  /* Due to introduction of a control flow edge from loop1 latch to loop2
> +     pre-header, we should update PHIs in loop2 to reflect this connection
> +     between loop1 and loop2.  */
> +  connect_loop_phis (loop1, loop2, to_loop2);
> +
> +  free_original_copy_tables ();
> +
> +  rewrite_into_loop_closed_ssa_1 (NULL, 0, SSA_OP_USE, loop1);
> +
> +  return true;
> +}
> +
> +/* Main entry point to perform loop splitting for suitable if-conditions
> +   in all loops.  */
> +
> +static unsigned int
> +tree_ssa_split_loops_for_cond (void)
> +{
> +  struct loop *loop;
> +  auto_vec<struct loop *> loop_list;
> +  bool changed = false;
> +  unsigned i;
> +
> +  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
> +    loop->aux = NULL;
> +
> +  /* Go through all loops starting from innermost.  */
> +  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
> +    {
> +      /* Put loop in a list if found a conditional statement candidate in it.
> +        This is stage for analysis, not change anything of the function.  */
> +      if (!loop->aux
> +         && !optimize_loop_for_size_p (loop)
> +         && mark_cond_to_split_loop (loop))
> +       loop_list.safe_push (loop);
> +
> +      /* If any of our inner loops was split, don't split us,
> +        and mark our containing loop as having had splits as well.  */
> +      loop_outer (loop)->aux = loop->aux;
> +    }
> +
> +  FOR_EACH_VEC_ELT (loop_list, i, loop)
> +    {
> +      /* Extract selected loop and perform loop split.  This is stage for
> +        transformation.  */
> +      changed |= split_loop_for_cond (loop);
> +
> +      delete (split_info *) loop->aux;
> +    }
> +
> +  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
> +    loop->aux = NULL;
> +
> +  if (changed)
> +    return TODO_cleanup_cfg;
> +  return 0;
> +}
> +
> +
>  /* Loop splitting pass.  */
>
>  namespace {
> @@ -716,3 +1496,48 @@ make_pass_loop_split (gcc::context *ctxt)
>  {
>    return new pass_loop_split (ctxt);
>  }
> +
> +namespace {
> +
> +const pass_data pass_data_cond_loop_split =
> +{
> +  GIMPLE_PASS, /* type */
> +  "cond_lsplit", /* name */
> +  OPTGROUP_LOOP, /* optinfo_flags */
> +  TV_COND_LOOP_SPLIT, /* tv_id */
> +  PROP_cfg, /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  0, /* todo_flags_finish */
> +};
> +
> +class pass_cond_loop_split : public gimple_opt_pass
> +{
> +public:
> +  pass_cond_loop_split (gcc::context *ctxt)
> +    : gimple_opt_pass (pass_data_cond_loop_split, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *) { return flag_split_loops != 0; }
> +  virtual unsigned int execute (function *);
> +
> +}; // class pass_cond_loop_split
> +
> +unsigned int
> +pass_cond_loop_split::execute (function *fun)
> +{
> +  if (number_of_loops (fun) <= 1)
> +    return 0;
> +
> +  return tree_ssa_split_loops_for_cond ();
> +}
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_cond_loop_split (gcc::context *ctxt)
> +{
> +  return new pass_cond_loop_split (ctxt);
> +}
> --
> 2.17.1
Feng Xue OS Sept. 12, 2019, 1:52 p.m. UTC | #5
>> Suppose a loop as:
>>
>>     void f (std::map<int, int> m)
>>     {
>>         for (auto it = m.begin (); it != m.end (); ++it) {
>>             /* if (b) is semi-invariant. */
>>             if (b) {
>>                 b = do_something();    /* Has effect on b */
>>             } else {
>>                                                         /* No effect on b */
>>             }
>>             statements;                      /* Also no effect on b */
>>         }
>>     }
>>
>> A transformation, kind of loop split, could be:
>>
>>     void f (std::map<int, int> m)
>>     {
>>         for (auto it = m.begin (); it != m.end (); ++it) {
>>             if (b) {
>>                 b = do_something();
>>             } else {
>>                 ++it;
>>                 statements;
>>                 break;
>>             }
>>             statements;
>>         }
>>
>>         for (; it != m.end (); ++it) {
>>             statements;
>>         }
>>     }

> So if you consider approaching this from unswitching instead we'd
> unswitch it on if (b) but
> treat the condition as constant only in the 'false' path, thus the
> transformed code would
> look like the following.  I believe implementing this in the existing
> unswitching pass
> involves a lot less code than putting it into the splitting pass but
> it would catch
> exactly the same cases?

May not.

Firstly, the following transformation is legal only when "b" is
semi-invariant, which means once a branch of "if (b)" is selected since
certain iteration, execution will always go to that branch in all following
iterations. Most of code in this loop-split patch was composed to check 
semi-invariantness of a conditional expression, which must also be needed
in loop-unswitch solution.

Secondly, to duplicate/move an invariant expression out of loop is simple,
for all intermediate computations should already be outside, only need to 
handle its result SSA that is inside the loop. But for semi-invariant, we
have to duplicate a tree of statements that contributes to computation of
the condition expression, similar to code hoisting, which make us write 
extra code. 

And loop-unswitch solution is weaker than loop-split. Suppose initial
value of "b" is true, and is changed to false after some iterations, 
not only unswitch does not help that, but also it introduces extra cost
due to enclosing "if (b)".

>   if (b)
>    {
>         for (auto it = m.begin (); it != m.end (); ++it) {
>              /* if (b) is non-invariant. */
>             if (b) {
>                 b = do_something();    /* Has effect on b */
>              } else {
>                                                         /* No effect on b */
>             }
>             statements;                      /* Also no effect on b */
>         }
>     }
>   else
>     {
>           for (auto it = m.begin (); it != m.end (); ++it) {
>              /* if (b) is invariant. */
>              if (false) {
>                  b = do_something();    /* Has effect on b */
>              } else {
>                                                          /* No effect on b */
>              }
>              statements;                      /* Also no effect on b */
>          }
>     }
Feng Xue OS Oct. 9, 2019, 3:23 a.m. UTC | #6
Hi, Michael,

   Would you please take a look at this modified version?

Thanks,
Feng
diff mbox series

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9a46f93d89d..2334b184945 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,23 @@ 
+2019-06-18  Feng Xue <fxue@os.amperecomputing.com>
+
+       PR tree-optimization/89134
+       * doc/invoke.texi (max-cond-loop-split-insns): Document new --params.
+       (min-cond-loop-split-prob): Likewise.
+       * params.def: Add max-cond-loop-split-insns, min-cond-loop-split-prob.
+       * passes.def (pass_cond_loop_split) : New pass.
+       * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
+       * tree-pass.h (make_pass_cond_loop_split): New declaration.
+       * tree-ssa-loop-split.c (split_info): New class.
+       (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
+       (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
+       (branch_removable_p, get_cond_invariant_branch): Likewise.
+       (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
+       (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
+       (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
+       (pass_data_cond_loop_split): New variable.
+       (pass_cond_loop_split): New class.
+       (make_pass_cond_loop_split): New function.
+
 2019-06-18  Kewen Lin  <linkw@gcc.gnu.org>

         PR middle-end/80791
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index eaef4cd63d2..0427fede3d6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11352,6 +11352,14 @@  The maximum number of branches unswitched in a single loop.
 @item lim-expensive
 The minimum cost of an expensive expression in the loop invariant motion.

+@item max-cond-loop-split-insns
+The maximum number of insns to be increased due to loop split on
+semi-invariant condition statement.
+
+@item min-cond-loop-split-prob
+The minimum threshold for probability of semi-invaraint condition
+statement to trigger loop split.
+
 @item iv-consider-all-candidates-bound
 Bound on number of candidates for induction variables, below which
 all candidates are considered for each use in induction variable
diff --git a/gcc/params.def b/gcc/params.def
index 0db60951413..5384f7d1c4d 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -386,6 +386,20 @@  DEFPARAM(PARAM_MAX_UNSWITCH_LEVEL,
         "The maximum number of unswitchings in a single loop.",
         3, 0, 0)

+/* The maximum number of increased insns due to loop split on semi-invariant
+   condition statement.  */
+DEFPARAM(PARAM_MAX_COND_LOOP_SPLIT_INSNS,
+       "max-cond-loop-split-insns",
+       "The maximum number of insns to be increased due to loop split on "
+       "semi-invariant condition statement.",
+       100, 0, 0)
+
+DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB,
+       "min-cond-loop-split-prob",
+       "The minimum threshold for probability of semi-invaraint condition "
+       "statement to trigger loop split.",
+       30, 0, 100)
+
 /* The maximum number of insns in loop header duplicated by the copy loop
    headers pass.  */
 DEFPARAM(PARAM_MAX_LOOP_HEADER_INSNS,
diff --git a/gcc/passes.def b/gcc/passes.def
index ad2efabd385..bb32b88738e 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -261,6 +261,7 @@  along with GCC; see the file COPYING3.  If not see
           NEXT_PASS (pass_tree_unswitch);
           NEXT_PASS (pass_scev_cprop);
           NEXT_PASS (pass_loop_split);
+         NEXT_PASS (pass_cond_loop_split);
           NEXT_PASS (pass_loop_versioning);
           NEXT_PASS (pass_loop_jam);
           /* All unswitching, final value replacement and splitting can expose
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 27a522e0140..9aa069e5c29 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@ 
+2019-06-18  Feng Xue  <fxue@os.amperecomputing.com>
+
+       * gcc.dg/tree-ssa/loop-cond-split-1.c: New test.
+       * g++.dg/tree-ssa/loop-cond-split-1.C: New test.
+
 2019-06-17  Jakub Jelinek  <jakub@redhat.com>

         * gcc.dg/vect/vect-simd-8.c: New test.
diff --git a/gcc/testsuite/g++.dg/tree-ssa/loop-cond-split-1.C b/gcc/testsuite/g++.dg/tree-ssa/loop-cond-split-1.C
new file mode 100644
index 00000000000..df269c5ee44
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/loop-cond-split-1.C
@@ -0,0 +1,33 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-cond_lsplit-details" } */
+
+#include <string>
+#include <map>
+
+using namespace std;
+
+class  A
+{
+public:
+  bool empty;
+  void set (string s);
+};
+
+class  B
+{
+  map<int, string> m;
+  void f ();
+};
+
+extern A *ga;
+
+void B::f ()
+{
+  for (map<int, string>::iterator iter = m.begin (); iter != m.end (); ++iter)
+    {
+      if (ga->empty)
+        ga->set (iter->second);
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "split loop 1 at branch" 1 "cond_lsplit" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-cond-split-1.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-cond-split-1.c
new file mode 100644
index 00000000000..a0eb7a26ad5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-cond-split-1.c
@@ -0,0 +1,23 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-cond_lsplit-details" } */
+
+__attribute__((pure)) __attribute__((noinline)) int inc (int i)
+{
+  return i + 1;
+}
+
+extern int do_something (void);
+extern int b;
+
+void test(int n)
+{
+  int i;
+
+  for (i = 0; i < n; i = inc (i))
+    {
+      if (b)
+        b = do_something();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "split loop 1 at branch" 1 "cond_lsplit" } } */
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 13cb470b688..5a2a80a29f7 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -188,6 +188,7 @@  DEFTIMEVAR (TV_TREE_LOOP_IVCANON     , "tree canonical iv")
 DEFTIMEVAR (TV_SCEV_CONST            , "scev constant prop")
 DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH    , "tree loop unswitching")
 DEFTIMEVAR (TV_LOOP_SPLIT            , "loop splitting")
+DEFTIMEVAR (TV_COND_LOOP_SPLIT       , "loop splitting for conditions")
 DEFTIMEVAR (TV_LOOP_JAM              , "unroll and jam")
 DEFTIMEVAR (TV_COMPLETE_UNROLL       , "complete unrolling")
 DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 3a0b3805d24..cdb7ef3c9f2 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -367,6 +367,7 @@  extern gimple_opt_pass *make_pass_lim (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_linterchange (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_unswitch (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_loop_split (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_cond_loop_split (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_loop_jam (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_predcom (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_iv_canon (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index 999c9a30366..7239d0cfb00 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -32,7 +32,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-loop.h"
 #include "tree-ssa-loop-manip.h"
 #include "tree-into-ssa.h"
+#include "tree-inline.h"
 #include "cfgloop.h"
+#include "params.h"
 #include "tree-scalar-evolution.h"
 #include "gimple-iterator.h"
 #include "gimple-pretty-print.h"
@@ -40,7 +42,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-fold.h"
 #include "gimplify-me.h"

-/* This file implements loop splitting, i.e. transformation of loops like
+/* This file implements two kind of loop splitting.
+
+   One transformation of loops like:

    for (i = 0; i < 100; i++)
      {
@@ -670,6 +674,782 @@  tree_ssa_split_loops (void)
   return 0;
 }

+
+/* Another transformation of loops like:
+
+   for (i = INIT (); CHECK (i); i = NEXT ())
+     {
+       if (expr (a_1, a_2, ..., a_n))
+         a_j = ...;  // change at least one a_j
+       else
+         S;          // not change any a_j
+     }
+
+   into:
+
+   for (i = INIT (); CHECK (i); i = NEXT ())
+     {
+       if (expr (a_1, a_2, ..., a_n))
+         a_j = ...;
+       else
+         {
+           S;
+           i = NEXT ();
+           break;
+         }
+     }
+
+   for (; CHECK (i); i = NEXT ())
+     {
+       S;
+     }
+
+   */
+
+/* Data structure to hold temporary information during loop split upon
+   semi-invariant conditional statement.  */
+class split_info {
+public:
+  /* Array of all basic blocks in a loop, returned by get_loop_body().  */
+  basic_block *bbs;
+
+  /* All memory store/clobber statements in a loop.  */
+  auto_vec<gimple *> stores;
+
+  /* Whether above memory stores vector has been filled.  */
+  bool set_stores;
+
+  /* Semi-invariant conditional statement, upon which to split loop.  */
+  gcond *cond;
+
+  split_info () : bbs (NULL),  set_stores (false), cond (NULL) { }
+
+  ~split_info ()
+    {
+      if (bbs)
+       free (bbs);
+    }
+};
+
+/* Find all statements with memory-write effect in LOOP, including memory
+   store and non-pure function call, and keep those in a vector.  This work
+   is only done one time, for the vector should be constant during analysis
+   stage of semi-invariant condition.  */
+
+static void
+find_vdef_in_loop (struct loop *loop)
+{
+  split_info *info = (split_info *) loop->aux;
+  gphi *vphi = get_virtual_phi (loop->header);
+
+  /* Indicate memory store vector has been filled.  */
+  info->set_stores = true;
+
+  /* If loop contains memory operation, there must be a virtual PHI node in
+     loop header basic block.  */
+  if (vphi == NULL)
+    return;
+
+  /* All virtual SSA names inside the loop are connected to be a cyclic
+     graph via virtual PHI nodes.  The virtual PHI node in loop header just
+     links the first and the last virtual SSA names, by using the last as
+     PHI operand to define the first.  */
+  const edge latch = loop_latch_edge (loop);
+  const tree first = gimple_phi_result (vphi);
+  const tree last = PHI_ARG_DEF_FROM_EDGE (vphi, latch);
+
+  /* The virtual SSA cyclic graph might consist of only one SSA name, who
+     is defined by itself.
+
+       .MEM_1 = PHI <.MEM_2(loop entry edge), .MEM_1(latch edge)>
+
+     This means the loop contains only memory loads, so we can skip it.  */
+  if (first == last)
+    return;
+
+  auto_vec<gimple *> others;
+  auto_vec<tree> worklist;
+  auto_bitmap visited;
+
+  bitmap_set_bit (visited, SSA_NAME_VERSION (first));
+  bitmap_set_bit (visited, SSA_NAME_VERSION (last));
+  worklist.safe_push (last);
+
+  do
+    {
+      tree vuse = worklist.pop ();
+      gimple *stmt = SSA_NAME_DEF_STMT (vuse);
+
+      /* We mark the first and last SSA names as visited at the beginning,
+        and reversely start the process from the last SSA name towards the
+        first, which ensures that this do-while will not touch SSA names
+        defined outside of the loop.  */
+      gcc_assert (gimple_bb (stmt)
+                 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)));
+
+      if (gimple_code (stmt) == GIMPLE_PHI)
+       {
+         gphi *phi = as_a <gphi *> (stmt);
+
+         for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+           {
+             tree arg = gimple_phi_arg_def (stmt, i);
+
+             if (bitmap_set_bit (visited, SSA_NAME_VERSION (arg)))
+               worklist.safe_push (arg);
+           }
+       }
+      else
+       {
+         tree prev = gimple_vuse (stmt);
+
+         /* Non-pure call statement is conservatively assumed to impact all
+            memory locations.  So place call statements ahead of other memory
+            stores in the vector with an idea of of using them as shortcut
+            terminators to memory alias analysis.  */
+         if (gimple_code (stmt) == GIMPLE_CALL)
+           info->stores.safe_push (stmt);
+         else
+           others.safe_push (stmt);
+
+         if (bitmap_set_bit (visited, SSA_NAME_VERSION (prev)))
+           worklist.safe_push (prev);
+       }
+    } while (!worklist.is_empty ());
+
+    info->stores.safe_splice (others);
+}
+
+
+/* Given STMT, memory load or pure call statement, check whether it is impacted
+   by some memory store in LOOP, excluding trace starting from SKIP_HEAD (the
+   trace is composed of SKIP_HEAD and those basic block dominated by it, always
+   corresponds to one branch of a conditional statement).  If SKIP_HEAD is
+   NULL, all basic blocks of LOOP are checked.  */
+
+static bool
+vuse_semi_invariant_p (struct loop *loop, gimple *stmt,
+                      const_basic_block skip_head)
+{
+  split_info *info = (split_info *) loop->aux;
+
+  /* Collect memory store/clobber statements if have not do that.  */
+  if (!info->set_stores)
+    find_vdef_in_loop (loop);
+
+  tree rhs = is_gimple_assign (stmt) ? gimple_assign_rhs1 (stmt) : NULL_TREE;
+  ao_ref ref;
+  gimple *store;
+  unsigned i;
+
+  ao_ref_init (&ref, rhs);
+
+  FOR_EACH_VEC_ELT (info->stores, i, store)
+    {
+      /* Skip basic blocks dominated by SKIP_HEAD, if non-NULL.  */
+      if (skip_head
+         && dominated_by_p (CDI_DOMINATORS, gimple_bb (store), skip_head))
+       continue;
+
+      if (!ref.ref || stmt_may_clobber_ref_p_1 (store, &ref))
+       return false;
+    }
+
+  return true;
+}
+
+/* Forward declaration.  */
+
+static bool
+stmt_semi_invariant_p (struct loop *loop, gimple *stmt,
+                      const_basic_block skip_head);
+
+/* Suppose one condition branch, led by SKIP_HEAD, is not executed since
+   certain iteration of LOOP, check whether an SSA name (NAME) remains
+   unchanged in next interation.  We call this characterisic as semi-
+   invariantness.  SKIP_HEAD might be NULL, if so, nothing excluded, all
+   basic blocks and control flows in the loop will be considered.  If non-
+   NULL, SSA name to check is supposed to be defined before SKIP_HEAD.  */
+
+static bool
+ssa_semi_invariant_p (struct loop *loop, const tree name,
+                     const_basic_block skip_head)
+{
+  gimple *def = SSA_NAME_DEF_STMT (name);
+  const_basic_block def_bb = gimple_bb (def);
+
+  /* An SSA name defined outside a loop is definitely semi-invariant.  */
+  if (!def_bb || !flow_bb_inside_loop_p (loop, def_bb))
+    return true;
+
+  if (gimple_code (def) == GIMPLE_PHI)
+    {
+      /* For PHI node that is not in loop header, its source operands should
+        be defined inside the loop, which are seen as loop variant.  */
+      if (def_bb != loop->header || !skip_head)
+       return false;
+
+      const_edge latch = loop_latch_edge (loop);
+      tree from = PHI_ARG_DEF_FROM_EDGE (as_a <gphi *> (def), latch);
+
+      /* A PHI node in loop header contains two source operands, one is
+        initial value, the other is the copy of last iteration through loop
+        latch, we call it latch value.  From the PHI node to definition
+        of latch value, if excluding branch trace from SKIP_HEAD, there
+        is no definition of other version of same variable, SSA name defined
+        by the PHI node is semi-invariant.
+
+                         loop entry
+                              |     .--- latch ---.
+                              |     |             |
+                              v     v             |
+                  x_1 = PHI <x_0,  x_3>           |
+                           |                      |
+                           v                      |
+              .------- if (cond) -------.         |
+              |                         |         |
+              |                     [ SKIP ]      |
+              |                         |         |
+              |                     x_2 = ...     |
+              |                         |         |
+              '---- T ---->.<---- F ----'         |
+                           |                      |
+                           v                      |
+                  x_3 = PHI <x_1, x_2>            |
+                           |                      |
+                           '----------------------'
+
+       Suppose in certain iteration, execution flow in above graph goes
+       through true branch, which means that one source value to define
+       x_3 in false branch (x2) is skipped, x_3 only comes from x_1, and
+       x_1 in next iterations is defined by x_3, we know that x_1 will
+       never changed if COND always chooses true branch from then on.  */
+
+      while (from != name)
+       {
+         /* A new value comes from a CONSTANT.  */
+         if (TREE_CODE (from) != SSA_NAME)
+           return false;
+
+         gimple *stmt = SSA_NAME_DEF_STMT (from);
+         const_basic_block bb = gimple_bb (stmt);
+
+         /* A new value comes from outside of loop.  */
+         if (!bb || !flow_bb_inside_loop_p (loop, bb))
+           return false;
+
+         from = NULL_TREE;
+
+         if (gimple_code (stmt) == GIMPLE_PHI)
+           {
+             gphi *phi = as_a <gphi *> (stmt);
+
+             for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+               {
+                 const_edge e = gimple_phi_arg_edge (phi, i);
+
+                 /* Not consider redefinitions in excluded basic blocks.  */
+                 if (!dominated_by_p (CDI_DOMINATORS, e->src, skip_head))
+                   {
+                     /* There are more than one source operands that can
+                        provide value to the SSA name, it is variant.  */
+                     if (from)
+                       return false;
+
+                     from = gimple_phi_arg_def (phi, i);
+                   }
+               }
+           }
+         else if (gimple_code (stmt) == GIMPLE_ASSIGN)
+           {
+             /* For simple value copy, check its rhs instead.  */
+             if (gimple_assign_ssa_name_copy_p (stmt))
+               from = gimple_assign_rhs1 (stmt);
+           }
+
+         /* Any other kind of definition is deemed to introduce a new value
+            to the SSA name.  */
+         if (!from)
+           return false;
+       }
+       return true;
+    }
+
+  /* Value originated from volatile memory load or return of normal (non-
+     const/pure) call should not be treated as constant in each iteration.  */
+  if (gimple_has_side_effects (def))
+    return false;
+
+  /* Check if any memory store may kill memory load at this place.  */
+  if (gimple_vuse (def) && !vuse_semi_invariant_p (loop, def, skip_head))
+    return false;
+
+  /* Check operands of definition statement of the SSA name.  */
+  return stmt_semi_invariant_p (loop, def, skip_head);
+}
+
+/* Check whether STMT is semi-invariant in LOOP, iff all its operands are
+   semi-invariant.  Trace composed of basic block SKIP_HEAD and basic blocks
+   dominated by it are excluded from the loop.  */
+
+static bool
+stmt_semi_invariant_p (struct loop *loop, gimple *stmt,
+                      const_basic_block skip_head)
+{
+  ssa_op_iter iter;
+  tree use;
+
+  /* Although operand of a statement might be SSA name, CONSTANT or VARDECL,
+     here we only need to check SSA name operands.  This is because check on
+     VARDECL operands, which involve memory loads, must have been done
+     prior to invocation of this function in vuse_semi_invariant_p.  */
+  FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
+    {
+      if (!ssa_semi_invariant_p (loop, use, skip_head))
+       return false;
+    }
+
+  return true;
+}
+
+/* Determine when conditional statement never transfers execution to one of its
+   branch, whether we can remove the branch's leading basic block (BRANCH_BB)
+   and those basic blocks dominated by BRANCH_BB.  */
+
+static bool
+branch_removable_p (basic_block branch_bb)
+{
+  if (single_pred_p (branch_bb))
+    return true;
+
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, branch_bb->preds)
+    {
+      if (dominated_by_p (CDI_DOMINATORS, e->src, branch_bb))
+       continue;
+
+      if (dominated_by_p (CDI_DOMINATORS, branch_bb, e->src))
+       continue;
+
+       /* The branch can be reached from opposite branch, or from some
+         statement not dominated by the conditional statement.  */
+      return false;
+    }
+
+  return true;
+}
+
+/* Find out which branch of a conditional statement (COND) is invariant in the
+   execution context of LOOP.  That is: once the branch is selected in certain
+   iteration of the loop, any operand that contributes to computation of the
+   conditional statement remains unchanged in all following iterations.  */
+
+static int
+get_cond_invariant_branch (struct loop *loop, gcond *cond)
+{
+  basic_block cond_bb = gimple_bb (cond);
+  basic_block targ_bb[2];
+  bool invar[2];
+  unsigned invar_checks;
+
+  for (unsigned i = 0; i < 2; i++)
+    {
+      targ_bb[i] = EDGE_SUCC (cond_bb, i)->dest;
+
+      /* One branch directs to loop exit, no need to perform loop split upon
+        this conditional statement.  Firstly, it is trivial if the exit branch
+        is semi-invariant, for the statement is just to break loop.  Secondly,
+        if the opposite branch is semi-invariant, it means that the statement
+        is real loop-invariant, which is covered by loop unswitch.  */
+      if (!flow_bb_inside_loop_p (loop, targ_bb[i]))
+       return -1;
+    }
+
+  invar_checks = 0;
+
+  for (unsigned i = 0; i < 2; i++)
+    {
+      invar[!i] = false;
+
+      if (!branch_removable_p (targ_bb[i]))
+       continue;
+
+      /* Given a semi-invariant branch, if its opposite branch dominates
+        loop latch, it and its following trace will only be executed in
+        final iteration of loop, namely it is not part of repeated body
+        of the loop.  Similar to the above case that the branch is loop
+        exit, no need to split loop.  */
+      if (dominated_by_p (CDI_DOMINATORS, loop->latch, targ_bb[i]))
+       continue;
+
+      invar[!i] = stmt_semi_invariant_p (loop, cond, targ_bb[i]);
+      invar_checks++;
+    }
+
+  /* With both branches being invariant (handled by loop unswitch) or
+     variant is not what we want.  */
+  if (invar[0] ^ !invar[1])
+    return -1;
+
+  /* Found a real loop-invariant condition, do nothing.  */
+  if (invar_checks < 2 && stmt_semi_invariant_p (loop, cond, NULL))
+    return -1;
+
+  return invar[1];
+}
+
+/* Given a conditional statement in LOOP, whose basic block is COND_BB,
+   suppose its execution only goes through one of its branch, whose index is
+   specified by BRANCH.  Return TRUE if this statement still executes multiple
+   times in one iteration of LOOP, in that the statement belongs a nested
+   unrecognized loop.  */
+
+static bool
+is_cond_in_hidden_loop (const struct loop *loop, basic_block cond_bb,
+                       int branch)
+{
+  basic_block branch_bb = EDGE_SUCC (cond_bb, branch)->dest;
+
+  if (cond_bb == loop->header || branch_bb == loop->latch)
+    return false;
+
+  basic_block *bbs = ((split_info *) loop->aux)->bbs;
+  auto_vec<basic_block> worklist;
+
+  for (unsigned i = 0; i < loop->num_nodes; i++)
+    bbs[i]->flags &= ~BB_REACHABLE;
+
+  /* Mark latch basic block as visited so as to terminate reachablility
+     traversal.  */
+  loop->latch->flags |= BB_REACHABLE;
+
+  gcc_assert (flow_bb_inside_loop_p (loop, branch_bb));
+
+  /* Start from the specified branch, the opposite branch is ignored for it
+     will not be executed.  */
+  branch_bb->flags |= BB_REACHABLE;
+  worklist.safe_push (branch_bb);
+
+  do
+    {
+      basic_block bb = worklist.pop ();
+      edge e;
+      edge_iterator ei;
+
+      FOR_EACH_EDGE (e, ei, bb->succs)
+       {
+         basic_block succ_bb = e->dest;
+
+         if (succ_bb == cond_bb)
+           return true;
+
+         if (!flow_bb_inside_loop_p (loop, succ_bb))
+           continue;
+
+         if (succ_bb->flags & BB_REACHABLE)
+           continue;
+
+         succ_bb->flags |= BB_REACHABLE;
+         worklist.safe_push (succ_bb);
+       }
+    } while (!worklist.is_empty ());
+
+  return false;
+}
+
+
+/* Calculate increased code size measured by estimated insn number if applying
+   loop split upon certain branch (BRANCH) of a conditional statement whose
+   basic block is COND_BB.  */
+
+static int
+compute_added_num_insns (struct loop *loop, const_basic_block cond_bb,
+                        int branch)
+{
+  const_basic_block targ_bb_var = EDGE_SUCC (cond_bb, !branch)->dest;
+  basic_block *bbs = ((split_info *) loop->aux)->bbs;
+  int num = 0;
+
+  for (unsigned i = 0; i < loop->num_nodes; i++)
+    {
+      /* Do no count basic blocks only in opposite branch.  */
+      if (dominated_by_p (CDI_DOMINATORS, bbs[i], targ_bb_var))
+       continue;
+
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);
+          gsi_next (&gsi))
+       num += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);
+    }
+
+  return num;
+}
+
+/* Return true if it is eligible and profitable to perform loop split upon
+   a conditional statement COND in LOOP.  */
+
+static bool
+can_split_loop_on_cond (struct loop *loop, gcond *cond)
+{
+  int branch = get_cond_invariant_branch (loop, cond);
+
+  if (branch < 0)
+    return false;
+
+  basic_block cond_bb = gimple_bb (cond);
+  profile_probability prob = EDGE_SUCC (cond_bb, branch)->probability;
+
+  /* When accurate profile information is available, and execution
+     frequency of the branch is too low, just let it go.  */
+  if (prob.reliable_p ())
+    {
+      int thres = PARAM_VALUE (PARAM_MIN_COND_LOOP_SPLIT_PROB);
+
+      if (prob < profile_probability::always ().apply_scale (thres, 100))
+       return false;
+    }
+
+  /* Add a threshold for increased code size to disable loop split.  */
+  if (compute_added_num_insns (loop, cond_bb, branch) >
+      PARAM_VALUE (PARAM_MAX_COND_LOOP_SPLIT_INSNS))
+    return false;
+
+  /* Skip conditional statement that is inside a nested unrecognized loop.  */
+  if (is_cond_in_hidden_loop (loop, cond_bb, branch))
+    return false;
+
+  /* Temporarily keep branch index in conditional statement.  */
+  gimple_set_plf (cond, GF_PLF_1, branch);
+  return true;
+}
+
+/* Traverse all conditional statements in LOOP, to find out a good candidate
+   upon which we can do loop split.  */
+
+static bool
+mark_cond_to_split_loop (struct loop *loop)
+{
+  split_info *info = new split_info ();
+  basic_block *bbs = info->bbs = get_loop_body (loop);
+
+  /* Allocate an area to keep temporary info, and associate its address
+     with loop aux field.  */
+  loop->aux = info;
+
+  for (unsigned i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = bbs[i];
+
+      /* We only consider conditional statement, which be executed at most once
+        in each iteration of the loop.  So skip statements in inner loops.  */
+      if (bb->loop_father != loop)
+       continue;
+
+      /* Actually this check is not a must constraint. With it, we can ensure
+        conditional statement will always be executed in each iteration. */
+      if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
+       continue;
+
+      gimple *last = last_stmt (bb);
+
+      if (!last || gimple_code (last) != GIMPLE_COND)
+       continue;
+
+      gcond *cond = as_a <gcond *> (last);
+
+      if (can_split_loop_on_cond (loop, cond))
+       {
+         info->cond = cond;
+         return true;
+       }
+    }
+
+  delete info;
+  loop->aux = NULL;
+
+  return false;
+}
+
+/* Given a loop (LOOP1) with a chosen conditional statement candidate, perform
+   loop split transformation illustrated as the following graph.
+
+               .-------T------ if (true) ------F------.
+               |                    .---------------. |
+               |                    |               | |
+               v                    |               v v
+          pre-header                |            pre-header
+               | .------------.     |                 | .------------.
+               | |            |     |                 | |            |
+               | v            |     |                 | v            |
+             header           |     |               header           |
+               |              |     |                 |              |
+       [ bool r = cond; ]     |     |                 |              |
+               |              |     |                 |              |
+      .---- if (r) -----.     |     |        .--- if (true) ---.     |
+      |                 |     |     |        |                 |     |
+  invariant             |     |     |    invariant             |     |
+      |                 |     |     |        |                 |     |
+      '---T--->.<---F---'     |     |        '---T--->.<---F---'     |
+               |              |    /                  |              |
+             stmts            |   /                 stmts            |
+               |              |  /                    |              |
+              / \             | /                    / \             |
+     .-------*   *       [ if (!r) ]        .-------*   *            |
+     |           |            |             |           |            |
+     |         latch          |             |         latch          |
+     |           |            |             |           |            |
+     |           '------------'             |           '------------'
+     '------------------------. .-----------'
+             loop1            | |                   loop2
+                              v v
+                             exits
+
+   In the graph, loop1 represents the part derived from original one, and
+   loop2 is duplicated using loop_version (), which corresponds to the part
+   of original one being splitted out.  In loop1, a new bool temporary (r)
+   is introduced to keep value of the condition result.  In original latch
+   edge of loop1, we insert a new conditional statement whose value comes
+   from previous temporary (r), one of its branch goes back to loop1 header
+   as a latch edge, and the other branch goes to loop2 pre-header as an entry
+   edge.  And also in loop2, we abandon the variant branch of the conditional
+   statement candidate by setting a constant bool condition, based on which
+   branch is semi-invariant.  */
+
+static bool
+split_loop_for_cond (struct loop *loop1)
+{
+  split_info *info = (split_info *) loop1->aux;
+  gcond *cond = info->cond;
+  basic_block cond_bb = gimple_bb (cond);
+  int branch = gimple_plf (cond, GF_PLF_1);
+  bool true_invar = !!(EDGE_SUCC (cond_bb, branch)->flags & EDGE_TRUE_VALUE);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+     fprintf (dump_file, "In %s(), split loop %d at branch<%s>, BB %d\n",
+             current_function_name (), loop1->num,
+             true_invar ? "T" : "F", cond_bb->index);
+     print_gimple_stmt (dump_file, cond, 0, TDF_SLIM | TDF_VOPS);
+   }
+
+  initialize_original_copy_tables ();
+
+  struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
+                                    profile_probability::always (),
+                                    profile_probability::never (),
+                                    profile_probability::always (),
+                                    profile_probability::always (),
+                                    true);
+  if (!loop2)
+    {
+      free_original_copy_tables ();
+      return false;
+    }
+
+  /* Generate a bool type temporary to hold result of the condition.  */
+  tree tmp = make_ssa_name (boolean_type_node);
+  gimple_stmt_iterator gsi = gsi_last_bb (cond_bb);
+  gimple *stmt = gimple_build_assign (tmp,
+                                     gimple_cond_code (cond),
+                                     gimple_cond_lhs (cond),
+                                     gimple_cond_rhs (cond));
+
+  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
+  gimple_cond_set_condition (cond, EQ_EXPR, tmp, boolean_true_node);
+  update_stmt (cond);
+
+  basic_block cond_bb_copy = get_bb_copy (cond_bb);
+  gcond *cond_copy = as_a<gcond *> (last_stmt (cond_bb_copy));
+
+  /* Replace the condition in loop2 with a bool constant to let PassManager
+     remove the variant branch after current pass completes.  */
+  if (true_invar)
+    gimple_cond_make_true (cond_copy);
+  else
+    gimple_cond_make_false (cond_copy);
+
+  update_stmt (cond_copy);
+
+  /* Insert a new conditional statement on latch edge of loop1.  This
+     statement acts as a switch to transfer execution from loop1 to loop2,
+     when loop1 enters into invariant state.  */
+  basic_block latch_bb = split_edge (loop_latch_edge (loop1));
+  basic_block break_bb = split_edge (single_pred_edge (latch_bb));
+  gimple *break_cond = gimple_build_cond (EQ_EXPR, tmp, boolean_true_node,
+                                         NULL_TREE, NULL_TREE);
+
+  gsi = gsi_last_bb (break_bb);
+  gsi_insert_after (&gsi, break_cond, GSI_NEW_STMT);
+
+  edge to_loop1 = single_succ_edge (break_bb);
+  edge to_loop2 = make_edge (break_bb, loop_preheader_edge (loop2)->src, 0);
+
+  to_loop1->flags &= ~EDGE_FALLTHRU;
+  to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE;
+  to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE;
+
+  update_ssa (TODO_update_ssa);
+
+  /* Due to introduction of a control flow edge from loop1 latch to loop2
+     pre-header, we should update PHIs in loop2 to reflect this connection
+     between loop1 and loop2.  */
+  connect_loop_phis (loop1, loop2, to_loop2);
+
+  free_original_copy_tables ();
+
+  rewrite_into_loop_closed_ssa_1 (NULL, 0, SSA_OP_USE, loop1);
+
+  return true;
+}
+
+/* Main entry point to perform loop splitting for suitable if-conditions
+   in all loops.  */
+
+static unsigned int
+tree_ssa_split_loops_for_cond (void)
+{
+  struct loop *loop;
+  auto_vec<struct loop *> loop_list;
+  bool changed = false;
+  unsigned i;
+
+  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+    loop->aux = NULL;
+
+  /* Go through all loops starting from innermost.  */
+  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+    {
+      /* Put loop in a list if found a conditional statement candidate in it.
+        This is stage for analysis, not change anything of the function.  */
+      if (!loop->aux
+         && !optimize_loop_for_size_p (loop)
+         && mark_cond_to_split_loop (loop))
+       loop_list.safe_push (loop);
+
+      /* If any of our inner loops was split, don't split us,
+        and mark our containing loop as having had splits as well.  */
+      loop_outer (loop)->aux = loop->aux;
+    }
+
+  FOR_EACH_VEC_ELT (loop_list, i, loop)
+    {
+      /* Extract selected loop and perform loop split.  This is stage for
+        transformation.  */
+      changed |= split_loop_for_cond (loop);
+
+      delete (split_info *) loop->aux;
+    }
+
+  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+    loop->aux = NULL;
+
+  if (changed)
+    return TODO_cleanup_cfg;
+  return 0;
+}
+
+
 /* Loop splitting pass.  */

 namespace {
@@ -716,3 +1496,48 @@  make_pass_loop_split (gcc::context *ctxt)
 {
   return new pass_loop_split (ctxt);
 }
+
+namespace {
+
+const pass_data pass_data_cond_loop_split =
+{
+  GIMPLE_PASS, /* type */
+  "cond_lsplit", /* name */
+  OPTGROUP_LOOP, /* optinfo_flags */
+  TV_COND_LOOP_SPLIT, /* tv_id */
+  PROP_cfg, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_cond_loop_split : public gimple_opt_pass
+{
+public:
+  pass_cond_loop_split (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_cond_loop_split, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *) { return flag_split_loops != 0; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_cond_loop_split
+
+unsigned int
+pass_cond_loop_split::execute (function *fun)
+{
+  if (number_of_loops (fun) <= 1)
+    return 0;
+
+  return tree_ssa_split_loops_for_cond ();
+}
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_cond_loop_split (gcc::context *ctxt)
+{
+  return new pass_cond_loop_split (ctxt);
+}