diff mbox series

[GRAPHITE] Fix PR82451 (and PR82355 in a different way)

Message ID alpine.LSU.2.20.1710111617280.6597@zhemvz.fhfr.qr
State New
Headers show
Series [GRAPHITE] Fix PR82451 (and PR82355 in a different way) | expand

Commit Message

Richard Biener Oct. 11, 2017, 2:43 p.m. UTC
For PR82355 I introduced a fake dimension to ISL to allow CHRECs
having an evolution in a loop that isn't fully part of the SESE
region we are processing.  That was easier than fending off those
CHRECs (without simply giving up on SESE regions with those).

But it didn't fully solve the issue as PR82451 shows where we run
into the issue that we eventually have to code-gen those
evolutions and thus in theory need a canonical IV of that containing loop.

So I decided (after Micha pressuring me a bit...) to revisit the
original issue and make SCEV analysis "properly" handle SE regions.
It turns out that it is mostly instantiate_scev lacking proper support
plus the necessary interfacing change (really just cosmetic in some sense)
from a instantiate_before basic-block to a instantiate_before edge.

data-ref interfaces have been similarly adjusted, here changing
the "loop nest" loop parameter to the entry edge for the SE region
and passing that down accordingly.

I've for now tried to keep other high-level loop-based interfaces the
same by simply using the loop preheader edge as entry where appropriate
(needing loop_preheader_edge cope with the loop root tree for simplicity).

In the process I ran into issues with us too overly aggressive
instantiating random trees and thus I cut those down.  That part
doesn't successfully test separately (when I remove the strange
ARRAY_REF instantiation), so it's part of this patch.  I've also
run into an SSA verification fail (the id-27.f90 testcase) which
shows we _do_ need to clear the SCEV cache after introducing
the versioned CFG (and added a comment before it).

On the previously failing testcases I've verified we produce
sensible instantiations for those pesky refs residing in "no" loop
in the SCOP and that we get away with the result in terms of
optimizing.

SPEC 2k6 testing shows

loop nest optimized: 311
loop nest not optimized, code generation error: 0
loop nest not optimized, optimized schedule is identical to original 
schedule: 173
loop nest not optimized, optimization timed out: 59
loop nest not optimized, ISL signalled an error: 9
loop nest: 552

for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity
still reveals some codegen errors:

loop nest optimized: 437
loop nest not optimized, code generation error: 25
loop nest not optimized, optimized schedule is identical to original 
schedule: 169
loop nest not optimized, optimization timed out: 60
loop nest not optimized, ISL signalled an error: 9
loop nest: 700

Bootstrap and regtest in progress on x86_64-unknown-linux-gnu
(with and without -fgraphite-identity -floop-nest-optimize).

Ok?

Thanks,
Richard.

2017-10-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/82451
	Revert
	2017-10-02  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/82355
	* graphite-isl-ast-to-gimple.c (build_iv_mapping): Also build
	a mapping for the enclosing loop but avoid generating one for
	the loop tree root.
	(copy_bb_and_scalar_dependences): Remove premature codegen
	error on PHIs in blocks duplicated into multiple places.
	* graphite-scop-detection.c
	(scop_detection::stmt_has_simple_data_refs_p): For a loop not
	in the region use it as loop and nest to analyze the DR in.
	(try_generate_gimple_bb): Likewise.
	* graphite-sese-to-poly.c (extract_affine_chrec): Adjust.
	(add_loop_constraints): For blocks in a loop not in the region
	create a dimension with a single iteration.
	* sese.h (gbb_loop_at_index): Remove assert.

	* cfgloop.c (loop_preheader_edge): For the loop tree root
	return the single successor of the entry block.
	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl):
	Reset the SCEV hashtable and niters.
	* graphite-scop-detection.c
	(scop_detection::graphite_can_represent_scev): Add SCOP parameter,
	assert that we only have POLYNOMIAL_CHREC that vary in loops
	contained in the region.
	(scop_detection::graphite_can_represent_expr): Adjust.
	(scop_detection::stmt_has_simple_data_refs_p): For loops
	not in the region set loop to NULL.  The nest is now the
	entry edge to the region.
	(try_generate_gimple_bb): Likewise.
	* sese.c (scalar_evolution_in_region): Adjust for
	instantiate_scev change.
	* tree-data-ref.h (graphite_find_data_references_in_stmt):
	Make nest parameter the edge into the region.
	(create_data_ref): Likewise.
	* tree-data-ref.c (dr_analyze_indices): Make nest parameter an
	entry edge into a region and adjust instantiate_scev calls.
	(create_data_ref): Likewise.
	(graphite_find_data_references_in_stmt): Likewise.
	(find_data_references_in_stmt): Pass the loop preheader edge
	from the nest argument.
	* tree-scalar-evolution.h (instantiate_scev): Make instantiate_below
	parameter the edge into the region.
	(instantiate_parameters): Use the loop preheader edge as entry.
	* tree-scalar-evolution.c (analyze_scalar_evolution): Handle
	NULL loop.
	(get_instantiated_value_entry): Make instantiate_below parameter
	the edge into the region.
	(instantiate_scev_name): Likewise.  Adjust dominance checks,
	when we cannot use loop-based instantiation instantiate by
	walking use-def chains.
	(instantiate_scev_poly): Adjust.
	(instantiate_scev_binary): Likewise.
	(instantiate_scev_convert): Likewise.
	(instantiate_scev_not): Likewise.
	(instantiate_array_ref): Remove.
	(instantiate_scev_3): Likewise.
	(instantiate_scev_2): Likewise.
	(instantiate_scev_1): Likewise.
	(instantiate_scev_r): Do not blindly handle N-operand trees.
	Do not instantiate array-refs.  Handle all constants and invariants.
	(instantiate_scev): Make instantiate_below parameter
	the edge into the region.
	(resolve_mixers): Use the loop preheader edge for the region
	parameter to instantiate_scev_r.
	* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Adjust.

	* gcc.dg/graphite/pr82451.c: New testcase.
	* gfortran.dg/graphite/id-27.f90: Likewise.
	* gfortran.dg/graphite/pr82451.f: Likewise.

Comments

Bin.Cheng Oct. 12, 2017, 10:05 a.m. UTC | #1
On Wed, Oct 11, 2017 at 3:43 PM, Richard Biener <rguenther@suse.de> wrote:
>
> For PR82355 I introduced a fake dimension to ISL to allow CHRECs
> having an evolution in a loop that isn't fully part of the SESE
> region we are processing.  That was easier than fending off those
> CHRECs (without simply giving up on SESE regions with those).
>
> But it didn't fully solve the issue as PR82451 shows where we run
> into the issue that we eventually have to code-gen those
> evolutions and thus in theory need a canonical IV of that containing loop.
>
> So I decided (after Micha pressuring me a bit...) to revisit the
> original issue and make SCEV analysis "properly" handle SE regions.
> It turns out that it is mostly instantiate_scev lacking proper support
> plus the necessary interfacing change (really just cosmetic in some sense)
> from a instantiate_before basic-block to a instantiate_before edge.
>
> data-ref interfaces have been similarly adjusted, here changing
> the "loop nest" loop parameter to the entry edge for the SE region
> and passing that down accordingly.
>
> I've for now tried to keep other high-level loop-based interfaces the
> same by simply using the loop preheader edge as entry where appropriate
> (needing loop_preheader_edge cope with the loop root tree for simplicity).
>
> In the process I ran into issues with us too overly aggressive
> instantiating random trees and thus I cut those down.  That part
> doesn't successfully test separately (when I remove the strange
> ARRAY_REF instantiation), so it's part of this patch.  I've also
> run into an SSA verification fail (the id-27.f90 testcase) which
> shows we _do_ need to clear the SCEV cache after introducing
> the versioned CFG (and added a comment before it).
>
> On the previously failing testcases I've verified we produce
> sensible instantiations for those pesky refs residing in "no" loop
> in the SCOP and that we get away with the result in terms of
> optimizing.
>
> SPEC 2k6 testing shows
>
> loop nest optimized: 311
> loop nest not optimized, code generation error: 0
> loop nest not optimized, optimized schedule is identical to original
> schedule: 173
> loop nest not optimized, optimization timed out: 59
> loop nest not optimized, ISL signalled an error: 9
> loop nest: 552
>
> for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity
> still reveals some codegen errors:
>
> loop nest optimized: 437
> loop nest not optimized, code generation error: 25
> loop nest not optimized, optimized schedule is identical to original
> schedule: 169
> loop nest not optimized, optimization timed out: 60
> loop nest not optimized, ISL signalled an error: 9
> loop nest: 700
>
> Bootstrap and regtest in progress on x86_64-unknown-linux-gnu
> (with and without -fgraphite-identity -floop-nest-optimize).
>
> Ok?
>
> Thanks,
> Richard.
>

> Index: gcc/tree-scalar-evolution.c
> ===================================================================
> --- gcc/tree-scalar-evolution.c (revision 253645)
> +++ gcc/tree-scalar-evolution.c (working copy)
> @@ -2344,7 +2348,7 @@ static tree instantiate_scev_r (basic_bl
>     instantiated, and to stop if it exceeds some limit.  */
>
>  static tree
> -instantiate_scev_name (basic_block instantiate_below,
> +instantiate_scev_name (edge instantiate_below,
>                        struct loop *evolution_loop, struct loop *inner_loop,
>                        tree chrec,
>                        bool *fold_conversions,
> @@ -2358,7 +2362,7 @@ instantiate_scev_name (basic_block insta
>       evolutions in outer loops), nothing to do.  */
>    if (!def_bb
>        || loop_depth (def_bb->loop_father) == 0
> -      || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb))
> +      || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
>      return chrec;
>
>    /* We cache the value of instantiated variable to avoid exponential
> @@ -2380,6 +2384,51 @@ instantiate_scev_name (basic_block insta
>
>    def_loop = find_common_loop (evolution_loop, def_bb->loop_father);
>
> +  if (! dominated_by_p (CDI_DOMINATORS,
> +                       def_loop->header, instantiate_below->dest))
> +    {
> +      gimple *def = SSA_NAME_DEF_STMT (chrec);
> +      if (gassign *ass = dyn_cast <gassign *> (def))
> +       {
> +         switch (gimple_assign_rhs_class (ass))
> +           {
> +           case GIMPLE_UNARY_RHS:
> +             {
> +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
> +                                              inner_loop, gimple_assign_rhs1 (ass),
> +                                              fold_conversions, size_expr);
> +               if (op0 == chrec_dont_know)
> +                 return chrec_dont_know;
> +               res = fold_build1 (gimple_assign_rhs_code (ass),
> +                                  TREE_TYPE (chrec), op0);
> +               break;
> +             }
> +           case GIMPLE_BINARY_RHS:
> +             {
> +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
> +                                              inner_loop, gimple_assign_rhs1 (ass),
> +                                              fold_conversions, size_expr);
> +               if (op0 == chrec_dont_know)
> +                 return chrec_dont_know;
> +               tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
> +                                              inner_loop, gimple_assign_rhs2 (ass),
> +                                              fold_conversions, size_expr);
> +               if (op1 == chrec_dont_know)
> +                 return chrec_dont_know;
> +               res = fold_build2 (gimple_assign_rhs_code (ass),
> +                                  TREE_TYPE (chrec), op0, op1);
> +               break;
> +             }
> +           default:
> +             res = chrec_dont_know;
> +           }
> +       }
> +      else
> +       res = chrec_dont_know;
> +      global_cache->set (si, res);
> +      return res;
> +    }
> +
>    /* If the analysis yields a parametric chrec, instantiate the
>       result again.  */
>    res = analyze_scalar_evolution (def_loop, chrec);

IIUC, after changing instantiate_scev_r from loop based to region
based, there are
two cases.  In one case, def_loop is dominated by instantiate_edge,
which we'd like
to analyze/instantiate scev wrto the new def_loop; In the other case,
def_loop is not
fully part of sese region, which we'd like to expand ssa_name wrto the
basic block
instantiate_edge->dest.  It's simply ssa expanding and no loop is involved.
So how about factor out above big if-statement into a function with name like
expand_scev_name (Other better names?).  The code like:

  /* Some comment explaining the two cases in region based instantiation.  */
  if (dominated_by_p (CDI_DOMINATORS, def_loop->header,
instantiate_below->dest))
    res = analyze_scalar_evolution (def_loop, chrec);
  else
    res = expand_scev_name (instantiate_below, chrec);

could be easier to read?

Thanks,
bin
Richard Biener Oct. 12, 2017, 11:13 a.m. UTC | #2
On Thu, 12 Oct 2017, Bin.Cheng wrote:

> On Wed, Oct 11, 2017 at 3:43 PM, Richard Biener <rguenther@suse.de> wrote:
> >
> > For PR82355 I introduced a fake dimension to ISL to allow CHRECs
> > having an evolution in a loop that isn't fully part of the SESE
> > region we are processing.  That was easier than fending off those
> > CHRECs (without simply giving up on SESE regions with those).
> >
> > But it didn't fully solve the issue as PR82451 shows where we run
> > into the issue that we eventually have to code-gen those
> > evolutions and thus in theory need a canonical IV of that containing loop.
> >
> > So I decided (after Micha pressuring me a bit...) to revisit the
> > original issue and make SCEV analysis "properly" handle SE regions.
> > It turns out that it is mostly instantiate_scev lacking proper support
> > plus the necessary interfacing change (really just cosmetic in some sense)
> > from a instantiate_before basic-block to a instantiate_before edge.
> >
> > data-ref interfaces have been similarly adjusted, here changing
> > the "loop nest" loop parameter to the entry edge for the SE region
> > and passing that down accordingly.
> >
> > I've for now tried to keep other high-level loop-based interfaces the
> > same by simply using the loop preheader edge as entry where appropriate
> > (needing loop_preheader_edge cope with the loop root tree for simplicity).
> >
> > In the process I ran into issues with us too overly aggressive
> > instantiating random trees and thus I cut those down.  That part
> > doesn't successfully test separately (when I remove the strange
> > ARRAY_REF instantiation), so it's part of this patch.  I've also
> > run into an SSA verification fail (the id-27.f90 testcase) which
> > shows we _do_ need to clear the SCEV cache after introducing
> > the versioned CFG (and added a comment before it).
> >
> > On the previously failing testcases I've verified we produce
> > sensible instantiations for those pesky refs residing in "no" loop
> > in the SCOP and that we get away with the result in terms of
> > optimizing.
> >
> > SPEC 2k6 testing shows
> >
> > loop nest optimized: 311
> > loop nest not optimized, code generation error: 0
> > loop nest not optimized, optimized schedule is identical to original
> > schedule: 173
> > loop nest not optimized, optimization timed out: 59
> > loop nest not optimized, ISL signalled an error: 9
> > loop nest: 552
> >
> > for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity
> > still reveals some codegen errors:
> >
> > loop nest optimized: 437
> > loop nest not optimized, code generation error: 25
> > loop nest not optimized, optimized schedule is identical to original
> > schedule: 169
> > loop nest not optimized, optimization timed out: 60
> > loop nest not optimized, ISL signalled an error: 9
> > loop nest: 700
> >
> > Bootstrap and regtest in progress on x86_64-unknown-linux-gnu
> > (with and without -fgraphite-identity -floop-nest-optimize).
> >
> > Ok?
> >
> > Thanks,
> > Richard.
> >
> 
> > Index: gcc/tree-scalar-evolution.c
> > ===================================================================
> > --- gcc/tree-scalar-evolution.c (revision 253645)
> > +++ gcc/tree-scalar-evolution.c (working copy)
> > @@ -2344,7 +2348,7 @@ static tree instantiate_scev_r (basic_bl
> >     instantiated, and to stop if it exceeds some limit.  */
> >
> >  static tree
> > -instantiate_scev_name (basic_block instantiate_below,
> > +instantiate_scev_name (edge instantiate_below,
> >                        struct loop *evolution_loop, struct loop *inner_loop,
> >                        tree chrec,
> >                        bool *fold_conversions,
> > @@ -2358,7 +2362,7 @@ instantiate_scev_name (basic_block insta
> >       evolutions in outer loops), nothing to do.  */
> >    if (!def_bb
> >        || loop_depth (def_bb->loop_father) == 0
> > -      || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb))
> > +      || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
> >      return chrec;
> >
> >    /* We cache the value of instantiated variable to avoid exponential
> > @@ -2380,6 +2384,51 @@ instantiate_scev_name (basic_block insta
> >
> >    def_loop = find_common_loop (evolution_loop, def_bb->loop_father);
> >
> > +  if (! dominated_by_p (CDI_DOMINATORS,
> > +                       def_loop->header, instantiate_below->dest))
> > +    {
> > +      gimple *def = SSA_NAME_DEF_STMT (chrec);
> > +      if (gassign *ass = dyn_cast <gassign *> (def))
> > +       {
> > +         switch (gimple_assign_rhs_class (ass))
> > +           {
> > +           case GIMPLE_UNARY_RHS:
> > +             {
> > +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
> > +                                              inner_loop, gimple_assign_rhs1 (ass),
> > +                                              fold_conversions, size_expr);
> > +               if (op0 == chrec_dont_know)
> > +                 return chrec_dont_know;
> > +               res = fold_build1 (gimple_assign_rhs_code (ass),
> > +                                  TREE_TYPE (chrec), op0);
> > +               break;
> > +             }
> > +           case GIMPLE_BINARY_RHS:
> > +             {
> > +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
> > +                                              inner_loop, gimple_assign_rhs1 (ass),
> > +                                              fold_conversions, size_expr);
> > +               if (op0 == chrec_dont_know)
> > +                 return chrec_dont_know;
> > +               tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
> > +                                              inner_loop, gimple_assign_rhs2 (ass),
> > +                                              fold_conversions, size_expr);
> > +               if (op1 == chrec_dont_know)
> > +                 return chrec_dont_know;
> > +               res = fold_build2 (gimple_assign_rhs_code (ass),
> > +                                  TREE_TYPE (chrec), op0, op1);
> > +               break;
> > +             }
> > +           default:
> > +             res = chrec_dont_know;
> > +           }
> > +       }
> > +      else
> > +       res = chrec_dont_know;
> > +      global_cache->set (si, res);
> > +      return res;
> > +    }
> > +
> >    /* If the analysis yields a parametric chrec, instantiate the
> >       result again.  */
> >    res = analyze_scalar_evolution (def_loop, chrec);
> 
> IIUC, after changing instantiate_scev_r from loop based to region
> based, there are
> two cases.  In one case, def_loop is dominated by instantiate_edge,
> which we'd like
> to analyze/instantiate scev wrto the new def_loop; In the other case,
> def_loop is not
> fully part of sese region, which we'd like to expand ssa_name wrto the
> basic block
> instantiate_edge->dest.  It's simply ssa expanding and no loop is involved.

Note there can be still a loop involved if the SSA chain arrives at
a DEF that is defined within a loop inbetween the current def and
instantiate_edge->dest.  In that case we need to process the
compute_overall_effect_of_inner_loop case.

> So how about factor out above big if-statement into a function with name like
> expand_scev_name (Other better names?).  The code like:
> 
>   /* Some comment explaining the two cases in region based instantiation.  */
>   if (dominated_by_p (CDI_DOMINATORS, def_loop->header,
> instantiate_below->dest))
>     res = analyze_scalar_evolution (def_loop, chrec);
>   else
>     res = expand_scev_name (instantiate_below, chrec);
> 
> could be easier to read?

Note it would be

   else
     {
       res = expand_scev_name (instantiate_below, chrec);
       global_cache->set (si, res);
       return res;
     }

and expand_scev_name would still need to recurse via instantiate_scev.
It isn't merely gathering all expressions up to instantiate_edge->dest
from the stmts (see above).

But yes, factoring the above might be a good idea.  The above is also
not 100% equivalent in capabilities as the rest of the instantiation
machinery -- it lacks condition PHI handling and it fails to capture
the loop-closed PHI handling (so the above mentioned case wouldn't
be handled right now).

I sent the patch out for comments early before fleshing out all
those pesky details ;)  It's at least conservative correct
right now.

Richard.
Bin.Cheng Oct. 12, 2017, 11:38 a.m. UTC | #3
On Thu, Oct 12, 2017 at 12:13 PM, Richard Biener <rguenther@suse.de> wrote:
> On Thu, 12 Oct 2017, Bin.Cheng wrote:
>
>> On Wed, Oct 11, 2017 at 3:43 PM, Richard Biener <rguenther@suse.de> wrote:
>> >
>> > For PR82355 I introduced a fake dimension to ISL to allow CHRECs
>> > having an evolution in a loop that isn't fully part of the SESE
>> > region we are processing.  That was easier than fending off those
>> > CHRECs (without simply giving up on SESE regions with those).
>> >
>> > But it didn't fully solve the issue as PR82451 shows where we run
>> > into the issue that we eventually have to code-gen those
>> > evolutions and thus in theory need a canonical IV of that containing loop.
>> >
>> > So I decided (after Micha pressuring me a bit...) to revisit the
>> > original issue and make SCEV analysis "properly" handle SE regions.
>> > It turns out that it is mostly instantiate_scev lacking proper support
>> > plus the necessary interfacing change (really just cosmetic in some sense)
>> > from a instantiate_before basic-block to a instantiate_before edge.
>> >
>> > data-ref interfaces have been similarly adjusted, here changing
>> > the "loop nest" loop parameter to the entry edge for the SE region
>> > and passing that down accordingly.
>> >
>> > I've for now tried to keep other high-level loop-based interfaces the
>> > same by simply using the loop preheader edge as entry where appropriate
>> > (needing loop_preheader_edge cope with the loop root tree for simplicity).
>> >
>> > In the process I ran into issues with us too overly aggressive
>> > instantiating random trees and thus I cut those down.  That part
>> > doesn't successfully test separately (when I remove the strange
>> > ARRAY_REF instantiation), so it's part of this patch.  I've also
>> > run into an SSA verification fail (the id-27.f90 testcase) which
>> > shows we _do_ need to clear the SCEV cache after introducing
>> > the versioned CFG (and added a comment before it).
>> >
>> > On the previously failing testcases I've verified we produce
>> > sensible instantiations for those pesky refs residing in "no" loop
>> > in the SCOP and that we get away with the result in terms of
>> > optimizing.
>> >
>> > SPEC 2k6 testing shows
>> >
>> > loop nest optimized: 311
>> > loop nest not optimized, code generation error: 0
>> > loop nest not optimized, optimized schedule is identical to original
>> > schedule: 173
>> > loop nest not optimized, optimization timed out: 59
>> > loop nest not optimized, ISL signalled an error: 9
>> > loop nest: 552
>> >
>> > for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity
>> > still reveals some codegen errors:
>> >
>> > loop nest optimized: 437
>> > loop nest not optimized, code generation error: 25
>> > loop nest not optimized, optimized schedule is identical to original
>> > schedule: 169
>> > loop nest not optimized, optimization timed out: 60
>> > loop nest not optimized, ISL signalled an error: 9
>> > loop nest: 700
>> >
>> > Bootstrap and regtest in progress on x86_64-unknown-linux-gnu
>> > (with and without -fgraphite-identity -floop-nest-optimize).
>> >
>> > Ok?
>> >
>> > Thanks,
>> > Richard.
>> >
>>
>> > Index: gcc/tree-scalar-evolution.c
>> > ===================================================================
>> > --- gcc/tree-scalar-evolution.c (revision 253645)
>> > +++ gcc/tree-scalar-evolution.c (working copy)
>> > @@ -2344,7 +2348,7 @@ static tree instantiate_scev_r (basic_bl
>> >     instantiated, and to stop if it exceeds some limit.  */
>> >
>> >  static tree
>> > -instantiate_scev_name (basic_block instantiate_below,
>> > +instantiate_scev_name (edge instantiate_below,
>> >                        struct loop *evolution_loop, struct loop *inner_loop,
>> >                        tree chrec,
>> >                        bool *fold_conversions,
>> > @@ -2358,7 +2362,7 @@ instantiate_scev_name (basic_block insta
>> >       evolutions in outer loops), nothing to do.  */
>> >    if (!def_bb
>> >        || loop_depth (def_bb->loop_father) == 0
>> > -      || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb))
>> > +      || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
>> >      return chrec;
>> >
>> >    /* We cache the value of instantiated variable to avoid exponential
>> > @@ -2380,6 +2384,51 @@ instantiate_scev_name (basic_block insta
>> >
>> >    def_loop = find_common_loop (evolution_loop, def_bb->loop_father);
>> >
>> > +  if (! dominated_by_p (CDI_DOMINATORS,
>> > +                       def_loop->header, instantiate_below->dest))
>> > +    {
>> > +      gimple *def = SSA_NAME_DEF_STMT (chrec);
>> > +      if (gassign *ass = dyn_cast <gassign *> (def))
>> > +       {
>> > +         switch (gimple_assign_rhs_class (ass))
>> > +           {
>> > +           case GIMPLE_UNARY_RHS:
>> > +             {
>> > +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
>> > +                                              inner_loop, gimple_assign_rhs1 (ass),
>> > +                                              fold_conversions, size_expr);
>> > +               if (op0 == chrec_dont_know)
>> > +                 return chrec_dont_know;
>> > +               res = fold_build1 (gimple_assign_rhs_code (ass),
>> > +                                  TREE_TYPE (chrec), op0);
>> > +               break;
>> > +             }
>> > +           case GIMPLE_BINARY_RHS:
>> > +             {
>> > +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
>> > +                                              inner_loop, gimple_assign_rhs1 (ass),
>> > +                                              fold_conversions, size_expr);
>> > +               if (op0 == chrec_dont_know)
>> > +                 return chrec_dont_know;
>> > +               tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
>> > +                                              inner_loop, gimple_assign_rhs2 (ass),
>> > +                                              fold_conversions, size_expr);
>> > +               if (op1 == chrec_dont_know)
>> > +                 return chrec_dont_know;
>> > +               res = fold_build2 (gimple_assign_rhs_code (ass),
>> > +                                  TREE_TYPE (chrec), op0, op1);
>> > +               break;
>> > +             }
>> > +           default:
>> > +             res = chrec_dont_know;
>> > +           }
>> > +       }
>> > +      else
>> > +       res = chrec_dont_know;
>> > +      global_cache->set (si, res);
>> > +      return res;
>> > +    }
>> > +
>> >    /* If the analysis yields a parametric chrec, instantiate the
>> >       result again.  */
>> >    res = analyze_scalar_evolution (def_loop, chrec);
>>
>> IIUC, after changing instantiate_scev_r from loop based to region
>> based, there are
>> two cases.  In one case, def_loop is dominated by instantiate_edge,
>> which we'd like
>> to analyze/instantiate scev wrto the new def_loop; In the other case,
>> def_loop is not
>> fully part of sese region, which we'd like to expand ssa_name wrto the
>> basic block
>> instantiate_edge->dest.  It's simply ssa expanding and no loop is involved.
>
> Note there can be still a loop involved if the SSA chain arrives at
> a DEF that is defined within a loop inbetween the current def and
> instantiate_edge->dest.  In that case we need to process the
> compute_overall_effect_of_inner_loop case.
>
>> So how about factor out above big if-statement into a function with name like
>> expand_scev_name (Other better names?).  The code like:
>>
>>   /* Some comment explaining the two cases in region based instantiation.  */
>>   if (dominated_by_p (CDI_DOMINATORS, def_loop->header,
>> instantiate_below->dest))
>>     res = analyze_scalar_evolution (def_loop, chrec);
>>   else
>>     res = expand_scev_name (instantiate_below, chrec);
>>
>> could be easier to read?
>
> Note it would be
>
>    else
>      {
>        res = expand_scev_name (instantiate_below, chrec);
>        global_cache->set (si, res);
>        return res;
>      }
>
> and expand_scev_name would still need to recurse via instantiate_scev.
> It isn't merely gathering all expressions up to instantiate_edge->dest
> from the stmts (see above).
I was thinking only do expand here and have last piece of code in function
instantiate_scev_name do the recursive call:

  else if (res != chrec_dont_know)
    {
      if (inner_loop
      && def_bb->loop_father != inner_loop
      && !flow_loop_nested_p (def_bb->loop_father, inner_loop))
    /* ???  We could try to compute the overall effect of the loop here.  */
    res = chrec_dont_know;
      else
    res = instantiate_scev_r (instantiate_below, evolution_loop,
                  inner_loop, res,
                  fold_conversions, size_expr);
    }

Not sure if it's feasible. We might need to stop expansion at either
instantiate_edge
or at in between loop.

Thanks,
bin
>
> But yes, factoring the above might be a good idea.  The above is also
> not 100% equivalent in capabilities as the rest of the instantiation
> machinery -- it lacks condition PHI handling and it fails to capture
> the loop-closed PHI handling (so the above mentioned case wouldn't
> be handled right now).
>
> I sent the patch out for comments early before fleshing out all
> those pesky details ;)  It's at least conservative correct
> right now.
>
> Richard.
Sebastian Pop Oct. 12, 2017, 2:37 p.m. UTC | #4
On Oct 11, 2017 9:43 AM, "Richard Biener" <rguenther@suse.de> wrote:


For PR82355 I introduced a fake dimension to ISL to allow CHRECs
having an evolution in a loop that isn't fully part of the SESE
region we are processing.  That was easier than fending off those
CHRECs (without simply giving up on SESE regions with those).

But it didn't fully solve the issue as PR82451 shows where we run
into the issue that we eventually have to code-gen those
evolutions and thus in theory need a canonical IV of that containing loop.

So I decided (after Micha pressuring me a bit...) to revisit the
original issue and make SCEV analysis "properly" handle SE regions.
It turns out that it is mostly instantiate_scev lacking proper support
plus the necessary interfacing change (really just cosmetic in some sense)
from a instantiate_before basic-block to a instantiate_before edge.


Very nice.


data-ref interfaces have been similarly adjusted, here changing
the "loop nest" loop parameter to the entry edge for the SE region
and passing that down accordingly.

I've for now tried to keep other high-level loop-based interfaces the
same by simply using the loop preheader edge as entry where appropriate
(needing loop_preheader_edge cope with the loop root tree for simplicity).

In the process I ran into issues with us too overly aggressive
instantiating random trees and thus I cut those down.  That part
doesn't successfully test separately (when I remove the strange
ARRAY_REF instantiation), so it's part of this patch.  I've also
run into an SSA verification fail (the id-27.f90 testcase) which
shows we _do_ need to clear the SCEV cache after introducing
the versioned CFG (and added a comment before it).

On the previously failing testcases I've verified we produce
sensible instantiations for those pesky refs residing in "no" loop
in the SCOP and that we get away with the result in terms of
optimizing.

SPEC 2k6 testing shows

loop nest optimized: 311
loop nest not optimized, code generation error: 0
loop nest not optimized, optimized schedule is identical to original
schedule: 173
loop nest not optimized, optimization timed out: 59
loop nest not optimized, ISL signalled an error: 9
loop nest: 552

for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity
still reveals some codegen errors:

loop nest optimized: 437
loop nest not optimized, code generation error: 25
loop nest not optimized, optimized schedule is identical to original
schedule: 169
loop nest not optimized, optimization timed out: 60
loop nest not optimized, ISL signalled an error: 9
loop nest: 700

Bootstrap and regtest in progress on x86_64-unknown-linux-gnu
(with and without -fgraphite-identity -floop-nest-optimize).

Ok?


Looks good to me.
Thanks.


Thanks,
Richard.

2017-10-11  Richard Biener  <rguenther@suse.de>

        PR tree-optimization/82451
        Revert
        2017-10-02  Richard Biener  <rguenther@suse.de>

        PR tree-optimization/82355
        * graphite-isl-ast-to-gimple.c (build_iv_mapping): Also build
        a mapping for the enclosing loop but avoid generating one for
        the loop tree root.
        (copy_bb_and_scalar_dependences): Remove premature codegen
        error on PHIs in blocks duplicated into multiple places.
        * graphite-scop-detection.c
        (scop_detection::stmt_has_simple_data_refs_p): For a loop not
        in the region use it as loop and nest to analyze the DR in.
        (try_generate_gimple_bb): Likewise.
        * graphite-sese-to-poly.c (extract_affine_chrec): Adjust.
        (add_loop_constraints): For blocks in a loop not in the region
        create a dimension with a single iteration.
        * sese.h (gbb_loop_at_index): Remove assert.

        * cfgloop.c (loop_preheader_edge): For the loop tree root
        return the single successor of the entry block.
        * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl):
        Reset the SCEV hashtable and niters.
        * graphite-scop-detection.c
        (scop_detection::graphite_can_represent_scev): Add SCOP parameter,
        assert that we only have POLYNOMIAL_CHREC that vary in loops
        contained in the region.
        (scop_detection::graphite_can_represent_expr): Adjust.
        (scop_detection::stmt_has_simple_data_refs_p): For loops
        not in the region set loop to NULL.  The nest is now the
        entry edge to the region.
        (try_generate_gimple_bb): Likewise.
        * sese.c (scalar_evolution_in_region): Adjust for
        instantiate_scev change.
        * tree-data-ref.h (graphite_find_data_references_in_stmt):
        Make nest parameter the edge into the region.
        (create_data_ref): Likewise.
        * tree-data-ref.c (dr_analyze_indices): Make nest parameter an
        entry edge into a region and adjust instantiate_scev calls.
        (create_data_ref): Likewise.
        (graphite_find_data_references_in_stmt): Likewise.
        (find_data_references_in_stmt): Pass the loop preheader edge
        from the nest argument.
        * tree-scalar-evolution.h (instantiate_scev): Make instantiate_below
        parameter the edge into the region.
        (instantiate_parameters): Use the loop preheader edge as entry.
        * tree-scalar-evolution.c (analyze_scalar_evolution): Handle
        NULL loop.
        (get_instantiated_value_entry): Make instantiate_below parameter
        the edge into the region.
        (instantiate_scev_name): Likewise.  Adjust dominance checks,
        when we cannot use loop-based instantiation instantiate by
        walking use-def chains.
        (instantiate_scev_poly): Adjust.
        (instantiate_scev_binary): Likewise.
        (instantiate_scev_convert): Likewise.
        (instantiate_scev_not): Likewise.
        (instantiate_array_ref): Remove.
        (instantiate_scev_3): Likewise.
        (instantiate_scev_2): Likewise.
        (instantiate_scev_1): Likewise.
        (instantiate_scev_r): Do not blindly handle N-operand trees.
        Do not instantiate array-refs.  Handle all constants and invariants.
        (instantiate_scev): Make instantiate_below parameter
        the edge into the region.
        (resolve_mixers): Use the loop preheader edge for the region
        parameter to instantiate_scev_r.
        * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Adjust.

        * gcc.dg/graphite/pr82451.c: New testcase.
        * gfortran.dg/graphite/id-27.f90: Likewise.
        * gfortran.dg/graphite/pr82451.f: Likewise.

Index: gcc/cfgloop.c
===================================================================
--- gcc/cfgloop.c       (revision 253645)
+++ gcc/cfgloop.c       (working copy)
@@ -1713,12 +1713,19 @@ loop_preheader_edge (const struct loop *
   edge e;
   edge_iterator ei;

-  gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS));
+  gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
+             && ! loops_state_satisfies_p (LOOPS_MAY_HAVE_MULTIPLE_LATCH
ES));

   FOR_EACH_EDGE (e, ei, loop->header->preds)
     if (e->src != loop->latch)
       break;

+  if (! e)
+    {
+      gcc_assert (! loop_outer (loop));
+      return single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+    }
+
   return e;
 }

Index: gcc/graphite-isl-ast-to-gimple.c
===================================================================
--- gcc/graphite-isl-ast-to-gimple.c    (revision 253645)
+++ gcc/graphite-isl-ast-to-gimple.c    (working copy)
@@ -749,10 +749,8 @@ build_iv_mapping (vec<tree> iv_map, gimp
       if (codegen_error_p ())
        t = integer_zero_node;

-      loop_p old_loop = gbb_loop_at_index (gbb, region, i - 2);
-      /* Record sth only for real loops.  */
-      if (loop_in_sese_p (old_loop, region))
-       iv_map[old_loop->num] = t;
+      loop_p old_loop = gbb_loop_at_index (gbb, region, i - 1);
+      iv_map[old_loop->num] = t;
     }
 }

@@ -1570,6 +1568,12 @@ graphite_regenerate_ast_isl (scop_p scop
       update_ssa (TODO_update_ssa);
       checking_verify_ssa (true, true);
       rewrite_into_loop_closed_ssa (NULL, 0);
+      /* We analyzed evolutions of all SCOPs during SCOP detection
+         which cached evolutions.  Now we've introduced PHIs for
+        liveouts which causes those cached solutions to be invalid
+        for code-generation purposes given we'd insert references
+        to SSA names not dominating their new use.  */
+      scev_reset ();
     }

   if (t.codegen_error_p ())
Index: gcc/graphite-scop-detection.c
===================================================================
--- gcc/graphite-scop-detection.c       (revision 253645)
+++ gcc/graphite-scop-detection.c       (working copy)
@@ -403,7 +403,7 @@ public:

      Something like "i * n" or "n * m" is not allowed.  */

-  static bool graphite_can_represent_scev (tree scev);
+  static bool graphite_can_represent_scev (sese_l scop, tree scev);

   /* Return true when EXPR can be represented in the polyhedral model.

@@ -963,7 +963,7 @@ scop_detection::graphite_can_represent_i
    Something like "i * n" or "n * m" is not allowed.  */

 bool
-scop_detection::graphite_can_represent_scev (tree scev)
+scop_detection::graphite_can_represent_scev (sese_l scop, tree scev)
 {
   if (chrec_contains_undetermined (scev))
     return false;
@@ -982,13 +982,13 @@ scop_detection::graphite_can_represent_s
     case BIT_NOT_EXPR:
     CASE_CONVERT:
     case NON_LVALUE_EXPR:
-      return graphite_can_represent_scev (TREE_OPERAND (scev, 0));
+      return graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0));

     case PLUS_EXPR:
     case POINTER_PLUS_EXPR:
     case MINUS_EXPR:
-      return graphite_can_represent_scev (TREE_OPERAND (scev, 0))
-       && graphite_can_represent_scev (TREE_OPERAND (scev, 1));
+      return graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0))
+       && graphite_can_represent_scev (scop, TREE_OPERAND (scev, 1));

     case MULT_EXPR:
       return !CONVERT_EXPR_CODE_P (TREE_CODE (TREE_OPERAND (scev, 0)))
@@ -996,18 +996,20 @@ scop_detection::graphite_can_represent_s
        && !(chrec_contains_symbols (TREE_OPERAND (scev, 0))
             && chrec_contains_symbols (TREE_OPERAND (scev, 1)))
        && graphite_can_represent_init (scev)
-       && graphite_can_represent_scev (TREE_OPERAND (scev, 0))
-       && graphite_can_represent_scev (TREE_OPERAND (scev, 1));
+       && graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0))
+       && graphite_can_represent_scev (scop, TREE_OPERAND (scev, 1));

     case POLYNOMIAL_CHREC:
       /* Check for constant strides.  With a non constant stride of
         'n' we would have a value of 'iv * n'.  Also check that the
         initial value can represented: for example 'n * m' cannot be
         represented.  */
+      gcc_assert (loop_in_sese_p (get_loop (cfun,
+                                           CHREC_VARIABLE (scev)), scop));
       if (!evolution_function_right_is_integer_cst (scev)
          || !graphite_can_represent_init (scev))
        return false;
-      return graphite_can_represent_scev (CHREC_LEFT (scev));
+      return graphite_can_represent_scev (scop, CHREC_LEFT (scev));

     default:
       break;
@@ -1031,7 +1033,7 @@ scop_detection::graphite_can_represent_e
                                             tree expr)
 {
   tree scev = scalar_evolution_in_region (scop, loop, expr);
-  return graphite_can_represent_scev (scev);
+  return graphite_can_represent_scev (scop, scev);
 }

 /* Return true if the data references of STMT can be represented by
Graphite.
@@ -1040,12 +1042,15 @@ scop_detection::graphite_can_represent_e
 bool
 scop_detection::stmt_has_simple_data_refs_p (sese_l scop, gimple *stmt)
 {
-  loop_p nest;
+  edge nest;
   loop_p loop = loop_containing_stmt (stmt);
   if (!loop_in_sese_p (loop, scop))
-    nest = loop;
+    {
+      nest = scop.entry;
+      loop = NULL;
+    }
   else
-    nest = outermost_loop_in_sese (scop, gimple_bb (stmt));
+    nest = loop_preheader_edge (outermost_loop_in_sese (scop, gimple_bb
(stmt)));

   auto_vec<data_reference_p> drs;
   if (! graphite_find_data_references_in_stmt (nest, loop, stmt, &drs))
@@ -1056,7 +1061,7 @@ scop_detection::stmt_has_simple_data_ref
   FOR_EACH_VEC_ELT (drs, j, dr)
     {
       for (unsigned i = 0; i < DR_NUM_DIMENSIONS (dr); ++i)
-       if (! graphite_can_represent_scev (DR_ACCESS_FN (dr, i)))
+       if (! graphite_can_represent_scev (scop, DR_ACCESS_FN (dr, i)))
          return false;
     }

@@ -1413,12 +1418,15 @@ try_generate_gimple_bb (scop_p scop, bas
   vec<scalar_use> reads = vNULL;

   sese_l region = scop->scop_info->region;
-  loop_p nest;
+  edge nest;
   loop_p loop = bb->loop_father;
   if (!loop_in_sese_p (loop, region))
-    nest = loop;
+    {
+      nest = region.entry;
+      loop = NULL;
+    }
   else
-    nest = outermost_loop_in_sese (region, bb);
+    nest = loop_preheader_edge (outermost_loop_in_sese (region, bb));

   for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
        gsi_next (&gsi))
Index: gcc/graphite-sese-to-poly.c
===================================================================
--- gcc/graphite-sese-to-poly.c (revision 253645)
+++ gcc/graphite-sese-to-poly.c (working copy)
@@ -86,7 +86,7 @@ extract_affine_chrec (scop_p s, tree e,
   isl_pw_aff *lhs = extract_affine (s, CHREC_LEFT (e), isl_space_copy
(space));
   isl_pw_aff *rhs = extract_affine (s, CHREC_RIGHT (e), isl_space_copy
(space));
   isl_local_space *ls = isl_local_space_from_space (space);
-  unsigned pos = sese_loop_depth (s->scop_info->region, get_chrec_loop
(e));
+  unsigned pos = sese_loop_depth (s->scop_info->region, get_chrec_loop
(e)) - 1;
   isl_aff *loop = isl_aff_set_coefficient_si
     (isl_aff_zero_on_domain (ls), isl_dim_in, pos, 1);
   isl_pw_aff *l = isl_pw_aff_from_aff (loop);
@@ -763,10 +763,10 @@ add_loop_constraints (scop_p scop, __isl
     return domain;
   const sese_l &region = scop->scop_info->region;
   if (!loop_in_sese_p (loop, region))
-    ;
-  else
-    /* Recursion all the way up to the context loop.  */
-    domain = add_loop_constraints (scop, domain, loop_outer (loop),
context);
+    return domain;
+
+  /* Recursion all the way up to the context loop.  */
+  domain = add_loop_constraints (scop, domain, loop_outer (loop), context);

   /* Then, build constraints over the loop in post-order: outer to inner.
*/

@@ -777,21 +777,6 @@ add_loop_constraints (scop_p scop, __isl
   domain = add_iter_domain_dimension (domain, loop, scop);
   isl_space *space = isl_set_get_space (domain);

-  if (!loop_in_sese_p (loop, region))
-    {
-      /* 0 == loop_i */
-      isl_local_space *ls = isl_local_space_from_space (space);
-      isl_constraint *c = isl_equality_alloc (ls);
-      c = isl_constraint_set_coefficient_si (c, isl_dim_set, loop_index,
1);
-      if (dump_file)
-       {
-         fprintf (dump_file, "[sese-to-poly] adding constraint to the
domain: ");
-         print_isl_constraint (dump_file, c);
-       }
-      domain = isl_set_add_constraint (domain, c);
-      return domain;
-    }
-
   /* 0 <= loop_i */
   isl_local_space *ls = isl_local_space_from_space (isl_space_copy
(space));
   isl_constraint *c = isl_inequality_alloc (ls);
Index: gcc/sese.c
===================================================================
--- gcc/sese.c  (revision 253645)
+++ gcc/sese.c  (working copy)
@@ -461,7 +461,6 @@ scalar_evolution_in_region (const sese_l
 {
   gimple *def;
   struct loop *def_loop;
-  basic_block before = region.entry->src;

   /* SCOP parameters.  */
   if (TREE_CODE (t) == SSA_NAME
@@ -472,7 +471,7 @@ scalar_evolution_in_region (const sese_l
       || loop_in_sese_p (loop, region))
     /* FIXME: we would need instantiate SCEV to work on a region, and be
more
        flexible wrt. memory loads that may be invariant in the region.  */
-    return instantiate_scev (before, loop,
+    return instantiate_scev (region.entry, loop,
                             analyze_scalar_evolution (loop, t));

   def = SSA_NAME_DEF_STMT (t);
@@ -494,7 +493,7 @@ scalar_evolution_in_region (const sese_l
   if (has_vdefs)
     return chrec_dont_know;

-  return instantiate_scev (before, loop, t);
+  return instantiate_scev (region.entry, loop, t);
 }

 /* Return true if BB is empty, contains only DEBUG_INSNs.  */
Index: gcc/sese.h
===================================================================
--- gcc/sese.h  (revision 253645)
+++ gcc/sese.h  (working copy)
@@ -334,6 +334,8 @@ gbb_loop_at_index (gimple_poly_bb_p gbb,
   while (--depth > index)
     loop = loop_outer (loop);

+  gcc_assert (loop_in_sese_p (loop, region));
+
   return loop;
 }

Index: gcc/testsuite/gcc.dg/graphite/fuse-1.c
===================================================================
--- gcc/testsuite/gcc.dg/graphite/fuse-1.c      (revision 253645)
+++ gcc/testsuite/gcc.dg/graphite/fuse-1.c      (working copy)
@@ -1,15 +1,15 @@
 /* Check that the two loops are fused and that we manage to fold the two
xor
    operations.  */
-/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop4-all
-fdump-tree-graphite-all" } */
+/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop-all
-fdump-tree-graphite-all" } */

 /* Make sure we fuse the loops like this:
 AST generated by isl:
 for (int c0 = 0; c0 <= 99; c0 += 1) {
-  S_3(0, c0);
-  S_6(0, c0);
-  S_9(0, c0);
+  S_3(c0);
+  S_6(c0);
+  S_9(c0);
 } */
-/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0
= 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(0, c0\\);.*S_.*\\(0,
c0\\);.*S_.*\\(0, c0\\);.*\\}" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0
= 0; c0 <= 99; c0 \\+= 1\\)
\\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}"
1 "graphite" } } */

 /* Check that after fusing the loops, the scalar computation is also
fused.  */
 /* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12"
1 "forwprop4" } } */
Index: gcc/testsuite/gcc.dg/graphite/fuse-2.c
===================================================================
--- gcc/testsuite/gcc.dg/graphite/fuse-2.c      (revision 253645)
+++ gcc/testsuite/gcc.dg/graphite/fuse-2.c      (working copy)
@@ -3,13 +3,13 @@
 /* Make sure we fuse the loops like this:
 AST generated by isl:
 for (int c0 = 0; c0 <= 99; c0 += 1) {
-  S_3(0, c0);
-  S_6(0, c0);
-  S_9(0, c0);
+  S_3(c0);
+  S_6(c0);
+  S_9(c0);
 }
 */

-/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0
= 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(0, c0\\);.*S_.*\\(0,
c0\\);.*S_.*\\(0, c0\\);.*\\}" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0
= 0; c0 <= 99; c0 \\+= 1\\)
\\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}"
1 "graphite" } } */

 #define MAX 100
 int A[MAX], B[MAX], C[MAX];
Index: gcc/testsuite/gcc.dg/graphite/pr82451.c
===================================================================
--- gcc/testsuite/gcc.dg/graphite/pr82451.c     (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr82451.c     (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -floop-parallelize-all" } */
+
+static int a[];
+int b[1];
+int c;
+static void
+d (int *f, int *g)
+{
+  int e;
+  for (e = 0; e < 2; e++)
+    g[e] = 1;
+  for (e = 0; e < 2; e++)
+    g[e] = f[e] + f[e + 1];
+}
+void
+h ()
+{
+  for (;; c += 8)
+    d (&a[c], b);
+}
Index: gcc/testsuite/gfortran.dg/graphite/id-27.f90
===================================================================
--- gcc/testsuite/gfortran.dg/graphite/id-27.f90        (nonexistent)
+++ gcc/testsuite/gfortran.dg/graphite/id-27.f90        (working copy)
@@ -0,0 +1,40 @@
+! { dg-additional-options "-Ofast" }
+MODULE module_ra_gfdleta
+      INTEGER, PARAMETER              :: NBLY=15
+      REAL   , SAVE :: EM1(28,180),EM1WDE(28,180),TABLE1(28,180),     &
+                           TABLE2(28,180),TABLE3(28,180),EM3(28,180), &
+                           SOURCE(28,NBLY), DSRCE(28,NBLY)
+CONTAINS
+      SUBROUTINE TABLE
+ INTEGER, PARAMETER :: NBLX=47
+ INTEGER , PARAMETER:: NBLW = 163
+      REAL ::  &
+               SUM(28,180),PERTSM(28,180),SUM3(28,180),       &
+               SUMWDE(28,180),SRCWD(28,NBLX),SRC1NB(28,NBLW), &
+               DBDTNB(28,NBLW)
+      REAL ::  &
+               ZMASS(181),ZROOT(181),SC(28),DSC(28),XTEMV(28), &
+               TFOUR(28),FORTCU(28),X(28),X1(28),X2(180),SRCS(28), &
+               R1T(28),R2(28),S2(28),T3(28),R1WD(28)
+      REAL ::  EXPO(180),FAC(180)
+      I = 0
+      DO 417 J=121,180
+      FAC(J)=ZMASS(J)*(ONE-(ONE+X2(J))*EXPO(J))/(X2(J)*X2(J))
+417   CONTINUE
+      DO 421 J=121,180
+      SUM3(I,J)=SUM3(I,J)+DBDTNB(I,N)*FAC(J)
+421   CONTINUE
+      IF (CENT.GT.160. .AND. CENT.LT.560.) THEN
+         DO 420 J=1,180
+         DO 420 I=1,28
+         SUMWDE(I,J)=SUMWDE(I,J)+SRC1NB(I,N)*EXPO(J)
+420      CONTINUE
+      ENDIF
+      DO 433 J=121,180
+      EM3(I,J)=SUM3(I,J)/FORTCU(I)
+433   CONTINUE
+      DO 501 I=1,28
+      EM1WDE(I,J)=SUMWDE(I,J)/TFOUR(I)
+501   CONTINUE
+      END SUBROUTINE TABLE
+      END MODULE module_RA_GFDLETA
Index: gcc/testsuite/gfortran.dg/graphite/pr82451.f
===================================================================
--- gcc/testsuite/gfortran.dg/graphite/pr82451.f        (nonexistent)
+++ gcc/testsuite/gfortran.dg/graphite/pr82451.f        (working copy)
@@ -0,0 +1,39 @@
+! { dg-do compile }
+! { dg-options "-O2 -floop-nest-optimize" }
+      MODULE LES3D_DATA
+      PARAMETER ( NSCHEME = 4, ICHEM = 0, ISGSK = 0, IVISC = 1 )
+      DOUBLE PRECISION DT, TIME, STATTIME, CFL, RELNO, TSTND, ALREF
+      INTEGER IDYN, IMAX, JMAX, KMAX
+      PARAMETER( RUNIV =  8.3145D3,
+     >        TPRANDLT =    0.91D0)
+      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:) ::
+     >             U, V, W, P, T, H, EK,
+     >         UAV, VAV, WAV, PAV, TAV, HAV, EKAV
+      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) ::
+     >             CONC, HF, QAV, COAV, HFAV, DU
+      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:,:) ::
+     >             Q
+      END MODULE LES3D_DATA
+      SUBROUTINE FLUXJ()
+      USE LES3D_DATA
+      ALLOCATABLE QS(:), FSJ(:,:,:)
+      ALLOCATABLE DWDX(:),DWDY(:),DWDZ(:)
+      ALLOCATABLE DHDY(:), DKDY(:)
+      PARAMETER (  R12I = 1.0D0 / 12.0D0,
+     >             TWO3 = 2.0D0 / 3.0D0 )
+      ALLOCATE( QS(IMAX-1), FSJ(IMAX-1,0:JMAX-1,ND))
+      ALLOCATE( DWDX(IMAX-1),DWDY(IMAX-1),DWDZ(IMAX-1))
+      I1 = 1
+      DO K = K1,K2
+         DO J = J1,J2
+            DO I = I1, I2
+               FSJ(I,J,5) = FSJ(I,J,5) + PAV(I,J,K) * QS(I)
+            END DO
+            DO I = I1, I2
+               DWDX(I) = DXI * R12I * (WAV(I-2,J,K) - WAV(I+2,J,K) +
+     >                        8.0D0 * (WAV(I+1,J,K) - WAV(I-1,J,K)))
+            END DO
+         END DO
+      END DO
+      DEALLOCATE( QS, FSJ, DHDY, DKDY)
+      END
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c (revision 253645)
+++ gcc/tree-data-ref.c (working copy)
@@ -957,15 +957,14 @@ access_fn_component_p (tree op)
 }

 /* Determines the base object and the list of indices of memory reference
-   DR, analyzed in LOOP and instantiated in loop nest NEST.  */
+   DR, analyzed in LOOP and instantiated before NEST.  */

 static void
-dr_analyze_indices (struct data_reference *dr, loop_p nest, loop_p loop)
+dr_analyze_indices (struct data_reference *dr, edge nest, loop_p loop)
 {
   vec<tree> access_fns = vNULL;
   tree ref, op;
   tree base, off, access_fn;
-  basic_block before_loop;

   /* If analyzing a basic-block there are no indices to analyze
      and thus no access functions.  */
@@ -977,7 +976,6 @@ dr_analyze_indices (struct data_referenc
     }

   ref = DR_REF (dr);
-  before_loop = block_before_loop (nest);

   /* REALPART_EXPR and IMAGPART_EXPR can be handled like accesses
      into a two element array with a constant index.  The base is
@@ -1002,7 +1000,7 @@ dr_analyze_indices (struct data_referenc
        {
          op = TREE_OPERAND (ref, 1);
          access_fn = analyze_scalar_evolution (loop, op);
-         access_fn = instantiate_scev (before_loop, loop, access_fn);
+         access_fn = instantiate_scev (nest, loop, access_fn);
          access_fns.safe_push (access_fn);
        }
       else if (TREE_CODE (ref) == COMPONENT_REF
@@ -1034,7 +1032,7 @@ dr_analyze_indices (struct data_referenc
     {
       op = TREE_OPERAND (ref, 0);
       access_fn = analyze_scalar_evolution (loop, op);
-      access_fn = instantiate_scev (before_loop, loop, access_fn);
+      access_fn = instantiate_scev (nest, loop, access_fn);
       if (TREE_CODE (access_fn) == POLYNOMIAL_CHREC)
        {
          tree orig_type;
@@ -1139,7 +1137,7 @@ free_data_ref (data_reference_p dr)
    in which the data reference should be analyzed.  */

 struct data_reference *
-create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
+create_data_ref (edge nest, loop_p loop, tree memref, gimple *stmt,
                 bool is_read, bool is_conditional_in_stmt)
 {
   struct data_reference *dr;
@@ -4970,7 +4968,8 @@ find_data_references_in_stmt (struct loo

   FOR_EACH_VEC_ELT (references, i, ref)
     {
-      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
+      dr = create_data_ref (nest ? loop_preheader_edge (nest) : NULL,
+                           loop_containing_stmt (stmt), ref->ref,
                            stmt, ref->is_read,
ref->is_conditional_in_stmt);
       gcc_assert (dr != NULL);
       datarefs->safe_push (dr);
@@ -4986,7 +4985,7 @@ find_data_references_in_stmt (struct loo
    should be analyzed.  */

 bool
-graphite_find_data_references_in_stmt (loop_p nest, loop_p loop, gimple
*stmt,
+graphite_find_data_references_in_stmt (edge nest, loop_p loop, gimple
*stmt,
                                       vec<data_reference_p> *datarefs)
 {
   unsigned i;
Index: gcc/tree-data-ref.h
===================================================================
--- gcc/tree-data-ref.h (revision 253645)
+++ gcc/tree-data-ref.h (working copy)
@@ -436,11 +436,11 @@ extern void free_data_ref (data_referenc
 extern void free_data_refs (vec<data_reference_p> );
 extern bool find_data_references_in_stmt (struct loop *, gimple *,
                                          vec<data_reference_p> *);
-extern bool graphite_find_data_references_in_stmt (loop_p, loop_p, gimple
*,
+extern bool graphite_find_data_references_in_stmt (edge, loop_p, gimple *,
                                                   vec<data_reference_p> *);
 tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
 bool loop_nest_has_data_refs (loop_p loop);
-struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *,
bool,
+struct data_reference *create_data_ref (edge, loop_p, tree, gimple *, bool,
                                        bool);
 extern bool find_loop_nest (struct loop *, vec<loop_p> *);
 extern struct data_dependence_relation *initialize_data_dependence_relation
Index: gcc/tree-scalar-evolution.c
===================================================================
--- gcc/tree-scalar-evolution.c (revision 253645)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -2095,6 +2095,10 @@ analyze_scalar_evolution (struct loop *l
 {
   tree res;

+  /* ???  Fix callers.  */
+  if (! loop)
+    return var;
+
   if (dump_file && (dump_flags & TDF_SCEV))
     {
       fprintf (dump_file, "(analyze_scalar_evolution \n");
@@ -2271,7 +2275,7 @@ eq_idx_scev_info (const void *e1, const

 static unsigned
 get_instantiated_value_entry (instantiate_cache_type &cache,
-                             tree name, basic_block instantiate_below)
+                             tree name, edge instantiate_below)
 {
   if (!cache.map)
     {
@@ -2281,7 +2285,7 @@ get_instantiated_value_entry (instantiat

   scev_info_str e;
   e.name_version = SSA_NAME_VERSION (name);
-  e.instantiated_below = instantiate_below->index;
+  e.instantiated_below = instantiate_below->dest->index;
   void **slot = htab_find_slot_with_hash (cache.map, &e,
                                          scev_info_hasher::hash (&e),
INSERT);
   if (!*slot)
@@ -2325,7 +2329,7 @@ loop_closed_phi_def (tree var)
   return NULL_TREE;
 }

-static tree instantiate_scev_r (basic_block, struct loop *, struct loop *,
+static tree instantiate_scev_r (edge, struct loop *, struct loop *,
                                tree, bool *, int);

 /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
@@ -2344,7 +2348,7 @@ static tree instantiate_scev_r (basic_bl
    instantiated, and to stop if it exceeds some limit.  */

 static tree
-instantiate_scev_name (basic_block instantiate_below,
+instantiate_scev_name (edge instantiate_below,
                       struct loop *evolution_loop, struct loop *inner_loop,
                       tree chrec,
                       bool *fold_conversions,
@@ -2358,7 +2362,7 @@ instantiate_scev_name (basic_block insta
      evolutions in outer loops), nothing to do.  */
   if (!def_bb
       || loop_depth (def_bb->loop_father) == 0
-      || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb))
+      || ! dominated_by_p (CDI_DOMINATORS, def_bb,
instantiate_below->dest))
     return chrec;

   /* We cache the value of instantiated variable to avoid exponential
@@ -2380,6 +2384,51 @@ instantiate_scev_name (basic_block insta

   def_loop = find_common_loop (evolution_loop, def_bb->loop_father);

+  if (! dominated_by_p (CDI_DOMINATORS,
+                       def_loop->header, instantiate_below->dest))
+    {
+      gimple *def = SSA_NAME_DEF_STMT (chrec);
+      if (gassign *ass = dyn_cast <gassign *> (def))
+       {
+         switch (gimple_assign_rhs_class (ass))
+           {
+           case GIMPLE_UNARY_RHS:
+             {
+               tree op0 = instantiate_scev_r (instantiate_below,
evolution_loop,
+                                              inner_loop,
gimple_assign_rhs1 (ass),
+                                              fold_conversions, size_expr);
+               if (op0 == chrec_dont_know)
+                 return chrec_dont_know;
+               res = fold_build1 (gimple_assign_rhs_code (ass),
+                                  TREE_TYPE (chrec), op0);
+               break;
+             }
+           case GIMPLE_BINARY_RHS:
+             {
+               tree op0 = instantiate_scev_r (instantiate_below,
evolution_loop,
+                                              inner_loop,
gimple_assign_rhs1 (ass),
+                                              fold_conversions, size_expr);
+               if (op0 == chrec_dont_know)
+                 return chrec_dont_know;
+               tree op1 = instantiate_scev_r (instantiate_below,
evolution_loop,
+                                              inner_loop,
gimple_assign_rhs2 (ass),
+                                              fold_conversions, size_expr);
+               if (op1 == chrec_dont_know)
+                 return chrec_dont_know;
+               res = fold_build2 (gimple_assign_rhs_code (ass),
+                                  TREE_TYPE (chrec), op0, op1);
+               break;
+             }
+           default:
+             res = chrec_dont_know;
+           }
+       }
+      else
+       res = chrec_dont_know;
+      global_cache->set (si, res);
+      return res;
+    }
+
   /* If the analysis yields a parametric chrec, instantiate the
      result again.  */
   res = analyze_scalar_evolution (def_loop, chrec);
@@ -2411,8 +2460,9 @@ instantiate_scev_name (basic_block insta
                                    inner_loop, res,
                                    fold_conversions, size_expr);
        }
-      else if (!dominated_by_p (CDI_DOMINATORS, instantiate_below,
-                               gimple_bb (SSA_NAME_DEF_STMT (res))))
+      else if (dominated_by_p (CDI_DOMINATORS,
+                               gimple_bb (SSA_NAME_DEF_STMT (res)),
+                               instantiate_below->dest))
        res = chrec_dont_know;
     }

@@ -2450,7 +2500,7 @@ instantiate_scev_name (basic_block insta
    instantiated, and to stop if it exceeds some limit.  */

 static tree
-instantiate_scev_poly (basic_block instantiate_below,
+instantiate_scev_poly (edge instantiate_below,
                       struct loop *evolution_loop, struct loop *,
                       tree chrec, bool *fold_conversions, int size_expr)
 {
@@ -2495,7 +2545,7 @@ instantiate_scev_poly (basic_block insta
    instantiated, and to stop if it exceeds some limit.  */

 static tree
-instantiate_scev_binary (basic_block instantiate_below,
+instantiate_scev_binary (edge instantiate_below,
                         struct loop *evolution_loop, struct loop
*inner_loop,
                         tree chrec, enum tree_code code,
                         tree type, tree c0, tree c1,
@@ -2541,43 +2591,6 @@ instantiate_scev_binary (basic_block ins
 /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
    and EVOLUTION_LOOP, that were left under a symbolic form.

-   "CHREC" is an array reference to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_array_ref (basic_block instantiate_below,
-                      struct loop *evolution_loop, struct loop *inner_loop,
-                      tree chrec, bool *fold_conversions, int size_expr)
-{
-  tree res;
-  tree index = TREE_OPERAND (chrec, 1);
-  tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
-                                inner_loop, index,
-                                fold_conversions, size_expr);
-
-  if (op1 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (chrec && op1 == index)
-    return chrec;
-
-  res = unshare_expr (chrec);
-  TREE_OPERAND (res, 1) = op1;
-  return res;
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
    "CHREC" that stands for a convert expression "(TYPE) OP" is to be
    instantiated.

@@ -2592,7 +2605,7 @@ instantiate_array_ref (basic_block insta
    instantiated, and to stop if it exceeds some limit.  */

 static tree
-instantiate_scev_convert (basic_block instantiate_below,
+instantiate_scev_convert (edge instantiate_below,
                          struct loop *evolution_loop, struct loop
*inner_loop,
                          tree chrec, tree type, tree op,
                          bool *fold_conversions, int size_expr)
@@ -2643,7 +2656,7 @@ instantiate_scev_convert (basic_block in
    instantiated, and to stop if it exceeds some limit.  */

 static tree
-instantiate_scev_not (basic_block instantiate_below,
+instantiate_scev_not (edge instantiate_below,
                      struct loop *evolution_loop, struct loop *inner_loop,
                      tree chrec,
                      enum tree_code code, tree type, tree op,
@@ -2681,130 +2694,6 @@ instantiate_scev_not (basic_block instan
 /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
    and EVOLUTION_LOOP, that were left under a symbolic form.

-   CHREC is an expression with 3 operands to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_scev_3 (basic_block instantiate_below,
-                   struct loop *evolution_loop, struct loop *inner_loop,
-                   tree chrec,
-                   bool *fold_conversions, int size_expr)
-{
-  tree op1, op2;
-  tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
-                                inner_loop, TREE_OPERAND (chrec, 0),
-                                fold_conversions, size_expr);
-  if (op0 == chrec_dont_know)
-    return chrec_dont_know;
-
-  op1 = instantiate_scev_r (instantiate_below, evolution_loop,
-                           inner_loop, TREE_OPERAND (chrec, 1),
-                           fold_conversions, size_expr);
-  if (op1 == chrec_dont_know)
-    return chrec_dont_know;
-
-  op2 = instantiate_scev_r (instantiate_below, evolution_loop,
-                           inner_loop, TREE_OPERAND (chrec, 2),
-                           fold_conversions, size_expr);
-  if (op2 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (op0 == TREE_OPERAND (chrec, 0)
-      && op1 == TREE_OPERAND (chrec, 1)
-      && op2 == TREE_OPERAND (chrec, 2))
-    return chrec;
-
-  return fold_build3 (TREE_CODE (chrec),
-                     TREE_TYPE (chrec), op0, op1, op2);
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
-   CHREC is an expression with 2 operands to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_scev_2 (basic_block instantiate_below,
-                   struct loop *evolution_loop, struct loop *inner_loop,
-                   tree chrec,
-                   bool *fold_conversions, int size_expr)
-{
-  tree op1;
-  tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
-                                inner_loop, TREE_OPERAND (chrec, 0),
-                                fold_conversions, size_expr);
-  if (op0 == chrec_dont_know)
-    return chrec_dont_know;
-
-  op1 = instantiate_scev_r (instantiate_below, evolution_loop,
-                           inner_loop, TREE_OPERAND (chrec, 1),
-                           fold_conversions, size_expr);
-  if (op1 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (op0 == TREE_OPERAND (chrec, 0)
-      && op1 == TREE_OPERAND (chrec, 1))
-    return chrec;
-
-  return fold_build2 (TREE_CODE (chrec), TREE_TYPE (chrec), op0, op1);
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
-   CHREC is an expression with 2 operands to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_scev_1 (basic_block instantiate_below,
-                   struct loop *evolution_loop, struct loop *inner_loop,
-                   tree chrec,
-                   bool *fold_conversions, int size_expr)
-{
-  tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
-                                inner_loop, TREE_OPERAND (chrec, 0),
-                                fold_conversions, size_expr);
-
-  if (op0 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (op0 == TREE_OPERAND (chrec, 0))
-    return chrec;
-
-  return fold_build1 (TREE_CODE (chrec), TREE_TYPE (chrec), op0);
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
    CHREC is the scalar evolution to instantiate.

    CACHE is the cache of already instantiated values.
@@ -2818,7 +2707,7 @@ instantiate_scev_1 (basic_block instanti
    instantiated, and to stop if it exceeds some limit.  */

 static tree
-instantiate_scev_r (basic_block instantiate_below,
+instantiate_scev_r (edge instantiate_below,
                    struct loop *evolution_loop, struct loop *inner_loop,
                    tree chrec,
                    bool *fold_conversions, int size_expr)
@@ -2870,50 +2759,20 @@ instantiate_scev_r (basic_block instanti
                                   fold_conversions, size_expr);

     case ADDR_EXPR:
+      if (is_gimple_min_invariant (chrec))
+       return chrec;
+      /* Fallthru.  */
     case SCEV_NOT_KNOWN:
       return chrec_dont_know;

     case SCEV_KNOWN:
       return chrec_known;

-    case ARRAY_REF:
-      return instantiate_array_ref (instantiate_below, evolution_loop,
-                                   inner_loop, chrec,
-                                   fold_conversions, size_expr);
-
-    default:
-      break;
-    }
-
-  if (VL_EXP_CLASS_P (chrec))
-    return chrec_dont_know;
-
-  switch (TREE_CODE_LENGTH (TREE_CODE (chrec)))
-    {
-    case 3:
-      return instantiate_scev_3 (instantiate_below, evolution_loop,
-                                inner_loop, chrec,
-                                fold_conversions, size_expr);
-
-    case 2:
-      return instantiate_scev_2 (instantiate_below, evolution_loop,
-                                inner_loop, chrec,
-                                fold_conversions, size_expr);
-
-    case 1:
-      return instantiate_scev_1 (instantiate_below, evolution_loop,
-                                inner_loop, chrec,
-                                fold_conversions, size_expr);
-
-    case 0:
-      return chrec;
-
     default:
-      break;
+      if (CONSTANT_CLASS_P (chrec))
+       return chrec;
+      return chrec_dont_know;
     }
-
-  /* Too complicated to handle.  */
-  return chrec_dont_know;
 }

 /* Analyze all the parameters of the chrec that were left under a
@@ -2923,7 +2782,7 @@ instantiate_scev_r (basic_block instanti
    a function parameter.  */

 tree
-instantiate_scev (basic_block instantiate_below, struct loop
*evolution_loop,
+instantiate_scev (edge instantiate_below, struct loop *evolution_loop,
                  tree chrec)
 {
   tree res;
@@ -2931,8 +2790,10 @@ instantiate_scev (basic_block instantiat
   if (dump_file && (dump_flags & TDF_SCEV))
     {
       fprintf (dump_file, "(instantiate_scev \n");
-      fprintf (dump_file, "  (instantiate_below = %d)\n",
instantiate_below->index);
-      fprintf (dump_file, "  (evolution_loop = %d)\n",
evolution_loop->num);
+      fprintf (dump_file, "  (instantiate_below = %d -> %d)\n",
+              instantiate_below->src->index, instantiate_below->dest->index
);
+      if (evolution_loop)
+       fprintf (dump_file, "  (evolution_loop = %d)\n",
evolution_loop->num);
       fprintf (dump_file, "  (chrec = ");
       print_generic_expr (dump_file, chrec);
       fprintf (dump_file, ")\n");
@@ -2980,7 +2841,7 @@ resolve_mixers (struct loop *loop, tree
       destr = true;
     }

-  tree ret = instantiate_scev_r (block_before_loop (loop), loop, NULL,
+  tree ret = instantiate_scev_r (loop_preheader_edge (loop), loop, NULL,
                                 chrec, &fold_conversions, 0);

   if (folded_casts && !*folded_casts)
Index: gcc/tree-scalar-evolution.h
===================================================================
--- gcc/tree-scalar-evolution.h (revision 253645)
+++ gcc/tree-scalar-evolution.h (working copy)
@@ -30,7 +30,7 @@ extern void scev_reset (void);
 extern void scev_reset_htab (void);
 extern void scev_finalize (void);
 extern tree analyze_scalar_evolution (struct loop *, tree);
-extern tree instantiate_scev (basic_block, struct loop *, tree);
+extern tree instantiate_scev (edge, struct loop *, tree);
 extern tree resolve_mixers (struct loop *, tree, bool *);
 extern void gather_stats_on_scev_database (void);
 extern void final_value_replacement_loop (struct loop *);
@@ -60,7 +60,7 @@ block_before_loop (loop_p loop)
 static inline tree
 instantiate_parameters (struct loop *loop, tree chrec)
 {
-  return instantiate_scev (block_before_loop (loop), loop, chrec);
+  return instantiate_scev (loop_preheader_edge (loop), loop, chrec);
 }

 /* Returns the loop of the polynomial chrec CHREC.  */
Index: gcc/tree-ssa-loop-prefetch.c
===================================================================
--- gcc/tree-ssa-loop-prefetch.c        (revision 253645)
+++ gcc/tree-ssa-loop-prefetch.c        (working copy)
@@ -1632,7 +1632,8 @@ determine_loop_nest_reuse (struct loop *
   for (gr = refs; gr; gr = gr->next)
     for (ref = gr->refs; ref; ref = ref->next)
       {
-       dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
+       dr = create_data_ref (loop_preheader_edge (nest),
+                             loop_containing_stmt (ref->stmt),
                              ref->mem, ref->stmt, !ref->write_p, false);

        if (dr)
diff mbox series

Patch

Index: gcc/cfgloop.c
===================================================================
--- gcc/cfgloop.c	(revision 253645)
+++ gcc/cfgloop.c	(working copy)
@@ -1713,12 +1713,19 @@  loop_preheader_edge (const struct loop *
   edge e;
   edge_iterator ei;
 
-  gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS));
+  gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
+	      && ! loops_state_satisfies_p (LOOPS_MAY_HAVE_MULTIPLE_LATCHES));
 
   FOR_EACH_EDGE (e, ei, loop->header->preds)
     if (e->src != loop->latch)
       break;
 
+  if (! e)
+    {
+      gcc_assert (! loop_outer (loop));
+      return single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+    }
+
   return e;
 }
 
Index: gcc/graphite-isl-ast-to-gimple.c
===================================================================
--- gcc/graphite-isl-ast-to-gimple.c	(revision 253645)
+++ gcc/graphite-isl-ast-to-gimple.c	(working copy)
@@ -749,10 +749,8 @@  build_iv_mapping (vec<tree> iv_map, gimp
       if (codegen_error_p ())
 	t = integer_zero_node;
 
-      loop_p old_loop = gbb_loop_at_index (gbb, region, i - 2);
-      /* Record sth only for real loops.  */
-      if (loop_in_sese_p (old_loop, region))
-	iv_map[old_loop->num] = t;
+      loop_p old_loop = gbb_loop_at_index (gbb, region, i - 1);
+      iv_map[old_loop->num] = t;
     }
 }
 
@@ -1570,6 +1568,12 @@  graphite_regenerate_ast_isl (scop_p scop
       update_ssa (TODO_update_ssa);
       checking_verify_ssa (true, true);
       rewrite_into_loop_closed_ssa (NULL, 0);
+      /* We analyzed evolutions of all SCOPs during SCOP detection
+         which cached evolutions.  Now we've introduced PHIs for
+	 liveouts which causes those cached solutions to be invalid
+	 for code-generation purposes given we'd insert references
+	 to SSA names not dominating their new use.  */
+      scev_reset ();
     }
 
   if (t.codegen_error_p ())
Index: gcc/graphite-scop-detection.c
===================================================================
--- gcc/graphite-scop-detection.c	(revision 253645)
+++ gcc/graphite-scop-detection.c	(working copy)
@@ -403,7 +403,7 @@  public:
 
      Something like "i * n" or "n * m" is not allowed.  */
 
-  static bool graphite_can_represent_scev (tree scev);
+  static bool graphite_can_represent_scev (sese_l scop, tree scev);
 
   /* Return true when EXPR can be represented in the polyhedral model.
 
@@ -963,7 +963,7 @@  scop_detection::graphite_can_represent_i
    Something like "i * n" or "n * m" is not allowed.  */
 
 bool
-scop_detection::graphite_can_represent_scev (tree scev)
+scop_detection::graphite_can_represent_scev (sese_l scop, tree scev)
 {
   if (chrec_contains_undetermined (scev))
     return false;
@@ -982,13 +982,13 @@  scop_detection::graphite_can_represent_s
     case BIT_NOT_EXPR:
     CASE_CONVERT:
     case NON_LVALUE_EXPR:
-      return graphite_can_represent_scev (TREE_OPERAND (scev, 0));
+      return graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0));
 
     case PLUS_EXPR:
     case POINTER_PLUS_EXPR:
     case MINUS_EXPR:
-      return graphite_can_represent_scev (TREE_OPERAND (scev, 0))
-	&& graphite_can_represent_scev (TREE_OPERAND (scev, 1));
+      return graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0))
+	&& graphite_can_represent_scev (scop, TREE_OPERAND (scev, 1));
 
     case MULT_EXPR:
       return !CONVERT_EXPR_CODE_P (TREE_CODE (TREE_OPERAND (scev, 0)))
@@ -996,18 +996,20 @@  scop_detection::graphite_can_represent_s
 	&& !(chrec_contains_symbols (TREE_OPERAND (scev, 0))
 	     && chrec_contains_symbols (TREE_OPERAND (scev, 1)))
 	&& graphite_can_represent_init (scev)
-	&& graphite_can_represent_scev (TREE_OPERAND (scev, 0))
-	&& graphite_can_represent_scev (TREE_OPERAND (scev, 1));
+	&& graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0))
+	&& graphite_can_represent_scev (scop, TREE_OPERAND (scev, 1));
 
     case POLYNOMIAL_CHREC:
       /* Check for constant strides.  With a non constant stride of
 	 'n' we would have a value of 'iv * n'.  Also check that the
 	 initial value can represented: for example 'n * m' cannot be
 	 represented.  */
+      gcc_assert (loop_in_sese_p (get_loop (cfun,
+					    CHREC_VARIABLE (scev)), scop));
       if (!evolution_function_right_is_integer_cst (scev)
 	  || !graphite_can_represent_init (scev))
 	return false;
-      return graphite_can_represent_scev (CHREC_LEFT (scev));
+      return graphite_can_represent_scev (scop, CHREC_LEFT (scev));
 
     default:
       break;
@@ -1031,7 +1033,7 @@  scop_detection::graphite_can_represent_e
 					     tree expr)
 {
   tree scev = scalar_evolution_in_region (scop, loop, expr);
-  return graphite_can_represent_scev (scev);
+  return graphite_can_represent_scev (scop, scev);
 }
 
 /* Return true if the data references of STMT can be represented by Graphite.
@@ -1040,12 +1042,15 @@  scop_detection::graphite_can_represent_e
 bool
 scop_detection::stmt_has_simple_data_refs_p (sese_l scop, gimple *stmt)
 {
-  loop_p nest;
+  edge nest;
   loop_p loop = loop_containing_stmt (stmt);
   if (!loop_in_sese_p (loop, scop))
-    nest = loop;
+    {
+      nest = scop.entry;
+      loop = NULL;
+    }
   else
-    nest = outermost_loop_in_sese (scop, gimple_bb (stmt));
+    nest = loop_preheader_edge (outermost_loop_in_sese (scop, gimple_bb (stmt)));
 
   auto_vec<data_reference_p> drs;
   if (! graphite_find_data_references_in_stmt (nest, loop, stmt, &drs))
@@ -1056,7 +1061,7 @@  scop_detection::stmt_has_simple_data_ref
   FOR_EACH_VEC_ELT (drs, j, dr)
     {
       for (unsigned i = 0; i < DR_NUM_DIMENSIONS (dr); ++i)
-	if (! graphite_can_represent_scev (DR_ACCESS_FN (dr, i)))
+	if (! graphite_can_represent_scev (scop, DR_ACCESS_FN (dr, i)))
 	  return false;
     }
 
@@ -1413,12 +1418,15 @@  try_generate_gimple_bb (scop_p scop, bas
   vec<scalar_use> reads = vNULL;
 
   sese_l region = scop->scop_info->region;
-  loop_p nest;
+  edge nest;
   loop_p loop = bb->loop_father;
   if (!loop_in_sese_p (loop, region))
-    nest = loop;
+    {
+      nest = region.entry;
+      loop = NULL;
+    }
   else
-    nest = outermost_loop_in_sese (region, bb);
+    nest = loop_preheader_edge (outermost_loop_in_sese (region, bb));
 
   for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
        gsi_next (&gsi))
Index: gcc/graphite-sese-to-poly.c
===================================================================
--- gcc/graphite-sese-to-poly.c	(revision 253645)
+++ gcc/graphite-sese-to-poly.c	(working copy)
@@ -86,7 +86,7 @@  extract_affine_chrec (scop_p s, tree e,
   isl_pw_aff *lhs = extract_affine (s, CHREC_LEFT (e), isl_space_copy (space));
   isl_pw_aff *rhs = extract_affine (s, CHREC_RIGHT (e), isl_space_copy (space));
   isl_local_space *ls = isl_local_space_from_space (space);
-  unsigned pos = sese_loop_depth (s->scop_info->region, get_chrec_loop (e));
+  unsigned pos = sese_loop_depth (s->scop_info->region, get_chrec_loop (e)) - 1;
   isl_aff *loop = isl_aff_set_coefficient_si
     (isl_aff_zero_on_domain (ls), isl_dim_in, pos, 1);
   isl_pw_aff *l = isl_pw_aff_from_aff (loop);
@@ -763,10 +763,10 @@  add_loop_constraints (scop_p scop, __isl
     return domain;
   const sese_l &region = scop->scop_info->region;
   if (!loop_in_sese_p (loop, region))
-    ;
-  else
-    /* Recursion all the way up to the context loop.  */
-    domain = add_loop_constraints (scop, domain, loop_outer (loop), context);
+    return domain;
+
+  /* Recursion all the way up to the context loop.  */
+  domain = add_loop_constraints (scop, domain, loop_outer (loop), context);
 
   /* Then, build constraints over the loop in post-order: outer to inner.  */
 
@@ -777,21 +777,6 @@  add_loop_constraints (scop_p scop, __isl
   domain = add_iter_domain_dimension (domain, loop, scop);
   isl_space *space = isl_set_get_space (domain);
 
-  if (!loop_in_sese_p (loop, region))
-    {
-      /* 0 == loop_i */
-      isl_local_space *ls = isl_local_space_from_space (space);
-      isl_constraint *c = isl_equality_alloc (ls);
-      c = isl_constraint_set_coefficient_si (c, isl_dim_set, loop_index, 1);
-      if (dump_file)
-	{
-	  fprintf (dump_file, "[sese-to-poly] adding constraint to the domain: ");
-	  print_isl_constraint (dump_file, c);
-	}
-      domain = isl_set_add_constraint (domain, c);
-      return domain;
-    }
-
   /* 0 <= loop_i */
   isl_local_space *ls = isl_local_space_from_space (isl_space_copy (space));
   isl_constraint *c = isl_inequality_alloc (ls);
Index: gcc/sese.c
===================================================================
--- gcc/sese.c	(revision 253645)
+++ gcc/sese.c	(working copy)
@@ -461,7 +461,6 @@  scalar_evolution_in_region (const sese_l
 {
   gimple *def;
   struct loop *def_loop;
-  basic_block before = region.entry->src;
 
   /* SCOP parameters.  */
   if (TREE_CODE (t) == SSA_NAME
@@ -472,7 +471,7 @@  scalar_evolution_in_region (const sese_l
       || loop_in_sese_p (loop, region))
     /* FIXME: we would need instantiate SCEV to work on a region, and be more
        flexible wrt. memory loads that may be invariant in the region.  */
-    return instantiate_scev (before, loop,
+    return instantiate_scev (region.entry, loop,
 			     analyze_scalar_evolution (loop, t));
 
   def = SSA_NAME_DEF_STMT (t);
@@ -494,7 +493,7 @@  scalar_evolution_in_region (const sese_l
   if (has_vdefs)
     return chrec_dont_know;
 
-  return instantiate_scev (before, loop, t);
+  return instantiate_scev (region.entry, loop, t);
 }
 
 /* Return true if BB is empty, contains only DEBUG_INSNs.  */
Index: gcc/sese.h
===================================================================
--- gcc/sese.h	(revision 253645)
+++ gcc/sese.h	(working copy)
@@ -334,6 +334,8 @@  gbb_loop_at_index (gimple_poly_bb_p gbb,
   while (--depth > index)
     loop = loop_outer (loop);
 
+  gcc_assert (loop_in_sese_p (loop, region));
+
   return loop;
 }
 
Index: gcc/testsuite/gcc.dg/graphite/fuse-1.c
===================================================================
--- gcc/testsuite/gcc.dg/graphite/fuse-1.c	(revision 253645)
+++ gcc/testsuite/gcc.dg/graphite/fuse-1.c	(working copy)
@@ -1,15 +1,15 @@ 
 /* Check that the two loops are fused and that we manage to fold the two xor
    operations.  */
-/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop4-all -fdump-tree-graphite-all" } */
+/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop-all -fdump-tree-graphite-all" } */
 
 /* Make sure we fuse the loops like this:
 AST generated by isl:
 for (int c0 = 0; c0 <= 99; c0 += 1) {
-  S_3(0, c0);
-  S_6(0, c0);
-  S_9(0, c0);
+  S_3(c0);
+  S_6(c0);
+  S_9(c0);
 } */
-/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*\\}" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */
 
 /* Check that after fusing the loops, the scalar computation is also fused.  */
 /* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 "forwprop4" } } */
Index: gcc/testsuite/gcc.dg/graphite/fuse-2.c
===================================================================
--- gcc/testsuite/gcc.dg/graphite/fuse-2.c	(revision 253645)
+++ gcc/testsuite/gcc.dg/graphite/fuse-2.c	(working copy)
@@ -3,13 +3,13 @@ 
 /* Make sure we fuse the loops like this:
 AST generated by isl:
 for (int c0 = 0; c0 <= 99; c0 += 1) {
-  S_3(0, c0);
-  S_6(0, c0);
-  S_9(0, c0);
+  S_3(c0);
+  S_6(c0);
+  S_9(c0);
 }
 */
 
-/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*\\}" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */
 
 #define MAX 100
 int A[MAX], B[MAX], C[MAX];
Index: gcc/testsuite/gcc.dg/graphite/pr82451.c
===================================================================
--- gcc/testsuite/gcc.dg/graphite/pr82451.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr82451.c	(working copy)
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O -floop-parallelize-all" } */
+
+static int a[];
+int b[1];
+int c;
+static void
+d (int *f, int *g)
+{
+  int e;
+  for (e = 0; e < 2; e++)
+    g[e] = 1;
+  for (e = 0; e < 2; e++)
+    g[e] = f[e] + f[e + 1];
+}
+void
+h ()
+{
+  for (;; c += 8)
+    d (&a[c], b);
+}
Index: gcc/testsuite/gfortran.dg/graphite/id-27.f90
===================================================================
--- gcc/testsuite/gfortran.dg/graphite/id-27.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/graphite/id-27.f90	(working copy)
@@ -0,0 +1,40 @@ 
+! { dg-additional-options "-Ofast" }
+MODULE module_ra_gfdleta
+      INTEGER, PARAMETER              :: NBLY=15
+      REAL   , SAVE :: EM1(28,180),EM1WDE(28,180),TABLE1(28,180),     &
+                           TABLE2(28,180),TABLE3(28,180),EM3(28,180), &
+                           SOURCE(28,NBLY), DSRCE(28,NBLY)
+CONTAINS
+      SUBROUTINE TABLE 
+ INTEGER, PARAMETER :: NBLX=47
+ INTEGER , PARAMETER:: NBLW = 163
+      REAL ::  &
+               SUM(28,180),PERTSM(28,180),SUM3(28,180),       &
+               SUMWDE(28,180),SRCWD(28,NBLX),SRC1NB(28,NBLW), &
+               DBDTNB(28,NBLW)
+      REAL ::  &
+               ZMASS(181),ZROOT(181),SC(28),DSC(28),XTEMV(28), &
+               TFOUR(28),FORTCU(28),X(28),X1(28),X2(180),SRCS(28), &
+               R1T(28),R2(28),S2(28),T3(28),R1WD(28)
+      REAL ::  EXPO(180),FAC(180)
+      I = 0
+      DO 417 J=121,180
+      FAC(J)=ZMASS(J)*(ONE-(ONE+X2(J))*EXPO(J))/(X2(J)*X2(J))
+417   CONTINUE
+      DO 421 J=121,180
+      SUM3(I,J)=SUM3(I,J)+DBDTNB(I,N)*FAC(J)
+421   CONTINUE
+      IF (CENT.GT.160. .AND. CENT.LT.560.) THEN
+         DO 420 J=1,180
+         DO 420 I=1,28
+         SUMWDE(I,J)=SUMWDE(I,J)+SRC1NB(I,N)*EXPO(J)
+420      CONTINUE
+      ENDIF
+      DO 433 J=121,180
+      EM3(I,J)=SUM3(I,J)/FORTCU(I)
+433   CONTINUE
+      DO 501 I=1,28
+      EM1WDE(I,J)=SUMWDE(I,J)/TFOUR(I)
+501   CONTINUE
+      END SUBROUTINE TABLE
+      END MODULE module_RA_GFDLETA
Index: gcc/testsuite/gfortran.dg/graphite/pr82451.f
===================================================================
--- gcc/testsuite/gfortran.dg/graphite/pr82451.f	(nonexistent)
+++ gcc/testsuite/gfortran.dg/graphite/pr82451.f	(working copy)
@@ -0,0 +1,39 @@ 
+! { dg-do compile }
+! { dg-options "-O2 -floop-nest-optimize" }
+      MODULE LES3D_DATA
+      PARAMETER ( NSCHEME = 4, ICHEM = 0, ISGSK = 0, IVISC = 1 )
+      DOUBLE PRECISION DT, TIME, STATTIME, CFL, RELNO, TSTND, ALREF
+      INTEGER IDYN, IMAX, JMAX, KMAX
+      PARAMETER( RUNIV =  8.3145D3,
+     >        TPRANDLT =    0.91D0)
+      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:) ::
+     >             U, V, W, P, T, H, EK,
+     >         UAV, VAV, WAV, PAV, TAV, HAV, EKAV
+      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) ::
+     >             CONC, HF, QAV, COAV, HFAV, DU
+      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:,:) ::
+     >             Q
+      END MODULE LES3D_DATA
+      SUBROUTINE FLUXJ()
+      USE LES3D_DATA
+      ALLOCATABLE QS(:), FSJ(:,:,:)
+      ALLOCATABLE DWDX(:),DWDY(:),DWDZ(:)
+      ALLOCATABLE DHDY(:), DKDY(:)
+      PARAMETER (  R12I = 1.0D0 / 12.0D0,
+     >             TWO3 = 2.0D0 / 3.0D0 )
+      ALLOCATE( QS(IMAX-1), FSJ(IMAX-1,0:JMAX-1,ND))
+      ALLOCATE( DWDX(IMAX-1),DWDY(IMAX-1),DWDZ(IMAX-1))
+      I1 = 1
+      DO K = K1,K2
+         DO J = J1,J2
+            DO I = I1, I2
+               FSJ(I,J,5) = FSJ(I,J,5) + PAV(I,J,K) * QS(I)
+            END DO
+            DO I = I1, I2
+               DWDX(I) = DXI * R12I * (WAV(I-2,J,K) - WAV(I+2,J,K) +
+     >                        8.0D0 * (WAV(I+1,J,K) - WAV(I-1,J,K)))
+            END DO
+         END DO
+      END DO
+      DEALLOCATE( QS, FSJ, DHDY, DKDY)
+      END
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c	(revision 253645)
+++ gcc/tree-data-ref.c	(working copy)
@@ -957,15 +957,14 @@  access_fn_component_p (tree op)
 }
 
 /* Determines the base object and the list of indices of memory reference
-   DR, analyzed in LOOP and instantiated in loop nest NEST.  */
+   DR, analyzed in LOOP and instantiated before NEST.  */
 
 static void
-dr_analyze_indices (struct data_reference *dr, loop_p nest, loop_p loop)
+dr_analyze_indices (struct data_reference *dr, edge nest, loop_p loop)
 {
   vec<tree> access_fns = vNULL;
   tree ref, op;
   tree base, off, access_fn;
-  basic_block before_loop;
 
   /* If analyzing a basic-block there are no indices to analyze
      and thus no access functions.  */
@@ -977,7 +976,6 @@  dr_analyze_indices (struct data_referenc
     }
 
   ref = DR_REF (dr);
-  before_loop = block_before_loop (nest);
 
   /* REALPART_EXPR and IMAGPART_EXPR can be handled like accesses
      into a two element array with a constant index.  The base is
@@ -1002,7 +1000,7 @@  dr_analyze_indices (struct data_referenc
 	{
 	  op = TREE_OPERAND (ref, 1);
 	  access_fn = analyze_scalar_evolution (loop, op);
-	  access_fn = instantiate_scev (before_loop, loop, access_fn);
+	  access_fn = instantiate_scev (nest, loop, access_fn);
 	  access_fns.safe_push (access_fn);
 	}
       else if (TREE_CODE (ref) == COMPONENT_REF
@@ -1034,7 +1032,7 @@  dr_analyze_indices (struct data_referenc
     {
       op = TREE_OPERAND (ref, 0);
       access_fn = analyze_scalar_evolution (loop, op);
-      access_fn = instantiate_scev (before_loop, loop, access_fn);
+      access_fn = instantiate_scev (nest, loop, access_fn);
       if (TREE_CODE (access_fn) == POLYNOMIAL_CHREC)
 	{
 	  tree orig_type;
@@ -1139,7 +1137,7 @@  free_data_ref (data_reference_p dr)
    in which the data reference should be analyzed.  */
 
 struct data_reference *
-create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
+create_data_ref (edge nest, loop_p loop, tree memref, gimple *stmt,
 		 bool is_read, bool is_conditional_in_stmt)
 {
   struct data_reference *dr;
@@ -4970,7 +4968,8 @@  find_data_references_in_stmt (struct loo
 
   FOR_EACH_VEC_ELT (references, i, ref)
     {
-      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
+      dr = create_data_ref (nest ? loop_preheader_edge (nest) : NULL,
+			    loop_containing_stmt (stmt), ref->ref,
 			    stmt, ref->is_read, ref->is_conditional_in_stmt);
       gcc_assert (dr != NULL);
       datarefs->safe_push (dr);
@@ -4986,7 +4985,7 @@  find_data_references_in_stmt (struct loo
    should be analyzed.  */
 
 bool
-graphite_find_data_references_in_stmt (loop_p nest, loop_p loop, gimple *stmt,
+graphite_find_data_references_in_stmt (edge nest, loop_p loop, gimple *stmt,
 				       vec<data_reference_p> *datarefs)
 {
   unsigned i;
Index: gcc/tree-data-ref.h
===================================================================
--- gcc/tree-data-ref.h	(revision 253645)
+++ gcc/tree-data-ref.h	(working copy)
@@ -436,11 +436,11 @@  extern void free_data_ref (data_referenc
 extern void free_data_refs (vec<data_reference_p> );
 extern bool find_data_references_in_stmt (struct loop *, gimple *,
 					  vec<data_reference_p> *);
-extern bool graphite_find_data_references_in_stmt (loop_p, loop_p, gimple *,
+extern bool graphite_find_data_references_in_stmt (edge, loop_p, gimple *,
 						   vec<data_reference_p> *);
 tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
 bool loop_nest_has_data_refs (loop_p loop);
-struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
+struct data_reference *create_data_ref (edge, loop_p, tree, gimple *, bool,
 					bool);
 extern bool find_loop_nest (struct loop *, vec<loop_p> *);
 extern struct data_dependence_relation *initialize_data_dependence_relation
Index: gcc/tree-scalar-evolution.c
===================================================================
--- gcc/tree-scalar-evolution.c	(revision 253645)
+++ gcc/tree-scalar-evolution.c	(working copy)
@@ -2095,6 +2095,10 @@  analyze_scalar_evolution (struct loop *l
 {
   tree res;
 
+  /* ???  Fix callers.  */
+  if (! loop)
+    return var;
+
   if (dump_file && (dump_flags & TDF_SCEV))
     {
       fprintf (dump_file, "(analyze_scalar_evolution \n");
@@ -2271,7 +2275,7 @@  eq_idx_scev_info (const void *e1, const
 
 static unsigned
 get_instantiated_value_entry (instantiate_cache_type &cache,
-			      tree name, basic_block instantiate_below)
+			      tree name, edge instantiate_below)
 {
   if (!cache.map)
     {
@@ -2281,7 +2285,7 @@  get_instantiated_value_entry (instantiat
 
   scev_info_str e;
   e.name_version = SSA_NAME_VERSION (name);
-  e.instantiated_below = instantiate_below->index;
+  e.instantiated_below = instantiate_below->dest->index;
   void **slot = htab_find_slot_with_hash (cache.map, &e,
 					  scev_info_hasher::hash (&e), INSERT);
   if (!*slot)
@@ -2325,7 +2329,7 @@  loop_closed_phi_def (tree var)
   return NULL_TREE;
 }
 
-static tree instantiate_scev_r (basic_block, struct loop *, struct loop *,
+static tree instantiate_scev_r (edge, struct loop *, struct loop *,
 				tree, bool *, int);
 
 /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
@@ -2344,7 +2348,7 @@  static tree instantiate_scev_r (basic_bl
    instantiated, and to stop if it exceeds some limit.  */
 
 static tree
-instantiate_scev_name (basic_block instantiate_below,
+instantiate_scev_name (edge instantiate_below,
 		       struct loop *evolution_loop, struct loop *inner_loop,
 		       tree chrec,
 		       bool *fold_conversions,
@@ -2358,7 +2362,7 @@  instantiate_scev_name (basic_block insta
      evolutions in outer loops), nothing to do.  */
   if (!def_bb
       || loop_depth (def_bb->loop_father) == 0
-      || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb))
+      || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
     return chrec;
 
   /* We cache the value of instantiated variable to avoid exponential
@@ -2380,6 +2384,51 @@  instantiate_scev_name (basic_block insta
 
   def_loop = find_common_loop (evolution_loop, def_bb->loop_father);
 
+  if (! dominated_by_p (CDI_DOMINATORS,
+			def_loop->header, instantiate_below->dest))
+    {
+      gimple *def = SSA_NAME_DEF_STMT (chrec);
+      if (gassign *ass = dyn_cast <gassign *> (def))
+	{
+	  switch (gimple_assign_rhs_class (ass))
+	    {
+	    case GIMPLE_UNARY_RHS:
+	      {
+		tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
+					       inner_loop, gimple_assign_rhs1 (ass),
+					       fold_conversions, size_expr);
+		if (op0 == chrec_dont_know)
+		  return chrec_dont_know;
+		res = fold_build1 (gimple_assign_rhs_code (ass),
+				   TREE_TYPE (chrec), op0);
+		break;
+	      }
+	    case GIMPLE_BINARY_RHS:
+	      {
+		tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
+					       inner_loop, gimple_assign_rhs1 (ass),
+					       fold_conversions, size_expr);
+		if (op0 == chrec_dont_know)
+		  return chrec_dont_know;
+		tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
+					       inner_loop, gimple_assign_rhs2 (ass),
+					       fold_conversions, size_expr);
+		if (op1 == chrec_dont_know)
+		  return chrec_dont_know;
+		res = fold_build2 (gimple_assign_rhs_code (ass),
+				   TREE_TYPE (chrec), op0, op1);
+		break;
+	      }
+	    default:
+	      res = chrec_dont_know;
+	    }
+	}
+      else
+	res = chrec_dont_know;
+      global_cache->set (si, res);
+      return res;
+    }
+
   /* If the analysis yields a parametric chrec, instantiate the
      result again.  */
   res = analyze_scalar_evolution (def_loop, chrec);
@@ -2411,8 +2460,9 @@  instantiate_scev_name (basic_block insta
 				    inner_loop, res,
 				    fold_conversions, size_expr);
 	}
-      else if (!dominated_by_p (CDI_DOMINATORS, instantiate_below,
-				gimple_bb (SSA_NAME_DEF_STMT (res))))
+      else if (dominated_by_p (CDI_DOMINATORS,
+				gimple_bb (SSA_NAME_DEF_STMT (res)),
+				instantiate_below->dest))
 	res = chrec_dont_know;
     }
 
@@ -2450,7 +2500,7 @@  instantiate_scev_name (basic_block insta
    instantiated, and to stop if it exceeds some limit.  */
 
 static tree
-instantiate_scev_poly (basic_block instantiate_below,
+instantiate_scev_poly (edge instantiate_below,
 		       struct loop *evolution_loop, struct loop *,
 		       tree chrec, bool *fold_conversions, int size_expr)
 {
@@ -2495,7 +2545,7 @@  instantiate_scev_poly (basic_block insta
    instantiated, and to stop if it exceeds some limit.  */
 
 static tree
-instantiate_scev_binary (basic_block instantiate_below,
+instantiate_scev_binary (edge instantiate_below,
 			 struct loop *evolution_loop, struct loop *inner_loop,
 			 tree chrec, enum tree_code code,
 			 tree type, tree c0, tree c1,
@@ -2541,43 +2591,6 @@  instantiate_scev_binary (basic_block ins
 /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
    and EVOLUTION_LOOP, that were left under a symbolic form.
 
-   "CHREC" is an array reference to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_array_ref (basic_block instantiate_below,
-		       struct loop *evolution_loop, struct loop *inner_loop,
-		       tree chrec, bool *fold_conversions, int size_expr)
-{
-  tree res;
-  tree index = TREE_OPERAND (chrec, 1);
-  tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
-				 inner_loop, index,
-				 fold_conversions, size_expr);
-
-  if (op1 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (chrec && op1 == index)
-    return chrec;
-
-  res = unshare_expr (chrec);
-  TREE_OPERAND (res, 1) = op1;
-  return res;
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
    "CHREC" that stands for a convert expression "(TYPE) OP" is to be
    instantiated.
 
@@ -2592,7 +2605,7 @@  instantiate_array_ref (basic_block insta
    instantiated, and to stop if it exceeds some limit.  */
 
 static tree
-instantiate_scev_convert (basic_block instantiate_below,
+instantiate_scev_convert (edge instantiate_below,
 			  struct loop *evolution_loop, struct loop *inner_loop,
 			  tree chrec, tree type, tree op,
 			  bool *fold_conversions, int size_expr)
@@ -2643,7 +2656,7 @@  instantiate_scev_convert (basic_block in
    instantiated, and to stop if it exceeds some limit.  */
 
 static tree
-instantiate_scev_not (basic_block instantiate_below,
+instantiate_scev_not (edge instantiate_below,
 		      struct loop *evolution_loop, struct loop *inner_loop,
 		      tree chrec,
 		      enum tree_code code, tree type, tree op,
@@ -2681,130 +2694,6 @@  instantiate_scev_not (basic_block instan
 /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
    and EVOLUTION_LOOP, that were left under a symbolic form.
 
-   CHREC is an expression with 3 operands to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_scev_3 (basic_block instantiate_below,
-		    struct loop *evolution_loop, struct loop *inner_loop,
-		    tree chrec,
-		    bool *fold_conversions, int size_expr)
-{
-  tree op1, op2;
-  tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
-				 inner_loop, TREE_OPERAND (chrec, 0),
-				 fold_conversions, size_expr);
-  if (op0 == chrec_dont_know)
-    return chrec_dont_know;
-
-  op1 = instantiate_scev_r (instantiate_below, evolution_loop,
-			    inner_loop, TREE_OPERAND (chrec, 1),
-			    fold_conversions, size_expr);
-  if (op1 == chrec_dont_know)
-    return chrec_dont_know;
-
-  op2 = instantiate_scev_r (instantiate_below, evolution_loop,
-			    inner_loop, TREE_OPERAND (chrec, 2),
-			    fold_conversions, size_expr);
-  if (op2 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (op0 == TREE_OPERAND (chrec, 0)
-      && op1 == TREE_OPERAND (chrec, 1)
-      && op2 == TREE_OPERAND (chrec, 2))
-    return chrec;
-
-  return fold_build3 (TREE_CODE (chrec),
-		      TREE_TYPE (chrec), op0, op1, op2);
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
-   CHREC is an expression with 2 operands to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_scev_2 (basic_block instantiate_below,
-		    struct loop *evolution_loop, struct loop *inner_loop,
-		    tree chrec,
-		    bool *fold_conversions, int size_expr)
-{
-  tree op1;
-  tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
-				 inner_loop, TREE_OPERAND (chrec, 0),
-				 fold_conversions, size_expr);
-  if (op0 == chrec_dont_know)
-    return chrec_dont_know;
-
-  op1 = instantiate_scev_r (instantiate_below, evolution_loop,
-			    inner_loop, TREE_OPERAND (chrec, 1),
-			    fold_conversions, size_expr);
-  if (op1 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (op0 == TREE_OPERAND (chrec, 0)
-      && op1 == TREE_OPERAND (chrec, 1))
-    return chrec;
-
-  return fold_build2 (TREE_CODE (chrec), TREE_TYPE (chrec), op0, op1);
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
-   CHREC is an expression with 2 operands to be instantiated.
-
-   CACHE is the cache of already instantiated values.
-
-   Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
-   conversions that may wrap in signed/pointer type are folded, as long
-   as the value of the chrec is preserved.  If FOLD_CONVERSIONS is NULL
-   then we don't do such fold.
-
-   SIZE_EXPR is used for computing the size of the expression to be
-   instantiated, and to stop if it exceeds some limit.  */
-
-static tree
-instantiate_scev_1 (basic_block instantiate_below,
-		    struct loop *evolution_loop, struct loop *inner_loop,
-		    tree chrec,
-		    bool *fold_conversions, int size_expr)
-{
-  tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
-				 inner_loop, TREE_OPERAND (chrec, 0),
-				 fold_conversions, size_expr);
-
-  if (op0 == chrec_dont_know)
-    return chrec_dont_know;
-
-  if (op0 == TREE_OPERAND (chrec, 0))
-    return chrec;
-
-  return fold_build1 (TREE_CODE (chrec), TREE_TYPE (chrec), op0);
-}
-
-/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
-   and EVOLUTION_LOOP, that were left under a symbolic form.
-
    CHREC is the scalar evolution to instantiate.
 
    CACHE is the cache of already instantiated values.
@@ -2818,7 +2707,7 @@  instantiate_scev_1 (basic_block instanti
    instantiated, and to stop if it exceeds some limit.  */
 
 static tree
-instantiate_scev_r (basic_block instantiate_below,
+instantiate_scev_r (edge instantiate_below,
 		    struct loop *evolution_loop, struct loop *inner_loop,
 		    tree chrec,
 		    bool *fold_conversions, int size_expr)
@@ -2870,50 +2759,20 @@  instantiate_scev_r (basic_block instanti
 				   fold_conversions, size_expr);
 
     case ADDR_EXPR:
+      if (is_gimple_min_invariant (chrec))
+	return chrec;
+      /* Fallthru.  */
     case SCEV_NOT_KNOWN:
       return chrec_dont_know;
 
     case SCEV_KNOWN:
       return chrec_known;
 
-    case ARRAY_REF:
-      return instantiate_array_ref (instantiate_below, evolution_loop,
-				    inner_loop, chrec,
-				    fold_conversions, size_expr);
-
-    default:
-      break;
-    }
-
-  if (VL_EXP_CLASS_P (chrec))
-    return chrec_dont_know;
-
-  switch (TREE_CODE_LENGTH (TREE_CODE (chrec)))
-    {
-    case 3:
-      return instantiate_scev_3 (instantiate_below, evolution_loop,
-				 inner_loop, chrec,
-				 fold_conversions, size_expr);
-
-    case 2:
-      return instantiate_scev_2 (instantiate_below, evolution_loop,
-				 inner_loop, chrec,
-				 fold_conversions, size_expr);
-
-    case 1:
-      return instantiate_scev_1 (instantiate_below, evolution_loop,
-				 inner_loop, chrec,
-				 fold_conversions, size_expr);
-
-    case 0:
-      return chrec;
-
     default:
-      break;
+      if (CONSTANT_CLASS_P (chrec))
+	return chrec;
+      return chrec_dont_know;
     }
-
-  /* Too complicated to handle.  */
-  return chrec_dont_know;
 }
 
 /* Analyze all the parameters of the chrec that were left under a
@@ -2923,7 +2782,7 @@  instantiate_scev_r (basic_block instanti
    a function parameter.  */
 
 tree
-instantiate_scev (basic_block instantiate_below, struct loop *evolution_loop,
+instantiate_scev (edge instantiate_below, struct loop *evolution_loop,
 		  tree chrec)
 {
   tree res;
@@ -2931,8 +2790,10 @@  instantiate_scev (basic_block instantiat
   if (dump_file && (dump_flags & TDF_SCEV))
     {
       fprintf (dump_file, "(instantiate_scev \n");
-      fprintf (dump_file, "  (instantiate_below = %d)\n", instantiate_below->index);
-      fprintf (dump_file, "  (evolution_loop = %d)\n", evolution_loop->num);
+      fprintf (dump_file, "  (instantiate_below = %d -> %d)\n",
+	       instantiate_below->src->index, instantiate_below->dest->index);
+      if (evolution_loop)
+	fprintf (dump_file, "  (evolution_loop = %d)\n", evolution_loop->num);
       fprintf (dump_file, "  (chrec = ");
       print_generic_expr (dump_file, chrec);
       fprintf (dump_file, ")\n");
@@ -2980,7 +2841,7 @@  resolve_mixers (struct loop *loop, tree
       destr = true;
     }
 
-  tree ret = instantiate_scev_r (block_before_loop (loop), loop, NULL,
+  tree ret = instantiate_scev_r (loop_preheader_edge (loop), loop, NULL,
 				 chrec, &fold_conversions, 0);
 
   if (folded_casts && !*folded_casts)
Index: gcc/tree-scalar-evolution.h
===================================================================
--- gcc/tree-scalar-evolution.h	(revision 253645)
+++ gcc/tree-scalar-evolution.h	(working copy)
@@ -30,7 +30,7 @@  extern void scev_reset (void);
 extern void scev_reset_htab (void);
 extern void scev_finalize (void);
 extern tree analyze_scalar_evolution (struct loop *, tree);
-extern tree instantiate_scev (basic_block, struct loop *, tree);
+extern tree instantiate_scev (edge, struct loop *, tree);
 extern tree resolve_mixers (struct loop *, tree, bool *);
 extern void gather_stats_on_scev_database (void);
 extern void final_value_replacement_loop (struct loop *);
@@ -60,7 +60,7 @@  block_before_loop (loop_p loop)
 static inline tree
 instantiate_parameters (struct loop *loop, tree chrec)
 {
-  return instantiate_scev (block_before_loop (loop), loop, chrec);
+  return instantiate_scev (loop_preheader_edge (loop), loop, chrec);
 }
 
 /* Returns the loop of the polynomial chrec CHREC.  */
Index: gcc/tree-ssa-loop-prefetch.c
===================================================================
--- gcc/tree-ssa-loop-prefetch.c	(revision 253645)
+++ gcc/tree-ssa-loop-prefetch.c	(working copy)
@@ -1632,7 +1632,8 @@  determine_loop_nest_reuse (struct loop *
   for (gr = refs; gr; gr = gr->next)
     for (ref = gr->refs; ref; ref = ref->next)
       {
-	dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
+	dr = create_data_ref (loop_preheader_edge (nest),
+			      loop_containing_stmt (ref->stmt),
 			      ref->mem, ref->stmt, !ref->write_p, false);
 
 	if (dr)