diff mbox

[graphite] RFC: Add ISL variants of remaining PPL things

Message ID alpine.LNX.2.00.1206261203020.25982@wotan.suse.de
State New
Headers show

Commit Message

Michael Matz June 26, 2012, 10:21 a.m. UTC
Hi,

so, to make progress on the graphite front we want to get rid of the ppl 
dependency.  We start from Tobis move-to-isl-and-isl-scheduler branch at 
github, merged current trunk into that (see also Richis mails about 
cloog/isl configury), and add ISL implementations of the essential parts 
that still are ppl-only.  These are the number of iteration analysis for 
potentially transformed iteration spaces (this wasn't mentioned in Tobis 
mail from February) and the interchange heuristic (basically stride 
calculation for transformed spaces).

So I learned PPL and ISL and rewrote that part :)  Like in the patch 
below.  I'm not asking for any proper review for this, but rather if the 
isl code makes sort of sense.  The testsuite works, except for pr43012.c, 
which exposes the fact that the current PPL nr-iter implementation is 
buggy (it computes the nr-iter of a 77<=i<=88 loop to be 65), so that 
testcase runs into the nr-iter-ppl==nr-iter-isl assert.

In a private branch I have already removed all PPL code whatsoever (and 
cleaned up the ISL code I added) so we make good progress there.


Ciao,
Michael.

Comments

Tobias Grosser June 29, 2012, 1:35 p.m. UTC | #1
On 06/26/2012 12:21 PM, Michael Matz wrote:
> Hi,
>
> so, to make progress on the graphite front we want to get rid of the ppl
> dependency.  We start from Tobis move-to-isl-and-isl-scheduler branch at
> github, merged current trunk into that (see also Richis mails about
> cloog/isl configury), and add ISL implementations of the essential parts
> that still are ppl-only.  These are the number of iteration analysis for
> potentially transformed iteration spaces (this wasn't mentioned in Tobis
> mail from February) and the interchange heuristic (basically stride
> calculation for transformed spaces).
>
> So I learned PPL and ISL and rewrote that part :)  Like in the patch
> below.  I'm not asking for any proper review for this, but rather if the
> isl code makes sort of sense.  The testsuite works, except for pr43012.c,
> which exposes the fact that the current PPL nr-iter implementation is
> buggy (it computes the nr-iter of a 77<=i<=88 loop to be 65), so that
> testcase runs into the nr-iter-ppl==nr-iter-isl assert.
>
> In a private branch I have already removed all PPL code whatsoever (and
> cleaned up the ISL code I added) so we make good progress there.

Hi Michael,

I am very impressed with how little time it took you to implement this 
patch. As it is working for several cases, you got a lot of stuff right. ;-)

I will still add some comments to the patch.


> +static isl_constraint *
> +build_linearized_memory_access (isl_map *map, poly_dr_p pdr)
> +{
> +  isl_constraint *res;
> +  isl_local_space *ls = isl_local_space_from_space (isl_map_get_space (map));
> +  unsigned offset, nsubs;
> +  int i;
> +  isl_int size, subsize;
> +
> +  res = isl_equality_alloc (ls);
> +  isl_int_init (size);
> +  isl_int_set_ui (size, 1);
> +  isl_int_init (subsize);
> +  isl_int_set_ui (subsize, 1);
> +
> +  nsubs = isl_set_dim (pdr->extent, isl_dim_set);
> +  /* -1 for the already included L dimension.  */
> +  offset = isl_map_dim (map, isl_dim_out) - 1 - nsubs;
> +  res = isl_constraint_set_coefficient_si (res, isl_dim_out, offset + nsubs, -1);
> +  /* Go through all subscripts from last to first.  First dimension
> +     is the alias set, ignore it.  */
> +  for (i = nsubs - 1; i>= 1; i--)
> +    {
> +      isl_space *dc;
> +      isl_aff *aff;
> +      enum isl_lp_result lpres;
> +
> +      res = isl_constraint_set_coefficient (res, isl_dim_out, offset + i, size);
> +
> +      dc = isl_set_get_space (pdr->extent);
> +      aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
> +      aff = isl_aff_set_coefficient_si (aff, isl_dim_in, i, 1);
> +      lpres = isl_set_max (pdr->extent, aff,&subsize);
> +      isl_aff_free (aff);
> +      isl_int_mul (size, size, subsize);
> +    }
> +
> +  isl_int_clear (subsize);
> +  isl_int_clear (size);
> +
> +  return res;
> +}

The function itself looks correct. However, I would personally have 
returned an isl_map instead of an isl_constraint. A map
{A[i,j] -> [200 * i]} is probably a nicer way to represent a 
transformation from the higher level array type to an actual memory 
reference.


> +static void
> +pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
> +{
> +  poly_bb_p pbb = PDR_PBB (pdr);
> +  isl_map *map;
> +  isl_set *set;
> +  isl_aff *aff;
> +  isl_space *dc;
> +  isl_constraint *lma, *c;
> +  isl_int islstride;
> +  enum isl_lp_result lpres;
> +  graphite_dim_t time_depth;
> +  unsigned offset, nt;
> +  unsigned i;
> +  /* pdr->accesses:    [P1..nb_param,I1..nb_domain]->[a,S1..nb_subscript]
> +          ??? [P] not used for PDRs?
> +     pdr->extent:      [a,S1..nb_subscript]
> +     pbb->domain:      [P1..nb_param,I1..nb_domain]
> +     pbb->transformed: [P1..nb_param,I1..nb_domain]->[T1..Tnb_sctr]
> +          [T] includes local vars (currently unused)
> +
> +     First we create [P,I] ->  [T,a,S].  */
> +
> +  map = isl_map_flat_range_product (isl_map_copy (pbb->transformed),
> +				    isl_map_copy (pdr->accesses));
> +  /* Add a dimension for L: [P,I] ->  [T,a,S,L].*/
> +  map = isl_map_add_dims (map, isl_dim_out, 1);
> +  /* Build a constraint for "lma[S] - L == 0", effectively calculating
> +     L in terms of subscripts.  */
> +  lma = build_linearized_memory_access (map, pdr);
> +  /* And add it to the map, so we now have:
> +     [P,I] ->  [T,a,S,L] : lma([S]) == L.  */
> +  map = isl_map_add_constraint (map, lma);
> +
> +  /* Then we create  [P,I,P',I'] ->  [T,a,S,L,T',a',S',L'].  */
> +  map = isl_map_flat_product (map, isl_map_copy (map));

Your analysis is here a 1:1 mapping from PPL to isl. That works, but is 
probably not the nicest/conceptually cleanest way to implement this. In 
general, in isl we try to avoid adding/removing individual dimensions 
explicitly. The canonical way to express such transformations is to 
create a new map and to "apply" this map on the set/map you want to 
modify. For this algorithm, I would start with the pdr->accesses map and 
'apply' pbb->transformed to map the original iteration space into the 
scattering space. I would than apply the isl_map calculated by 
build_linearized_memory_access to map the array to the actual data 
location accessed. Now you create a map 'scattering->scattering' that 
maps from one scattering time point to the next. And apply the map 
'scattering->memory' on both the range and the domain. You will end up 
with a map memory -> memory. Here you can use the deltas function to 
give you the distances. ;-)

You can have a look in Polly, where I implement a similar calculation:

http://repo.or.cz/w/polly-mirror.git/blob/HEAD:/lib/Analysis/ScopInfo.cpp#l390

I wrote the stride detection code. You are allowed to use a copy of it 
for GCC under my copyright assignment.


>   /* Sets STRIDES to the sum of all the strides of the data references
> diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
> index 4520bbb..bf1854c 100644
> --- a/gcc/graphite-poly.c
> +++ b/gcc/graphite-poly.c
> @@ -26,6 +26,8 @@ along with GCC; see the file COPYING3.  If not see
>   #include<isl/map.h>
>   #include<isl/union_map.h>
>   #include<isl/constraint.h>
> +#include<isl/ilp.h>
> +#include<isl/aff.h>
>   #include<cloog/cloog.h>
>   #include<cloog/isl/domain.h>
>   #endif
> @@ -1573,6 +1575,25 @@ debug_scop_params (scop_p scop, int verbosity)
>     print_scop_params (stderr, scop, verbosity);
>   }
>
> +extern isl_ctx *the_isl_ctx;
> +void print_isl_set (FILE *f, isl_set *set);
> +void print_isl_map (FILE *f, isl_map *map);
> +void
> +print_isl_set (FILE *f, isl_set *set)
> +{
> +  isl_printer *p = isl_printer_to_file (the_isl_ctx, f);
> +  p = isl_printer_print_set (p, set);
> +  isl_printer_free (p);
> +}
> +
> +void
> +print_isl_map (FILE *f, isl_map *map)
> +{
> +  isl_printer *p = isl_printer_to_file (the_isl_ctx, f);
> +  p = isl_printer_print_map (p, map);
> +  isl_printer_free (p);
> +}

Are you introducing these functions for debugging purposes. If this is 
the case, you can either use isl_set_dump(set), isl_map_dump(map), ..
or you can use the gdb islprint command that should be made available by 
the gdb python script, installed by isl.

Also, you can get the_isl_ctx, by calling isl_map_get_ctx(map).

> @@ -1704,6 +1753,7 @@ pbb_number_of_iterations_at_time (poly_bb_p pbb,
>        it to the scattering.  */
>     ppl_new_Pointset_Powerset_C_Polyhedron_from_C_Polyhedron
>       (&sctr_lb, PBB_TRANSFORMED_SCATTERING (pbb));
> +  lbmap = isl_map_copy (pbb->transformed);
>     for (i = 0; i<  (int) dim_iter_domain; i++)
>       {
>         ppl_Linear_Expression_t eq;
> @@ -1732,6 +1782,23 @@ pbb_number_of_iterations_at_time (poly_bb_p pbb,
>         ppl_delete_Pointset_Powerset_C_Polyhedron (pph);
>         ppl_delete_Constraint (pc);
>         ppl_delete_Constraint_System (cs);
> +	{
> +	  isl_space *dc = isl_set_get_space (pbb->domain);
> +	  isl_aff *aff;
> +	  isl_int isllb;
> +	  enum isl_lp_result res;
> +	  isl_constraint *c;
> +	  isl_int_init (isllb);
> +	  aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
> +	  aff = isl_aff_set_coefficient_si (aff, isl_dim_in, i, 1);
> +	  res = isl_set_min (pbb->domain, aff,&isllb);
> +	  c = isl_equality_alloc (isl_local_space_from_space (isl_map_get_space (pbb->transformed)));
> +	  c = isl_constraint_set_coefficient_si (c, isl_dim_in, i, -1);
> +	  c = isl_constraint_set_constant (c, isllb);
> +	  lbmap = isl_map_add_constraint (lbmap, c);
> +	  isl_int_clear (isllb);
> +	  isl_aff_free (aff);
> +	}
>       }
>
>     /* Extract the number of iterations.  */
> @@ -1741,6 +1808,56 @@ pbb_number_of_iterations_at_time (poly_bb_p pbb,
>     ppl_max_for_le_pointset (sctr_ub, le, ub);
>     mpz_sub (res, ub, lb);
>
> +    {
> +      isl_space *dc;
> +      isl_aff *aff;
> +      isl_int isllb, islub, islres;
> +      enum isl_lp_result lpres;
> +
> +      isl_int_init (isllb);
> +      isl_int_init (islub);
> +      isl_int_init (islres);
> +
> +      lbset = isl_map_range (lbmap);
> +      ubset = isl_map_range (ubmap);
> +
> +      dc = isl_set_get_space (lbset);
> +      aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
> +      aff = isl_aff_set_coefficient_si (aff, isl_dim_in, time_depth, 1);
> +      lpres = isl_set_min (lbset, aff,&isllb);
> +      lpres = isl_set_max (ubset, aff,&islub);
> +
> +      isl_int_sub (islres, islub, isllb);
> +
> +      isl_int_set_gmp (isllb, res);
> +
> +      /* This would ideally hold, but the PPL code actually is buggy
> +         (returns upper bound 0 for unanalyzable domains), and incomplete,
> +	 pdr_add_data_dimensions can constrain the ISL domain more than
> +	 the PPL domain.  So, with ISL we get better results on pr37485.c.  */
> +      /*gcc_assert (isl_int_eq (isllb, islres));*/
> +
> +      /* Even shorter and more correct method: map the iteration domain
> +         through the current scatter, and work on the resulting set.  */
> +      isl_set_free (lbset);
> +      lbset = isl_set_apply (isl_set_copy (pbb->domain),
> +			     isl_map_copy (pbb->transformed));
> +      lpres = isl_set_min (lbset, aff,&isllb);
> +      lpres = isl_set_max (lbset, aff,&islub);
> +
> +      isl_int_sub (islub, islub, isllb);
> +      isl_int_add_ui (islub, islub, 1);
> +      gcc_assert (isl_int_eq (islub, islres));
> +
> +      isl_int_clear (isllb);
> +      isl_int_clear (islub);
> +      isl_int_clear (islres);
> +      isl_aff_free (aff);
> +
> +      isl_set_free (lbset);
> +      isl_set_free (ubset);
> +    }

This is again modeling the PPL code. There is nothing wrong with this, 
however you may be able to simplify the code by using isl_*_deltas.

> +
>     mpz_clear (one);
>     mpz_clear (diff);
>     mpz_clear (lb);
> diff --git a/gcc/graphite.c b/gcc/graphite.c
> index dd98f5e..821efb1 100644
> --- a/gcc/graphite.c
> +++ b/gcc/graphite.c
> @@ -253,6 +253,8 @@ graphite_finalize (bool need_cfg_cleanup_p)
>       print_loops (dump_file, 3);
>   }
>
> +isl_ctx *the_isl_ctx;

Why are you creating a global ctx here?

Thanks a lot Michael for working on this.

Tobi
Michael Matz July 2, 2012, 11:32 a.m. UTC | #2
Hi Tobi,

On Fri, 29 Jun 2012, Tobias Grosser wrote:

> > +static isl_constraint *
> > +build_linearized_memory_access (isl_map *map, poly_dr_p pdr)
> > +{
> > +}
> 
> The function itself looks correct. However, I would personally have returned
> an isl_map instead of an isl_constraint.

As you noticed I modelled all my isl code after the existing PPL code 
(which returned an expression here, turned into a conflict in the caller).  
I couldn't learn the polyhedral basics, PPL basics, isl basics _and_ the 
correct isl idioms on one weekend :)  So many thanks for the hints.

> A map {A[i,j] -> [200 * i]} is probably a nicer way to represent a 
> transformation from the higher level array type to an actual memory 
> reference.

Hmm.  This ignores the 'j' input dimension.  Can I also have a map ala
{A[i,j] -> [200 * i + j]}?  I don't directly see how, as for that I'd need 
at least another dimension (the 'L' in the callers map).

> > +pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
> > +{
> > +  poly_bb_p pbb = PDR_PBB (pdr);
> > +  isl_map *map;
> > +  isl_set *set;
> > +  isl_aff *aff;
> > +  isl_space *dc;
> > +  isl_constraint *lma, *c;
> > +  isl_int islstride;
> > +  enum isl_lp_result lpres;
> > +  graphite_dim_t time_depth;
> > +  unsigned offset, nt;
> > +  unsigned i;
> > +  /* pdr->accesses:    [P1..nb_param,I1..nb_domain]->[a,S1..nb_subscript]
> > +          ??? [P] not used for PDRs?
> > +     pdr->extent:      [a,S1..nb_subscript]
> > +     pbb->domain:      [P1..nb_param,I1..nb_domain]
> > +     pbb->transformed: [P1..nb_param,I1..nb_domain]->[T1..Tnb_sctr]
> > +          [T] includes local vars (currently unused)
> > +
> > +     First we create [P,I] ->  [T,a,S].  */
> > +
> > +  map = isl_map_flat_range_product (isl_map_copy (pbb->transformed),
> > +				    isl_map_copy (pdr->accesses));
> > +  /* Add a dimension for L: [P,I] ->  [T,a,S,L].*/
> > +  map = isl_map_add_dims (map, isl_dim_out, 1);
> > +  /* Build a constraint for "lma[S] - L == 0", effectively calculating
> > +     L in terms of subscripts.  */
> > +  lma = build_linearized_memory_access (map, pdr);
> > +  /* And add it to the map, so we now have:
> > +     [P,I] ->  [T,a,S,L] : lma([S]) == L.  */
> > +  map = isl_map_add_constraint (map, lma);
> > +
> > +  /* Then we create  [P,I,P',I'] ->  [T,a,S,L,T',a',S',L'].  */
> > +  map = isl_map_flat_product (map, isl_map_copy (map));
> 
> Your analysis is here a 1:1 mapping from PPL to isl. That works, but is
> probably not the nicest/conceptually cleanest way to implement this. In
> general, in isl we try to avoid adding/removing individual dimensions
> explicitly.

Yeah, I noticed that, but didn't know how to get around adding the L 
dimension (I at least used flat_product for creating the X' dimensions 
instead of explicit dimension insertions and moving).

> The canonical way to express such transformations is to create a
> new map and to "apply" this map on the set/map you want to modify. For this
> algorithm, I would start with the pdr->accesses map and 'apply'
> pbb->transformed to map the original iteration space into the scattering
> space.

Let's see if I understand:  So, we have [PI]->[aS] (Params, Iteration 
domain, alias set, Subscript) as accesses, and we have [PI]->[T] 
(Transformed) as the scattering.  Indeed what I ideally wanted to have is 
a [T]->[aS] mapping, but I couldn't see how to construct such.  So you're 
saying that I can apply the PI->T to the domain of PI->aS, and that gives 
me the T->aS?  Probably I was confused by the functional syntax (the ->), 
as in normal functions you can't simply apply a function to a functions 
domain and expect to get anything sensible out of that.  I see that in 
your code you reverse the scattering first (T->PI), IIRC that's one thing 
I also tried, but I got stuck somewhere.

> I would than apply the isl_map calculated by 
> build_linearized_memory_access to map the array to the actual data 
> location accessed. Now you create a map 'scattering->scattering' that 
> maps from one scattering time point to the next. And apply the map 
> 'scattering->memory' on both the range and the domain. You will end up 
> with a map memory -> memory.

Yeah.  If my above understanding is correct the path is clear.

> Here you can use the deltas function to give you the distances. ;-)

I couldn't figure out from isls user manual what exactly the delta 
functions do :)  From the description above I think I understand now.  Let 
me experiment a bit more.

> You can have a look in Polly, where I implement a similar calculation:
> 
> http://repo.or.cz/w/polly-mirror.git/blob/HEAD:/lib/Analysis/ScopInfo.cpp#l390

The games you play with projecting out some dimensions are related to 
which dimension you want to calculate the stride, right?  (As you always 
calculate the next scatterting to have n+1 in the last dimension, not an 
arbitrary one).

> > +print_isl_map (FILE *f, isl_map *map)
> 
> If this is the
> case, you can either use isl_set_dump(set), isl_map_dump(map), ..

There functions weren't documented.  Thanks.  But I need a file argument 
anyway as the plan is to use them also for debug dumps.

> Also, you can get the_isl_ctx, by calling isl_map_get_ctx(map).

Indeed, that occurred to me only later.

> > +      /* Even shorter and more correct method: map the iteration domain
> > +         through the current scatter, and work on the resulting set.  */
> > +      isl_set_free (lbset);
> > +      lbset = isl_set_apply (isl_set_copy (pbb->domain),
> > +			     isl_map_copy (pbb->transformed));
> > +      lpres = isl_set_min (lbset, aff,&isllb);
> > +      lpres = isl_set_max (lbset, aff,&islub);
> > +
> > +      isl_int_sub (islub, islub, isllb);
> > +      isl_int_add_ui (islub, islub, 1);
> 
> This is again modeling the PPL code.

Actually the PPL code translation is the one I removed from my reply here.  
The above is what made that code dead and what is in the final version 
(with renamed variables and such).  Is using _deltas really better?  I 
would have to construct a map from 
lexmin(transformed-domain)->lexmax(transformed-domain), then apply 
_delts and the still extract the number.

> > +++ b/gcc/graphite.c
> > @@ -253,6 +253,8 @@ graphite_finalize (bool need_cfg_cleanup_p)
> >       print_loops (dump_file, 3);
> >   }
> > 
> > +isl_ctx *the_isl_ctx;
> 
> Why are you creating a global ctx here?

For the printer, before I was aware of the fact that of course all isl 
objects know about their context.


Ciao,
Michael.
Tobias Grosser July 2, 2012, 2:37 p.m. UTC | #3
On 07/02/2012 01:32 PM, Michael Matz wrote:
> Hi Tobi,
>
> On Fri, 29 Jun 2012, Tobias Grosser wrote:
>
>>> +static isl_constraint *
>>> +build_linearized_memory_access (isl_map *map, poly_dr_p pdr)
>>> +{
>>> +}
>>
>> The function itself looks correct. However, I would personally have returned
>> an isl_map instead of an isl_constraint.
>
> As you noticed I modelled all my isl code after the existing PPL code
> (which returned an expression here, turned into a conflict in the caller).
> I couldn't learn the polyhedral basics, PPL basics, isl basics _and_ the
> correct isl idioms on one weekend :)  So many thanks for the hints.

No worries, I believe a direct translation is a good start. we can 
improve later on that.

>> A map {A[i,j] ->  [200 * i]} is probably a nicer way to represent a
>> transformation from the higher level array type to an actual memory
>> reference.
>
> Hmm.  This ignores the 'j' input dimension.  Can I also have a map ala
> {A[i,j] ->  [200 * i + j]}?  I don't directly see how, as for that I'd need
> at least another dimension (the 'L' in the callers map).

Yes, you can have this: A[i,j] ->  [200 * i + j]}

It is basically a pretty printing of:

{A[i,j] ->  [o]: o = 200 * i + j}

>> The canonical way to express such transformations is to create a
>> new map and to "apply" this map on the set/map you want to modify. For this
>> algorithm, I would start with the pdr->accesses map and 'apply'
>> pbb->transformed to map the original iteration space into the scattering
>> space.
>
> Let's see if I understand:  So, we have [PI]->[aS] (Params, Iteration
> domain, alias set, Subscript) as accesses, and we have [PI]->[T]
> (Transformed) as the scattering.  Indeed what I ideally wanted to have is
> a [T]->[aS] mapping, but I couldn't see how to construct such.  So you're
> saying that I can apply the PI->T to the domain of PI->aS, and that gives
> me the T->aS?

You start with a [P] -> {[I] -> [aS]} as accesses and a [P] -> {[I] -> 
[T]}. To get [P] -> {[T] -> [aS]} you do the following:

isl_map *Accesses = "[P] -> {[I] -> [aS]}"
isl_map *Scattering = " [P] -> {[I] -> [T]}"
isl_map *TimeToAccess = isl_map_apply_domain(Accesses, Scattering)
isl_map_dump(TimeToAccess);

$ [P] { [T]->[aS] }

> Probably I was confused by the functional syntax (the ->),
> as in normal functions you can't simply apply a function to a functions
> domain and expect to get anything sensible out of that.

No, all those are relations. Reverse and combine them works in general 
very well.

 > I see that in
> your code you reverse the scattering first (T->PI), IIRC that's one thing
> I also tried, but I got stuck somewhere.

I don't think reversing is necessary in your case. As the input space of 
the scattering, which is "[I]", matches the domain of the access map. So 
you can directly use isl_map_apply_domain.

>> I would than apply the isl_map calculated by
>> build_linearized_memory_access to map the array to the actual data
>> location accessed. Now you create a map 'scattering->scattering' that
>> maps from one scattering time point to the next. And apply the map
>> 'scattering->memory' on both the range and the domain. You will end up
>> with a map memory ->  memory.
>
> Yeah.  If my above understanding is correct the path is clear.

I believe it is.

>> Here you can use the deltas function to give you the distances. ;-)
>
> I couldn't figure out from isls user manual what exactly the delta
> functions do :)  From the description above I think I understand now.  Let
> me experiment a bit more.

The delta function calculates the following:

Input: {[a1, a2, a3] ->  [b1, b2, b3]: <constraints between a and b> }
Output {[x1, x2, x3] exists a1, a2, a3, b1, b2, b3 : x1 = b1 - a1 and x2 
= b2 - a2 and x3 = b3 - a3 and <constraints between a and b> }

So basically it calculates the distances between each possible pair of 
input and output values.

>> You can have a look in Polly, where I implement a similar calculation:
>>
>> http://repo.or.cz/w/polly-mirror.git/blob/HEAD:/lib/Analysis/ScopInfo.cpp#l390
>
> The games you play with projecting out some dimensions are related to
> which dimension you want to calculate the stride, right?  (As you always
> calculate the next scatterting to have n+1 in the last dimension, not an
> arbitrary one).

I always calculate the stride in the last dimension. This is less 
general as what you are trying to calculate. The projecting out of 
dimensions is just needed due to the ugly CLooG interface, which gives 
me a mixture or scattering and domain dimensions. You should need it in gcc.

>>> +print_isl_map (FILE *f, isl_map *map)
>>
>> If this is the
>> case, you can either use isl_set_dump(set), isl_map_dump(map), ..
>
> There functions weren't documented.  Thanks.  But I need a file argument
> anyway as the plan is to use them also for debug dumps.

OK.

>>> +      /* Even shorter and more correct method: map the iteration domain
>>> +         through the current scatter, and work on the resulting set.  */
>>> +      isl_set_free (lbset);
>>> +      lbset = isl_set_apply (isl_set_copy (pbb->domain),
>>> +			     isl_map_copy (pbb->transformed));
>>> +      lpres = isl_set_min (lbset, aff,&isllb);
>>> +      lpres = isl_set_max (lbset, aff,&islub);
>>> +
>>> +      isl_int_sub (islub, islub, isllb);
>>> +      isl_int_add_ui (islub, islub, 1);
>>
>> This is again modeling the PPL code.
>
> Actually the PPL code translation is the one I removed from my reply here.
> The above is what made that code dead and what is in the final version
> (with renamed variables and such).  Is using _deltas really better?  I
> would have to construct a map from
> lexmin(transformed-domain)->lexmax(transformed-domain), then apply
> _delts and the still extract the number.

Yes, you are right. For this specific case, the benefits are less clear.

Cheers
Tobi
Tobias Grosser July 2, 2012, 2:39 p.m. UTC | #4
On 07/02/2012 01:32 PM, Michael Matz wrote:
>>> +++ b/gcc/graphite.c
>>> @@ -253,6 +253,8 @@ graphite_finalize (bool need_cfg_cleanup_p)
>>>        print_loops (dump_file, 3);
>>>    }
>>>
>>> +isl_ctx *the_isl_ctx;
>>
>> Why are you creating a global ctx here?
>
> For the printer, before I was aware of the fact that of course all isl
> objects know about their context.

Ah, I forgot. Can you commit a patch, that removes the_isl_ctx from 
graphite?

Tobi
Michael Matz July 3, 2012, 2:05 p.m. UTC | #5
Hi,

On Mon, 2 Jul 2012, Tobias Grosser wrote:

> > Yeah.  If my above understanding is correct the path is clear.
> 
> I believe it is.

It somewhat works.  It has problems in those cases where there are 
dependencies between different input dimensions in the scattering, which 
happens for strip-mining.  interchange-8.c for instance.

The core is:

  isl_map *subs = isl_map_copy (pdr->accesses);
  subs = isl_map_intersect_range (subs, isl_set_copy (pdr->extent));
  subs = isl_map_apply_domain (subs, isl_map_copy (pbb->transformed));
  subs = isl_map_apply_range (subs, build_linearized_memory_access (pdr));

    So, subs is now the [T]->address mapping.

  space = isl_space_range (isl_map_get_space (pbb->transformed));
  t_to_t1 = isl_map_universe (isl_space_map_from_set (space));
  space = isl_map_get_space (t_to_t1);
  c = isl_equality_alloc (isl_local_space_from_space (space));
  c = isl_constraint_set_coefficient_si (c, isl_dim_in, time_depth, 1);
  c = isl_constraint_set_coefficient_si (c, isl_dim_out, time_depth, -1);
  c = isl_constraint_set_constant_si (c, 1);
  t_to_t1 = isl_map_add_constraint (t_to_t1, c);

    t_to_t1 is now the T_depth->T_depth+1 mapping, e.g.:
      { [i0, i1, i2, i3, i4] -> [o0, 1 + i1, o2, o3, o4] }

  t_to_t1 = isl_map_apply_domain (t_to_t1, isl_map_copy (subs));
  t_to_t1 = isl_map_apply_range (t_to_t1, subs);

    Now it's mem_addr -> mem_addr' mapping.

  deltas = isl_map_deltas (t_to_t1);

    And deltas is the difference, mem_addr' - mem_addr

  isl_int_init (islstride);
  if (!isl_set_plain_is_fixed (deltas, isl_dim_set, 0, &islstride))
    gcc_unreachable ();

So, I want to extract the difference, and that is what doesn't work.  In 
my initial tries I had this additional loop to construct t_to_t1:

  for (i = 0; i < nt; i++)
    if (i != time_depth)
      t_to_t1 = isl_map_equate (t_to_t1, isl_dim_in, i, isl_dim_out, i);

That is, I enforced also equality on all dimensions except for the 
T_depth's one.  For interchange-0.c I got these sets:

T->addr:   { [0, i1, 0, i3, 0] -> [i1 + 999i3]
             : i1 >= 0 and i3 >= 0 and i1 <= 999 and i3 <= 999 }
t_to_t1:   { [i0, i1, i2, i3, i4] -> [i0, 1 + i1, i2, i3, i4] }
mem->mem': { [i0] -> [1 + i0] : i0 <= 998999 and i0 >= 0 }

And hence deltas indeed contains a single fixed number that I could 
extract.  Of course forcing all transformed domains the same breaks 
exactly where there are interdependencies, so e.g. interchange-8.c breaks.  
So I thought I simply don't equate those.  Now the sets for 
interchange-0.c look like so:

T->addr:   as above
t_to_t1:   { [i0, i1, i2, i3, i4] -> [o0, 1 + i1, o2, o3, o4] }

  Note how only dimension 1 is constrained.

mem->mem': { [i0] -> [o0]
             : exists (e0, e1 = [(998 - i0 + o0 + 999e0)/999]
                       : 999e1 = 998 - i0 + o0 + 999e0
                         and 999e0 <= -999 + i0
                         and e0 >= -1 and 999e0 >= -1997 + i0
                         and e0 <= 998 and 999e0 <= 997003 + i0 - o0
                         and 999e0 >= -998 + i0 - o0) }

Doing deltas on that one gives:

{ [i0] :
  exists (e0, e1 = [(998 + i0 + 999e0)/999]
          : 999e1 = 998 + i0 + 999e0
            and 999e0 >= -998 - i0 and e0 >= -1
            and 999e0 <= 997003 - i0 and e0 <= 998) } > 

So, no fixed number.  Doing isl_set_max/min on that doesn't seem to give 
sensible results (max=998002, min=-998000).  For this specific instance 
it's supposed to give '1'.

Hmm.  Any thoughts? :)


Ciao,
Michael.
diff mbox

Patch

diff --git a/gcc/graphite-interchange.c b/gcc/graphite-interchange.c
index fa38f5c..96a7967 100644
--- a/gcc/graphite-interchange.c
+++ b/gcc/graphite-interchange.c
@@ -24,9 +24,11 @@  along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 
 #ifdef HAVE_cloog
+#include <isl/aff.h>
 #include <isl/set.h>
 #include <isl/map.h>
 #include <isl/union_map.h>
+#include <isl/ilp.h>
 #include <cloog/cloog.h>
 #include <cloog/isl/domain.h>
 #endif
@@ -64,7 +66,7 @@  along with GCC; see the file COPYING3.  If not see
    c_0 * s_0 + c_1 * s_1 + ... c_n * s_n.  */
 
 static ppl_Linear_Expression_t
-build_linearized_memory_access (ppl_dimension_type offset, poly_dr_p pdr)
+ppl_build_linearized_memory_access (ppl_dimension_type offset, poly_dr_p pdr)
 {
   ppl_Linear_Expression_t res;
   ppl_Linear_Expression_t le;
@@ -189,7 +191,7 @@  build_partial_difference (ppl_Pointset_Powerset_C_Polyhedron_t *p,
    the loop at DEPTH.  */
 
 static void
-pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
+ppl_pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
 {
   ppl_dimension_type time_depth;
   ppl_Linear_Expression_t le, lma;
@@ -245,7 +247,7 @@  pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
 
   /* Construct the 0|0|0|0|0|S|0|l1|0 part.  */
   {
-    lma = build_linearized_memory_access (offset + dim_sctr, pdr);
+    lma = ppl_build_linearized_memory_access (offset + dim_sctr, pdr);
     ppl_set_coef (lma, dim_L1, -1);
     ppl_new_Constraint (&new_cstr, lma, PPL_CONSTRAINT_TYPE_EQUAL);
     ppl_Pointset_Powerset_C_Polyhedron_add_constraint (p1, new_cstr);
@@ -310,6 +312,7 @@  pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
     ppl_max_for_le_pointset (p1, le, stride);
   }
 
+#if 0
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
       char *str;
@@ -322,12 +325,159 @@  pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
       mp_get_memory_functions (NULL, NULL, &gmp_free);
       (*gmp_free) (str, strlen (str) + 1);
     }
+#endif
 
   ppl_delete_Pointset_Powerset_C_Polyhedron (p1);
   ppl_delete_Pointset_Powerset_C_Polyhedron (p2);
   ppl_delete_Linear_Expression (le);
 }
 
+static isl_constraint *
+build_linearized_memory_access (isl_map *map, poly_dr_p pdr)
+{
+  isl_constraint *res;
+  isl_local_space *ls = isl_local_space_from_space (isl_map_get_space (map));
+  unsigned offset, nsubs;
+  int i;
+  isl_int size, subsize;
+
+  res = isl_equality_alloc (ls);
+  isl_int_init (size);
+  isl_int_set_ui (size, 1);
+  isl_int_init (subsize);
+  isl_int_set_ui (subsize, 1);
+
+  nsubs = isl_set_dim (pdr->extent, isl_dim_set);
+  /* -1 for the already included L dimension.  */
+  offset = isl_map_dim (map, isl_dim_out) - 1 - nsubs;
+  res = isl_constraint_set_coefficient_si (res, isl_dim_out, offset + nsubs, -1);
+  /* Go through all subscripts from last to first.  First dimension
+     is the alias set, ignore it.  */
+  for (i = nsubs - 1; i >= 1; i--)
+    {
+      isl_space *dc;
+      isl_aff *aff;
+      enum isl_lp_result lpres;
+
+      res = isl_constraint_set_coefficient (res, isl_dim_out, offset + i, size);
+
+      dc = isl_set_get_space (pdr->extent);
+      aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
+      aff = isl_aff_set_coefficient_si (aff, isl_dim_in, i, 1);
+      lpres = isl_set_max (pdr->extent, aff, &subsize);
+      isl_aff_free (aff);
+      isl_int_mul (size, size, subsize);
+    }
+
+  isl_int_clear (subsize);
+  isl_int_clear (size);
+
+  return res;
+}
+
+static void
+pdr_stride_in_loop (mpz_t stride, graphite_dim_t depth, poly_dr_p pdr)
+{
+  poly_bb_p pbb = PDR_PBB (pdr);
+  isl_map *map;
+  isl_set *set;
+  isl_aff *aff;
+  isl_space *dc;
+  isl_constraint *lma, *c;
+  isl_int islstride;
+  enum isl_lp_result lpres;
+  graphite_dim_t time_depth;
+  unsigned offset, nt;
+  unsigned i;
+  /* pdr->accesses:    [P1..nb_param,I1..nb_domain]->[a,S1..nb_subscript]
+          ??? [P] not used for PDRs?
+     pdr->extent:      [a,S1..nb_subscript]
+     pbb->domain:      [P1..nb_param,I1..nb_domain]
+     pbb->transformed: [P1..nb_param,I1..nb_domain]->[T1..Tnb_sctr]
+          [T] includes local vars (currently unused)
+     
+     First we create [P,I] -> [T,a,S].  */
+  
+  map = isl_map_flat_range_product (isl_map_copy (pbb->transformed),
+				    isl_map_copy (pdr->accesses));
+  /* Add a dimension for L: [P,I] -> [T,a,S,L].*/
+  map = isl_map_add_dims (map, isl_dim_out, 1);
+  /* Build a constraint for "lma[S] - L == 0", effectively calculating
+     L in terms of subscripts.  */
+  lma = build_linearized_memory_access (map, pdr);
+  /* And add it to the map, so we now have:
+     [P,I] -> [T,a,S,L] : lma([S]) == L.  */
+  map = isl_map_add_constraint (map, lma);
+
+  /* Then we create  [P,I,P',I'] -> [T,a,S,L,T',a',S',L'].  */
+  map = isl_map_flat_product (map, isl_map_copy (map));
+
+  /* Now add the equality T[time_depth] == T'[time_depth]+1.  This will
+     force L' to be the linear address at T[time_depth] + 1. */
+  time_depth = psct_dynamic_dim (pbb, depth);
+  /* Length of [a,S] plus [L] ...  */
+  offset = 1 + isl_map_dim (pdr->accesses, isl_dim_out);
+  /* ... plus [T].  */
+  offset += isl_map_dim (pbb->transformed, isl_dim_out);
+
+  c = isl_equality_alloc (isl_local_space_from_space (isl_map_get_space (map)));
+  c = isl_constraint_set_coefficient_si (c, isl_dim_out, time_depth, 1);
+  c = isl_constraint_set_coefficient_si (c, isl_dim_out,
+					 offset + time_depth, -1);
+  c = isl_constraint_set_constant_si (c, 1);
+  map = isl_map_add_constraint (map, c);
+
+  /* Now we equate most of the T/T' elements (making PITaSL nearly
+     the same is (PITaSL)', except for one dimension, namely for 'depth'
+     (an index into [I]), after translating to index into [T].  Take care
+     to not produce an empty map, which indicates we wanted to equate
+     two dimensions that are already coupled via the above time_depth
+     dimension.  Happens with strip mining where several scatter dimension
+     are interdependend.  */
+  /* Length of [T].  */
+  nt = pbb_nb_scattering_transform (pbb) + pbb_nb_local_vars (pbb);
+  for (i = 0; i < nt; i++)
+    if (i != time_depth)
+      {
+	isl_map *temp = isl_map_equate (isl_map_copy (map),
+					isl_dim_out, i,
+					isl_dim_out, offset + i);
+	if (isl_map_is_empty (temp))
+	  isl_map_free (temp);
+	else
+	  {
+	    isl_map_free (map);
+	    map = temp;
+	  }
+      }
+
+  /* Now maximize the expression L' - L.  */
+  set = isl_map_range (map);
+  dc = isl_set_get_space (set);
+  aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
+  aff = isl_aff_set_coefficient_si (aff, isl_dim_in, offset - 1, -1);
+  aff = isl_aff_set_coefficient_si (aff, isl_dim_in, offset + offset - 1, 1);
+  isl_int_init (islstride);
+  lpres = isl_set_max (set, aff, &islstride);
+  isl_int_get_gmp (islstride, stride);
+  isl_int_clear (islstride);
+  isl_aff_free (aff);
+  isl_set_free (set);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      char *str;
+      void (*gmp_free) (void *, size_t);
+
+      fprintf (dump_file, "\nStride in BB_%d, DR_%d, depth %d:",
+	       pbb_index (pbb), PDR_ID (pdr), (int) depth);
+      str = mpz_get_str (0, 10, stride);
+      fprintf (dump_file, "  %s ", str);
+      mp_get_memory_functions (NULL, NULL, &gmp_free);
+      (*gmp_free) (str, strlen (str) + 1);
+    }
+}
+
 
 /* Sets STRIDES to the sum of all the strides of the data references
    accessed in LOOP at DEPTH.  */
@@ -339,9 +489,11 @@  memory_strides_in_loop_1 (lst_p loop, graphite_dim_t depth, mpz_t strides)
   lst_p l;
   poly_dr_p pdr;
   mpz_t s, n;
+  mpz_t sold;
 
   mpz_init (s);
   mpz_init (n);
+  mpz_init (sold);
 
   FOR_EACH_VEC_ELT (lst_p, LST_SEQ (loop), j, l)
     if (LST_LOOP_P (l))
@@ -350,6 +502,11 @@  memory_strides_in_loop_1 (lst_p loop, graphite_dim_t depth, mpz_t strides)
       FOR_EACH_VEC_ELT (poly_dr_p, PBB_DRS (LST_PBB (l)), i, pdr)
 	{
 	  pdr_stride_in_loop (s, depth, pdr);
+	  ppl_pdr_stride_in_loop (sold, depth, pdr);
+	  /* Ideally this would hold, but it doesn't on e.g.
+	     interchange-8.c.  ??? Not analyzed why.  */
+	  if (0 != mpz_cmp (s, sold))
+	    {}
 	  mpz_set_si (n, PDR_NB_REFS (pdr));
 	  mpz_mul (s, s, n);
 	  mpz_add (strides, strides, s);
@@ -357,6 +514,7 @@  memory_strides_in_loop_1 (lst_p loop, graphite_dim_t depth, mpz_t strides)
 
   mpz_clear (s);
   mpz_clear (n);
+  mpz_clear (sold);
 }
 
 /* Sets STRIDES to the sum of all the strides of the data references
diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
index 4520bbb..bf1854c 100644
--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -26,6 +26,8 @@  along with GCC; see the file COPYING3.  If not see
 #include <isl/map.h>
 #include <isl/union_map.h>
 #include <isl/constraint.h>
+#include <isl/ilp.h>
+#include <isl/aff.h>
 #include <cloog/cloog.h>
 #include <cloog/isl/domain.h>
 #endif
@@ -1573,6 +1575,25 @@  debug_scop_params (scop_p scop, int verbosity)
   print_scop_params (stderr, scop, verbosity);
 }
 
+extern isl_ctx *the_isl_ctx;
+void print_isl_set (FILE *f, isl_set *set);
+void print_isl_map (FILE *f, isl_map *map);
+void
+print_isl_set (FILE *f, isl_set *set)
+{
+  isl_printer *p = isl_printer_to_file (the_isl_ctx, f);
+  p = isl_printer_print_set (p, set);
+  isl_printer_free (p);
+}
+
+void
+print_isl_map (FILE *f, isl_map *map)
+{
+  isl_printer *p = isl_printer_to_file (the_isl_ctx, f);
+  p = isl_printer_print_map (p, map);
+  isl_printer_free (p);
+}
+
 
 /* The dimension in the transformed scattering polyhedron of PBB
    containing the scattering iterator for the loop at depth LOOP_DEPTH.  */
@@ -1643,6 +1664,9 @@  pbb_number_of_iterations_at_time (poly_bb_p pbb,
 				  graphite_dim_t time_depth,
 				  mpz_t res)
 {
+  /* XXX rewrite with isl, used from lst_niter_for_loop.  */
+  isl_map *lbmap, *ubmap;
+  isl_set *lbset, *ubset;
   ppl_Pointset_Powerset_C_Polyhedron_t domain, sctr_lb, sctr_ub;
   ppl_dimension_type domain_dim, sctr_dim;
   graphite_dim_t dim_iter_domain = pbb_dim_iter_domain (pbb);
@@ -1667,6 +1691,7 @@  pbb_number_of_iterations_at_time (poly_bb_p pbb,
      that upper bound to the scattering.  */
   ppl_new_Pointset_Powerset_C_Polyhedron_from_C_Polyhedron
     (&sctr_ub, PBB_TRANSFORMED_SCATTERING (pbb));
+  ubmap = isl_map_copy (pbb->transformed);
   for (i = 0; i < (int) dim_iter_domain; i++)
     {
       ppl_Linear_Expression_t eq;
@@ -1679,6 +1704,30 @@  pbb_number_of_iterations_at_time (poly_bb_p pbb,
       ppl_set_coef (le, i, 1);
       ppl_min_for_le_pointset (domain, le, lb);
       ppl_max_for_le_pointset (domain, le, ub);
+	{
+	  isl_space *dc = isl_set_get_space (pbb->domain);
+	  isl_aff *aff;
+	  isl_int isllb, islub, isldiff;
+	  enum isl_lp_result res;
+	  isl_constraint *c;
+	  isl_int_init (isllb);
+	  isl_int_init (islub);
+	  isl_int_init (isldiff);
+	  aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
+	  aff = isl_aff_set_coefficient_si (aff, isl_dim_in, i, 1);
+	  res = isl_set_min (pbb->domain, aff, &isllb);
+	  res = isl_set_max (pbb->domain, aff, &islub);
+	  isl_int_sub (isldiff, islub, isllb);
+	  isl_int_add_ui (isldiff, isldiff, 1);
+	  c = isl_equality_alloc (isl_local_space_from_space (isl_map_get_space (pbb->transformed)));
+	  c = isl_constraint_set_coefficient_si (c, isl_dim_in, i, -1);
+	  c = isl_constraint_set_constant (c, isldiff);
+	  ubmap = isl_map_add_constraint (ubmap, c);
+	  isl_int_clear (isllb);
+	  isl_int_clear (islub);
+	  isl_int_clear (isldiff);
+	  isl_aff_free (aff);
+	}
       mpz_sub (diff, ub, lb);
       mpz_add (diff, diff, one);
 
@@ -1704,6 +1753,7 @@  pbb_number_of_iterations_at_time (poly_bb_p pbb,
      it to the scattering.  */
   ppl_new_Pointset_Powerset_C_Polyhedron_from_C_Polyhedron
     (&sctr_lb, PBB_TRANSFORMED_SCATTERING (pbb));
+  lbmap = isl_map_copy (pbb->transformed);
   for (i = 0; i < (int) dim_iter_domain; i++)
     {
       ppl_Linear_Expression_t eq;
@@ -1732,6 +1782,23 @@  pbb_number_of_iterations_at_time (poly_bb_p pbb,
       ppl_delete_Pointset_Powerset_C_Polyhedron (pph);
       ppl_delete_Constraint (pc);
       ppl_delete_Constraint_System (cs);
+	{
+	  isl_space *dc = isl_set_get_space (pbb->domain);
+	  isl_aff *aff;
+	  isl_int isllb;
+	  enum isl_lp_result res;
+	  isl_constraint *c;
+	  isl_int_init (isllb);
+	  aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
+	  aff = isl_aff_set_coefficient_si (aff, isl_dim_in, i, 1);
+	  res = isl_set_min (pbb->domain, aff, &isllb);
+	  c = isl_equality_alloc (isl_local_space_from_space (isl_map_get_space (pbb->transformed)));
+	  c = isl_constraint_set_coefficient_si (c, isl_dim_in, i, -1);
+	  c = isl_constraint_set_constant (c, isllb);
+	  lbmap = isl_map_add_constraint (lbmap, c);
+	  isl_int_clear (isllb);
+	  isl_aff_free (aff);
+	}
     }
 
   /* Extract the number of iterations.  */
@@ -1741,6 +1808,56 @@  pbb_number_of_iterations_at_time (poly_bb_p pbb,
   ppl_max_for_le_pointset (sctr_ub, le, ub);
   mpz_sub (res, ub, lb);
 
+    {
+      isl_space *dc;
+      isl_aff *aff;
+      isl_int isllb, islub, islres;
+      enum isl_lp_result lpres;
+
+      isl_int_init (isllb);
+      isl_int_init (islub);
+      isl_int_init (islres);
+
+      lbset = isl_map_range (lbmap);
+      ubset = isl_map_range (ubmap);
+
+      dc = isl_set_get_space (lbset);
+      aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
+      aff = isl_aff_set_coefficient_si (aff, isl_dim_in, time_depth, 1);
+      lpres = isl_set_min (lbset, aff, &isllb);
+      lpres = isl_set_max (ubset, aff, &islub);
+
+      isl_int_sub (islres, islub, isllb);
+
+      isl_int_set_gmp (isllb, res);
+
+      /* This would ideally hold, but the PPL code actually is buggy
+         (returns upper bound 0 for unanalyzable domains), and incomplete,
+	 pdr_add_data_dimensions can constrain the ISL domain more than
+	 the PPL domain.  So, with ISL we get better results on pr37485.c.  */
+      /*gcc_assert (isl_int_eq (isllb, islres));*/
+
+      /* Even shorter and more correct method: map the iteration domain
+         through the current scatter, and work on the resulting set.  */
+      isl_set_free (lbset);
+      lbset = isl_set_apply (isl_set_copy (pbb->domain),
+			     isl_map_copy (pbb->transformed));
+      lpres = isl_set_min (lbset, aff, &isllb);
+      lpres = isl_set_max (lbset, aff, &islub);
+
+      isl_int_sub (islub, islub, isllb);
+      isl_int_add_ui (islub, islub, 1);
+      gcc_assert (isl_int_eq (islub, islres));
+
+      isl_int_clear (isllb);
+      isl_int_clear (islub);
+      isl_int_clear (islres);
+      isl_aff_free (aff);
+
+      isl_set_free (lbset);
+      isl_set_free (ubset);
+    }
+
   mpz_clear (one);
   mpz_clear (diff);
   mpz_clear (lb);
diff --git a/gcc/graphite.c b/gcc/graphite.c
index dd98f5e..821efb1 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -253,6 +253,8 @@  graphite_finalize (bool need_cfg_cleanup_p)
     print_loops (dump_file, 3);
 }
 
+isl_ctx *the_isl_ctx;
+
 /* Perform a set of linear transforms on the loops of the current
    function.  */
 
@@ -276,6 +278,7 @@  graphite_transform_loops (void)
   if (!graphite_initialize (ctx))
     return;
 
+  the_isl_ctx = ctx;
   build_scops (&scops);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -301,6 +304,7 @@  graphite_transform_loops (void)
   htab_delete (bb_pbb_mapping);
   free_scops (scops);
   graphite_finalize (need_cfg_cleanup_p);
+  the_isl_ctx = NULL;
   isl_ctx_free (ctx);
 }