diff mbox

[PR68953] Fix pdr accesses order

Message ID 570757AE.7070306@mentor.com
State New
Headers show

Commit Message

Tom de Vries April 8, 2016, 7:03 a.m. UTC
Hi,

this patch fixes wrong-code PR68953, a graphite 6 regression.


I.

Consider test.c:
...
int yu[4][1] = { { 1 }, { 2 }, { 3 }, { 4 } };

int
main (void)
{
   int zh, ro;

   for (zh = 0; zh < 2; ++zh)
     for (ro = 0; ro < 3; ++ro)
       yu[ro][0] = yu[zh + 1][0];

   return yu[0][0];
}
...

The program is supposed to return 2, but when compiling with -O1 
-floop-nest-optimize, the program returns 3 instead.

The problem is that graphite rewrites this loop nest:
...
   for (zh = 0; zh < 2; ++zh)
     for (ro = 0; ro < 3; ++ro)
       yu[ro][0] = yu[zh + 1][0];
...
into this loop nest, which has different semantics:
...
   for (ro = 0; ro < 3; ++ro)
     for (zh = 0; zh < 2; ++zh)
       yu[ro][0] = yu[zh + 1][0];
...


II.

Now, consider test2.c, the equivalent of test.c without the extra array 
dimension on yu:
...
int yu[4] = { 1, 2, 3, 4 };

int
main (void)
{
   int zh, ro;

   for (zh = 0; zh < 2; ++zh)
     for (ro = 0; ro < 3; ++ro)
       yu[ro] = yu[zh + 1];

   return yu[0];
}
...

In this case, when compiling with -O1 -floop-nest-optimize, the graphite 
transformation doesn't happen, and the program returns 2.


III.

So, what is the difference in graphite analysis that causes different 
conclusions?

For test2.c, we find the data references:
...
data references (
   reads: { S_4[i1, i2] -> [1, 1 + i1] : i1 >= 0 and i1 <= 1 and i2 <= 2 
and i2 >= 0 }
   must_writes: { S_4[i1, i2] -> [1, i2] : i2 >= 0 and i2 <= 2 and i1 >= 
0 and i1 <= 1 }
   may_writes: {  }
)
...

For test.c, we find these data references:
...
data references (
   reads: {  }
   must_writes: { S_4[i1, 0] -> [1, 0, 0] : i1 >= 0 and i1 <= 1 }
   may_writes: {  }
)
...

In other words, there are no reads in the program, which is of course 
incorrect.

Focusing on the reads, we find for test2.c:
...
Adding read to depedence graph: pdr_0 (read
in gimple stmt: _9 = yu[_8];
data accesses: { S_4[i1, i2] -> [1, 1 + i1] }
subscript sizes: { [1, i1] : i1 >= 0 and i1 <= 3 }
)
Reads depedence graph: { S_4[i1, i2] -> [1, 1 + i1] : i1 >= 0 and i1 <= 
1 and i2 <= 2 and i2 >= 0 }
...

And for test.c:
...
Adding read to depedence graph: pdr_0 (read
in gimple stmt: _9 = yu[_8][0];
data accesses: { S_4[i1, i2] -> [1, 0, 1 + i1] }
subscript sizes: { [1, i1, 0] : i1 >= 0 and i1 <= 3 }
)
Reads depedence graph: {  }
...


IV.

AFAICT, the data accesses and subscript sizes are parsed as follows:
...
data accesses:
{
   S_4[i1, i2]
      /* [alias set, last subscript, first subscript] */
   -> [1, 0, 1 + i1]
}
subscript sizes:
{
   /* [alias set range, first subscript range, last subscript range] */
   [1, i1, 0]
   : i1 >= 0 and i1 <= 3
}
...

In add_pdr_constraints, we do an intersection of the data accesses and 
the subscript_sizes:
...
/* Constrain pdr->accesses with pdr->subscript_sizes and pbb->domain.*/

static isl_map *
add_pdr_constraints (poly_dr_p pdr, poly_bb_p pbb)
{
   isl_map *x
     = isl_map_intersect_range (isl_map_copy (pdr->accesses),
                                isl_set_copy (pdr->subscript_sizes));
   x = isl_map_coalesce (x);
   return constrain_domain (x, isl_set_copy (pbb->domain));
}

...

The output of the intersection is:
...
x: { S_4[-1, i2] -> [1, 0, 0] }
...

AFAICT, we're intersecting
   '[1, 0, 1 + i1]'
with
   '[1, i1, 0]'
and end up with
   '[1, 0, 0]'.

In other words, we're intersecting:
- the last subscript access function with the first subscript range,
   and
- the first subscript access function with the last subscript range.

This is the point from where things go wrong.


V.

I'm not really sure how this is supposed to be fixed. I imagine that we 
should do one of 3:
1. we change the order in the access functions
2. we change the order in the subscript_sizes
3. we keep the orders as they are, but don't intersect them directly
    but do an order inversion before.

I've picked 1, since that was the easiest for me to implement (but I'm 
not sure if by doing so, I've broken any hardcoded graphite assumptions).

Bootstrapped and regtested on x86_64.

OK for stage4?

Thanks,
- Tom

Comments

Sebastian Pop April 9, 2016, 4:07 a.m. UTC | #1
On Fri, Apr 8, 2016 at 2:03 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> pdr_0 (read
> in gimple stmt: _9 = yu[_8][0];
> data accesses: { S_4[i1, i2] -> [1, 0, 1 + i1] }

data access should be { S_4[i1, i2] -> [1, 1 + i1, 0] }

> subscript sizes: { [1, i1, 0] : i1 >= 0 and i1 <= 3 }
> )
[...]
> I'm not really sure how this is supposed to be fixed. I imagine that we should do one of 3:
> 1. we change the order in the access functions
> 2. we change the order in the subscript_sizes
> 3. we keep the orders as they are, but don't intersect them directly
>    but do an order inversion before.
>
> I've picked 1, since that was the easiest for me to implement (but I'm not sure if by doing so, I've broken any hardcoded graphite assumptions).

1 is the right fix: both access functions and subscript sizes should
be in the same order.
If Richi agrees, ok to commit.

Thanks,
Sebastian
Richard Biener April 9, 2016, 11:26 a.m. UTC | #2
On April 9, 2016 6:07:19 AM GMT+02:00, Sebastian Pop <sebpop@gmail.com> wrote:
>On Fri, Apr 8, 2016 at 2:03 AM, Tom de Vries <Tom_deVries@mentor.com>
>wrote:
>> pdr_0 (read
>> in gimple stmt: _9 = yu[_8][0];
>> data accesses: { S_4[i1, i2] -> [1, 0, 1 + i1] }
>
>data access should be { S_4[i1, i2] -> [1, 1 + i1, 0] }
>
>> subscript sizes: { [1, i1, 0] : i1 >= 0 and i1 <= 3 }
>> )
>[...]
>> I'm not really sure how this is supposed to be fixed. I imagine that
>we should do one of 3:
>> 1. we change the order in the access functions
>> 2. we change the order in the subscript_sizes
>> 3. we keep the orders as they are, but don't intersect them directly
>>    but do an order inversion before.
>>
>> I've picked 1, since that was the easiest for me to implement (but
>I'm not sure if by doing so, I've broken any hardcoded graphite
>assumptions).
>
>1 is the right fix: both access functions and subscript sizes should
>be in the same order.
>If Richi agrees, ok to commit.

OK.

Richard.

>Thanks,
>Sebastian
diff mbox

Patch

Fix pdr accesses order

2016-04-07  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/68953
	* graphite-sese-to-poly.c (pdr_add_memory_accesses): Order accesses from
	first to last subscript.

	* gcc.dg/graphite/pr68953.c: New test.

---
 gcc/graphite-sese-to-poly.c             |  2 +-
 gcc/testsuite/gcc.dg/graphite/pr68953.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index b62789f8..22a09a1 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -672,7 +672,7 @@  pdr_add_memory_accesses (isl_map *acc, dr_info &dri)
 
       aff = extract_affine (scop, afn,
 			    isl_space_domain (isl_map_get_space (acc)));
-      acc = set_index (acc, i + 1, aff);
+      acc = set_index (acc, nb_subscripts - i , aff);
     }
 
   return isl_map_coalesce (acc);
diff --git a/gcc/testsuite/gcc.dg/graphite/pr68953.c b/gcc/testsuite/gcc.dg/graphite/pr68953.c
new file mode 100644
index 0000000..12c632d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr68953.c
@@ -0,0 +1,30 @@ 
+/* { dg-do run } */
+/* { dg-options "-O1 -floop-nest-optimize" } */
+
+extern void abort (void);
+
+int yu[4][1] = { { 1 }, { 2 }, { 3 }, { 4 } };
+
+static void __attribute__((noinline,noclone))
+foo (void)
+{
+  int zh, ro;
+
+  for (zh = 0; zh < 2; ++zh)
+    for (ro = 0; ro < 3; ++ro)
+      yu[ro][0] = yu[zh + 1][0];
+}
+
+int
+main (void)
+{
+  foo ();
+
+  if (yu[0][0] != 2
+      || yu[1][0] != 2
+      || yu[2][0] != 2
+      || yu[3][0] != 4)
+    abort ();
+
+  return 0;
+}