diff mbox

Don't allow unsafe reductions in graphite

Message ID 55AFBE25.5020508@mentor.com
State New
Headers show

Commit Message

Tom de Vries July 22, 2015, 4 p.m. UTC
[ was: Re: [RFC, PR66873] Use graphite for parloops ]

On 22/07/15 13:02, Richard Biener wrote:
> On Wed, Jul 22, 2015 at 1:01 PM, Richard Biener
> <richard.guenther@gmail.com>  wrote:
>> >On Tue, Jul 21, 2015 at 8:42 PM, Sebastian Pop<sebpop@gmail.com>  wrote:
>>> >>Tom de Vries wrote:
>>>> >>>Fix reduction safety checks
>>>> >>>
>>>> >>>       * graphite-sese-to-poly.c (is_reduction_operation_p): Limit
>>>> >>>       flag_associative_math to SCALAR_FLOAT_TYPE_P.  Honour
>>>> >>>       TYPE_OVERFLOW_TRAPS and TYPE_OVERFLOW_WRAPS for INTEGRAL_TYPE_P.
>>>> >>>       Only allow wrapping fixed-point otherwise.
>>>> >>>       (build_poly_scop): Always call
>>>> >>>       rewrite_commutative_reductions_out_of_ssa.
>>> >>
>>> >>The changes to graphite look good to me.
>> >
>> >+  if (SCALAR_FLOAT_TYPE_P (type))
>> >+    return flag_associative_math;
>> >+
>> >
>> >why only scalar floats?

Copied from the conditions in vect_is_simple_reduction_1.

 >> >Please use FLOAT_TYPE_P.

Done.

>> >
>> >+  if (INTEGRAL_TYPE_P (type))
>> >+    return (!TYPE_OVERFLOW_TRAPS (type)
>> >+           && TYPE_OVERFLOW_WRAPS (type));
>> >
>> >it cannot both wrap and trap thus TYPE_OVERFLOW_WRAPS is enough.
>> >

Done.

>> >I'm sure you'll disable quite some parallelization this way... (the
>> >routine is modeled after
>> >the vectorizers IIRC, so it would be affected as well).  Yeah - I see
>> >you modify autopar
>> >testcases.

I now split up the patch, this bit only relates to graphite, so no 
autopar testcases are affected.

>> >Please instead XFAIL the existing ones and add variants
>> >with unsigned
>> >reductions.  Adding -fwrapv isn't a good solution either.

Done.

>> >
>> >Can you think of a testcase that breaks btw?
>> >

If you mean a testcase that fails to execute properly with the fix, and 
executes correctly with the fix, then no.  The problem this patch is 
trying to fix, is that we assume wrapping overflow without fwrapv. In 
order to run into a runtime failure, we need a target that does not do 
wrapping overflow without fwrapv.

>> >The "proper" solution (see other passes) is to rewrite the reduction
>> >to a wrapping
>> >one (cast to unsigned for the reduction op).
>> >

Right.

>> >+  return (FIXED_POINT_TYPE_P (type)
>> >+         && FIXED_POINT_TYPE_OVERFLOW_WRAPS_P (type));
>> >
>> >why?

Again, copied from the conditions in vect_is_simple_reduction_1.

 >> >  Simply return false here instead?

Done.


[ Btw, looking at associative_tree_code, I realized that the
   overflow checking is only necessary for PLUS_EXPR and MULT_EXPR:
...
   switch (code)
     {
     case BIT_IOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_XOR_EXPR:
     case PLUS_EXPR:
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
       return true;
...

The other operators cannot overflow to begin with. My guess is that it's 
better to leave this for a trunk-only follow-up patch.
]

Currently bootstrapping and reg-testing on x86_64.

OK for trunk?

OK 5 and 4.9 release branches?

Thanks,
- Tom

Comments

Richard Biener July 23, 2015, 10:42 a.m. UTC | #1
On Wed, Jul 22, 2015 at 6:00 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> [ was: Re: [RFC, PR66873] Use graphite for parloops ]
>
> On 22/07/15 13:02, Richard Biener wrote:
>>
>> On Wed, Jul 22, 2015 at 1:01 PM, Richard Biener
>> <richard.guenther@gmail.com>  wrote:
>>>
>>> >On Tue, Jul 21, 2015 at 8:42 PM, Sebastian Pop<sebpop@gmail.com>  wrote:
>>>>
>>>> >>Tom de Vries wrote:
>>>>>
>>>>> >>>Fix reduction safety checks
>>>>> >>>
>>>>> >>>       * graphite-sese-to-poly.c (is_reduction_operation_p): Limit
>>>>> >>>       flag_associative_math to SCALAR_FLOAT_TYPE_P.  Honour
>>>>> >>>       TYPE_OVERFLOW_TRAPS and TYPE_OVERFLOW_WRAPS for
>>>>> >>> INTEGRAL_TYPE_P.
>>>>> >>>       Only allow wrapping fixed-point otherwise.
>>>>> >>>       (build_poly_scop): Always call
>>>>> >>>       rewrite_commutative_reductions_out_of_ssa.
>>>>
>>>> >>
>>>> >>The changes to graphite look good to me.
>>>
>>> >
>>> >+  if (SCALAR_FLOAT_TYPE_P (type))
>>> >+    return flag_associative_math;
>>> >+
>>> >
>>> >why only scalar floats?
>
>
> Copied from the conditions in vect_is_simple_reduction_1.
>
>>> >Please use FLOAT_TYPE_P.
>
> Done.
>
>>> >
>>> >+  if (INTEGRAL_TYPE_P (type))
>>> >+    return (!TYPE_OVERFLOW_TRAPS (type)
>>> >+           && TYPE_OVERFLOW_WRAPS (type));
>>> >
>>> >it cannot both wrap and trap thus TYPE_OVERFLOW_WRAPS is enough.
>>> >
>
>
> Done.
>
>>> >I'm sure you'll disable quite some parallelization this way... (the
>>> >routine is modeled after
>>> >the vectorizers IIRC, so it would be affected as well).  Yeah - I see
>>> >you modify autopar
>>> >testcases.
>
>
> I now split up the patch, this bit only relates to graphite, so no autopar
> testcases are affected.
>
>>> >Please instead XFAIL the existing ones and add variants
>>> >with unsigned
>>> >reductions.  Adding -fwrapv isn't a good solution either.
>
>
> Done.
>
>>> >
>>> >Can you think of a testcase that breaks btw?
>>> >
>
>
> If you mean a testcase that fails to execute properly with the fix, and
> executes correctly with the fix, then no.  The problem this patch is trying
> to fix, is that we assume wrapping overflow without fwrapv. In order to run
> into a runtime failure, we need a target that does not do wrapping overflow
> without fwrapv.
>
>>> >The "proper" solution (see other passes) is to rewrite the reduction
>>> >to a wrapping
>>> >one (cast to unsigned for the reduction op).
>>> >
>
>
> Right.
>
>>> >+  return (FIXED_POINT_TYPE_P (type)
>>> >+         && FIXED_POINT_TYPE_OVERFLOW_WRAPS_P (type));
>>> >
>>> >why?
>
>
> Again, copied from the conditions in vect_is_simple_reduction_1.
>
>>> >  Simply return false here instead?
>
> Done.
>
>
> [ Btw, looking at associative_tree_code, I realized that the
>   overflow checking is only necessary for PLUS_EXPR and MULT_EXPR:
> ...
>   switch (code)
>     {
>     case BIT_IOR_EXPR:
>     case BIT_AND_EXPR:
>     case BIT_XOR_EXPR:
>     case PLUS_EXPR:
>     case MULT_EXPR:
>     case MIN_EXPR:
>     case MAX_EXPR:
>       return true;
> ...
>
> The other operators cannot overflow to begin with. My guess is that it's
> better to leave this for a trunk-only follow-up patch.
> ]
>
> Currently bootstrapping and reg-testing on x86_64.
>
> OK for trunk?
>
> OK 5 and 4.9 release branches?

Ok if Sebastian is fine with it.

Richard.

> Thanks,
> - Tom
>
Sebastian Pop July 24, 2015, 8:20 p.m. UTC | #2
Richard Biener wrote:
> On Wed, Jul 22, 2015 at 6:00 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> > Currently bootstrapping and reg-testing on x86_64.
> >
> > OK for trunk?
> >
> > OK 5 and 4.9 release branches?
> 
> Ok if Sebastian is fine with it.

Ok to backport as well.
Thanks Tom for the patches.
diff mbox

Patch

Don't allow unsafe reductions in graphite

2015-07-21  Tom de Vries  <tom@codesourcery.com>

	* graphite-sese-to-poly.c (is_reduction_operation_p): Limit
	flag_associative_math to FLOAT_TYPE_P.  Honour
	TYPE_OVERFLOW_WRAPS for INTEGRAL_TYPE_P. Don't allow any other types.

	* gcc.dg/graphite/block-1.c: Xfail scan.
	* gcc.dg/graphite/interchange-12.c: Same.
	* gcc.dg/graphite/interchange-14.c: Same.
	* gcc.dg/graphite/interchange-15.c: Same.
	* gcc.dg/graphite/interchange-9.c: Same.
	* gcc.dg/graphite/interchange-mvt.c: Same.
	* gcc.dg/graphite/uns-block-1.c: New test.
	* gcc.dg/graphite/uns-interchange-12.c: New test.
	* gcc.dg/graphite/uns-interchange-14.c: New test.
	* gcc.dg/graphite/uns-interchange-15.c: New test.
	* gcc.dg/graphite/uns-interchange-9.c: New test.
	* gcc.dg/graphite/uns-interchange-mvt.c: New test.
---
 gcc/graphite-sese-to-poly.c                        | 14 +++--
 gcc/testsuite/gcc.dg/graphite/block-1.c            |  2 +-
 gcc/testsuite/gcc.dg/graphite/interchange-12.c     |  2 +-
 gcc/testsuite/gcc.dg/graphite/interchange-14.c     |  2 +-
 gcc/testsuite/gcc.dg/graphite/interchange-15.c     |  2 +-
 gcc/testsuite/gcc.dg/graphite/interchange-9.c      |  2 +-
 gcc/testsuite/gcc.dg/graphite/interchange-mvt.c    |  2 +-
 gcc/testsuite/gcc.dg/graphite/uns-block-1.c        | 48 +++++++++++++++++
 gcc/testsuite/gcc.dg/graphite/uns-interchange-12.c | 56 +++++++++++++++++++
 gcc/testsuite/gcc.dg/graphite/uns-interchange-14.c | 58 ++++++++++++++++++++
 gcc/testsuite/gcc.dg/graphite/uns-interchange-15.c | 53 ++++++++++++++++++
 gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c  | 47 ++++++++++++++++
 .../gcc.dg/graphite/uns-interchange-mvt.c          | 63 ++++++++++++++++++++++
 13 files changed, 342 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/uns-block-1.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/uns-interchange-12.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/uns-interchange-14.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/uns-interchange-15.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/uns-interchange-mvt.c

diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 8960c3f..68f7df1 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -2604,9 +2604,17 @@  is_reduction_operation_p (gimple stmt)
   gcc_assert (is_gimple_assign (stmt));
   code = gimple_assign_rhs_code (stmt);
 
-  return flag_associative_math
-    && commutative_tree_code (code)
-    && associative_tree_code (code);
+  if (!commutative_tree_code (code)
+      || !associative_tree_code (code))
+    return false;
+
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+
+  if (FLOAT_TYPE_P (type))
+    return flag_associative_math;
+
+  return (INTEGRAL_TYPE_P (type)
+	  && TYPE_OVERFLOW_WRAPS (type));
 }
 
 /* Returns true when PHI contains an argument ARG.  */
diff --git a/gcc/testsuite/gcc.dg/graphite/block-1.c b/gcc/testsuite/gcc.dg/graphite/block-1.c
index a73c20f..2208eb9 100644
--- a/gcc/testsuite/gcc.dg/graphite/block-1.c
+++ b/gcc/testsuite/gcc.dg/graphite/block-1.c
@@ -45,4 +45,4 @@  main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "will be loop blocked" 3 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "will be loop blocked" 3 "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-12.c b/gcc/testsuite/gcc.dg/graphite/interchange-12.c
index 41a8882..bf95fdd 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-12.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-12.c
@@ -53,4 +53,4 @@  main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-14.c b/gcc/testsuite/gcc.dg/graphite/interchange-14.c
index 36990ab..46f6a6d 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-14.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-14.c
@@ -55,4 +55,4 @@  main (void)
 }
 
 /* PRE destroys the perfect nest and we can't cope with that yet.  */
-/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-15.c b/gcc/testsuite/gcc.dg/graphite/interchange-15.c
index 3ddb74f..9f6b7ae 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-15.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-15.c
@@ -49,5 +49,5 @@  main (void)
 }
 
 /* PRE destroys the perfect nest and we can't cope with that yet.  */
-/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" { xfail *-*-* } } } */
 
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-9.c b/gcc/testsuite/gcc.dg/graphite/interchange-9.c
index cfec110..b023ea8 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-9.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-9.c
@@ -44,4 +44,4 @@  main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-mvt.c b/gcc/testsuite/gcc.dg/graphite/interchange-mvt.c
index 4b8f264..8c00f80 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-mvt.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-mvt.c
@@ -59,5 +59,5 @@  main (void)
 }
 
 /* PRE destroys the perfect nest and we can't cope with that yet.  */
-/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" { xfail *-*-* } } } */
 
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-block-1.c b/gcc/testsuite/gcc.dg/graphite/uns-block-1.c
new file mode 100644
index 0000000..57d522b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/uns-block-1.c
@@ -0,0 +1,48 @@ 
+/* { dg-require-effective-target size32plus } */
+
+#define DEBUG 0
+#if DEBUG
+#include <stdio.h>
+#endif
+
+#define MAX 100
+
+extern void abort ();
+
+int
+main (void)
+{
+  int i, j;
+  int sum = 0;
+  int A[MAX * MAX];
+  int B[MAX * MAX];
+
+  /* These loops should be loop blocked.  */
+  for (i = 0; i < MAX; i++)
+    for (j = 0; j < MAX; j++)
+      {
+	A[i*MAX + j] = j;
+	B[i*MAX + j] = j;
+      }
+
+  /* These loops should be loop blocked.  */
+  for (i = 0; i < MAX; i++)
+    for (j = 0; j < MAX; j++)
+      A[i*MAX + j] += B[j*MAX + i];
+
+  /* These loops should be loop blocked.  */
+  for (i = 0; i < MAX; i++)
+    for (j = 0; j < MAX; j++)
+      sum += A[i*MAX + j];
+
+#if DEBUG
+  fprintf (stderr, "sum = %d \n", sum);
+#endif
+
+  if (sum != 990000)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "will be loop blocked" 3 "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-12.c b/gcc/testsuite/gcc.dg/graphite/uns-interchange-12.c
new file mode 100644
index 0000000..dc26926
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-12.c
@@ -0,0 +1,56 @@ 
+/* { dg-require-effective-target size32plus } */
+
+#define DEBUG 0
+#if DEBUG
+#include <stdio.h>
+#endif
+
+#define N 200
+
+int A[N][N], B[N][N], C[N][N];
+
+static int __attribute__((noinline))
+matmult (void)
+{
+  int i, j, k;
+
+  /* Loops J and K should be interchanged.  */
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      {
+	A[i][j] = 0;
+	for (k = 0; k < N; k++)
+	  A[i][j] += B[i][k] * C[k][j];
+      }
+
+  return A[0][0] + A[N-1][N-1];
+}
+
+extern void abort ();
+
+int
+main (void)
+{
+  int i, j, res;
+
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      {
+	A[i][j] = 0;
+	B[i][j] = i - j;
+	C[i][j] = i + j;
+      }
+
+  res = matmult ();
+
+#if DEBUG
+  fprintf (stderr, "res = %d \n", res);
+#endif
+
+  if (res != 2626800)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-14.c b/gcc/testsuite/gcc.dg/graphite/uns-interchange-14.c
new file mode 100644
index 0000000..36990ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-14.c
@@ -0,0 +1,58 @@ 
+/* { dg-require-effective-target size32plus } */
+
+#define DEBUG 0
+#if DEBUG
+#include <stdio.h>
+#endif
+
+#define N 200
+
+int A[N][N], B[N][N], C[N][N];
+
+static void __attribute__((noinline))
+matmult (void)
+{
+  int i, j, k;
+
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      A[i][j] = 0;
+
+  /* Loops J and K should be interchanged.  */
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      for (k = 0; k < N; k++)
+	A[i][j] += B[i][k] * C[k][j];
+}
+
+extern void abort ();
+
+int
+main (void)
+{
+  int i, j, res = 0;
+
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      {
+	B[i][j] = j;
+	C[i][j] = i;
+      }
+
+  matmult ();
+
+  for (i = 0; i < N; i++)
+    res += A[i][i];
+
+#if DEBUG
+  fprintf (stderr, "res = %d \n", res);
+#endif
+
+  if (res != 529340000)
+    abort ();
+
+  return 0;
+}
+
+/* PRE destroys the perfect nest and we can't cope with that yet.  */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-15.c b/gcc/testsuite/gcc.dg/graphite/uns-interchange-15.c
new file mode 100644
index 0000000..3ddb74f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-15.c
@@ -0,0 +1,53 @@ 
+/* { dg-require-effective-target size32plus } */
+
+#define DEBUG 0
+#if DEBUG
+#include <stdio.h>
+#endif
+
+#define NMAX 2000
+
+static int x[NMAX], a[NMAX][NMAX];
+
+static int __attribute__((noinline))
+mvt (long N)
+{
+  int i,j;
+
+  /* These two loops should be interchanged.  */
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      x[i] += a[j][i];
+
+  return x[1];
+}
+
+extern void abort ();
+
+int
+main (void)
+{
+  int i, j, res;
+
+  for (i = 0; i < NMAX; i++)
+    for (j = 0; j < NMAX; j++)
+      a[i][j] = j;
+
+  for (i = 0; i < NMAX; i++)
+    x[i] = i;
+
+  res = mvt (NMAX);
+
+#if DEBUG
+  fprintf (stderr, "res = %d \n", res);
+#endif
+
+  if (res != 2001)
+    abort ();
+
+  return 0;
+}
+
+/* PRE destroys the perfect nest and we can't cope with that yet.  */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c b/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
new file mode 100644
index 0000000..cfec110
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
@@ -0,0 +1,47 @@ 
+/* { dg-require-effective-target size32plus } */
+
+#define DEBUG 0
+#if DEBUG
+#include <stdio.h>
+#endif
+
+#define N 111
+#define M 111
+
+static int __attribute__((noinline))
+foo (int *x)
+{
+  int i, j;
+  int sum = 0;
+
+  for (j = 0; j < M; ++j)
+    for (i = 0;  i < N; ++i)
+      sum += x[M * i + j];
+
+  return sum;
+}
+
+extern void abort ();
+
+int
+main (void)
+{
+  int A[N*M];
+  int i, res;
+
+  for (i = 0; i < N*M; i++)
+    A[i] = 2;
+
+  res = foo (A);
+
+#if DEBUG
+  fprintf (stderr, "res = %d \n", res);
+#endif
+
+  if (res != 24642)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-mvt.c b/gcc/testsuite/gcc.dg/graphite/uns-interchange-mvt.c
new file mode 100644
index 0000000..4b8f264
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-mvt.c
@@ -0,0 +1,63 @@ 
+/* { dg-require-effective-target size32plus } */
+
+#define DEBUG 0
+#if DEBUG
+#include <stdio.h>
+#endif
+
+#define NMAX 2000
+
+static int x1[NMAX], x2[NMAX], a[NMAX][NMAX], y1[NMAX], y2[NMAX];
+
+static int __attribute__((noinline))
+mvt (long N)
+{
+
+  int i,j;
+
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      x1[i] = x1[i] + a[i][j] * y1[j];
+
+  /* These two loops should be interchanged.  */
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      x2[i] = x2[i] + a[j][i] * y2[j];
+
+  return x1[0] + x2[0];
+}
+
+extern void abort ();
+
+int
+main (void)
+{
+  int i, j, res;
+
+  for (i = 0; i < NMAX; i++)
+    for (j = 0; j < NMAX; j++)
+      a[i][j] = i + j;
+
+  for (i = 0; i < NMAX; i++)
+    {
+      x1[i] = 0;
+      x2[i] = 2*i;
+      y1[i] = 100 - i;
+      y2[i] = i;
+    }
+
+  res = mvt (NMAX);
+
+#if DEBUG
+  fprintf (stderr, "res = %d \n", res);
+#endif
+
+  if (res != 199900000)
+    abort ();
+
+  return 0;
+}
+
+/* PRE destroys the perfect nest and we can't cope with that yet.  */
+/* { dg-final { scan-tree-dump-times "will be interchanged" 1 "graphite" } } */
+
-- 
1.9.1