Patchwork Fix PR47594: Sign extend constants while translating to Graphite

login
register
mail settings
Submitter Sebastian Pop
Date July 25, 2011, 4:51 p.m.
Message ID <1311612679-8020-1-git-send-email-sebpop@gmail.com>
Download mbox | patch
Permalink /patch/106719/
State New
Headers show

Comments

Sebastian Pop - July 25, 2011, 4:51 p.m.
"Bug 47594 - gfortran.dg/vect/vect-5.f90 execution test fails when
compiled with -O2 -fgraphite-identity"

The problem is due to the fact that Graphite generates this loop:

    for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
      S6(scat_1,scat_3);
    }

that has a "-1" encoded as an unsigned "4294967295".  This constant
comes from the computation of the number of iterations "M - I" of
the inner loop:

        do I = 1, N
          do J = I, M
            A(J,2) = B(J)
          end do
        end do

The patch fixes the problem by sign-extending the constants for the
step of a chain of recurrence in scan_tree_for_params_right_scev.

The same patter could occur for multiplication by a scalar, like in
"-1 * N" and so the patch also fixes these cases in
scan_tree_for_params.

Bootstrapped and tested on amd64-linux.

2011-07-23  Sebastian Pop  <sebastian.pop@amd.com>

	PR middle-end/47594
	* graphite-sese-to-poly.c (scan_tree_for_params_right_scev): Sign
	extend constants.
	(scan_tree_for_params): Same.

	* gfortran.dg/graphite/run-id-pr47594.f90: New.
---
 gcc/ChangeLog                                      |    7 ++++
 gcc/graphite-sese-to-poly.c                        |   26 ++++++++++++--
 gcc/testsuite/ChangeLog                            |    5 +++
 .../gfortran.dg/graphite/run-id-pr47594.f90        |   35 ++++++++++++++++++++
 4 files changed, 69 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/run-id-pr47594.f90
Sebastian Pop - July 26, 2011, 2:02 p.m.
On Tue, Jul 26, 2011 at 03:22, Richard Guenther <rguenther@suse.de> wrote:
> On Mon, 25 Jul 2011, Sebastian Pop wrote:
>
>> "Bug 47594 - gfortran.dg/vect/vect-5.f90 execution test fails when
>> compiled with -O2 -fgraphite-identity"
>>
>> The problem is due to the fact that Graphite generates this loop:
>>
>>     for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
>>       S6(scat_1,scat_3);
>>     }
>>
>> that has a "-1" encoded as an unsigned "4294967295".  This constant
>> comes from the computation of the number of iterations "M - I" of
>> the inner loop:
>>
>>         do I = 1, N
>>           do J = I, M
>>             A(J,2) = B(J)
>>           end do
>>         end do
>>
>> The patch fixes the problem by sign-extending the constants for the
>> step of a chain of recurrence in scan_tree_for_params_right_scev.
>>
>> The same patter could occur for multiplication by a scalar, like in
>> "-1 * N" and so the patch also fixes these cases in
>> scan_tree_for_params.
>
> That certainly feels odd (again).  How does it end up being unsigned
> in the first place?

We got this expression from niter.  niter analysis turns all expressions
into unsigned types before starting computations.  I tried to see if we
could improve niter, but that would be a major work.  I also thought
about using PPL or ISL to implement niter for graphite.

> Randomly sign-extending stuff looks bogus to me.
> Does graphite operate on infinite precision signed integers?  Or
> does it operate on twos-complement fixed precision integers?

Graphite represents constants using mpz_t.

Sebastian
Richard Guenther - July 26, 2011, 2:07 p.m.
On Tue, 26 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 03:22, Richard Guenther <rguenther@suse.de> wrote:
> > On Mon, 25 Jul 2011, Sebastian Pop wrote:
> >
> >> "Bug 47594 - gfortran.dg/vect/vect-5.f90 execution test fails when
> >> compiled with -O2 -fgraphite-identity"
> >>
> >> The problem is due to the fact that Graphite generates this loop:
> >>
> >>     for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
> >>       S6(scat_1,scat_3);
> >>     }
> >>
> >> that has a "-1" encoded as an unsigned "4294967295".  This constant
> >> comes from the computation of the number of iterations "M - I" of
> >> the inner loop:
> >>
> >>         do I = 1, N
> >>           do J = I, M
> >>             A(J,2) = B(J)
> >>           end do
> >>         end do
> >>
> >> The patch fixes the problem by sign-extending the constants for the
> >> step of a chain of recurrence in scan_tree_for_params_right_scev.
> >>
> >> The same patter could occur for multiplication by a scalar, like in
> >> "-1 * N" and so the patch also fixes these cases in
> >> scan_tree_for_params.
> >
> > That certainly feels odd (again).  How does it end up being unsigned
> > in the first place?
> 
> We got this expression from niter.  niter analysis turns all expressions
> into unsigned types before starting computations.  I tried to see if we
> could improve niter, but that would be a major work.  I also thought
> about using PPL or ISL to implement niter for graphite.

Hmm, I see (I suppose to avoid introducing undefined overflow).

> > Randomly sign-extending stuff looks bogus to me.
> > Does graphite operate on infinite precision signed integers?  Or
> > does it operate on twos-complement fixed precision integers?
> 
> Graphite represents constants using mpz_t.

Not exactly an answer but I guess all mpz_t do have a sign and are
of arbitrary precision.  Thus it's wrong to change unsigned + -1U
to mpz_t + -1 unless you truncate to unsigneds precision after
doing that operation.  Do we properly handle this?

Richard.
Sebastian Pop - July 26, 2011, 2:22 p.m.
On Tue, Jul 26, 2011 at 09:07, Richard Guenther <rguenther@suse.de> wrote:
>> > Randomly sign-extending stuff looks bogus to me.
>> > Does graphite operate on infinite precision signed integers?  Or
>> > does it operate on twos-complement fixed precision integers?
>>
>> Graphite represents constants using mpz_t.
>
> Not exactly an answer but I guess all mpz_t do have a sign and are
> of arbitrary precision.  Thus it's wrong to change unsigned + -1U
> to mpz_t + -1 unless you truncate to unsigneds precision after
> doing that operation.  Do we properly handle this?

Graphite is not truncating after conversion of an unsigned expression to mpz_t.

I still don't see how truncating -1U to its precision changes anything,
could you explain?

Thanks,
Sebastian
Richard Guenther - July 26, 2011, 2:34 p.m.
On Tue, 26 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 09:07, Richard Guenther <rguenther@suse.de> wrote:
> >> > Randomly sign-extending stuff looks bogus to me.
> >> > Does graphite operate on infinite precision signed integers?  Or
> >> > does it operate on twos-complement fixed precision integers?
> >>
> >> Graphite represents constants using mpz_t.
> >
> > Not exactly an answer but I guess all mpz_t do have a sign and are
> > of arbitrary precision.  Thus it's wrong to change unsigned + -1U
> > to mpz_t + -1 unless you truncate to unsigneds precision after
> > doing that operation.  Do we properly handle this?
> 
> Graphite is not truncating after conversion of an unsigned expression to mpz_t.
> 
> I still don't see how truncating -1U to its precision changes anything,
> could you explain?

Truncating -1 doesn't matter - it matters that if you perform any
unsigned arithmetic in arbitrary precision signed arithmetic that
you properly truncate after each operation to simulate unsigned
twos-complement wrapping semantic.  And if you did that you wouldn't
need to sign-extend -1U either.

Richard.
Sebastian Pop - July 27, 2011, 4:42 p.m.
On Tue, Jul 26, 2011 at 09:34, Richard Guenther <rguenther@suse.de> wrote:
> Truncating -1 doesn't matter - it matters that if you perform any
> unsigned arithmetic in arbitrary precision signed arithmetic that
> you properly truncate after each operation to simulate unsigned
> twos-complement wrapping semantic.  And if you did that you wouldn't
> need to sign-extend -1U either.

Ok, so I guess that the type of the expression that we generate from
Graphite should be, as the original expression, of unsigned type.
In the previous example,

>     for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
>       S6(scat_1,scat_3);
>     }

this is still valid if the type of "4294967295*scat_1" is unsigned.
That would fix only -fgraphite-identity: we also have to watch out for
operations on the polyhedral representation that would use -1U in
other computations, and here I'm thinking about everything we have
implemented on the polyhedral representation: dependence test,
counting the number of points, i.e., all the heuristics, etc.

When disabling Graphite on all unsigned niter expressions, we get
the following fails:

FAIL: gcc.dg/graphite/scop-0.c scan-tree-dump-times graphite "number
of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-1.c scan-tree-dump-times graphite "number
of SCoPs: 3" 1
FAIL: gcc.dg/graphite/scop-10.c scan-tree-dump-times graphite "number
of SCoPs: 3" 1
FAIL: gcc.dg/graphite/scop-11.c scan-tree-dump-times graphite "number
of SCoPs: 3" 1
FAIL: gcc.dg/graphite/scop-12.c scan-tree-dump-times graphite "number
of SCoPs: 5" 1
FAIL: gcc.dg/graphite/scop-13.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-16.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-17.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-18.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-2.c scan-tree-dump-times graphite "number
of SCoPs: 4" 1
FAIL: gcc.dg/graphite/scop-20.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-21.c scan-tree-dump-times graphite "number
of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-22.c scan-tree-dump-times graphite "number
of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-3.c scan-tree-dump-times graphite "number
of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-4.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-5.c scan-tree-dump-times graphite "number
of SCoPs: 3" 1
FAIL: gcc.dg/graphite/scop-6.c scan-tree-dump-times graphite "number
of SCoPs: 3" 1
FAIL: gcc.dg/graphite/scop-7.c scan-tree-dump-times graphite "number
of SCoPs: 3" 1
FAIL: gcc.dg/graphite/scop-8.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-9.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-dsyr2k.c scan-tree-dump-times graphite
"number of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-dsyrk.c scan-tree-dump-times graphite
"number of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-matmult.c scan-tree-dump-times graphite
"number of SCoPs: 1" 1
FAIL: gcc.dg/graphite/scop-mvt.c scan-tree-dump-times graphite "number
of SCoPs: 2" 1
FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number
of SCoPs: 1" 1
FAIL: gcc.dg/graphite/interchange-0.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-1.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-10.c scan-tree-dump-times graphite
"will be interchanged" 2
FAIL: gcc.dg/graphite/interchange-11.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-12.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-13.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-3.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-4.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-5.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-6.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-7.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/interchange-8.c scan-tree-dump-times graphite
"will be interchanged" 2
FAIL: gcc.dg/graphite/interchange-9.c scan-tree-dump-times graphite
"will be interchanged" 1
FAIL: gcc.dg/graphite/block-1.c scan-tree-dump-times graphite "will be
loop blocked" 3
FAIL: gcc.dg/graphite/block-5.c scan-tree-dump-times graphite "will be
loop blocked" 1
FAIL: gcc.dg/graphite/vect-pr43423.c scan-tree-dump-times vect
"vectorized 2 loops" 1
FAIL: gcc.dg/graphite/pr35356-1.c scan-tree-dump-times graphite "loop_1" 0
FAIL: gcc.dg/graphite/pr35356-2.c scan-tree-dump-times graphite "MIN_EXPR" 4
FAIL: gcc.dg/graphite/pr35356-2.c scan-tree-dump-times graphite "MAX_EXPR" 4

FAIL: gfortran.dg/graphite/interchange-3.f90  -O  scan-tree-dump-times
graphite "will be interchanged" 1
FAIL: gfortran.dg/graphite/block-1.f90  -O  scan-tree-dump-times
graphite "number of SCoPs: 1" 1
FAIL: gfortran.dg/graphite/block-2.f  -O  scan-tree-dump-times
graphite "number of SCoPs: 2" 1

FAIL: libgomp.graphite/bounds.c scan-tree-dump-times graphite "0 loops
carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-1.c scan-tree-dump-times
graphite "1 loops carried no dependency" 2
FAIL: libgomp.graphite/force-parallel-1.c scan-tree-dump-times
optimized "loopfn" 5
FAIL: libgomp.graphite/force-parallel-2.c scan-tree-dump-times
graphite "2 loops carried no dependency" 2
FAIL: libgomp.graphite/force-parallel-2.c scan-tree-dump-times
optimized "loopfn" 5
FAIL: libgomp.graphite/force-parallel-3.c scan-tree-dump-times
graphite "4 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-3.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-3.c scan-tree-dump-times
optimized "loopfn.1" 5
FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times
graphite "2 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times
optimized "loopfn.1" 5
FAIL: libgomp.graphite/force-parallel-5.c scan-tree-dump-times
graphite "2 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-5.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-5.c scan-tree-dump-times
optimized "loopfn.1" 5
FAIL: libgomp.graphite/force-parallel-6.c scan-tree-dump-times
graphite "1 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-6.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-7.c scan-tree-dump-times
graphite "3 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-7.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times
graphite "2 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times
optimized "loopfn.1" 5
FAIL: libgomp.graphite/force-parallel-9.c scan-tree-dump-times
graphite "4 loops carried no dependency" 1
FAIL: libgomp.graphite/force-parallel-9.c scan-tree-dump-times
optimized "loopfn.0" 5
FAIL: libgomp.graphite/force-parallel-9.c scan-tree-dump-times
optimized "loopfn.1" 5

So the only solution that I can see is to implement the niter analysis
as the resolution of a constraint system, and that would avoid creating
the unsigned expressions.

Sebastian
Richard Guenther - July 28, 2011, 8:58 a.m.
On Wed, 27 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 09:34, Richard Guenther <rguenther@suse.de> wrote:
> > Truncating -1 doesn't matter - it matters that if you perform any
> > unsigned arithmetic in arbitrary precision signed arithmetic that
> > you properly truncate after each operation to simulate unsigned
> > twos-complement wrapping semantic.  And if you did that you wouldn't
> > need to sign-extend -1U either.
> 
> Ok, so I guess that the type of the expression that we generate from
> Graphite should be, as the original expression, of unsigned type.
> In the previous example,
> 
> >     for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
> >       S6(scat_1,scat_3);
> >     }
> 
> this is still valid if the type of "4294967295*scat_1" is unsigned.

If 4294967295*scat_1+T_51-1 is always the symbolic number of
iterations then it will be always >= 0, right?  I still do not
quite understand where and how "types" enter the picture for
graphite here - if the niter expression was scat_1 + T_51 with
both unsigned then you'd still have to truncate to the result
types precision in case the polyhedral model internally has
infinite precision.  So I don't think -1U is in any way special
(it probably just appears more often, and we could avoid some
of the issues with folding the above to T_51 - 1 - scat_1).

> That would fix only -fgraphite-identity: we also have to watch out for
> operations on the polyhedral representation that would use -1U in
> other computations, and here I'm thinking about everything we have
> implemented on the polyhedral representation: dependence test,
> counting the number of points, i.e., all the heuristics, etc.
> 
> When disabling Graphite on all unsigned niter expressions, we get
> the following fails:

I think niter expressions are unsigned simply because niter will
always be >= 0.  But the issue doesn't seem to be the unsignedness of 
niter but the fact that the symbolic expression is computed with
unsigned arithmetic?

> FAIL: gcc.dg/graphite/scop-0.c scan-tree-dump-times graphite "number
> of SCoPs: 1" 1

Where I wonder why we end up with unsigned arithmetic for this
testcase for example.  2 * N + 100 is surely all signed.

[...]

> So the only solution that I can see is to implement the niter analysis
> as the resolution of a constraint system, and that would avoid creating
> the unsigned expressions.

So maybe we can instead try to avoid using unsigned arithmetic
for symbolic niters if the source does not have it unsigned?

Richard.

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 65676cb..f7e2f7d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@ 
 2011-07-23  Sebastian Pop  <sebastian.pop@amd.com>
 
+	PR middle-end/47594
+	* graphite-sese-to-poly.c (scan_tree_for_params_right_scev): Sign
+	extend constants.
+	(scan_tree_for_params): Same.
+
+2011-07-23  Sebastian Pop  <sebastian.pop@amd.com>
+
 	* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Build an unsigned
 	iv only when the largest type is unsigned.  Do not call
 	lang_hooks.types.type_for_size.
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 7e23c9d..5c9e984 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -633,7 +633,11 @@  scan_tree_for_params_right_scev (sese s, tree e, int var,
       gcc_assert (TREE_CODE (e) == INTEGER_CST);
 
       mpz_init (val);
-      tree_int_to_gmp (e, val);
+
+      /* Necessary to not get "-1 = 2^n - 1". */
+      mpz_set_double_int
+	(val, double_int_sext (tree_to_double_int (e),
+			       TYPE_PRECISION (TREE_TYPE (e))), false);
       add_value_to_dim (l, expr, val);
       mpz_clear (val);
     }
@@ -729,9 +733,16 @@  scan_tree_for_params (sese s, tree e, ppl_Linear_Expression_t c,
 	  if (c)
 	    {
 	      mpz_t val;
-	      gcc_assert (host_integerp (TREE_OPERAND (e, 1), 0));
+	      tree cst = TREE_OPERAND (e, 1);
+
+	      gcc_assert (host_integerp (cst, 0));
 	      mpz_init (val);
-	      tree_int_to_gmp (TREE_OPERAND (e, 1), val);
+
+	      /* Necessary to not get "-1 = 2^n - 1". */
+	      mpz_set_double_int
+		(val, double_int_sext (tree_to_double_int (cst),
+				       TYPE_PRECISION (TREE_TYPE (cst))), false);
+
 	      mpz_mul (val, val, k);
 	      scan_tree_for_params (s, TREE_OPERAND (e, 0), c, val);
 	      mpz_clear (val);
@@ -744,9 +755,16 @@  scan_tree_for_params (sese s, tree e, ppl_Linear_Expression_t c,
 	  if (c)
 	    {
 	      mpz_t val;
+	      tree cst = TREE_OPERAND (e, 0);
+
 	      gcc_assert (host_integerp (TREE_OPERAND (e, 0), 0));
 	      mpz_init (val);
-	      tree_int_to_gmp (TREE_OPERAND (e, 0), val);
+
+	      /* Necessary to not get "-1 = 2^n - 1". */
+	      mpz_set_double_int
+		(val, double_int_sext (tree_to_double_int (cst),
+				       TYPE_PRECISION (TREE_TYPE (cst))), false);
+
 	      mpz_mul (val, val, k);
 	      scan_tree_for_params (s, TREE_OPERAND (e, 1), c, val);
 	      mpz_clear (val);
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 1f93f4c..b7c2be3 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,10 @@ 
 2011-07-23  Sebastian Pop  <sebastian.pop@amd.com>
 
+	PR middle-end/47594
+	* gfortran.dg/graphite/run-id-pr47594.f90: New.
+
+2011-07-23  Sebastian Pop  <sebastian.pop@amd.com>
+
 	PR middle-end/47653
 	* gcc.dg/graphite/run-id-pr47653.c: New.
 	* gcc.dg/graphite/interchange-3.c: Do not use unsigned types for
diff --git a/gcc/testsuite/gfortran.dg/graphite/run-id-pr47594.f90 b/gcc/testsuite/gfortran.dg/graphite/run-id-pr47594.f90
new file mode 100644
index 0000000..7f36fc6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/graphite/run-id-pr47594.f90
@@ -0,0 +1,35 @@ 
+! { dg-options "-O2 -fgraphite-identity" }
+
+        Subroutine foo (N, M)
+        Integer N
+        Integer M
+        integer A(8,16)
+        integer B(8)
+
+        B = (/ 2, 3, 5, 7, 11, 13, 17, 23 /)
+
+        do I = 1, N
+          do J = I, M
+            A(J,2) = B(J)
+          end do
+        end do
+
+        do I = 1, N
+          do J = I, M
+            if (A(J,2) /= B(J)) then
+              call abort ()
+              endif
+          end do
+        end do
+
+        Return
+        end
+
+
+        program main
+
+        Call foo (16, 8)
+
+        stop
+        end
+