diff mbox

[Fortran] assumed-rank some bound intrinsics support, fix failures and improve diagnostcs

Message ID 501159F1.5060704@sfr.fr
State New
Headers show

Commit Message

Mikael Morin July 26, 2012, 2:53 p.m. UTC
On 21/07/2012 13:08, Tobias Burnus wrote:
> Only failing are:
>   lbound(x) / ubound(x) / shape(x)
> 
Here is a draft for those.
Lightly tested with print *, ...

Mikael

Comments

Mikael Morin July 26, 2012, 3:12 p.m. UTC | #1
On 26/07/2012 16:53, Mikael Morin wrote:
> On 21/07/2012 13:08, Tobias Burnus wrote:
>> Only failing are:
>>   lbound(x) / ubound(x) / shape(x)
>>
> Here is a draft for those.
> Lightly tested with print *, ...
> 
Better with the tests.


$ ./test1
           1           1
           3           8
           3           8

$ ./test2
          11         101
          13         108
           3           8
Tobias Burnus July 26, 2012, 3:32 p.m. UTC | #2
On 07/26/2012 05:12 PM, Mikael Morin wrote:
> On 26/07/2012 16:53, Mikael Morin wrote:
>> Here is a draft for those. Lightly tested with print *, ... 

Looks rather nice. The output for test1 is also  good:

   integer :: a(1:3,-2:5)
gives
   lbound(arg) == [1, 1]
   ubound(arg) == [3, 8]
   shape(arg) == [3, 8]

However, if the dummy is allocatable or a pointer, the result should be:

   lbound(arg) == [1, -2]
   ubound(arg) == [3, 5]
   shape(arg) == [3, 8]

which your second test case doesn't give. (At least that's how I understand TS and F2008.)

Except for that issue, I like your patch. Thanks!

Tobias
Mikael Morin July 27, 2012, 5:26 p.m. UTC | #3
On 26/07/2012 17:32, Tobias Burnus wrote:
> On 07/26/2012 05:12 PM, Mikael Morin wrote:
>> On 26/07/2012 16:53, Mikael Morin wrote:
>>> Here is a draft for those. Lightly tested with print *, ... 
> 
> Looks rather nice. The output for test1 is also  good:
> 
>   integer :: a(1:3,-2:5)
> gives
>   lbound(arg) == [1, 1]
>   ubound(arg) == [3, 8]
>   shape(arg) == [3, 8]
> 
> However, if the dummy is allocatable or a pointer, the result should be:
> 
>   lbound(arg) == [1, -2]
>   ubound(arg) == [3, 5]
>   shape(arg) == [3, 8]
> 
> which your second test case doesn't give.

Hello,

do you have a test case exhibiting the problem?
It seems fine to me.

$ ./test1
           1           1
           3           8
           3           8
           1           1
           3           8
           3           8
           1          -2
           3           5
           3           8
           1          -2
           3           5
           3           8

./test2
          11         101
          13         108
           3           8
          11          97
          12         106
           2          10
          13          99
          15         110
           3          12
Tobias Burnus Aug. 1, 2012, 10 a.m. UTC | #4
On 07/27/2012 07:26 PM, Mikael Morin wrote:
> do you have a test case exhibiting the problem? It seems fine to me.

Your second test case was too convoluted for me - and as I wasn't at 
home, I couldn't test it. I now believe that your patch is okay; I will 
later formally review  it.

Do you intent to wrap it for final inclusion? I think it only lacks a 
dejaGNUified test case and a changelog.

  * * *

However, I found another spot where one needs to have a scalarizer; 
possibly a poor man's version is enough. Namely INTENT(OUT) handling. Do 
you have an idea how to best handle that case?

program test
   implicit none
   type t
     integer, allocatable :: a
   end type t
   type(t) :: b(4,6)
   integer :: i, j

   do j = 1, 6
     do i = 1, 4
       allocate (b(i,j)%a)
     end do
   end do

   call sub (b(::2,::3))
   do j = 1, 6
     do i = 1, 4
       print *, i, j, allocated (b(i,j)%a)
!      if (allocated (b(i,j)%a) .neqv. (mod (i-1,2) /= 0 .or. mod 
(j-1,3) /= 0))&
!        call abort ()
     end do
   end do
contains
   subroutine sub (x)
     type(t), intent(out) :: x(..)
   end subroutine sub
end program test


Tobias

PS: Note to self: Reject passing an assumed-size array to an INTENT(OUT) 
assumed-rank array, at least if it is "polymorphic, finalizable, of a 
type with an allocatable ultimate component, or of a type for which 
default initialization is specified." [TS29113 seems to allow it, but 
one needs some check similar to F2008's C534. A constraint is not enough 
as it doesn't cover all cases, but the rest is the user's responsibility.]
Mikael Morin Aug. 1, 2012, 11:37 a.m. UTC | #5
On 01/08/2012 12:00, Tobias Burnus wrote:
> On 07/27/2012 07:26 PM, Mikael Morin wrote:
>> do you have a test case exhibiting the problem? It seems fine to me.
> 
> Your second test case was too convoluted for me - and as I wasn't at
> home, I couldn't test it. I now believe that your patch is okay; I will
> later formally review  it.
I will formally ask for it. ;-)
> 
> Do you intent to wrap it for final inclusion? I think it only lacks a
> dejaGNUified test case and a changelog.
Will do.
> 
>  * * *
> 
> However, I found another spot where one needs to have a scalarizer;
> possibly a poor man's version is enough. Namely INTENT(OUT) handling.
Indeed.
> Do you have an idea how to best handle that case?
It seems some new code is necessary. I don't know how well it will
fit/reuse the existing though.

I have been thinking about rewriting the scalarizer in a way that would
need less bookkeeping to make things work. Nothing near a patch though,
and it's not something for 4.8. Anyway, here is the interface I had in mind:

gfc_init_loopinfo (loopinfo);
/* generate the code.  */
gfc_conv_expr (loopinfo, expr1);
gfc_conv_expr (loopinfo, expr2);
/* etc, do something, putting loopinfo every time as parameter so that
it is populated appropriately... and then: */
loopblock = gfc_scalarize (loopinfo);

the gfc_scalarize could have a big `if' in it distinguishing known rank
from assumed rank.
Maybe we could take the opportunity to make a seed for a new scalarizer.

Mikael
Tobias Burnus Aug. 1, 2012, 12:37 p.m. UTC | #6
On 08/01/2012 01:37 PM, Mikael Morin wrote:
>> However, I found another spot where one needs to have a scalarizer;
>> possibly a poor man's version is enough. Namely INTENT(OUT) handling.
> Indeed.
>> Do you have an idea how to best handle that case?
> It seems some new code is necessary. I don't know how well it will
> fit/reuse the existing though.

I think we should try to get this working in some way for 4.8 as 
assumed-rank arrays will be used for the finalization wrapper - and it 
would be awesome to have FINAL support in 4.8.

Background: As it is unknown (at compile time) whether a polymorphic 
variable has no final subroutines or one for that rank or an elemental 
one (or some but no suitable ones) - and as there could be a different 
combination for the parent type, the current plan is to add a _final 
proc-pointer to the vtable, which points to a final wrapper procedure 
for that type. It takes (at least for arrays) an assumed-rank array and 
dispatches the calls based on the rank; for an elemental final 
subroutine, it has to "scalarize it". [It's simple to add a special case 
as the array is contiguous - one just needs to "call 
elemental(base_address + i*elem_size)", where i = 
1,size(assumed-size-array).]

And for finalization, it would be great if one could use the INTENT(OUT) 
support. One could alternatively implement it manually on the gfortran 
AST level (gfc_code/gfc_expr) by walking through the derived type or one 
could implement a simplified version, making use of the contiguity of 
the finalized variable.


> I have been thinking about rewriting the scalarizer in a way that would
> need less bookkeeping to make things work. Nothing near a patch though,
> and it's not something for 4.8.

I think it would be good to base it on the new array descriptor, which 
we hopefully have by that time. Additionally, we should consider to support:

a) ARRAY_RANGE_REF: That's probably somewhat independent of 
scalarization, but replaces it in some cases:
   A(:,:,5) = B(:)
can be implemented as ARRAY_RANGE_REF, if the memory is contiguous; one 
just passes an offset and (via the decl) the size of the array 
(section). See trans-expr.c for one example. A range ref is better than 
a memcpy/memmove or a loop - as the first looses the data type and some 
alias information and the second represents the structure in a more 
convoluted way. Either could be recovered by the middle end, but it 
currently isn't and doing it correctly from the beginning makes the ME 
life easier.

b) Middle-end arrays. Richard made an initial patch, cf. 
http://gcc.gnu.org/wiki/GCCGathering2011Fortran . It probably needs some 
polishing and some optimizations have to be implemented, but then it 
should work and allow for further optimizations. [Description in the 
wiki might be partially wrong; blame me - and correct it, if you find 
something.]

I think the latter requires also some thinking about how to handle 
arrays internally: In that case, the array has - at least for the 
scalarization - more than one rank (for the ME) while gfortran normally 
folds everything to rank-1 arrays. Additionally, one needs to think 
about the case where the array has nonunit strides, i.e. where the 
leftmost stride is not sizeof(declared type) but larger by a noninteger 
amount. (e.g. passing a polymorphic array to a TYPE.)

Tobias
Richard Biener Aug. 1, 2012, 12:52 p.m. UTC | #7
On Wed, Aug 1, 2012 at 2:37 PM, Tobias Burnus <burnus@net-b.de> wrote:
> On 08/01/2012 01:37 PM, Mikael Morin wrote:
>>>
>>> However, I found another spot where one needs to have a scalarizer;
>>> possibly a poor man's version is enough. Namely INTENT(OUT) handling.
>>
>> Indeed.
>>>
>>> Do you have an idea how to best handle that case?
>>
>> It seems some new code is necessary. I don't know how well it will
>> fit/reuse the existing though.
>
>
> I think we should try to get this working in some way for 4.8 as
> assumed-rank arrays will be used for the finalization wrapper - and it would
> be awesome to have FINAL support in 4.8.
>
> Background: As it is unknown (at compile time) whether a polymorphic
> variable has no final subroutines or one for that rank or an elemental one
> (or some but no suitable ones) - and as there could be a different
> combination for the parent type, the current plan is to add a _final
> proc-pointer to the vtable, which points to a final wrapper procedure for
> that type. It takes (at least for arrays) an assumed-rank array and
> dispatches the calls based on the rank; for an elemental final subroutine,
> it has to "scalarize it". [It's simple to add a special case as the array is
> contiguous - one just needs to "call elemental(base_address + i*elem_size)",
> where i = 1,size(assumed-size-array).]
>
> And for finalization, it would be great if one could use the INTENT(OUT)
> support. One could alternatively implement it manually on the gfortran AST
> level (gfc_code/gfc_expr) by walking through the derived type or one could
> implement a simplified version, making use of the contiguity of the
> finalized variable.
>
>
>
>> I have been thinking about rewriting the scalarizer in a way that would
>> need less bookkeeping to make things work. Nothing near a patch though,
>> and it's not something for 4.8.
>
>
> I think it would be good to base it on the new array descriptor, which we
> hopefully have by that time. Additionally, we should consider to support:
>
> a) ARRAY_RANGE_REF: That's probably somewhat independent of scalarization,
> but replaces it in some cases:
>   A(:,:,5) = B(:)
> can be implemented as ARRAY_RANGE_REF, if the memory is contiguous; one just
> passes an offset and (via the decl) the size of the array (section). See
> trans-expr.c for one example. A range ref is better than a memcpy/memmove or
> a loop - as the first looses the data type and some alias information and
> the second represents the structure in a more convoluted way. Either could
> be recovered by the middle end, but it currently isn't and doing it
> correctly from the beginning makes the ME life easier.
>
> b) Middle-end arrays. Richard made an initial patch, cf.
> http://gcc.gnu.org/wiki/GCCGathering2011Fortran . It probably needs some
> polishing and some optimizations have to be implemented, but then it should
> work and allow for further optimizations. [Description in the wiki might be
> partially wrong; blame me - and correct it, if you find something.]

Well, I wouldn't concentrate on this one ;)

> I think the latter requires also some thinking about how to handle arrays
> internally: In that case, the array has - at least for the scalarization -
> more than one rank (for the ME) while gfortran normally folds everything to
> rank-1 arrays. Additionally, one needs to think about the case where the
> array has nonunit strides, i.e. where the leftmost stride is not
> sizeof(declared type) but larger by a noninteger amount. (e.g. passing a
> polymorphic array to a TYPE.)

c) Do _not_ fold everything to rank-1 arrays (this makes data
dependence analysis
harder).  If you know the rank of an array use an intermediate array
pointer type
to access the data, like the following C example:

void foo (void *data, int n, int m)
{
  int (*a)[n][m] = (int (*)[n][m]) data;
  int i, j;
  for (i = 0; i < n; ++i)
    for (j = 0; j < m; ++j)
      (*a)[i][j] = 0;
}


d) Think about Frontend optimizations again - using the ISL part of GRAPHITE
on the GFortran IL, possibly driving the scalarizer with the result.

Richard.

> Tobias
diff mbox

Patch

Index: trans-array.c
===================================================================
--- trans-array.c	(révision 189883)
+++ trans-array.c	(copie de travail)
@@ -249,6 +249,20 @@  gfc_conv_descriptor_dtype (tree desc)
 
 
 tree
+gfc_conv_descriptor_rank (tree desc)
+{
+  tree tmp;
+  tree dtype;
+
+  dtype = gfc_conv_descriptor_dtype (desc);
+  tmp = build_int_cst (TREE_TYPE (dtype), GFC_DTYPE_RANK_MASK);
+  tmp = fold_build2_loc (input_location, BIT_AND_EXPR, TREE_TYPE (dtype),
+			 dtype, tmp);
+  return fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp);
+}
+
+
+tree
 gfc_get_descriptor_dimension (tree desc)
 {
   tree type, field;
@@ -3794,6 +3808,40 @@  done:
 	    /* Fall through to supply start and stride.  */
 	    case GFC_ISYM_LBOUND:
 	    case GFC_ISYM_UBOUND:
+	      {
+		gfc_expr *arg;
+
+		/* This is the variant without DIM=...  */
+		gcc_assert (expr->value.function.actual->next->expr == NULL);
+
+		arg = expr->value.function.actual->expr;
+		if (arg->rank == -1)
+		  {
+		    gfc_se se;
+		    tree rank, tmp;
+
+		    /* The rank (hence the return value's shape) is unknown,
+		       we have to retrieve it.  */
+		    gfc_init_se (&se, NULL);
+		    se.descriptor_only = 1;
+		    gfc_conv_expr (&se, arg);
+		    /* This is a bare variable, so there is no preliminary
+		       or cleanup code.  */
+		    gcc_assert (se.pre.head == NULL_TREE
+				&& se.post.head == NULL_TREE);
+		    rank = gfc_conv_descriptor_rank (se.expr);
+		    tmp = fold_build2_loc (input_location, MINUS_EXPR,
+					   gfc_array_index_type,
+					   fold_convert (gfc_array_index_type,
+							 rank),
+					   gfc_index_one_node);
+		    info->end[0] = gfc_evaluate_now (tmp, &loop->pre);
+		    info->start[0] = gfc_index_zero_node;
+		    info->stride[0] = gfc_index_one_node;
+		    continue;
+		  }
+		  /* Otherwise fall through GFC_SS_FUNCTION.  */
+	      }
 	    case GFC_ISYM_LCOBOUND:
 	    case GFC_ISYM_UCOBOUND:
 	    case GFC_ISYM_THIS_IMAGE:
@@ -4430,22 +4478,11 @@  set_loop_bounds (gfc_loopinfo *loop)
 	      continue;
 	    }
 
-	  /* TODO: Pick the best bound if we have a choice between a
-	     function and something else.  */
-	  if (ss_type == GFC_SS_FUNCTION)
-	    {
-	      loopspec[n] = ss;
-	      continue;
-	    }
-
 	  /* Avoid using an allocatable lhs in an assignment, since
 	     there might be a reallocation coming.  */
 	  if (loopspec[n] && ss->is_alloc_lhs)
 	    continue;
 
-	  if (ss_type != GFC_SS_SECTION)
-	    continue;
-
 	  if (!loopspec[n])
 	    loopspec[n] = ss;
 	  /* Criteria for choosing a loop specifier (most important first):
@@ -4520,6 +4557,20 @@  set_loop_bounds (gfc_loopinfo *loop)
 	      gcc_assert (loop->to[n] == NULL_TREE);
 	      break;
 
+	    case GFC_SS_INTRINSIC:
+	      {
+		gfc_expr *expr = loopspec[n]->info->expr;
+
+		/* The {l,u}bound of an assumed rank.  */
+		gcc_assert ((expr->value.function.isym->id == GFC_ISYM_LBOUND
+			     || expr->value.function.isym->id == GFC_ISYM_UBOUND)
+			     && expr->value.function.actual->next->expr == NULL
+			     && expr->value.function.actual->expr->rank == -1);
+
+		loop->to[n] = info->end[dim];
+		break;
+	      }
+
 	    default:
 	      gcc_unreachable ();
 	    }
Index: trans-array.h
===================================================================
--- trans-array.h	(révision 189881)
+++ trans-array.h	(copie de travail)
@@ -154,6 +154,7 @@  tree gfc_conv_descriptor_data_get (tree);
 tree gfc_conv_descriptor_data_addr (tree);
 tree gfc_conv_descriptor_offset_get (tree);
 tree gfc_conv_descriptor_dtype (tree);
+tree gfc_conv_descriptor_rank (tree);
 tree gfc_get_descriptor_dimension (tree);
 tree gfc_conv_descriptor_stride_get (tree, tree);
 tree gfc_conv_descriptor_lbound_get (tree, tree);
Index: iresolve.c
===================================================================
--- iresolve.c	(révision 189881)
+++ iresolve.c	(copie de travail)
@@ -134,9 +134,12 @@  resolve_bound (gfc_expr *f, gfc_expr *array, gfc_e
   if (dim == NULL)
     {
       f->rank = 1;
-      f->shape = gfc_get_shape (1);
-      mpz_init_set_ui (f->shape[0], coarray ? gfc_get_corank (array)
-					    : array->rank);
+      if (array->rank != -1)
+	{
+	  f->shape = gfc_get_shape (1);
+	  mpz_init_set_ui (f->shape[0], coarray ? gfc_get_corank (array)
+						: array->rank);
+	}
     }
 
   f->value.function.name = xstrdup (name);
@@ -2225,8 +2228,12 @@  gfc_resolve_shape (gfc_expr *f, gfc_expr *array, g
     f->ts.kind = gfc_default_integer_kind;
 
   f->rank = 1;
-  f->shape = gfc_get_shape (1);
-  mpz_init_set_ui (f->shape[0], array->rank);
+  if (array->rank != -1)
+    {
+      f->shape = gfc_get_shape (1);
+      mpz_init_set_ui (f->shape[0], array->rank);
+    }
+ 
   f->value.function.name = gfc_get_string (PREFIX ("shape_%d"), f->ts.kind);
 }
 
Index: trans-intrinsic.c
===================================================================
--- trans-intrinsic.c	(révision 189881)
+++ trans-intrinsic.c	(copie de travail)
@@ -1315,20 +1315,6 @@  trans_num_images (gfc_se * se)
 }
 
 
-static tree
-get_rank_from_desc (tree desc)
-{
-  tree tmp;
-  tree dtype;
-
-  dtype = gfc_conv_descriptor_dtype (desc);
-  tmp = build_int_cst (TREE_TYPE (dtype), GFC_DTYPE_RANK_MASK);
-  tmp = fold_build2_loc (input_location, BIT_AND_EXPR, TREE_TYPE (dtype),
-			 dtype, tmp);
-  return fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp);
-}
-
-
 static void
 gfc_conv_intrinsic_rank (gfc_se *se, gfc_expr *expr)
 {
@@ -1345,7 +1331,7 @@  gfc_conv_intrinsic_rank (gfc_se *se, gfc_expr *exp
   gfc_add_block_to_block (&se->pre, &argse.pre);
   gfc_add_block_to_block (&se->post, &argse.post);
 
-  se->expr = get_rank_from_desc (argse.expr);
+  se->expr = gfc_conv_descriptor_rank (argse.expr);
 }
 
 
@@ -1434,7 +1420,7 @@  gfc_conv_intrinsic_bound (gfc_se * se, gfc_expr *
           cond = fold_build2_loc (input_location, LT_EXPR, boolean_type_node,
 				  bound, build_int_cst (TREE_TYPE (bound), 0));
 	  if (as && as->type == AS_ASSUMED_RANK)
-	    tmp = get_rank_from_desc (desc);
+	    tmp = gfc_conv_descriptor_rank (desc);
 	  else
 	    tmp = gfc_rank_cst[GFC_TYPE_ARRAY_RANK (TREE_TYPE (desc))];
           tmp = fold_build2_loc (input_location, GE_EXPR, boolean_type_node,
@@ -5895,7 +5881,7 @@  gfc_conv_associated (gfc_se *se, gfc_expr *expr)
 	  gfc_conv_expr_lhs (&arg1se, arg1->expr);
 	  if (arg1->expr->rank == -1)
 	    {
-	      tmp = get_rank_from_desc (arg1se.expr);
+	      tmp = gfc_conv_descriptor_rank (arg1se.expr);
 	      tmp = fold_build2_loc (input_location, MINUS_EXPR,
 				     TREE_TYPE (tmp), tmp, gfc_index_one_node);
 	    }
Index: simplify.c
===================================================================
--- simplify.c	(révision 189881)
+++ simplify.c	(copie de travail)
@@ -5470,6 +5470,9 @@  gfc_simplify_shape (gfc_expr *source, gfc_expr *ki
   gfc_try t;
   int k = get_kind (BT_INTEGER, kind, "SHAPE", gfc_default_integer_kind);
 
+  if (source->rank == -1)
+    return NULL;
+
   result = gfc_get_array_expr (BT_INTEGER, k, &source->where);
 
   if (source->rank == 0)