Message ID | 501159F1.5060704@sfr.fr |
---|---|
State | New |
Headers | show |
On 26/07/2012 16:53, Mikael Morin wrote: > On 21/07/2012 13:08, Tobias Burnus wrote: >> Only failing are: >> lbound(x) / ubound(x) / shape(x) >> > Here is a draft for those. > Lightly tested with print *, ... > Better with the tests. $ ./test1 1 1 3 8 3 8 $ ./test2 11 101 13 108 3 8
On 07/26/2012 05:12 PM, Mikael Morin wrote: > On 26/07/2012 16:53, Mikael Morin wrote: >> Here is a draft for those. Lightly tested with print *, ... Looks rather nice. The output for test1 is also good: integer :: a(1:3,-2:5) gives lbound(arg) == [1, 1] ubound(arg) == [3, 8] shape(arg) == [3, 8] However, if the dummy is allocatable or a pointer, the result should be: lbound(arg) == [1, -2] ubound(arg) == [3, 5] shape(arg) == [3, 8] which your second test case doesn't give. (At least that's how I understand TS and F2008.) Except for that issue, I like your patch. Thanks! Tobias
On 26/07/2012 17:32, Tobias Burnus wrote: > On 07/26/2012 05:12 PM, Mikael Morin wrote: >> On 26/07/2012 16:53, Mikael Morin wrote: >>> Here is a draft for those. Lightly tested with print *, ... > > Looks rather nice. The output for test1 is also good: > > integer :: a(1:3,-2:5) > gives > lbound(arg) == [1, 1] > ubound(arg) == [3, 8] > shape(arg) == [3, 8] > > However, if the dummy is allocatable or a pointer, the result should be: > > lbound(arg) == [1, -2] > ubound(arg) == [3, 5] > shape(arg) == [3, 8] > > which your second test case doesn't give. Hello, do you have a test case exhibiting the problem? It seems fine to me. $ ./test1 1 1 3 8 3 8 1 1 3 8 3 8 1 -2 3 5 3 8 1 -2 3 5 3 8 ./test2 11 101 13 108 3 8 11 97 12 106 2 10 13 99 15 110 3 12
On 07/27/2012 07:26 PM, Mikael Morin wrote:
> do you have a test case exhibiting the problem? It seems fine to me.
Your second test case was too convoluted for me - and as I wasn't at
home, I couldn't test it. I now believe that your patch is okay; I will
later formally review it.
Do you intent to wrap it for final inclusion? I think it only lacks a
dejaGNUified test case and a changelog.
* * *
However, I found another spot where one needs to have a scalarizer;
possibly a poor man's version is enough. Namely INTENT(OUT) handling. Do
you have an idea how to best handle that case?
program test
implicit none
type t
integer, allocatable :: a
end type t
type(t) :: b(4,6)
integer :: i, j
do j = 1, 6
do i = 1, 4
allocate (b(i,j)%a)
end do
end do
call sub (b(::2,::3))
do j = 1, 6
do i = 1, 4
print *, i, j, allocated (b(i,j)%a)
! if (allocated (b(i,j)%a) .neqv. (mod (i-1,2) /= 0 .or. mod
(j-1,3) /= 0))&
! call abort ()
end do
end do
contains
subroutine sub (x)
type(t), intent(out) :: x(..)
end subroutine sub
end program test
Tobias
PS: Note to self: Reject passing an assumed-size array to an INTENT(OUT)
assumed-rank array, at least if it is "polymorphic, finalizable, of a
type with an allocatable ultimate component, or of a type for which
default initialization is specified." [TS29113 seems to allow it, but
one needs some check similar to F2008's C534. A constraint is not enough
as it doesn't cover all cases, but the rest is the user's responsibility.]
On 01/08/2012 12:00, Tobias Burnus wrote: > On 07/27/2012 07:26 PM, Mikael Morin wrote: >> do you have a test case exhibiting the problem? It seems fine to me. > > Your second test case was too convoluted for me - and as I wasn't at > home, I couldn't test it. I now believe that your patch is okay; I will > later formally review it. I will formally ask for it. ;-) > > Do you intent to wrap it for final inclusion? I think it only lacks a > dejaGNUified test case and a changelog. Will do. > > * * * > > However, I found another spot where one needs to have a scalarizer; > possibly a poor man's version is enough. Namely INTENT(OUT) handling. Indeed. > Do you have an idea how to best handle that case? It seems some new code is necessary. I don't know how well it will fit/reuse the existing though. I have been thinking about rewriting the scalarizer in a way that would need less bookkeeping to make things work. Nothing near a patch though, and it's not something for 4.8. Anyway, here is the interface I had in mind: gfc_init_loopinfo (loopinfo); /* generate the code. */ gfc_conv_expr (loopinfo, expr1); gfc_conv_expr (loopinfo, expr2); /* etc, do something, putting loopinfo every time as parameter so that it is populated appropriately... and then: */ loopblock = gfc_scalarize (loopinfo); the gfc_scalarize could have a big `if' in it distinguishing known rank from assumed rank. Maybe we could take the opportunity to make a seed for a new scalarizer. Mikael
On 08/01/2012 01:37 PM, Mikael Morin wrote: >> However, I found another spot where one needs to have a scalarizer; >> possibly a poor man's version is enough. Namely INTENT(OUT) handling. > Indeed. >> Do you have an idea how to best handle that case? > It seems some new code is necessary. I don't know how well it will > fit/reuse the existing though. I think we should try to get this working in some way for 4.8 as assumed-rank arrays will be used for the finalization wrapper - and it would be awesome to have FINAL support in 4.8. Background: As it is unknown (at compile time) whether a polymorphic variable has no final subroutines or one for that rank or an elemental one (or some but no suitable ones) - and as there could be a different combination for the parent type, the current plan is to add a _final proc-pointer to the vtable, which points to a final wrapper procedure for that type. It takes (at least for arrays) an assumed-rank array and dispatches the calls based on the rank; for an elemental final subroutine, it has to "scalarize it". [It's simple to add a special case as the array is contiguous - one just needs to "call elemental(base_address + i*elem_size)", where i = 1,size(assumed-size-array).] And for finalization, it would be great if one could use the INTENT(OUT) support. One could alternatively implement it manually on the gfortran AST level (gfc_code/gfc_expr) by walking through the derived type or one could implement a simplified version, making use of the contiguity of the finalized variable. > I have been thinking about rewriting the scalarizer in a way that would > need less bookkeeping to make things work. Nothing near a patch though, > and it's not something for 4.8. I think it would be good to base it on the new array descriptor, which we hopefully have by that time. Additionally, we should consider to support: a) ARRAY_RANGE_REF: That's probably somewhat independent of scalarization, but replaces it in some cases: A(:,:,5) = B(:) can be implemented as ARRAY_RANGE_REF, if the memory is contiguous; one just passes an offset and (via the decl) the size of the array (section). See trans-expr.c for one example. A range ref is better than a memcpy/memmove or a loop - as the first looses the data type and some alias information and the second represents the structure in a more convoluted way. Either could be recovered by the middle end, but it currently isn't and doing it correctly from the beginning makes the ME life easier. b) Middle-end arrays. Richard made an initial patch, cf. http://gcc.gnu.org/wiki/GCCGathering2011Fortran . It probably needs some polishing and some optimizations have to be implemented, but then it should work and allow for further optimizations. [Description in the wiki might be partially wrong; blame me - and correct it, if you find something.] I think the latter requires also some thinking about how to handle arrays internally: In that case, the array has - at least for the scalarization - more than one rank (for the ME) while gfortran normally folds everything to rank-1 arrays. Additionally, one needs to think about the case where the array has nonunit strides, i.e. where the leftmost stride is not sizeof(declared type) but larger by a noninteger amount. (e.g. passing a polymorphic array to a TYPE.) Tobias
On Wed, Aug 1, 2012 at 2:37 PM, Tobias Burnus <burnus@net-b.de> wrote: > On 08/01/2012 01:37 PM, Mikael Morin wrote: >>> >>> However, I found another spot where one needs to have a scalarizer; >>> possibly a poor man's version is enough. Namely INTENT(OUT) handling. >> >> Indeed. >>> >>> Do you have an idea how to best handle that case? >> >> It seems some new code is necessary. I don't know how well it will >> fit/reuse the existing though. > > > I think we should try to get this working in some way for 4.8 as > assumed-rank arrays will be used for the finalization wrapper - and it would > be awesome to have FINAL support in 4.8. > > Background: As it is unknown (at compile time) whether a polymorphic > variable has no final subroutines or one for that rank or an elemental one > (or some but no suitable ones) - and as there could be a different > combination for the parent type, the current plan is to add a _final > proc-pointer to the vtable, which points to a final wrapper procedure for > that type. It takes (at least for arrays) an assumed-rank array and > dispatches the calls based on the rank; for an elemental final subroutine, > it has to "scalarize it". [It's simple to add a special case as the array is > contiguous - one just needs to "call elemental(base_address + i*elem_size)", > where i = 1,size(assumed-size-array).] > > And for finalization, it would be great if one could use the INTENT(OUT) > support. One could alternatively implement it manually on the gfortran AST > level (gfc_code/gfc_expr) by walking through the derived type or one could > implement a simplified version, making use of the contiguity of the > finalized variable. > > > >> I have been thinking about rewriting the scalarizer in a way that would >> need less bookkeeping to make things work. Nothing near a patch though, >> and it's not something for 4.8. > > > I think it would be good to base it on the new array descriptor, which we > hopefully have by that time. Additionally, we should consider to support: > > a) ARRAY_RANGE_REF: That's probably somewhat independent of scalarization, > but replaces it in some cases: > A(:,:,5) = B(:) > can be implemented as ARRAY_RANGE_REF, if the memory is contiguous; one just > passes an offset and (via the decl) the size of the array (section). See > trans-expr.c for one example. A range ref is better than a memcpy/memmove or > a loop - as the first looses the data type and some alias information and > the second represents the structure in a more convoluted way. Either could > be recovered by the middle end, but it currently isn't and doing it > correctly from the beginning makes the ME life easier. > > b) Middle-end arrays. Richard made an initial patch, cf. > http://gcc.gnu.org/wiki/GCCGathering2011Fortran . It probably needs some > polishing and some optimizations have to be implemented, but then it should > work and allow for further optimizations. [Description in the wiki might be > partially wrong; blame me - and correct it, if you find something.] Well, I wouldn't concentrate on this one ;) > I think the latter requires also some thinking about how to handle arrays > internally: In that case, the array has - at least for the scalarization - > more than one rank (for the ME) while gfortran normally folds everything to > rank-1 arrays. Additionally, one needs to think about the case where the > array has nonunit strides, i.e. where the leftmost stride is not > sizeof(declared type) but larger by a noninteger amount. (e.g. passing a > polymorphic array to a TYPE.) c) Do _not_ fold everything to rank-1 arrays (this makes data dependence analysis harder). If you know the rank of an array use an intermediate array pointer type to access the data, like the following C example: void foo (void *data, int n, int m) { int (*a)[n][m] = (int (*)[n][m]) data; int i, j; for (i = 0; i < n; ++i) for (j = 0; j < m; ++j) (*a)[i][j] = 0; } d) Think about Frontend optimizations again - using the ISL part of GRAPHITE on the GFortran IL, possibly driving the scalarizer with the result. Richard. > Tobias
Index: trans-array.c =================================================================== --- trans-array.c (révision 189883) +++ trans-array.c (copie de travail) @@ -249,6 +249,20 @@ gfc_conv_descriptor_dtype (tree desc) tree +gfc_conv_descriptor_rank (tree desc) +{ + tree tmp; + tree dtype; + + dtype = gfc_conv_descriptor_dtype (desc); + tmp = build_int_cst (TREE_TYPE (dtype), GFC_DTYPE_RANK_MASK); + tmp = fold_build2_loc (input_location, BIT_AND_EXPR, TREE_TYPE (dtype), + dtype, tmp); + return fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp); +} + + +tree gfc_get_descriptor_dimension (tree desc) { tree type, field; @@ -3794,6 +3808,40 @@ done: /* Fall through to supply start and stride. */ case GFC_ISYM_LBOUND: case GFC_ISYM_UBOUND: + { + gfc_expr *arg; + + /* This is the variant without DIM=... */ + gcc_assert (expr->value.function.actual->next->expr == NULL); + + arg = expr->value.function.actual->expr; + if (arg->rank == -1) + { + gfc_se se; + tree rank, tmp; + + /* The rank (hence the return value's shape) is unknown, + we have to retrieve it. */ + gfc_init_se (&se, NULL); + se.descriptor_only = 1; + gfc_conv_expr (&se, arg); + /* This is a bare variable, so there is no preliminary + or cleanup code. */ + gcc_assert (se.pre.head == NULL_TREE + && se.post.head == NULL_TREE); + rank = gfc_conv_descriptor_rank (se.expr); + tmp = fold_build2_loc (input_location, MINUS_EXPR, + gfc_array_index_type, + fold_convert (gfc_array_index_type, + rank), + gfc_index_one_node); + info->end[0] = gfc_evaluate_now (tmp, &loop->pre); + info->start[0] = gfc_index_zero_node; + info->stride[0] = gfc_index_one_node; + continue; + } + /* Otherwise fall through GFC_SS_FUNCTION. */ + } case GFC_ISYM_LCOBOUND: case GFC_ISYM_UCOBOUND: case GFC_ISYM_THIS_IMAGE: @@ -4430,22 +4478,11 @@ set_loop_bounds (gfc_loopinfo *loop) continue; } - /* TODO: Pick the best bound if we have a choice between a - function and something else. */ - if (ss_type == GFC_SS_FUNCTION) - { - loopspec[n] = ss; - continue; - } - /* Avoid using an allocatable lhs in an assignment, since there might be a reallocation coming. */ if (loopspec[n] && ss->is_alloc_lhs) continue; - if (ss_type != GFC_SS_SECTION) - continue; - if (!loopspec[n]) loopspec[n] = ss; /* Criteria for choosing a loop specifier (most important first): @@ -4520,6 +4557,20 @@ set_loop_bounds (gfc_loopinfo *loop) gcc_assert (loop->to[n] == NULL_TREE); break; + case GFC_SS_INTRINSIC: + { + gfc_expr *expr = loopspec[n]->info->expr; + + /* The {l,u}bound of an assumed rank. */ + gcc_assert ((expr->value.function.isym->id == GFC_ISYM_LBOUND + || expr->value.function.isym->id == GFC_ISYM_UBOUND) + && expr->value.function.actual->next->expr == NULL + && expr->value.function.actual->expr->rank == -1); + + loop->to[n] = info->end[dim]; + break; + } + default: gcc_unreachable (); } Index: trans-array.h =================================================================== --- trans-array.h (révision 189881) +++ trans-array.h (copie de travail) @@ -154,6 +154,7 @@ tree gfc_conv_descriptor_data_get (tree); tree gfc_conv_descriptor_data_addr (tree); tree gfc_conv_descriptor_offset_get (tree); tree gfc_conv_descriptor_dtype (tree); +tree gfc_conv_descriptor_rank (tree); tree gfc_get_descriptor_dimension (tree); tree gfc_conv_descriptor_stride_get (tree, tree); tree gfc_conv_descriptor_lbound_get (tree, tree); Index: iresolve.c =================================================================== --- iresolve.c (révision 189881) +++ iresolve.c (copie de travail) @@ -134,9 +134,12 @@ resolve_bound (gfc_expr *f, gfc_expr *array, gfc_e if (dim == NULL) { f->rank = 1; - f->shape = gfc_get_shape (1); - mpz_init_set_ui (f->shape[0], coarray ? gfc_get_corank (array) - : array->rank); + if (array->rank != -1) + { + f->shape = gfc_get_shape (1); + mpz_init_set_ui (f->shape[0], coarray ? gfc_get_corank (array) + : array->rank); + } } f->value.function.name = xstrdup (name); @@ -2225,8 +2228,12 @@ gfc_resolve_shape (gfc_expr *f, gfc_expr *array, g f->ts.kind = gfc_default_integer_kind; f->rank = 1; - f->shape = gfc_get_shape (1); - mpz_init_set_ui (f->shape[0], array->rank); + if (array->rank != -1) + { + f->shape = gfc_get_shape (1); + mpz_init_set_ui (f->shape[0], array->rank); + } + f->value.function.name = gfc_get_string (PREFIX ("shape_%d"), f->ts.kind); } Index: trans-intrinsic.c =================================================================== --- trans-intrinsic.c (révision 189881) +++ trans-intrinsic.c (copie de travail) @@ -1315,20 +1315,6 @@ trans_num_images (gfc_se * se) } -static tree -get_rank_from_desc (tree desc) -{ - tree tmp; - tree dtype; - - dtype = gfc_conv_descriptor_dtype (desc); - tmp = build_int_cst (TREE_TYPE (dtype), GFC_DTYPE_RANK_MASK); - tmp = fold_build2_loc (input_location, BIT_AND_EXPR, TREE_TYPE (dtype), - dtype, tmp); - return fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp); -} - - static void gfc_conv_intrinsic_rank (gfc_se *se, gfc_expr *expr) { @@ -1345,7 +1331,7 @@ gfc_conv_intrinsic_rank (gfc_se *se, gfc_expr *exp gfc_add_block_to_block (&se->pre, &argse.pre); gfc_add_block_to_block (&se->post, &argse.post); - se->expr = get_rank_from_desc (argse.expr); + se->expr = gfc_conv_descriptor_rank (argse.expr); } @@ -1434,7 +1420,7 @@ gfc_conv_intrinsic_bound (gfc_se * se, gfc_expr * cond = fold_build2_loc (input_location, LT_EXPR, boolean_type_node, bound, build_int_cst (TREE_TYPE (bound), 0)); if (as && as->type == AS_ASSUMED_RANK) - tmp = get_rank_from_desc (desc); + tmp = gfc_conv_descriptor_rank (desc); else tmp = gfc_rank_cst[GFC_TYPE_ARRAY_RANK (TREE_TYPE (desc))]; tmp = fold_build2_loc (input_location, GE_EXPR, boolean_type_node, @@ -5895,7 +5881,7 @@ gfc_conv_associated (gfc_se *se, gfc_expr *expr) gfc_conv_expr_lhs (&arg1se, arg1->expr); if (arg1->expr->rank == -1) { - tmp = get_rank_from_desc (arg1se.expr); + tmp = gfc_conv_descriptor_rank (arg1se.expr); tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (tmp), tmp, gfc_index_one_node); } Index: simplify.c =================================================================== --- simplify.c (révision 189881) +++ simplify.c (copie de travail) @@ -5470,6 +5470,9 @@ gfc_simplify_shape (gfc_expr *source, gfc_expr *ki gfc_try t; int k = get_kind (BT_INTEGER, kind, "SHAPE", gfc_default_integer_kind); + if (source->rank == -1) + return NULL; + result = gfc_get_array_expr (BT_INTEGER, k, &source->where); if (source->rank == 0)