diff mbox

PR 78534 Change character length from int to size_t

Message ID 1481536386-8520-1-git-send-email-blomqvist.janne@gmail.com
State New
Headers show

Commit Message

Janne Blomqvist Dec. 12, 2016, 9:53 a.m. UTC
In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.

I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.

Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.

I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.

A caveat here is the testcase char_result_8.f90, which I changed
slightly to work around PR 78757. It's a somewhat obscure corner case,
and while this patch admittedly makes it more likely to trigger this
bug, it's, well, still a corner case.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

frontend:

2016-12-12  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/78534
	PR fortran/66310
	* arith.c (gfc_check_charlen_range): New function.
	(gfc_range_check): Use gfc_check_charlen_range.
	* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
	hardcoded kind.
	(find_intrinsic_vtab): Likewise.
	* expr.c (gfc_get_character_expr): Length parameter of type
	gfc_charlen_t.
	(gfc_get_int_expr): Value argument of type long.
	(gfc_extract_long): New function.
	* gfortran.h (gfc_typespec): New member is_charlen.
	(gfc_charlen_t): New typedef.
	(gfc_expr): Use gfc_charlen_t for character lengths.
	(gfc_size_kind): New extern variable.
	(gfc_extract_long): New prototype.
	(gfc_get_character_expr): Use gfc_charlen_t for character length.
	(gfc_get_int_expr): Use long type for value argument.
	* iresolve.c (gfc_resolve_repeat): Use gfc_charlen_t,
	gfc_charlen_int_kind, set is_charlen.
	* match.c (select_intrinsic_set_tmp): Use long for charlen.
	* module.c (atom_int): Change type from int to HOST_WIDE_INT.
	(parse_integer): Don't complain about large integers.
	(write_atom): Use HOST_WIDE_INT for integers.
	(mio_integer): Handle integer type mismatch.
	(mio_hwi): New function.
	(mio_intrinsic_op): Use HOST_WIDE_INT.
	(mio_array_ref): Likewise.
	(mio_expr): Likewise.
	* resolve.c (resolve_substring): Use get_type_static_bounds.
	(resolve_select_type): Use long for charlen.
	(resolve_charlen): Use long for charlen, get_type_static_bounds.
	* simplify.c (gfc_simplify_repeat): Likewise.
	* target-memory.c (gfc_interpret_character): Use gfc_charlen_t.
	* trans-array.c (get_array_ctor_var_strlen): Use
	gfc_conv_mpz_to_tree_type.
	* trans-const.c (gfc_conv_mpz_to_tree_type): New function.
	* trans-const.h (gfc_conv_mpz_to_tree_type): New prototype.
	* trans-decl.c (create_function_arglist): Assert that length is
	not NULL_TREE.
	* trans-expr.c (gfc_class_len_or_zero_get): Build const of type
	gfc_charlen_type_node.
	(gfc_conv_intrinsic_to_class): Use gfc_charlen_int_kind instead of
	4, fold_convert to correct type.
	(gfc_conv_class_to_class): Build const of type size_type_node for
	size.
	(gfc_copy_class_to_class): Likewise.
	(gfc_conv_string_length): Use same type in expression.
	(gfc_conv_substring): Likewise, use long for charlen.
	(gfc_conv_string_tmp): Make sure len is of the right type.
	(gfc_conv_concat_op): Use same type in expression.
	(gfc_conv_procedure_call): Likewise.
	(alloc_scalar_allocatable_for_subcomponent_assignment):
	fold_convert to right type.
	(gfc_trans_subcomponent_assign): Likewise.
	(trans_class_vptr_len_assignment): Build const of correct type.
	(gfc_trans_pointer_assignment): Likewise.
	(alloc_scalar_allocatable_for_assignment): fold_convert to right
	type in expr.
	(trans_class_assignment): Build const of correct type.
	* trans-intrinsic.c (gfc_conv_associated): Likewise.
	(gfc_conv_intrinsic_repeat): Do calculation in sizetype.
	* trans-io.c (gfc_build_io_library_fndecls): Use
	gfc_charlen_type_node for character lengths.
	* trans-stmt.c (gfc_trans_label_assign): Build const of
	gfc_charlen_type_node.
	(gfc_trans_character_select): Likewise.
	(gfc_trans_allocate): Likewise, don't typecast strlen result.
	(gfc_trans_deallocate): Don't typecast strlen result.
	* trans-types.c (gfc_size_kind): New variable.
	(gfc_init_types): Determine gfc_charlen_int_kind and gfc_size_kind
	from size_type_node.
	* trans-types.h: Include trans.h, tidying.

testsuite:

2016-12-12  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/78534
	PR fortran/66310
	* gfortran.dg/char_result_8.f90: Work around PR 78757 by using an
	integer of kind C_SIZE_T.
	* gfortran.dg/repeat_4.f90: Use integers of kind C_SIZE_T.
	* gfortran.dg/repeat_7.f90: New test for PR 66310.
	* gfortran.dg/scan_2.f90: Handle potential cast in assignment.
	* gfortran.dg/string_1.f90: Limit to ilp32 targets.
	* gfortran.dg/string_1_lp64.f90: New test.
	* gfortran.dg/string_3.f90: Limit to ilp32 targets.
	* gfortran.dg/string_3_lp64.f90: New test.

libgfortran:

2016-12-12  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/78534
	* intrinsics/args.c (getarg_i4): Use gfc_charlen_type.
	(get_command_argument_i4): Likewise.
	(get_command_i4): Likewise.
	* intrinsics/chmod.c (chmod_internal): Likewise.
	* intrinsics/env.c (get_environment_variable_i4): Likewise.
	* intrinsics/extends_type_of.c (struct vtype): Use size_t for size
	member.
	* intrinsics/gerror.c (gerror): Use gfc_charlen_type.
	* intrinsics/getlog.c (getlog): Likewise.
	* intrinsics/hostnm.c (hostnm_0): Likewise.
	* intrinsics/string_intrinsics_inc.c (string_len_trim): Rework to
	work if gfc_charlen_type is unsigned.
	(string_scan): Likewise.
	* io/transfer.c (transfer_character): Modify prototype.
	(transfer_character_write): Likewise.
	(transfer_character_wide): Likewise.
	(transfer_character_wide_write): Likewise.
	* io/unit.c (is_trim_ok): Use gfc_charlen_type.
	* io/write.c (namelist_write): Likewise.
	* libgfortran.h (gfc_charlen_type): Change typedef to size_t.
---
 gcc/fortran/arith.c                            | 28 +++++++++++++
 gcc/fortran/class.c                            | 12 +++---
 gcc/fortran/expr.c                             | 30 +++++++++++++-
 gcc/fortran/gfortran.h                         | 18 +++++++--
 gcc/fortran/iresolve.c                         | 10 +++--
 gcc/fortran/match.c                            |  4 +-
 gcc/fortran/module.c                           | 43 +++++++++++++-------
 gcc/fortran/resolve.c                          | 31 +++++++++-----
 gcc/fortran/simplify.c                         | 33 ++++++++++-----
 gcc/fortran/target-memory.c                    |  8 ++--
 gcc/fortran/trans-array.c                      |  3 +-
 gcc/fortran/trans-const.c                      | 12 ++++++
 gcc/fortran/trans-const.h                      |  1 +
 gcc/fortran/trans-decl.c                       |  3 +-
 gcc/fortran/trans-expr.c                       | 56 +++++++++++++++-----------
 gcc/fortran/trans-intrinsic.c                  | 54 ++++++++++++-------------
 gcc/fortran/trans-io.c                         |  4 +-
 gcc/fortran/trans-stmt.c                       | 19 ++++-----
 gcc/fortran/trans-types.c                      | 12 +++++-
 gcc/fortran/trans-types.h                      |  5 ++-
 gcc/testsuite/gfortran.dg/char_result_8.f90    |  8 ++--
 gcc/testsuite/gfortran.dg/repeat_4.f90         | 23 ++++++-----
 gcc/testsuite/gfortran.dg/repeat_7.f90         |  8 ++++
 gcc/testsuite/gfortran.dg/scan_2.f90           |  4 +-
 gcc/testsuite/gfortran.dg/string_1.f90         |  1 +
 gcc/testsuite/gfortran.dg/string_1_lp64.f90    | 15 +++++++
 gcc/testsuite/gfortran.dg/string_3.f90         |  1 +
 gcc/testsuite/gfortran.dg/string_3_lp64.f90    | 20 +++++++++
 libgfortran/intrinsics/args.c                  | 10 ++---
 libgfortran/intrinsics/chmod.c                 |  3 +-
 libgfortran/intrinsics/env.c                   |  3 +-
 libgfortran/intrinsics/extends_type_of.c       |  2 +-
 libgfortran/intrinsics/gerror.c                |  2 +-
 libgfortran/intrinsics/getlog.c                |  3 +-
 libgfortran/intrinsics/hostnm.c                |  5 +--
 libgfortran/intrinsics/string_intrinsics_inc.c | 17 ++++----
 libgfortran/io/transfer.c                      | 16 ++++----
 libgfortran/io/unit.c                          |  3 +-
 libgfortran/io/write.c                         |  3 +-
 libgfortran/libgfortran.h                      |  2 +-
 40 files changed, 359 insertions(+), 176 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/repeat_7.f90
 create mode 100644 gcc/testsuite/gfortran.dg/string_1_lp64.f90
 create mode 100644 gcc/testsuite/gfortran.dg/string_3_lp64.f90

Comments

FX Coudert Dec. 12, 2016, 10:20 a.m. UTC | #1
Hi Janne,

This is an ABI change, so it is serious… it will require people to recompile older code and libraries with the new compiler. Do we already plan to break the ABI in this cycle, or is this the first ABI-breaking patch of the cycle? And do we have real-life examples of character strings larger than 2 GB?

> Also, as there are some places in the frontend were negative character
> lengths are used as special flag values, in the frontend the character
> length is handled as a signed variable of the same size as a size_t,
> although in the runtime library it really is size_t.

First, I thought: we should really make it size_t, and have the negative values be well-defined constants, e.g. (size_t) -1

On the other hand, there is the problem of the case where the front-end has different size_t than the target: think 32-bit on 64-bit i386 (front-end size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit machine (front-end size_t smaller than target size_t). So the charlen type bounds need to be determined when the front-end runs, not when it is compiled (i.e. it is not a fixed type).

In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.


> I haven't changed the character length variables for the co-array
> intrinsics, as this is something that may need to be synchronized with
> OpenCoarrays.

Won’t that mean that coarray programs will fail due to ABI mismatch?


FX
Andre Vehreschild Dec. 12, 2016, 10:29 a.m. UTC | #2
Hi FX,

there is already an ABI change. DTIO needed it.

I will take on the coarray ABI changes in the next days and also emit a
pull-request to the opencoarrays to get them to sync. Janne, please wait until
I have added those changes to prevent people from having to re-compile multiple
times.

- Andre

On Mon, 12 Dec 2016 11:20:06 +0100
FX <fxcoudert@gmail.com> wrote:

> Hi Janne,
> 
> This is an ABI change, so it is serious… it will require people to recompile
> older code and libraries with the new compiler. Do we already plan to break
> the ABI in this cycle, or is this the first ABI-breaking patch of the cycle?
> And do we have real-life examples of character strings larger than 2 GB?
> 
> > Also, as there are some places in the frontend were negative character
> > lengths are used as special flag values, in the frontend the character
> > length is handled as a signed variable of the same size as a size_t,
> > although in the runtime library it really is size_t.  
> 
> First, I thought: we should really make it size_t, and have the negative
> values be well-defined constants, e.g. (size_t) -1
> 
> On the other hand, there is the problem of the case where the front-end has
> different size_t than the target: think 32-bit on 64-bit i386 (front-end
> size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit
> machine (front-end size_t smaller than target size_t). So the charlen type
> bounds need to be determined when the front-end runs, not when it is compiled
> (i.e. it is not a fixed type).
> 
> In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.
> 
> 
> > I haven't changed the character length variables for the co-array
> > intrinsics, as this is something that may need to be synchronized with
> > OpenCoarrays.  
> 
> Won’t that mean that coarray programs will fail due to ABI mismatch?
> 
> 
> FX
Janne Blomqvist Dec. 12, 2016, 1:07 p.m. UTC | #3
On Mon, Dec 12, 2016 at 12:20 PM, FX <fxcoudert@gmail.com> wrote:
> Hi Janne,
>
> This is an ABI change, so it is serious… it will require people to recompile older code and libraries with the new compiler. Do we already plan to break the ABI in this cycle, or is this the first ABI-breaking patch of the cycle?

As Andre mentioned, the ABI has already been broken, Gfortran 7 will
have libgfortran.so.4.

However, this will also affect people doing C->Fortran calls the
old-fashioned way without ISO_C_BINDING, as they will have to change
the string length argument from int to size_t in their prototypes.
Then again, Intel Fortran did this some years ago so I guess at least
people who care about portability to several compilers are aware.

> And do we have real-life examples of character strings larger than 2 GB?

Well, people who have needed such will have figured out some
work-around since we haven't supported it, so how would we know? :) It
could be splitting the data into several strings, or switching to
ifort, using C instead of Fortran, or something else.

In any case, I don't expect characters larger than 2 GB to be common
(particularly with the Fortran standard-mandated behaviour of
space-padding to the end in many cases), but as the ABI has been
broken anyways, we might as well fix it.

IIRC at some point there was some discussion of this on
comp.lang.fortran, and somebody mentioned analysis of genomic data as
a use case where large characters can be useful. I don't have any
personal usecase though, at least at the moment.

>> Also, as there are some places in the frontend were negative character
>> lengths are used as special flag values, in the frontend the character
>> length is handled as a signed variable of the same size as a size_t,
>> although in the runtime library it really is size_t.
>
> First, I thought: we should really make it size_t, and have the negative values be well-defined constants, e.g. (size_t) -1

I tried it, but in addition to the issue with negative characters used
as flag values, there's issues like we have stuff such as
gfc_get_int_expr() that take a kind value, and an integer constant,
and produces a gfc_expr. But that doesn't understand stuff like
unsigned types. So in the end I decided it's better to get this patch
in working shape and merged with the ABI changes, then one can fix the
unsigned-ness later (in the end it's just a factor of two in sizes we
can handle, so not a huge deal).

> On the other hand, there is the problem of the case where the front-end has different size_t than the target: think 32-bit on 64-bit i386 (front-end size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit machine (front-end size_t smaller than target size_t). So the charlen type bounds need to be determined when the front-end runs, not when it is compiled (i.e. it is not a fixed type).

True. Although things like gfc_charlen_type_node should be correct for
the target, the type gfc_charlen_t that I introduced in the frontend
might be too small if one is doing a 32->64 bit cross-compile. So that
should be changed from a typedef of ptrdiff_t to a typedef of
HOST_WIDE_INT which AFAIK is guaranteed to be 64-bit everywhere.

> In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.

Hmm, I think it's a leftover from some earlier experimentation, should
be removed.

>> I haven't changed the character length variables for the co-array
>> intrinsics, as this is something that may need to be synchronized with
>> OpenCoarrays.
>
> Won’t that mean that coarray programs will fail due to ABI mismatch?

No, the co-array intrinsics are, well, intrinsics, so they're handled
specially in the frontend and don't need to follow the normal
argument-passing conventions. But I think it'd be easier if they did,
and might prevent some obscure corner-case bugs. Say, create a
character variable with length 2**31+9, then typecasting to plain int
when calling the intrinsic would wrap around and the library would see
a negative length.
Janne Blomqvist Dec. 12, 2016, 1:10 p.m. UTC | #4
On Mon, Dec 12, 2016 at 12:29 PM, Andre Vehreschild <vehre@gmx.de> wrote:
> I will take on the coarray ABI changes in the next days and also emit a
> pull-request to the opencoarrays to get them to sync. Janne, please wait until
> I have added those changes to prevent people from having to re-compile multiple
> times.

Ok, thanks for taking care of this!
Janus Weil Dec. 12, 2016, 1:35 p.m. UTC | #5
Hi guys,

> there is already an ABI change. DTIO needed it.

maybe it would be a good idea to document this in places like:
* https://gcc.gnu.org/wiki/GFortran/News
* https://gcc.gnu.org/gcc-7/changes.html

On the first page there are "Compatibility notices" for several
earlier versions which mention stuff like this ...

Cheers,
Janus



> On Mon, 12 Dec 2016 11:20:06 +0100
> FX <fxcoudert@gmail.com> wrote:
>
>> Hi Janne,
>>
>> This is an ABI change, so it is serious… it will require people to recompile
>> older code and libraries with the new compiler. Do we already plan to break
>> the ABI in this cycle, or is this the first ABI-breaking patch of the cycle?
>> And do we have real-life examples of character strings larger than 2 GB?
>>
>> > Also, as there are some places in the frontend were negative character
>> > lengths are used as special flag values, in the frontend the character
>> > length is handled as a signed variable of the same size as a size_t,
>> > although in the runtime library it really is size_t.
>>
>> First, I thought: we should really make it size_t, and have the negative
>> values be well-defined constants, e.g. (size_t) -1
>>
>> On the other hand, there is the problem of the case where the front-end has
>> different size_t than the target: think 32-bit on 64-bit i386 (front-end
>> size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit
>> machine (front-end size_t smaller than target size_t). So the charlen type
>> bounds need to be determined when the front-end runs, not when it is compiled
>> (i.e. it is not a fixed type).
>>
>> In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.
>>
>>
>> > I haven't changed the character length variables for the co-array
>> > intrinsics, as this is something that may need to be synchronized with
>> > OpenCoarrays.
>>
>> Won’t that mean that coarray programs will fail due to ABI mismatch?
>>
>>
>> FX
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de
Janne Blomqvist Dec. 12, 2016, 1:43 p.m. UTC | #6
On Mon, Dec 12, 2016 at 3:35 PM, Janus Weil <janus@gcc.gnu.org> wrote:
> Hi guys,
>
>> there is already an ABI change. DTIO needed it.
>
> maybe it would be a good idea to document this in places like:
> * https://gcc.gnu.org/wiki/GFortran/News
> * https://gcc.gnu.org/gcc-7/changes.html
>
> On the first page there are "Compatibility notices" for several
> earlier versions which mention stuff like this ...

Yes, absolutely. I was planning to do it when/if the patch is accepted
and merged.
Bob Deen Dec. 12, 2016, 6:26 p.m. UTC | #7
> However, this will also affect people doing C->Fortran calls the
> old-fashioned way without ISO_C_BINDING, as they will have to change
> the string length argument from int to size_t in their prototypes.
> Then again, Intel Fortran did this some years ago so I guess at least
> people who care about portability to several compilers are aware.

We do a ton of this (old fashioned c-fortran binding) and changing the string length argument size will have a big impact on us.  We don't use the Intel compiler so we never noticed a change there.

Is there really a use case for strings > 2 GB that justifies the breakage?  I certainly understand wanting to do it "right" but I'm probably not the only one with practical considerations that argue against it if there are no compelling use cases.

Thanks...

-Bob

Bob Deen @ NASA-JPL Multimission Image Processing Lab
Bob.Deen@jpl.nasa.gov
Bob Deen Dec. 19, 2016, 4:43 p.m. UTC | #8
Hi all...

I never saw any followup on this...?

It's one thing to break the ABI between the compiler and the gfortran 
library; those can generally be expected to be in sync.  It's another to 
break the ABI between two *languages*, when there might be no such 
expectation (especially if gcc does NOT break their ABI at the same 
version number transition).  Yes, the pre-ISO_C_BINDING method may be 
old-fashioned, but it is a de-facto standard, and breaking it should not 
be done lightly.

If you do proceed with changing the size, I would request that there at 
least be a facility to reliably tell at compile time (on the C side) 
which definition is being used, so I can adjust our macros accordingly. 
Our code does depend on the size, and it has to cross-platform (and now, 
if this change is made, cross-version), so with this change I would have 
to support both int and size_t.

A C-side preprocessor symbol definition would do the trick.  Of course 
that assumes the versions of gcc/g++ and gfortran are in sync, which is 
never guaranteed.  But that assumption is better than nothing.  Unless 
someone has a better idea...?

Perhaps it might be best to wait until a time when gcc is also breaking 
their ABI, so that there's no question of code (on either side) working 
across the transition...?

Thanks...

-Bob

P.S.  I'm just a lurker here, but I lurk specifically to look for things 
that will break our code base, like this....  ;-)

Bob.Deen @ NASA-JPL Multimission Image Processing Lab
Bob.Deen@jpl.nasa.gov


On 12/12/16 10:26 AM, Bob Deen wrote:
>
>> However, this will also affect people doing C->Fortran calls the
>> old-fashioned way without ISO_C_BINDING, as they will have to change
>> the string length argument from int to size_t in their prototypes.
>> Then again, Intel Fortran did this some years ago so I guess at least
>> people who care about portability to several compilers are aware.
>
> We do a ton of this (old fashioned c-fortran binding) and changing the string length argument size will have a big impact on us.  We don't use the Intel compiler so we never noticed a change there.
>
> Is there really a use case for strings > 2 GB that justifies the breakage?  I certainly understand wanting to do it "right" but I'm probably not the only one with practical considerations that argue against it if there are no compelling use cases.
>
> Thanks...
>
> -Bob
>
> Bob Deen @ NASA-JPL Multimission Image Processing Lab
> Bob.Deen@jpl.nasa.gov
>
>
Steve Kargl Dec. 19, 2016, 6:20 p.m. UTC | #9
On Mon, Dec 19, 2016 at 08:43:01AM -0800, Bob Deen wrote:
> 
> It's one thing to break the ABI between the compiler and the gfortran 
> library; those can generally be expected to be in sync.  It's another to 
> break the ABI between two *languages*, when there might be no such 
> expectation (especially if gcc does NOT break their ABI at the same 
> version number transition).  Yes, the pre-ISO_C_BINDING method may be 
> old-fashioned, but it is a de-facto standard, and breaking it should not 
> be done lightly.

Do you really think that those of us who actively contribute to 
gfortran development take breaking the ABI lightly?  We have put 
off changes to gfortran's library for several years to specifically 
avoid ABI breakage.  It seems that there is never a "Good Time" to
break the ABI.  However, in this case, support for F2008 9.6.4.8,
Defined Input/Output, necessitates a change in the ABI.  Instead of
breaking the ABI multiple times, it has been decided to try to cleanup
some long standing issues with libgfortran.

> If you do proceed with changing the size, I would request that there at 
> least be a facility to reliably tell at compile time (on the C side) 
> which definition is being used, so I can adjust our macros accordingly. 
> Our code does depend on the size, and it has to cross-platform (and now, 
> if this change is made, cross-version), so with this change I would have 
> to support both int and size_t.

As the breakage is going to occur with gfortran 7.0, you do

% cat a.F90
#if defined(__GFORTRAN__) && (__GNUC__ > 6)
print *, '7'
#else
print *, 'not 7'
#endif
end
% gfc7 -E a.F90 | cat -s
] gfc7 -E a.F90 | cat -s
# 1 "a.F90"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "a.F90"

print *, '7'

end
% gfortran6 -E a.F90 | cat -s
# 1 "a.F90"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "a.F90"

print *, 'not 7'

end

> Perhaps it might be best to wait until a time when gcc is also breaking 
> their ABI, so that there's no question of code (on either side) working 
> across the transition...?

There is never a good time.  If we are to wait for gcc, should
we remove support for Defined Input/Output from the compiler?
Janne Blomqvist Dec. 19, 2016, 7:33 p.m. UTC | #10
On Mon, Dec 19, 2016 at 6:43 PM, Bob Deen <Bob.Deen@jpl.nasa.gov> wrote:
> Hi all...
>
> I never saw any followup on this...?
>
> It's one thing to break the ABI between the compiler and the gfortran
> library; those can generally be expected to be in sync.  It's another to
> break the ABI between two *languages*, when there might be no such
> expectation (especially if gcc does NOT break their ABI at the same version
> number transition).  Yes, the pre-ISO_C_BINDING method may be old-fashioned,
> but it is a de-facto standard, and breaking it should not be done lightly.

First: No, it's not done "lightly". And secondly, cross-language
interfacing is always tricky, which is why most languages, including
Fortran with ISO_C_BINDING, have devised standardized ways for
communication with C so users don't need to rely on various cross-call
mechanisms that are not guaranteed to work.

That the charlen is a hidden argument added at the end of type int is
AFAIK a fairly common implementation choice, though I'm not sure if it
can be called a de-facto standard.  Considering that Intel Fortran has
switched to size_t several years ago (5-ish?), and AFAIU it's the most
used Fortran compiler around in addition to GFortran, and the world
hasn't crashed down due to it, I suspect users can adapt to the change
with relatively little drama.

That gcc would change it's ABI at all, and especially in conjunction
with gfortran, is a pipe dream.

C changed to use size_t for string lengths instead of int with ANSI C
in, what, 1989.  With 2-socket servers, typically used e.g. in HPC
clusters, today easily having hundreds of gigs of RAM, limiting
GFortran char lengths to 2 GB for all eternity in the name of
compatibility seems quaint at best. Maybe in your organization Fortran
is legacy code that YE SHALL NOT TOUCH, but GFortran also has to cater
to users who have chosen to write new code in Fortran.

> If you do proceed with changing the size, I would request that there at
> least be a facility to reliably tell at compile time (on the C side) which
> definition is being used, so I can adjust our macros accordingly.

Oh, you have macros rather than hard-coded int all over the place?
Shouldn't it be a relatively trivial affair then to define that macro
appropriately depending on which compiler and which version you're
using?

Steve showed how you can do it for Fortran. From the C side, just
check the version from the __GNUC__ macro.

> Our code
> does depend on the size, and it has to cross-platform (and now, if this
> change is made, cross-version), so with this change I would have to support
> both int and size_t.

Well, if you add the option to use size_t you should be able to use
ifort as well. :)

> A C-side preprocessor symbol definition would do the trick.  Of course that
> assumes the versions of gcc/g++ and gfortran are in sync, which is never
> guaranteed.  But that assumption is better than nothing.  Unless someone has
> a better idea...?

Yeah, I think that's the best idea.

Another option would be to implement some kind of
-fcharacter-length=[int,size_t] command-line option. But that would
make the patch a lot more complicated since one would need to typecast
the character length argument when calling libgfortran. And, you'd
still have to have some version-dependent checks to see if gfortran
would accept that option. And like other similar options like
-fdefault-this-or-that it would change the ABI, so code compiled with
that option would be incompatible with code compiled without it. So in
the end I'm not convinced such an option would actually make life any
easier for our users.

> Perhaps it might be best to wait until a time when gcc is also breaking
> their ABI, so that there's no question of code (on either side) working
> across the transition...?

AFAIK there is no ABI change planned for gcc. For better or worse, the
C language is relatively stable and doesn't change much.

> P.S.  I'm just a lurker here, but I lurk specifically to look for things
> that will break our code base, like this....  ;-)

Well, then you ought to be aware the ABI cleanup page on the wiki,
where the char length issue has been listed for, what, 5 years or so,
so it can't really be a surprise that it will happen at some point,
can it...?

>
> Bob.Deen @ NASA-JPL Multimission Image Processing Lab
> Bob.Deen@jpl.nasa.gov
>
>
>
> On 12/12/16 10:26 AM, Bob Deen wrote:
>>
>>
>>> However, this will also affect people doing C->Fortran calls the
>>> old-fashioned way without ISO_C_BINDING, as they will have to change
>>> the string length argument from int to size_t in their prototypes.
>>> Then again, Intel Fortran did this some years ago so I guess at least
>>> people who care about portability to several compilers are aware.
>>
>>
>> We do a ton of this (old fashioned c-fortran binding) and changing the
>> string length argument size will have a big impact on us.  We don't use the
>> Intel compiler so we never noticed a change there.
>>
>> Is there really a use case for strings > 2 GB that justifies the breakage?
>> I certainly understand wanting to do it "right" but I'm probably not the
>> only one with practical considerations that argue against it if there are no
>> compelling use cases.
>>
>> Thanks...
>>
>> -Bob
>>
>> Bob Deen @ NASA-JPL Multimission Image Processing Lab
>> Bob.Deen@jpl.nasa.gov
>>
>>
>
Bob Deen Dec. 20, 2016, 12:43 a.m. UTC | #11
On 12/19/16 11:33 AM, Janne Blomqvist wrote:
> On Mon, Dec 19, 2016 at 6:43 PM, Bob Deen <Bob.Deen@jpl.nasa.gov> wrote:
>> Hi all...
>>
>> I never saw any followup on this...?
>>
>> It's one thing to break the ABI between the compiler and the gfortran
>> library; those can generally be expected to be in sync.  It's another to
>> break the ABI between two *languages*, when there might be no such
>> expectation (especially if gcc does NOT break their ABI at the same version
>> number transition).  Yes, the pre-ISO_C_BINDING method may be old-fashioned,
>> but it is a de-facto standard, and breaking it should not be done lightly.
>
> First: No, it's not done "lightly". And secondly, cross-language
> interfacing is always tricky, which is why most languages, including
> Fortran with ISO_C_BINDING, have devised standardized ways for
> communication with C so users don't need to rely on various cross-call
> mechanisms that are not guaranteed to work.

Apologies if I offended (and to Steve too).  I see all the deliberation 
you're doing for breaking the language->library ABI, and appreciate 
that.  It's well-justified.  My point, however, is that with this change 
you are breaking an entirely *different* ABI - that between Fortran and 
C - and the sum total of discussion was one message from Janne pointing 
out that it was breaking (thanks for that heads-up, I had missed it!), 
with no followup.  Janne, you yourself in that message questioned the 
need for large strings, and had no use cases in response to FX's inquiry.

Now that I think about it, it's not even an ABI change, it's an API 
change... requiring a code change, not just a recompile.

So in this case, this change represents (AFAIK) the only breakage in the 
old-style Fortran<->C ABI/API, with no known use cases... and thus my 
question about whether it's justified.  It's a fair question.  I'm not 
arguing the language->library ABI at all.

> C changed to use size_t for string lengths instead of int with ANSI C
> in, what, 1989.  With 2-socket servers, typically used e.g. in HPC
> clusters, today easily having hundreds of gigs of RAM, limiting
> GFortran char lengths to 2 GB for all eternity in the name of
> compatibility seems quaint at best. Maybe in your organization Fortran
> is legacy code that YE SHALL NOT TOUCH, but GFortran also has to cater
> to users who have chosen to write new code in Fortran.

I understand that.  It just seems that opening up an entirely *new* 
ABI/API for breakage deserved a little more discussion.  Y'all are the 
ones doing the (mostly volunteer) work on gfortran, and I appreciate it. 
  You're also much more invested in the future of the language than I 
(yeah, it's mostly legacy code for us).  If you end up deciding that it 
needs to be done, then I'll deal with it.  I just wanted to chime in 
that there are users who will be affected.  If I'm the only one, I 
wouldn't want to stand in the way of progress - but also don't want to 
get steamrolled if it's not an important change, or if there are other 
affected users.

So... ARE there any other affected users out there??

> Oh, you have macros rather than hard-coded int all over the place?
> Shouldn't it be a relatively trivial affair then to define that macro
> appropriately depending on which compiler and which version you're
> using?

I wouldn't call it trivial by any means... it's tricky code I haven't 
had to look at in 10 years.  But in the end, probably doable.

> Steve showed how you can do it for Fortran. From the C side, just
> check the version from the __GNUC__ macro.

I dislike having to check for version numbers (feels kludgy) but that's 
a personal preference.  That will probably work, with a bit of futzing.

Thanks for your attention...

-Bob

Bob Deen @ NASA-JPL Multimission Image Processing Lab
Bob.Deen@jpl.nasa.gov
FX Coudert Dec. 20, 2016, 9:17 a.m. UTC | #12
Dear Bob,

First, regarding the ABI vs. API question: there is no consistent API for how to pass between Fortran and C strings, unless one uses Fortran 2003’s ISO_C_BINDING. It’s an ABI detail, in the sense that every compiler will choose to do things their own way: most compilers who pass a hidden length parameter, although its size (32-bit or 64-bit or size_t) and position (either after the char pointer, or at the end of the argument list) are variable between compilers. So, any code that does this is already compiler-specific.

Second, there are good reasons we might want to change this. One is possible use cases (although there are few, by definition, because we simply don’t support those right now). The second one is compatibility with C string-handling functions, who operate on size_t arguments, which means we can now use those functions without casting types around all the time.

Finally, if we’re making this change, we welcome any feedback on how to make it as easy as possible to handle in user code. Documentation, preprocessor macros, etc.

In particular, one of the things we will need to address is on helping widely used code to adapt to the change, so that. One example I am thinking of, that uses old-style C/Fortran interfaces, is MPI libraries (openmpi & mpich). We definitely need to test those to make sure nothing breaks if we are going to proceed — or they need to be fixed upstream well before we release, and with due note of the incompatibility in our release notes.


Cheers,
FX
Gerald Pfeifer Dec. 29, 2016, 9:49 p.m. UTC | #13
On Tue, 20 Dec 2016, FX wrote:
> Finally, if we’re making this change, we welcome any feedback on how 
> to make it as easy as possible to handle in user code. Documentation, 
> preprocessor macros, etc.

I believe including this in the (yet to be created) gcc-7/porting_to.html,
would be great.

Historically the porting_to.html documents have mostly covered C and 
C++, since that is the source language of the majority of packages in 
a GNU/Linux distribution that GCC touches.  Adding more focus on
Fortran users as well feels like a good idea, though.

(If you want to go ahead, but prefer the page to be created first,
let me know, and I'll take care.)

Gerald
diff mbox

Patch

diff --git a/gcc/fortran/arith.c b/gcc/fortran/arith.c
index 2781f10..ef4b884 100644
--- a/gcc/fortran/arith.c
+++ b/gcc/fortran/arith.c
@@ -31,6 +31,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "arith.h"
 #include "target-memory.h"
 #include "constructor.h"
+#include "tree.h"
+#include "trans-types.h"
 
 /* MPFR does not have a direct replacement for mpz_set_f() from GMP.
    It's easily implemented with a few calls though.  */
@@ -281,6 +283,27 @@  gfc_check_character_range (gfc_char_t c, int kind)
 }
 
 
+/* Check whether a character length is within the range
+   [0, TYPE_MAX_VALUE(gfc_charlen_type_node)].  */
+
+arith
+gfc_check_charlen_range (mpz_t p)
+{
+  mpz_t min, max;
+  arith result = ARITH_OK;
+
+  mpz_init (min);
+  mpz_init (max);
+  get_type_static_bounds (gfc_charlen_type_node, min, max);
+  mpz_clear (min);
+  if ((mpz_sgn (p) < 0) || (mpz_cmp (p, max) > 0))
+    result = ARITH_OVERFLOW;
+
+  mpz_clear (max);
+  return result;
+}
+
+
 /* Given an integer and a kind, make sure that the integer lies within
    the range of the kind.  Returns ARITH_OK, ARITH_ASYMMETRIC or
    ARITH_OVERFLOW.  */
@@ -487,6 +510,9 @@  gfc_range_check (gfc_expr *e)
   arith rc;
   arith rc2;
 
+  if (e->ts.is_charlen)
+    return gfc_check_charlen_range (e->value.integer);
+
   switch (e->ts.type)
     {
     case BT_INTEGER:
@@ -689,6 +715,8 @@  gfc_arith_times (gfc_expr *op1, gfc_expr *op2, gfc_expr **resultp)
     {
     case BT_INTEGER:
       mpz_mul (result->value.integer, op1->value.integer, op2->value.integer);
+      if (op1->ts.is_charlen || op2->ts.is_charlen)
+	result->ts.is_charlen = true;
       break;
 
     case BT_REAL:
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 1fba6c9..62c751e 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -35,7 +35,7 @@  along with GCC; see the file COPYING3.  If not see
     * _vptr: A pointer to the vtable entry (see below) of the dynamic type.
 
     Only for unlimited polymorphic classes:
-    * _len:  An integer(4) to store the string length when the unlimited
+    * _len:  An integer(C_SIZE_T) to store the string length when the unlimited
              polymorphic pointer is used to point to a char array.  The '_len'
              component will be zero when no character array is stored in
              '_data'.
@@ -2310,13 +2310,13 @@  gfc_find_derived_vtab (gfc_symbol *derived)
 	      if (!gfc_add_component (vtype, "_size", &c))
 		goto cleanup;
 	      c->ts.type = BT_INTEGER;
-	      c->ts.kind = 4;
+	      c->ts.kind = gfc_size_kind;
 	      c->attr.access = ACCESS_PRIVATE;
 	      /* Remember the derived type in ts.u.derived,
 		 so that the correct initializer can be set later on
 		 (in gfc_conv_structure).  */
 	      c->ts.u.derived = derived;
-	      c->initializer = gfc_get_int_expr (gfc_default_integer_kind,
+	      c->initializer = gfc_get_int_expr (gfc_size_kind,
 						 NULL, 0);
 
 	      /* Add component _extends.  */
@@ -2676,7 +2676,7 @@  find_intrinsic_vtab (gfc_typespec *ts)
 	      if (!gfc_add_component (vtype, "_size", &c))
 		goto cleanup;
 	      c->ts.type = BT_INTEGER;
-	      c->ts.kind = 4;
+	      c->ts.kind = gfc_size_kind;
 	      c->attr.access = ACCESS_PRIVATE;
 
 	      /* Build a minimal expression to make use of
@@ -2687,11 +2687,11 @@  find_intrinsic_vtab (gfc_typespec *ts)
 	      e = gfc_get_expr ();
 	      e->ts = *ts;
 	      e->expr_type = EXPR_VARIABLE;
-	      c->initializer = gfc_get_int_expr (gfc_default_integer_kind,
+	      c->initializer = gfc_get_int_expr (gfc_size_kind,
 						 NULL,
 						 ts->type == BT_CHARACTER
 						 ? ts->kind
-						 : (int)gfc_element_size (e));
+						 : (long)gfc_element_size (e));
 	      gfc_free_expr (e);
 
 	      /* Add component _extends.  */
diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 3464a20..1c7c036 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -184,7 +184,7 @@  gfc_get_constant_expr (bt type, int kind, locus *where)
    blanked and null-terminated.  */
 
 gfc_expr *
-gfc_get_character_expr (int kind, locus *where, const char *src, int len)
+gfc_get_character_expr (int kind, locus *where, const char *src, gfc_charlen_t len)
 {
   gfc_expr *e;
   gfc_char_t *dest;
@@ -210,7 +210,7 @@  gfc_get_character_expr (int kind, locus *where, const char *src, int len)
 /* Get a new expression node that is an integer constant.  */
 
 gfc_expr *
-gfc_get_int_expr (int kind, locus *where, int value)
+gfc_get_int_expr (int kind, locus *where, long value)
 {
   gfc_expr *p;
   p = gfc_get_constant_expr (BT_INTEGER, kind,
@@ -636,6 +636,32 @@  gfc_extract_int (gfc_expr *expr, int *result)
 }
 
 
+/* Same as gfc_extract_int, but use a long. long isn't optimal either,
+   since it won't help on LLP64 targets like win64, but it's the best
+   we can do due to the mpz_*_si functions that take arguments of type
+   long.  */
+
+const char *
+gfc_extract_long (gfc_expr *expr, long *result)
+{
+  if (expr->expr_type != EXPR_CONSTANT)
+    return _("Constant expression required at %C");
+
+  if (expr->ts.type != BT_INTEGER)
+    return _("Integer expression required at %C");
+
+  if ((mpz_cmp_si (expr->value.integer, LONG_MAX) > 0)
+      || (mpz_cmp_si (expr->value.integer, LONG_MIN) < 0))
+    {
+      return _("Integer value too large in expression at %C");
+    }
+
+  *result = mpz_get_si (expr->value.integer);
+
+  return NULL;
+}
+
+
 /* Recursively copy a list of reference structures.  */
 
 gfc_ref *
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 670c13a..4f30ba3 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1009,6 +1009,8 @@  typedef struct
   int is_iso_c;
   bt f90_type;
   bool deferred;
+  bool is_charlen; /* If the expression is a character length, ignore
+		      type and kind.  */
 }
 gfc_typespec;
 
@@ -2058,6 +2060,12 @@  gfc_intrinsic_sym;
 
 typedef splay_tree gfc_constructor_base;
 
+
+/* This should be a size_t. But occasionally the string length field
+   is used as a flag with values -1 and -2, see
+   e.g. gfc_add_assign_aux_vars.  */
+typedef ptrdiff_t gfc_charlen_t;
+
 typedef struct gfc_expr
 {
   expr_t expr_type;
@@ -2103,7 +2111,7 @@  typedef struct gfc_expr
      the value.  */
   struct
   {
-    int length;
+    gfc_charlen_t length;
     char *string;
   }
   representation;
@@ -2159,7 +2167,7 @@  typedef struct gfc_expr
 
     struct
     {
-      int length;
+      gfc_charlen_t length;
       gfc_char_t *string;
     }
     character;
@@ -2843,6 +2851,7 @@  extern int gfc_atomic_int_kind;
 extern int gfc_atomic_logical_kind;
 extern int gfc_intio_kind;
 extern int gfc_charlen_int_kind;
+extern int gfc_size_kind;
 extern int gfc_numeric_storage_size;
 extern int gfc_character_storage_size;
 
@@ -3074,6 +3083,7 @@  void gfc_resolve_oacc_blocks (gfc_code *, gfc_namespace *);
 void gfc_free_actual_arglist (gfc_actual_arglist *);
 gfc_actual_arglist *gfc_copy_actual_arglist (gfc_actual_arglist *);
 const char *gfc_extract_int (gfc_expr *, int *);
+const char *gfc_extract_long (gfc_expr *, long *);
 bool is_subref_array (gfc_expr *);
 bool gfc_is_simply_contiguous (gfc_expr *, bool, bool);
 bool gfc_check_init_expr (gfc_expr *);
@@ -3091,8 +3101,8 @@  gfc_expr *gfc_get_null_expr (locus *);
 gfc_expr *gfc_get_operator_expr (locus *, gfc_intrinsic_op,gfc_expr *, gfc_expr *);
 gfc_expr *gfc_get_structure_constructor_expr (bt, int, locus *);
 gfc_expr *gfc_get_constant_expr (bt, int, locus *);
-gfc_expr *gfc_get_character_expr (int, locus *, const char *, int len);
-gfc_expr *gfc_get_int_expr (int, locus *, int);
+gfc_expr *gfc_get_character_expr (int, locus *, const char *, gfc_charlen_t len);
+gfc_expr *gfc_get_int_expr (int, locus *, long);
 gfc_expr *gfc_get_logical_expr (int, locus *, bool);
 gfc_expr *gfc_get_iokind_expr (locus *, io_kind);
 
diff --git a/gcc/fortran/iresolve.c b/gcc/fortran/iresolve.c
index b289c9f..39b1984 100644
--- a/gcc/fortran/iresolve.c
+++ b/gcc/fortran/iresolve.c
@@ -2147,7 +2147,7 @@  void
 gfc_resolve_repeat (gfc_expr *f, gfc_expr *string,
 		    gfc_expr *ncopies)
 {
-  int len;
+  gfc_charlen_t len;
   gfc_expr *tmp;
   f->ts.type = BT_CHARACTER;
   f->ts.kind = string->ts.kind;
@@ -2161,7 +2161,7 @@  gfc_resolve_repeat (gfc_expr *f, gfc_expr *string,
   if (string->expr_type == EXPR_CONSTANT)
     {
       len = string->value.character.length;
-      tmp = gfc_get_int_expr (gfc_default_integer_kind, NULL , len);
+      tmp = gfc_get_int_expr (gfc_charlen_int_kind, NULL , len);
     }
   else if (string->ts.u.cl && string->ts.u.cl->length)
     {
@@ -2169,7 +2169,11 @@  gfc_resolve_repeat (gfc_expr *f, gfc_expr *string,
     }
 
   if (tmp)
-    f->ts.u.cl->length = gfc_multiply (tmp, gfc_copy_expr (ncopies));
+    {
+      tmp->ts.kind = gfc_charlen_int_kind; /* Why is this fixup needed?  */
+      tmp->ts.is_charlen = true;
+      f->ts.u.cl->length = gfc_multiply (tmp, gfc_copy_expr (ncopies));
+    }
 }
 
 
diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 523cba4..6184289 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -5765,7 +5765,7 @@  select_intrinsic_set_tmp (gfc_typespec *ts)
 {
   char name[GFC_MAX_SYMBOL_LEN];
   gfc_symtree *tmp;
-  int charlen = 0;
+  long charlen = 0;
 
   if (ts->type == BT_CLASS || ts->type == BT_DERIVED)
     return NULL;
@@ -5782,7 +5782,7 @@  select_intrinsic_set_tmp (gfc_typespec *ts)
     sprintf (name, "__tmp_%s_%d", gfc_basic_typename (ts->type),
 	     ts->kind);
   else
-    sprintf (name, "__tmp_%s_%d_%d", gfc_basic_typename (ts->type),
+    sprintf (name, "__tmp_%s_%ld_%d", gfc_basic_typename (ts->type),
 	     charlen, ts->kind);
 
   gfc_get_sym_tree (name, gfc_current_ns, &tmp, false);
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index e727ade..9324a37 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -1141,7 +1141,7 @@  static atom_type last_atom;
 
 #define MAX_ATOM_SIZE 100
 
-static int atom_int;
+static HOST_WIDE_INT atom_int;
 static char *atom_string, atom_name[MAX_ATOM_SIZE];
 
 
@@ -1271,7 +1271,7 @@  parse_string (void)
 }
 
 
-/* Parse a small integer.  */
+/* Parse an integer. Should fit in a HOST_WIDE_INT.  */
 
 static void
 parse_integer (int c)
@@ -1288,8 +1288,6 @@  parse_integer (int c)
 	}
 
       atom_int = 10 * atom_int + c - '0';
-      if (atom_int > 99999999)
-	bad_module ("Integer overflow");
     }
 
 }
@@ -1631,11 +1629,12 @@  write_char (char out)
 static void
 write_atom (atom_type atom, const void *v)
 {
-  char buffer[20];
+  char buffer[32];
 
   /* Workaround -Wmaybe-uninitialized false positive during
      profiledbootstrap by initializing them.  */
-  int i = 0, len;
+  int len;
+  HOST_WIDE_INT i = 0;
   const char *p;
 
   switch (atom)
@@ -1654,11 +1653,9 @@  write_atom (atom_type atom, const void *v)
       break;
 
     case ATOM_INTEGER:
-      i = *((const int *) v);
-      if (i < 0)
-	gfc_internal_error ("write_atom(): Writing negative integer");
+      i = *((const HOST_WIDE_INT *) v);
 
-      sprintf (buffer, "%d", i);
+      snprintf (buffer, sizeof (buffer), HOST_WIDE_INT_PRINT_DEC, i);
       p = buffer;
       break;
 
@@ -1766,7 +1763,10 @@  static void
 mio_integer (int *ip)
 {
   if (iomode == IO_OUTPUT)
-    write_atom (ATOM_INTEGER, ip);
+    {
+      HOST_WIDE_INT hwi = *ip;
+      write_atom (ATOM_INTEGER, &hwi);
+    }
   else
     {
       require_atom (ATOM_INTEGER);
@@ -1774,6 +1774,18 @@  mio_integer (int *ip)
     }
 }
 
+static void
+mio_hwi (HOST_WIDE_INT *hwi)
+{
+  if (iomode == IO_OUTPUT)
+    write_atom (ATOM_INTEGER, hwi);
+  else
+    {
+      require_atom (ATOM_INTEGER);
+      *hwi = atom_int;
+    }
+}
+
 
 /* Read or write a gfc_intrinsic_op value.  */
 
@@ -1783,7 +1795,7 @@  mio_intrinsic_op (gfc_intrinsic_op* op)
   /* FIXME: Would be nicer to do this via the operators symbolic name.  */
   if (iomode == IO_OUTPUT)
     {
-      int converted = (int) *op;
+      HOST_WIDE_INT converted = (HOST_WIDE_INT) *op;
       write_atom (ATOM_INTEGER, &converted);
     }
   else
@@ -2680,7 +2692,7 @@  mio_array_ref (gfc_array_ref *ar)
     {
       for (i = 0; i < ar->dimen; i++)
 	{
-	  int tmp = (int)ar->dimen_type[i];
+	  HOST_WIDE_INT tmp = (HOST_WIDE_INT)ar->dimen_type[i];
 	  write_atom (ATOM_INTEGER, &tmp);
 	}
     }
@@ -3382,6 +3394,7 @@  fix_mio_expr (gfc_expr *e)
 static void
 mio_expr (gfc_expr **ep)
 {
+  HOST_WIDE_INT hwi;
   gfc_expr *e;
   atom_type t;
   int flag;
@@ -3596,7 +3609,9 @@  mio_expr (gfc_expr **ep)
 	  break;
 
 	case BT_CHARACTER:
-	  mio_integer (&e->value.character.length);
+	  hwi = e->value.character.length;
+	  mio_hwi (&hwi);
+	  e->value.character.length = hwi;
 	  e->value.character.string
 	    = CONST_CAST (gfc_char_t *,
 			  mio_allocated_wide_string (e->value.character.string,
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 2093de91..dbe6666 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -29,6 +29,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "data.h"
 #include "target-memory.h" /* for gfc_simplify_transfer */
 #include "constructor.h"
+#include "tree.h"
+#include "trans-types.h" /* For gfc_charlen_type_node  */
 
 /* Types used in equivalence statements.  */
 
@@ -4608,7 +4610,7 @@  resolve_array_ref (gfc_array_ref *ar)
 static bool
 resolve_substring (gfc_ref *ref)
 {
-  int k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);
+  mpz_t min, max;
 
   if (ref->u.ss.start != NULL)
     {
@@ -4668,15 +4670,20 @@  resolve_substring (gfc_ref *ref)
 	  return false;
 	}
 
-      if (compare_bound_mpz_t (ref->u.ss.end,
-			       gfc_integer_kinds[k].huge) == CMP_GT
+      mpz_init (min);
+      mpz_init (max);
+      get_type_static_bounds (gfc_charlen_type_node, min, max);
+      mpz_clear (min);
+      if (compare_bound_mpz_t (ref->u.ss.end, max) == CMP_GT
 	  && (compare_bound (ref->u.ss.end, ref->u.ss.start) == CMP_EQ
 	      || compare_bound (ref->u.ss.end, ref->u.ss.start) == CMP_GT))
 	{
+	  mpz_clear (max);
 	  gfc_error ("Substring end index at %L is too large",
 		     &ref->u.ss.end->where);
 	  return false;
 	}
+      mpz_clear (max);
     }
 
   return true;
@@ -8488,7 +8495,7 @@  resolve_select_type (gfc_code *code, gfc_namespace *old_ns)
   char name[GFC_MAX_SYMBOL_LEN];
   gfc_namespace *ns;
   int error = 0;
-  int charlen = 0;
+  long charlen = 0;
   int rank = 0;
   gfc_ref* ref = NULL;
   gfc_expr *selector_expr = NULL;
@@ -8739,7 +8746,7 @@  resolve_select_type (gfc_code *code, gfc_namespace *old_ns)
 	  if (c->ts.u.cl && c->ts.u.cl->length
 	      && c->ts.u.cl->length->expr_type == EXPR_CONSTANT)
 	    charlen = mpz_get_si (c->ts.u.cl->length->value.integer);
-	  sprintf (name, "__tmp_%s_%d_%d", gfc_basic_typename (c->ts.type),
+	  sprintf (name, "__tmp_%s_%ld_%d", gfc_basic_typename (c->ts.type),
 	           charlen, c->ts.kind);
 	}
       else
@@ -11402,7 +11409,8 @@  resolve_index_expr (gfc_expr *e)
 static bool
 resolve_charlen (gfc_charlen *cl)
 {
-  int i, k;
+  long i;
+  mpz_t min, max;
   bool saved_specification_expr;
 
   if (cl->resolved)
@@ -11438,22 +11446,27 @@  resolve_charlen (gfc_charlen *cl)
 
   /* F2008, 4.4.3.2:  If the character length parameter value evaluates to
      a negative value, the length of character entities declared is zero.  */
-  if (cl->length && !gfc_extract_int (cl->length, &i) && i < 0)
+  if (cl->length && !gfc_extract_long (cl->length, &i) && i < 0)
     gfc_replace_expr (cl->length,
 		      gfc_get_int_expr (gfc_default_integer_kind, NULL, 0));
 
   /* Check that the character length is not too large.  */
-  k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);
+  mpz_init (min);
+  mpz_init (max);
+  get_type_static_bounds (gfc_charlen_type_node, min, max);
+  mpz_clear (min);
   if (cl->length && cl->length->expr_type == EXPR_CONSTANT
       && cl->length->ts.type == BT_INTEGER
-      && mpz_cmp (cl->length->value.integer, gfc_integer_kinds[k].huge) > 0)
+      && mpz_cmp (cl->length->value.integer, max) > 0)
     {
       gfc_error ("String length at %L is too large", &cl->length->where);
       specification_expr = saved_specification_expr;
+      mpz_clear (max);
       return false;
     }
 
   specification_expr = saved_specification_expr;
+  mpz_clear (max);
   return true;
 }
 
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index a46fbc5..ea835f4 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -28,6 +28,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "target-memory.h"
 #include "constructor.h"
 #include "version.h"	/* For version_string.  */
+#include "tree.h"
+#include "trans-types.h"
 
 
 gfc_expr gfc_bad_expr;
@@ -5190,7 +5192,7 @@  gfc_expr *
 gfc_simplify_repeat (gfc_expr *e, gfc_expr *n)
 {
   gfc_expr *result;
-  int i, j, len, ncop, nlen;
+  long len, ncop;
   mpz_t ncopies;
   bool have_length = false;
 
@@ -5232,24 +5234,26 @@  gfc_simplify_repeat (gfc_expr *e, gfc_expr *n)
   /* Check that NCOPIES isn't too large.  */
   if (len)
     {
-      mpz_t max, mlen;
-      int i;
+      mpz_t min, max, mlen, maxb;
 
       /* Compute the maximum value allowed for NCOPIES: huge(cl) / len.  */
+      mpz_init (min);
+      mpz_init (maxb);
       mpz_init (max);
-      i = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);
+      get_type_static_bounds (gfc_charlen_type_node, min, maxb);
+      mpz_clear (min);
 
       if (have_length)
 	{
-	  mpz_tdiv_q (max, gfc_integer_kinds[i].huge,
-		      e->ts.u.cl->length->value.integer);
+	  mpz_tdiv_q (max, maxb, e->ts.u.cl->length->value.integer);
 	}
       else
 	{
 	  mpz_init_set_si (mlen, len);
-	  mpz_tdiv_q (max, gfc_integer_kinds[i].huge, mlen);
+	  mpz_tdiv_q (max, maxb, mlen);
 	  mpz_clear (mlen);
 	}
+      mpz_clear (maxb);
 
       /* The check itself.  */
       if (mpz_cmp (ncopies, max) > 0)
@@ -5274,7 +5278,7 @@  gfc_simplify_repeat (gfc_expr *e, gfc_expr *n)
       (e->ts.u.cl->length &&
        mpz_sgn (e->ts.u.cl->length->value.integer) != 0))
     {
-      const char *res = gfc_extract_int (n, &ncop);
+      const char *res = gfc_extract_long (n, &ncop);
       gcc_assert (res == NULL);
     }
   else
@@ -5284,11 +5288,18 @@  gfc_simplify_repeat (gfc_expr *e, gfc_expr *n)
     return gfc_get_character_expr (e->ts.kind, &e->where, NULL, 0);
 
   len = e->value.character.length;
-  nlen = ncop * len;
+  size_t nlen = ncop * len;
+
+  /* Here's a semi-arbitrary limit. If the string is longer than 4 MB
+     (2**20 elements * 4 bytes (wide chars) per element) defer to
+     runtime instead of consuming (unbounded) memory and CPU at
+     compile time.  */
+  if (nlen > 1048576)
+    return NULL;
 
   result = gfc_get_character_expr (e->ts.kind, &e->where, NULL, nlen);
-  for (i = 0; i < ncop; i++)
-    for (j = 0; j < len; j++)
+  for (size_t i = 0; i < (size_t) ncop; i++)
+    for (size_t j = 0; j < (size_t) len; j++)
       result->value.character.string[j+i*len]= e->value.character.string[j];
 
   result->value.character.string[nlen] = '\0';	/* For debugger */
diff --git a/gcc/fortran/target-memory.c b/gcc/fortran/target-memory.c
index ac9cce2..a62d4c2 100644
--- a/gcc/fortran/target-memory.c
+++ b/gcc/fortran/target-memory.c
@@ -438,11 +438,9 @@  int
 gfc_interpret_character (unsigned char *buffer, size_t buffer_size,
 			 gfc_expr *result)
 {
-  int i;
-
   if (result->ts.u.cl && result->ts.u.cl->length)
     result->value.character.length =
-      (int) mpz_get_ui (result->ts.u.cl->length->value.integer);
+      (gfc_charlen_t) mpz_get_ui (result->ts.u.cl->length->value.integer);
 
   gcc_assert (buffer_size >= size_character (result->value.character.length,
 					     result->ts.kind));
@@ -450,7 +448,7 @@  gfc_interpret_character (unsigned char *buffer, size_t buffer_size,
     gfc_get_wide_string (result->value.character.length + 1);
 
   if (result->ts.kind == gfc_default_character_kind)
-    for (i = 0; i < result->value.character.length; i++)
+    for (gfc_charlen_t i = 0; i < result->value.character.length; i++)
       result->value.character.string[i] = (gfc_char_t) buffer[i];
   else
     {
@@ -459,7 +457,7 @@  gfc_interpret_character (unsigned char *buffer, size_t buffer_size,
       mpz_init (integer);
       gcc_assert (bytes <= sizeof (unsigned long));
 
-      for (i = 0; i < result->value.character.length; i++)
+      for (gfc_charlen_t i = 0; i < result->value.character.length; i++)
 	{
 	  gfc_conv_tree_to_mpz (integer,
 	    native_interpret_expr (gfc_get_char_type (result->ts.kind),
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 8753cbf..f08a614 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -1909,8 +1909,7 @@  get_array_ctor_var_strlen (stmtblock_t *block, gfc_expr * expr, tree * len)
 	  mpz_init_set_ui (char_len, 1);
 	  mpz_add (char_len, char_len, ref->u.ss.end->value.integer);
 	  mpz_sub (char_len, char_len, ref->u.ss.start->value.integer);
-	  *len = gfc_conv_mpz_to_tree (char_len, gfc_default_integer_kind);
-	  *len = convert (gfc_charlen_type_node, *len);
+	  *len = gfc_conv_mpz_to_tree_type (char_len, gfc_charlen_type_node);
 	  mpz_clear (char_len);
 	  return;
 
diff --git a/gcc/fortran/trans-const.c b/gcc/fortran/trans-const.c
index 812dcc6..3438979 100644
--- a/gcc/fortran/trans-const.c
+++ b/gcc/fortran/trans-const.c
@@ -206,6 +206,18 @@  gfc_conv_mpz_to_tree (mpz_t i, int kind)
   return wide_int_to_tree (gfc_get_int_type (kind), val);
 }
 
+
+/* Convert a GMP integer into a tree node of type given by the type
+   argument.  */
+
+tree
+gfc_conv_mpz_to_tree_type (mpz_t i, const tree type)
+{
+  const wide_int val = wi::from_mpz (type, i, true);
+  return wide_int_to_tree (type, val);
+}
+
+
 /* Converts a backend tree into a GMP integer.  */
 
 void
diff --git a/gcc/fortran/trans-const.h b/gcc/fortran/trans-const.h
index 14b0f79..072de47 100644
--- a/gcc/fortran/trans-const.h
+++ b/gcc/fortran/trans-const.h
@@ -20,6 +20,7 @@  along with GCC; see the file COPYING3.  If not see
 
 /* Converts between INT_CST and GMP integer representations.  */
 tree gfc_conv_mpz_to_tree (mpz_t, int);
+tree gfc_conv_mpz_to_tree_type (mpz_t, const tree);
 void gfc_conv_tree_to_mpz (mpz_t, tree);
 
 /* Converts between REAL_CST and MPFR floating-point representations.  */
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index f659a48..9536a24 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -2334,7 +2334,7 @@  create_function_arglist (gfc_symbol * sym)
 
   if (gfc_return_by_reference (sym))
     {
-      tree type = TREE_VALUE (typelist), length = NULL;
+      tree type = TREE_VALUE (typelist), length = NULL_TREE;
 
       if (sym->ts.type == BT_CHARACTER)
 	{
@@ -2405,6 +2405,7 @@  create_function_arglist (gfc_symbol * sym)
       if (sym->ts.type == BT_CHARACTER)
 	{
 	  gfc_allocate_lang_decl (parm);
+	  gcc_assert (length != NULL_TREE);
 	  arglist = chainon (arglist, length);
 	  typelist = TREE_CHAIN (typelist);
 	}
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index cbfad0b..5b4bb2e 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -250,7 +250,7 @@  gfc_class_len_or_zero_get (tree decl)
   return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
 					     TREE_TYPE (len), decl, len,
 					     NULL_TREE)
-			  : integer_zero_node;
+    : build_int_cst (gfc_charlen_type_node, 0);
 }
 
 
@@ -884,7 +884,8 @@  gfc_conv_intrinsic_to_class (gfc_se *parmse, gfc_expr *e,
 		{
 		  /* Amazingly all data is present to compute the length of a
 		   constant string, but the expression is not yet there.  */
-		  e->ts.u.cl->length = gfc_get_constant_expr (BT_INTEGER, 4,
+		  e->ts.u.cl->length = gfc_get_constant_expr (BT_INTEGER,
+							      gfc_charlen_int_kind,
 							      &e->where);
 		  mpz_set_ui (e->ts.u.cl->length->value.integer,
 			      e->value.character.length);
@@ -902,7 +903,7 @@  gfc_conv_intrinsic_to_class (gfc_se *parmse, gfc_expr *e,
       else
 	tmp = integer_zero_node;
 
-      gfc_add_modify (&parmse->pre, ctree, tmp);
+      gfc_add_modify (&parmse->pre, ctree, fold_convert (TREE_TYPE (ctree), tmp));
     }
   else if (class_ts.type == BT_CLASS
 	   && class_ts.u.derived->components
@@ -1041,7 +1042,7 @@  gfc_conv_class_to_class (gfc_se *parmse, gfc_expr *e, gfc_typespec class_ts,
       if (DECL_LANG_SPECIFIC (tmp) && GFC_DECL_SAVED_DESCRIPTOR (tmp))
 	tmp = GFC_DECL_SAVED_DESCRIPTOR (tmp);
 
-      slen = integer_zero_node;
+      slen = build_int_cst (size_type_node, 0);
     }
   else
     {
@@ -1088,7 +1089,7 @@  gfc_conv_class_to_class (gfc_se *parmse, gfc_expr *e, gfc_typespec class_ts,
 	  tmp = slen;
 	}
       else
-	tmp = integer_zero_node;
+	tmp = build_int_cst (size_type_node, 0);
       gfc_add_modify (&parmse->pre, ctree,
 		      fold_convert (TREE_TYPE (ctree), tmp));
 
@@ -1227,7 +1228,7 @@  gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
       if (from != NULL_TREE && unlimited)
 	from_len = gfc_class_len_or_zero_get (from);
       else
-	from_len = integer_zero_node;
+	from_len = build_int_cst (size_type_node, 0);
     }
 
   if (GFC_CLASS_TYPE_P (TREE_TYPE (to)))
@@ -1339,7 +1340,7 @@  gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
 
 	  tmp = fold_build2_loc (input_location, GT_EXPR,
 				 boolean_type_node, from_len,
-				 integer_zero_node);
+				 build_int_cst (TREE_TYPE (from_len), 0));
 	  tmp = fold_build3_loc (input_location, COND_EXPR,
 				 void_type_node, tmp, extcopy, stdcopy);
 	  gfc_add_expr_to_block (&body, tmp);
@@ -1367,7 +1368,7 @@  gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
 	  extcopy = build_call_vec (fcn_type, fcn, args);
 	  tmp = fold_build2_loc (input_location, GT_EXPR,
 				 boolean_type_node, from_len,
-				 integer_zero_node);
+				 build_int_cst (TREE_TYPE (from_len), 0));
 	  tmp = fold_build3_loc (input_location, COND_EXPR,
 				 void_type_node, tmp, extcopy, stdcopy);
 	}
@@ -2195,7 +2196,7 @@  gfc_conv_string_length (gfc_charlen * cl, gfc_expr * expr, stmtblock_t * pblock)
 
   gfc_conv_expr_type (&se, cl->length, gfc_charlen_type_node);
   se.expr = fold_build2_loc (input_location, MAX_EXPR, gfc_charlen_type_node,
-			     se.expr, build_int_cst (gfc_charlen_type_node, 0));
+			     se.expr, build_int_cst (TREE_TYPE (se.expr), 0));
   gfc_add_block_to_block (pblock, &se.pre);
 
   if (cl->backend_decl)
@@ -2267,7 +2268,7 @@  gfc_conv_substring (gfc_se * se, gfc_ref * ref, int kind,
       /* Check lower bound.  */
       fault = fold_build2_loc (input_location, LT_EXPR, boolean_type_node,
 			       start.expr,
-			       build_int_cst (gfc_charlen_type_node, 1));
+			       build_int_cst (TREE_TYPE (start.expr), 1));
       fault = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR,
 			       boolean_type_node, nonempty, fault);
       if (name)
@@ -2303,7 +2304,7 @@  gfc_conv_substring (gfc_se * se, gfc_ref * ref, int kind,
   if (ref->u.ss.end
       && gfc_dep_difference (ref->u.ss.end, ref->u.ss.start, &length))
     {
-      int i_len;
+      long i_len;
 
       i_len = mpz_get_si (length) + 1;
       if (i_len < 0)
@@ -2315,7 +2316,8 @@  gfc_conv_substring (gfc_se * se, gfc_ref * ref, int kind,
   else
     {
       tmp = fold_build2_loc (input_location, MINUS_EXPR, gfc_charlen_type_node,
-			     end.expr, start.expr);
+			     fold_convert (gfc_charlen_type_node, end.expr),
+			     fold_convert (gfc_charlen_type_node, start.expr));
       tmp = fold_build2_loc (input_location, PLUS_EXPR, gfc_charlen_type_node,
 			     build_int_cst (gfc_charlen_type_node, 1), tmp);
       tmp = fold_build2_loc (input_location, MAX_EXPR, gfc_charlen_type_node,
@@ -3115,9 +3117,10 @@  gfc_conv_string_tmp (gfc_se * se, tree type, tree len)
     {
       /* Create a temporary variable to hold the result.  */
       tmp = fold_build2_loc (input_location, MINUS_EXPR,
-			     gfc_charlen_type_node, len,
+			     gfc_charlen_type_node,
+			     fold_convert (gfc_charlen_type_node, len),
 			     build_int_cst (gfc_charlen_type_node, 1));
-      tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node, tmp);
+      tmp = build_range_type (gfc_charlen_type_node, gfc_index_zero_node, tmp);
 
       if (TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE)
 	tmp = build_array_type (TREE_TYPE (TREE_TYPE (type)), tmp);
@@ -3180,7 +3183,9 @@  gfc_conv_concat_op (gfc_se * se, gfc_expr * expr)
     {
       len = fold_build2_loc (input_location, PLUS_EXPR,
 			     TREE_TYPE (lse.string_length),
-			     lse.string_length, rse.string_length);
+			     lse.string_length,
+			     fold_convert (TREE_TYPE (lse.string_length),
+					   rse.string_length));
     }
 
   type = build_pointer_type (type);
@@ -5872,7 +5877,7 @@  gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	  tmp = fold_convert (gfc_charlen_type_node, parmse.expr);
 	  tmp = fold_build2_loc (input_location, MAX_EXPR,
 				 gfc_charlen_type_node, tmp,
-				 build_int_cst (gfc_charlen_type_node, 0));
+				 build_int_cst (TREE_TYPE (tmp), 0));
 	  cl.backend_decl = tmp;
 	}
 
@@ -7199,7 +7204,8 @@  alloc_scalar_allocatable_for_subcomponent_assignment (stmtblock_t *block,
 
   if (cm->ts.type == BT_CHARACTER && cm->ts.deferred)
     /* Update the lhs character length.  */
-    gfc_add_modify (block, lhs_cl_size, size);
+    gfc_add_modify (block, lhs_cl_size,
+		    fold_convert (TREE_TYPE (lhs_cl_size), size));
 }
 
 
@@ -7438,7 +7444,8 @@  gfc_trans_subcomponent_assign (tree dest, gfc_component * cm, gfc_expr * expr,
 				     1, size);
 	  gfc_add_modify (&block, dest,
 			  fold_convert (TREE_TYPE (dest), tmp));
-	  gfc_add_modify (&block, strlen, se.string_length);
+	  gfc_add_modify (&block, strlen,
+			  fold_convert (TREE_TYPE (strlen), se.string_length));
 	  tmp = gfc_build_memcpy_call (dest, se.expr, size);
 	  gfc_add_expr_to_block (&block, tmp);
 	}
@@ -8104,7 +8111,7 @@  trans_class_vptr_len_assignment (stmtblock_t *block, gfc_expr * le,
 		  from_len = gfc_evaluate_now (se.expr, block);
 		}
 	      else
-		from_len = integer_zero_node;
+		from_len = build_int_cst (gfc_charlen_type_node, 0);
 	    }
 	  gfc_add_modify (pre, to_len, fold_convert (TREE_TYPE (to_len),
 						     from_len));
@@ -8233,7 +8240,7 @@  gfc_trans_pointer_assignment (gfc_expr * expr1, gfc_expr * expr2)
 	    gfc_add_modify (&block, lse.string_length, rse.string_length);
 	  else if (lse.string_length != NULL)
 	    gfc_add_modify (&block, lse.string_length,
-			    build_int_cst (gfc_charlen_type_node, 0));
+			    build_int_cst (TREE_TYPE (lse.string_length), 0));
 	}
 
       gfc_add_modify (&block, lse.expr,
@@ -9488,7 +9495,9 @@  alloc_scalar_allocatable_for_assignment (stmtblock_t *block,
   if (expr1->ts.type == BT_CHARACTER && expr1->ts.deferred)
     {
       cond = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
-			      lse.string_length, size);
+			      lse.string_length,
+			      fold_convert (TREE_TYPE (lse.string_length),
+					    size));
       /* Jump past the realloc if the lengths are the same.  */
       tmp = build3_v (COND_EXPR, cond,
 		      build1_v (GOTO_EXPR, jump_label2),
@@ -9505,7 +9514,8 @@  alloc_scalar_allocatable_for_assignment (stmtblock_t *block,
 
       /* Update the lhs character length.  */
       size = string_length;
-      gfc_add_modify (block, lse.string_length, size);
+      gfc_add_modify (block, lse.string_length,
+		      fold_convert (TREE_TYPE (lse.string_length), size));
     }
 }
 
@@ -9666,7 +9676,7 @@  trans_class_assignment (stmtblock_t *block, gfc_expr *lhs, gfc_expr *rhs,
 
 	  tmp = fold_build2_loc (input_location, GT_EXPR,
 				 boolean_type_node, from_len,
-				 integer_zero_node);
+				 build_int_cst (TREE_TYPE (from_len), 0));
 	  return fold_build3_loc (input_location, COND_EXPR,
 				  void_type_node, tmp,
 				  extcopy, stdcopy);
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index d7612f6..df414ee 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -7491,10 +7491,13 @@  gfc_conv_associated (gfc_se *se, gfc_expr *expr)
 
       nonzero_charlen = NULL_TREE;
       if (arg1->expr->ts.type == BT_CHARACTER)
-	nonzero_charlen = fold_build2_loc (input_location, NE_EXPR,
-					   boolean_type_node,
-					   arg1->expr->ts.u.cl->backend_decl,
-					   integer_zero_node);
+	nonzero_charlen
+	  = fold_build2_loc (input_location, NE_EXPR,
+			     boolean_type_node,
+			     arg1->expr->ts.u.cl->backend_decl,
+			     build_int_cst
+			     (TREE_TYPE (arg1->expr->ts.u.cl->backend_decl),
+			      0));
       if (scalar)
         {
 	  /* A pointer to a scalar.  */
@@ -7784,11 +7787,11 @@  gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
 
   /* We store in charsize the size of a character.  */
   i = gfc_validate_kind (BT_CHARACTER, expr->ts.kind, false);
-  size = build_int_cst (size_type_node, gfc_character_kinds[i].bit_size / 8);
+  size = build_int_cst (sizetype, gfc_character_kinds[i].bit_size / 8);
 
   /* Get the arguments.  */
   gfc_conv_intrinsic_function_args (se, expr, args, 3);
-  slen = fold_convert (size_type_node, gfc_evaluate_now (args[0], &se->pre));
+  slen = fold_convert (sizetype, gfc_evaluate_now (args[0], &se->pre));
   src = args[1];
   ncopies = gfc_evaluate_now (args[2], &se->pre);
   ncopies_type = TREE_TYPE (ncopies);
@@ -7805,7 +7808,7 @@  gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
      is valid, and nothing happens.  */
   n = gfc_create_var (ncopies_type, "ncopies");
   cond = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node, slen,
-			  build_int_cst (size_type_node, 0));
+			  size_zero_node);
   tmp = fold_build3_loc (input_location, COND_EXPR, ncopies_type, cond,
 			 build_int_cst (ncopies_type, 0), ncopies);
   gfc_add_modify (&se->pre, n, tmp);
@@ -7815,17 +7818,17 @@  gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
      (or equal to) MAX / slen, where MAX is the maximal integer of
      the gfc_charlen_type_node type.  If slen == 0, we need a special
      case to avoid the division by zero.  */
-  i = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);
-  max = gfc_conv_mpz_to_tree (gfc_integer_kinds[i].huge, gfc_charlen_int_kind);
-  max = fold_build2_loc (input_location, TRUNC_DIV_EXPR, size_type_node,
-			  fold_convert (size_type_node, max), slen);
-  largest = TYPE_PRECISION (size_type_node) > TYPE_PRECISION (ncopies_type)
-	      ? size_type_node : ncopies_type;
+  max = fold_build2_loc (input_location, TRUNC_DIV_EXPR, sizetype,
+			 fold_convert (sizetype,
+				       TYPE_MAX_VALUE (gfc_charlen_type_node)),
+			 slen);
+  largest = TYPE_PRECISION (sizetype) > TYPE_PRECISION (ncopies_type)
+	      ? sizetype : ncopies_type;
   cond = fold_build2_loc (input_location, GT_EXPR, boolean_type_node,
 			  fold_convert (largest, ncopies),
 			  fold_convert (largest, max));
   tmp = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node, slen,
-			 build_int_cst (size_type_node, 0));
+			 size_zero_node);
   cond = fold_build3_loc (input_location, COND_EXPR, boolean_type_node, tmp,
 			  boolean_false_node, cond);
   gfc_trans_runtime_check (true, false, cond, &se->pre, &expr->where,
@@ -7842,8 +7845,8 @@  gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
        for (i = 0; i < ncopies; i++)
          memmove (dest + (i * slen * size), src, slen*size);  */
   gfc_start_block (&block);
-  count = gfc_create_var (ncopies_type, "count");
-  gfc_add_modify (&block, count, build_int_cst (ncopies_type, 0));
+  count = gfc_create_var (sizetype, "count");
+  gfc_add_modify (&block, count, size_zero_node);
   exit_label = gfc_build_label_decl (NULL_TREE);
 
   /* Start the loop body.  */
@@ -7851,7 +7854,7 @@  gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
 
   /* Exit the loop if count >= ncopies.  */
   cond = fold_build2_loc (input_location, GE_EXPR, boolean_type_node, count,
-			  ncopies);
+			  fold_convert (sizetype, ncopies));
   tmp = build1_v (GOTO_EXPR, exit_label);
   TREE_USED (exit_label) = 1;
   tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, cond, tmp,
@@ -7859,25 +7862,22 @@  gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
   gfc_add_expr_to_block (&body, tmp);
 
   /* Call memmove (dest + (i*slen*size), src, slen*size).  */
-  tmp = fold_build2_loc (input_location, MULT_EXPR, gfc_charlen_type_node,
-			 fold_convert (gfc_charlen_type_node, slen),
-			 fold_convert (gfc_charlen_type_node, count));
-  tmp = fold_build2_loc (input_location, MULT_EXPR, gfc_charlen_type_node,
-			 tmp, fold_convert (gfc_charlen_type_node, size));
+  tmp = fold_build2_loc (input_location, MULT_EXPR, sizetype, slen,
+			 count);
+  tmp = fold_build2_loc (input_location, MULT_EXPR, sizetype, tmp,
+			 size);
   tmp = fold_build_pointer_plus_loc (input_location,
 				     fold_convert (pvoid_type_node, dest), tmp);
   tmp = build_call_expr_loc (input_location,
 			     builtin_decl_explicit (BUILT_IN_MEMMOVE),
 			     3, tmp, src,
 			     fold_build2_loc (input_location, MULT_EXPR,
-					      size_type_node, slen,
-					      fold_convert (size_type_node,
-							    size)));
+					      size_type_node, slen, size));
   gfc_add_expr_to_block (&body, tmp);
 
   /* Increment count.  */
-  tmp = fold_build2_loc (input_location, PLUS_EXPR, ncopies_type,
-			 count, build_int_cst (TREE_TYPE (count), 1));
+  tmp = fold_build2_loc (input_location, PLUS_EXPR, sizetype,
+			 count, size_one_node);
   gfc_add_modify (&body, count, tmp);
 
   /* Build the loop.  */
diff --git a/gcc/fortran/trans-io.c b/gcc/fortran/trans-io.c
index 253a5ac..a2992d8 100644
--- a/gcc/fortran/trans-io.c
+++ b/gcc/fortran/trans-io.c
@@ -339,11 +339,11 @@  gfc_build_io_library_fndecls (void)
 
   iocall[IOCALL_X_CHARACTER] = gfc_build_library_function_decl_with_spec (
 	get_identifier (PREFIX("transfer_character")), ".wW",
-	void_type_node, 3, dt_parm_type, pvoid_type_node, gfc_int4_type_node);
+	void_type_node, 3, dt_parm_type, pvoid_type_node, gfc_charlen_type_node);
 
   iocall[IOCALL_X_CHARACTER_WRITE] = gfc_build_library_function_decl_with_spec (
 	get_identifier (PREFIX("transfer_character_write")), ".wR",
-	void_type_node, 3, dt_parm_type, pvoid_type_node, gfc_int4_type_node);
+	void_type_node, 3, dt_parm_type, pvoid_type_node, gfc_charlen_type_node);
 
   iocall[IOCALL_X_CHARACTER_WIDE] = gfc_build_library_function_decl_with_spec (
 	get_identifier (PREFIX("transfer_character_wide")), ".wW",
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index d34bdba..3285ee0 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -112,7 +112,7 @@  gfc_trans_label_assign (gfc_code * code)
       || code->label1->defined == ST_LABEL_DO_TARGET)
     {
       label_tree = gfc_build_addr_expr (pvoid_type_node, label_tree);
-      len_tree = integer_minus_one_node;
+      len_tree = build_int_cst (gfc_charlen_type_node, -1);
     }
   else
     {
@@ -125,7 +125,7 @@  gfc_trans_label_assign (gfc_code * code)
       label_tree = gfc_build_addr_expr (pvoid_type_node, label_tree);
     }
 
-  gfc_add_modify (&se.pre, len, len_tree);
+  gfc_add_modify (&se.pre, len, fold_convert (TREE_TYPE (len), len_tree));
   gfc_add_modify (&se.pre, addr, label_tree);
 
   return gfc_finish_block (&se.pre);
@@ -2750,7 +2750,7 @@  gfc_trans_character_select (gfc_code *code)
     {
       for (d = cp; d; d = d->right)
 	{
-	  int i;
+	  gfc_charlen_t i;
 	  if (d->low)
 	    {
 	      gcc_assert (d->low->expr_type == EXPR_CONSTANT
@@ -2955,7 +2955,7 @@  gfc_trans_character_select (gfc_code *code)
       if (d->low == NULL)
         {
           CONSTRUCTOR_APPEND_ELT (node, ss_string1[k], null_pointer_node);
-          CONSTRUCTOR_APPEND_ELT (node, ss_string1_len[k], integer_zero_node);
+          CONSTRUCTOR_APPEND_ELT (node, ss_string1_len[k], build_int_cst (gfc_charlen_type_node, 0));
         }
       else
         {
@@ -2968,7 +2968,7 @@  gfc_trans_character_select (gfc_code *code)
       if (d->high == NULL)
         {
           CONSTRUCTOR_APPEND_ELT (node, ss_string2[k], null_pointer_node);
-          CONSTRUCTOR_APPEND_ELT (node, ss_string2_len[k], integer_zero_node);
+          CONSTRUCTOR_APPEND_ELT (node, ss_string2_len[k], build_int_cst (gfc_charlen_type_node, 0));
         }
       else
         {
@@ -5640,7 +5640,7 @@  gfc_trans_allocate (gfc_code * code)
 	{
 	  gfc_init_se (&se, NULL);
 	  temp_var_needed = false;
-	  expr3_len = integer_zero_node;
+	  expr3_len = build_int_cst (gfc_charlen_type_node, 0);
 	  e3_is = E3_MOLD;
 	}
       /* Prevent aliasing, i.e., se.expr may be already a
@@ -6036,7 +6036,8 @@  gfc_trans_allocate (gfc_code * code)
 		     e.g., a string.  */
 		  memsz = fold_build2_loc (input_location, GT_EXPR,
 					   boolean_type_node, expr3_len,
-					   integer_zero_node);
+					   build_int_cst
+					   (TREE_TYPE (expr3_len), 0));
 		  memsz = fold_build3_loc (input_location, COND_EXPR,
 					 TREE_TYPE (expr3_esize),
 					 memsz, tmp, expr3_esize);
@@ -6332,7 +6333,7 @@  gfc_trans_allocate (gfc_code * code)
 		gfc_build_addr_expr (pchar_type_node,
 			gfc_build_localized_cstring_const (msg)));
 
-      slen = build_int_cst (gfc_charlen_type_node, ((int) strlen (msg)));
+      slen = build_int_cst (gfc_charlen_type_node, strlen (msg));
       dlen = gfc_get_expr_charlen (code->expr2);
       slen = fold_build2_loc (input_location, MIN_EXPR,
 			      TREE_TYPE (slen), dlen, slen);
@@ -6612,7 +6613,7 @@  gfc_trans_deallocate (gfc_code *code)
       gfc_add_modify (&errmsg_block, errmsg_str,
 		gfc_build_addr_expr (pchar_type_node,
                         gfc_build_localized_cstring_const (msg)));
-      slen = build_int_cst (gfc_charlen_type_node, ((int) strlen (msg)));
+      slen = build_int_cst (gfc_charlen_type_node, strlen (msg));
       dlen = gfc_get_expr_charlen (code->expr2);
 
       gfc_trans_string_copy (&errmsg_block, dlen, errmsg, code->expr2->ts.kind,
diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
index 354308f..3a9c39d 100644
--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -118,6 +118,9 @@  int gfc_intio_kind;
 /* The integer kind used to store character lengths.  */
 int gfc_charlen_int_kind;
 
+/* Kind of internal integer for storing object sizes.  */
+int gfc_size_kind;
+
 /* The size of the numeric storage unit and character storage unit.  */
 int gfc_numeric_storage_size;
 int gfc_character_storage_size;
@@ -965,9 +968,14 @@  gfc_init_types (void)
   boolean_true_node = build_int_cst (boolean_type_node, 1);
   boolean_false_node = build_int_cst (boolean_type_node, 0);
 
-  /* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
-  gfc_charlen_int_kind = 4;
+
+  /* Character lengths are of type size_t, except signed.  */
+  gfc_charlen_int_kind = get_int_kind_from_node (size_type_node);
   gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
+
+  /* Fortran kind number of size_type_node (size_t). This is used for
+     the _size member in vtables.  */
+  gfc_size_kind = get_int_kind_from_node (size_type_node);
 }
 
 /* Get the type node for the given type and kind.  */
diff --git a/gcc/fortran/trans-types.h b/gcc/fortran/trans-types.h
index e8e92bf..6328125 100644
--- a/gcc/fortran/trans-types.h
+++ b/gcc/fortran/trans-types.h
@@ -23,6 +23,8 @@  along with GCC; see the file COPYING3.  If not see
 #ifndef GFC_BACKEND_H
 #define GFC_BACKEND_H
 
+#include "trans.h"
+
 extern GTY(()) tree gfc_array_index_type;
 extern GTY(()) tree gfc_array_range_type;
 extern GTY(()) tree gfc_character1_type_node;
@@ -35,10 +37,9 @@  extern GTY(()) tree gfc_complex_float128_type_node;
 
 /* This is the type used to hold the lengths of character variables.
    It must be the same as the corresponding definition in gfortran.h.  */
-/* TODO: This is still hardcoded as kind=4 in some bits of the compiler
-   and runtime library.  */
 extern GTY(()) tree gfc_charlen_type_node;
 
+
 /* The following flags give us information on the correspondence of
    real (and complex) kinds with C floating-point types long double
    and __float128.  */
diff --git a/gcc/testsuite/gfortran.dg/char_result_8.f90 b/gcc/testsuite/gfortran.dg/char_result_8.f90
index 69b1196..e265060 100644
--- a/gcc/testsuite/gfortran.dg/char_result_8.f90
+++ b/gcc/testsuite/gfortran.dg/char_result_8.f90
@@ -2,6 +2,7 @@ 
 ! functions that return strings.
 ! { dg-do run }
 program main
+  use iso_c_binding, only : c_size_t
   implicit none
 
   character (len = 30), target :: string
@@ -9,7 +10,7 @@  program main
   call test (f1 (), 30)
   call test (f2 (50), 50)
   call test (f3 (), 30)
-  call test (f4 (70), 70)
+  call test (f4 (70_c_size_t), 70)
 
   call indirect (100)
 contains
@@ -30,7 +31,8 @@  contains
   end function f3
 
   function f4 (i)
-    integer :: i
+    ! Use kind=c_size_t to work around PR 78757
+    integer(c_size_t) :: i
     character (len = i), pointer :: f4
     f4 => string
   end function f4
@@ -40,7 +42,7 @@  contains
     call test (f1 (), 30)
     call test (f2 (i), i)
     call test (f3 (), 30)
-    call test (f4 (i), i)
+    call test (f4 (int(i, c_size_t)), i)
   end subroutine indirect
 
   subroutine test (string, length)
diff --git a/gcc/testsuite/gfortran.dg/repeat_4.f90 b/gcc/testsuite/gfortran.dg/repeat_4.f90
index e5b5acc..99e7aee 100644
--- a/gcc/testsuite/gfortran.dg/repeat_4.f90
+++ b/gcc/testsuite/gfortran.dg/repeat_4.f90
@@ -2,6 +2,7 @@ 
 !
 ! { dg-do compile }
 program test
+  use iso_c_binding, only: k => c_size_t
   implicit none
   character(len=0), parameter :: s0 = "" 
   character(len=1), parameter :: s1 = "a"
@@ -21,18 +22,18 @@  program test
   print *, repeat(t2, -1) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is negative" }
 
   ! Check for too large NCOPIES argument and limit cases
-  print *, repeat(t0, huge(0))
-  print *, repeat(t1, huge(0))
-  print *, repeat(t2, huge(0)) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
-  print *, repeat(s2, huge(0)) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
+  print *, repeat(t0, huge(0_k))
+  print *, repeat(t1, huge(0_k))
+  print *, repeat(t2, huge(0_k)) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
+  print *, repeat(s2, huge(0_k)) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
 
-  print *, repeat(t0, huge(0)/2)
-  print *, repeat(t1, huge(0)/2)
-  print *, repeat(t2, huge(0)/2)
+  print *, repeat(t0, huge(0_k)/2)
+  print *, repeat(t1, huge(0_k)/2)
+  print *, repeat(t2, huge(0_k)/2)
 
-  print *, repeat(t0, huge(0)/2+1)
-  print *, repeat(t1, huge(0)/2+1)
-  print *, repeat(t2, huge(0)/2+1) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
-  print *, repeat(s2, huge(0)/2+1) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
+  print *, repeat(t0, huge(0_k)/2+1)
+  print *, repeat(t1, huge(0_k)/2+1)
+  print *, repeat(t2, huge(0_k)/2+1) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
+  print *, repeat(s2, huge(0_k)/2+1) ! { dg-error "Argument NCOPIES of REPEAT intrinsic is too large " }
 
 end program test
diff --git a/gcc/testsuite/gfortran.dg/repeat_7.f90 b/gcc/testsuite/gfortran.dg/repeat_7.f90
new file mode 100644
index 0000000..82f8dbf
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/repeat_7.f90
@@ -0,0 +1,8 @@ 
+! { dg-do compile }
+! PR 66310
+! Make sure there is a limit to how large arrays we try to handle at
+! compile time.
+program p
+  character, parameter :: z = 'z'
+  print *, repeat(z, huge(1_4))
+end program p
diff --git a/gcc/testsuite/gfortran.dg/scan_2.f90 b/gcc/testsuite/gfortran.dg/scan_2.f90
index c58a3a2..5ef0230 100644
--- a/gcc/testsuite/gfortran.dg/scan_2.f90
+++ b/gcc/testsuite/gfortran.dg/scan_2.f90
@@ -30,5 +30,5 @@  program p1
    call s1(.TRUE.)
 end program p1
 
-! { dg-final { scan-tree-dump-times "iscan = _gfortran_string_scan \\(2," 1 "original" } }
-! { dg-final { scan-tree-dump-times "iverify = _gfortran_string_verify \\(2," 1 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_string_scan \\(2," 1 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_string_verify \\(2," 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/string_1.f90 b/gcc/testsuite/gfortran.dg/string_1.f90
index 11dc5b7..6a6151e 100644
--- a/gcc/testsuite/gfortran.dg/string_1.f90
+++ b/gcc/testsuite/gfortran.dg/string_1.f90
@@ -1,4 +1,5 @@ 
 ! { dg-do compile }
+! { dg-require-effective-target ilp32 }
 !
 program main
   implicit none
diff --git a/gcc/testsuite/gfortran.dg/string_1_lp64.f90 b/gcc/testsuite/gfortran.dg/string_1_lp64.f90
new file mode 100644
index 0000000..a0edbef
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/string_1_lp64.f90
@@ -0,0 +1,15 @@ 
+! { dg-do compile }
+! { dg-require-effective-target lp64 }
+! { dg-require-effective-target fortran_integer_16 }
+program main
+  implicit none
+  integer(kind=16), parameter :: l1 = 2_16**64_16
+  character (len=2_16**64_16+4_16), parameter :: s = "" ! { dg-error "too large" }
+  character (len=2_16**64_8+4_16) :: ch ! { dg-error "too large" }
+  character (len=l1 + 1_16) :: v ! { dg-error "too large" }
+  character (len=int(huge(0_8),kind=16) + 1_16) :: z ! { dg-error "too large" }
+  character (len=int(huge(0_8),kind=16) + 0_16) :: w
+
+  print *, len(s)
+
+end program main
diff --git a/gcc/testsuite/gfortran.dg/string_3.f90 b/gcc/testsuite/gfortran.dg/string_3.f90
index 7daf8d3..4a88b06 100644
--- a/gcc/testsuite/gfortran.dg/string_3.f90
+++ b/gcc/testsuite/gfortran.dg/string_3.f90
@@ -1,4 +1,5 @@ 
 ! { dg-do compile }
+! { dg-require-effective-target ilp32 }
 !
 subroutine foo(i)
   implicit none
diff --git a/gcc/testsuite/gfortran.dg/string_3_lp64.f90 b/gcc/testsuite/gfortran.dg/string_3_lp64.f90
new file mode 100644
index 0000000..162561f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/string_3_lp64.f90
@@ -0,0 +1,20 @@ 
+! { dg-do compile }
+! { dg-require-effective-target lp64 }
+! { dg-require-effective-target fortran_integer_16 }
+subroutine foo(i)
+  implicit none
+  integer, intent(in) :: i
+  character(len=i) :: s
+
+  s = ''
+  print *, s(1:2_16**64_16+3_16) ! { dg-error "too large" }
+  print *, s(2_16**64_16+3_16:2_16**64_16+4_16) ! { dg-error "too large" }
+  print *, len(s(1:2_16**64_16+3_16)) ! { dg-error "too large" }
+  print *, len(s(2_16**64_16+3_16:2_16**64_16+4_16)) ! { dg-error "too large" }
+
+  print *, s(2_16**64_16+3_16:1)
+  print *, s(2_16**64_16+4_16:2_16**64_16+3_16)
+  print *, len(s(2_16**64_16+3_16:1))
+  print *, len(s(2_16**64_16+4_16:2_16**64_16+3_16))
+
+end subroutine
diff --git a/libgfortran/intrinsics/args.c b/libgfortran/intrinsics/args.c
index 517ebc9..0a5923e 100644
--- a/libgfortran/intrinsics/args.c
+++ b/libgfortran/intrinsics/args.c
@@ -37,7 +37,6 @@  void
 getarg_i4 (GFC_INTEGER_4 *pos, char  *val, gfc_charlen_type val_len)
 {
   int argc;
-  int arglen;
   char **argv;
 
   get_args (&argc, &argv);
@@ -49,7 +48,7 @@  getarg_i4 (GFC_INTEGER_4 *pos, char  *val, gfc_charlen_type val_len)
 
   if ((*pos) + 1 <= argc  && *pos >=0 )
     {
-      arglen = strlen (argv[*pos]);
+      gfc_charlen_type arglen = strlen (argv[*pos]);
       if (arglen > val_len)
 	arglen = val_len;
       memcpy (val, argv[*pos], arglen);
@@ -119,7 +118,8 @@  get_command_argument_i4 (GFC_INTEGER_4 *number, char *value,
 			 GFC_INTEGER_4 *length, GFC_INTEGER_4 *status, 
 			 gfc_charlen_type value_len)
 {
-  int argc, arglen = 0, stat_flag = GFC_GC_SUCCESS;
+  int argc, stat_flag = GFC_GC_SUCCESS;
+  gfc_charlen_type arglen = 0;
   char **argv;
 
   if (number == NULL )
@@ -195,10 +195,10 @@  void
 get_command_i4 (char *command, GFC_INTEGER_4 *length, GFC_INTEGER_4 *status,
 		gfc_charlen_type command_len)
 {
-  int i, argc, arglen, thisarg;
+  int i, argc, thisarg;
   int stat_flag = GFC_GC_SUCCESS;
-  int tot_len = 0;
   char **argv;
+  gfc_charlen_type arglen, tot_len = 0;
 
   if (command == NULL && length == NULL && status == NULL)
     return; /* No need to do anything.  */
diff --git a/libgfortran/intrinsics/chmod.c b/libgfortran/intrinsics/chmod.c
index 5aae77b..bbca4dc 100644
--- a/libgfortran/intrinsics/chmod.c
+++ b/libgfortran/intrinsics/chmod.c
@@ -66,7 +66,6 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 static int
 chmod_internal (char *file, char *mode, gfc_charlen_type mode_len)
 {
-  int i;
   bool ugo[3];
   bool rwxXstugo[9];
   int set_mode, part;
@@ -112,7 +111,7 @@  chmod_internal (char *file, char *mode, gfc_charlen_type mode_len)
   honor_umask = false;
 #endif
 
-  for (i = 0; i < mode_len; i++)
+  for (gfc_charlen_type i = 0; i < mode_len; i++)
     {
       if (!continue_clause)
 	{
diff --git a/libgfortran/intrinsics/env.c b/libgfortran/intrinsics/env.c
index fe94f9e..26ad7fe 100644
--- a/libgfortran/intrinsics/env.c
+++ b/libgfortran/intrinsics/env.c
@@ -94,7 +94,8 @@  get_environment_variable_i4 (char *name, char *value, GFC_INTEGER_4 *length,
 			     gfc_charlen_type name_len,
 			     gfc_charlen_type value_len)
 {
-  int stat = GFC_SUCCESS, res_len = 0;
+  int stat = GFC_SUCCESS;
+  gfc_charlen_type res_len = 0;
   char *name_nt;
   char *res;
 
diff --git a/libgfortran/intrinsics/extends_type_of.c b/libgfortran/intrinsics/extends_type_of.c
index 595dc6a..8c1e4db 100644
--- a/libgfortran/intrinsics/extends_type_of.c
+++ b/libgfortran/intrinsics/extends_type_of.c
@@ -31,7 +31,7 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 typedef struct vtype
 {
   GFC_INTEGER_4 hash;
-  GFC_INTEGER_4 size;
+  size_t size;
   struct vtype *extends;
 }
 vtype;
diff --git a/libgfortran/intrinsics/gerror.c b/libgfortran/intrinsics/gerror.c
index d3b28da..ff05737 100644
--- a/libgfortran/intrinsics/gerror.c
+++ b/libgfortran/intrinsics/gerror.c
@@ -39,7 +39,7 @@  export_proto_np(PREFIX(gerror));
 void 
 PREFIX(gerror) (char * msg, gfc_charlen_type msg_len)
 {
-  int p_len;
+  gfc_charlen_type p_len;
   char *p;
 
   p = gf_strerror (errno, msg, msg_len);
diff --git a/libgfortran/intrinsics/getlog.c b/libgfortran/intrinsics/getlog.c
index 2d3c74c..89b2c88 100644
--- a/libgfortran/intrinsics/getlog.c
+++ b/libgfortran/intrinsics/getlog.c
@@ -71,7 +71,6 @@  export_proto_np(PREFIX(getlog));
 void
 PREFIX(getlog) (char * login, gfc_charlen_type login_len)
 {
-  int p_len;
   char *p;
 
   memset (login, ' ', login_len); /* Blank the string.  */
@@ -108,7 +107,7 @@  PREFIX(getlog) (char * login, gfc_charlen_type login_len)
   if (p == NULL)
     goto cleanup;
 
-  p_len = strlen (p);
+  gfc_charlen_type p_len = strlen (p);
   if (login_len < p_len)
     p_len = login_len;
   memcpy (login, p, p_len);
diff --git a/libgfortran/intrinsics/hostnm.c b/libgfortran/intrinsics/hostnm.c
index 221852b..aff4d26 100644
--- a/libgfortran/intrinsics/hostnm.c
+++ b/libgfortran/intrinsics/hostnm.c
@@ -88,8 +88,8 @@  w32_gethostname (char *name, size_t len)
 static int
 hostnm_0 (char *name, gfc_charlen_type name_len)
 {
-  int val, i;
   char p[HOST_NAME_MAX + 1];
+  int val;
 
   memset (name, ' ', name_len);
 
@@ -99,8 +99,7 @@  hostnm_0 (char *name, gfc_charlen_type name_len)
 
   if (val == 0)
   {
-    i = -1;
-    while (i < name_len && p[++i] != '\0')
+    for (gfc_charlen_type i = 0; i < name_len && p[i] != '\0'; i++)
       name[i] = p[i];
   }
 
diff --git a/libgfortran/intrinsics/string_intrinsics_inc.c b/libgfortran/intrinsics/string_intrinsics_inc.c
index aa132ce..4131377 100644
--- a/libgfortran/intrinsics/string_intrinsics_inc.c
+++ b/libgfortran/intrinsics/string_intrinsics_inc.c
@@ -224,14 +224,15 @@  string_len_trim (gfc_charlen_type len, const CHARTYPE *s)
 	      break;
 	    }
 	}
-
-      /* Now continue for the last characters with naive approach below.  */
-      assert (i >= 0);
     }
 
   /* Simply look for the first non-blank character.  */
-  while (i >= 0 && s[i] == ' ')
-    --i;
+  while (s[i] == ' ')
+    {
+      if (i == 0)
+	return 0;
+      --i;
+    }
   return i + 1;
 }
 
@@ -327,12 +328,12 @@  string_scan (gfc_charlen_type slen, const CHARTYPE *str,
 
   if (back)
     {
-      for (i = slen - 1; i >= 0; i--)
+      for (i = slen; i != 0; i--)
 	{
 	  for (j = 0; j < setlen; j++)
 	    {
-	      if (str[i] == set[j])
-		return (i + 1);
+	      if (str[i - 1] == set[j])
+		return i;
 	    }
 	}
     }
diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 5830362..4287668 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -95,17 +95,17 @@  export_proto(transfer_logical);
 extern void transfer_logical_write (st_parameter_dt *, void *, int);
 export_proto(transfer_logical_write);
 
-extern void transfer_character (st_parameter_dt *, void *, int);
+extern void transfer_character (st_parameter_dt *, void *, gfc_charlen_type);
 export_proto(transfer_character);
 
-extern void transfer_character_write (st_parameter_dt *, void *, int);
+extern void transfer_character_write (st_parameter_dt *, void *, gfc_charlen_type);
 export_proto(transfer_character_write);
 
-extern void transfer_character_wide (st_parameter_dt *, void *, int, int);
+extern void transfer_character_wide (st_parameter_dt *, void *, gfc_charlen_type, int);
 export_proto(transfer_character_wide);
 
 extern void transfer_character_wide_write (st_parameter_dt *,
-					   void *, int, int);
+					   void *, gfc_charlen_type, int);
 export_proto(transfer_character_wide_write);
 
 extern void transfer_complex (st_parameter_dt *, void *, int);
@@ -2259,7 +2259,7 @@  transfer_logical_write (st_parameter_dt *dtp, void *p, int kind)
 }
 
 void
-transfer_character (st_parameter_dt *dtp, void *p, int len)
+transfer_character (st_parameter_dt *dtp, void *p, gfc_charlen_type len)
 {
   static char *empty_string[0];
 
@@ -2277,13 +2277,13 @@  transfer_character (st_parameter_dt *dtp, void *p, int len)
 }
 
 void
-transfer_character_write (st_parameter_dt *dtp, void *p, int len)
+transfer_character_write (st_parameter_dt *dtp, void *p, gfc_charlen_type len)
 {
   transfer_character (dtp, p, len);
 }
 
 void
-transfer_character_wide (st_parameter_dt *dtp, void *p, int len, int kind)
+transfer_character_wide (st_parameter_dt *dtp, void *p, gfc_charlen_type len, int kind)
 {
   static char *empty_string[0];
 
@@ -2301,7 +2301,7 @@  transfer_character_wide (st_parameter_dt *dtp, void *p, int len, int kind)
 }
 
 void
-transfer_character_wide_write (st_parameter_dt *dtp, void *p, int len, int kind)
+transfer_character_wide_write (st_parameter_dt *dtp, void *p, gfc_charlen_type len, int kind)
 {
   transfer_character_wide (dtp, p, len, kind);
 }
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 6fa264c..e27f70d 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -440,10 +440,9 @@  is_trim_ok (st_parameter_dt *dtp)
   if (dtp->common.flags & IOPARM_DT_HAS_FORMAT)
     {
       char *p = dtp->format;
-      off_t i;
       if (dtp->common.flags & IOPARM_DT_HAS_BLANK)
 	return false;
-      for (i = 0; i < dtp->format_len; i++)
+      for (gfc_charlen_type i = 0; i < dtp->format_len; i++)
 	{
 	  if (p[i] == '/') return false;
 	  if (p[i] == 'b' || p[i] == 'B')
diff --git a/libgfortran/io/write.c b/libgfortran/io/write.c
index c8bba3c..607e0aa 100644
--- a/libgfortran/io/write.c
+++ b/libgfortran/io/write.c
@@ -2381,7 +2381,6 @@  void
 namelist_write (st_parameter_dt *dtp)
 {
   namelist_info * t1, *t2, *dummy = NULL;
-  index_type i;
   index_type dummy_offset = 0;
   char c;
   char * dummy_name = NULL;
@@ -2403,7 +2402,7 @@  namelist_write (st_parameter_dt *dtp)
   write_character (dtp, "&", 1, 1, NODELIM);
 
   /* Write namelist name in upper case - f95 std.  */
-  for (i = 0 ;i < dtp->namelist_name_len ;i++ )
+  for (gfc_charlen_type i = 0 ;i < dtp->namelist_name_len ;i++ )
     {
       c = toupper ((int) dtp->namelist_name[i]);
       write_character (dtp, &c, 1 ,1, NODELIM);
diff --git a/libgfortran/libgfortran.h b/libgfortran/libgfortran.h
index b9f2471..847ab63 100644
--- a/libgfortran/libgfortran.h
+++ b/libgfortran/libgfortran.h
@@ -249,7 +249,7 @@  typedef GFC_INTEGER_4 GFC_IO_INT;
 typedef ptrdiff_t index_type;
 
 /* The type used for the lengths of character variables.  */
-typedef GFC_INTEGER_4 gfc_charlen_type;
+typedef size_t gfc_charlen_type;
 
 /* Definitions of CHARACTER data types:
      - CHARACTER(KIND=1) corresponds to the C char type,