diff mbox series

Add workaround for broken C/C++ wrappers to LAPACK (PR fortran/90329)

Message ID 20190516080930.GR19695@tucnak
State New
Headers show
Series Add workaround for broken C/C++ wrappers to LAPACK (PR fortran/90329) | expand

Commit Message

Jakub Jelinek May 16, 2019, 8:09 a.m. UTC
Hi!

Fortran subroutines/functions that have CHARACTER arguments have also
hidden arguments at the end of the argument list which hold the string
length.  This is something all Fortran compilers I've tried do and is
done in order to support calling unprototyped subroutines/functions
where one can't know if the callee is using character(len=constant)
or character(len=*) argument, where for the latter one has to pass
the extra argument because there is no other way to propagate that info,
while for the former it is kind of redundant but still part of the ABI;
the compiler just uses the constant directly instead of asking about the
real passed string length.

Another thing is that until PR87689 has been fixed, the Fortran FE has been
broken, used vararg-ish prototypes in most cases and that prevented (all?)
tail call optimizations in the Fortran code.

Apparently it is a common case that buggy C/C++ wrappers around the Fortran
functions just ignore the ABI and don't pass the hidden string length
arguments at all, which seemed to work fine in gfortran until recently
(because the arguments weren't used).  When we started making tail calls
from such functions, this has of course changed, because when making a tail
call we overwrite the caller's argument slots with the arguments we want to
pass to the callee.  Not really sure why it seemed to work with other
compilers; trying https://fortran.godbolt.org/z/ckMt1t
subroutine foo (a, b, c, d, e, f, g, h)
  integer :: b, c, d, e, f, g, h
  character(len=1) :: a
  call bar (a, b, c, d, e, f, g, h)
end subroutine foo
both older/new gfortran and ifort 19.0.0 emit a tail call which overwrites
the hidden string length argument with 1, but when trying the
https://fortran.godbolt.org/z/xpsH8e LAPACK routine, ifort for some reason
doesn't tail call it.

I'm not really happy to provide workarounds for undefined behavior,
especially because that will mean it might take longer if ever if those
buggy programs are fixed.  On the other side, the PR87689 bug fix has been
backported to all release branches and so now not only trunk, but also 9.1,
8.3.1 and 7.4.1 are affected.  Instead of trying to disable all tail calls,
this patch disables tail calls from functions/subroutines that have those
hidden string length arguments and don't use character(len=*) (in that case
the function wouldn't seem to work previously either, because the argument
is really used), where those hidden string length arguments are passed
on the stack and where the tail callee also would want to pass arguments
on the stack (if we spent even more time on this, we could narrow it down
further and check if the tail call would actually store anything overlapping
the hidden string length arguments on the stack).

This workaround probably needs guarding with some Fortran FE specific
option, so that it can be disabled, will defer that to the Fortran
maintainers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
release branches (not sure about LTO on the release branches, does one need
to bump anything when changing the LTO format by streaming another bit)?

2019-05-16  Jakub Jelinek  <jakub@redhat.com>

	PR fortran/90329
	* tree-core.h (struct tree_decl_common): Document
	decl_nonshareable_flag for PARM_DECLs.
	* tree.h (DECL_HIDDEN_STRING_LENGTH): Define.
	* calls.c (expand_call): Don't try tail call if caller
	has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be
	passed on the stack and callee needs to pass any arguments on the
	stack.
	* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
	else if instead of series of mutually exclusive ifs.  Handle
	DECL_HIDDEN_STRING_LENGTH for PARM_DECLs.
	* tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise.

	* trans-decl.c (create_function_arglist): Set
	DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if
	len is constant.


	Jakub

Comments

Richard Biener May 16, 2019, 9:22 a.m. UTC | #1
On Thu, 16 May 2019, Jakub Jelinek wrote:

> Hi!
> 
> Fortran subroutines/functions that have CHARACTER arguments have also
> hidden arguments at the end of the argument list which hold the string
> length.  This is something all Fortran compilers I've tried do and is
> done in order to support calling unprototyped subroutines/functions
> where one can't know if the callee is using character(len=constant)
> or character(len=*) argument, where for the latter one has to pass
> the extra argument because there is no other way to propagate that info,
> while for the former it is kind of redundant but still part of the ABI;
> the compiler just uses the constant directly instead of asking about the
> real passed string length.
> 
> Another thing is that until PR87689 has been fixed, the Fortran FE has been
> broken, used vararg-ish prototypes in most cases and that prevented (all?)
> tail call optimizations in the Fortran code.
> 
> Apparently it is a common case that buggy C/C++ wrappers around the Fortran
> functions just ignore the ABI and don't pass the hidden string length
> arguments at all, which seemed to work fine in gfortran until recently
> (because the arguments weren't used).  When we started making tail calls
> from such functions, this has of course changed, because when making a tail
> call we overwrite the caller's argument slots with the arguments we want to
> pass to the callee.  Not really sure why it seemed to work with other
> compilers; trying https://fortran.godbolt.org/z/ckMt1t
> subroutine foo (a, b, c, d, e, f, g, h)
>   integer :: b, c, d, e, f, g, h
>   character(len=1) :: a
>   call bar (a, b, c, d, e, f, g, h)
> end subroutine foo
> both older/new gfortran and ifort 19.0.0 emit a tail call which overwrites
> the hidden string length argument with 1, but when trying the
> https://fortran.godbolt.org/z/xpsH8e LAPACK routine, ifort for some reason
> doesn't tail call it.
> 
> I'm not really happy to provide workarounds for undefined behavior,
> especially because that will mean it might take longer if ever if those
> buggy programs are fixed.  On the other side, the PR87689 bug fix has been
> backported to all release branches and so now not only trunk, but also 9.1,
> 8.3.1 and 7.4.1 are affected.  Instead of trying to disable all tail calls,
> this patch disables tail calls from functions/subroutines that have those
> hidden string length arguments and don't use character(len=*) (in that case
> the function wouldn't seem to work previously either, because the argument
> is really used), where those hidden string length arguments are passed
> on the stack and where the tail callee also would want to pass arguments
> on the stack (if we spent even more time on this, we could narrow it down
> further and check if the tail call would actually store anything overlapping
> the hidden string length arguments on the stack).
> 
> This workaround probably needs guarding with some Fortran FE specific
> option, so that it can be disabled, will defer that to the Fortran
> maintainers.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
> release branches (not sure about LTO on the release branches, does one need
> to bump anything when changing the LTO format by streaming another bit)?

You need to bump the LTO_minor_version in lto-streamer.h

Which reminds me I forgot to update LTO_major_version on trunk - can
you do that?

I'm also worried that with this we just never fix those sources
but I agree with fixing the issue on now broken branches.

So, OK for trunk and branches (with the approrpiate lto-streamer.h
changes).  We should eventually revert the change for GCC 10
though.

Thanks,
Richard.

> 2019-05-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR fortran/90329
> 	* tree-core.h (struct tree_decl_common): Document
> 	decl_nonshareable_flag for PARM_DECLs.
> 	* tree.h (DECL_HIDDEN_STRING_LENGTH): Define.
> 	* calls.c (expand_call): Don't try tail call if caller
> 	has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be
> 	passed on the stack and callee needs to pass any arguments on the
> 	stack.
> 	* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
> 	else if instead of series of mutually exclusive ifs.  Handle
> 	DECL_HIDDEN_STRING_LENGTH for PARM_DECLs.
> 	* tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise.
> 
> 	* trans-decl.c (create_function_arglist): Set
> 	DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if
> 	len is constant.
> 
> --- gcc/tree-core.h.jj	2019-02-22 15:22:20.882919620 +0100
> +++ gcc/tree-core.h	2019-05-15 08:00:39.284668438 +0200
> @@ -1683,6 +1683,7 @@ struct GTY(()) tree_decl_common {
>    /* In a VAR_DECL and PARM_DECL, this is DECL_READ_P.  */
>    unsigned decl_read_flag : 1;
>    /* In a VAR_DECL or RESULT_DECL, this is DECL_NONSHAREABLE.  */
> +  /* In a PARM_DECL, this is DECL_HIDDEN_STRING_LENGTH.  */
>    unsigned decl_nonshareable_flag : 1;
>  
>    /* DECL_OFFSET_ALIGN, used only for FIELD_DECLs.  */
> --- gcc/tree.h.jj	2019-05-02 12:18:33.829078755 +0200
> +++ gcc/tree.h	2019-05-15 08:06:11.559171046 +0200
> @@ -904,6 +904,11 @@ extern void omp_clause_range_check_faile
>    (TREE_CHECK2 (NODE, VAR_DECL, \
>  		RESULT_DECL)->decl_common.decl_nonshareable_flag)
>  
> +/* In a PARM_DECL, set for Fortran hidden string length arguments that some
> +   buggy callers don't pass to the callee.  */
> +#define DECL_HIDDEN_STRING_LENGTH(NODE) \
> +  (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
> +
>  /* In a CALL_EXPR, means that the call is the jump from a thunk to the
>     thunked-to function.  */
>  #define CALL_FROM_THUNK_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
> --- gcc/calls.c.jj	2019-04-08 10:11:30.852182466 +0200
> +++ gcc/calls.c	2019-05-15 09:03:56.863754839 +0200
> @@ -3628,6 +3628,28 @@ expand_call (tree exp, rtx target, int i
>        || dbg_cnt (tail_call) == false)
>      try_tail_call = 0;
>  
> +  /* Workaround buggy C/C++ wrappers around Fortran routines with
> +     character(len=constant) arguments if the hidden string length arguments
> +     are passed on the stack; if the callers forget to pass those arguments,
> +     attempting to tail call in such routines leads to stack corruption.
> +     Avoid tail calls in functions where at least one such hidden string
> +     length argument is passed (partially or fully) on the stack in the
> +     caller and the callee needs to pass any arguments on the stack.
> +     See PR90329.  */
> +  if (try_tail_call && maybe_ne (args_size.constant, 0))
> +    for (tree arg = DECL_ARGUMENTS (current_function_decl);
> +	 arg; arg = DECL_CHAIN (arg))
> +      if (DECL_HIDDEN_STRING_LENGTH (arg) && DECL_INCOMING_RTL (arg))
> +	{
> +	  subrtx_iterator::array_type array;
> +	  FOR_EACH_SUBRTX (iter, array, DECL_INCOMING_RTL (arg), NONCONST)
> +	    if (MEM_P (*iter))
> +	      {
> +		try_tail_call = 0;
> +		break;
> +	      }
> +	}
> +
>    /* If the user has marked the function as requiring tail-call
>       optimization, attempt it.  */
>    if (must_tail_call)
> --- gcc/tree-streamer-in.c.jj	2019-01-01 12:37:21.184908879 +0100
> +++ gcc/tree-streamer-in.c	2019-05-15 08:58:02.123629519 +0200
> @@ -251,7 +251,7 @@ unpack_ts_decl_common_value_fields (stru
>        LABEL_DECL_UID (expr) = -1;
>      }
>  
> -  if (TREE_CODE (expr) == FIELD_DECL)
> +  else if (TREE_CODE (expr) == FIELD_DECL)
>      {
>        DECL_PACKED (expr) = (unsigned) bp_unpack_value (bp, 1);
>        DECL_NONADDRESSABLE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
> @@ -259,12 +259,15 @@ unpack_ts_decl_common_value_fields (stru
>        expr->decl_common.off_align = bp_unpack_value (bp, 8);
>      }
>  
> -  if (VAR_P (expr))
> +  else if (VAR_P (expr))
>      {
>        DECL_HAS_DEBUG_EXPR_P (expr) = (unsigned) bp_unpack_value (bp, 1);
>        DECL_NONLOCAL_FRAME (expr) = (unsigned) bp_unpack_value (bp, 1);
>      }
>  
> +  else if (TREE_CODE (expr) == PARM_DECL)
> +    DECL_HIDDEN_STRING_LENGTH (expr) = (unsigned) bp_unpack_value (bp, 1);
> +
>    if (TREE_CODE (expr) == RESULT_DECL
>        || TREE_CODE (expr) == PARM_DECL
>        || VAR_P (expr))
> --- gcc/tree-streamer-out.c.jj	2019-01-24 19:54:20.792500923 +0100
> +++ gcc/tree-streamer-out.c	2019-05-15 09:01:23.957287106 +0200
> @@ -212,7 +212,7 @@ pack_ts_decl_common_value_fields (struct
>        bp_pack_var_len_unsigned (bp, EH_LANDING_PAD_NR (expr));
>      }
>  
> -  if (TREE_CODE (expr) == FIELD_DECL)
> +  else if (TREE_CODE (expr) == FIELD_DECL)
>      {
>        bp_pack_value (bp, DECL_PACKED (expr), 1);
>        bp_pack_value (bp, DECL_NONADDRESSABLE_P (expr), 1);
> @@ -220,12 +220,15 @@ pack_ts_decl_common_value_fields (struct
>        bp_pack_value (bp, expr->decl_common.off_align, 8);
>      }
>  
> -  if (VAR_P (expr))
> +  else if (VAR_P (expr))
>      {
>        bp_pack_value (bp, DECL_HAS_DEBUG_EXPR_P (expr), 1);
>        bp_pack_value (bp, DECL_NONLOCAL_FRAME (expr), 1);
>      }
>  
> +  else if (TREE_CODE (expr) == PARM_DECL)
> +    bp_pack_value (bp, DECL_HIDDEN_STRING_LENGTH (expr), 1);
> +
>    if (TREE_CODE (expr) == RESULT_DECL
>        || TREE_CODE (expr) == PARM_DECL
>        || VAR_P (expr))
> --- gcc/fortran/trans-decl.c.jj	2019-05-15 08:18:16.000000000 +0200
> +++ gcc/fortran/trans-decl.c	2019-05-15 08:31:07.260388229 +0200
> @@ -2512,6 +2512,10 @@ create_function_arglist (gfc_symbol * sy
>  	  DECL_ARG_TYPE (length) = len_type;
>  	  TREE_READONLY (length) = 1;
>  	  gfc_finish_decl (length);
> +	  if (f->sym->ts.u.cl
> +	      && f->sym->ts.u.cl->length
> +	      && f->sym->ts.u.cl->length->expr_type == EXPR_CONSTANT)
> +	    DECL_HIDDEN_STRING_LENGTH (length) = 1;
>  
>  	  /* Remember the passed value.  */
>            if (!f->sym->ts.u.cl ||  f->sym->ts.u.cl->passed_length)
> 
> 	Jakub
>
Jakub Jelinek May 16, 2019, 9:36 a.m. UTC | #2
On Thu, May 16, 2019 at 11:22:08AM +0200, Richard Biener wrote:
> Which reminds me I forgot to update LTO_major_version on trunk - can
> you do that?

Done with:

2019-05-16  Jakub Jelinek  <jakub@redhat.com>

	* lto-streamer.h (LTO_major_version): Bump to 9.

--- gcc/lto-streamer.h	(revision 271283)
+++ gcc/lto-streamer.h	(working copy)
@@ -120,7 +120,7 @@ along with GCC; see the file COPYING3.
      String are represented in the table as pairs, a length in ULEB128
      form followed by the data for the string.  */
 
-#define LTO_major_version 8
+#define LTO_major_version 9
 #define LTO_minor_version 0
 
 typedef unsigned char	lto_decl_flags_t;


	Jakub
Jeff Law May 17, 2019, 3:33 p.m. UTC | #3
On 5/16/19 3:36 AM, Jakub Jelinek wrote:
> On Thu, May 16, 2019 at 11:22:08AM +0200, Richard Biener wrote:
>> Which reminds me I forgot to update LTO_major_version on trunk - can
>> you do that?
> 
> Done with:
> 
> 2019-05-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* lto-streamer.h (LTO_major_version): Bump to 9.
OK
jeff
Jeff Law May 17, 2019, 5:48 p.m. UTC | #4
On 5/16/19 2:09 AM, Jakub Jelinek wrote:
> Hi!
> 
> Fortran subroutines/functions that have CHARACTER arguments have also
> hidden arguments at the end of the argument list which hold the string
> length.  This is something all Fortran compilers I've tried do and is
> done in order to support calling unprototyped subroutines/functions
> where one can't know if the callee is using character(len=constant)
> or character(len=*) argument, where for the latter one has to pass
> the extra argument because there is no other way to propagate that info,
> while for the former it is kind of redundant but still part of the ABI;
> the compiler just uses the constant directly instead of asking about the
> real passed string length.
> 
> Another thing is that until PR87689 has been fixed, the Fortran FE has been
> broken, used vararg-ish prototypes in most cases and that prevented (all?)
> tail call optimizations in the Fortran code.
> 
> Apparently it is a common case that buggy C/C++ wrappers around the Fortran
> functions just ignore the ABI and don't pass the hidden string length
> arguments at all, which seemed to work fine in gfortran until recently
> (because the arguments weren't used).  When we started making tail calls
> from such functions, this has of course changed, because when making a tail
> call we overwrite the caller's argument slots with the arguments we want to
> pass to the callee.  Not really sure why it seemed to work with other
> compilers; trying https://fortran.godbolt.org/z/ckMt1t
> subroutine foo (a, b, c, d, e, f, g, h)
>   integer :: b, c, d, e, f, g, h
>   character(len=1) :: a
>   call bar (a, b, c, d, e, f, g, h)
> end subroutine foo
> both older/new gfortran and ifort 19.0.0 emit a tail call which overwrites
> the hidden string length argument with 1, but when trying the
> https://fortran.godbolt.org/z/xpsH8e LAPACK routine, ifort for some reason
> doesn't tail call it.
> 
> I'm not really happy to provide workarounds for undefined behavior,
> especially because that will mean it might take longer if ever if those
> buggy programs are fixed.  On the other side, the PR87689 bug fix has been
> backported to all release branches and so now not only trunk, but also 9.1,
> 8.3.1 and 7.4.1 are affected.  Instead of trying to disable all tail calls,
> this patch disables tail calls from functions/subroutines that have those
> hidden string length arguments and don't use character(len=*) (in that case
> the function wouldn't seem to work previously either, because the argument
> is really used), where those hidden string length arguments are passed
> on the stack and where the tail callee also would want to pass arguments
> on the stack (if we spent even more time on this, we could narrow it down
> further and check if the tail call would actually store anything overlapping
> the hidden string length arguments on the stack).
> 
> This workaround probably needs guarding with some Fortran FE specific
> option, so that it can be disabled, will defer that to the Fortran
> maintainers.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
> release branches (not sure about LTO on the release branches, does one need
> to bump anything when changing the LTO format by streaming another bit)?
> 
> 2019-05-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR fortran/90329
> 	* tree-core.h (struct tree_decl_common): Document
> 	decl_nonshareable_flag for PARM_DECLs.
> 	* tree.h (DECL_HIDDEN_STRING_LENGTH): Define.
> 	* calls.c (expand_call): Don't try tail call if caller
> 	has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be
> 	passed on the stack and callee needs to pass any arguments on the
> 	stack.
> 	* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
> 	else if instead of series of mutually exclusive ifs.  Handle
> 	DECL_HIDDEN_STRING_LENGTH for PARM_DECLs.
> 	* tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise.
> 
> 	* trans-decl.c (create_function_arglist): Set
> 	DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if
> 	len is constant.
My first, second and third thought has been we shouldn't be catering to
code that is so clearly broken.  Maybe we could do this on the release
branches which would in turn give folks roughly a year to fix up this mess?

jeff
Jerry DeLisle May 17, 2019, 6:57 p.m. UTC | #5
On 5/17/19 10:48 AM, Jeff Law wrote:
> On 5/16/19 2:09 AM, Jakub Jelinek wrote:
>> Hi!
>>
>> Fortran subroutines/functions that have CHARACTER arguments have also
>> hidden arguments at the end of the argument list which hold the string
>> length.  This is something all Fortran compilers I've tried do and is
>> done in order to support calling unprototyped subroutines/functions
>> where one can't know if the callee is using character(len=constant)
>> or character(len=*) argument, where for the latter one has to pass
>> the extra argument because there is no other way to propagate that info,
>> while for the former it is kind of redundant but still part of the ABI;
>> the compiler just uses the constant directly instead of asking about the
>> real passed string length.
>>
>> Another thing is that until PR87689 has been fixed, the Fortran FE has been
>> broken, used vararg-ish prototypes in most cases and that prevented (all?)
>> tail call optimizations in the Fortran code.
>>
>> Apparently it is a common case that buggy C/C++ wrappers around the Fortran
>> functions just ignore the ABI and don't pass the hidden string length
>> arguments at all, which seemed to work fine in gfortran until recently
>> (because the arguments weren't used).  When we started making tail calls
>> from such functions, this has of course changed, because when making a tail
>> call we overwrite the caller's argument slots with the arguments we want to
>> pass to the callee.  Not really sure why it seemed to work with other
>> compilers; trying https://fortran.godbolt.org/z/ckMt1t
>> subroutine foo (a, b, c, d, e, f, g, h)
>>    integer :: b, c, d, e, f, g, h
>>    character(len=1) :: a
>>    call bar (a, b, c, d, e, f, g, h)
>> end subroutine foo
>> both older/new gfortran and ifort 19.0.0 emit a tail call which overwrites
>> the hidden string length argument with 1, but when trying the
>> https://fortran.godbolt.org/z/xpsH8e LAPACK routine, ifort for some reason
>> doesn't tail call it.
>>
>> I'm not really happy to provide workarounds for undefined behavior,
>> especially because that will mean it might take longer if ever if those
>> buggy programs are fixed.  On the other side, the PR87689 bug fix has been
>> backported to all release branches and so now not only trunk, but also 9.1,
>> 8.3.1 and 7.4.1 are affected.  Instead of trying to disable all tail calls,
>> this patch disables tail calls from functions/subroutines that have those
>> hidden string length arguments and don't use character(len=*) (in that case
>> the function wouldn't seem to work previously either, because the argument
>> is really used), where those hidden string length arguments are passed
>> on the stack and where the tail callee also would want to pass arguments
>> on the stack (if we spent even more time on this, we could narrow it down
>> further and check if the tail call would actually store anything overlapping
>> the hidden string length arguments on the stack).
>>
>> This workaround probably needs guarding with some Fortran FE specific
>> option, so that it can be disabled, will defer that to the Fortran
>> maintainers.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
>> release branches (not sure about LTO on the release branches, does one need
>> to bump anything when changing the LTO format by streaming another bit)?
>>
>> 2019-05-16  Jakub Jelinek  <jakub@redhat.com>
>>
>> 	PR fortran/90329
>> 	* tree-core.h (struct tree_decl_common): Document
>> 	decl_nonshareable_flag for PARM_DECLs.
>> 	* tree.h (DECL_HIDDEN_STRING_LENGTH): Define.
>> 	* calls.c (expand_call): Don't try tail call if caller
>> 	has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be
>> 	passed on the stack and callee needs to pass any arguments on the
>> 	stack.
>> 	* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
>> 	else if instead of series of mutually exclusive ifs.  Handle
>> 	DECL_HIDDEN_STRING_LENGTH for PARM_DECLs.
>> 	* tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise.
>>
>> 	* trans-decl.c (create_function_arglist): Set
>> 	DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if
>> 	len is constant.
> My first, second and third thought has been we shouldn't be catering to
> code that is so clearly broken.  Maybe we could do this on the release
> branches which would in turn give folks roughly a year to fix up this mess?
> 
> jeff
> 

Not that I have much say, but I have been following this thread and the others 
about broken wrappers and screwed up prototypes being used by R for there use of 
LAPACK.  I have been holding off saying anything.

I don't thing we should be doing anything in gfortran at all. I think the R 
people need to fix their calls. People get caught by not following the proper 
conventions and then want the compiler to fix it. Its not the compiler writers 
job to fix users bad code. The Fortran conventions of having the string lengths 
appended to the end has been known for many many years and well documented 
everywhere.

Sorry for ranting and I understand everyone is just trying to do the right 
thing.  It would have been about an editorial fix on the R side.

Maybe Jeff as a good compromise here in that at least we would not have to carry 
it forward in time.

Jerry
Janne Blomqvist May 17, 2019, 7:33 p.m. UTC | #6
On Fri, May 17, 2019 at 9:57 PM Jerry DeLisle <jvdelisle@charter.net> wrote:
>
> On 5/17/19 10:48 AM, Jeff Law wrote:
> > My first, second and third thought has been we shouldn't be catering to
> > code that is so clearly broken.  Maybe we could do this on the release
> > branches which would in turn give folks roughly a year to fix up this mess?
> >
> > jeff
> >
>
> Not that I have much say, but I have been following this thread and the others
> about broken wrappers and screwed up prototypes being used by R for there use of
> LAPACK.  I have been holding off saying anything.
>
> I don't thing we should be doing anything in gfortran at all. I think the R
> people need to fix their calls. People get caught by not following the proper
> conventions and then want the compiler to fix it. Its not the compiler writers
> job to fix users bad code. The Fortran conventions of having the string lengths
> appended to the end has been known for many many years and well documented
> everywhere.
>
> Sorry for ranting and I understand everyone is just trying to do the right
> thing.  It would have been about an editorial fix on the R side.
>
> Maybe Jeff as a good compromise here in that at least we would not have to carry
> it forward in time.

I don't think it's this simple, unfortunately. If it would be only R,
then yes, we could help the R people fix their code and then it would
all be done. But seems this is everywhere. It's in CBLAS & LAPACKE
(the official BLAS and LAPACK C interfaces), it's in numpy (probably
scipy also), R, arma. And BLAS/LAPACK are pretty central, and probably
the single biggest reason why non-Fortran programmers use Fortran
code. And while the issue has been known, it seems to have been
happily ignored for the past 30 years.

And yes, while I think one year might be a quite optimistic timeframe
to get this fixed, I agree we shouldn't keep the workaround
indefinitely either. I think the best way would be if CBLAS & LAPACKE
would be fixed, and then we could tell people to use those rather than
their own in-house broken interfaces.
Jakub Jelinek May 17, 2019, 7:37 p.m. UTC | #7
On Fri, May 17, 2019 at 10:33:30PM +0300, Janne Blomqvist wrote:
> I don't think it's this simple, unfortunately. If it would be only R,
> then yes, we could help the R people fix their code and then it would
> all be done. But seems this is everywhere. It's in CBLAS & LAPACKE
> (the official BLAS and LAPACK C interfaces), it's in numpy (probably
> scipy also), R, arma. And BLAS/LAPACK are pretty central, and probably
> the single biggest reason why non-Fortran programmers use Fortran
> code. And while the issue has been known, it seems to have been
> happily ignored for the past 30 years.
> 
> And yes, while I think one year might be a quite optimistic timeframe
> to get this fixed, I agree we shouldn't keep the workaround
> indefinitely either. I think the best way would be if CBLAS & LAPACKE
> would be fixed, and then we could tell people to use those rather than
> their own in-house broken interfaces.

Could we easily detect at resolve time whether a subroutine/function
makes any implicitly prototyped calls and limit this workaround to those
cases (i.e. only old-style Fortran)?  I've mentioned it in the PR, but not
sure how exactly to check that.
The thing is, if all calls are explicitly prototyped, then we've been using
tail calls before as well and so the workaround would just pessimize
something we've been optimizing fine before.

	Jakub
Thomas Koenig May 18, 2019, 11:53 a.m. UTC | #8
Hi Jakub,

> Could we easily detect at resolve time whether a subroutine/function
> makes any implicitly prototyped calls and limit this workaround to those
> cases (i.e. only old-style Fortran)?  I've mentioned it in the PR, but not
> sure how exactly to check that.

I think we could to that.  We could look at all the external symbols and
check with if (sym->attr.if_src == IFSRC_UNKNOWN) .

Let me think about this some more...

Regards

	Thomas
Jerry DeLisle May 18, 2019, 10:08 p.m. UTC | #9
On 5/17/19 12:33 PM, Janne Blomqvist wrote:

--- snip ---
> And yes, while I think one year might be a quite optimistic timeframe
> to get this fixed, I agree we shouldn't keep the workaround
> indefinitely either. I think the best way would be if CBLAS & LAPACKE
> would be fixed, and then we could tell people to use those rather than
> their own in-house broken interfaces.
> 

Thankyou for clarifying, so the real bug here is in CBLAS and LAPACK? Gads!

Cheers,

Jerry
diff mbox series

Patch

--- gcc/tree-core.h.jj	2019-02-22 15:22:20.882919620 +0100
+++ gcc/tree-core.h	2019-05-15 08:00:39.284668438 +0200
@@ -1683,6 +1683,7 @@  struct GTY(()) tree_decl_common {
   /* In a VAR_DECL and PARM_DECL, this is DECL_READ_P.  */
   unsigned decl_read_flag : 1;
   /* In a VAR_DECL or RESULT_DECL, this is DECL_NONSHAREABLE.  */
+  /* In a PARM_DECL, this is DECL_HIDDEN_STRING_LENGTH.  */
   unsigned decl_nonshareable_flag : 1;
 
   /* DECL_OFFSET_ALIGN, used only for FIELD_DECLs.  */
--- gcc/tree.h.jj	2019-05-02 12:18:33.829078755 +0200
+++ gcc/tree.h	2019-05-15 08:06:11.559171046 +0200
@@ -904,6 +904,11 @@  extern void omp_clause_range_check_faile
   (TREE_CHECK2 (NODE, VAR_DECL, \
 		RESULT_DECL)->decl_common.decl_nonshareable_flag)
 
+/* In a PARM_DECL, set for Fortran hidden string length arguments that some
+   buggy callers don't pass to the callee.  */
+#define DECL_HIDDEN_STRING_LENGTH(NODE) \
+  (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
+
 /* In a CALL_EXPR, means that the call is the jump from a thunk to the
    thunked-to function.  */
 #define CALL_FROM_THUNK_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
--- gcc/calls.c.jj	2019-04-08 10:11:30.852182466 +0200
+++ gcc/calls.c	2019-05-15 09:03:56.863754839 +0200
@@ -3628,6 +3628,28 @@  expand_call (tree exp, rtx target, int i
       || dbg_cnt (tail_call) == false)
     try_tail_call = 0;
 
+  /* Workaround buggy C/C++ wrappers around Fortran routines with
+     character(len=constant) arguments if the hidden string length arguments
+     are passed on the stack; if the callers forget to pass those arguments,
+     attempting to tail call in such routines leads to stack corruption.
+     Avoid tail calls in functions where at least one such hidden string
+     length argument is passed (partially or fully) on the stack in the
+     caller and the callee needs to pass any arguments on the stack.
+     See PR90329.  */
+  if (try_tail_call && maybe_ne (args_size.constant, 0))
+    for (tree arg = DECL_ARGUMENTS (current_function_decl);
+	 arg; arg = DECL_CHAIN (arg))
+      if (DECL_HIDDEN_STRING_LENGTH (arg) && DECL_INCOMING_RTL (arg))
+	{
+	  subrtx_iterator::array_type array;
+	  FOR_EACH_SUBRTX (iter, array, DECL_INCOMING_RTL (arg), NONCONST)
+	    if (MEM_P (*iter))
+	      {
+		try_tail_call = 0;
+		break;
+	      }
+	}
+
   /* If the user has marked the function as requiring tail-call
      optimization, attempt it.  */
   if (must_tail_call)
--- gcc/tree-streamer-in.c.jj	2019-01-01 12:37:21.184908879 +0100
+++ gcc/tree-streamer-in.c	2019-05-15 08:58:02.123629519 +0200
@@ -251,7 +251,7 @@  unpack_ts_decl_common_value_fields (stru
       LABEL_DECL_UID (expr) = -1;
     }
 
-  if (TREE_CODE (expr) == FIELD_DECL)
+  else if (TREE_CODE (expr) == FIELD_DECL)
     {
       DECL_PACKED (expr) = (unsigned) bp_unpack_value (bp, 1);
       DECL_NONADDRESSABLE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
@@ -259,12 +259,15 @@  unpack_ts_decl_common_value_fields (stru
       expr->decl_common.off_align = bp_unpack_value (bp, 8);
     }
 
-  if (VAR_P (expr))
+  else if (VAR_P (expr))
     {
       DECL_HAS_DEBUG_EXPR_P (expr) = (unsigned) bp_unpack_value (bp, 1);
       DECL_NONLOCAL_FRAME (expr) = (unsigned) bp_unpack_value (bp, 1);
     }
 
+  else if (TREE_CODE (expr) == PARM_DECL)
+    DECL_HIDDEN_STRING_LENGTH (expr) = (unsigned) bp_unpack_value (bp, 1);
+
   if (TREE_CODE (expr) == RESULT_DECL
       || TREE_CODE (expr) == PARM_DECL
       || VAR_P (expr))
--- gcc/tree-streamer-out.c.jj	2019-01-24 19:54:20.792500923 +0100
+++ gcc/tree-streamer-out.c	2019-05-15 09:01:23.957287106 +0200
@@ -212,7 +212,7 @@  pack_ts_decl_common_value_fields (struct
       bp_pack_var_len_unsigned (bp, EH_LANDING_PAD_NR (expr));
     }
 
-  if (TREE_CODE (expr) == FIELD_DECL)
+  else if (TREE_CODE (expr) == FIELD_DECL)
     {
       bp_pack_value (bp, DECL_PACKED (expr), 1);
       bp_pack_value (bp, DECL_NONADDRESSABLE_P (expr), 1);
@@ -220,12 +220,15 @@  pack_ts_decl_common_value_fields (struct
       bp_pack_value (bp, expr->decl_common.off_align, 8);
     }
 
-  if (VAR_P (expr))
+  else if (VAR_P (expr))
     {
       bp_pack_value (bp, DECL_HAS_DEBUG_EXPR_P (expr), 1);
       bp_pack_value (bp, DECL_NONLOCAL_FRAME (expr), 1);
     }
 
+  else if (TREE_CODE (expr) == PARM_DECL)
+    bp_pack_value (bp, DECL_HIDDEN_STRING_LENGTH (expr), 1);
+
   if (TREE_CODE (expr) == RESULT_DECL
       || TREE_CODE (expr) == PARM_DECL
       || VAR_P (expr))
--- gcc/fortran/trans-decl.c.jj	2019-05-15 08:18:16.000000000 +0200
+++ gcc/fortran/trans-decl.c	2019-05-15 08:31:07.260388229 +0200
@@ -2512,6 +2512,10 @@  create_function_arglist (gfc_symbol * sy
 	  DECL_ARG_TYPE (length) = len_type;
 	  TREE_READONLY (length) = 1;
 	  gfc_finish_decl (length);
+	  if (f->sym->ts.u.cl
+	      && f->sym->ts.u.cl->length
+	      && f->sym->ts.u.cl->length->expr_type == EXPR_CONSTANT)
+	    DECL_HIDDEN_STRING_LENGTH (length) = 1;
 
 	  /* Remember the passed value.  */
           if (!f->sym->ts.u.cl ||  f->sym->ts.u.cl->passed_length)