diff mbox series

[fortran] Fix PR 85781, ICE on valid

Message ID e6b2e37f-3999-46fd-d888-a5a7c213fb4f@tkoenig.net
State New
Headers show
Series [fortran] Fix PR 85781, ICE on valid | expand

Commit Message

Thomas König Jan. 19, 2020, 6:21 p.m. UTC
Hello world,

the attached patch fixes an ICE which could occur for empty
substrings (see test case).

In the spirit of "A patch that works beats an elegant idea every
time" I simply set a substring known to be empty to (1:0) so
that problems further down the road could be avoided.

Regression-tested. OK for trunk?

Regards

	Thomas

2020-01-19  Thomas König  <tkoenig@gcc.gnu.org>

	PR fortran/85781
	* resolve.c (resolve_substring): If the substring is empty, set it
	to (1:0) explicitly.


2020-01-19  Thomas König  <tkoenig@gcc.gnu.org>

	PR fortran/85781
	* gfortran.dg/substr_9.f90: New test.

Comments

Tobias Burnus Jan. 20, 2020, 12:31 p.m. UTC | #1
Hi Thomas,

On 1/19/20 7:21 PM, Thomas König wrote:
> the attached patch fixes an ICE which could occur for empty
> substrings (see test case).

I think one should rather fix the following issue. – While on x86-64 it 
does not seem to cause problems, it might for other platforms. 
Additionally, it is a missed optimization to use "A123" instead of "66" 
in the call.

Namely, gfortran generates a different code when using kind=c_type 
instead of kind=1 – although the kind value is the same! (Without 
BIND(C) or without VALUE, the generated code is the same.) The dump is:

two (character(kind=1)[1:1] x) vs.  three (character(kind=1) x)

Expected is the latter. See attached test case.

Tobias
Tobias Burnus Jan. 20, 2020, 4:07 p.m. UTC | #2
On 1/20/20 1:31 PM, Tobias Burnus wrote:
> I think one should rather fix the following issue.

That's now https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93336

Tobias
Thomas König Jan. 21, 2020, 6:32 p.m. UTC | #3
Hi Tobias,

>> the attached patch fixes an ICE which could occur for empty
>> substrings (see test case).
> 
> I think one should rather fix the following issue.

I am not sure what you mean.  Does that mean that fixing the following
issue will also fix PR 85781, or that the PR should not be fixed?

Regards

	Thomas
Tobias Burnus Jan. 22, 2020, 10:59 a.m. UTC | #4
Hi Thomas, hi all,

first, I have now attached a different fix for PR 85781 (= original 
bug). Can you have a look?

I have the feeling (but didn't check) that your patch does not handle 
the following variant of the test case: "print *, x(m:n)" (i.e. the 
lower bound is not known at compile time).

I hope my patch covers all issues. – OK for the trunk?

Secondly:

On 1/21/20 7:32 PM, Thomas König wrote:
>>> the attached patch fixes an ICE which could occur for empty
>>> substrings (see test case).
>> I think one should rather fix the following issue.
> I am not sure what you mean.  Does that mean that fixing the following
> issue will also fix PR 85781

I am no longer sure what I meant myself ;-)

I initially thought those are directly related – but they now look 
related but independent bugs:

PR 85781 is about getting a non-ARRAY_TYPE (tree dump: "character") and 
using it as ARRAY_TYPE (tree dump: "character[lb:ub]").

While PR93336 is about (1) using  an ARRAY_TYPE when one should not. – 
And, additionally, about missing diagnostic related to (2) bind(c) and 
kind=4, (3) passing zero-length strings to non-zero-length dummy args, 
(4) diagnostic about truncating too long strings (esp. if of 
non-default, non-c_char kind).

Tobias
Steve Kargl Jan. 22, 2020, 3:02 p.m. UTC | #5
On Wed, Jan 22, 2020 at 11:59:12AM +0100, Tobias Burnus wrote:
>
> And, additionally, about missing diagnostic related to (2) bind(c) and 
> kind=4,
>

Are you sure there is a missing diagnostic?  You need to 
add -Wc-binding-type or -Wall to your command line.

subroutine p(c) bind(c)
   use iso_c_binding
   character(kind=4) c
   print *, c
end

% gfcx -Wall -c a.f90
a.f90:1:14:

    1 | subroutine p(c) bind(c)
      |              1
Warning: Variable 'c' at (1) is a dummy argument of the BIND(C) procedure
'p' but may not be C interoperable [-Wc-binding-type]
Tobias Burnus Jan. 22, 2020, 3:43 p.m. UTC | #6
Hi Steve,

On 1/22/20 4:02 PM, Steve Kargl wrote:
> On Wed, Jan 22, 2020 at 11:59:12AM +0100, Tobias Burnus wrote:
>> And, additionally, about missing diagnostic related to (2) bind(c) and
>> kind=4,
> Are you sure there is a missing diagnostic?  You need to
> add -Wc-binding-type or -Wall to your command line.

The problem with the ISO C warnings for character is that they are not 
based on the KIND value but whether the variable comes from the 
ISO_C_BINDING module or not.

Namely, for kind=1 and kind=4 you get the warning you have shown while 
for kind=c_char and kind=c_int32_t you don't. (If you use "integer, 
parameter :: mykind = c_char", the result is the same as for c_char as 
this property is inherited.)

Numerically, 1 and c_char are the same – as are 4 and c_int32_t. I think 
one *can* warn for using kind=1 as this is bad style (= assumption that 
c_char is the same as "1" and the same as the default-kind character).

But "kind=4" is as bad as "kind=c_int32_t"! — And as noted in the PR, 
using c_char and c_int32_t (!) will give a "character(kind=1)" (note the 
kind=1!) dummy argument while kind=1 and kind=4 will give 
"character(kind={1 or 4})[1:1]" (= ARRAY_TYPE).

That's complete nonsense and in some cases it does not even depend on 
whether that's a BIND(C) procedure or not!

Cheers,

Tobias
Steve Kargl Jan. 22, 2020, 3:54 p.m. UTC | #7
On Wed, Jan 22, 2020 at 04:43:49PM +0100, Tobias Burnus wrote:
> Hi Steve,
> 
> On 1/22/20 4:02 PM, Steve Kargl wrote:
> > On Wed, Jan 22, 2020 at 11:59:12AM +0100, Tobias Burnus wrote:
> >> And, additionally, about missing diagnostic related to (2) bind(c) and
> >> kind=4,
> > Are you sure there is a missing diagnostic?  You need to
> > add -Wc-binding-type or -Wall to your command line.
> 
> The problem with the ISO C warnings for character is that they are not 
> based on the KIND value but whether the variable comes from the 
> ISO_C_BINDING module or not.
> 
> Namely, for kind=1 and kind=4 you get the warning you have shown while 
> for kind=c_char and kind=c_int32_t you don't. (If you use "integer, 
> parameter :: mykind = c_char", the result is the same as for c_char as 
> this property is inherited.)
> 
> Numerically, 1 and c_char are the same – as are 4 and c_int32_t. I think 
> one *can* warn for using kind=1 as this is bad style (= assumption that 
> c_char is the same as "1" and the same as the default-kind character).
> 
> But "kind=4" is as bad as "kind=c_int32_t"! — And as noted in the PR, 
> using c_char and c_int32_t (!) will give a "character(kind=1)" (note the 
> kind=1!) dummy argument while kind=1 and kind=4 will give 
> "character(kind={1 or 4})[1:1]" (= ARRAY_TYPE).
> 
> That's complete nonsense and in some cases it does not even depend on 
> whether that's a BIND(C) procedure or not!
> 

Not that I disagree with you, but this is what happens when
UTF-8 (kind=4) is wedged on top of ISO C Binding (kind=c_char),
which is wedged on top of a Fortran 95 compiler.  Supposedly, one
can use attr.is_c_interop and attr.is_iso_c to determine if something
is interoperable (if it is consistently set) and if it is from 
the ISO C Binding module.

The solution is to use unique kind values forall types (reserve
1-9 for integer, 11-19 for real, 21-29 for character, etc).  Then
one simply needs to define a storage association mapping.
Tobias Burnus Jan. 22, 2020, 4:10 p.m. UTC | #8
Hi Steve,

On 1/22/20 4:54 PM, Steve Kargl wrote:
> Supposedly, one can use attr.is_c_interop and attr.is_iso_c to 
> determine if something is interoperable (if it is consistently set) 
> and if it is from the ISO C Binding module. 

I think this is fixable on two sides:

* For the used tree type, one just has to check kind value (the integer) 
plus whether the dummy's procedure is BIND(C). Here the problem is that 
we do not always have this information available (i.e. it needs to be 
passed on). Not difficult, but one needs to find all spots and think of 
how to pass the information.

* For the diagnostic, we can check whether the dummy argument is in 
bind(C) and do some additional diagnostic in this case (e.g. always warn 
for kind=4 with BIND(C) – or reject it unconditionally or with -std=... 
or …).

* The other issue related to kind=4: Fortran permits to pass a too long 
string as actual argument; but this is only permitted in the standard 
for kind=1 and kind=c_char strings. In gfortran, this also works fine 
with kind=4 – but one might still want to warn (with some -Wall or 
-std=f... or ...). Likewise, the trimming happening in this case might 
indicate a bug in the program: "Hello World" as actual to 
character(len=10) is most likely a bug, even if this is valid Fortran.

Cheers,

Tobias
Tobias Burnus Jan. 24, 2020, 8:53 a.m. UTC | #9
Admittedly an early PING.

On 1/22/20 11:59 AM, Tobias Burnus wrote:
> Hi Thomas, hi all,
>
> first, I have now attached a different fix for PR 85781 (= original 
> bug). Can you have a look?
>
> I have the feeling (but didn't check) that your patch does not handle 
> the following variant of the test case: "print *, x(m:n)" (i.e. the 
> lower bound is not known at compile time).

I confirmed that my suspicion was right: the "resolve_substring" patch 
(first patch in this email thread) is not sufficient as all my test 
cases (of this patch) will still ICE with it.

> I hope my patch covers all issues. – OK for the trunk?

Cheers,

Tobias

> Secondly:
>
> On 1/21/20 7:32 PM, Thomas König wrote:
>>>> the attached patch fixes an ICE which could occur for empty
>>>> substrings (see test case).
>>> I think one should rather fix the following issue.
>> I am not sure what you mean.  Does that mean that fixing the following
>> issue will also fix PR 85781
>
> I am no longer sure what I meant myself ;-)
>
> I initially thought those are directly related – but they now look 
> related but independent bugs:
>
> PR 85781 is about getting a non-ARRAY_TYPE (tree dump: "character") 
> and using it as ARRAY_TYPE (tree dump: "character[lb:ub]").
>
> While PR93336 is about (1) using  an ARRAY_TYPE when one should not. – 
> And, additionally, about missing diagnostic related to (2) bind(c) and 
> kind=4, (3) passing zero-length strings to non-zero-length dummy args, 
> (4) diagnostic about truncating too long strings (esp. if of 
> non-default, non-c_char kind).
>
> Tobias
>
Thomas König Jan. 25, 2020, 1:51 p.m. UTC | #10
Hi Tobias,

> I hope my patch covers all issues. – OK for the trunk?

Yep.

Thanks a lot for the patch!

Regards

	Thomas
diff mbox series

Patch

commit 476a7e195399d2dc76b22947dcbde59f36fe5748
Author: Thomas König <tkoenig@gcc.gnu.org>
Date:   Sun Jan 19 19:14:41 2020 +0100

    2020-01-19  Thomas König  <tkoenig@gcc.gnu.org>
    
            PR fortran/85781
            * resolve.c (resolve_substring): If the substring is empty, set it
            to (1:0) explicitly.
    
    2020-01-19  Thomas König  <tkoenig@gcc.gnu.org>
    
            PR fortran/85781
            * gfortran.dg/substr_9.f90: New test.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index e840aec62f2..5f644da9bf2 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -5092,6 +5092,19 @@  resolve_substring (gfc_ref *ref, bool *equal_length)
 	  && compare_bound (ref->u.ss.end, ref->u.ss.length->length) == CMP_EQ
 	  && compare_bound_int (ref->u.ss.start, 1) == CMP_EQ)
 	*equal_length = true;
+
+    }
+
+  /* Set empty substrings to (1:0) to avoid subsequent ICEs.  */
+  if (gfc_dep_compare_expr (ref->u.ss.start, ref->u.ss.end) == 1)
+    {
+      locus loc;
+      loc = ref->u.ss.start->where;
+      gfc_free_expr (ref->u.ss.start);
+      ref->u.ss.start = gfc_get_int_expr (gfc_index_integer_kind, &loc, 1);
+      loc = ref->u.ss.end->where;
+      gfc_free_expr (ref->u.ss.end);
+      ref->u.ss.end = gfc_get_int_expr (gfc_index_integer_kind, &loc, 0);
     }
 
   return true;
diff --git a/gcc/testsuite/gfortran.dg/substr_9.f90 b/gcc/testsuite/gfortran.dg/substr_9.f90
new file mode 100644
index 00000000000..60557d0cc53
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/substr_9.f90
@@ -0,0 +1,7 @@ 
+! { dg-do compile }
+! PR 85781 - this used to cause an ICE.
+subroutine s(x) bind(c)
+   use iso_c_binding, only: c_char
+   character(kind=c_char), value :: x
+   print *, x(2:1)
+end