diff mbox series

[fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

Message ID 1767259.yRpc3bzoFC@andrew-precision-3520
State New
Headers show
Series [fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule | expand

Commit Message

Andrew Benson Jan. 27, 2020, 8:57 p.m. UTC
I created PR93461 for this issue: The following code causes a bogus "symbol is 
already defined" error (using git commit 
73380abd6b2783215c7950a2ade5e3f4b271e2bc):

module aModuleWithAnAllowedName

  interface
     module subroutine aShortName()
     end subroutine aShortName
  end interface
     
end module aModuleWithAnAllowedName

submodule (aModuleWithAnAllowedName) 
aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName

contains

  subroutine aShortName()
    call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
    call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
  end subroutine aShortName
  
  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem

  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
  
end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName



$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data001/abenson/Galacticus/Tools_Devel_Install/bin/../
libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-git/configure --prefix=/home/abenson/Galacticus/
Tools_Devel --enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200124 (experimental) (GCC) 


$ gfortran  -c symlength.F90 -o symlength.o -ffree-line-length-none -
frecursive  -pthread -Wall -fbacktrace -ffpe-trap=invalid,zero,overflow -fdump-
core -O3 -ffinite-math-only -fno-math-errno -fopenmp -g
/tmp/cc8B4Hmp.s: Assembler messages:
/tmp/cc8B4Hmp.s:20: Error: symbol 
`__amodulewithanallowedname.asubmodulewithaveryveryverylongbutentirelylegalname_MOD_asubroutinewithaverylongnamethatwillcauseaprobl' 
is already defined



The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to 
GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus function name 
(plus the additional "_"'s that get prepended), but insufficient if a submodule 
name is included. The name then gets truncated and can lead to two different 
functions having the same (truncated) symbol name.

The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which allows for 
the submodule name plus the "." added between module and submodule names.

I've attached a patch for this which includes a new test case for this PR. The 
patch regression tests cleanly.

OK to commit?

-Andrew

Comments

Tobias Burnus Jan. 28, 2020, 8:21 a.m. UTC | #1
Hi Andrew,

On 1/27/20 9:57 PM, Andrew Benson wrote:
> The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to
> GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus function name
> (plus the additional "_"'s that get prepended), but insufficient if a submodule
> name is included. The name then gets truncated and can lead to two different
> functions having the same (truncated) symbol name.
>
> The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which allows for
> the submodule name plus the "." added between module and submodule names.
>
> I've attached a patch for this which includes a new test case for this PR. The
> patch regression tests cleanly.
>
> OK to commit?

Can you also update the comment before the #define? It currently has:

  /* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)

Otherweise, it LGTM.

Tobias

PS: I wonder whether there are relevant systems which will fail because they
do not handle that long symbol names...
Andrew Benson Jan. 28, 2020, 4:41 p.m. UTC | #2
Hi Tobias,

> > The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to
> > GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus
> > function name (plus the additional "_"'s that get prepended), but
> > insufficient if a submodule name is included. The name then gets
> > truncated and can lead to two different functions having the same
> > (truncated) symbol name.
> > 
> > The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which
> > allows for the submodule name plus the "." added between module and
> > submodule names.
> > 
> > I've attached a patch for this which includes a new test case for this PR.
> > The patch regression tests cleanly.
> > 
> > OK to commit?
> 
> Can you also update the comment before the #define? It currently has:
> 
>   /* Mangled symbols take the form __module__name.  */
> -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> +#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)

Thanks for the suggestion. An updated patch is attached.

> PS: I wonder whether there are relevant systems which will fail because they
> do not handle that long symbol names...

Good question. Should I hold off on committing the patch until this can be 
tested further? Or should I just go ahead and commit and deal with any such 
problems if they show up?

Also, Richard Biener raised the question (in the PR) of whether this patch 
would be an ABI change. I can see that it probably would be, but don't know 
for sure. If it would change the ABI is there anything else that I need to 
include in the patch?

Thanks,
Andrew
Tobias Burnus Jan. 28, 2020, 5:49 p.m. UTC | #3
Hi Andrew,

On 1/28/20 5:41 PM, Andrew Benson wrote:
>> Can you also update the comment before the #define? It currently has:
> Thanks for the suggestion. An updated patch is attached.
Thanks. LGTM now.
>> PS: I wonder whether there are relevant systems which will fail because they
>> do not handle that long symbol names...
> Good question. Should I hold off on committing the patch until this can be
> tested further? Or should I just go ahead and commit and deal with any such
> problems if they show up?

I think it is fine – as a user, one can always choose a shorter name if 
the system one is interested in doesn't support it. Besides, following 
the current scheme, we do not really have a choice without breaking the ABI.

> Also, Richard Biener raised the question (in the PR) of whether this patch
> would be an ABI change. I can see that it probably would be, but don't know
> for sure. If it would change the ABI is there anything else that I need to
> include in the patch?

As just mentioned on the PR, I think it is a small ABI change – before a 
very long identified got cut off now it is available in full extent. — I 
do not see any simple way to avoid this ABI change and, frankly, I also 
do not think any real-world program uses that long identifiers.

Namely, instead of using 3 times the length of 42 character 
("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3 
times the length of 63 characters 
(ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_). 
Already 42 is quite long for a module, submodule or procedure name. And 
that does work even without the patch. And as only the sum counts, if 
one part only uses 10 characters (e.g. "my_module5"), now 2*58 
characters available two others.

If at the end it really causes problems (source code not available), the 
user still can modify the module file and change the procedure name to 
the trimmed value and continue.

Thus, I do not think it is a real problem in practice. We could be nice, 
however, and add a note to the release notes (i.e. 
https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether 
it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git 
about how to change WWW files and CC Gerald as web maintainer when 
submitting wwwdocs patches.]

Thanks,

Tobias

> patch.diff
>
> diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
> index 52bc045..69171f3 100644
> --- a/gcc/fortran/trans.h
> +++ b/gcc/fortran/trans.h
> @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
>   
>   #include "predict.h"  /* For enum br_predictor and PRED_*.  */
>   
> -/* Mangled symbols take the form __module__name.  */
> -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> +/* Mangled symbols take the form __module__name or __module.submodule__name.  */
> +#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
>   
>   /* Struct for holding a block of statements.  It should be treated as an
>      opaque entity and not modified directly.  This allows us to change the
> diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90
> new file mode 100644
> index 0000000..3bef326
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr93461.f90
> @@ -0,0 +1,22 @@
> +! { dg-do compile }
> +! PR fortran/93461
> +module aModuleWithAnAllowedName
> +  interface
> +     module subroutine aShortName()
> +     end subroutine aShortName
> +  end interface
> +end module aModuleWithAnAllowedName
> +
> +submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
> +contains
> +  subroutine aShortName()
> +    call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> +    call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> +  end subroutine aShortName
> +
> +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
> +
> +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
> +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
>
> ChangeLog
>
> 2020-01-28  Andrew Benson<abensonca@gmail.com>
>
> 	PR fortran/93461
> 	* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
> 	GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
> 	plus the "." between module and submodule names.
>
> 	* gfortran.dg/pr93461.f90: New test.
Andrew Benson Jan. 28, 2020, 6:14 p.m. UTC | #4
Hi Tobias,

Thanks - committed as 

https://gcc.gnu.org/git/gitweb.cgi?
p=gcc.git;a=commit;h=ad690d79cfbb905c5546c9333c5fd089d906505b

I'll send a separate email about changes to the release notes.

-Andrew

On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
> Hi Andrew,
> 
> On 1/28/20 5:41 PM, Andrew Benson wrote:
> >> Can you also update the comment before the #define? It currently has:
> > Thanks for the suggestion. An updated patch is attached.
> 
> Thanks. LGTM now.
> 
> >> PS: I wonder whether there are relevant systems which will fail because
> >> they do not handle that long symbol names...
> > 
> > Good question. Should I hold off on committing the patch until this can be
> > tested further? Or should I just go ahead and commit and deal with any
> > such
> > problems if they show up?
> 
> I think it is fine – as a user, one can always choose a shorter name if
> the system one is interested in doesn't support it. Besides, following
> the current scheme, we do not really have a choice without breaking the ABI.
> > Also, Richard Biener raised the question (in the PR) of whether this patch
> > would be an ABI change. I can see that it probably would be, but don't
> > know
> > for sure. If it would change the ABI is there anything else that I need to
> > include in the patch?
> 
> As just mentioned on the PR, I think it is a small ABI change – before a
> very long identified got cut off now it is available in full extent. — I
> do not see any simple way to avoid this ABI change and, frankly, I also
> do not think any real-world program uses that long identifiers.
> 
> Namely, instead of using 3 times the length of 42 character
> ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3
> times the length of 63 characters
> (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_).
> Already 42 is quite long for a module, submodule or procedure name. And
> that does work even without the patch. And as only the sum counts, if
> one part only uses 10 characters (e.g. "my_module5"), now 2*58
> characters available two others.
> 
> If at the end it really causes problems (source code not available), the
> user still can modify the module file and change the procedure name to
> the trimmed value and continue.
> 
> Thus, I do not think it is a real problem in practice. We could be nice,
> however, and add a note to the release notes (i.e.
> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
> about how to change WWW files and CC Gerald as web maintainer when
> submitting wwwdocs patches.]
> 
> Thanks,
> 
> Tobias
> 
> > patch.diff
> > 
> > diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
> > index 52bc045..69171f3 100644
> > --- a/gcc/fortran/trans.h
> > +++ b/gcc/fortran/trans.h
> > @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
> > 
> >   #include "predict.h"  /* For enum br_predictor and PRED_*.  */
> > 
> > -/* Mangled symbols take the form __module__name.  */
> > -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> > +/* Mangled symbols take the form __module__name or
> > __module.submodule__name.  */ +#define GFC_MAX_MANGLED_SYMBOL_LEN 
> > (GFC_MAX_SYMBOL_LEN*3+5)
> > 
> >   /* Struct for holding a block of statements.  It should be treated as an
> >   
> >      opaque entity and not modified directly.  This allows us to change
> >      the
> > 
> > diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90
> > b/gcc/testsuite/gfortran.dg/pr93461.f90 new file mode 100644
> > index 0000000..3bef326
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/pr93461.f90
> > @@ -0,0 +1,22 @@
> > +! { dg-do compile }
> > +! PR fortran/93461
> > +module aModuleWithAnAllowedName
> > +  interface
> > +     module subroutine aShortName()
> > +     end subroutine aShortName
> > +  end interface
> > +end module aModuleWithAnAllowedName
> > +
> > +submodule (aModuleWithAnAllowedName)
> > aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName +contains
> > +  subroutine aShortName()
> > +    call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> > +    call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> > +  end subroutine aShortName
> > +
> > +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> > +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
> > +
> > +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> > +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
> > +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
> > 
> > ChangeLog
> > 
> > 2020-01-28  Andrew Benson<abensonca@gmail.com>
> > 
> > 	PR fortran/93461
> > 	* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
> > 	GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
> > 	plus the "." between module and submodule names.
> > 	
> > 	* gfortran.dg/pr93461.f90: New test.
Andrew Benson Jan. 29, 2020, 12:45 a.m. UTC | #5
Hi Tobias,

On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
> Thus, I do not think it is a real problem in practice. We could be nice,
> however, and add a note to the release notes (i.e.
> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
> about how to change WWW files and CC Gerald as web maintainer when
> submitting wwwdocs patches.]

I've attached a draft patch to update the release notes about this ABI 
breakage. I don't know if I've explained it sufficiently clearly though?

-Andrew
Gerald Pfeifer Jan. 29, 2020, 9:07 a.m. UTC | #6
On Tue, 28 Jan 2020, Andrew Benson wrote:
> I've attached a draft patch to update the release notes about this ABI 
> breakage. I don't know if I've explained it sufficiently clearly though?

I do not speak Fortran, but your description was easy to read and
understand for me. :-)

Thank you, and okay - let's just give the Fortran team a day or two
to chime in before you push the change.

Gerald
Tobias Burnus Jan. 29, 2020, 10:57 a.m. UTC | #7
LGTM – and also to Gerald. Hence, go ahead!

Thanks for the patch,

Tobias

On 1/29/20 1:45 AM, Andrew Benson wrote:
> Hi Tobias,
>
> On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
>> Thus, I do not think it is a real problem in practice. We could be nice,
>> however, and add a note to the release notes (i.e.
>> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
>> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
>> about how to change WWW files and CC Gerald as web maintainer when
>> submitting wwwdocs patches.]
> I've attached a draft patch to update the release notes about this ABI
> breakage. I don't know if I've explained it sufficiently clearly though?
>
> -Andrew
>
Andrew Benson Jan. 29, 2020, 4:37 p.m. UTC | #8
Thanks Tobias and Gerald - that change is now committed:

https://gcc.gnu.org/git/?p=gcc-wwwdocs.git;a=commit;h=b88b1edba71bbca44429c28b7518d76678ab6acb

On Wednesday, January 29, 2020 11:57:02 AM PST Tobias Burnus wrote:
> LGTM – and also to Gerald. Hence, go ahead!
> 
> Thanks for the patch,
> 
> Tobias
> 
> On 1/29/20 1:45 AM, Andrew Benson wrote:
> > Hi Tobias,
> > 
> > On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
> >> Thus, I do not think it is a real problem in practice. We could be nice,
> >> however, and add a note to the release notes (i.e.
> >> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
> >> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
> >> about how to change WWW files and CC Gerald as web maintainer when
> >> submitting wwwdocs patches.]
> > 
> > I've attached a draft patch to update the release notes about this ABI
> > breakage. I don't know if I've explained it sufficiently clearly though?
> > 
> > -Andrew
diff mbox series

Patch

diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 52bc045..5942320 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -24,7 +24,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "predict.h"  /* For enum br_predictor and PRED_*.  */
 
 /* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
 
 /* Struct for holding a block of statements.  It should be treated as an
    opaque entity and not modified directly.  This allows us to change the
diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90
new file mode 100644
index 0000000..3bef326
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93461.f90
@@ -0,0 +1,22 @@ 
+! { dg-do compile }
+! PR fortran/93461
+module aModuleWithAnAllowedName
+  interface
+     module subroutine aShortName()
+     end subroutine aShortName
+  end interface
+end module aModuleWithAnAllowedName
+
+submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
+contains
+  subroutine aShortName()
+    call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+    call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aShortName
+  
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso  
+end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName