diff mbox

Derive interface buffers from max name length

Message ID 1448974501-30981-2-git-send-email-rep.dot.nop@gmail.com
State New
Headers show

Commit Message

Bernhard Reutner-Fischer Dec. 1, 2015, 12:54 p.m. UTC
These three function used a hardcoded buffer of 100 but would be better
off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum length of a
name in any of our supported standards (63 as of f2003 ff.).

Regstrapped without regressions, ok for trunk stage3 now / next stage1?

gcc/fortran/ChangeLog

2015-11-29  Bernhard Reutner-Fischer  <aldot@gcc.gnu.org>

	* interface.c (check_sym_interfaces, check_uop_interfaces,
	gfc_check_interfaces): Base interface_name buffer off
	GFC_MAX_SYMBOL_LEN.

Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
---
 gcc/fortran/interface.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Janne Blomqvist Dec. 1, 2015, 2:52 p.m. UTC | #1
On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
<rep.dot.nop@gmail.com> wrote:
> These three function used a hardcoded buffer of 100 but would be better
> off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum length of a
> name in any of our supported standards (63 as of f2003 ff.).

Please use xasprintf() instead (and free the result, or course). One
of my backburner projects is to get rid of these static symbol
buffers, and use dynamic buffers (or the symbol table) instead. We
IIRC already have some ugly hacks by using hashing to get around
GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch doesn't
make the situation worse per se, but if you're going to fix it, lets
do it properly.

Ok for GCC 7 stage1 with these changes. I don't think it's worth
putting it into GCC 6 at this point anymore, unless this is actually
fixing some bugs that are visible to users?
Bernhard Reutner-Fischer Dec. 1, 2015, 4:51 p.m. UTC | #2
On 1 December 2015 at 15:52, Janne Blomqvist <blomqvist.janne@gmail.com> wrote:
> On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
> <rep.dot.nop@gmail.com> wrote:
>> These three function used a hardcoded buffer of 100 but would be better
>> off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum length of a
>> name in any of our supported standards (63 as of f2003 ff.).
>
> Please use xasprintf() instead (and free the result, or course). One
> of my backburner projects is to get rid of these static symbol
> buffers, and use dynamic buffers (or the symbol table) instead. We
> IIRC already have some ugly hacks by using hashing to get around
> GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch doesn't
> make the situation worse per se, but if you're going to fix it, lets
> do it properly.

I see.

/scratch/src/gcc-6.0.mine/gcc/fortran$ git grep
"^[[:space:]]*char[[:space:]][[:space:]]*[^[;[:space:]]*\[" | wc -l
142
/scratch/src/gcc-6.0.mine/gcc/fortran$ git grep "xasprintf" | wc -l
32

What about memory fragmentation when switching to heap-based allocation?
Or is there consensus that these are in the noise compared to other
parts of the compiler?

BTW:
$ git grep APO
io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE", NULL };
io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE", NULL };


> Ok for GCC 7 stage1 with these changes. I don't think it's worth
> putting it into GCC 6 at this point anymore, unless this is actually
> fixing some bugs that are visible to users?

Not visible, no, can wait easily.
Janne Blomqvist Dec. 3, 2015, 9:46 a.m. UTC | #3
On Tue, Dec 1, 2015 at 6:51 PM, Bernhard Reutner-Fischer
<rep.dot.nop@gmail.com> wrote:
> On 1 December 2015 at 15:52, Janne Blomqvist <blomqvist.janne@gmail.com> wrote:
>> On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
>> <rep.dot.nop@gmail.com> wrote:
>>> These three function used a hardcoded buffer of 100 but would be better
>>> off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum length of a
>>> name in any of our supported standards (63 as of f2003 ff.).
>>
>> Please use xasprintf() instead (and free the result, or course). One
>> of my backburner projects is to get rid of these static symbol
>> buffers, and use dynamic buffers (or the symbol table) instead. We
>> IIRC already have some ugly hacks by using hashing to get around
>> GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch doesn't
>> make the situation worse per se, but if you're going to fix it, lets
>> do it properly.
>
> I see.
>
> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep
> "^[[:space:]]*char[[:space:]][[:space:]]*[^[;[:space:]]*\[" | wc -l
> 142
> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep "xasprintf" | wc -l
> 32

Yes, that's why it's on the TODO-list rather than on the DONE-list. :)

> What about memory fragmentation when switching to heap-based allocation?
> Or is there consensus that these are in the noise compared to other
> parts of the compiler?

Heap fragmentation is an issue, yes. I'm not sure it's that
performance-critical, but I don't think there is any consensus. I just
want to avoid ugly hacks like symbol hashing to fit within some fixed
buffer. Perhaps an good compromise would be something like std::string
with small string optimization, but as you have seen there is some
resistance to C++. But this is more relevant for mangled symbols, so
GFC_MAX_MANGLED_SYMBOL_LEN is more relevant here, and there's only a
few of them left. So, well, if you're sure that mangled symbols are
never copied into the buffers your patch modifies, please consider
your original patch Ok as well. Whichever you prefer.

Performance-wise I think a bigger benefit would be to use the symbol
table more and then e.g. be able to do pointer comparisons rather than
strcmp(). But that is certainly much more work.

> BTW:
> $ git grep APO
> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE", NULL };
> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE", NULL };

? What are you saying?
Bernhard Reutner-Fischer June 18, 2016, 7:46 p.m. UTC | #4
On December 3, 2015 10:46:09 AM GMT+01:00, Janne Blomqvist <blomqvist.janne@gmail.com> wrote:
>On Tue, Dec 1, 2015 at 6:51 PM, Bernhard Reutner-Fischer
><rep.dot.nop@gmail.com> wrote:
>> On 1 December 2015 at 15:52, Janne Blomqvist
><blomqvist.janne@gmail.com> wrote:
>>> On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
>>> <rep.dot.nop@gmail.com> wrote:
>>>> These three function used a hardcoded buffer of 100 but would be
>better
>>>> off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum length
>of a
>>>> name in any of our supported standards (63 as of f2003 ff.).
>>>
>>> Please use xasprintf() instead (and free the result, or course). One
>>> of my backburner projects is to get rid of these static symbol
>>> buffers, and use dynamic buffers (or the symbol table) instead. We
>>> IIRC already have some ugly hacks by using hashing to get around
>>> GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch doesn't
>>> make the situation worse per se, but if you're going to fix it, lets
>>> do it properly.
>>
>> I see.
>>
>> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep
>> "^[[:space:]]*char[[:space:]][[:space:]]*[^[;[:space:]]*\[" | wc -l
>> 142
>> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep "xasprintf" | wc -l
>> 32
>
>Yes, that's why it's on the TODO-list rather than on the DONE-list. :)
>
>> What about memory fragmentation when switching to heap-based
>allocation?
>> Or is there consensus that these are in the noise compared to other
>> parts of the compiler?
>
>Heap fragmentation is an issue, yes. I'm not sure it's that
>performance-critical, but I don't think there is any consensus. I just
>want to avoid ugly hacks like symbol hashing to fit within some fixed
>buffer. Perhaps an good compromise would be something like std::string
>with small string optimization, but as you have seen there is some
>resistance to C++. But this is more relevant for mangled symbols, so
>GFC_MAX_MANGLED_SYMBOL_LEN is more relevant here, and there's only a
>few of them left. So, well, if you're sure that mangled symbols are
>never copied into the buffers your patch modifies, please consider
>your original patch Ok as well. Whichever you prefer.
>
>Performance-wise I think a bigger benefit would be to use the symbol
>table more and then e.g. be able to do pointer comparisons rather than
>strcmp(). But that is certainly much more work.

Hm, worth a look indeed since that would certainly be a step in the right direction.

>
>> BTW:
>> $ git grep APO
>> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE",
>NULL };
>> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE",
>NULL };
>
>? What are you saying?

delim is duplicated, we should remove one instance.
thanks,
Bernhard Reutner-Fischer Oct. 19, 2017, 8:03 a.m. UTC | #5
On Sat, Jun 18, 2016 at 09:46:17PM +0200, Bernhard Reutner-Fischer wrote:
> On December 3, 2015 10:46:09 AM GMT+01:00, Janne Blomqvist <blomqvist.janne@gmail.com> wrote:
> >On Tue, Dec 1, 2015 at 6:51 PM, Bernhard Reutner-Fischer
> ><rep.dot.nop@gmail.com> wrote:
> >> On 1 December 2015 at 15:52, Janne Blomqvist
> ><blomqvist.janne@gmail.com> wrote:
> >>> On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
> >>> <rep.dot.nop@gmail.com> wrote:
> >>>> These three function used a hardcoded buffer of 100 but would be
> >better
> >>>> off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum length
> >of a
> >>>> name in any of our supported standards (63 as of f2003 ff.).
> >>>
> >>> Please use xasprintf() instead (and free the result, or course). One
> >>> of my backburner projects is to get rid of these static symbol
> >>> buffers, and use dynamic buffers (or the symbol table) instead. We
> >>> IIRC already have some ugly hacks by using hashing to get around
> >>> GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch doesn't
> >>> make the situation worse per se, but if you're going to fix it, lets
> >>> do it properly.
> >>
> >> I see.
> >>
> >> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep
> >> "^[[:space:]]*char[[:space:]][[:space:]]*[^[;[:space:]]*\[" | wc -l
> >> 142
> >> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep "xasprintf" | wc -l
> >> 32
> >
> >Yes, that's why it's on the TODO-list rather than on the DONE-list. :)
> >
> >> What about memory fragmentation when switching to heap-based
> >allocation?
> >> Or is there consensus that these are in the noise compared to other
> >> parts of the compiler?
> >
> >Heap fragmentation is an issue, yes. I'm not sure it's that
> >performance-critical, but I don't think there is any consensus. I just
> >want to avoid ugly hacks like symbol hashing to fit within some fixed
> >buffer. Perhaps an good compromise would be something like std::string
> >with small string optimization, but as you have seen there is some
> >resistance to C++. But this is more relevant for mangled symbols, so
> >GFC_MAX_MANGLED_SYMBOL_LEN is more relevant here, and there's only a
> >few of them left. So, well, if you're sure that mangled symbols are
> >never copied into the buffers your patch modifies, please consider
> >your original patch Ok as well. Whichever you prefer.
> >
> >Performance-wise I think a bigger benefit would be to use the symbol
> >table more and then e.g. be able to do pointer comparisons rather than
> >strcmp(). But that is certainly much more work.
> 
> Hm, worth a look indeed since that would certainly be a step in the right direction.

Installed the initial patch as intermediate step as r253881 for now.

thanks,
> 
> >
> >> BTW:
> >> $ git grep APO
> >> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE",
> >NULL };
> >> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE",
> >NULL };
> >
> >? What are you saying?
> 
> delim is duplicated, we should remove one instance.
> thanks,
>
Bernhard Reutner-Fischer Oct. 20, 2017, 10:46 p.m. UTC | #6
On 19 October 2017 10:03:06 CEST, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote:
>On Sat, Jun 18, 2016 at 09:46:17PM +0200, Bernhard Reutner-Fischer
>wrote:
>> On December 3, 2015 10:46:09 AM GMT+01:00, Janne Blomqvist
><blomqvist.janne@gmail.com> wrote:
>> >On Tue, Dec 1, 2015 at 6:51 PM, Bernhard Reutner-Fischer
>> ><rep.dot.nop@gmail.com> wrote:
>> >> On 1 December 2015 at 15:52, Janne Blomqvist
>> ><blomqvist.janne@gmail.com> wrote:
>> >>> On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
>> >>> <rep.dot.nop@gmail.com> wrote:
>> >>>> These three function used a hardcoded buffer of 100 but would be
>> >better
>> >>>> off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum
>length
>> >of a
>> >>>> name in any of our supported standards (63 as of f2003 ff.).
>> >>>
>> >>> Please use xasprintf() instead (and free the result, or course).
>One
>> >>> of my backburner projects is to get rid of these static symbol
>> >>> buffers, and use dynamic buffers (or the symbol table) instead.
>We
>> >>> IIRC already have some ugly hacks by using hashing to get around
>> >>> GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch
>doesn't
>> >>> make the situation worse per se, but if you're going to fix it,
>lets
>> >>> do it properly.
>> >>
>> >> I see.
>> >>
>> >> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep
>> >> "^[[:space:]]*char[[:space:]][[:space:]]*[^[;[:space:]]*\[" | wc
>-l
>> >> 142
>> >> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep "xasprintf" | wc
>-l
>> >> 32
>> >
>> >Yes, that's why it's on the TODO-list rather than on the DONE-list.
>:)
>> >
>> >> What about memory fragmentation when switching to heap-based
>> >allocation?
>> >> Or is there consensus that these are in the noise compared to
>other
>> >> parts of the compiler?
>> >
>> >Heap fragmentation is an issue, yes. I'm not sure it's that
>> >performance-critical, but I don't think there is any consensus. I
>just
>> >want to avoid ugly hacks like symbol hashing to fit within some
>fixed
>> >buffer. Perhaps an good compromise would be something like
>std::string
>> >with small string optimization, but as you have seen there is some
>> >resistance to C++. But this is more relevant for mangled symbols, so
>> >GFC_MAX_MANGLED_SYMBOL_LEN is more relevant here, and there's only a
>> >few of them left. So, well, if you're sure that mangled symbols are
>> >never copied into the buffers your patch modifies, please consider
>> >your original patch Ok as well. Whichever you prefer.
>> >
>> >Performance-wise I think a bigger benefit would be to use the symbol
>> >table more and then e.g. be able to do pointer comparisons rather
>than
>> >strcmp(). But that is certainly much more work.
>> 
>> Hm, worth a look indeed since that would certainly be a step in the
>right direction.
>
>Installed the initial patch as intermediate step as r253881 for now.

JFYI I'm contemplating to move the stack-based allocations to heap-based ones now, starting with gfc_match_name and gradually moving to pointer comparisons with the stringpool based identifiers. I'll strive to suggest something for discussion in smallish steps when it's ready.

Cheers,
>
>thanks,
>> 
>> >
>> >> BTW:
>> >> $ git grep APO
>> >> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE",
>"NONE",
>> >NULL };
>> >> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE",
>"NONE",
>> >NULL };
>> >
>> >? What are you saying?
>> 
>> delim is duplicated, we should remove one instance.
>> thanks,
>>
diff mbox

Patch

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index dcf3eae..30cc522 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -1696,7 +1696,7 @@  check_interface1 (gfc_interface *p, gfc_interface *q0,
 static void
 check_sym_interfaces (gfc_symbol *sym)
 {
-  char interface_name[100];
+  char interface_name[GFC_MAX_SYMBOL_LEN + sizeof("generic interface ''")];
   gfc_interface *p;
 
   if (sym->ns != gfc_current_ns)
@@ -1733,7 +1733,7 @@  check_sym_interfaces (gfc_symbol *sym)
 static void
 check_uop_interfaces (gfc_user_op *uop)
 {
-  char interface_name[100];
+  char interface_name[GFC_MAX_SYMBOL_LEN + sizeof("operator interface ''")];
   gfc_user_op *uop2;
   gfc_namespace *ns;
 
@@ -1810,7 +1810,7 @@  void
 gfc_check_interfaces (gfc_namespace *ns)
 {
   gfc_namespace *old_ns, *ns2;
-  char interface_name[100];
+  char interface_name[GFC_MAX_SYMBOL_LEN + sizeof("intrinsic '' operator")];
   int i;
 
   old_ns = gfc_current_ns;