Message ID | trinity-b0708fbd-6e5c-4fc0-b74f-4af90740e364-1590780482832@3c-app-gmx-bap60 |
---|---|
State | New |
Headers | show |
Series | [committed,part2] PR fortran/95090 - ICE: identifier overflow | expand |
On Fri, May 29, 2020 at 1:39 PM Harald Anlauf <anlauf@gmx.de> wrote: > > The initial patch for this PR had some fallout which for unknown reason > did only show up on i686, but not on x86_64. With initial guidance by > Manfred Schwarb three further locations exhibiting buffer overrun could > be identified in a gdb session and were fixed. > > Committed as 'obvious' to master. > > Thanks, > Harald > > > PR fortran/95090 - ICE: identifier overflow > > The initial fix for this PR uncovered several latent issues with further > too small string buffers which showed up only when testing on i686. > Provide sufficiently large temporaries. > > 2020-05-29 Harald Anlauf <anlauf@gmx.de> > > gcc/fortran/ > PR fortran/95090 > * class.c (get_unique_type_string): Enlarge temporary for > name-mangling. Use strncpy to prevent buffer overrun. > (get_unique_hashed_string): Enlarge temporary. > (gfc_hash_value): Enlarge temporary for name-mangling. This breaks bootstrap: https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html ../../src-master/gcc/fortran/class.c:487:13: error: ‘char* strncpy(char*, const char*, size_t)’ specified bound 67 equals destination size [-Werror=stringop-truncation] 487 | strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name));
Hi H.J., > Gesendet: Freitag, 29. Mai 2020 um 23:57 Uhr > Von: "H.J. Lu" <hjl.tools@gmail.com> > This breaks bootstrap: > > https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html > > ../../src-master/gcc/fortran/class.c:487:13: error: ‘char* > strncpy(char*, const char*, size_t)’ specified bound 67 equals > destination size [-Werror=stringop-truncation] > 487 | strncpy (dt_name, gfc_dt_upper_string (derived->name), > sizeof (dt_name)); what is the right way to deal with that? I haven't seen any use of strlcpy in the gcc sources. This would do the right thing and would fit here. So should one use the clumsy way: strncpy(buf, str, buflen - 1); if (buflen > 0) buf[buflen - 1]= '\0'; Or is there something more convenient that keeps gcc happy? Harald
On Fri, May 29, 2020 at 3:20 PM Harald Anlauf <anlauf@gmx.de> wrote: > > Hi H.J., > > > Gesendet: Freitag, 29. Mai 2020 um 23:57 Uhr > > Von: "H.J. Lu" <hjl.tools@gmail.com> > > > This breaks bootstrap: > > > > https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html > > > > ../../src-master/gcc/fortran/class.c:487:13: error: ‘char* > > strncpy(char*, const char*, size_t)’ specified bound 67 equals > > destination size [-Werror=stringop-truncation] > > 487 | strncpy (dt_name, gfc_dt_upper_string (derived->name), > > sizeof (dt_name)); > > what is the right way to deal with that? > > I haven't seen any use of strlcpy in the gcc sources. This would do > the right thing and would fit here. > > So should one use the clumsy way: > > strncpy(buf, str, buflen - 1); > if (buflen > 0) > buf[buflen - 1]= '\0'; > > Or is there something more convenient that keeps gcc happy? dt_name[sizeof (dt_name) - 1] = '\0'; strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name) - 1);
On Sat, May 30, 2020 at 12:20:19AM +0200, Harald Anlauf wrote: > > Gesendet: Freitag, 29. Mai 2020 um 23:57 Uhr > > Von: "H.J. Lu" <hjl.tools@gmail.com> > > > This breaks bootstrap: > > > > https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html > > > > ../../src-master/gcc/fortran/class.c:487:13: error: ‘char* > > strncpy(char*, const char*, size_t)’ specified bound 67 equals > > destination size [-Werror=stringop-truncation] > > 487 | strncpy (dt_name, gfc_dt_upper_string (derived->name), > > sizeof (dt_name)); > > what is the right way to deal with that? > > I haven't seen any use of strlcpy in the gcc sources. This would do > the right thing and would fit here. > > So should one use the clumsy way: > > strncpy(buf, str, buflen - 1); > if (buflen > 0) > buf[buflen - 1]= '\0'; > > Or is there something more convenient that keeps gcc happy? Depends on what exactly you want to achieve, but strncpy is pretty much always the wrong answer. Do you really want to clear the rest of buf, when gfc_dt_upper_string is much shorter? That is one thing that is only seldom useful from strncpy behavior. The other is that it doesn't zero terminate on truncation. I'd suggest e.g. remember gfc_dt_upper_string result in a temporary, compute len = strnlen (res, sizeof (dt_name) - 1); and then you can just memcpy and zero terminate buf[len] = '\0'; Or do you ever want to truncate? Jakub
> > > This breaks bootstrap: > > > > > > https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html > > > > > > ../../src-master/gcc/fortran/class.c:487:13: error: ‘char* > > > strncpy(char*, const char*, size_t)’ specified bound 67 equals > > > destination size [-Werror=stringop-truncation] > > > 487 | strncpy (dt_name, gfc_dt_upper_string (derived->name), > > > sizeof (dt_name)); > > > > So should one use the clumsy way: > > > > strncpy(buf, str, buflen - 1); > > if (buflen > 0) > > buf[buflen - 1]= '\0'; > > > > Or is there something more convenient that keeps gcc happy? > > Depends on what exactly you want to achieve, but strncpy is pretty much > always the wrong answer. > Do you really want to clear the rest of buf, when gfc_dt_upper_string > is much shorter? That is one thing that is only seldom useful from strncpy > behavior. The other is that it doesn't zero terminate on truncation. > I'd suggest e.g. remember gfc_dt_upper_string result in a temporary, > compute len = strnlen (res, sizeof (dt_name) - 1); > and then you can just memcpy and zero terminate buf[len] = '\0'; > Or do you ever want to truncate? I'ld like to detect the situation that when somebody modifies name-mangling in a way that generates a buffer overrun during regtesting so that the temporaries to adjust are easier to find. After thinking about your and H.J.'s suggestions, the shortest solution I came up with is: diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index db395624a16..6d0924da2b8 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -484,7 +484,12 @@ get_unique_type_string (char *string, gfc_symbol *derived) if (derived->attr.unlimited_polymorphic) strcpy (dt_name, "STAR"); else - strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name)); + { + dt_name[sizeof (dt_name)-1] = '\0'; + strcpy (dt_name, gfc_dt_upper_string (derived->name)); + if (dt_name[sizeof (dt_name)-1] != '\0') + gfc_internal_error ("get_unique_type_string: identifier overflow"); + } if (derived->attr.unlimited_polymorphic) sprintf (string, "_%s", dt_name); else if (derived->module) That would have given me a useful error on x86_64. Is this OK for master? Harald
On Sat, May 30, 2020 at 02:48:32PM +0200, Harald Anlauf wrote: > I'ld like to detect the situation that when somebody modifies name-mangling in > a way that generates a buffer overrun during regtesting so that the temporaries > to adjust are easier to find. > > After thinking about your and H.J.'s suggestions, the shortest solution > I came up with is: > > diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c > index db395624a16..6d0924da2b8 100644 > --- a/gcc/fortran/class.c > +++ b/gcc/fortran/class.c > @@ -484,7 +484,12 @@ get_unique_type_string (char *string, gfc_symbol *derived) > if (derived->attr.unlimited_polymorphic) > strcpy (dt_name, "STAR"); > else > - strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name)); > + { > + dt_name[sizeof (dt_name)-1] = '\0'; > + strcpy (dt_name, gfc_dt_upper_string (derived->name)); > + if (dt_name[sizeof (dt_name)-1] != '\0') > + gfc_internal_error ("get_unique_type_string: identifier overflow"); > + } > if (derived->attr.unlimited_polymorphic) > sprintf (string, "_%s", dt_name); > else if (derived->module) > > That would have given me a useful error on x86_64. > > Is this OK for master? That is detecting it after the buffer overflow has happened already, that is too late, after UB anything can happen. + { + const char *upper = gfc_dt_upper_string (derived->name); + size_t len = strnlen (upper, sizeof (dt_name)); + gcc_assert (len < sizeof (dt_name)); + memcpy (dt_name, upper, len); + dt_name[len] = '\0'; + } does detect it before overflowing it. Jakub
On Mai 30 2020, Harald Anlauf wrote: > diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c > index db395624a16..6d0924da2b8 100644 > --- a/gcc/fortran/class.c > +++ b/gcc/fortran/class.c > @@ -484,7 +484,12 @@ get_unique_type_string (char *string, gfc_symbol *derived) > if (derived->attr.unlimited_polymorphic) > strcpy (dt_name, "STAR"); > else > - strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name)); > + { > + dt_name[sizeof (dt_name)-1] = '\0'; > + strcpy (dt_name, gfc_dt_upper_string (derived->name)); > + if (dt_name[sizeof (dt_name)-1] != '\0') > + gfc_internal_error ("get_unique_type_string: identifier overflow"); > + } Why do you need to create the copy in the first place? Please revert, if you cannot fix that soon. Andreas.
> That is detecting it after the buffer overflow has happened already, that is > too late, after UB anything can happen. > + { > + const char *upper = gfc_dt_upper_string (derived->name); > + size_t len = strnlen (upper, sizeof (dt_name)); > + gcc_assert (len < sizeof (dt_name)); > + memcpy (dt_name, upper, len); > + dt_name[len] = '\0'; > + } > does detect it before overflowing it. OK. Here's what I committed: PR fortran/95090 - ICE: identifier overflow Implement buffer overrun check for temporary that holds mangled names. 2020-05-30 Harald Anlauf <anlauf@gmx.de> gcc/fortran/ PR fortran/95090 * class.c (get_unique_type_string): Use buffer overrun check. diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index db395624a16..afd8885a1ea 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -484,7 +484,14 @@ get_unique_type_string (char *string, gfc_symbol *derived) if (derived->attr.unlimited_polymorphic) strcpy (dt_name, "STAR"); else - strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name)); + { + const char *upper = gfc_dt_upper_string (derived->name); + size_t len = strnlen (upper, sizeof (dt_name)); + if (len >= sizeof (dt_name)) + gfc_internal_error ("get_unique_type_string: identifier overflow"); + memcpy (dt_name, upper, len); + dt_name[len] = '\0'; + } if (derived->attr.unlimited_polymorphic) sprintf (string, "_%s", dt_name); else if (derived->module) Harald
On Sat, May 30, 2020 at 08:57:46PM +0200, Harald Anlauf wrote: > > That is detecting it after the buffer overflow has happened already, that is > > too late, after UB anything can happen. > > + { > > + const char *upper = gfc_dt_upper_string (derived->name); > > + size_t len = strnlen (upper, sizeof (dt_name)); > > + gcc_assert (len < sizeof (dt_name)); > > + memcpy (dt_name, upper, len); > > + dt_name[len] = '\0'; > > + } > > does detect it before overflowing it. > > OK. Here's what I committed: > > > PR fortran/95090 - ICE: identifier overflow > > Implement buffer overrun check for temporary that holds mangled names. > > 2020-05-30 Harald Anlauf <anlauf@gmx.de> > > gcc/fortran/ > PR fortran/95090 > * class.c (get_unique_type_string): Use buffer overrun check. Thanks. Though, see Andreas' mail, I think he is right that there is really no reason to have that dt_name local buffer at all, just changing dt_name from an array to const char *dt_name; and changing the strcpy to dt_name = "STAR"; and the former strncpy (dt_name, ...) to dt_name = gfc_dt_upper_string (derived->name); should do it. There is a possible buffer overflow in the string with or without that change but to fix that I think it would be desirable to pass not just the string buffer to the function but also the length of the buffer and in the function verify it will not overflow. There is no reason to use sprintf which is fairly expensive, and could be even simplified. So, once dt_name is const char *, change that if (derived->attr.unlimited_polymorphic) sprintf (string, "_%s", dt_name); else if (derived->module) sprintf (string, "%s_%s", derived->module, dt_name); else if (derived->ns->proc_name) sprintf (string, "%s_%s", derived->ns->proc_name->name, dt_name); else sprintf (string, "_%s", dt_name); to something like: const char *first = ""; if (!derived->attr.unlimited_polymorphic) { if (derived->module) first = derived->module; else if (derived->ns->proc_name) first = derived->ns->proc_name->name; } size_t len1 = strlen (first), len2 = strlen (dt_name); if (len1 + 1 + len2 + 1 >= len) // len being the new passed argument - length of the buffer pointed by string gfc_internal_error (...); memcpy (string, first, len1); string[len1] = '_'; memcpy (string + len1 + 1, dt_name, len2 + 1); Jakub
On Sat, May 30, 2020 at 09:11:23PM +0200, Jakub Jelinek via Gcc-patches wrote: > There is a possible buffer overflow in the string with or without that > change but to fix that I think it would be desirable to pass not just the > string buffer to the function but also the length of the buffer and in the > function verify it will not overflow. There is no reason to use sprintf > which is fairly expensive, and could be even simplified. > > So, once dt_name is const char *, change that > if (derived->attr.unlimited_polymorphic) > sprintf (string, "_%s", dt_name); > else if (derived->module) > sprintf (string, "%s_%s", derived->module, dt_name); > else if (derived->ns->proc_name) > sprintf (string, "%s_%s", derived->ns->proc_name->name, dt_name); > else > sprintf (string, "_%s", dt_name); > to something like: > const char *first = ""; > if (!derived->attr.unlimited_polymorphic) > { > if (derived->module) > first = derived->module; > else if (derived->ns->proc_name) > first = derived->ns->proc_name->name; > } > size_t len1 = strlen (first), len2 = strlen (dt_name); > if (len1 + 1 + len2 + 1 >= len) // len being the new passed argument - length of the buffer pointed by string > gfc_internal_error (...); > memcpy (string, first, len1); > string[len1] = '_'; > memcpy (string + len1 + 1, dt_name, len2 + 1); Or if you prefer replace everything starting with len1 above with snprintf (string, len, "%s_%s", first, dt_name); which will truncate (and if you need, you could if ((size_t) snprintf (string, len, "%s_%s", first, dt_name) >= len) gfc_internal_error (...); Jakub
> Thanks. Though, see Andreas' mail, I think he is right that there is really > no reason to have that dt_name local buffer at all, just changing > dt_name from an array to const char *dt_name; and changing the strcpy to > dt_name = "STAR"; > and the former strncpy (dt_name, ...) to > dt_name = gfc_dt_upper_string (derived->name); > should do it. Certainly. The functions in question existed long before I became involved in gfortran development. AFAIU is only the (rather) receently modified name mangling to take into account newer Fortran language features that caused the ICE when the maximum allowed symbol lengths are used. This is hopefully now fully exerted in the testcase. Symbol lengths are checked elsewhere. A user program should not be able to trigger an ICE here. (Fingers crossed) Of course one could always rewrite gfortran to use a more modern C style. We're all looking for more volunteers! (I'm much better at Fortran than at C). Nevertheless, thanks for your patience and your explanations. They might positively influence my future contribution. Harald
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index 9aa3eb7282c..db395624a16 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -479,11 +479,12 @@ gfc_class_initializer (gfc_typespec *ts, gfc_expr *init_expr) static void get_unique_type_string (char *string, gfc_symbol *derived) { - char dt_name[GFC_MAX_SYMBOL_LEN+1]; + /* Provide sufficient space to hold "Pdtsymbol". */ + char dt_name[GFC_MAX_SYMBOL_LEN+4]; if (derived->attr.unlimited_polymorphic) strcpy (dt_name, "STAR"); else - strcpy (dt_name, gfc_dt_upper_string (derived->name)); + strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name)); if (derived->attr.unlimited_polymorphic) sprintf (string, "_%s", dt_name); else if (derived->module) @@ -501,7 +502,8 @@ get_unique_type_string (char *string, gfc_symbol *derived) static void get_unique_hashed_string (char *string, gfc_symbol *derived) { - char tmp[2*GFC_MAX_SYMBOL_LEN+2]; + /* Provide sufficient space to hold "symbol_Pdtsymbol". */ + char tmp[2*GFC_MAX_SYMBOL_LEN+5]; get_unique_type_string (&tmp[0], derived); /* If string is too long, use hash value in hex representation (allow for extra decoration, cf. gfc_build_class_symbol & gfc_find_derived_vtab). @@ -523,7 +525,8 @@ unsigned int gfc_hash_value (gfc_symbol *sym) { unsigned int hash = 0; - char c[2*(GFC_MAX_SYMBOL_LEN+1)]; + /* Provide sufficient space to hold "symbol_Pdtsymbol". */ + char c[2*GFC_MAX_SYMBOL_LEN+5]; int i, len; get_unique_type_string (&c[0], sym);