Message ID | 20181116084325.GD11625@tucnak |
---|---|
State | New |
Headers | show |
Series | Reject too large string literals (PR middle-end/87854) | expand |
On 11/16/18 3:43 AM, Jakub Jelinek wrote: > Hi! > > Both C and C++ FE diagnose arrays larger than half of the address space: > /tmp/1.c:1:6: error: size of array ‘a’ is too large > char a[__SIZE_MAX__ / 2 + 1]; > ^ > because one can't do pointer arithmetics on them. But we don't have > anything similar for string literals. As internally we use host int > as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit > size_t only. > > The following patch adds that diagnostics and truncates the string literals. Ok by me. nathan
On Fri, Nov 16, 2018 at 07:06:51AM -0500, Nathan Sidwell wrote: > On 11/16/18 3:43 AM, Jakub Jelinek wrote: > > Hi! > > > > Both C and C++ FE diagnose arrays larger than half of the address space: > > /tmp/1.c:1:6: error: size of array ‘a’ is too large > > char a[__SIZE_MAX__ / 2 + 1]; > > ^ > > because one can't do pointer arithmetics on them. But we don't have > > anything similar for string literals. As internally we use host int > > as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit > > size_t only. > > > > The following patch adds that diagnostics and truncates the string literals. > > Ok by me. No objections from me, either. Marek
On Fri, 16 Nov 2018, Jakub Jelinek wrote: > Hi! > > Both C and C++ FE diagnose arrays larger than half of the address space: > /tmp/1.c:1:6: error: size of array ‘a’ is too large > char a[__SIZE_MAX__ / 2 + 1]; > ^ > because one can't do pointer arithmetics on them. But we don't have > anything similar for string literals. As internally we use host int > as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit > size_t only. > > The following patch adds that diagnostics and truncates the string literals. > > Bootstrapped/regtested on x86_64-linux and i686-linux and tested with > a cross to avr. I'll defer adjusting testcases to the maintainers of 16-bit > ports. From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and > pr46534.c tests are affected. > > Ok for trunk? OK with me. I'd hope at least one test (existing or new) would actually test the new diagnostic on 16-bit systems, rather than just those tests being disabled for affected platforms.
On 11/16/2018 01:43 AM, Jakub Jelinek wrote: > Hi! > > Both C and C++ FE diagnose arrays larger than half of the address space: > /tmp/1.c:1:6: error: size of array ‘a’ is too large > char a[__SIZE_MAX__ / 2 + 1]; > ^ > because one can't do pointer arithmetics on them. But we don't have > anything similar for string literals. As internally we use host int > as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit > size_t only. > > The following patch adds that diagnostics and truncates the string literals. > > Bootstrapped/regtested on x86_64-linux and i686-linux and tested with > a cross to avr. I'll defer adjusting testcases to the maintainers of 16-bit > ports. From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and > pr46534.c tests are affected. > > Ok for trunk? > > 2018-11-16 Jakub Jelinek <jakub@redhat.com> > > PR middle-end/87854 > * c-common.c (fix_string_type): Reject string literals larger than > TYPE_MAX_VALUE (ssizetype) bytes. > > --- gcc/c-family/c-common.c.jj 2018-11-14 13:37:46.921050615 +0100 > +++ gcc/c-family/c-common.c 2018-11-15 15:20:31.138056115 +0100 > @@ -737,31 +737,44 @@ tree > fix_string_type (tree value) > { > int length = TREE_STRING_LENGTH (value); > - int nchars; > + int nchars, charsz; > tree e_type, i_type, a_type; > > /* Compute the number of elements, for the array type. */ > if (TREE_TYPE (value) == char_array_type_node || !TREE_TYPE (value)) > { > - nchars = length; > + charsz = 1; > e_type = char_type_node; > } > else if (TREE_TYPE (value) == char16_array_type_node) > { > - nchars = length / (TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT); > + charsz = TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT; > e_type = char16_type_node; > } > else if (TREE_TYPE (value) == char32_array_type_node) > { > - nchars = length / (TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT); > + charsz = TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT; > e_type = char32_type_node; > } > else > { > - nchars = length / (TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT); > + charsz = TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT; > e_type = wchar_type_node; > } > > + /* This matters only for targets where ssizetype has smaller precision > + than 32 bits. */ > + if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length)) > + { > + error ("size of string literal is too large"); It would be helpful to mention the size of the literal and the limit so users who do run into the error don't wonder how to fix it. Martin
On Fri, Nov 16, 2018 at 11:25:15AM -0700, Martin Sebor wrote: > On 11/16/2018 01:43 AM, Jakub Jelinek wrote: > > > > + /* This matters only for targets where ssizetype has smaller precision > > + than 32 bits. */ > > + if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length)) > > + { > > + error ("size of string literal is too large"); > > It would be helpful to mention the size of the literal and the limit > so users who do run into the error don't wonder how to fix it. It is consistent with what we emit for the arrays. So, if the size and limit info is helpful to users, we should provide that for those too. I mean the: if (name) error_at (loc, "size of array %qE is too large", else error_at (loc, "size of unnamed array is too large"); name); calls in the C FE and similar stuff in C++ FE. Feel free to add that to all of those. Jakub
--- gcc/c-family/c-common.c.jj 2018-11-14 13:37:46.921050615 +0100 +++ gcc/c-family/c-common.c 2018-11-15 15:20:31.138056115 +0100 @@ -737,31 +737,44 @@ tree fix_string_type (tree value) { int length = TREE_STRING_LENGTH (value); - int nchars; + int nchars, charsz; tree e_type, i_type, a_type; /* Compute the number of elements, for the array type. */ if (TREE_TYPE (value) == char_array_type_node || !TREE_TYPE (value)) { - nchars = length; + charsz = 1; e_type = char_type_node; } else if (TREE_TYPE (value) == char16_array_type_node) { - nchars = length / (TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT); + charsz = TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT; e_type = char16_type_node; } else if (TREE_TYPE (value) == char32_array_type_node) { - nchars = length / (TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT); + charsz = TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT; e_type = char32_type_node; } else { - nchars = length / (TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT); + charsz = TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT; e_type = wchar_type_node; } + /* This matters only for targets where ssizetype has smaller precision + than 32 bits. */ + if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length)) + { + error ("size of string literal is too large"); + length = tree_to_shwi (TYPE_MAX_VALUE (ssizetype)) / charsz * charsz; + char *str = CONST_CAST (char *, TREE_STRING_POINTER (value)); + memset (str + length, '\0', + MIN (TREE_STRING_LENGTH (value) - length, charsz)); + TREE_STRING_LENGTH (value) = length; + } + nchars = length / charsz; + /* C89 2.2.4.1, C99 5.2.4.1 (Translation limits). The analogous limit in C++98 Annex B is very large (65536) and is not normative, so we do not diagnose it (warn_overlength_strings is forced off