Message ID | a0d277c4-41c5-61a8-0284-5b1b245e2c74@gmail.com |
---|---|
State | New |
Headers | show |
Series | warn for strlen of arrays with missing nul (PR 86552) | expand |
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html The fix for bug 86532 has been checked in so this enhancement can now be applied on top of it (with only minor adjustments). On 07/19/2018 02:08 PM, Martin Sebor wrote: > In the discussion of my patch for pr86532 Bernd noted that > GCC silently accepts constant character arrays with no > terminating nul as arguments to strlen (and other string > functions). > > The attached patch is a first step in detecting these kinds > of bugs in strlen calls by issuing -Wstringop-overflow. > The next step is to modify all other handlers of built-in > functions to detect the same problem (not part of this patch). > Yet another step is to detect these problems in arguments > initialized using the non-string form: > > const char a[] = { 'a', 'b', 'c' }; > > This patch is meant to apply on top of the one for bug 86532 > (I tested it with an earlier version of that patch so there > is code in the context that does not appear in the latest > version of the other diff). > > Martin >
Attached is an updated version of the patch that handles more instances of calling strlen() on a constant array that is not a nul-terminated string. No other functions except strlen are explicitly handled yet, and neither are constant arrays with braced-initializer lists like const char a[] = { 'a', 'b', 'c' }; I am testing an independent solution for those (bug 86552). Once those are handled the warning will be able to detect those as well. Tested on x86_64-linux. On 07/25/2018 05:38 PM, Martin Sebor wrote: > Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html > > The fix for bug 86532 has been checked in so this enhancement > can now be applied on top of it (with only minor adjustments). > > On 07/19/2018 02:08 PM, Martin Sebor wrote: >> In the discussion of my patch for pr86532 Bernd noted that >> GCC silently accepts constant character arrays with no >> terminating nul as arguments to strlen (and other string >> functions). >> >> The attached patch is a first step in detecting these kinds >> of bugs in strlen calls by issuing -Wstringop-overflow. >> The next step is to modify all other handlers of built-in >> functions to detect the same problem (not part of this patch). >> Yet another step is to detect these problems in arguments >> initialized using the non-string form: >> >> const char a[] = { 'a', 'b', 'c' }; >> >> This patch is meant to apply on top of the one for bug 86532 >> (I tested it with an earlier version of that patch so there >> is code in the context that does not appear in the latest >> version of the other diff). >> >> Martin >> > PR tree-optimization/86552 - missing warning for reading past the end of non-string arrays gcc/ChangeLog: PR tree-optimization/86552 * builtins.h (warn_string_no_nul): Declare.. (c_strlen): Add argument. * builtins.c (warn_string_no_nul): New function. (fold_builtin_strlen): Add argument. Detect missing nul. (fold_builtin_1): Adjust. (string_length): Add argument and use it. (c_strlen): Same. (expand_builtin_strlen): Detect missing nul. * expr.c (string_constant): Add arguments. Detect missing nul terminator and outermost declaration it's missing in. * expr.h (string_constant): Add argument. * fold-const.c (c_getstr): Change argument to bool*, rename other arguments. * fold-const-call.c (fold_const_call): Detect missing nul. * gimple-fold.c (get_range_strlen): Add argument. (get_maxval_strlen): Adjust. * gimple-fold.h (get_range_strlen): Add argument. gcc/testsuite/ChangeLog: PR tree-optimization/86552 * gcc.dg/warn-string-no-nul.c: New test. diff --git a/gcc/builtins.c b/gcc/builtins.c index aa3e0d8..f4924d5 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -150,7 +150,7 @@ static tree stabilize_va_list_loc (location_t, tree, int); static rtx expand_builtin_expect (tree, rtx); static tree fold_builtin_constant_p (tree); static tree fold_builtin_classify_type (tree); -static tree fold_builtin_strlen (location_t, tree, tree); +static tree fold_builtin_strlen (location_t, tree, tree, tree); static tree fold_builtin_inf (location_t, tree, int); static tree rewrite_call_expr (location_t, tree, int, tree, int, ...); static bool validate_arg (const_tree, enum tree_code code); @@ -550,6 +550,36 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts) return n; } +/* For a call expression EXP to a function that expects a string argument, + issue a diagnostic due to it being a called with an argument NONSTR + that is a character array with no terminating NUL. */ + +void +warn_string_no_nul (location_t loc, tree exp, tree fndecl, tree nonstr) +{ + loc = expansion_point_location_if_in_system_header (loc); + + bool warned; + if (exp) + { + if (!fndecl) + fndecl = get_callee_fndecl (exp); + warned = warning_at (loc, OPT_Wstringop_overflow_, + "%K%qD argument missing terminating nul", + exp, fndecl); + } + else + { + gcc_assert (fndecl); + warned = warning_at (loc, OPT_Wstringop_overflow_, + "%qD argument missing terminating nul", + fndecl); + } + + if (warned && DECL_P (nonstr)) + inform (DECL_SOURCE_LOCATION (nonstr), "referenced argument declared here"); +} + /* Compute the length of a null-terminated character string or wide character string handling character sizes of 1, 2, and 4 bytes. TREE_STRING_LENGTH is not the right way because it evaluates to @@ -567,37 +597,60 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts) accesses. Note that this implies the result is not going to be emitted into the instruction stream. + When ARR is non-null and the string is not properly nul-terminated, + set *ARR to the declaration of the outermost constant object whose + initializer (or one of its elements) is not nul-terminated. + The value returned is of type `ssizetype'. Unfortunately, string_constant can't access the values of const char arrays with initializers, so neither can we do so here. */ tree -c_strlen (tree src, int only_value) +c_strlen (tree src, int only_value, tree *arr /* = NULL */) { STRIP_NOPS (src); + + /* Used to detect non-nul-terminated strings in subexpressions + of a conditional expression. When ARR is null, point it at + one of the elements for simplicity. */ + tree arrs[] = { NULL_TREE, NULL_TREE }; + if (!arr) + arr = arrs; + if (TREE_CODE (src) == COND_EXPR && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0)))) { - tree len1, len2; - - len1 = c_strlen (TREE_OPERAND (src, 1), only_value); - len2 = c_strlen (TREE_OPERAND (src, 2), only_value); + tree len1 = c_strlen (TREE_OPERAND (src, 1), only_value, arrs); + tree len2 = c_strlen (TREE_OPERAND (src, 2), only_value, arrs + 1); if (tree_int_cst_equal (len1, len2)) - return len1; + { + *arr = arrs[0] ? arrs[0] : arrs[1]; + return len1; + } } if (TREE_CODE (src) == COMPOUND_EXPR && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0)))) - return c_strlen (TREE_OPERAND (src, 1), only_value); + return c_strlen (TREE_OPERAND (src, 1), only_value, arr); location_t loc = EXPR_LOC_OR_LOC (src, input_location); /* Offset from the beginning of the string in bytes. */ tree byteoff; - src = string_constant (src, &byteoff); - if (src == 0) - return NULL_TREE; + /* Set if array is nul-terminated, false otherwise. */ + bool nulterm; + src = string_constant (src, &byteoff, &nulterm, arr); + if (!src) + { + *arr = arrs[0] ? arrs[0] : arrs[1]; + return NULL_TREE; + } + + /* Clear *ARR when the string is nul-terminated. It should be + of no interest to callers. */ + if (nulterm) + *arr = NULL_TREE; /* Determine the size of the string element. */ unsigned eltsize @@ -621,6 +674,12 @@ c_strlen (tree src, int only_value) maxelts = maxelts / eltsize - 1; } + /* Unless the caller is prepared to handle it by passing in a non-null + ARR, fail if the terminating nul doesn't fit in the array the string + is stored in (as in const char a[3] = "123"; */ + if (!arr && maxelts < strelts) + return NULL_TREE; + /* PTR can point to the byte representation of any string type, including char* and wchar_t*. */ const char *ptr = TREE_STRING_POINTER (src); @@ -650,7 +709,8 @@ c_strlen (tree src, int only_value) offsave = fold_convert (ssizetype, offsave); tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, offsave, build_int_cst (ssizetype, len * eltsize)); - tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), offsave); + tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), + offsave); return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp, build_zero_cst (ssizetype)); } @@ -690,7 +750,7 @@ c_strlen (tree src, int only_value) Since ELTOFF is our starting index into the string, no further calculation is needed. */ unsigned len = string_length (ptr + eltoff * eltsize, eltsize, - maxelts - eltoff); + strelts - eltoff); return ssize_int (len); } @@ -2855,7 +2915,6 @@ expand_builtin_strlen (tree exp, rtx target, struct expand_operand ops[4]; rtx pat; - tree len; tree src = CALL_EXPR_ARG (exp, 0); rtx src_reg; rtx_insn *before_strlen; @@ -2864,20 +2923,39 @@ expand_builtin_strlen (tree exp, rtx target, unsigned int align; /* If the length can be computed at compile-time, return it. */ - len = c_strlen (src, 0); + tree array; + tree len = c_strlen (src, 0, &array); if (len) - return expand_expr (len, target, target_mode, EXPAND_NORMAL); + { + if (array) + { + /* Array refers to the non-nul terminated constant array + whose length is attempted to be computed. */ + warn_string_no_nul (EXPR_LOCATION (exp), exp, NULL_TREE, array); + return NULL_RTX; + } + return expand_expr (len, target, target_mode, EXPAND_NORMAL); + } /* If the length can be computed at compile-time and is constant integer, but there are side-effects in src, evaluate src for side-effects, then return len. E.g. x = strlen (i++ ? "xfoo" + 1 : "bar"); can be optimized into: i++; x = 3; */ - len = c_strlen (src, 1); - if (len && TREE_CODE (len) == INTEGER_CST) + len = c_strlen (src, 1, &array); + if (len) { - expand_expr (src, const0_rtx, VOIDmode, EXPAND_NORMAL); - return expand_expr (len, target, target_mode, EXPAND_NORMAL); + if (array) + { + warn_string_no_nul (EXPR_LOCATION (exp), exp, NULL_TREE, array); + return NULL_RTX; + } + + if (TREE_CODE (len) == INTEGER_CST) + { + expand_expr (src, const0_rtx, VOIDmode, EXPAND_NORMAL); + return expand_expr (len, target, target_mode, EXPAND_NORMAL); + } } align = get_pointer_alignment (src) / BITS_PER_UNIT; @@ -8255,19 +8333,30 @@ fold_builtin_classify_type (tree arg) return build_int_cst (integer_type_node, type_to_class (TREE_TYPE (arg))); } -/* Fold a call to __builtin_strlen with argument ARG. */ +/* Fold a strlen call to FNDECL of TYPE, and with argument ARG. */ static tree -fold_builtin_strlen (location_t loc, tree type, tree arg) +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg) { if (!validate_arg (arg, POINTER_TYPE)) return NULL_TREE; else { - tree len = c_strlen (arg, 0); - + tree arr = NULL_TREE; + tree len = c_strlen (arg, 0, &arr); if (len) - return fold_convert_loc (loc, type, len); + { + if (loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (arg)) + loc = EXPR_LOCATION (arg); + + /* To avoid warning multiple times about non-nul-terminated + strings only warn if their length has been determined + and it's being folded. */ + if (arr) + warn_string_no_nul (loc, NULL_TREE, fndecl, arr); + + return fold_convert_loc (loc, type, len); + } return NULL_TREE; } @@ -9175,7 +9264,7 @@ fold_builtin_1 (location_t loc, tree fndecl, tree arg0) return fold_builtin_classify_type (arg0); case BUILT_IN_STRLEN: - return fold_builtin_strlen (loc, type, arg0); + return fold_builtin_strlen (loc, fndecl, type, arg0); CASE_FLT_FN (BUILT_IN_FABS): CASE_FLT_FN_FLOATN_NX (BUILT_IN_FABS): diff --git a/gcc/builtins.h b/gcc/builtins.h index 2e0a2f9..73b0b0b 100644 --- a/gcc/builtins.h +++ b/gcc/builtins.h @@ -58,7 +58,7 @@ extern bool get_pointer_alignment_1 (tree, unsigned int *, unsigned HOST_WIDE_INT *); extern unsigned int get_pointer_alignment (tree); extern unsigned string_length (const void*, unsigned, unsigned); -extern tree c_strlen (tree, int); +extern tree c_strlen (tree, int, tree * = NULL); extern void expand_builtin_setjmp_setup (rtx, rtx); extern void expand_builtin_setjmp_receiver (rtx); extern void expand_builtin_update_setjmp_buf (rtx); @@ -103,7 +103,7 @@ extern bool target_char_cst_p (tree t, char *p); extern internal_fn associated_internal_fn (tree); extern internal_fn replacement_internal_fn (gcall *); - +extern void warn_string_no_nul (location_t, tree, tree, tree); extern tree max_object_size (); #endif /* GCC_BUILTINS_H */ diff --git a/gcc/expr.c b/gcc/expr.c index de6709d..edbd7f8 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -11271,10 +11271,14 @@ is_aligning_offset (const_tree offset, const_tree exp) /* Return the tree node if an ARG corresponds to a string constant or zero if it doesn't. If we return nonzero, set *PTR_OFFSET to the (possibly non-constant) offset in bytes within the string that ARG is accessing. + If NULTERM is non-null, consider valid even sequences of characters that + aren't nul-terminated strings. In that case, set NULTERM if ARG refers + to such a sequence and clear it otherwise. The type of the offset is sizetype. */ tree -string_constant (tree arg, tree *ptr_offset) +string_constant (tree arg, tree *ptr_offset, bool *nulterm /* = NULL */, + tree *decl /* = NULL */) { tree array; STRIP_NOPS (arg); @@ -11328,7 +11332,7 @@ string_constant (tree arg, tree *ptr_offset) return NULL_TREE; tree offset; - if (tree str = string_constant (arg0, &offset)) + if (tree str = string_constant (arg0, &offset, nulterm, decl)) { /* Avoid pointers to arrays (see bug 86622). */ if (POINTER_TYPE_P (TREE_TYPE (arg)) @@ -11368,6 +11372,10 @@ string_constant (tree arg, tree *ptr_offset) if (TREE_CODE (array) == STRING_CST) { *ptr_offset = fold_convert (sizetype, offset); + if (nulterm) + *nulterm = true; + if (decl) + *decl = NULL_TREE; return array; } @@ -11414,6 +11422,49 @@ string_constant (tree arg, tree *ptr_offset) if (!array_size || TREE_CODE (array_size) != INTEGER_CST) return NULL_TREE; + unsigned HOST_WIDE_INT array_elts = tree_to_uhwi (array_size); + + /* When ARG refers to an aggregate (of arrays) try to determine + the size of the character array within the aggregate. */ + tree ref = arg; + tree reftype = TREE_TYPE (arg); + + if (TREE_CODE (ref) == MEM_REF) + { + ref = TREE_OPERAND (ref, 0); + if (TREE_CODE (ref) == ADDR_EXPR) + { + ref = TREE_OPERAND (ref, 0); + reftype = TREE_TYPE (ref); + } + } + else + while (TREE_CODE (ref) == ARRAY_REF) + { + reftype = TREE_TYPE (ref); + ref = TREE_OPERAND (ref, 0); + } + + if (TREE_CODE (ref) == COMPONENT_REF) + reftype = TREE_TYPE (ref); + + while (TREE_CODE (reftype) == ARRAY_TYPE) + { + tree next = TREE_TYPE (reftype); + if (TREE_CODE (next) == INTEGER_TYPE) + { + if (tree size = TYPE_SIZE_UNIT (reftype)) + if (tree_fits_uhwi_p (size)) + array_elts = tree_to_uhwi (size); + break; + } + + reftype = TREE_TYPE (reftype); + } + + if (decl) + *decl = array; + /* Avoid returning a string that doesn't fit in the array it is stored in, like const char a[4] = "abcde"; @@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset) unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init); length = string_length (TREE_STRING_POINTER (init), charsize, length / charsize); - if (compare_tree_int (array_size, length + 1) < 0) + if (nulterm) + *nulterm = array_elts > length; + else if (array_elts <= length) return NULL_TREE; *ptr_offset = offset; diff --git a/gcc/expr.h b/gcc/expr.h index cf047d4..e630979 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -288,7 +288,7 @@ expand_normal (tree exp) /* Return the tree node and offset if a given argument corresponds to a string constant. */ -extern tree string_constant (tree, tree *); +extern tree string_constant (tree, tree *, bool * = NULL, tree * = NULL); /* Two different ways of generating switch statements. */ extern int try_casesi (tree, tree, tree, tree, rtx, rtx, rtx, profile_probability); diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c index 06a42060..849a443 100644 --- a/gcc/fold-const-call.c +++ b/gcc/fold-const-call.c @@ -1199,9 +1199,14 @@ fold_const_call (combined_fn fn, tree type, tree arg) switch (fn) { case CFN_BUILT_IN_STRLEN: - if (const char *str = c_getstr (arg)) - return build_int_cst (type, strlen (str)); - return NULL_TREE; + { + bool nulterm; + if (const char *str = c_getstr (arg, NULL, &nulterm)) + if (nulterm) + return build_int_cst (type, strlen (str)); + + return NULL_TREE; + } CASE_CFN_NAN: CASE_FLT_FN_FLOATN_NX (CFN_BUILT_IN_NAN): diff --git a/gcc/fold-const.c b/gcc/fold-const.c index b318fc77..ecbc38c 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -14577,23 +14577,23 @@ fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE_INT off) /* Return a pointer P to a NUL-terminated string representing the sequence of constant characters referred to by SRC (or a subsequence of such characters within it if SRC is a reference to a string plus some - constant offset). If STRLEN is non-null, store stgrlen(P) in *STRLEN. - If STRSIZE is non-null, store in *STRSIZE the size of the array - the string is stored in; in that case, even though P points to a NUL - terminated string, SRC need not refer to one. This can happen when - SRC refers to a constant character array initialized to all non-NUL - values, as in the C declaration: char a[4] = "1234"; */ + constant offset). If STRSIZE is non-null, store the size of the string + literal in *STRSIZE, including any embedded or terminating nuls. If + If NULLTERM is non-null, set *NULLTERM if the referenced string is + guaranteed to contain a terminating NUL. Otherwise clear it. This + can happen in the case of valid C declarations such as: + const char a[3] = "123"; */ const char * -c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */, - unsigned HOST_WIDE_INT *strsize /* = NULL */) +c_getstr (tree src, unsigned HOST_WIDE_INT *strsize /* = NULL */, + bool *nulterm /* = NULL */) { tree offset_node; - if (strlen) - *strlen = 0; + if (strsize) + *strsize = 0; - src = string_constant (src, &offset_node); + src = string_constant (src, &offset_node, nulterm); if (src == 0) return NULL; @@ -14606,47 +14606,42 @@ c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */, offset = tree_to_uhwi (offset_node); } - /* STRING_LENGTH is the size of the string literal, including any - embedded NULs. STRING_SIZE is the size of the array the string + /* STRING_SIZE is the size of the string literal, including any + embedded NULs. ARRAY_SIZE is the size of the array the string literal is stored in. */ - unsigned HOST_WIDE_INT string_length = TREE_STRING_LENGTH (src); - unsigned HOST_WIDE_INT string_size = string_length; + unsigned HOST_WIDE_INT string_size = TREE_STRING_LENGTH (src); + unsigned HOST_WIDE_INT array_size = string_size; tree type = TREE_TYPE (src); if (tree size = TYPE_SIZE_UNIT (type)) if (tree_fits_shwi_p (size)) - string_size = tree_to_uhwi (size); + array_size = tree_to_uhwi (size); + + const char *string = TREE_STRING_POINTER (src); - if (strlen) + if (strsize) { - /* Compute and store the length of the substring at OFFSET. + /* Compute and store the size of the substring at OFFSET. All offsets past the initial length refer to null strings. */ - if (offset <= string_length) - *strlen = string_length - offset; + if (offset <= string_size) + *strsize = string_size - offset; else - *strlen = 0; + *strsize = 0; } - const char *string = TREE_STRING_POINTER (src); - - if (string_length == 0 - || offset >= string_size) + if (string_size == 0 + || offset >= array_size) return NULL; - if (strsize) - { - /* Support even constant character arrays that aren't proper - NUL-terminated strings. */ - *strsize = string_size; - } - else if (string[string_length - 1] != '\0') + if (!nulterm && string[string_size - 1] != '\0') { - /* Support only properly NUL-terminated strings but handle - consecutive strings within the same array, such as the six - substrings in "1\0002\0003". */ + /* When NULTERM is null, support only properly nul-terminated + strings but handle consecutive strings within the same array, + such as the six substrings in "1\0002\0003". Otherwise, let + the caller deal with non-nul-terminated arrays. */ return NULL; } - return offset <= string_length ? string + offset : ""; + return offset <= string_size ? string + offset : ""; } /* Given a tree T, compute which bits in T may be nonzero. */ diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 1b9ccc0..a58a4a2 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -188,7 +188,7 @@ extern tree const_unop (enum tree_code, tree, tree); extern tree const_binop (enum tree_code, tree, tree, tree); extern bool negate_mathfn_p (combined_fn); extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL, - unsigned HOST_WIDE_INT * = NULL); + bool * = NULL); extern wide_int tree_nonzero_bits (const_tree); /* Return OFF converted to a pointer offset type suitable as offset for diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index c3fa570..9eefb37 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -1275,11 +1275,13 @@ gimple_fold_builtin_memset (gimple_stmt_iterator *gsi, tree c, tree len) Set *FLEXP to true if the range of the string lengths has been obtained from the upper bound of an array at the end of a struct. Such an array may hold a string that's longer than its upper bound - due to it being used as a poor-man's flexible array member. */ + due to it being used as a poor-man's flexible array member. + Clear *NULTERM if ARG refers to a constant array that is known + not be nul-terminated. */ static bool get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, - int fuzzy, bool *flexp) + int fuzzy, bool *flexp, bool *nulterm) { tree var, val = NULL_TREE; gimple *def_stmt; @@ -1301,7 +1303,8 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, if (TREE_CODE (aop0) == INDIRECT_REF && TREE_CODE (TREE_OPERAND (aop0, 0)) == SSA_NAME) return get_range_strlen (TREE_OPERAND (aop0, 0), - length, visited, type, fuzzy, flexp); + length, visited, type, fuzzy, flexp, + nulterm); } else if (TREE_CODE (TREE_OPERAND (op, 0)) == COMPONENT_REF && fuzzy) { @@ -1329,13 +1332,18 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, return false; } else - val = c_strlen (arg, 1); + { + tree arr; + val = c_strlen (arg, 1, &arr); + if (val && arr) + *nulterm = false; + } if (!val && fuzzy) { if (TREE_CODE (arg) == ADDR_EXPR) return get_range_strlen (TREE_OPERAND (arg, 0), length, - visited, type, fuzzy, flexp); + visited, type, fuzzy, flexp, nulterm); if (TREE_CODE (arg) == ARRAY_REF) { @@ -1477,7 +1485,8 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, || gimple_assign_unary_nop_p (def_stmt)) { tree rhs = gimple_assign_rhs1 (def_stmt); - return get_range_strlen (rhs, length, visited, type, fuzzy, flexp); + return get_range_strlen (rhs, length, visited, type, fuzzy, flexp, + nulterm); } else if (gimple_assign_rhs_code (def_stmt) == COND_EXPR) { @@ -1486,7 +1495,7 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, for (unsigned int i = 0; i < 2; i++) if (!get_range_strlen (ops[i], length, visited, type, fuzzy, - flexp)) + flexp, nulterm)) { if (fuzzy == 2) *maxlen = build_all_ones_cst (size_type_node); @@ -1513,7 +1522,8 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, if (arg == gimple_phi_result (def_stmt)) continue; - if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp)) + if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp, + nulterm)) { if (fuzzy == 2) *maxlen = build_all_ones_cst (size_type_node); @@ -1545,19 +1555,27 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, and false if PHIs and COND_EXPRs are to be handled optimistically, if we can determine string length minimum and maximum; it will use the minimum from the ones where it can be determined. - STRICT false should be only used for warning code. */ + STRICT false should be only used for warning code. + When non-null, clear *NULTERM if ARG refers to a constant array + that is known not be nul-terminated. Otherwise set it to true. */ bool -get_range_strlen (tree arg, tree minmaxlen[2], bool strict) +get_range_strlen (tree arg, tree minmaxlen[2], bool strict /* = false */, + bool *nulterm /* = NULL */) { bitmap visited = NULL; minmaxlen[0] = NULL_TREE; minmaxlen[1] = NULL_TREE; + bool nultermbuf; + if (!nulterm) + nulterm = &nultermbuf; + *nulterm = true; + bool flexarray = false; if (!get_range_strlen (arg, minmaxlen, &visited, 1, strict ? 1 : 2, - &flexarray)) + &flexarray, nulterm)) { minmaxlen[0] = NULL_TREE; minmaxlen[1] = NULL_TREE; @@ -1576,7 +1594,7 @@ get_maxval_strlen (tree arg, int type) tree len[2] = { NULL_TREE, NULL_TREE }; bool dummy; - if (!get_range_strlen (arg, len, &visited, type, 0, &dummy)) + if (!get_range_strlen (arg, len, &visited, type, 0, &dummy, NULL)) len[1] = NULL_TREE; if (visited) BITMAP_FREE (visited); @@ -3496,12 +3514,14 @@ static bool gimple_fold_builtin_strlen (gimple_stmt_iterator *gsi) { gimple *stmt = gsi_stmt (*gsi); + tree arg = gimple_call_arg (stmt, 0); wide_int minlen; wide_int maxlen; tree lenrange[2]; - if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange, true) + bool nulterm; + if (!get_range_strlen (arg, lenrange, true, &nulterm) && lenrange[0] && TREE_CODE (lenrange[0]) == INTEGER_CST && lenrange[1] && TREE_CODE (lenrange[1]) == INTEGER_CST) { @@ -3523,6 +3543,10 @@ gimple_fold_builtin_strlen (gimple_stmt_iterator *gsi) if (minlen == maxlen) { + if (!nulterm) + warn_string_no_nul (gimple_location (stmt), NULL_TREE, + gimple_call_fndecl (stmt), arg); + lenrange[0] = force_gimple_operand_gsi (gsi, lenrange[0], true, NULL, true, GSI_SAME_STMT); replace_call_with_value (gsi, lenrange[0]); diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h index 04e9bfa..fe11728 100644 --- a/gcc/gimple-fold.h +++ b/gcc/gimple-fold.h @@ -25,7 +25,7 @@ along with GCC; see the file COPYING3. If not see extern tree create_tmp_reg_or_ssa_name (tree, gimple *stmt = NULL); extern tree canonicalize_constructor_val (tree, tree); extern tree get_symbol_constant_value (tree); -extern bool get_range_strlen (tree, tree[2], bool = false); +extern bool get_range_strlen (tree, tree[2], bool = false, bool * = NULL); extern tree get_maxval_strlen (tree, int); extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree); extern bool fold_stmt (gimple_stmt_iterator *); diff --git a/gcc/testsuite/gcc.dg/warn-string-no-nul.c b/gcc/testsuite/gcc.dg/warn-string-no-nul.c new file mode 100644 index 0000000..838528f --- /dev/null +++ b/gcc/testsuite/gcc.dg/warn-string-no-nul.c @@ -0,0 +1,239 @@ +/* PR tree-optimization/86552 - missing warning for reading past the end + of non-string arrays + { dg-do compile } + { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */ + +extern __SIZE_TYPE__ strlen (const char*); + +const char a[5] = "12345"; /* { dg-message "declared here" } */ + +int i0 = 0; +int i1 = 1; + +void sink (int, ...); + +#define CONCAT(a, b) a ## b +#define CAT(a, b) CONCAT(a, b) + +#define T(str) \ + __attribute__ ((noipa)) \ + void CAT (test_, __LINE__) (void) { \ + sink (strlen (str)); \ + } typedef void dummy_type + +T (a); /* { dg-warning "argument missing terminating nul" } */ +T (&a[0]); /* { dg-warning "nul" } */ +T (&a[0] + 1); /* { dg-warning "nul" } */ +T (&a[1]); /* { dg-warning "nul" } */ +T (&a[i0]); /* { dg-warning "nul" } */ +T (&a[i0] + 1); /* { dg-warning "nul" } */ + + +const char b[][5] = { /* { dg-message "declared here" } */ + "12", "123", "1234", "54321" +}; + +T (b[0]); +T (b[1]); +T (b[2]); +T (b[3]); /* { dg-warning "nul" } */ +T (b[i0]); + +T (&b[2][1]); +T (&b[2][1] + 1); +T (&b[2][i0]); +T (&b[2][1] + i0); + +T (&b[3][1]); /* { dg-warning "nul" } */ +T (&b[3][1] + 1); /* { dg-warning "nul" } */ +T (&b[3][i0]); /* { dg-warning "nul" } */ +T (&b[3][1] + i0); /* { dg-warning "nul" } */ +T (&b[3][i0] + i1); /* { dg-warning "nul" } */ + +T (i0 ? "" : b[0]); +T (i0 ? "" : b[1]); +T (i0 ? "" : b[2]); +T (i0 ? "" : b[3]); /* { dg-warning "nul" } */ +T (i0 ? b[0] : ""); +T (i0 ? b[1] : ""); +T (i0 ? b[2] : ""); +T (i0 ? b[3] : ""); /* { dg-warning "nul" } */ + +T (i0 ? "1234" : b[3]); /* { dg-warning "nul" } */ +T (i0 ? b[3] : "1234"); /* { dg-warning "nul" } */ + +T (i0 ? a : b[3]); /* { dg-warning "nul" } */ +T (i0 ? b[0] : b[2]); +T (i0 ? b[2] : b[3]); /* { dg-warning "nul" } */ +T (i0 ? b[3] : b[2]); /* { dg-warning "nul" } */ + +T (i0 ? b[0] : &b[3][0] + 1); /* { dg-warning "nul" } */ +T (i0 ? b[1] : &b[3][1] + i0); /* { dg-warning "nul" } */ + +/* It's possible to detect the missing nul in the following two + expressions but GCC doesn't do it yet. */ +T (i0 ? &b[3][1] + i0 : b[2]); /* { dg-warning "nul" "bug" { xfail *-*-* } } */ +T (i0 ? &b[3][i0] : &b[3][i1]); /* { dg-warning "nul" "bug" { xfail *-*-* } } */ + + +struct A { char a[5], b[5]; }; + +const struct A s = { "1234", "12345" }; + +T (s.a); +T (&s.a[0]); +T (&s.a[0] + 1); +T (&s.a[0] + i0); +T (&s.a[1]); +T (&s.a[1] + 1); +T (&s.a[1] + i0); + +T (s.b); /* { dg-warning "nul" } */ +T (&s.b[0]); /* { dg-warning "nul" } */ +T (&s.b[0] + 1); /* { dg-warning "nul" } */ +T (&s.b[0] + i0); /* { dg-warning "nul" } */ +T (&s.b[1]); /* { dg-warning "nul" } */ +T (&s.b[1] + 1); /* { dg-warning "nul" } */ +T (&s.b[1] + i0); /* { dg-warning "nul" } */ + +struct B { struct A a[2]; }; + +const struct B ba[] = { + { { { "123", "12345" }, { "12345", "123" } } }, + { { { "12345", "123" }, { "123", "12345" } } }, + { { { "1", "12" }, { "123", "1234" } } }, + { { { "123", "1234" }, { "12345", "12" } } } +}; + +T (ba[0].a[0].a); +T (&ba[0].a[0].a[0]); +T (&ba[0].a[0].a[0] + 1); +T (&ba[0].a[0].a[0] + i0); +T (&ba[0].a[0].a[1]); +T (&ba[0].a[0].a[1] + 1); +T (&ba[0].a[0].a[1] + i0); + +T (ba[0].a[0].b); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[0]); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[0] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[0] + i0); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[1]); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[1] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[1] + i0); /* { dg-warning "nul" } */ + +T (ba[0].a[1].a); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[0]); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[0] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[0] + i0); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[1]); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[1] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[1] + i0); /* { dg-warning "nul" } */ + +T (ba[0].a[1].b); +T (&ba[0].a[1].b[0]); +T (&ba[0].a[1].b[0] + 1); +T (&ba[0].a[1].b[0] + i0); +T (&ba[0].a[1].b[1]); +T (&ba[0].a[1].b[1] + 1); +T (&ba[0].a[1].b[1] + i0); + + +T (ba[1].a[0].a); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[0]); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[0] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[0] + i0); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[1]); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[1] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[1] + i0); /* { dg-warning "nul" } */ + +T (ba[1].a[0].b); +T (&ba[1].a[0].b[0]); +T (&ba[1].a[0].b[0] + 1); +T (&ba[1].a[0].b[0] + i0); +T (&ba[1].a[0].b[1]); +T (&ba[1].a[0].b[1] + 1); +T (&ba[1].a[0].b[1] + i0); + +T (ba[1].a[1].a); +T (&ba[1].a[1].a[0]); +T (&ba[1].a[1].a[0] + 1); +T (&ba[1].a[1].a[0] + i0); +T (&ba[1].a[1].a[1]); +T (&ba[1].a[1].a[1] + 1); +T (&ba[1].a[1].a[1] + i0); + +T (ba[1].a[1].b); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[0]); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[0] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[0] + i0); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[1]); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[1] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[1] + i0); /* { dg-warning "nul" } */ + + +T (ba[2].a[0].a); +T (&ba[2].a[0].a[0]); +T (&ba[2].a[0].a[0] + 1); +T (&ba[2].a[0].a[0] + i0); +T (&ba[2].a[0].a[1]); +T (&ba[2].a[0].a[1] + 1); +T (&ba[2].a[0].a[1] + i0); + +T (ba[2].a[0].b); +T (&ba[2].a[0].b[0]); +T (&ba[2].a[0].b[0] + 1); +T (&ba[2].a[0].b[0] + i0); +T (&ba[2].a[0].b[1]); +T (&ba[2].a[0].b[1] + 1); +T (&ba[2].a[0].b[1] + i0); + +T (ba[2].a[1].a); +T (&ba[2].a[1].a[0]); +T (&ba[2].a[1].a[0] + 1); +T (&ba[2].a[1].a[0] + i0); +T (&ba[2].a[1].a[1]); +T (&ba[2].a[1].a[1] + 1); +T (&ba[2].a[1].a[1] + i0); + + +T (ba[3].a[0].a); +T (&ba[3].a[0].a[0]); +T (&ba[3].a[0].a[0] + 1); +T (&ba[3].a[0].a[0] + i0); +T (&ba[3].a[0].a[1]); +T (&ba[3].a[0].a[1] + 1); +T (&ba[3].a[0].a[1] + i0); + +T (ba[3].a[0].b); +T (&ba[3].a[0].b[0]); +T (&ba[3].a[0].b[0] + 1); +T (&ba[3].a[0].b[0] + i0); +T (&ba[3].a[0].b[1]); +T (&ba[3].a[0].b[1] + 1); +T (&ba[3].a[0].b[1] + i0); + +T (ba[3].a[1].a); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[0]); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[0] + 1); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[0] + i0); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[1]); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[1] + 1); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[1] + i0); /* { dg-warning "nul" } */ + +T (ba[3].a[1].b); +T (&ba[3].a[1].b[0]); +T (&ba[3].a[1].b[0] + 1); +T (&ba[3].a[1].b[0] + i0); +T (&ba[3].a[1].b[1]); +T (&ba[3].a[1].b[1] + 1); +T (&ba[3].a[1].b[1] + i0); + + +T (i0 ? ba[0].a[0].a : ba[0].a[0].b); /* { dg-warning "nul" } */ +T (i0 ? ba[0].a[0].a : ba[0].a[0].b); /* { dg-warning "nul" } */ + +T (i0 ? &ba[0].a[0].a[0] : &ba[3].a[1].a[0]); /* { dg-warning "nul" } */ +T (i0 ? &ba[3].a[1].a[1] : ba[0].a[0].a); /* { dg-warning "nul" } */ + +T (i0 ? ba[0].a[0].a : ba[0].a[1].b); +T (i0 ? ba[0].a[1].b : ba[0].a[0].a);
PR tree-optimization/86552 - missing warning for reading past the end of non-string arrays gcc/ChangeLog: PR tree-optimization/86552 * builtins.c (warn_string_no_nul): New function. (string_length): Add argument and use it. (c_strlen): Same. (expand_builtin_strlen): Detect missing nul. (fold_builtin_1): Adjust. * builtins.h (c_strlen): Add argument. * expr.c (string_constant): Add arguments. Detect missing nul terminator and outermost declaration it's missing in. * expr.h (string_constant): Add argument. * fold-const.c (c_getstr): Revert test. gcc/testsuite/ChangeLog: PR tree-optimization/86552 * gcc.dg/warn-string-no-nul.c: New test. diff --git a/gcc/builtins.c b/gcc/builtins.c index 03cf012..9885c4b 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -150,7 +150,7 @@ static tree stabilize_va_list_loc (location_t, tree, int); static rtx expand_builtin_expect (tree, rtx); static tree fold_builtin_constant_p (tree); static tree fold_builtin_classify_type (tree); -static tree fold_builtin_strlen (location_t, tree, tree); +static tree fold_builtin_strlen (location_t, tree, tree, tree); static tree fold_builtin_inf (location_t, tree, int); static tree rewrite_call_expr (location_t, tree, int, tree, int, ...); static bool validate_arg (const_tree, enum tree_code code); @@ -550,6 +550,36 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts) return n; } +/* For a call expression EXP to a function that expects a string argument, + issue a diagnostic due to it being a called with an argument NONSTR + that is a character array with no terminating NUL. */ + +static void +warn_string_no_nul (location_t loc, tree exp, tree fndecl, tree nonstr) +{ + loc = expansion_point_location_if_in_system_header (loc); + + bool warned; + if (exp) + { + if (!fndecl) + fndecl = get_callee_fndecl (exp); + warned = warning_at (loc, OPT_Wstringop_overflow_, + "%K%qD argument missing terminating nul", + exp, fndecl); + } + else + { + gcc_assert (fndecl); + warned = warning_at (loc, OPT_Wstringop_overflow_, + "%qD argument missing terminating nul", + fndecl); + } + + if (warned && DECL_P (nonstr)) + inform (DECL_SOURCE_LOCATION (nonstr), "referenced argument declared here"); +} + /* Compute the length of a null-terminated character string or wide character string handling character sizes of 1, 2, and 4 bytes. TREE_STRING_LENGTH is not the right way because it evaluates to @@ -567,13 +597,17 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts) accesses. Note that this implies the result is not going to be emitted into the instruction stream. + When ARR is non-null and the string is not properly nul-terminated, + set *ARR to the declaration of the outermost constant object whose + initializer (or one of its elements) is not nul-terminated. + The value returned is of type `ssizetype'. Unfortunately, string_constant can't access the values of const char arrays with initializers, so neither can we do so here. */ tree -c_strlen (tree src, int only_value) +c_strlen (tree src, int only_value, tree *arr /* = NULL */) { STRIP_NOPS (src); if (TREE_CODE (src) == COND_EXPR @@ -581,24 +615,31 @@ c_strlen (tree src, int only_value) { tree len1, len2; - len1 = c_strlen (TREE_OPERAND (src, 1), only_value); - len2 = c_strlen (TREE_OPERAND (src, 2), only_value); + len1 = c_strlen (TREE_OPERAND (src, 1), only_value, arr); + len2 = c_strlen (TREE_OPERAND (src, 2), only_value, arr); if (tree_int_cst_equal (len1, len2)) return len1; } if (TREE_CODE (src) == COMPOUND_EXPR && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0)))) - return c_strlen (TREE_OPERAND (src, 1), only_value); + return c_strlen (TREE_OPERAND (src, 1), only_value, arr); location_t loc = EXPR_LOC_OR_LOC (src, input_location); /* Offset from the beginning of the string in bytes. */ tree byteoff; - src = string_constant (src, &byteoff); - if (src == 0) + /* Set if array is nul-terminated, false otherwise. */ + bool nulterm; + src = string_constant (src, &byteoff, &nulterm, arr); + if (!src) return NULL_TREE; + /* Clear *ARR when the string is nul-terminated. It should be + of no interest to callers. */ + if (nulterm && arr) + *arr = NULL_TREE; + /* Determine the size of the string element. */ unsigned eltsize = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (src)))); @@ -650,7 +691,8 @@ c_strlen (tree src, int only_value) offsave = fold_convert (ssizetype, offsave); tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, offsave, build_int_cst (ssizetype, len * eltsize)); - tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), offsave); + tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), + offsave); return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp, build_zero_cst (ssizetype)); } @@ -690,7 +732,7 @@ c_strlen (tree src, int only_value) Since ELTOFF is our starting index into the string, no further calculation is needed. */ unsigned len = string_length (ptr + eltoff * eltsize, eltsize, - maxelts - eltoff); + strelts - eltoff); return ssize_int (len); } @@ -2855,7 +2897,6 @@ expand_builtin_strlen (tree exp, rtx target, struct expand_operand ops[4]; rtx pat; - tree len; tree src = CALL_EXPR_ARG (exp, 0); rtx src_reg; rtx_insn *before_strlen; @@ -2864,20 +2905,37 @@ expand_builtin_strlen (tree exp, rtx target, unsigned int align; /* If the length can be computed at compile-time, return it. */ - len = c_strlen (src, 0); + tree array; + tree len = c_strlen (src, 0, &array); if (len) - return expand_expr (len, target, target_mode, EXPAND_NORMAL); + { + if (array) + { + warn_string_no_nul (EXPR_LOCATION (exp), exp, NULL_TREE, array); + return NULL_RTX; + } + return expand_expr (len, target, target_mode, EXPAND_NORMAL); + } /* If the length can be computed at compile-time and is constant integer, but there are side-effects in src, evaluate src for side-effects, then return len. E.g. x = strlen (i++ ? "xfoo" + 1 : "bar"); can be optimized into: i++; x = 3; */ - len = c_strlen (src, 1); - if (len && TREE_CODE (len) == INTEGER_CST) + len = c_strlen (src, 1, &array); + if (len) { - expand_expr (src, const0_rtx, VOIDmode, EXPAND_NORMAL); - return expand_expr (len, target, target_mode, EXPAND_NORMAL); + if (array) + { + warn_string_no_nul (EXPR_LOCATION (exp), exp, NULL_TREE, array); + return NULL_RTX; + } + + if (TREE_CODE (len) == INTEGER_CST) + { + expand_expr (src, const0_rtx, VOIDmode, EXPAND_NORMAL); + return expand_expr (len, target, target_mode, EXPAND_NORMAL); + } } align = get_pointer_alignment (src) / BITS_PER_UNIT; @@ -8238,19 +8296,27 @@ fold_builtin_classify_type (tree arg) return build_int_cst (integer_type_node, type_to_class (TREE_TYPE (arg))); } -/* Fold a call to __builtin_strlen with argument ARG. */ +/* Fold a strlen call to FNDECL of TYPE, and with argument ARG. */ static tree -fold_builtin_strlen (location_t loc, tree type, tree arg) +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg) { if (!validate_arg (arg, POINTER_TYPE)) return NULL_TREE; else { - tree len = c_strlen (arg, 0); - + tree arr = NULL_TREE; + tree len = c_strlen (arg, 0, &arr); if (len) - return fold_convert_loc (loc, type, len); + { + /* To avoid warning multiple times about non-nul-terminated + strings only warn if their length has been determined + and it's being folded. */ + if (arr) + warn_string_no_nul (loc, NULL_TREE, fndecl, arr); + + return fold_convert_loc (loc, type, len); + } return NULL_TREE; } @@ -9158,7 +9224,7 @@ fold_builtin_1 (location_t loc, tree fndecl, tree arg0) return fold_builtin_classify_type (arg0); case BUILT_IN_STRLEN: - return fold_builtin_strlen (loc, type, arg0); + return fold_builtin_strlen (loc, fndecl, type, arg0); CASE_FLT_FN (BUILT_IN_FABS): CASE_FLT_FN_FLOATN_NX (BUILT_IN_FABS): diff --git a/gcc/builtins.h b/gcc/builtins.h index c922904..9446c09 100644 --- a/gcc/builtins.h +++ b/gcc/builtins.h @@ -57,7 +57,7 @@ extern unsigned int get_object_alignment (tree); extern bool get_pointer_alignment_1 (tree, unsigned int *, unsigned HOST_WIDE_INT *); extern unsigned int get_pointer_alignment (tree); -extern tree c_strlen (tree, int); +extern tree c_strlen (tree, int, tree * = NULL); extern void expand_builtin_setjmp_setup (rtx, rtx); extern void expand_builtin_setjmp_receiver (rtx); extern void expand_builtin_update_setjmp_buf (rtx); diff --git a/gcc/expr.c b/gcc/expr.c index 79ead3d..79bcbbe 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -11271,10 +11271,14 @@ is_aligning_offset (const_tree offset, const_tree exp) /* Return the tree node if an ARG corresponds to a string constant or zero if it doesn't. If we return nonzero, set *PTR_OFFSET to the (possibly non-constant) offset in bytes within the string that ARG is accessing. + If NULTERM is non-null, consider valid even sequences of characters that + aren't nul-terminated strings. In that case, set NULTERM if ARG refers + to such a sequence and clear it otherwise. The type of the offset is sizetype. */ tree -string_constant (tree arg, tree *ptr_offset) +string_constant (tree arg, tree *ptr_offset, bool *nulterm /* = NULL */, + tree *decl /* = NULL */) { tree array; STRIP_NOPS (arg); @@ -11335,7 +11339,7 @@ string_constant (tree arg, tree *ptr_offset) return NULL_TREE; tree offset; - if (tree str = string_constant (arg0, &offset)) + if (tree str = string_constant (arg0, &offset, nulterm, decl)) { tree type = TREE_TYPE (arg1); *ptr_offset = fold_build2 (PLUS_EXPR, type, offset, arg1); @@ -11357,12 +11361,10 @@ string_constant (tree arg, tree *ptr_offset) if (TREE_CODE (TREE_TYPE (array)) != ARRAY_TYPE) return NULL_TREE; - while (TREE_CODE (chartype) == ARRAY_TYPE - || TREE_CODE (chartype) == POINTER_TYPE) - chartype = TREE_TYPE (chartype); + gcc_assert (TREE_CODE (chartype) == POINTER_TYPE); - if (TREE_CODE (chartype) != INTEGER_TYPE) - return NULL; + while (TREE_CODE (chartype) != INTEGER_TYPE) + chartype = TREE_TYPE (chartype); /* Set the non-constant offset to the non-constant index scaled by the size of the character type. */ @@ -11374,6 +11376,8 @@ string_constant (tree arg, tree *ptr_offset) if (TREE_CODE (array) == STRING_CST) { *ptr_offset = fold_convert (sizetype, offset); + if (decl) + *decl = NULL_TREE; return array; } @@ -11420,6 +11424,38 @@ string_constant (tree arg, tree *ptr_offset) if (!array_size || TREE_CODE (array_size) != INTEGER_CST) return NULL_TREE; + unsigned HOST_WIDE_INT array_elts = tree_to_uhwi (array_size); + + /* When ARG refers to an aggregate (of arrays) determine the size + of the character array within the aggregate. */ + tree ref = arg; + tree reftype = TREE_TYPE (arg); + while (TREE_CODE (ref) == ARRAY_REF) + { + reftype = TREE_TYPE (ref); + ref = TREE_OPERAND (ref, 0); + } + + if (TREE_CODE (ref) == COMPONENT_REF) + reftype = TREE_TYPE (ref); + + while (TREE_CODE (reftype) == ARRAY_TYPE) + { + tree next = TREE_TYPE (reftype); + if (TREE_CODE (next) == INTEGER_TYPE) + { + if (tree size = TYPE_SIZE_UNIT (reftype)) + if (tree_fits_uhwi_p (size)) + array_elts = tree_to_uhwi (size); + break; + } + + reftype = TREE_TYPE (reftype); + } + + if (decl) + *decl = array; + /* Avoid returning a string that doesn't fit in the array it is stored in, like const char a[4] = "abcde"; @@ -11430,7 +11466,9 @@ string_constant (tree arg, tree *ptr_offset) but not to strlen(). */ unsigned HOST_WIDE_INT length = strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init)); - if (compare_tree_int (array_size, length + 1) < 0) + if (nulterm) + *nulterm = array_elts > length; + else if (array_elts <= length) return NULL_TREE; *ptr_offset = offset; diff --git a/gcc/expr.h b/gcc/expr.h index cf047d4..e630979 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -288,7 +288,7 @@ expand_normal (tree exp) /* Return the tree node and offset if a given argument corresponds to a string constant. */ -extern tree string_constant (tree, tree *); +extern tree string_constant (tree, tree *, bool * = NULL, tree * = NULL); /* Two different ways of generating switch statements. */ extern int try_casesi (tree, tree, tree, tree, rtx, rtx, rtx, profile_probability); diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 15bbf95..b318fc77 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -14638,8 +14638,7 @@ c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */, NUL-terminated strings. */ *strsize = string_size; } - else if (string_size < string_length - || string[string_length - 1] != '\0') + else if (string[string_length - 1] != '\0') { /* Support only properly NUL-terminated strings but handle consecutive strings within the same array, such as the six diff --git a/gcc/testsuite/gcc.dg/warn-string-no-nul.c b/gcc/testsuite/gcc.dg/warn-string-no-nul.c new file mode 100644 index 0000000..e470ade --- /dev/null +++ b/gcc/testsuite/gcc.dg/warn-string-no-nul.c @@ -0,0 +1,200 @@ +/* PR tree-optimization/86552 - missing warning for reading past the end + of non-string arrays + { dg-do compile } + { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */ + +extern __SIZE_TYPE__ strlen (const char*); + +const char a[5] = "12345"; /* { dg-message "declared here" } */ + +int i0 = 0; + +void sink (int, ...); + +#define CONCAT(a, b) a ## b +#define CAT(a, b) CONCAT(a, b) + +#define T(str) \ + __attribute__ ((noipa)) \ + void CAT (test_, __LINE__) (void) { \ + sink (strlen (str)); \ + } typedef void dummy_type + +T (a); /* { dg-warning "argument missing terminating nul" } */ +T (&a[0]); /* { dg-warning "nul" } */ +T (&a[0] + 1); /* { dg-warning "nul" } */ +T (&a[1]); /* { dg-warning "nul" } */ +T (&a[i0]); /* { dg-warning "nul" } */ + + +const char b[][5] = { /* { dg-message "declared here" } */ + "12", "123", "1234", "54321" +}; + +T (b[0]); +T (b[1]); +T (b[2]); +T (b[3]); /* { dg-warning "nul" } */ + +T (&b[2][1]); +T (&b[2][1] + 1); +T (&b[2][i0]); +T (&b[2][1] + i0); + +T (&b[3][1]); /* { dg-warning "nul" } */ +T (&b[3][1] + 1); /* { dg-warning "nul" } */ +T (&b[3][i0]); /* { dg-warning "nul" } */ +T (&b[3][1] + i0); /* { dg-warning "nul" } */ + + +struct A { char a[5], b[5]; }; + +const struct A s = { "1234", "12345" }; + +T (s.a); +T (&s.a[0]); +T (&s.a[0] + 1); +T (&s.a[0] + i0); +T (&s.a[1]); +T (&s.a[1] + 1); +T (&s.a[1] + i0); + +T (s.b); /* { dg-warning "nul" } */ +T (&s.b[0]); /* { dg-warning "nul" } */ +T (&s.b[0] + 1); /* { dg-warning "nul" } */ +T (&s.b[0] + i0); /* { dg-warning "nul" } */ +T (&s.b[1]); /* { dg-warning "nul" } */ +T (&s.b[1] + 1); /* { dg-warning "nul" } */ +T (&s.b[1] + i0); /* { dg-warning "nul" } */ + +struct B { struct A a[2]; }; + +const struct B ba[] = { + { { { "123", "12345" }, { "12345", "123" } } }, + { { { "12345", "123" }, { "123", "12345" } } }, + { { { "1", "12" }, { "123", "1234" } } }, + { { { "123", "1234" }, { "12345", "12" } } } +}; + +T (ba[0].a[0].a); +T (&ba[0].a[0].a[0]); +T (&ba[0].a[0].a[0] + 1); +T (&ba[0].a[0].a[0] + i0); +T (&ba[0].a[0].a[1]); +T (&ba[0].a[0].a[1] + 1); +T (&ba[0].a[0].a[1] + i0); + +T (ba[0].a[0].b); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[0]); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[0] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[0] + i0); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[1]); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[1] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[0].b[1] + i0); /* { dg-warning "nul" } */ + +T (ba[0].a[1].a); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[0]); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[0] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[0] + i0); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[1]); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[1] + 1); /* { dg-warning "nul" } */ +T (&ba[0].a[1].a[1] + i0); /* { dg-warning "nul" } */ + +T (ba[0].a[1].b); +T (&ba[0].a[1].b[0]); +T (&ba[0].a[1].b[0] + 1); +T (&ba[0].a[1].b[0] + i0); +T (&ba[0].a[1].b[1]); +T (&ba[0].a[1].b[1] + 1); +T (&ba[0].a[1].b[1] + i0); + + +T (ba[1].a[0].a); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[0]); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[0] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[0] + i0); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[1]); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[1] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[0].a[1] + i0); /* { dg-warning "nul" } */ + +T (ba[1].a[0].b); +T (&ba[1].a[0].b[0]); +T (&ba[1].a[0].b[0] + 1); +T (&ba[1].a[0].b[0] + i0); +T (&ba[1].a[0].b[1]); +T (&ba[1].a[0].b[1] + 1); +T (&ba[1].a[0].b[1] + i0); + +T (ba[1].a[1].a); +T (&ba[1].a[1].a[0]); +T (&ba[1].a[1].a[0] + 1); +T (&ba[1].a[1].a[0] + i0); +T (&ba[1].a[1].a[1]); +T (&ba[1].a[1].a[1] + 1); +T (&ba[1].a[1].a[1] + i0); + +T (ba[1].a[1].b); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[0]); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[0] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[0] + i0); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[1]); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[1] + 1); /* { dg-warning "nul" } */ +T (&ba[1].a[1].b[1] + i0); /* { dg-warning "nul" } */ + + +T (ba[2].a[0].a); +T (&ba[2].a[0].a[0]); +T (&ba[2].a[0].a[0] + 1); +T (&ba[2].a[0].a[0] + i0); +T (&ba[2].a[0].a[1]); +T (&ba[2].a[0].a[1] + 1); +T (&ba[2].a[0].a[1] + i0); + +T (ba[2].a[0].b); +T (&ba[2].a[0].b[0]); +T (&ba[2].a[0].b[0] + 1); +T (&ba[2].a[0].b[0] + i0); +T (&ba[2].a[0].b[1]); +T (&ba[2].a[0].b[1] + 1); +T (&ba[2].a[0].b[1] + i0); + +T (ba[2].a[1].a); +T (&ba[2].a[1].a[0]); +T (&ba[2].a[1].a[0] + 1); +T (&ba[2].a[1].a[0] + i0); +T (&ba[2].a[1].a[1]); +T (&ba[2].a[1].a[1] + 1); +T (&ba[2].a[1].a[1] + i0); + + +T (ba[3].a[0].a); +T (&ba[3].a[0].a[0]); +T (&ba[3].a[0].a[0] + 1); +T (&ba[3].a[0].a[0] + i0); +T (&ba[3].a[0].a[1]); +T (&ba[3].a[0].a[1] + 1); +T (&ba[3].a[0].a[1] + i0); + +T (ba[3].a[0].b); +T (&ba[3].a[0].b[0]); +T (&ba[3].a[0].b[0] + 1); +T (&ba[3].a[0].b[0] + i0); +T (&ba[3].a[0].b[1]); +T (&ba[3].a[0].b[1] + 1); +T (&ba[3].a[0].b[1] + i0); + +T (ba[3].a[1].a); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[0]); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[0] + 1); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[0] + i0); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[1]); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[1] + 1); /* { dg-warning "nul" } */ +T (&ba[3].a[1].a[1] + i0); /* { dg-warning "nul" } */ + +T (ba[3].a[1].b); +T (&ba[3].a[1].b[0]); +T (&ba[3].a[1].b[0] + 1); +T (&ba[3].a[1].b[0] + i0); +T (&ba[3].a[1].b[1]); +T (&ba[3].a[1].b[1] + 1); +T (&ba[3].a[1].b[1] + i0);