Message ID | 1497231174.27153.9.camel@stu.xidian.edu.cn |
---|---|
State | New |
Headers | show |
On 06/11/2017 07:32 PM, Xi Ruoyao wrote: > This patch adds warning option -Wstring-plus-int for C/C++. > > gcc/ChangeLog: > > 2017-06-12 Xi Ruoyao <ryxi@stu.xidian.edu.cn> > > * c-family/c.opt: New option -Wstring-plus-int. > * c-family/c-common.c (pointer_int_sum): Checking for > -Wstring-plus-int. This is a very useful warning but I would suggest to word it in terms of what it actually does rather than what it might be intended to do. E.g., for const char *p = "123" + 7; issue warning: offset 7 exceeds the upper bound 3 of the array rather than warning: adding 'int' to a string does not append to the string (I have trouble envisioning on what grounds someone might expect the addition to have this effect.) Given that the warning only triggers when the upper bound of an array is exceeded I would also suggest to consider including the warning in -Warray-bounds. (With that, it would be useful to also detect exceeding the upper bound of non-literal arrays as well.) Martin
On 2017-06-19 10:51 -0600, Martin Sebor wrote: > On 06/11/2017 07:32 PM, Xi Ruoyao wrote: > > This patch adds warning option -Wstring-plus-int for C/C++. > > > > gcc/ChangeLog: > > > > 2017-06-12 Xi Ruoyao <ryxi@stu.xidian.edu.cn> > > > > * c-family/c.opt: New option -Wstring-plus-int. > > * c-family/c-common.c (pointer_int_sum): Checking for > > -Wstring-plus-int. > > This is a very useful warning but I would suggest to word it > in terms of what it actually does rather than what it might be > intended to do. E.g., for > > const char *p = "123" + 7; > > issue > > warning: offset 7 exceeds the upper bound 3 of the array > > rather than > > warning: adding 'int' to a string does not append to the string > > (I have trouble envisioning on what grounds someone might expect > the addition to have this effect.) How about something like `const char *p = "123" + getchar();` ? I'd like this for -Wstring-plus-int=1: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int=] const char *p = "123" + 7; ^ note: offset 7 exceeds the size 4 of the string, using the result may lead to undefined behaviour. (Clang permits "123" + 4 since its result is well defined in standard. Maybe we could permit "123" + 3 only.) For level 1 we only warn for such obvious mistakes. And for -Wstring-plus-int=2: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int=] const char *p = "123" + getchar(); ^ note: the offset may exceed the size of the string. (Clang also warn while it's impossible to know whether the offset exceeds. It seems aggressively so we can make it level 2.) > Given that the warning only triggers when the upper bound of > an array is exceeded I would also suggest to consider including > the warning in -Warray-bounds. (With that, it would be useful > to also detect exceeding the upper bound of non-literal arrays > as well.) We can let -Warray-bounds enable -Wstring-plus-int=1, but not =2.
On 06/19/2017 11:28 AM, Xi Ruoyao wrote: > On 2017-06-19 10:51 -0600, Martin Sebor wrote: >> On 06/11/2017 07:32 PM, Xi Ruoyao wrote: >>> This patch adds warning option -Wstring-plus-int for C/C++. >>> >>> gcc/ChangeLog: >>> >>> 2017-06-12 Xi Ruoyao <ryxi@stu.xidian.edu.cn> >>> >>> * c-family/c.opt: New option -Wstring-plus-int. >>> * c-family/c-common.c (pointer_int_sum): Checking for >>> -Wstring-plus-int. >> >> This is a very useful warning but I would suggest to word it >> in terms of what it actually does rather than what it might be >> intended to do. E.g., for >> >> const char *p = "123" + 7; >> >> issue >> >> warning: offset 7 exceeds the upper bound 3 of the array >> >> rather than >> >> warning: adding 'int' to a string does not append to the string >> >> (I have trouble envisioning on what grounds someone might expect >> the addition to have this effect.) > > How about something like `const char *p = "123" + getchar();` ? I'm not sure I correctly understand the question (or whether it's meant in response to my comment in parentheses) but let me clarify what I meant. In my view, the group of C++ (and certainly C) programmers who might expect "123" + i to append the string representation of the integer result to a string literal isn't significant enough to focus the warning on. Whether or not the addition is valid depends on the value of the integer operand. There are three sets of cases where the addition is or may be invalid: 1) the integer operand is an out of bounds constant 2) the integer operand's non-constant value or the lower bound of its range is known to be out of bounds, 3) the lower bound of the operand's range is in bounds but the upper bound is out of bounds (as in the getchar example). (1) can be handled with lexical analysis alone (as in you parch) but it's prone to a high rate of false negatives. (3) can also be handled by lexical analysis alone but it's prone to a high rate of false positives. (2) has no false positives but some false negatives. It can only be detected with optimization. With that in mind the warning would serve a greater purpose by being aimed more broadly and describing the nature of the error: forming an invalid pointer. I believe it would best be implemented analogously to or even integrated into -Warray-bounds. I.e., I suggest covering set (2) above. > > I'd like this for -Wstring-plus-int=1: > > warning: adding 'int' to a string does not append to the string > [-Wstring-plus-int=] > const char *p = "123" + 7; > ^ > note: offset 7 exceeds the size 4 of the string, using the result > may lead to undefined behaviour. The addition itself is undefined, regardless of whether or not the result is used. > > (Clang permits "123" + 4 since its result is well defined in standard. > Maybe we could permit "123" + 3 only.) "123" is an array of 4 elements, with "123" + 4 pointing just past the last (NUL) element. It's valid to form a pointer past the last element of an array and warning about it would likely be viewed as a false positive (certainly if it were an out-of-bounds type of warning). Martin
On 2017-06-19 12:44 -0600, Martin Sebor wrote: > On 06/19/2017 11:28 AM, Xi Ruoyao wrote: > > On 2017-06-19 10:51 -0600, Martin Sebor wrote: > > > On 06/11/2017 07:32 PM, Xi Ruoyao wrote: > > > > This patch adds warning option -Wstring-plus-int for C/C++. > > > > > > > > gcc/ChangeLog: > > > > > > > > 2017-06-12 Xi Ruoyao <ryxi@stu.xidian.edu.cn> > > > > > > > > * c-family/c.opt: New option -Wstring-plus-int. > > > > * c-family/c-common.c (pointer_int_sum): Checking for > > > > -Wstring-plus-int. > > > > > > This is a very useful warning but I would suggest to word it > > > in terms of what it actually does rather than what it might be > > > intended to do. E.g., for > > > > > > const char *p = "123" + 7; > > > > > > issue > > > > > > warning: offset 7 exceeds the upper bound 3 of the array > > > > > > rather than > > > > > > warning: adding 'int' to a string does not append to the string > > > > > > (I have trouble envisioning on what grounds someone might expect > > > the addition to have this effect.) > > > > How about something like `const char *p = "123" + getchar();` ? > > I'm not sure I correctly understand the question (or whether > it's meant in response to my comment in parentheses) but let > me clarify what I meant. > > In my view, the group of C++ (and certainly C) programmers who > might expect "123" + i to append the string representation of > the integer result to a string literal isn't significant enough > to focus the warning on. > > Whether or not the addition is valid depends on the value of > the integer operand. There are three sets of cases where the > addition is or may be invalid: > > 1) the integer operand is an out of bounds constant > 2) the integer operand's non-constant value or the lower bound > of its range is known to be out of bounds, > 3) the lower bound of the operand's range is in bounds but > the upper bound is out of bounds (as in the getchar example). > > (1) can be handled with lexical analysis alone (as in you parch) > but it's prone to a high rate of false negatives. (3) can also > be handled by lexical analysis alone but it's prone to a high > rate of false positives. (2) has no false positives but some > false negatives. It can only be detected with optimization. > > With that in mind the warning would serve a greater purpose > by being aimed more broadly and describing the nature of the > error: forming an invalid pointer. I believe it would best > be implemented analogously to or even integrated into > -Warray-bounds. I.e., I suggest covering set (2) above. Now I think I've been cheat by GCC wiki, which states PR 62181 is an "easy-hack" :) I'll try to improve -Warray-bounds. > > > > I'd like this for -Wstring-plus-int=1: > > > > warning: adding 'int' to a string does not append to the string > > [-Wstring-plus-int=] > > const char *p = "123" + 7; > > ^ > > note: offset 7 exceeds the size 4 of the string, using the result > > may lead to undefined behaviour. > > The addition itself is undefined, regardless of whether or not > the result is used. > > > > > (Clang permits "123" + 4 since its result is well defined in standard. > > Maybe we could permit "123" + 3 only.) > > "123" is an array of 4 elements, with "123" + 4 pointing just past > the last (NUL) element. It's valid to form a pointer past the last > element of an array and warning about it would likely be viewed as > a false positive (certainly if it were an out-of-bounds type of > warning). > > Martin
On 2017-06-19 12:44 -0600, Martin Sebor wrote: > On 06/19/2017 11:28 AM, Xi Ruoyao wrote: > > On 2017-06-19 10:51 -0600, Martin Sebor wrote: > > > On 06/11/2017 07:32 PM, Xi Ruoyao wrote: > > > > This patch adds warning option -Wstring-plus-int for C/C++. > > > > > > > > gcc/ChangeLog: > > > > > > > > 2017-06-12 Xi Ruoyao <ryxi@stu.xidian.edu.cn> > > > > > > > > * c-family/c.opt: New option -Wstring-plus-int. > > > > * c-family/c-common.c (pointer_int_sum): Checking for > > > > -Wstring-plus-int. > > > > > > This is a very useful warning but I would suggest to word it > > > in terms of what it actually does rather than what it might be > > > intended to do. E.g., for > > > > > > const char *p = "123" + 7; > > > > > > issue > > > > > > warning: offset 7 exceeds the upper bound 3 of the array > > > > > > rather than > > > > > > warning: adding 'int' to a string does not append to the string > > > > > > (I have trouble envisioning on what grounds someone might expect > > > the addition to have this effect.) > > > > How about something like `const char *p = "123" + getchar();` ? > > I'm not sure I correctly understand the question (or whether > it's meant in response to my comment in parentheses) but let > me clarify what I meant. > > In my view, the group of C++ (and certainly C) programmers who > might expect "123" + i to append the string representation of > the integer result to a string literal isn't significant enough > to focus the warning on. > > Whether or not the addition is valid depends on the value of > the integer operand. There are three sets of cases where the > addition is or may be invalid: > > 1) the integer operand is an out of bounds constant > 2) the integer operand's non-constant value or the lower bound > of its range is known to be out of bounds, > 3) the lower bound of the operand's range is in bounds but > the upper bound is out of bounds (as in the getchar example). > > (1) can be handled with lexical analysis alone (as in you parch) > but it's prone to a high rate of false negatives. (3) can also > be handled by lexical analysis alone but it's prone to a high > rate of false positives. (2) has no false positives but some > false negatives. It can only be detected with optimization. > > With that in mind the warning would serve a greater purpose > by being aimed more broadly and describing the nature of the > error: forming an invalid pointer. I believe it would best > be implemented analogously to or even integrated into > -Warray-bounds. I.e., I suggest covering set (2) above. I created PR 81172. For const char *p = "123" + 'c' we should have: warning: offset 99 is above array bounds, the behaviour is undefined [-Warray-bounds] const char *p = "123" + 'c'; and perhaps (for the case the pointer operand is a string or a string pointer): note: adding integer to a string does not append to the string. > > > > I'd like this for -Wstring-plus-int=1: > > > > warning: adding 'int' to a string does not append to the string > > [-Wstring-plus-int=] > > const char *p = "123" + 7; > > ^ > > note: offset 7 exceeds the size 4 of the string, using the result > > may lead to undefined behaviour. > > The addition itself is undefined, regardless of whether or not > the result is used. > > > > > (Clang permits "123" + 4 since its result is well defined in standard. > > Maybe we could permit "123" + 3 only.) > > "123" is an array of 4 elements, with "123" + 4 pointing just past > the last (NUL) element. It's valid to form a pointer past the last > element of an array and warning about it would likely be viewed as > a false positive (certainly if it were an out-of-bounds type of > warning). > > Martin
On Thu, 22 Jun 2017, Xi Ruoyao wrote: > I created PR 81172. For const char *p = "123" + 'c' we should have: > > warning: offset 99 is above array bounds, the behaviour is > undefined [-Warray-bounds] > const char *p = "123" + 'c'; > > and perhaps (for the case the pointer operand is a string or a string > pointer): > note: adding integer to a string does not append to the string. I do think this makes sense. I'm really not convinced there is a lot of code out there that uses the "123" + i idiom to create a pointer, even if it is legitimate code, so a warning makes sense to me. Gerald
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 4395e51..1eee48f 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -3100,6 +3100,31 @@ pointer_int_sum (location_t loc, enum tree_code resultcode, else size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type)); + /* Handle -Wstring-plus-int, warn for adding string literals + and an integer which may result in a wild pointer. */ + if (warn_string_plus_int + && resultcode == PLUS_EXPR + && char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (result_type)))) + { + tree orig_ptrop = tree_strip_nop_conversions(ptrop); + if (TREE_CODE (orig_ptrop) == ADDR_EXPR) + { + tree obj = TREE_OPERAND (orig_ptrop, 0); + if (TREE_CODE (obj) == STRING_CST) + { + tree t = TYPE_DOMAIN (TREE_TYPE (obj)); + if (TREE_CODE (intop) != INTEGER_CST + || tree_int_cst_lt (intop, TYPE_MIN_VALUE (t)) + || int_cst_value (intop) + > int_cst_value (TYPE_MAX_VALUE (t)) + 1) + warning_at (loc, OPT_Wstring_plus_int, + "add %qT to string does not append to " + "the string", + TREE_TYPE (intop)); + } + } + } + /* We are manipulating pointer values, so we don't need to warn about relying on undefined signed overflow. We disable the warning here because we use integer types so fold won't know that diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 37bb236..94ba3eb 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -732,6 +732,11 @@ C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_stringop_overflow) Ini Under the control of Object Size type, warn about buffer overflow in string manipulation functions like memcpy and strcpy. +Wstring-plus-int +C ObjC C++ ObjC++ Var(warn_string_plus_int) Warning +Warn about adding strings and integers, which is likely an ill-formed +attempt to append the string. + Wsuggest-attribute=format C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning Warn about functions which might be candidates for format attributes. -- 2.7.1