Message ID | 571DDE39.7060200@redhat.com |
---|---|
State | New |
Headers | show |
On 04/25/2016 05:07 AM, Bernd Schmidt wrote: > + if (TREE_CODE (arg2) == CONST_DECL) > + arg2 = DECL_INITIAL (arg2); > + int literal_mask = ((!!integer_zerop (arg1) << 1) > + | (!!integer_zerop (arg2) << 2)); Are you deliberately treating an enumerator as a literal 0? I'd think we should set literal_mask before stripping CONST_DECL. OK with that change. Jason
On 04/25/2016 07:55 PM, Jason Merrill wrote: > On 04/25/2016 05:07 AM, Bernd Schmidt wrote: >> + if (TREE_CODE (arg2) == CONST_DECL) >> + arg2 = DECL_INITIAL (arg2); >> + int literal_mask = ((!!integer_zerop (arg1) << 1) >> + | (!!integer_zerop (arg2) << 2)); > > Are you deliberately treating an enumerator as a literal 0? I'd think > we should set literal_mask before stripping CONST_DECL. OK with that > change. Just to be sure, is that OK for everything or just the C++ parts? Bernd
On Tue, Apr 26, 2016 at 9:04 AM, Bernd Schmidt <bschmidt@redhat.com> wrote: > On 04/25/2016 07:55 PM, Jason Merrill wrote: >> >> On 04/25/2016 05:07 AM, Bernd Schmidt wrote: >>> >>> + if (TREE_CODE (arg2) == CONST_DECL) >>> + arg2 = DECL_INITIAL (arg2); >>> + int literal_mask = ((!!integer_zerop (arg1) << 1) >>> + | (!!integer_zerop (arg2) << 2)); >> >> >> Are you deliberately treating an enumerator as a literal 0? I'd think >> we should set literal_mask before stripping CONST_DECL. OK with that >> change. > > > Just to be sure, is that OK for everything or just the C++ parts? Everything. Jason
On 04/25/2016 03:07 AM, Bernd Schmidt wrote: > On 04/22/2016 03:57 PM, Jason Merrill wrote: >> This looks good, but can we move the code into c-common rather than >> duplicate it? > > That would be this patch. Also passes testing on x86_64-linux. Hi Bernd, I like the idea behind the patch! So much so in fact, and since it happens to be in an area where I've been doing some of my own work, I have a few observations and suggestions. Please forgive me for the length of the email. The documentation for the new option implies that it should warn for calls to memset where the third argument contains the number of elements not multiplied by the element size. But in my (quick) testing it only warns when the argument is a constant equal to the number of elements and less than the size of the array. For example, neither of the following is diagnosed: int a [4]; __builtin_memset (a, 0, 2 + 2); __builtin_memset (a, 0, 4 * 1); __builtin_memset (a, 0, 3); __builtin_memset (a, 0, 4 * sizeof a); If it's possible and not too difficult, it would be nice if the detection logic could be made a bit smarter to also diagnose these less trivial cases (and matched the documented behavior). Even beyond that, I also wonder if the warning could also be issued when writing any constant number of bytes that is not a multiple of the element size. This would be useful not just for memset but also for memcpy. (The premise being that it's unusual to want to zero out or copy just a few bytes of any array element and leave the remaining bytes of that element unchanged.) I also have a comment on the text and content of the warning: memset used with length equal to number of elements without multiplication with element size FWIW, multiplication is typically done "by" a number (not with one). More important, I often find it helpful if a diagnostic spells out the numeric values involved in the computation (e.g., array bounds or sizes). I find this especially helpful in C++ where the values of symbolic constants are often the result of non-trivial computations involving non-type template parameters or constexpr variables or function calls, and thus difficult to derive just by looking at the source code. Here's an idea for rewording the diagnostic to include this information: warning: memset called to set '3' bytes which is not a positive multiple of element size in array 'a' with type int[3]' note: array 'a' is declared here Finally, I would be remiss not to mention that the patch has an instance of trailing space in it (gasp! ;) Personally, I'm not bothered by it but it seems like a good opportunity to highlight that these things happen even to the most careful of us, and not necessarily as a result of not being careful or aware of the coding guidelines. My point is that no amount of documentation will or diligence will prevent these kinds of problems, and dwelling on them in code reviews isn't the best use of our time. Let's put in place a tool that takes care of these nits for us so we can focus on things a tool can't help us with! Martin
On 04/26/2016 11:23 PM, Martin Sebor wrote: > The documentation for the new option implies that it should warn > for calls to memset where the third argument contains the number > of elements not multiplied by the element size. But in my (quick) > testing it only warns when the argument is a constant equal to > the number of elements and less than the size of the array. For > example, neither of the following is diagnosed: > > int a [4]; > __builtin_memset (a, 0, 2 + 2); > __builtin_memset (a, 0, 4 * 1); > __builtin_memset (a, 0, 3); > __builtin_memset (a, 0, 4 * sizeof a); > > If it's possible and not too difficult, it would be nice if > the detection logic could be made a bit smarter to also diagnose > these less trivial cases (and matched the documented behavior). I've thought about some of these cases. The problem is there are legitimate cases of calling memset for only part of an array. I wanted to start with something that is unlikely to give false positives. A multiplication by the wrong sizeof would be a nice thing to spot. Would you like to work on followup patches? I probably won't get to it in a while. > Even beyond that, I also wonder if the warning could also be > issued when writing any constant number of bytes that is not > a multiple of the element size. This would be useful not just > for memset but also for memcpy. (The premise being that it's > unusual to want to zero out or copy just a few bytes of any > array element and leave the remaining bytes of that element > unchanged.) Probably a good idea. > I also have a comment on the text and content of the warning: > > memset used with length equal to number of elements without > multiplication with element size > > FWIW, multiplication is typically done "by" a number (not with > one). Fixed. > Here's an idea > for rewording the diagnostic to include this information: > > warning: memset called to set '3' bytes which is not a positive > multiple of element size in array 'a' with type int[3]' > note: array 'a' is declared here I'm finding this too verbose, but I guess that's a matter of taste. > Finally, I would be remiss not to mention that the patch has > an instance of trailing space in it (gasp! ;) Fixed before committing. > Personally, > I'm not bothered by it but it seems like a good opportunity > to highlight that these things happen even to the most careful > of us, and not necessarily as a result of not being careful or > aware of the coding guidelines. My point is that no amount of > documentation will or diligence will prevent these kinds of > problems, and dwelling on them in code reviews isn't the best > use of our time. The first part is probably true, but we have code review exactly _because_ people make mistakes. Without it, the code base would degenerate rapidly. So, well spotted. > Let's put in place a tool that takes care of > these nits for us so we can focus on things a tool can't help > us with! I don't believe in tools for this, Machines are stupid, and I think the problem is AI-complete. In this case the check-GNU-style script has two other complaints about the patch which a human can tell are wrong. Enforcing patches to pass a mechanical check would be a mistake IMO, since it would force people into contortions trying to placate an unintelligent program, making code quality worse. Bernd
On 04/27/2016 03:55 AM, Bernd Schmidt wrote: > On 04/26/2016 11:23 PM, Martin Sebor wrote: >> The documentation for the new option implies that it should warn >> for calls to memset where the third argument contains the number >> of elements not multiplied by the element size. But in my (quick) >> testing it only warns when the argument is a constant equal to >> the number of elements and less than the size of the array. For >> example, neither of the following is diagnosed: >> >> int a [4]; >> __builtin_memset (a, 0, 2 + 2); >> __builtin_memset (a, 0, 4 * 1); >> __builtin_memset (a, 0, 3); >> __builtin_memset (a, 0, 4 * sizeof a); >> >> If it's possible and not too difficult, it would be nice if >> the detection logic could be made a bit smarter to also diagnose >> these less trivial cases (and matched the documented behavior). > > I've thought about some of these cases. The problem is there are > legitimate cases of calling memset for only part of an array. I wanted > to start with something that is unlikely to give false positives. So I wonder if what we really want is to track which bytes in the object are set and which are not -- utilizing both memset and standard stores and if the object as a whole is not initialized, then warn. We've actually got a lot of the code that would be necessary to detect this in tree DSE, with more coming in this stage1 as I extend it to handle some missing cases. Clearly a follow-up rather than a requirement for the current patch to move forward. Jeff
On 04/27/2016 03:55 AM, Bernd Schmidt wrote: > On 04/26/2016 11:23 PM, Martin Sebor wrote: >> The documentation for the new option implies that it should warn >> for calls to memset where the third argument contains the number >> of elements not multiplied by the element size. But in my (quick) >> testing it only warns when the argument is a constant equal to >> the number of elements and less than the size of the array. For >> example, neither of the following is diagnosed: >> >> int a [4]; >> __builtin_memset (a, 0, 2 + 2); >> __builtin_memset (a, 0, 4 * 1); >> __builtin_memset (a, 0, 3); >> __builtin_memset (a, 0, 4 * sizeof a); >> >> If it's possible and not too difficult, it would be nice if >> the detection logic could be made a bit smarter to also diagnose >> these less trivial cases (and matched the documented behavior). > > I've thought about some of these cases. The problem is there are > legitimate cases of calling memset for only part of an array. I wanted > to start with something that is unlikely to give false positives. > > A multiplication by the wrong sizeof would be a nice thing to spot. > Would you like to work on followup patches? I probably won't get to it > in a while. Yes, I think enhancing this warning would be in line with the _FORTIFY_SOURCE improvements I'm starting to look into now. I agree that minimizing false positives is important. I'm not sure there is complete consensus on exactly what qualifies as a false positive, but that's probably a discussion we can have once we have a patch and some tests to look at). Martin
For the record, this new warning/code found at least three actual bugs in Wine (two of which were in the testsuite, but still). Definitely a useful addition. Gerald
* doc/invoke.texi (Warning Options): Add -Wmemset-elt-size. (-Wmemset-elt-size): New item. c-family/ * c.opt (Wmemset-elt-size): New option. * c-common.c (warn_for_memset): New function. * c-common.h (warn_for_memset): Declare. c/ * c-parser.c (c_parser_postfix_expression_after_primary): Call warn_for_memset instead of warning directly here. cp/ * parser.c (cp_parser_postfix_expression): Call warn_for_memset instead of warning directly here. testsuite/ * c-c++-common/memset-array.c: New test. Index: gcc/c/c-parser.c =================================================================== --- gcc/c/c-parser.c (revision 235384) +++ gcc/c/c-parser.c (working copy) @@ -8282,18 +8282,15 @@ c_parser_postfix_expression_after_primar expr.value, exprlist, sizeof_arg, sizeof_ptr_memacc_comptypes); - if (warn_memset_transposed_args - && TREE_CODE (expr.value) == FUNCTION_DECL + if (TREE_CODE (expr.value) == FUNCTION_DECL && DECL_BUILT_IN_CLASS (expr.value) == BUILT_IN_NORMAL && DECL_FUNCTION_CODE (expr.value) == BUILT_IN_MEMSET - && vec_safe_length (exprlist) == 3 - && integer_zerop ((*exprlist)[2]) - && (literal_zero_mask & (1 << 2)) != 0 - && (!integer_zerop ((*exprlist)[1]) - || (literal_zero_mask & (1 << 1)) == 0)) - warning_at (expr_loc, OPT_Wmemset_transposed_args, - "%<memset%> used with constant zero length parameter; " - "this could be due to transposed parameters"); + && vec_safe_length (exprlist) == 3) + { + tree arg0 = (*exprlist)[0]; + tree arg2 = (*exprlist)[2]; + warn_for_memset (expr_loc, arg0, arg2, literal_zero_mask); + } start = expr.get_start (); finish = parser->tokens_buf[0].get_finish (); Index: gcc/c-family/c-common.c =================================================================== --- gcc/c-family/c-common.c (revision 235384) +++ gcc/c-family/c-common.c (working copy) @@ -11767,6 +11767,49 @@ warn_for_div_by_zero (location_t loc, tr warning_at (loc, OPT_Wdiv_by_zero, "division by zero"); } +/* Warn for patterns where memset appears to be used incorrectly. The + warning location should be LOC. ARG0, and ARG2 are the first and + last arguments to the call, while LITERAL_ZERO_MASK has a 1 bit for + each argument that was a literal zero. */ + +void +warn_for_memset (location_t loc, tree arg0, tree arg2, + int literal_zero_mask) +{ + if (warn_memset_transposed_args + && integer_zerop (arg2) + && (literal_zero_mask & (1 << 2)) != 0 + && (literal_zero_mask & (1 << 1)) == 0) + warning_at (loc, OPT_Wmemset_transposed_args, + "%<memset%> used with constant zero length " + "parameter; this could be due to transposed " + "parameters"); + + if (warn_memset_elt_size && TREE_CODE (arg2) == INTEGER_CST) + { + STRIP_NOPS (arg0); + if (TREE_CODE (arg0) == ADDR_EXPR) + arg0 = TREE_OPERAND (arg0, 0); + tree type = TREE_TYPE (arg0); + if (TREE_CODE (type) == ARRAY_TYPE) + { + tree elt_type = TREE_TYPE (type); + tree domain = TYPE_DOMAIN (type); + if (!integer_onep (TYPE_SIZE_UNIT (elt_type)) + && TYPE_MAXVAL (domain) + && TYPE_MINVAL (domain) + && integer_zerop (TYPE_MINVAL (domain)) + && integer_onep (fold_build2 (MINUS_EXPR, domain, + arg2, + TYPE_MAXVAL (domain)))) + warning_at (loc, OPT_Wmemset_elt_size, + "%<memset%> used with length equal to " + "number of elements without multiplication " + "with element size"); + } + } +} + /* Subroutine of build_binary_op. Give warnings for comparisons between signed and unsigned quantities that may fail. Do the checking based on the original operand trees ORIG_OP0 and ORIG_OP1, Index: gcc/c-family/c-common.h =================================================================== --- gcc/c-family/c-common.h (revision 235384) +++ gcc/c-family/c-common.h (working copy) @@ -902,6 +902,7 @@ extern void c_parse_file (void); extern void c_parse_final_cleanups (void); extern void warn_for_omitted_condop (location_t, tree); +extern void warn_for_memset (location_t, tree, tree, int); /* These macros provide convenient access to the various _STMT nodes. */ Index: gcc/c-family/c.opt =================================================================== --- gcc/c-family/c.opt (revision 235384) +++ gcc/c-family/c.opt (working copy) @@ -565,6 +565,10 @@ Wmemset-transposed-args C ObjC C++ ObjC++ Var(warn_memset_transposed_args) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn about suspicious calls to memset where the third argument is constant literal zero and the second is not. +Wmemset-elt-size +C ObjC C++ ObjC++ Var(warn_memset_elt_size) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) +Warn about suspicious calls to memset where the third argument contains the number of elements not multiplied by the element size. + Wmisleading-indentation C C++ Common Var(warn_misleading_indentation) Warning LangEnabledBy(C C++,Wall) Warn when the indentation of the code does not reflect the block structure. Index: gcc/cp/parser.c =================================================================== --- gcc/cp/parser.c (revision 235384) +++ gcc/cp/parser.c (working copy) @@ -6829,20 +6829,19 @@ cp_parser_postfix_expression (cp_parser } } - if (warn_memset_transposed_args) + if (TREE_CODE (postfix_expression) == FUNCTION_DECL + && DECL_BUILT_IN_CLASS (postfix_expression) == BUILT_IN_NORMAL + && DECL_FUNCTION_CODE (postfix_expression) == BUILT_IN_MEMSET + && vec_safe_length (args) == 3) { - if (TREE_CODE (postfix_expression) == FUNCTION_DECL - && DECL_BUILT_IN_CLASS (postfix_expression) == BUILT_IN_NORMAL - && DECL_FUNCTION_CODE (postfix_expression) == BUILT_IN_MEMSET - && vec_safe_length (args) == 3 - && TREE_CODE ((*args)[2]) == INTEGER_CST - && integer_zerop ((*args)[2]) - && !(TREE_CODE ((*args)[1]) == INTEGER_CST - && integer_zerop ((*args)[1]))) - warning (OPT_Wmemset_transposed_args, - "%<memset%> used with constant zero length " - "parameter; this could be due to transposed " - "parameters"); + tree arg0 = (*args)[0]; + tree arg1 = (*args)[1]; + tree arg2 = (*args)[2]; + if (TREE_CODE (arg2) == CONST_DECL) + arg2 = DECL_INITIAL (arg2); + int literal_mask = ((!!integer_zerop (arg1) << 1) + | (!!integer_zerop (arg2) << 2)); + warn_for_memset (input_location, arg0, arg2, literal_mask); } if (TREE_CODE (postfix_expression) == COMPONENT_REF) Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 235384) +++ gcc/doc/invoke.texi (working copy) @@ -273,7 +273,7 @@ Objective-C and Objective-C++ Dialects}. -Wno-int-to-pointer-cast -Winvalid-memory-model -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol --Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol +-Wmain -Wmaybe-uninitialized -Wmemset-elt-size -Wmemset-transposed-args @gol -Wmisleading-indentation -Wmissing-braces @gol -Wmissing-field-initializers -Wmissing-include-dirs @gol -Wno-multichar -Wnonnull -Wnonnull-compare @gol @@ -3547,6 +3547,7 @@ Options} and @ref{Objective-C and Object -Wlogical-not-parentheses -Wmain @r{(only for C/ObjC and unless} @option{-ffreestanding}@r{)} @gol -Wmaybe-uninitialized @gol +-Wmemset-elt-size @gol -Wmemset-transposed-args @gol -Wmisleading-indentation @r{(only for C/C++)} @gol -Wmissing-braces @r{(only for C/ObjC)} @gol @@ -5256,6 +5257,15 @@ Warn when the @code{sizeof} operator is declared as an array in a function definition. This warning is enabled by default for C and C++ programs. +@item -Wmemset-elt-size +@opindex Wmemset-elt-size +@opindex Wno-memset-elt-size +Warn for suspicious calls to the @code{memset} built-in function, if the +first argument references an array, and the third argument is a number +equal to the number of elements, but not equal to the size of the array +in memory. This indicates that the user has omitted a multiplication by +the element size. This warning is enabled by @option{-Wall}. + @item -Wmemset-transposed-args @opindex Wmemset-transposed-args @opindex Wno-memset-transposed-args Index: gcc/testsuite/c-c++-common/memset-array.c =================================================================== --- gcc/testsuite/c-c++-common/memset-array.c (revision 0) +++ gcc/testsuite/c-c++-common/memset-array.c (working copy) @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-Wmemset-elt-size" } */ +enum a { + a_1, + a_2, + a_n +}; +int t1[20]; +int t2[a_n]; + +struct s +{ + int t[20]; +}; + +void foo (struct s *s) +{ + __builtin_memset (t1, 0, 20); /* { dg-warning "element size" } */ + __builtin_memset (t2, 0, a_n); /* { dg-warning "element size" } */ + __builtin_memset (s->t, 0, 20); /* { dg-warning "element size" } */ +} + +char u1[20]; +char u2[a_n]; + +struct s2 +{ + char u[20]; +}; + +void bar (struct s2 *s) +{ + __builtin_memset (u1, 0, 20); + __builtin_memset (u2, 0, a_n); + __builtin_memset (s->u, 0, 20); +}