Message ID | 864beb08-0073-cde2-ad21-a855a7bf7dc9@gmail.com |
---|---|
State | New |
Headers | show |
Series | use pointer size rather than array size when storing the former (PR 93829) | expand |
On 20 February 2020 01:26:58 CET, Martin Sebor <msebor@gmail.com> wrote: + Sets *NULTREM if the representation contains a zero byte, and sets s/NULTREM/NULTERM/ thanks,
On Wed, 2020-02-19 at 17:26 -0700, Martin Sebor wrote: > The buffer overflow detection for multi-char stores uses the size > of a source array even when what's actually being accessed (read > and stored) is a pointer to the array. That leads to incorrect > warnings in some cases. > > The attached patch corrects the function that computes the size of > the access to set it to that of a pointer instead if the source is > an address expression. > > Tested on x86_64-linux. > if (TREE_CODE (exp) == ADDR_EXPR) > - exp = TREE_OPERAND (exp, 0); > + { > + /* If the size of the access hasn't been determined yet it's that > + of a pointer. */ > + if (!nbytes) > + nbytes = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (exp))); > + exp = TREE_OPERAND (exp, 0); > + } > This doesn't make any sense to me. You're always going to get the size of a pointer here. Don't you want the size of the TYPE of the operand? Jeff
On 2/26/20 3:09 PM, Jeff Law wrote: > On Wed, 2020-02-19 at 17:26 -0700, Martin Sebor wrote: >> The buffer overflow detection for multi-char stores uses the size >> of a source array even when what's actually being accessed (read >> and stored) is a pointer to the array. That leads to incorrect >> warnings in some cases. >> >> The attached patch corrects the function that computes the size of >> the access to set it to that of a pointer instead if the source is >> an address expression. >> >> Tested on x86_64-linux. > >> if (TREE_CODE (exp) == ADDR_EXPR) >> - exp = TREE_OPERAND (exp, 0); >> + { >> + /* If the size of the access hasn't been determined yet it's that >> + of a pointer. */ >> + if (!nbytes) >> + nbytes = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (exp))); >> + exp = TREE_OPERAND (exp, 0); >> + } >> > This doesn't make any sense to me. You're always going to get the size of a > pointer here. Don't you want the size of the TYPE of the operand? The function correctly handles cases when the expression is an array because the array is what's being stored. The bug is in stripping the ADDR_EXPR and then using the size of the operand when what's being stored is actually a pointer. This test case illustrates the difference: struct S { char *p, *q, r[8]; } a; void f (void) { struct S b = { 0, "1234567", "abcdefgh" }; __builtin_memcpy (&a, &b, sizeof a); } The memcpy results in: MEM <char *> [(char * {ref-all})&b] = 0B; MEM <char *> [(char * {ref-all})&b + 8B] = "1234567"; MEM <unsigned char[24]> [(char * {ref-all})&a] = MEM <unsigned char[24]> [(char * {ref-all})&b]; The RHS of the second MEM_REF is ADDR_EXPR (STRING_CST). The RHS of the third MEM_REF is the array case (that's handled correctly). I think computing the size of the store could be simplified (it seems it should be the same as type of either the RHS or the LHS) but I didn't want to mess with things too much this late. Martin
On Wed, 2020-02-26 at 16:18 -0700, Martin Sebor wrote: > On 2/26/20 3:09 PM, Jeff Law wrote: > > On Wed, 2020-02-19 at 17:26 -0700, Martin Sebor wrote: > > > The buffer overflow detection for multi-char stores uses the size > > > of a source array even when what's actually being accessed (read > > > and stored) is a pointer to the array. That leads to incorrect > > > warnings in some cases. > > > > > > The attached patch corrects the function that computes the size of > > > the access to set it to that of a pointer instead if the source is > > > an address expression. > > > > > > Tested on x86_64-linux. > > > if (TREE_CODE (exp) == ADDR_EXPR) > > > - exp = TREE_OPERAND (exp, 0); > > > + { > > > + /* If the size of the access hasn't been determined yet it's that > > > + of a pointer. */ > > > + if (!nbytes) > > > + nbytes = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (exp))); > > > + exp = TREE_OPERAND (exp, 0); > > > + } > > > > > This doesn't make any sense to me. You're always going to get the size of > > a > > pointer here. Don't you want the size of the TYPE of the operand? > > The function correctly handles cases when the expression is an array > because the array is what's being stored. The bug is in stripping > the ADDR_EXPR and then using the size of the operand when what's > being stored is actually a pointer. > > This test case illustrates the difference: > > struct S { char *p, *q, r[8]; } a; > > void f (void) > { > struct S b = { 0, "1234567", "abcdefgh" }; > __builtin_memcpy (&a, &b, sizeof a); > } > > The memcpy results in: > > MEM <char *> [(char * {ref-all})&b] = 0B; > MEM <char *> [(char * {ref-all})&b + 8B] = "1234567"; > MEM <unsigned char[24]> [(char * {ref-all})&a] = MEM <unsigned > char[24]> [(char * {ref-all})&b]; > > The RHS of the second MEM_REF is ADDR_EXPR (STRING_CST). The RHS > of the third MEM_REF is the array case (that's handled correctly). > > I think computing the size of the store could be simplified (it > seems it should be the same as type of either the RHS or the LHS) > but I didn't want to mess with things too much this late. Sorry, I should have just thrown this under the debugger, but I was short of time. ISTM like the problem is we have an RHS like: > > > <addr_expr 0x7fffea93f360 > > type <pointer_type 0x7fffea939d20 > > type <array_type 0x7fffea939690 type <integer_type 0x7fffea8193f0 > > char> > > BLK > > size <integer_cst 0x7fffea81e3a8 constant 328> > > unit-size <integer_cst 0x7fffea935f48 constant 41> > > align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > > 0x7fffea939690 domain <integer_type 0x7fffea9395e8> > > pointer_to_this <pointer_type 0x7fffea939d20>> > > unsigned DI > > size <integer_cst 0x7fffea800cd8 constant 64> > > unit-size <integer_cst 0x7fffea800cf0 constant 8> > > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > > 0x7fffea939d20> > > readonly constant > > arg:0 <string_cst 0x7fffea814100 type <array_type 0x7fffea939690> > > readonly constant static > > "0123456789012345678901234567890123456789\000"> > > j.c:20:10 start: j.c:20:10 finish: j.c:20:10> > > The current code unconditionally strips the ADDR_EXPR at this point and as a result will use the length of STRING_CST node, which is obviously wrong. With your change, if we haven't already determined a length, we extract the length of the pointer then strip the ADDR_EXPR. Yea, this is all ripe for some cleanup. For example it's not at all clear that just because we don't have a size at this point in count_nonzero_bytes and we've got an ADDR_EXPR that we want the size of the pointer. That may in fact be true, but it's not easy to ascertain with the current spaghetti code. OK for the trunk, but let's queue this area for some cleanup in gcc-11. jeff > Martin >
PR middle-end/93829 - bogus -Wstringop-overflow on memcpy of a struct with a pointer member from another with a long string gcc/testsuite/ChangeLog: PR middle-end/93829 * gcc.dg/Wstringop-overflow-32.c: New test. gcc/ChangeLog: PR middle-end/93829 * tree-ssa-strlen.c (count_nonzero_bytes): Set the size to that of a pointer in the outermost ADDR_EXPRs. diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index 9a88a85b07c..1cdb581a51f 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -4587,12 +4587,15 @@ int ssa_name_limit_t::next_ssa_name (tree ssa_name) /* Determines the minimum and maximum number of leading non-zero bytes in the representation of EXP and set LENRANGE[0] and LENRANGE[1] - to each. Sets LENRANGE[2] to the total number of bytes in - the representation. Sets *NULTREM if the representation contains - a zero byte, and sets *ALLNUL if all the bytes are zero. + to each. + Sets LENRANGE[2] to the total size of the access (which may be less + than LENRANGE[1] when what's being referenced by EXP is a pointer + rather than an array). + Sets *NULTREM if the representation contains a zero byte, and sets + *ALLNUL if all the bytes are zero. OFFSET and NBYTES are the offset into the representation and - the size of the access to it determined from a MEM_REF or zero - for other expressions. + the size of the access to it determined from an ADDR_EXPR (i.e., + a pointer) or MEM_REF or zero for other expressions. Uses RVALS to determine range information. Avoids recursing deeper than the limits in SNLIM allow. Returns true on success and false otherwise. */ @@ -4692,7 +4695,13 @@ count_nonzero_bytes (tree exp, unsigned HOST_WIDE_INT offset, } if (TREE_CODE (exp) == ADDR_EXPR) - exp = TREE_OPERAND (exp, 0); + { + /* If the size of the access hasn't been determined yet it's that + of a pointer. */ + if (!nbytes) + nbytes = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (exp))); + exp = TREE_OPERAND (exp, 0); + } if (TREE_CODE (exp) == SSA_NAME) { @@ -4788,9 +4797,10 @@ count_nonzero_bytes (tree exp, unsigned HOST_WIDE_INT offset, return false; if (!nbytes) - /* If NBYTES hasn't been determined earlier from MEM_REF, - set it here. It includes all internal nuls, including - the terminating one if the string has one. */ + /* If NBYTES hasn't been determined earlier, either from ADDR_EXPR + (i.e., it's the size of a pointer), or from MEM_REF (as the size + of the access), set it here to the size of the string, including + all internal and trailing nuls if the string has any. */ nbytes = nchars - offset; prep = TREE_STRING_POINTER (exp) + offset; diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-32.c b/gcc/testsuite/gcc.dg/Wstringop-overflow-32.c new file mode 100644 index 00000000000..e5939567a4d --- /dev/null +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-32.c @@ -0,0 +1,51 @@ +/* PR middle-end/93829 - bogus -Wstringop-overflow on memcpy of a struct + with a pointer member from another with a long string + { dg-do compile } + { dg-options "-O2 -Wall" } */ + +extern void* memcpy (void*, const void*, __SIZE_TYPE__); + +#define S40 "0123456789012345678901234567890123456789" + +const char s40[] = S40; + +struct S +{ + const void *p, *q, *r; +} s, sa[2]; + + +void test_lit_decl (void) +{ + struct S t = { 0, S40, 0 }; + + memcpy (&s, &t, sizeof t); // { dg-bogus "-Wstringop-overflow" } +} + +void test_str_decl (void) +{ + struct S t = { 0, s40, 0 }; + + memcpy (&s, &t, sizeof t); // { dg-bogus "-Wstringop-overflow" } +} + + +void test_lit_ssa (int i) +{ + if (i < 1) + i = 1; + struct S *p = &sa[i]; + struct S t = { 0, S40, 0 }; + + memcpy (p, &t, sizeof t); // { dg-bogus "-Wstringop-overflow" } +} + +void test_str_ssa (int i) +{ + if (i < 1) + i = 1; + struct S *p = &sa[i]; + struct S t = { 0, s40, 0 }; + + memcpy (p, &t, sizeof t); // { dg-bogus "-Wstringop-overflow" } +}