Message ID | 3b35c9b9-491c-4d2d-a2db-737dc2c535d5@gmail.com |
---|---|
State | New |
Headers | show |
Series | handle non-constant offsets in -Wstringop-overflow (PR 77608) | expand |
On 11/17/2017 12:36 PM, Martin Sebor wrote: > The attached patch enhances -Wstringop-overflow to detect more > instances of buffer overflow at compile time by handling non- > constant offsets into the destination object that are known to > be in some range. The solution could be improved by handling > even more cases (e.g., anti-ranges or offsets relative to > pointers beyond the beginning of an object) but it's a start. > > In addition to bootsrapping/regtesting GCC, also tested with > Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no > regressions. > > Martin > > The top of GDB fails to compile at the moment so the validation > there was incomplete. > > gcc-77608.diff > > > PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow > > gcc/ChangeLog: > > PR middle-end/77608 > * builtins.c (compute_objsize): Handle non-constant offsets. > > gcc/testsuite/ChangeLog: > > PR middle-end/77608 > * gcc.dg/Wstringop-overflow.c: New test. The recursive call into compute_objsize passing in the ostype avoids having to think about the whole object vs nearest containing object issues. Right? What's left to worry about is maximum or minimum remaining bytes in the object. At least that's my understanding of how ostype works here. So we get the amount remaining, ignoring the variable offset, from the recursive call (SIZE). The space left after we account for the variable offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to return SIZE-MIN (which you do) and for type 2/3 you have to return SIZE-MAX which I think you get wrong (and you have to account for the possibility that MAX or MIN is greater than SIZE and thus there's nothing left). The other concern here is precision of the answer and how it's going to be used by callers. But after a lot of thought I think we're ok on this front based on how b_o_s is supposed to handle cases where it's given a pointer to multiple objects which all have known, but possibly different sizes. > + /* compute_builtin_object_size fails for addresses with > + non-constant offsets. Try to determine the range of > + such an offset here and use it to adjus the constant > + size. */ > + tree off = gimple_assign_rhs2 (stmt); > + if (TREE_CODE (off) == SSA_NAME > + && INTEGRAL_TYPE_P (TREE_TYPE (off))) > + { > + wide_int min, max; > + enum value_range_type rng = get_range_info (off, &min, &max); > + > + if (rng == VR_RANGE) > + { > + if (tree size = compute_objsize (dest, ostype)) > + { > + wide_int wisiz = wi::to_wide (size); > + if (wi::sign_mask (min)) > + /* ignore negative offsets for now */; Put the comment before the condition. like this /* Ignore negative offsets for now. */ if (wi::sign_mask (min)) ; That makes the empty if clause clearer to spot. You could even argue that instead of an empty if clause it should be: /* Ignore negative offsets for now. */ if (wi::sign_mask (min) return NULL_TREE; We're giving up at this point anyway, so why make the reader follow further control flow to see that's what's going to ultimately happen. > + else if (wi::ltu_p (min, wisiz)) > + return wide_int_to_tree (TREE_TYPE (size), > + wi::sub (wisiz, min)); So for type 0,1 (max space available). So I think the condition is else if ((ostype & 2) == 0 && wi::ltu_p (min, wisize)) return SIZE-MIN; Then you'll need else if ((ostype & 2) == 2 && (wi::ltu_p (max, wisize) return SIZE-MAX; else return size_zero_node; I'd like Jakub or Richi to chime in here to check my analysis. It's getting late :-) Jeff
On 11/18/2017 12:53 AM, Jeff Law wrote: > On 11/17/2017 12:36 PM, Martin Sebor wrote: >> The attached patch enhances -Wstringop-overflow to detect more >> instances of buffer overflow at compile time by handling non- >> constant offsets into the destination object that are known to >> be in some range. The solution could be improved by handling >> even more cases (e.g., anti-ranges or offsets relative to >> pointers beyond the beginning of an object) but it's a start. I should have made it even more clear that this change affects just warnings, not the runtime protection with _FORTIFY_SOURCE. The latter is driven by the __builtin_object_size intrinsic and by the compute_builtin_object_size GCC function. This patch does not change what __builtin_object_size returns. >> In addition to bootsrapping/regtesting GCC, also tested with >> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >> regressions. >> >> Martin >> >> The top of GDB fails to compile at the moment so the validation >> there was incomplete. >> >> gcc-77608.diff >> >> >> PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow >> >> gcc/ChangeLog: >> >> PR middle-end/77608 >> * builtins.c (compute_objsize): Handle non-constant offsets. >> >> gcc/testsuite/ChangeLog: >> >> PR middle-end/77608 >> * gcc.dg/Wstringop-overflow.c: New test. > The recursive call into compute_objsize passing in the ostype avoids > having to think about the whole object vs nearest containing object > issues. Right? Yes, something like that. > What's left to worry about is maximum or minimum remaining bytes in the > object. At least that's my understanding of how ostype works here. -Wstringop-overflow (i.e., warnings only) always uses the largest object size for memxxx functions (type 0) and the smallest (type 2) for string functions by default. This can be changed but I suspect no one does because (IME) few users understand what the other modes mean or what good using them vs the default might do. It was probably a mistake to expose the mode via the option. I'll address the remaining comments separately. Martin
On 11/18/2017 09:47 AM, Martin Sebor wrote: > On 11/18/2017 12:53 AM, Jeff Law wrote: >> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>> The attached patch enhances -Wstringop-overflow to detect more >>> instances of buffer overflow at compile time by handling non- >>> constant offsets into the destination object that are known to >>> be in some range. The solution could be improved by handling >>> even more cases (e.g., anti-ranges or offsets relative to >>> pointers beyond the beginning of an object) but it's a start. > > I should have made it even more clear that this change affects > just warnings, not the runtime protection with _FORTIFY_SOURCE. > The latter is driven by the __builtin_object_size intrinsic and > by the compute_builtin_object_size GCC function. > > This patch does not change what __builtin_object_size returns. Understood. But I think that the documented behavior of b_o_s directly impacts how this routine is supposed to work. >> > -Wstringop-overflow (i.e., warnings only) always uses the largest > object size for memxxx functions (type 0) and the smallest (type > 2) for string functions by default. This can be changed but I > suspect no one does because (IME) few users understand what > the other modes mean or what good using them vs the default might > do. It was probably a mistake to expose the mode via the option. Then maybe only do the deeper analysis for max mode only? As it stands if someone were to call it in a min mode I think they're going to be rather surprised by the answer. jeff
On 11/18/2017 12:53 AM, Jeff Law wrote: > On 11/17/2017 12:36 PM, Martin Sebor wrote: >> The attached patch enhances -Wstringop-overflow to detect more >> instances of buffer overflow at compile time by handling non- >> constant offsets into the destination object that are known to >> be in some range. The solution could be improved by handling >> even more cases (e.g., anti-ranges or offsets relative to >> pointers beyond the beginning of an object) but it's a start. >> >> In addition to bootsrapping/regtesting GCC, also tested with >> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >> regressions. >> >> Martin >> >> The top of GDB fails to compile at the moment so the validation >> there was incomplete. >> >> gcc-77608.diff >> >> >> PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow >> >> gcc/ChangeLog: >> >> PR middle-end/77608 >> * builtins.c (compute_objsize): Handle non-constant offsets. >> >> gcc/testsuite/ChangeLog: >> >> PR middle-end/77608 >> * gcc.dg/Wstringop-overflow.c: New test. > The recursive call into compute_objsize passing in the ostype avoids > having to think about the whole object vs nearest containing object > issues. Right? > > What's left to worry about is maximum or minimum remaining bytes in the > object. At least that's my understanding of how ostype works here. > > So we get the amount remaining, ignoring the variable offset, from the > recursive call (SIZE). The space left after we account for the variable > offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to > return SIZE-MIN (which you do) and for type 2/3 you have to return > SIZE-MAX which I think you get wrong (and you have to account for the > possibility that MAX or MIN is greater than SIZE and thus there's > nothing left). Subtracting the upper bound of the offset from the size instead of the lower bound when the caller is asking for the minimum object size would make the result essentially meaningless in common cases where the offset is smaller than size_t, as in: char a[7]; void f (const char *s, unsigned i) { __builtin_strcpy (a + i, s); } Here, i's range is [0, UINT_MAX]. IMO, it's only useful to use the lower bound here, otherwise the result would only rarely be non-zero. This is also what other warnings that deal with ranges do. For -Warray-bounds only considers the lower bound (unless it's negative) when deciding whether or not to warn for int g (unsigned i) { return a[i]; } It would be too noisy to be of practical use otherwise (even at level 2). Martin
On 11/19/2017 04:28 PM, Martin Sebor wrote: > On 11/18/2017 12:53 AM, Jeff Law wrote: >> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>> The attached patch enhances -Wstringop-overflow to detect more >>> instances of buffer overflow at compile time by handling non- >>> constant offsets into the destination object that are known to >>> be in some range. The solution could be improved by handling >>> even more cases (e.g., anti-ranges or offsets relative to >>> pointers beyond the beginning of an object) but it's a start. >>> >>> In addition to bootsrapping/regtesting GCC, also tested with >>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >>> regressions. >>> >>> Martin >>> >>> The top of GDB fails to compile at the moment so the validation >>> there was incomplete. >>> >>> gcc-77608.diff >>> >>> >>> PR middle-end/77608 - missing protection on trivially detectable >>> runtime buffer overflow >>> >>> gcc/ChangeLog: >>> >>> PR middle-end/77608 >>> * builtins.c (compute_objsize): Handle non-constant offsets. >>> >>> gcc/testsuite/ChangeLog: >>> >>> PR middle-end/77608 >>> * gcc.dg/Wstringop-overflow.c: New test. >> The recursive call into compute_objsize passing in the ostype avoids >> having to think about the whole object vs nearest containing object >> issues. Right? >> >> What's left to worry about is maximum or minimum remaining bytes in the >> object. At least that's my understanding of how ostype works here. >> >> So we get the amount remaining, ignoring the variable offset, from the >> recursive call (SIZE). The space left after we account for the variable >> offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to >> return SIZE-MIN (which you do) and for type 2/3 you have to return >> SIZE-MAX which I think you get wrong (and you have to account for the >> possibility that MAX or MIN is greater than SIZE and thus there's >> nothing left). > > Subtracting the upper bound of the offset from the size instead > of the lower bound when the caller is asking for the minimum > object size would make the result essentially meaningless in > common cases where the offset is smaller than size_t, as in: > > char a[7]; > > void f (const char *s, unsigned i) > { > __builtin_strcpy (a + i, s); > } > > Here, i's range is [0, UINT_MAX]. > > IMO, it's only useful to use the lower bound here, otherwise > the result would only rarely be non-zero. But when we're asking for the minimum left, aren't we essentially asking for "how much space am I absolutely sure I can write"? And if that is the question, then the only conservatively correct answer is to subtract the high bound. > > This is also what other warnings that deal with ranges do. For > -Warray-bounds only considers the lower bound (unless it's negative) > when deciding whether or not to warn for > > int g (unsigned i) > { > return a[i]; > }> > It would be too noisy to be of practical use otherwise (even at > level 2). Which argues that: 1. Our ranges suck 2. We're currently avoiding dealing with that by giving answers that are not conservatively correct. Right? Jeff
On 11/21/2017 09:55 AM, Jeff Law wrote: > On 11/19/2017 04:28 PM, Martin Sebor wrote: >> On 11/18/2017 12:53 AM, Jeff Law wrote: >>> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>>> The attached patch enhances -Wstringop-overflow to detect more >>>> instances of buffer overflow at compile time by handling non- >>>> constant offsets into the destination object that are known to >>>> be in some range. The solution could be improved by handling >>>> even more cases (e.g., anti-ranges or offsets relative to >>>> pointers beyond the beginning of an object) but it's a start. >>>> >>>> In addition to bootsrapping/regtesting GCC, also tested with >>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >>>> regressions. >>>> >>>> Martin >>>> >>>> The top of GDB fails to compile at the moment so the validation >>>> there was incomplete. >>>> >>>> gcc-77608.diff >>>> >>>> >>>> PR middle-end/77608 - missing protection on trivially detectable >>>> runtime buffer overflow >>>> >>>> gcc/ChangeLog: >>>> >>>> PR middle-end/77608 >>>> * builtins.c (compute_objsize): Handle non-constant offsets. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>> PR middle-end/77608 >>>> * gcc.dg/Wstringop-overflow.c: New test. >>> The recursive call into compute_objsize passing in the ostype avoids >>> having to think about the whole object vs nearest containing object >>> issues. Right? >>> >>> What's left to worry about is maximum or minimum remaining bytes in the >>> object. At least that's my understanding of how ostype works here. >>> >>> So we get the amount remaining, ignoring the variable offset, from the >>> recursive call (SIZE). The space left after we account for the variable >>> offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to >>> return SIZE-MIN (which you do) and for type 2/3 you have to return >>> SIZE-MAX which I think you get wrong (and you have to account for the >>> possibility that MAX or MIN is greater than SIZE and thus there's >>> nothing left). >> >> Subtracting the upper bound of the offset from the size instead >> of the lower bound when the caller is asking for the minimum >> object size would make the result essentially meaningless in >> common cases where the offset is smaller than size_t, as in: >> >> char a[7]; >> >> void f (const char *s, unsigned i) >> { >> __builtin_strcpy (a + i, s); >> } >> >> Here, i's range is [0, UINT_MAX]. >> >> IMO, it's only useful to use the lower bound here, otherwise >> the result would only rarely be non-zero. > But when we're asking for the minimum left, aren't we essentially asking > for "how much space am I absolutely sure I can write"? And if that is > the question, then the only conservatively correct answer is to subtract > the high bound. I suppose you could look at it that way but IME with this work (now, and also last year when I submitted a patch actually changing the built-in), using the upper bound is just not that useful because it's too often way too big. There's no way to distinguish an out-of-range upper bound that's the result of an inadequate attempt to constrain a value from an out-of-range upper bound that is sufficiently constrained but in a way GCC doesn't see. There are no clients of this API that would be affected by the decision one way or the other (unless the user specifies a -Wstringop-overflow= argument greater than the default 2) so I don't think what we do now matters much, if at all. Perhaps in the future with some of the range improvements that you, Aldy and Andrew have been working on. That said, if it helps us move forward with this enhancement I'll use the upper bound -- let me know. In the future, when it is actually used, we'll adjust it as necessary. >> This is also what other warnings that deal with ranges do. For >> -Warray-bounds only considers the lower bound (unless it's negative) >> when deciding whether or not to warn for >> >> int g (unsigned i) >> { >> return a[i]; >> }> >> It would be too noisy to be of practical use otherwise (even at >> level 2). > Which argues that: > > 1. Our ranges suck > 2. We're currently avoiding dealing with that by giving answers that are > not conservatively correct. > > Right? I don't feel quite as strongly. Modulo the pesky bugs we get every so often for some of the corner cases or known limitations they seem actually reasonably accurate in most cases, and the warnings, for the most part, strike a reasonable balance between false positives and negatives. But I certainly agree that there is room to improve and I look forward to taking some of the range enhancement out for a spin. Martin
On 11/21/2017 12:07 PM, Martin Sebor wrote: > On 11/21/2017 09:55 AM, Jeff Law wrote: >> On 11/19/2017 04:28 PM, Martin Sebor wrote: >>> On 11/18/2017 12:53 AM, Jeff Law wrote: >>>> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>>>> The attached patch enhances -Wstringop-overflow to detect more >>>>> instances of buffer overflow at compile time by handling non- >>>>> constant offsets into the destination object that are known to >>>>> be in some range. The solution could be improved by handling >>>>> even more cases (e.g., anti-ranges or offsets relative to >>>>> pointers beyond the beginning of an object) but it's a start. >>>>> >>>>> In addition to bootsrapping/regtesting GCC, also tested with >>>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >>>>> regressions. >>>>> >>>>> Martin >>>>> >>>>> The top of GDB fails to compile at the moment so the validation >>>>> there was incomplete. >>>>> >>>>> gcc-77608.diff >>>>> >>>>> >>>>> PR middle-end/77608 - missing protection on trivially detectable >>>>> runtime buffer overflow >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> PR middle-end/77608 >>>>> * builtins.c (compute_objsize): Handle non-constant offsets. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> PR middle-end/77608 >>>>> * gcc.dg/Wstringop-overflow.c: New test. >>>> The recursive call into compute_objsize passing in the ostype avoids >>>> having to think about the whole object vs nearest containing object >>>> issues. Right? >>>> >>>> What's left to worry about is maximum or minimum remaining bytes in the >>>> object. At least that's my understanding of how ostype works here. >>>> >>>> So we get the amount remaining, ignoring the variable offset, from the >>>> recursive call (SIZE). The space left after we account for the >>>> variable >>>> offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to >>>> return SIZE-MIN (which you do) and for type 2/3 you have to return >>>> SIZE-MAX which I think you get wrong (and you have to account for the >>>> possibility that MAX or MIN is greater than SIZE and thus there's >>>> nothing left). >>> >>> Subtracting the upper bound of the offset from the size instead >>> of the lower bound when the caller is asking for the minimum >>> object size would make the result essentially meaningless in >>> common cases where the offset is smaller than size_t, as in: >>> >>> char a[7]; >>> >>> void f (const char *s, unsigned i) >>> { >>> __builtin_strcpy (a + i, s); >>> } >>> >>> Here, i's range is [0, UINT_MAX]. >>> >>> IMO, it's only useful to use the lower bound here, otherwise >>> the result would only rarely be non-zero. >> But when we're asking for the minimum left, aren't we essentially asking >> for "how much space am I absolutely sure I can write"? And if that is >> the question, then the only conservatively correct answer is to subtract >> the high bound. > > I suppose you could look at it that way but IME with this work > (now, and also last year when I submitted a patch actually > changing the built-in), using the upper bound is just not that > useful because it's too often way too big. There's no way to > distinguish an out-of-range upper bound that's the result of > an inadequate attempt to constrain a value from an out-of-range > upper bound that is sufficiently constrained but in a way GCC > doesn't see. Understood. So while it's reasonable to not warn in those cases where we just have crap range information (that's always going to be the case for some code regardless of how good my work or Andrew/Aldy's work is), we have to be very careful and make sure that nobody acts on this information for optimization purposes because what we're returning is not conservatively correct. > > There are no clients of this API that would be affected by > the decision one way or the other (unless the user specifies > a -Wstringop-overflow= argument greater than the default 2) > so I don't think what we do now matters much, if at all. Right, but what's to stop someone without knowledge of the implementation and its quirk of not returning the conservatively safe result from using the results in other ways. What would the impact be of simply not supporting those queries, essentially returning "I don't know" rather than returning something that isn't conservatively correct? If we have to support both queries and we're going to return something that is not conservatively correct, then it really needs to be documented to avoid folks using the code in ways that are not safe. > Perhaps in the future with some of the range improvements > that you, Aldy and Andrew have been working on. I think they'll help, but there's always going to be codes that can't be analyzed, it's just inherent in the languages we use. > > That said, if it helps us move forward with this enhancement > I'll use the upper bound -- let me know. In the future, when > it is actually used, we'll adjust it as necessary. Understood. Let's answer the questions above first. Namely, do we have to support the other mode? I thought I saw something that indicated it wasn't ever used that way by the current clients. So if we can just return "don't know" for those ostype queries that's a reasonable place to go. Else we should look at a clear comment about the limitations of the code to help prevent (ab)use of the analysis in contexts were it's not appropriate. Jeff
On 11/22/2017 05:03 PM, Jeff Law wrote: > On 11/21/2017 12:07 PM, Martin Sebor wrote: >> On 11/21/2017 09:55 AM, Jeff Law wrote: >>> On 11/19/2017 04:28 PM, Martin Sebor wrote: >>>> On 11/18/2017 12:53 AM, Jeff Law wrote: >>>>> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>>>>> The attached patch enhances -Wstringop-overflow to detect more >>>>>> instances of buffer overflow at compile time by handling non- >>>>>> constant offsets into the destination object that are known to >>>>>> be in some range. The solution could be improved by handling >>>>>> even more cases (e.g., anti-ranges or offsets relative to >>>>>> pointers beyond the beginning of an object) but it's a start. >>>>>> >>>>>> In addition to bootsrapping/regtesting GCC, also tested with >>>>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >>>>>> regressions. >>>>>> >>>>>> Martin >>>>>> >>>>>> The top of GDB fails to compile at the moment so the validation >>>>>> there was incomplete. >>>>>> >>>>>> gcc-77608.diff >>>>>> >>>>>> >>>>>> PR middle-end/77608 - missing protection on trivially detectable >>>>>> runtime buffer overflow >>>>>> >>>>>> gcc/ChangeLog: >>>>>> >>>>>> PR middle-end/77608 >>>>>> * builtins.c (compute_objsize): Handle non-constant offsets. >>>>>> >>>>>> gcc/testsuite/ChangeLog: >>>>>> >>>>>> PR middle-end/77608 >>>>>> * gcc.dg/Wstringop-overflow.c: New test. >>>>> The recursive call into compute_objsize passing in the ostype avoids >>>>> having to think about the whole object vs nearest containing object >>>>> issues. Right? >>>>> >>>>> What's left to worry about is maximum or minimum remaining bytes in the >>>>> object. At least that's my understanding of how ostype works here. >>>>> >>>>> So we get the amount remaining, ignoring the variable offset, from the >>>>> recursive call (SIZE). The space left after we account for the >>>>> variable >>>>> offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to >>>>> return SIZE-MIN (which you do) and for type 2/3 you have to return >>>>> SIZE-MAX which I think you get wrong (and you have to account for the >>>>> possibility that MAX or MIN is greater than SIZE and thus there's >>>>> nothing left). >>>> >>>> Subtracting the upper bound of the offset from the size instead >>>> of the lower bound when the caller is asking for the minimum >>>> object size would make the result essentially meaningless in >>>> common cases where the offset is smaller than size_t, as in: >>>> >>>> char a[7]; >>>> >>>> void f (const char *s, unsigned i) >>>> { >>>> __builtin_strcpy (a + i, s); >>>> } >>>> >>>> Here, i's range is [0, UINT_MAX]. >>>> >>>> IMO, it's only useful to use the lower bound here, otherwise >>>> the result would only rarely be non-zero. >>> But when we're asking for the minimum left, aren't we essentially asking >>> for "how much space am I absolutely sure I can write"? And if that is >>> the question, then the only conservatively correct answer is to subtract >>> the high bound. >> >> I suppose you could look at it that way but IME with this work >> (now, and also last year when I submitted a patch actually >> changing the built-in), using the upper bound is just not that >> useful because it's too often way too big. There's no way to >> distinguish an out-of-range upper bound that's the result of >> an inadequate attempt to constrain a value from an out-of-range >> upper bound that is sufficiently constrained but in a way GCC >> doesn't see. > Understood. > > So while it's reasonable to not warn in those cases where we just have > crap range information (that's always going to be the case for some code > regardless of how good my work or Andrew/Aldy's work is), we have to be > very careful and make sure that nobody acts on this information for > optimization purposes because what we're returning is not conservatively > correct. > > >> >> There are no clients of this API that would be affected by >> the decision one way or the other (unless the user specifies >> a -Wstringop-overflow= argument greater than the default 2) >> so I don't think what we do now matters much, if at all. > Right, but what's to stop someone without knowledge of the > implementation and its quirk of not returning the conservatively safe > result from using the results in other ways. Presumably they would find out by testing their code. But this is a hypothetical scenario. I added the function for warnings. I wasn't expecting it to be used for optimization, no such uses have emerged, and I don't have the impression that anyone is contemplating adding them (certainly not in stage 3). If you think the function could be useful for optimization then we should certainly consider changing it as we gain experience with it under those conditions. > What would the impact be of simply not supporting those queries, > essentially returning "I don't know" rather than returning something > that isn't conservatively correct? Except for false negatives with -Wstringop-overflow=3 and =4 (i.e., Object Size type 2 and 3) I don't think there would be any impact. As I said, the function isn't used for optimization and I don't think the option is used in these modes either, so in my mind it matters very little. I don't even think there are any tests for it in these modes, either. > > If we have to support both queries and we're going to return something > that is not conservatively correct, then it really needs to be > documented to avoid folks using the code in ways that are not safe. > > > >> Perhaps in the future with some of the range improvements >> that you, Aldy and Andrew have been working on. > I think they'll help, but there's always going to be codes that can't be > analyzed, it's just inherent in the languages we use. > >> >> That said, if it helps us move forward with this enhancement >> I'll use the upper bound -- let me know. In the future, when >> it is actually used, we'll adjust it as necessary. > Understood. Let's answer the questions above first. Namely, do we have > to support the other mode? I thought I saw something that indicated it > wasn't ever used that way by the current clients. So if we can just > return "don't know" for those ostype queries that's a reasonable place > to go. Else we should look at a clear comment about the limitations of > the code to help prevent (ab)use of the analysis in contexts were it's > not appropriate. The -Wstringop-overflow= option supports all four object size types. I suspect only types 0 and 1 are used but I also don't see a pressing need to disable or compromise the others. If you do I'm fine with changing the option either to only support modes 0 and 1, or to remove the argument altogether (and using the same default as today, i.e., mode 0 for memxxx functions and mode 1 for strxxx functions). But I see no benefit in changing the function to give up in modes 2 and 3 and have it cause false negatives. It simply isn't necessary today. For now, I've added comments to clarify the return value and the purpose of the function. Let me know if that's good enough or if you want me to make any other changes. Martin PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow gcc/ChangeLog: PR middle-end/77608 * builtins.c (compute_objsize): Handle non-constant offsets. gcc/testsuite/ChangeLog: PR middle-end/77608 * gcc.dg/Wstringop-overflow.c: New test. diff --git a/gcc/builtins.c b/gcc/builtins.c index 650de0d..e35c514 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3262,8 +3262,12 @@ check_sizes (int opt, tree exp, tree size, tree maxlen, tree src, tree objsize) /* Helper to compute the size of the object referenced by the DEST expression which must have pointer type, using Object Size type OSTYPE (only the least significant 2 bits are used). Return - the size of the object if successful or NULL when the size cannot - be determined. */ + an estimate of the size of the object if successful or NULL when + the size cannot be determined. When the referenced object involves + a non-constant offset in some range the returned value represents + the largest size given the smallest non-negative offset in the + range. The function is intended for diagnostics and is not + suitable for optimization. */ tree compute_objsize (tree dest, int ostype) @@ -3276,24 +3280,57 @@ compute_objsize (tree dest, int ostype) if (compute_builtin_object_size (dest, ostype, &size)) return build_int_cst (sizetype, size); - /* Unless computing the largest size (for memcpy and other raw memory - functions), try to determine the size of the object from its type. */ - if (!ostype) - return NULL_TREE; - if (TREE_CODE (dest) == SSA_NAME) { gimple *stmt = SSA_NAME_DEF_STMT (dest); if (!is_gimple_assign (stmt)) return NULL_TREE; + dest = gimple_assign_rhs1 (stmt); + tree_code code = gimple_assign_rhs_code (stmt); - if (code != ADDR_EXPR && code != POINTER_PLUS_EXPR) - return NULL_TREE; + if (code == POINTER_PLUS_EXPR) + { + /* compute_builtin_object_size fails for addresses with + non-constant offsets. Try to determine the range of + such an offset here and use it to adjus the constant + size. */ + tree off = gimple_assign_rhs2 (stmt); + if (TREE_CODE (off) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (off))) + { + wide_int min, max; + enum value_range_type rng = get_range_info (off, &min, &max); - dest = gimple_assign_rhs1 (stmt); + if (rng == VR_RANGE) + { + if (tree size = compute_objsize (dest, ostype)) + { + wide_int wisiz = wi::to_wide (size); + + /* Ignore negative offsets for now. For others, + use the lower bound as the most optimistic + estimate of the (remaining)size. */ + if (wi::sign_mask (min)) + ; + else if (wi::ltu_p (min, wisiz)) + return wide_int_to_tree (TREE_TYPE (size), + wi::sub (wisiz, min)); + else + return size_zero_node; + } + } + } + } + else if (code != ADDR_EXPR) + return NULL_TREE; } + /* Unless computing the largest size (for memcpy and other raw memory + functions), try to determine the size of the object from its type. */ + if (!ostype) + return NULL_TREE; + if (TREE_CODE (dest) != ADDR_EXPR) return NULL_TREE; diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow.c b/gcc/testsuite/gcc.dg/Wstringop-overflow.c new file mode 100644 index 0000000..b5bd40e --- /dev/null +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow.c @@ -0,0 +1,132 @@ +/* PR middle-end/77608 - missing protection on trivially detectable runtime + buffer overflow + { dg-do compile } + { dg-options "-O2 -Wstringop-overflow -ftrack-macro-expansion=0" } */ + +#define SIZE_MAX __SIZE_MAX__ +#define DIFF_MAX __PTRDIFF_MAX__ +#define DIFF_MIN (-DIFF_MAX - 1) + +typedef __SIZE_TYPE__ size_t; + +extern void* memcpy (void*, const void*, size_t); +extern char* strcpy (char*, const char*); +extern char* strncpy (char*, const char*, size_t); + +void sink (void*); + +extern size_t unsigned_value (void) +{ + extern volatile size_t unsigned_value_source; + return unsigned_value_source; +} + +size_t unsigned_range (size_t min, size_t max) +{ + size_t val = unsigned_value (); + return val < min || max < val ? min : val; +} + +#define UR(min, max) unsigned_range (min, max) + + +char a7[7]; + +struct MemArray { char a9[9]; char a1[1]; }; + +void test_memcpy_array (const void *s) +{ +#define T(d, s, n) (memcpy ((d), (s), (n)), sink (d)) + + T (a7 + UR (0, 1), s, 7); + T (a7 + UR (0, 7), s, 7); + T (a7 + UR (0, 8), s, 7); + T (a7 + UR (0, DIFF_MAX), s, 7); + T (a7 + UR (0, SIZE_MAX), s, 7); + + T (a7 + UR (1, 2), s, 7); /* { dg-warning "writing 7 bytes into a region of size 6" } */ + T (a7 + UR (2, 3), s, 7); /* { dg-warning "writing 7 bytes into a region of size 5" } */ + T (a7 + UR (6, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 1" } */ + T (a7 + UR (7, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (8, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + T (a7 + UR (9, 10), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, SIZE_MAX), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + /* This is valid. */ + char *d = a7 + 7; + T (d + UR (-8, -7), s, 7); +} + +/* Verify the absence of warnings for memcpy writing beyond object + boundaries. */ + +void test_memcpy_memarray (struct MemArray *p, const void *s) +{ +#undef T +#define T(d, s, n) (memcpy ((d), (s), (n)), sink (d)) + + /* The following are valid. */ + T (p->a9 + UR (0, 1), s, 9); + T (p->a9 + UR (0, 7), s, 9); + T (p->a9 + UR (0, 8), s, 9); + T (p->a9 + UR (0, DIFF_MAX), s, 9); + T (p->a9 + UR (0, SIZE_MAX), s, 9); + + /* The following are invalid. Unfortunately, there is apparently enough + code out there that abuses memcpy to write past the end of one member + and into the members that follow so the following are not diagnosed + by design. It sure would be nice not to have to cater to hacks like + these... */ + T (p->a9 + UR (1, 2), s, 9); + T (p->a9 + UR (1, 2), s, 123); +} + + +void test_strcpy_array (void) +{ +#undef T +#define T(d, s) (strcpy ((d), (s)), sink (d)) + + T (a7 + UR (0, 1), "012345"); + T (a7 + UR (0, 7), "012345"); + T (a7 + UR (0, 8), "012345"); + T (a7 + UR (0, DIFF_MAX), "012345"); + T (a7 + UR (0, SIZE_MAX), "012345"); + + T (a7 + UR (1, 2), "012345"); /* { dg-warning "writing 7 bytes into a region of size 6" } */ + T (a7 + UR (2, 3), "012345"); /* { dg-warning "writing 7 bytes into a region of size 5" } */ + T (a7 + UR (6, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 1" } */ + T (a7 + UR (7, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (8, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + T (a7 + UR (9, 10), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, SIZE_MAX), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + char *d = a7 + 7; + + T (d + UR (-8, -7), "012345"); +} + +void test_strncpy_memarray (struct MemArray *p, const void *s) +{ +#undef T +#define T(d, s, n) (strncpy ((d), (s), (n)), sink (d)) + + T (p->a9 + UR (0, 1), s, 9); + T (p->a9 + UR (0, 7), s, 9); + T (p->a9 + UR (0, 8), s, 9); + T (p->a9 + UR (0, DIFF_MAX), s, 9); + T (p->a9 + UR (0, SIZE_MAX), s, 9); + + T (p->a9 + UR (1, 2), s, 9); /* { dg-warning "writing 9 bytes into a region of size 8" } */ + T (p->a9 + UR (2, 3), s, 9); /* { dg-warning "writing 9 bytes into a region of size 7" } */ + T (p->a9 + UR (6, 9), s, 9); /* { dg-warning "writing 9 bytes into a region of size 3" } */ + T (p->a9 + UR (9, 10), s, 9); /* { dg-warning "writing 9 bytes into a region of size 0" } */ + T (p->a9 + UR (10, 11), s, 9); /* { dg-warning "writing 9 bytes into a region of size 0" } */ + + T (p->a9 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), s, 1); /* { dg-warning "writing 1 byte into a region of size 0" } */ + T (p->a9 + UR (DIFF_MAX, SIZE_MAX), s, 3); /* { dg-warning "writing 3 bytes into a region of size 0" } */ +}
On 11/30/2017 01:30 PM, Martin Sebor wrote: > On 11/22/2017 05:03 PM, Jeff Law wrote: >> On 11/21/2017 12:07 PM, Martin Sebor wrote: >>> On 11/21/2017 09:55 AM, Jeff Law wrote: >>>> On 11/19/2017 04:28 PM, Martin Sebor wrote: >>>>> On 11/18/2017 12:53 AM, Jeff Law wrote: >>>>>> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>>>>>> The attached patch enhances -Wstringop-overflow to detect more >>>>>>> instances of buffer overflow at compile time by handling non- >>>>>>> constant offsets into the destination object that are known to >>>>>>> be in some range. The solution could be improved by handling >>>>>>> even more cases (e.g., anti-ranges or offsets relative to >>>>>>> pointers beyond the beginning of an object) but it's a start. >>>>>>> >>>>>>> In addition to bootsrapping/regtesting GCC, also tested with >>>>>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >>>>>>> regressions. >>>>>>> >>>>>>> Martin >>>>>>> >>>>>>> The top of GDB fails to compile at the moment so the validation >>>>>>> there was incomplete. >>>>>>> >>>>>>> gcc-77608.diff >>>>>>> >>>>>>> >>>>>>> PR middle-end/77608 - missing protection on trivially detectable >>>>>>> runtime buffer overflow >>>>>>> >>>>>>> gcc/ChangeLog: >>>>>>> >>>>>>> PR middle-end/77608 >>>>>>> * builtins.c (compute_objsize): Handle non-constant offsets. >>>>>>> >>>>>>> gcc/testsuite/ChangeLog: >>>>>>> >>>>>>> PR middle-end/77608 >>>>>>> * gcc.dg/Wstringop-overflow.c: New test. >>>>>> The recursive call into compute_objsize passing in the ostype avoids >>>>>> having to think about the whole object vs nearest containing object >>>>>> issues. Right? >>>>>> >>>>>> What's left to worry about is maximum or minimum remaining bytes >>>>>> in the >>>>>> object. At least that's my understanding of how ostype works here. >>>>>> >>>>>> So we get the amount remaining, ignoring the variable offset, from >>>>>> the >>>>>> recursive call (SIZE). The space left after we account for the >>>>>> variable >>>>>> offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to >>>>>> return SIZE-MIN (which you do) and for type 2/3 you have to return >>>>>> SIZE-MAX which I think you get wrong (and you have to account for the >>>>>> possibility that MAX or MIN is greater than SIZE and thus there's >>>>>> nothing left). >>>>> >>>>> Subtracting the upper bound of the offset from the size instead >>>>> of the lower bound when the caller is asking for the minimum >>>>> object size would make the result essentially meaningless in >>>>> common cases where the offset is smaller than size_t, as in: >>>>> >>>>> char a[7]; >>>>> >>>>> void f (const char *s, unsigned i) >>>>> { >>>>> __builtin_strcpy (a + i, s); >>>>> } >>>>> >>>>> Here, i's range is [0, UINT_MAX]. >>>>> >>>>> IMO, it's only useful to use the lower bound here, otherwise >>>>> the result would only rarely be non-zero. >>>> But when we're asking for the minimum left, aren't we essentially >>>> asking >>>> for "how much space am I absolutely sure I can write"? And if that is >>>> the question, then the only conservatively correct answer is to >>>> subtract >>>> the high bound. >>> >>> I suppose you could look at it that way but IME with this work >>> (now, and also last year when I submitted a patch actually >>> changing the built-in), using the upper bound is just not that >>> useful because it's too often way too big. There's no way to >>> distinguish an out-of-range upper bound that's the result of >>> an inadequate attempt to constrain a value from an out-of-range >>> upper bound that is sufficiently constrained but in a way GCC >>> doesn't see. >> Understood. >> >> So while it's reasonable to not warn in those cases where we just have >> crap range information (that's always going to be the case for some code >> regardless of how good my work or Andrew/Aldy's work is), we have to be >> very careful and make sure that nobody acts on this information for >> optimization purposes because what we're returning is not conservatively >> correct. >> >> >>> >>> There are no clients of this API that would be affected by >>> the decision one way or the other (unless the user specifies >>> a -Wstringop-overflow= argument greater than the default 2) >>> so I don't think what we do now matters much, if at all. >> Right, but what's to stop someone without knowledge of the >> implementation and its quirk of not returning the conservatively safe >> result from using the results in other ways. > > Presumably they would find out by testing their code. But this > is a hypothetical scenario. I added the function for warnings. > I wasn't expecting it to be used for optimization, no such uses > have emerged, and I don't have the impression that anyone is > contemplating adding them (certainly not in stage 3). If you > think the function could be useful for optimization then we > should certainly consider changing it as we gain experience > with it under those conditions. Merely passing tests does not mean the code is correct, we both have the war stories and scars to prove it. :-) Hell, I prove it to myself nearly daily :( Furthermore, just because nobody is using the function today in an optimization context does not mean it will always be the case. Worse yet, someone could potentially use a caller of compute_objsize without knowing about the limitations. In fact, if I look at how we handle expand_builtin_mempcpy we have: /* Avoid expanding mempcpy into memcpy when the call is determined to overflow the buffer. This also prevents the same overflow from being diagnosed again when expanding memcpy. */ if (!check_memop_sizes (exp, dest, src, len)) return NULL_RTX; While that's not strictly meant to be an optimization, it is in effect changing the code we generate based on the return value of check_memop_sizes, which comes from compute_objsize. Thankfully, the worst that happens here is we'll fail to turn the mempcpy into a memcpy if compute_objsize/check_memop_sizes returns a value that is not strictly correct. But I think it highlights how easy it is to end up having code generation changing based on the results of compute_objsize. >> What would the impact be of simply not supporting those queries, >> essentially returning "I don't know" rather than returning something >> that isn't conservatively correct? > > Except for false negatives with -Wstringop-overflow=3 and =4 > (i.e., Object Size type 2 and 3) I don't think there would be > any impact. As I said, the function isn't used for optimization > and I don't think the option is used in these modes either, so > in my mind it matters very little. I don't even think there > are any tests for it in these modes, either. SO based on my research around the mempcpy stuff above, my thinking is refined a bit. Inherently the result of compute_objsize is inexact when presented with a non-constant offset (and it may be in other contexts as well). That introduces a level of risk in that callers may well end up using the result of compute_objsize in ways that are unsafe. THe best way I know of to deal with that is to make the result a multi-state so that we can distinguish between the exact results and inexact results. ie, the fact that the function returns an inexact result gets baked into its API. It's a lot harder to mis-use. I could possibly live with a pair of comments though. In compute_objsize we'd want a comment clarifying that the result is inexact and discourage using the result to make code generation decisions of any kind (ie, more general than just discouraging using the return value for optimization purposes). In the mempcpy code we'd want a comment indicating why it's safe to use the result the way we do. Essentially if compute_objsize returns true, then we transform the result into a memcpy + return value adjustment, otherwise we just call mempcpy. In both cases we copy the same data and the final return value is the same. > > For now, I've added comments to clarify the return value and > the purpose of the function. Let me know if that's good enough > or if you want me to make any other changes. SO would you rather go with baking the inexact nature of the return value into the API or the pair of comments noted above? jeff
On 12/01/2017 01:26 AM, Jeff Law wrote: > On 11/30/2017 01:30 PM, Martin Sebor wrote: >> On 11/22/2017 05:03 PM, Jeff Law wrote: >>> On 11/21/2017 12:07 PM, Martin Sebor wrote: >>>> On 11/21/2017 09:55 AM, Jeff Law wrote: >>>>> On 11/19/2017 04:28 PM, Martin Sebor wrote: >>>>>> On 11/18/2017 12:53 AM, Jeff Law wrote: >>>>>>> On 11/17/2017 12:36 PM, Martin Sebor wrote: >>>>>>>> The attached patch enhances -Wstringop-overflow to detect more >>>>>>>> instances of buffer overflow at compile time by handling non- >>>>>>>> constant offsets into the destination object that are known to >>>>>>>> be in some range. The solution could be improved by handling >>>>>>>> even more cases (e.g., anti-ranges or offsets relative to >>>>>>>> pointers beyond the beginning of an object) but it's a start. >>>>>>>> >>>>>>>> In addition to bootsrapping/regtesting GCC, also tested with >>>>>>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no >>>>>>>> regressions. >>>>>>>> >>>>>>>> Martin >>>>>>>> >>>>>>>> The top of GDB fails to compile at the moment so the validation >>>>>>>> there was incomplete. >>>>>>>> >>>>>>>> gcc-77608.diff >>>>>>>> >>>>>>>> >>>>>>>> PR middle-end/77608 - missing protection on trivially detectable >>>>>>>> runtime buffer overflow >>>>>>>> >>>>>>>> gcc/ChangeLog: >>>>>>>> >>>>>>>> PR middle-end/77608 >>>>>>>> * builtins.c (compute_objsize): Handle non-constant offsets. >>>>>>>> >>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>> >>>>>>>> PR middle-end/77608 >>>>>>>> * gcc.dg/Wstringop-overflow.c: New test. >>>>>>> The recursive call into compute_objsize passing in the ostype avoids >>>>>>> having to think about the whole object vs nearest containing object >>>>>>> issues. Right? >>>>>>> >>>>>>> What's left to worry about is maximum or minimum remaining bytes >>>>>>> in the >>>>>>> object. At least that's my understanding of how ostype works here. >>>>>>> >>>>>>> So we get the amount remaining, ignoring the variable offset, from >>>>>>> the >>>>>>> recursive call (SIZE). The space left after we account for the >>>>>>> variable >>>>>>> offset is [SIZE - MAX, SIZE - MIN]. So ISTM for type 0/1 you have to >>>>>>> return SIZE-MIN (which you do) and for type 2/3 you have to return >>>>>>> SIZE-MAX which I think you get wrong (and you have to account for the >>>>>>> possibility that MAX or MIN is greater than SIZE and thus there's >>>>>>> nothing left). >>>>>> >>>>>> Subtracting the upper bound of the offset from the size instead >>>>>> of the lower bound when the caller is asking for the minimum >>>>>> object size would make the result essentially meaningless in >>>>>> common cases where the offset is smaller than size_t, as in: >>>>>> >>>>>> char a[7]; >>>>>> >>>>>> void f (const char *s, unsigned i) >>>>>> { >>>>>> __builtin_strcpy (a + i, s); >>>>>> } >>>>>> >>>>>> Here, i's range is [0, UINT_MAX]. >>>>>> >>>>>> IMO, it's only useful to use the lower bound here, otherwise >>>>>> the result would only rarely be non-zero. >>>>> But when we're asking for the minimum left, aren't we essentially >>>>> asking >>>>> for "how much space am I absolutely sure I can write"? And if that is >>>>> the question, then the only conservatively correct answer is to >>>>> subtract >>>>> the high bound. >>>> >>>> I suppose you could look at it that way but IME with this work >>>> (now, and also last year when I submitted a patch actually >>>> changing the built-in), using the upper bound is just not that >>>> useful because it's too often way too big. There's no way to >>>> distinguish an out-of-range upper bound that's the result of >>>> an inadequate attempt to constrain a value from an out-of-range >>>> upper bound that is sufficiently constrained but in a way GCC >>>> doesn't see. >>> Understood. >>> >>> So while it's reasonable to not warn in those cases where we just have >>> crap range information (that's always going to be the case for some code >>> regardless of how good my work or Andrew/Aldy's work is), we have to be >>> very careful and make sure that nobody acts on this information for >>> optimization purposes because what we're returning is not conservatively >>> correct. >>> >>> >>>> >>>> There are no clients of this API that would be affected by >>>> the decision one way or the other (unless the user specifies >>>> a -Wstringop-overflow= argument greater than the default 2) >>>> so I don't think what we do now matters much, if at all. >>> Right, but what's to stop someone without knowledge of the >>> implementation and its quirk of not returning the conservatively safe >>> result from using the results in other ways. >> >> Presumably they would find out by testing their code. But this >> is a hypothetical scenario. I added the function for warnings. >> I wasn't expecting it to be used for optimization, no such uses >> have emerged, and I don't have the impression that anyone is >> contemplating adding them (certainly not in stage 3). If you >> think the function could be useful for optimization then we >> should certainly consider changing it as we gain experience >> with it under those conditions. > Merely passing tests does not mean the code is correct, we both have the > war stories and scars to prove it. :-) Hell, I prove it to myself > nearly daily :( > > Furthermore, just because nobody is using the function today in an > optimization context does not mean it will always be the case. Worse > yet, someone could potentially use a caller of compute_objsize without > knowing about the limitations. > > In fact, if I look at how we handle expand_builtin_mempcpy we have: > > /* Avoid expanding mempcpy into memcpy when the call is determined > to overflow the buffer. This also prevents the same overflow > from being diagnosed again when expanding memcpy. */ > if (!check_memop_sizes (exp, dest, src, len)) > return NULL_RTX; > > While that's not strictly meant to be an optimization, it is in effect > changing the code we generate based on the return value of > check_memop_sizes, which comes from compute_objsize. Thankfully, the > worst that happens here is we'll fail to turn the mempcpy into a memcpy > if compute_objsize/check_memop_sizes returns a value that is not > strictly correct. But I think it highlights how easy it is to end up > having code generation changing based on the results of compute_objsize. I think you have misread the code (which is easy to do), so I'm not sure it does highlight it. The mempcpy code unconditionally uses Object Size type 0 (see check_memop_sizes) so it cannot be used the way you are concerned about. All raw memory built-ins that write into memory do. The string functions do use other Object Size modes, depending on the -Wstringop-overflow=N argument (by default mode 1), but they don't use it to disable the expansion. Even if any of the existing functions did use the function to disable the expansion, because compute_objsize uses the lower bound of the offset, the result is the upper of the size and so only considered invalid if it definitely is out-of-bounds. In this scenario, using the upper bound would have the effect of disabling the expansion even in cases they are not provably in bounds (though only in Object Size modes 2 and 3, which I suspect are never used). I.e., it would have an adverse impact not only in terms of false positives but also on the expansion. If we were talking about changing __builtin_object_size (we're not), using the upper bound would cause string function calls with an in-bounds lower bound and an out-of-bounds upper bound to abort with _FORTIFY_SOURCE. Conversely, giving up by returning a "don't know" result in this case would fail to prevent provable buffer overflows. I believe that where ranges are involved, it only makes sense to use the conservative bound (which may be the lower or the upper bound, depending on the circumstances). But it just isn't viable to use the other one because it leads to wrong (suboptimal) results. >>> What would the impact be of simply not supporting those queries, >>> essentially returning "I don't know" rather than returning something >>> that isn't conservatively correct? >> >> Except for false negatives with -Wstringop-overflow=3 and =4 >> (i.e., Object Size type 2 and 3) I don't think there would be >> any impact. As I said, the function isn't used for optimization >> and I don't think the option is used in these modes either, so >> in my mind it matters very little. I don't even think there >> are any tests for it in these modes, either. > SO based on my research around the mempcpy stuff above, my thinking is > refined a bit. > > Inherently the result of compute_objsize is inexact when presented with > a non-constant offset (and it may be in other contexts as well). That > introduces a level of risk in that callers may well end up using the > result of compute_objsize in ways that are unsafe. > > THe best way I know of to deal with that is to make the result a > multi-state so that we can distinguish between the exact results and > inexact results. ie, the fact that the function returns an inexact > result gets baked into its API. It's a lot harder to mis-use. The result already is multi-state: NULL means "don't know," which is interpreted as valid. > I could possibly live with a pair of comments though. In > compute_objsize we'd want a comment clarifying that the result is > inexact and discourage using the result to make code generation > decisions of any kind (ie, more general than just discouraging using the > return value for optimization purposes). > > In the mempcpy code we'd want a comment indicating why it's safe to use > the result the way we do. Essentially if compute_objsize returns true, > then we transform the result into a memcpy + return value adjustment, > otherwise we just call mempcpy. In both cases we copy the same data and > the final return value is the same. > >> >> For now, I've added comments to clarify the return value and >> the purpose of the function. Let me know if that's good enough >> or if you want me to make any other changes. > SO would you rather go with baking the inexact nature of the return > value into the API or the pair of comments noted above? I added the comment to compute_objsize in the last iteration of the patch. As I explained above, expand_builtin_mempcpy isn't affected by this patch. The comments above and within in the function already explain what what happens when a buffer overflow is detected and why so I'm not sure what else to say there. If I misunderstood your suggestion please clarify. The other change you are suggesting means restoring the false negatives in Object Size modes 2 and 3 where my patch detected true positives. Here's an example to demonstrate the effect. With my original patch, the overflow in both functions below is diagnosed with all -Wstringop-overflow=N arguments. With the change you asked for, only the memcpy overflow is diagnosed with all arguments. The strncpy overflow is only diagnosed with N=1 and 2, and suppressed with N=3 and N=4. That's clearly a worse outcome, but in the interest of moving forward I've made the change. I think the overall improvement is worthwhile despite this flaw. $ cat a.c && gcc -O2 -S -Wstringop-overflow=3 a.c char d[7]; void f (const void *s, unsigned i) { if (i < 3 || 9 < i) i = 3; __builtin_memcpy (d + i, s, 5); // diagnosed } void g (const char *s, unsigned i) { if (i < 3 || 9 < i) i = 3; __builtin_strncpy (d + i, s, 5); // false negative } a.c: In function ‘f’: a.c:8:3: warning: ‘__builtin_memcpy’ writing 5 bytes into a region of size 4 overflows the destination [-Wstringop-overflow=] __builtin_memcpy (d + i, s, 5); // diagnosed ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow gcc/ChangeLog: PR middle-end/77608 * builtins.c (compute_objsize): Handle non-constant offsets. gcc/testsuite/ChangeLog: PR middle-end/77608 * gcc.dg/Wstringop-overflow.c: New test. diff --git a/gcc/builtins.c b/gcc/builtins.c index 650de0d..0565eaf 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3262,8 +3262,12 @@ check_sizes (int opt, tree exp, tree size, tree maxlen, tree src, tree objsize) /* Helper to compute the size of the object referenced by the DEST expression which must have pointer type, using Object Size type OSTYPE (only the least significant 2 bits are used). Return - the size of the object if successful or NULL when the size cannot - be determined. */ + an estimate of the size of the object if successful or NULL when + the size cannot be determined. When the referenced object involves + a non-constant offset in some range the returned value represents + the largest size given the smallest non-negative offset in the + range. The function is intended for diagnostics and is not + suitable for optimization. */ tree compute_objsize (tree dest, int ostype) @@ -3276,24 +3280,68 @@ compute_objsize (tree dest, int ostype) if (compute_builtin_object_size (dest, ostype, &size)) return build_int_cst (sizetype, size); - /* Unless computing the largest size (for memcpy and other raw memory - functions), try to determine the size of the object from its type. */ - if (!ostype) - return NULL_TREE; - if (TREE_CODE (dest) == SSA_NAME) { gimple *stmt = SSA_NAME_DEF_STMT (dest); if (!is_gimple_assign (stmt)) return NULL_TREE; + dest = gimple_assign_rhs1 (stmt); + tree_code code = gimple_assign_rhs_code (stmt); - if (code != ADDR_EXPR && code != POINTER_PLUS_EXPR) - return NULL_TREE; + if (code == POINTER_PLUS_EXPR) + { + /* compute_builtin_object_size fails for addresses with + non-constant offsets. Try to determine the range of + such an offset here and use it to adjus the constant + size. */ + tree off = gimple_assign_rhs2 (stmt); + if (TREE_CODE (off) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (off))) + { + wide_int min, max; + enum value_range_type rng = get_range_info (off, &min, &max); - dest = gimple_assign_rhs1 (stmt); + if (rng == VR_RANGE) + { + if (tree size = compute_objsize (dest, ostype)) + { + wide_int wisiz = wi::to_wide (size); + + /* Ignore negative offsets for now. */ + if (wi::sign_mask (min)) + ; + else if (ostype > 1) + { + /* In Object Size modes 2 and 3, return zero + if the remaining size is definitely zero, + otherwise return a "don't know" result. */ + if (wi::leu_p (wisiz, min)) + return size_zero_node; + + return NULL_TREE; + } + else if (wi::ltu_p (min, wisiz)) + /* In Object Size modes 0 and 1, use the lower + bound of the offset to compute as the most + optimistic estimate of the (remaining) size. */ + return wide_int_to_tree (TREE_TYPE (size), + wi::sub (wisiz, min)); + else + return size_zero_node; + } + } + } + } + else if (code != ADDR_EXPR) + return NULL_TREE; } + /* Unless computing the largest size (for memcpy and other raw memory + functions), try to determine the size of the object from its type. */ + if (!ostype) + return NULL_TREE; + if (TREE_CODE (dest) != ADDR_EXPR) return NULL_TREE; diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow.c b/gcc/testsuite/gcc.dg/Wstringop-overflow.c new file mode 100644 index 0000000..b5bd40e --- /dev/null +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow.c @@ -0,0 +1,132 @@ +/* PR middle-end/77608 - missing protection on trivially detectable runtime + buffer overflow + { dg-do compile } + { dg-options "-O2 -Wstringop-overflow -ftrack-macro-expansion=0" } */ + +#define SIZE_MAX __SIZE_MAX__ +#define DIFF_MAX __PTRDIFF_MAX__ +#define DIFF_MIN (-DIFF_MAX - 1) + +typedef __SIZE_TYPE__ size_t; + +extern void* memcpy (void*, const void*, size_t); +extern char* strcpy (char*, const char*); +extern char* strncpy (char*, const char*, size_t); + +void sink (void*); + +extern size_t unsigned_value (void) +{ + extern volatile size_t unsigned_value_source; + return unsigned_value_source; +} + +size_t unsigned_range (size_t min, size_t max) +{ + size_t val = unsigned_value (); + return val < min || max < val ? min : val; +} + +#define UR(min, max) unsigned_range (min, max) + + +char a7[7]; + +struct MemArray { char a9[9]; char a1[1]; }; + +void test_memcpy_array (const void *s) +{ +#define T(d, s, n) (memcpy ((d), (s), (n)), sink (d)) + + T (a7 + UR (0, 1), s, 7); + T (a7 + UR (0, 7), s, 7); + T (a7 + UR (0, 8), s, 7); + T (a7 + UR (0, DIFF_MAX), s, 7); + T (a7 + UR (0, SIZE_MAX), s, 7); + + T (a7 + UR (1, 2), s, 7); /* { dg-warning "writing 7 bytes into a region of size 6" } */ + T (a7 + UR (2, 3), s, 7); /* { dg-warning "writing 7 bytes into a region of size 5" } */ + T (a7 + UR (6, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 1" } */ + T (a7 + UR (7, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (8, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + T (a7 + UR (9, 10), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, SIZE_MAX), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + /* This is valid. */ + char *d = a7 + 7; + T (d + UR (-8, -7), s, 7); +} + +/* Verify the absence of warnings for memcpy writing beyond object + boundaries. */ + +void test_memcpy_memarray (struct MemArray *p, const void *s) +{ +#undef T +#define T(d, s, n) (memcpy ((d), (s), (n)), sink (d)) + + /* The following are valid. */ + T (p->a9 + UR (0, 1), s, 9); + T (p->a9 + UR (0, 7), s, 9); + T (p->a9 + UR (0, 8), s, 9); + T (p->a9 + UR (0, DIFF_MAX), s, 9); + T (p->a9 + UR (0, SIZE_MAX), s, 9); + + /* The following are invalid. Unfortunately, there is apparently enough + code out there that abuses memcpy to write past the end of one member + and into the members that follow so the following are not diagnosed + by design. It sure would be nice not to have to cater to hacks like + these... */ + T (p->a9 + UR (1, 2), s, 9); + T (p->a9 + UR (1, 2), s, 123); +} + + +void test_strcpy_array (void) +{ +#undef T +#define T(d, s) (strcpy ((d), (s)), sink (d)) + + T (a7 + UR (0, 1), "012345"); + T (a7 + UR (0, 7), "012345"); + T (a7 + UR (0, 8), "012345"); + T (a7 + UR (0, DIFF_MAX), "012345"); + T (a7 + UR (0, SIZE_MAX), "012345"); + + T (a7 + UR (1, 2), "012345"); /* { dg-warning "writing 7 bytes into a region of size 6" } */ + T (a7 + UR (2, 3), "012345"); /* { dg-warning "writing 7 bytes into a region of size 5" } */ + T (a7 + UR (6, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 1" } */ + T (a7 + UR (7, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (8, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + T (a7 + UR (9, 10), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, SIZE_MAX), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + char *d = a7 + 7; + + T (d + UR (-8, -7), "012345"); +} + +void test_strncpy_memarray (struct MemArray *p, const void *s) +{ +#undef T +#define T(d, s, n) (strncpy ((d), (s), (n)), sink (d)) + + T (p->a9 + UR (0, 1), s, 9); + T (p->a9 + UR (0, 7), s, 9); + T (p->a9 + UR (0, 8), s, 9); + T (p->a9 + UR (0, DIFF_MAX), s, 9); + T (p->a9 + UR (0, SIZE_MAX), s, 9); + + T (p->a9 + UR (1, 2), s, 9); /* { dg-warning "writing 9 bytes into a region of size 8" } */ + T (p->a9 + UR (2, 3), s, 9); /* { dg-warning "writing 9 bytes into a region of size 7" } */ + T (p->a9 + UR (6, 9), s, 9); /* { dg-warning "writing 9 bytes into a region of size 3" } */ + T (p->a9 + UR (9, 10), s, 9); /* { dg-warning "writing 9 bytes into a region of size 0" } */ + T (p->a9 + UR (10, 11), s, 9); /* { dg-warning "writing 9 bytes into a region of size 0" } */ + + T (p->a9 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), s, 1); /* { dg-warning "writing 1 byte into a region of size 0" } */ + T (p->a9 + UR (DIFF_MAX, SIZE_MAX), s, 3); /* { dg-warning "writing 3 bytes into a region of size 0" } */ +}
On 12/01/2017 11:06 AM, Martin Sebor wrote: > On 12/01/2017 01:26 AM, Jeff Law wrote: >> On 11/30/2017 01:30 PM, Martin Sebor wrote: >>> On 11/22/2017 05:03 PM, Jeff Law wrote: >>>> On 11/21/2017 12:07 PM, Martin Sebor wrote: >>>>> On 11/21/2017 09:55 AM, Jeff Law wrote: >>>>>> On 11/19/2017 04:28 PM, Martin Sebor wrote: >>>>>>> On 11/18/2017 12:53 AM, Jeff Law wrote: >>>>>>>> On 11/17/2017 12:36 PM, Martin Sebor wrote: [ Lots of snipping ] >> >> In fact, if I look at how we handle expand_builtin_mempcpy we have: >> >> /* Avoid expanding mempcpy into memcpy when the call is determined >> to overflow the buffer. This also prevents the same overflow >> from being diagnosed again when expanding memcpy. */ >> if (!check_memop_sizes (exp, dest, src, len)) >> return NULL_RTX; >> >> While that's not strictly meant to be an optimization, it is in effect >> changing the code we generate based on the return value of >> check_memop_sizes, which comes from compute_objsize. Thankfully, the >> worst that happens here is we'll fail to turn the mempcpy into a memcpy >> if compute_objsize/check_memop_sizes returns a value that is not >> strictly correct. But I think it highlights how easy it is to end up >> having code generation changing based on the results of compute_objsize. > > I think you have misread the code (which is easy to do), so I'm > not sure it does highlight it. Just to be clear, I'm convinced the code should DTRT as-written and that it's behavior is not changed by your patch. My point is that these routines should not generally be used to influence code generation/optimization decisions. In an ideal world we could express and enforce that rule. But we don't live in that ideal world and as a result it's relatively easy to use those routines to influence code generation decisions without even knowing it. So again, anytime we have something like what we see above where we call check_memop_sizes/compute_objsize and the result is used to select between paths that change code generation/optimization we need a comment. We also need a comment in compute_objsize to discourage its use in any contexts where the result affects code generation/optimization. Again, the code is safe as-is and not affected by your patch. I'm just asking for a comment for those functions and at any site where those functions are used to influence code generation/optimization decisions. [ More snipping.. Hopefully not losing too much context.. ] >> SO would you rather go with baking the inexact nature of the return >> value into the API or the pair of comments noted above? > > I added the comment to compute_objsize in the last iteration > of the patch. > > As I explained above, expand_builtin_mempcpy isn't affected by > this patch. The comments above and within in the function > already explain what what happens when a buffer overflow is > detected and why so I'm not sure what else to say there. If > I misunderstood your suggestion please clarify. The comment I'm looking for in expand_builtin_mempcpy would be something like this: /* Policy does not generally allow using compute_objsize (which is used internally by check_memop_size) to change code generation or drive optimization decisions. In this instance it is safe because the code we generate has the same semantics regardless of the return value of check_memop_sizes. Exactly the same amount of data is copied and the return value is exactly the same in both cases. Furthermore, check_memop_size always uses mode 0 for the call to compute_objsize, so the imprecise nature of compute_objsize is avoided. */ > > The other change you are suggesting means restoring the false > negatives in Object Size modes 2 and 3 where my patch detected > true positives. Here's an example to demonstrate the effect. > With my original patch, the overflow in both functions below > is diagnosed with all -Wstringop-overflow=N arguments. With > the change you asked for, only the memcpy overflow is diagnosed > with all arguments. The strncpy overflow is only diagnosed > with N=1 and 2, and suppressed with N=3 and N=4. That's clearly > a worse outcome, but in the interest of moving forward I've made > the change. I think the overall improvement is worthwhile despite > this flaw. So given the comments about restrictions in how compute_objsize is used, I don't think we need to make the change I originally asked for. Go with whatever you think is best WRT handling of that bound computation. Either computation of that bound has a degree of imprecision due to range involvement and makes the result unsuitable for influencing code generation or optimization. In the compute_objsize comment I'd use "The function is intended for diagnostics and should not be used to influence code generation or optimization." I think that's slightly better. OK with either computation the updated comment for compute_objsize. Please add a comment to the mempcpy bits as well. Jeff
PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow gcc/ChangeLog: PR middle-end/77608 * builtins.c (compute_objsize): Handle non-constant offsets. gcc/testsuite/ChangeLog: PR middle-end/77608 * gcc.dg/Wstringop-overflow.c: New test. Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 254879) +++ gcc/builtins.c (working copy) @@ -3276,11 +3276,6 @@ compute_objsize (tree dest, int ostype) if (compute_builtin_object_size (dest, ostype, &size)) return build_int_cst (sizetype, size); - /* Unless computing the largest size (for memcpy and other raw memory - functions), try to determine the size of the object from its type. */ - if (!ostype) - return NULL_TREE; - if (TREE_CODE (dest) == SSA_NAME) { gimple *stmt = SSA_NAME_DEF_STMT (dest); @@ -3287,13 +3282,47 @@ compute_objsize (tree dest, int ostype) if (!is_gimple_assign (stmt)) return NULL_TREE; + dest = gimple_assign_rhs1 (stmt); + tree_code code = gimple_assign_rhs_code (stmt); - if (code != ADDR_EXPR && code != POINTER_PLUS_EXPR) + if (code == POINTER_PLUS_EXPR) + { + /* compute_builtin_object_size fails for addresses with + non-constant offsets. Try to determine the range of + such an offset here and use it to adjus the constant + size. */ + tree off = gimple_assign_rhs2 (stmt); + if (TREE_CODE (off) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (off))) + { + wide_int min, max; + enum value_range_type rng = get_range_info (off, &min, &max); + + if (rng == VR_RANGE) + { + if (tree size = compute_objsize (dest, ostype)) + { + wide_int wisiz = wi::to_wide (size); + if (wi::sign_mask (min)) + /* ignore negative offsets for now */; + else if (wi::ltu_p (min, wisiz)) + return wide_int_to_tree (TREE_TYPE (size), + wi::sub (wisiz, min)); + else + return size_zero_node; + } + } + } + } + else if (code != ADDR_EXPR) return NULL_TREE; - - dest = gimple_assign_rhs1 (stmt); } + /* Unless computing the largest size (for memcpy and other raw memory + functions), try to determine the size of the object from its type. */ + if (!ostype) + return NULL_TREE; + if (TREE_CODE (dest) != ADDR_EXPR) return NULL_TREE; Index: gcc/testsuite/gcc.dg/Wstringop-overflow.c =================================================================== --- gcc/testsuite/gcc.dg/Wstringop-overflow.c (nonexistent) +++ gcc/testsuite/gcc.dg/Wstringop-overflow.c (working copy) @@ -0,0 +1,132 @@ +/* PR middle-end/77608 - missing protection on trivially detectable runtime + buffer overflow + { dg-do compile } + { dg-options "-O2 -Wstringop-overflow -ftrack-macro-expansion=0" } */ + +#define SIZE_MAX __SIZE_MAX__ +#define DIFF_MAX __PTRDIFF_MAX__ +#define DIFF_MIN (-DIFF_MAX - 1) + +typedef __SIZE_TYPE__ size_t; + +extern void* memcpy (void*, const void*, size_t); +extern char* strcpy (char*, const char*); +extern char* strncpy (char*, const char*, size_t); + +void sink (void*); + +extern size_t unsigned_value (void) +{ + extern volatile size_t unsigned_value_source; + return unsigned_value_source; +} + +size_t unsigned_range (size_t min, size_t max) +{ + size_t val = unsigned_value (); + return val < min || max < val ? min : val; +} + +#define UR(min, max) unsigned_range (min, max) + + +char a7[7]; + +struct MemArray { char a9[9]; char a1[1]; }; + +void test_memcpy_array (const void *s) +{ +#define T(d, s, n) (memcpy ((d), (s), (n)), sink (d)) + + T (a7 + UR (0, 1), s, 7); + T (a7 + UR (0, 7), s, 7); + T (a7 + UR (0, 8), s, 7); + T (a7 + UR (0, DIFF_MAX), s, 7); + T (a7 + UR (0, SIZE_MAX), s, 7); + + T (a7 + UR (1, 2), s, 7); /* { dg-warning "writing 7 bytes into a region of size 6" } */ + T (a7 + UR (2, 3), s, 7); /* { dg-warning "writing 7 bytes into a region of size 5" } */ + T (a7 + UR (6, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 1" } */ + T (a7 + UR (7, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (8, 9), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + T (a7 + UR (9, 10), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, SIZE_MAX), s, 7); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + /* This is valid. */ + char *d = a7 + 7; + T (d + UR (-8, -7), s, 7); +} + +/* Verify the absence of warnings for memcpy writing beyond object + boundaries. */ + +void test_memcpy_memarray (struct MemArray *p, const void *s) +{ +#undef T +#define T(d, s, n) (memcpy ((d), (s), (n)), sink (d)) + + /* The following are valid. */ + T (p->a9 + UR (0, 1), s, 9); + T (p->a9 + UR (0, 7), s, 9); + T (p->a9 + UR (0, 8), s, 9); + T (p->a9 + UR (0, DIFF_MAX), s, 9); + T (p->a9 + UR (0, SIZE_MAX), s, 9); + + /* The following are invalid. Unfortunately, there is apparently enough + code out there that abuses memcpy to write past the end of one member + and into the members that follow so the following are not diagnosed + by design. It sure would be nice not to have to cater to hacks like + these... */ + T (p->a9 + UR (1, 2), s, 9); + T (p->a9 + UR (1, 2), s, 123); +} + + +void test_strcpy_array (void) +{ +#undef T +#define T(d, s) (strcpy ((d), (s)), sink (d)) + + T (a7 + UR (0, 1), "012345"); + T (a7 + UR (0, 7), "012345"); + T (a7 + UR (0, 8), "012345"); + T (a7 + UR (0, DIFF_MAX), "012345"); + T (a7 + UR (0, SIZE_MAX), "012345"); + + T (a7 + UR (1, 2), "012345"); /* { dg-warning "writing 7 bytes into a region of size 6" } */ + T (a7 + UR (2, 3), "012345"); /* { dg-warning "writing 7 bytes into a region of size 5" } */ + T (a7 + UR (6, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 1" } */ + T (a7 + UR (7, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (8, 9), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + T (a7 + UR (9, 10), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + T (a7 + UR (DIFF_MAX, SIZE_MAX), "012345"); /* { dg-warning "writing 7 bytes into a region of size 0" } */ + + char *d = a7 + 7; + + T (d + UR (-8, -7), "012345"); +} + +void test_strncpy_memarray (struct MemArray *p, const void *s) +{ +#undef T +#define T(d, s, n) (strncpy ((d), (s), (n)), sink (d)) + + T (p->a9 + UR (0, 1), s, 9); + T (p->a9 + UR (0, 7), s, 9); + T (p->a9 + UR (0, 8), s, 9); + T (p->a9 + UR (0, DIFF_MAX), s, 9); + T (p->a9 + UR (0, SIZE_MAX), s, 9); + + T (p->a9 + UR (1, 2), s, 9); /* { dg-warning "writing 9 bytes into a region of size 8" } */ + T (p->a9 + UR (2, 3), s, 9); /* { dg-warning "writing 9 bytes into a region of size 7" } */ + T (p->a9 + UR (6, 9), s, 9); /* { dg-warning "writing 9 bytes into a region of size 3" } */ + T (p->a9 + UR (9, 10), s, 9); /* { dg-warning "writing 9 bytes into a region of size 0" } */ + T (p->a9 + UR (10, 11), s, 9); /* { dg-warning "writing 9 bytes into a region of size 0" } */ + + T (p->a9 + UR (DIFF_MAX, DIFF_MAX + (size_t)1), s, 1); /* { dg-warning "writing 1 byte into a region of size 0" } */ + T (p->a9 + UR (DIFF_MAX, SIZE_MAX), s, 3); /* { dg-warning "writing 3 bytes into a region of size 0" } */ +}