diff mbox series

make -Wrestrict for strcat more meaningful (PR 83698)

Message ID b034a2a1-9193-5fd2-016b-da1fbc05d139@gmail.com
State New
Headers show
Series make -Wrestrict for strcat more meaningful (PR 83698) | expand

Commit Message

Martin Sebor Jan. 16, 2018, 8:36 p.m. UTC
PR 83698 - bogus offset in -Wrestrict messages for strcat of
unknown strings, points out that the offsets printed by
-Wrestrict for possibly overlapping strcat calls with unknown
strings don't look meaningful in some cases.  The root cause
of the bogus values is wrapping during the conversion from
offset_int in which the pass tracks numerical values to
HOST_WIDE_INT for printing.  (The problem will go away once
GCC's pretty-printer supports wide int formatting.)  For
instance, the following:

   extern char *d;
   strcat (d + 3, d + 5);

results in

   warning: ‘strcat’ accessing 0 or more bytes at offsets 3 and 5 may 
overlap 1 byte at offset [3, -9223372036854775806]

which, besides printing the bogus negative offset on LP64
targets, isn't correct because strcat always accesses at least
one byte (the nul) and there can be no overlap at offset 3.
To be more accurate, the warning should say something like:

   warning: ‘strcat’ accessing 3 or more bytes at offsets 3 and 5 may 
overlap 1 byte at offset 5 [-Wrestrict]

because the function must access at least 3 bytes in order to
cause an overlap, and when it does, the overlap starts at the
higher of the two offsets, i.e., 5.  (Though it's virtually
impossible to have a single sentence and a singled set of
numbers cover all the cases with perfect accuracy.)

The attached patch fixes these issues to make the printed values
make more sense.  (It doesn't affect when diagnostics are printed.)

Although this isn't strictly a regression, it has an impact on
the readability of the warnings.  If left unchanged, the original
messages are likely to confuse users and lead to bug reports.

Martin

Comments

Jakub Jelinek Jan. 16, 2018, 9:32 p.m. UTC | #1
On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
> --- gcc/gimple-ssa-warn-restrict.c	(revision 256752)
> +++ gcc/gimple-ssa-warn-restrict.c	(working copy)
> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>  	  base = SSA_NAME_VAR (base);
>        }
>  
> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
> +    {
> +      if (offrange[0] < 0 && offrange[1] > 0)
> +	offrange[0] = 0;
> +    }

Why the 2 nested ifs?

> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>      return false;
>  
>    /* When strcat overlap is certain it is always a single byte:
> -     the terminatinn NUL, regardless of offsets and sizes.  When
> +     the terminating NUL, regardless of offsets and sizes.  When
>       overlap is only possible its range is [0, 1].  */
>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>    acs.ovlsiz[1] = 1;
> -  acs.ovloff[0] = (dstref->sizrange[0] + dstref->offrange[0]).to_shwi ();
> -  acs.ovloff[1] = (dstref->sizrange[1] + dstref->offrange[1]).to_shwi ();

You use to_shwi many times in the patch, do the callers or something earlier
in this function guarantee that you aren't throwing away any bits (unlike
tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
Especially when you perform additions like here, even if both wide_ints fit
into a shwi, the result might not.

	Jakub
Martin Sebor Jan. 17, 2018, 12:35 a.m. UTC | #2
On 01/16/2018 02:32 PM, Jakub Jelinek wrote:
> On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
>> --- gcc/gimple-ssa-warn-restrict.c	(revision 256752)
>> +++ gcc/gimple-ssa-warn-restrict.c	(working copy)
>> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>>  	  base = SSA_NAME_VAR (base);
>>        }
>>
>> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
>> +    {
>> +      if (offrange[0] < 0 && offrange[1] > 0)
>> +	offrange[0] = 0;
>> +    }
>
> Why the 2 nested ifs?

No particular reason.  There may have been more code in there
that I ended up removing.  Or a comment.  I can remove the
extra braces when the patch is approved.

>
>> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>>      return false;
>>
>>    /* When strcat overlap is certain it is always a single byte:
>> -     the terminatinn NUL, regardless of offsets and sizes.  When
>> +     the terminating NUL, regardless of offsets and sizes.  When
>>       overlap is only possible its range is [0, 1].  */
>>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>>    acs.ovlsiz[1] = 1;
>> -  acs.ovloff[0] = (dstref->sizrange[0] + dstref->offrange[0]).to_shwi ();
>> -  acs.ovloff[1] = (dstref->sizrange[1] + dstref->offrange[1]).to_shwi ();
>
> You use to_shwi many times in the patch, do the callers or something earlier
> in this function guarantee that you aren't throwing away any bits (unlike
> tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
> Especially when you perform additions like here, even if both wide_ints fit
> into a shwi, the result might not.

No, I'm not sure.  In fact, it wouldn't surprise me if it did
happen.  It doesn't cause false positives or negatives but it
can make the offsets less than meaningful in cases where they
are within valid bounds.  There are also cases where they are
meaningless to begin with and there is little the pass can do
about that.

IMO, the ideal solution to the first problem is to add a format
specifier for wide ints to the pretty printer and get rid of
the conversions.  It's probably too late for something like
that now but I'd like to do it for GCC 9.  Unless someone
files a bug/regression, it's also too late for me to go and
try to find and fix these conversions now.

Martin

PS While looking for a case you asked about I came up with
the following.  I don't think there's any slicing involved
but the offsets are just as meaningless as if there were.
I think the way to do significantly better is to detect
out-of-bounds offsets earlier (e.g., as in this patch:
https://gcc.gnu.org/ml/gcc-patches/2017-10/msg02143.html)

$ cat z.c && gcc -O2 -S -Warray-bounds -m32 z.c
extern int a[];

void f (__PTRDIFF_TYPE__ i)
{
   if (i < __PTRDIFF_MAX__ - 7 || __PTRDIFF_MAX__ - 5 < i)
     i = __PTRDIFF_MAX__ -  7;

   const int *s = a + i;

   __builtin_memcpy (a, &s[i], 3);
}
z.c: In function ‘f’:
z.c:10:3: warning: ‘__builtin_memcpy’ offset [-64, -48] is out of the 
bounds of object ‘a’ with type ‘int[]’ [-Warray-bounds]
    __builtin_memcpy (a, &s[i], 3);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
z.c:1:12: note: ‘a’ declared here
  extern int a[];
             ^
Martin Sebor Jan. 30, 2018, 5:24 p.m. UTC | #3
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html

This is a minor improvement but there have been a couple of
comments lately pointing out that the numbers in the -Wrestrict
messages can make them confusing.

Jakub, since you made one of those comments (the other came up
in the context of bug 84095 - [8 Regression] false-positive
-Wrestrict warnings for memcpy within array).  Can you please
either approve this patch or suggest changes?

On 01/16/2018 05:35 PM, Martin Sebor wrote:
> On 01/16/2018 02:32 PM, Jakub Jelinek wrote:
>> On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
>>> --- gcc/gimple-ssa-warn-restrict.c    (revision 256752)
>>> +++ gcc/gimple-ssa-warn-restrict.c    (working copy)
>>> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>>>        base = SSA_NAME_VAR (base);
>>>        }
>>>
>>> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
>>> +    {
>>> +      if (offrange[0] < 0 && offrange[1] > 0)
>>> +    offrange[0] = 0;
>>> +    }
>>
>> Why the 2 nested ifs?
>
> No particular reason.  There may have been more code in there
> that I ended up removing.  Or a comment.  I can remove the
> extra braces when the patch is approved.
>
>>
>>> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>>>      return false;
>>>
>>>    /* When strcat overlap is certain it is always a single byte:
>>> -     the terminatinn NUL, regardless of offsets and sizes.  When
>>> +     the terminating NUL, regardless of offsets and sizes.  When
>>>       overlap is only possible its range is [0, 1].  */
>>>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>>>    acs.ovlsiz[1] = 1;
>>> -  acs.ovloff[0] = (dstref->sizrange[0] +
>>> dstref->offrange[0]).to_shwi ();
>>> -  acs.ovloff[1] = (dstref->sizrange[1] +
>>> dstref->offrange[1]).to_shwi ();
>>
>> You use to_shwi many times in the patch, do the callers or something
>> earlier
>> in this function guarantee that you aren't throwing away any bits (unlike
>> tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
>> Especially when you perform additions like here, even if both
>> wide_ints fit
>> into a shwi, the result might not.
>
> No, I'm not sure.  In fact, it wouldn't surprise me if it did
> happen.  It doesn't cause false positives or negatives but it
> can make the offsets less than meaningful in cases where they
> are within valid bounds.  There are also cases where they are
> meaningless to begin with and there is little the pass can do
> about that.
>
> IMO, the ideal solution to the first problem is to add a format
> specifier for wide ints to the pretty printer and get rid of
> the conversions.  It's probably too late for something like
> that now but I'd like to do it for GCC 9.  Unless someone
> files a bug/regression, it's also too late for me to go and
> try to find and fix these conversions now.
>
> Martin
>
> PS While looking for a case you asked about I came up with
> the following.  I don't think there's any slicing involved
> but the offsets are just as meaningless as if there were.
> I think the way to do significantly better is to detect
> out-of-bounds offsets earlier (e.g., as in this patch:
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg02143.html)
>
> $ cat z.c && gcc -O2 -S -Warray-bounds -m32 z.c
> extern int a[];
>
> void f (__PTRDIFF_TYPE__ i)
> {
>   if (i < __PTRDIFF_MAX__ - 7 || __PTRDIFF_MAX__ - 5 < i)
>     i = __PTRDIFF_MAX__ -  7;
>
>   const int *s = a + i;
>
>   __builtin_memcpy (a, &s[i], 3);
> }
> z.c: In function ‘f’:
> z.c:10:3: warning: ‘__builtin_memcpy’ offset [-64, -48] is out of the
> bounds of object ‘a’ with type ‘int[]’ [-Warray-bounds]
>    __builtin_memcpy (a, &s[i], 3);
>    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> z.c:1:12: note: ‘a’ declared here
>  extern int a[];
>             ^
>
Martin Sebor Feb. 6, 2018, 3:20 a.m. UTC | #4
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html

As I mentioned, this doesn't solve a regression per se but
rather implements what I consider an important usability
improvement to the -Wrestrict warnings.  Printing offsets
that are the most meaningful makes debugging the problems
the warnings point out easier.

On 01/30/2018 10:24 AM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html
>
> This is a minor improvement but there have been a couple of
> comments lately pointing out that the numbers in the -Wrestrict
> messages can make them confusing.
>
> Jakub, since you made one of those comments (the other came up
> in the context of bug 84095 - [8 Regression] false-positive
> -Wrestrict warnings for memcpy within array).  Can you please
> either approve this patch or suggest changes?
>
> On 01/16/2018 05:35 PM, Martin Sebor wrote:
>> On 01/16/2018 02:32 PM, Jakub Jelinek wrote:
>>> On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
>>>> --- gcc/gimple-ssa-warn-restrict.c    (revision 256752)
>>>> +++ gcc/gimple-ssa-warn-restrict.c    (working copy)
>>>> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>>>>        base = SSA_NAME_VAR (base);
>>>>        }
>>>>
>>>> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
>>>> +    {
>>>> +      if (offrange[0] < 0 && offrange[1] > 0)
>>>> +    offrange[0] = 0;
>>>> +    }
>>>
>>> Why the 2 nested ifs?
>>
>> No particular reason.  There may have been more code in there
>> that I ended up removing.  Or a comment.  I can remove the
>> extra braces when the patch is approved.
>>
>>>
>>>> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>>>>      return false;
>>>>
>>>>    /* When strcat overlap is certain it is always a single byte:
>>>> -     the terminatinn NUL, regardless of offsets and sizes.  When
>>>> +     the terminating NUL, regardless of offsets and sizes.  When
>>>>       overlap is only possible its range is [0, 1].  */
>>>>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>>>>    acs.ovlsiz[1] = 1;
>>>> -  acs.ovloff[0] = (dstref->sizrange[0] +
>>>> dstref->offrange[0]).to_shwi ();
>>>> -  acs.ovloff[1] = (dstref->sizrange[1] +
>>>> dstref->offrange[1]).to_shwi ();
>>>
>>> You use to_shwi many times in the patch, do the callers or something
>>> earlier
>>> in this function guarantee that you aren't throwing away any bits
>>> (unlike
>>> tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
>>> Especially when you perform additions like here, even if both
>>> wide_ints fit
>>> into a shwi, the result might not.
>>
>> No, I'm not sure.  In fact, it wouldn't surprise me if it did
>> happen.  It doesn't cause false positives or negatives but it
>> can make the offsets less than meaningful in cases where they
>> are within valid bounds.  There are also cases where they are
>> meaningless to begin with and there is little the pass can do
>> about that.
>>
>> IMO, the ideal solution to the first problem is to add a format
>> specifier for wide ints to the pretty printer and get rid of
>> the conversions.  It's probably too late for something like
>> that now but I'd like to do it for GCC 9.  Unless someone
>> files a bug/regression, it's also too late for me to go and
>> try to find and fix these conversions now.
>>
>> Martin
>>
>> PS While looking for a case you asked about I came up with
>> the following.  I don't think there's any slicing involved
>> but the offsets are just as meaningless as if there were.
>> I think the way to do significantly better is to detect
>> out-of-bounds offsets earlier (e.g., as in this patch:
>> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg02143.html)
>>
>> $ cat z.c && gcc -O2 -S -Warray-bounds -m32 z.c
>> extern int a[];
>>
>> void f (__PTRDIFF_TYPE__ i)
>> {
>>   if (i < __PTRDIFF_MAX__ - 7 || __PTRDIFF_MAX__ - 5 < i)
>>     i = __PTRDIFF_MAX__ -  7;
>>
>>   const int *s = a + i;
>>
>>   __builtin_memcpy (a, &s[i], 3);
>> }
>> z.c: In function ‘f’:
>> z.c:10:3: warning: ‘__builtin_memcpy’ offset [-64, -48] is out of the
>> bounds of object ‘a’ with type ‘int[]’ [-Warray-bounds]
>>    __builtin_memcpy (a, &s[i], 3);
>>    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> z.c:1:12: note: ‘a’ declared here
>>  extern int a[];
>>             ^
>>
>
Martin Sebor Feb. 13, 2018, 3:15 a.m. UTC | #5
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html

On 02/05/2018 08:20 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html
>
> As I mentioned, this doesn't solve a regression per se but
> rather implements what I consider an important usability
> improvement to the -Wrestrict warnings.  Printing offsets
> that are the most meaningful makes debugging the problems
> the warnings point out easier.
>
> On 01/30/2018 10:24 AM, Martin Sebor wrote:
>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html
>>
>> This is a minor improvement but there have been a couple of
>> comments lately pointing out that the numbers in the -Wrestrict
>> messages can make them confusing.
>>
>> Jakub, since you made one of those comments (the other came up
>> in the context of bug 84095 - [8 Regression] false-positive
>> -Wrestrict warnings for memcpy within array).  Can you please
>> either approve this patch or suggest changes?
>>
>> On 01/16/2018 05:35 PM, Martin Sebor wrote:
>>> On 01/16/2018 02:32 PM, Jakub Jelinek wrote:
>>>> On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
>>>>> --- gcc/gimple-ssa-warn-restrict.c    (revision 256752)
>>>>> +++ gcc/gimple-ssa-warn-restrict.c    (working copy)
>>>>> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr,
>>>>> tree si
>>>>>        base = SSA_NAME_VAR (base);
>>>>>        }
>>>>>
>>>>> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
>>>>> +    {
>>>>> +      if (offrange[0] < 0 && offrange[1] > 0)
>>>>> +    offrange[0] = 0;
>>>>> +    }
>>>>
>>>> Why the 2 nested ifs?
>>>
>>> No particular reason.  There may have been more code in there
>>> that I ended up removing.  Or a comment.  I can remove the
>>> extra braces when the patch is approved.
>>>
>>>>
>>>>> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>>>>>      return false;
>>>>>
>>>>>    /* When strcat overlap is certain it is always a single byte:
>>>>> -     the terminatinn NUL, regardless of offsets and sizes.  When
>>>>> +     the terminating NUL, regardless of offsets and sizes.  When
>>>>>       overlap is only possible its range is [0, 1].  */
>>>>>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>>>>>    acs.ovlsiz[1] = 1;
>>>>> -  acs.ovloff[0] = (dstref->sizrange[0] +
>>>>> dstref->offrange[0]).to_shwi ();
>>>>> -  acs.ovloff[1] = (dstref->sizrange[1] +
>>>>> dstref->offrange[1]).to_shwi ();
>>>>
>>>> You use to_shwi many times in the patch, do the callers or something
>>>> earlier
>>>> in this function guarantee that you aren't throwing away any bits
>>>> (unlike
>>>> tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
>>>> Especially when you perform additions like here, even if both
>>>> wide_ints fit
>>>> into a shwi, the result might not.
>>>
>>> No, I'm not sure.  In fact, it wouldn't surprise me if it did
>>> happen.  It doesn't cause false positives or negatives but it
>>> can make the offsets less than meaningful in cases where they
>>> are within valid bounds.  There are also cases where they are
>>> meaningless to begin with and there is little the pass can do
>>> about that.
>>>
>>> IMO, the ideal solution to the first problem is to add a format
>>> specifier for wide ints to the pretty printer and get rid of
>>> the conversions.  It's probably too late for something like
>>> that now but I'd like to do it for GCC 9.  Unless someone
>>> files a bug/regression, it's also too late for me to go and
>>> try to find and fix these conversions now.
>>>
>>> Martin
>>>
>>> PS While looking for a case you asked about I came up with
>>> the following.  I don't think there's any slicing involved
>>> but the offsets are just as meaningless as if there were.
>>> I think the way to do significantly better is to detect
>>> out-of-bounds offsets earlier (e.g., as in this patch:
>>> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg02143.html)
>>>
>>> $ cat z.c && gcc -O2 -S -Warray-bounds -m32 z.c
>>> extern int a[];
>>>
>>> void f (__PTRDIFF_TYPE__ i)
>>> {
>>>   if (i < __PTRDIFF_MAX__ - 7 || __PTRDIFF_MAX__ - 5 < i)
>>>     i = __PTRDIFF_MAX__ -  7;
>>>
>>>   const int *s = a + i;
>>>
>>>   __builtin_memcpy (a, &s[i], 3);
>>> }
>>> z.c: In function ‘f’:
>>> z.c:10:3: warning: ‘__builtin_memcpy’ offset [-64, -48] is out of the
>>> bounds of object ‘a’ with type ‘int[]’ [-Warray-bounds]
>>>    __builtin_memcpy (a, &s[i], 3);
>>>    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> z.c:1:12: note: ‘a’ declared here
>>>  extern int a[];
>>>             ^
>>>
>>
>
Jeff Law Feb. 14, 2018, 5:36 a.m. UTC | #6
On 01/16/2018 05:35 PM, Martin Sebor wrote:
> On 01/16/2018 02:32 PM, Jakub Jelinek wrote:
>> On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
>>> --- gcc/gimple-ssa-warn-restrict.c    (revision 256752)
>>> +++ gcc/gimple-ssa-warn-restrict.c    (working copy)
>>> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>>>        base = SSA_NAME_VAR (base);
>>>        }
>>>
>>> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
>>> +    {
>>> +      if (offrange[0] < 0 && offrange[1] > 0)
>>> +    offrange[0] = 0;
>>> +    }
>>
>> Why the 2 nested ifs?
> 
> No particular reason.  There may have been more code in there
> that I ended up removing.  Or a comment.  I can remove the
> extra braces when the patch is approved.
> 
>>
>>> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>>>      return false;
>>>
>>>    /* When strcat overlap is certain it is always a single byte:
>>> -     the terminatinn NUL, regardless of offsets and sizes.  When
>>> +     the terminating NUL, regardless of offsets and sizes.  When
>>>       overlap is only possible its range is [0, 1].  */
>>>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>>>    acs.ovlsiz[1] = 1;
>>> -  acs.ovloff[0] = (dstref->sizrange[0] +
>>> dstref->offrange[0]).to_shwi ();
>>> -  acs.ovloff[1] = (dstref->sizrange[1] +
>>> dstref->offrange[1]).to_shwi ();
>>
>> You use to_shwi many times in the patch, do the callers or something
>> earlier
>> in this function guarantee that you aren't throwing away any bits (unlike
>> tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
>> Especially when you perform additions like here, even if both
>> wide_ints fit
>> into a shwi, the result might not.
> 
> No, I'm not sure.  In fact, it wouldn't surprise me if it did
> happen.  It doesn't cause false positives or negatives but it
> can make the offsets less than meaningful in cases where they
> are within valid bounds.  There are also cases where they are
> meaningless to begin with and there is little the pass can do
> about that.
I was kind of expecting an update to try and address some of these
issues.  Though after re-reading your response the consequence of
throwing away bits here is just the diagnostic is not as precise as it
could be, right?  ie, it doesn't change when we issue a diagnostic, just
the contents of the diagnostic.

I filed this into my gcc9 bucket because it doesn't fix a regression,
but it appears that a regression fix does depend on this stuff to some
degree (84095).  So I'll try to take a look at this shortly so that we
can unblock 84095.


Jeff
Jeff Law Feb. 14, 2018, 5:53 a.m. UTC | #7
On 01/16/2018 01:36 PM, Martin Sebor wrote:
> PR 83698 - bogus offset in -Wrestrict messages for strcat of
> unknown strings, points out that the offsets printed by
> -Wrestrict for possibly overlapping strcat calls with unknown
> strings don't look meaningful in some cases.  The root cause
> of the bogus values is wrapping during the conversion from
> offset_int in which the pass tracks numerical values to
> HOST_WIDE_INT for printing.  (The problem will go away once
> GCC's pretty-printer supports wide int formatting.)  For
> instance, the following:
> 
>   extern char *d;
>   strcat (d + 3, d + 5);
> 
> results in
> 
>   warning: ‘strcat’ accessing 0 or more bytes at offsets 3 and 5 may
> overlap 1 byte at offset [3, -9223372036854775806]
> 
> which, besides printing the bogus negative offset on LP64
> targets, isn't correct because strcat always accesses at least
> one byte (the nul) and there can be no overlap at offset 3.
> To be more accurate, the warning should say something like:
> 
>   warning: ‘strcat’ accessing 3 or more bytes at offsets 3 and 5 may
> overlap 1 byte at offset 5 [-Wrestrict]
> 
> because the function must access at least 3 bytes in order to
> cause an overlap, and when it does, the overlap starts at the
> higher of the two offsets, i.e., 5.  (Though it's virtually
> impossible to have a single sentence and a singled set of
> numbers cover all the cases with perfect accuracy.)
> 
> The attached patch fixes these issues to make the printed values
> make more sense.  (It doesn't affect when diagnostics are printed.)
> 
> Although this isn't strictly a regression, it has an impact on
> the readability of the warnings.  If left unchanged, the original
> messages are likely to confuse users and lead to bug reports.
> 
> Martin
> 
> gcc-83698.diff
> 
> 
> PR tree-optimization/83698 - bogus offset in -Wrestrict messages for strcat of unknown strings
> 
> gcc/ChangeLog:
> 
> 	PR tree-optimization/83698
> 	* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): For
> 	arrays constrain the offset range to their bounds.
> 	(builtin_access::strcat_overlap): Adjust the bounds of overlap offset.
> 	(builtin_access::overlap): Avoid setting the size of overlap if it's
> 	already been set.
> 	(maybe_diag_overlap): Also consider arrays when deciding what values
> 	of offsets to include in diagnostics.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR tree-optimization/83698
> 	* gcc.dg/Wrestrict-7.c: New test.
> 	* c-c++-common/Wrestrict.c: Adjust expected values for strcat.
> 	* gcc.target/i386/chkp-stropt-17.c: Same.
> 
> Index: gcc/gimple-ssa-warn-restrict.c
> ===================================================================
> --- gcc/gimple-ssa-warn-restrict.c	(revision 256752)
> +++ gcc/gimple-ssa-warn-restrict.c	(working copy)
> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>  	  base = SSA_NAME_VAR (base);
>        }
>  
> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
> +    {
> +      if (offrange[0] < 0 && offrange[1] > 0)
> +	offrange[0] = 0;
> +    }
> +
Comment before this code so that it's more obvious to the reader that
you're narrowing the range of the low bound because it's an array
(presumably Ada's virtual origins aren't a concern here).

Note some of the preceding context has changed, but I don't think it
materially affects your patch, only the ability to apply it cleanly.

Jakub had a nit with that hunk (unnecessarily nested conditional).  I'm
going to assume you'll address that.

I concur that the changes to strcat_overlap Jakub was concerned about
merely affect the diagnostic output, not whether or not we emit a
diagnostic.  I'm going to assume that even though we may drop bits that
the net is better with this patch rather than without.

OK for the trunk with the comment and nit fixed.

jeff
diff mbox series

Patch

PR tree-optimization/83698 - bogus offset in -Wrestrict messages for strcat of unknown strings

gcc/ChangeLog:

	PR tree-optimization/83698
	* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): For
	arrays constrain the offset range to their bounds.
	(builtin_access::strcat_overlap): Adjust the bounds of overlap offset.
	(builtin_access::overlap): Avoid setting the size of overlap if it's
	already been set.
	(maybe_diag_overlap): Also consider arrays when deciding what values
	of offsets to include in diagnostics.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83698
	* gcc.dg/Wrestrict-7.c: New test.
	* c-c++-common/Wrestrict.c: Adjust expected values for strcat.
	* gcc.target/i386/chkp-stropt-17.c: Same.

Index: gcc/gimple-ssa-warn-restrict.c
===================================================================
--- gcc/gimple-ssa-warn-restrict.c	(revision 256752)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -384,6 +384,12 @@  builtin_memref::builtin_memref (tree expr, tree si
 	  base = SSA_NAME_VAR (base);
       }
 
+  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
+    {
+      if (offrange[0] < 0 && offrange[1] > 0)
+	offrange[0] = 0;
+    }
+
   if (size)
     {
       tree range[2];
@@ -1079,14 +1085,35 @@  builtin_access::strcat_overlap ()
     return false;
 
   /* When strcat overlap is certain it is always a single byte:
-     the terminatinn NUL, regardless of offsets and sizes.  When
+     the terminating NUL, regardless of offsets and sizes.  When
      overlap is only possible its range is [0, 1].  */
   acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
   acs.ovlsiz[1] = 1;
-  acs.ovloff[0] = (dstref->sizrange[0] + dstref->offrange[0]).to_shwi ();
-  acs.ovloff[1] = (dstref->sizrange[1] + dstref->offrange[1]).to_shwi ();
 
-  acs.sizrange[0] = wi::smax (acs.dstsiz[0], srcref->sizrange[0]).to_shwi ();
+  offset_int endoff = dstref->offrange[0] + dstref->sizrange[0];
+  if (endoff <= srcref->offrange[0])
+    acs.ovloff[0] = wi::smin (maxobjsize, srcref->offrange[0]).to_shwi ();
+  else
+    acs.ovloff[0] = wi::smin (maxobjsize, endoff).to_shwi ();
+
+  acs.sizrange[0] = wi::smax (wi::abs (endoff - srcref->offrange[0]) + 1,
+			      srcref->sizrange[0]).to_shwi ();
+  if (dstref->offrange[0] == dstref->offrange[1])
+    {
+      if (srcref->offrange[0] == srcref->offrange[1])
+	acs.ovloff[1] = acs.ovloff[0];
+      else
+	acs.ovloff[1]
+	  = wi::smin (maxobjsize,
+		      srcref->offrange[1] + srcref->sizrange[1]).to_shwi ();
+    }
+  else
+    acs.ovloff[1]
+      = wi::smin (maxobjsize,
+		  dstref->offrange[1] + dstref->sizrange[1]).to_shwi ();
+
+  if (acs.sizrange[0] == 0)
+    acs.sizrange[0] = 1;
   acs.sizrange[1] = wi::smax (acs.dstsiz[1], srcref->sizrange[1]).to_shwi ();
   return true;
 }
@@ -1224,8 +1251,12 @@  builtin_access::overlap ()
   /* Call the appropriate function to determine the overlap.  */
   if ((this->*detect_overlap) ())
     {
-      sizrange[0] = wi::smax (acs.dstsiz[0], srcref->sizrange[0]).to_shwi ();
-      sizrange[1] = wi::smax (acs.dstsiz[1], srcref->sizrange[1]).to_shwi ();
+      if (!sizrange[1])
+	{
+	  /* Unless the access size range has already been set, do so here.  */
+	  sizrange[0] = wi::smax (acs.dstsiz[0], srcref->sizrange[0]).to_shwi ();
+	  sizrange[1] = wi::smax (acs.dstsiz[1], srcref->sizrange[1]).to_shwi ();
+	}
       return true;
     }
 
@@ -1401,10 +1432,17 @@  maybe_diag_overlap (location_t loc, gcall *call, b
   /* Use more concise wording when one of the offsets is unbounded
      to avoid confusing the user with large and mostly meaningless
      numbers.  */
-  bool open_range = ((dstref.offrange[0] == -maxobjsize - 1
-		      && dstref.offrange[1] == maxobjsize)
-		     || (srcref.offrange[0] == -maxobjsize - 1
-			 && srcref.offrange[1] == maxobjsize));
+  bool open_range;
+  if (DECL_P (dstref.base) && TREE_CODE (TREE_TYPE (dstref.base)) == ARRAY_TYPE)
+    open_range = ((dstref.offrange[0] == 0
+		   && dstref.offrange[1] == maxobjsize)
+		  || (srcref.offrange[0] == 0
+		      && srcref.offrange[1] == maxobjsize));
+  else
+    open_range = ((dstref.offrange[0] == -maxobjsize - 1
+		   && dstref.offrange[1] == maxobjsize)
+		  || (srcref.offrange[0] == -maxobjsize - 1
+		      && srcref.offrange[1] == maxobjsize));
 
   if (sizrange[0] == sizrange[1] || sizrange[1] == 1)
     {
Index: gcc/testsuite/c-c++-common/Wrestrict.c
===================================================================
--- gcc/testsuite/c-c++-common/Wrestrict.c	(revision 256752)
+++ gcc/testsuite/c-c++-common/Wrestrict.c	(working copy)
@@ -616,11 +616,14 @@  void test_strcat_var (char *d, const char *s)
   } while (0)
 
   T (d, d);                       /* { dg-warning "source argument is the same as destination" "strcat" } */
-  T (d, d + 1);                   /* { dg-warning "accessing 0 or more bytes at offsets 0 and 1 may overlap 1 byte" "strcat" } */
-  T (d, d + 2);                   /* { dg-warning "accessing 0 or more bytes at offsets 0 and 2 may overlap 1 byte" "strcat" } */
-  T (d, d + 999);                 /* { dg-warning "accessing 0 or more bytes at offsets 0 and 999 may overlap 1 byte" "strcat" } */
-  T (d, d + -99);                 /* { dg-warning "accessing 0 or more bytes at offsets 0 and -99 may overlap 1 byte" "strcat" } */
+  T (d, d + 1);                   /* { dg-warning "accessing 2 or more bytes at offsets 0 and 1 may overlap 1 byte" "strcat" } */
+  T (d, d + 2);                   /* { dg-warning "accessing 3 or more bytes at offsets 0 and 2 may overlap 1 byte at offset 2" "strcat" } */
+  T (d, d + 999);                 /* { dg-warning "accessing 1000 or more bytes at offsets 0 and 999 may overlap 1 byte at offset 999" "strcat" } */
 
+  /* The source string must be at least 100 bytes in length for the copy
+     below to overlap.  */
+  T (d, d + -99);                 /* { dg-warning "accessing 100 or more bytes at offsets 0 and -99 may overlap 1 byte" "strcat" } */
+
   size_t n = unsigned_value ();
 
   T (d + n, d + n);                       /* { dg-warning "\\\[-Wrestrict" "strcat" } */
Index: gcc/testsuite/gcc.dg/Wrestrict-7.c
===================================================================
--- gcc/testsuite/gcc.dg/Wrestrict-7.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-7.c	(working copy)
@@ -0,0 +1,51 @@ 
+/* PR tree-optimization/83698 - bogus offset in -Wrestrict messages for
+   strcat of unknown strings
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict -ftrack-macro-expansion=0" } */
+
+extern char* strcat (char*, const char*);
+
+void sink (char*);
+
+#define T(d, s) sink (strcat (d, s))
+
+extern char arr[];
+
+
+void test_strcat_array_cst_offset (void)
+{
+  T (arr, arr + 1);           /* { dg-warning "accessing 2 or more bytes at offsets 0 and 1 may overlap 1 byte at offset 1" } */
+  T (arr, arr + 2);           /* { dg-warning "accessing 3 or more bytes at offsets 0 and 2 may overlap 1 byte at offset 2" } */
+  T (arr, arr + 13);          /* { dg-warning "accessing 14 or more bytes at offsets 0 and 13 may overlap 1 byte at offset 13" } */
+
+  T (arr + 1, arr);           /* { dg-warning "accessing 2 or more bytes at offsets 1 and 0 may overlap 1 byte at offset 1" } */
+  T (arr + 17, arr + 11);     /* { dg-warning "accessing 7 or more bytes at offsets 17 and 11 may overlap 1 byte at offset 17" } */
+  T (arr + 36, arr + 20);     /* { dg-warning "accessing 17 or more bytes at offsets 36 and 20 may overlap 1 byte at offset 36" } */
+}
+
+void test_strcat_ptr_cst_offset (char *d)
+{
+  T (d - 12, d + 34);         /* { dg-warning "accessing 47 or more bytes at offsets -12 and 34 may overlap 1 byte at offset 34" } */
+  T (d + 12, d + 34);         /* { dg-warning "accessing 23 or more bytes at offsets 12 and 34 may overlap 1 byte at offset 34" } */
+  T (d + 20, d + 36);         /* { dg-warning "accessing 17 or more bytes at offsets 20 and 36 may overlap 1 byte at offset 36" } */
+}
+
+void test_strcat_array_var_offset (int i, int j)
+{
+  T (arr + i, arr);           /* { dg-warning "accessing 1 or more bytes at offsets \\\[0, \[0-9\]+] and 0 may overlap 1 byte at offset \\\[0, \[0-9\]+]" } */
+  T (arr, arr + j);           /* { dg-warning "accessing 1 or more bytes at offsets 0 and \\\[0, \[0-9\]+] may overlap 1 byte at offset \\\[0, \[0-9\]+]" } */
+  T (arr + i, arr + j);       /* { dg-warning "accessing 1 or more bytes at offsets \\\[0, \[0-9\]+] and \\\[0, \[0-9\]+] may overlap 1 byte at offset \\\[0, \[0-9\]+]" } */
+
+  T (arr + i, arr + 5);       /* { dg-warning "accessing 6 or more bytes at offsets \\\[0, \[0-9\]+] and 5 may overlap 1 byte at offset \\\[5, \[0-9\]+]" } */
+  T (arr + 7, arr + j);       /* { dg-warning "accessing 8 or more bytes at offsets 7 and \\\[0, \[0-9\]+] may overlap 1 byte at offset \\\[7, \[0-9\]+]" } */
+}
+
+void test_strcat_ptr_var_offset (char *d, int i, int j)
+{
+  T (d + i, d);               /* { dg-warning "accessing \[0-9\]+ or more bytes at offsets \\\[-\[0-9\]+, \[0-9\]+] and 0 may overlap 1 byte at offset \\\[0, \[0-9\]+]" } */
+  T (d, d + j);               /* { dg-warning "accessing \[0-9\]+ or more bytes at offsets 0 and \\\[-\[0-9\]+, \[0-9\]+] may overlap 1 byte at offset \\\[0, \[0-9\]+]" } */
+  T (d + i, d + j);           /* { dg-warning "accessing 1 or more bytes at offsets \\\[-\[0-9\]+, \[0-9\]+] and \\\[-\[0-9\]+, \[0-9\]+] may overlap 1 byte at offset \\\[-\[0-9\]+, \[0-9\]+]" } */
+
+  T (d + i, d + 3);           /* { dg-warning "accessing \[0-9\]+ or more bytes at offsets \\\[-\[0-9\]+, \[0-9\]+] and 3 may overlap 1 byte at offset \\\[3, \[0-9\]+]" } */
+  T (d + 9, d + j);           /* { dg-warning "accessing \[0-9\]+ or more bytes at offsets 9 and \\\[-\[0-9\]+, \[0-9\]+] may overlap 1 byte at offset \\\[9, \[0-9\]+]" } */
+}
Index: gcc/testsuite/gcc.target/i386/chkp-stropt-17.c
===================================================================
--- gcc/testsuite/gcc.target/i386/chkp-stropt-17.c	(revision 256752)
+++ gcc/testsuite/gcc.target/i386/chkp-stropt-17.c	(working copy)
@@ -51,7 +51,7 @@  void test_strcpy (void)
 
 void test_strcat (int n)
 {
-  strcat (a, a + 3);   /* { dg-warning ".strcat\.chkp. accessing 0 or more bytes at offsets 0 and 3 may overlap 1 byte" } */
+  strcat (a, a + 3);   /* { dg-warning ".strcat\.chkp. accessing 4 or more bytes at offsets 0 and 3 may overlap 1 byte at offset 3" } */
 }
 
 void test_strncat (int n)