diff mbox series

relax -Wsizeof-pointer-memaccess for strncpy with size of source (PR 85931)

Message ID da9a3582-33e1-6f0a-cb91-36b0e699806d@gmail.com
State New
Headers show
Series relax -Wsizeof-pointer-memaccess for strncpy with size of source (PR 85931) | expand

Commit Message

Martin Sebor May 30, 2018, 2:57 a.m. UTC
Warning for a strncpy call whose bound is the same as the size
of the source and suggesting to use the size of the source is
less than helpful when both sizes are the same, as in:

   char a[4], b[4];
   strncpy (a, b, sizeof b);

The attached patch suppresses the -Wsizeof-pointer-memaccess
warning for these cases.  To do that even for VLAs (in some
cases), the patch enhances operand_equal_p() to handle
SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
of a variable-length array.

Is this okay for trunk and GCC 8?

Martin

Comments

Martin Sebor June 4, 2018, 11:50 p.m. UTC | #1
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01698.html

On 05/29/2018 08:57 PM, Martin Sebor wrote:
> Warning for a strncpy call whose bound is the same as the size
> of the source and suggesting to use the size of the source is
> less than helpful when both sizes are the same, as in:
>
>   char a[4], b[4];
>   strncpy (a, b, sizeof b);
>
> The attached patch suppresses the -Wsizeof-pointer-memaccess
> warning for these cases.  To do that even for VLAs (in some
> cases), the patch enhances operand_equal_p() to handle
> SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
> of a variable-length array.
>
> Is this okay for trunk and GCC 8?
>
> Martin
Martin Sebor June 11, 2018, 5:35 p.m. UTC | #2
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01698.html

On 06/04/2018 05:50 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01698.html
>
> On 05/29/2018 08:57 PM, Martin Sebor wrote:
>> Warning for a strncpy call whose bound is the same as the size
>> of the source and suggesting to use the size of the source is
>> less than helpful when both sizes are the same, as in:
>>
>>   char a[4], b[4];
>>   strncpy (a, b, sizeof b);
>>
>> The attached patch suppresses the -Wsizeof-pointer-memaccess
>> warning for these cases.  To do that even for VLAs (in some
>> cases), the patch enhances operand_equal_p() to handle
>> SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
>> of a variable-length array.
>>
>> Is this okay for trunk and GCC 8?
>>
>> Martin
>
Jeff Law June 11, 2018, 9:57 p.m. UTC | #3
On 05/29/2018 08:57 PM, Martin Sebor wrote:
> Warning for a strncpy call whose bound is the same as the size
> of the source and suggesting to use the size of the source is
> less than helpful when both sizes are the same, as in:
> 
>   char a[4], b[4];
>   strncpy (a, b, sizeof b);
> 
> The attached patch suppresses the -Wsizeof-pointer-memaccess
> warning for these cases.  To do that even for VLAs (in some
> cases), the patch enhances operand_equal_p() to handle
> SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
> of a variable-length array.
> 
> Is this okay for trunk and GCC 8?
> 
> Martin
> 
> gcc-85931.diff
> 
> 
> PR c/85931 -  -Wsizeof-pointer-memaccess for strncpy with size of source
> 
> gcc/c-family/ChangeLog:
> 
> 	PR c/85931
> 	* c-warn.c (sizeof_pointer_memaccess_warning): Avoid warning when
> 	sizeof source and destination yields the same value.
> 
> gcc/ChangeLog:
> 
> 	PR c/85931
> 	* fold-const.c (operand_equal_p): Handle SAVE_EXPR.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR c/85931
> 	* gcc.dg/Wstringop-truncation-3.c: New test.
OK for the trunk.  Richi and Jakub have the final say for the branch.

I'm a bit surprised that you don't just use operand_equal_p for both the
INTEGER_CST and !INTEGER_CST cases when testing dstsz == srcsz

Jeff
Martin Sebor June 12, 2018, 5:17 p.m. UTC | #4
On 06/11/2018 03:57 PM, Jeff Law wrote:
> On 05/29/2018 08:57 PM, Martin Sebor wrote:
>> Warning for a strncpy call whose bound is the same as the size
>> of the source and suggesting to use the size of the source is
>> less than helpful when both sizes are the same, as in:
>>
>>   char a[4], b[4];
>>   strncpy (a, b, sizeof b);
>>
>> The attached patch suppresses the -Wsizeof-pointer-memaccess
>> warning for these cases.  To do that even for VLAs (in some
>> cases), the patch enhances operand_equal_p() to handle
>> SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
>> of a variable-length array.
>>
>> Is this okay for trunk and GCC 8?
>>
>> Martin
>>
>> gcc-85931.diff
>>
>>
>> PR c/85931 -  -Wsizeof-pointer-memaccess for strncpy with size of source
>>
>> gcc/c-family/ChangeLog:
>>
>> 	PR c/85931
>> 	* c-warn.c (sizeof_pointer_memaccess_warning): Avoid warning when
>> 	sizeof source and destination yields the same value.
>>
>> gcc/ChangeLog:
>>
>> 	PR c/85931
>> 	* fold-const.c (operand_equal_p): Handle SAVE_EXPR.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR c/85931
>> 	* gcc.dg/Wstringop-truncation-3.c: New test.
> OK for the trunk.  Richi and Jakub have the final say for the branch.
>
> I'm a bit surprised that you don't just use operand_equal_p for both the
> INTEGER_CST and !INTEGER_CST cases when testing dstsz == srcsz

It seemed that since the CST case compared the values of the size
expressions and the other case their lexicographic form they needed
to be different.  But operand_equal_p() does value comparison for
constants so the conditional can be simplified by using just it
for both cases.  Thanks for the suggestion!  I've made that
simplification and committed r261515.

Jakub/Richard, can you please review the commit and let me know
if you have any concerns with backporting it to the GCC 8 branch?
(I will proceed if I don't hear anything this week.)

Thanks
Martin
Richard Biener June 13, 2018, 7:46 a.m. UTC | #5
On Tue, 12 Jun 2018, Martin Sebor wrote:

> On 06/11/2018 03:57 PM, Jeff Law wrote:
> > On 05/29/2018 08:57 PM, Martin Sebor wrote:
> > > Warning for a strncpy call whose bound is the same as the size
> > > of the source and suggesting to use the size of the source is
> > > less than helpful when both sizes are the same, as in:
> > > 
> > >   char a[4], b[4];
> > >   strncpy (a, b, sizeof b);
> > > 
> > > The attached patch suppresses the -Wsizeof-pointer-memaccess
> > > warning for these cases.  To do that even for VLAs (in some
> > > cases), the patch enhances operand_equal_p() to handle
> > > SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
> > > of a variable-length array.
> > > 
> > > Is this okay for trunk and GCC 8?
> > > 
> > > Martin
> > > 
> > > gcc-85931.diff
> > > 
> > > 
> > > PR c/85931 -  -Wsizeof-pointer-memaccess for strncpy with size of source
> > > 
> > > gcc/c-family/ChangeLog:
> > > 
> > > 	PR c/85931
> > > 	* c-warn.c (sizeof_pointer_memaccess_warning): Avoid warning when
> > > 	sizeof source and destination yields the same value.
> > > 
> > > gcc/ChangeLog:
> > > 
> > > 	PR c/85931
> > > 	* fold-const.c (operand_equal_p): Handle SAVE_EXPR.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > > 	PR c/85931
> > > 	* gcc.dg/Wstringop-truncation-3.c: New test.
> > OK for the trunk.  Richi and Jakub have the final say for the branch.
> > 
> > I'm a bit surprised that you don't just use operand_equal_p for both the
> > INTEGER_CST and !INTEGER_CST cases when testing dstsz == srcsz
> 
> It seemed that since the CST case compared the values of the size
> expressions and the other case their lexicographic form they needed
> to be different.  But operand_equal_p() does value comparison for
> constants so the conditional can be simplified by using just it
> for both cases.  Thanks for the suggestion!  I've made that
> simplification and committed r261515.
> 
> Jakub/Richard, can you please review the commit and let me know
> if you have any concerns with backporting it to the GCC 8 branch?
> (I will proceed if I don't hear anything this week.)

I'm fine with backporting it.

Richard.
Jakub Jelinek June 13, 2018, 7:58 a.m. UTC | #6
On Wed, Jun 13, 2018 at 09:46:28AM +0200, Richard Biener wrote:
> > > > gcc/ChangeLog:
> > > > 
> > > > 	PR c/85931
> > > > 	* fold-const.c (operand_equal_p): Handle SAVE_EXPR.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > 	PR c/85931
> > > > 	* gcc.dg/Wstringop-truncation-3.c: New test.
> > > OK for the trunk.  Richi and Jakub have the final say for the branch.
> > > 
> > > I'm a bit surprised that you don't just use operand_equal_p for both the
> > > INTEGER_CST and !INTEGER_CST cases when testing dstsz == srcsz
> > 
> > It seemed that since the CST case compared the values of the size
> > expressions and the other case their lexicographic form they needed
> > to be different.  But operand_equal_p() does value comparison for
> > constants so the conditional can be simplified by using just it
> > for both cases.  Thanks for the suggestion!  I've made that
> > simplification and committed r261515.
> > 
> > Jakub/Richard, can you please review the commit and let me know
> > if you have any concerns with backporting it to the GCC 8 branch?
> > (I will proceed if I don't hear anything this week.)
> 
> I'm fine with backporting it.

I'm actually worried about the fold-const.c change and don't understand, why
it has been done at all in the context of this PR.

        case SAVE_EXPR:
          if (flags & OEP_LEXICOGRAPHIC)
            return OP_SAME (0);
          return 0;

means it does something different from the state before the patch:
	default:
	  return 0;
only if OEP_LEXICOGRAPHIC is set, which AFAIK is only in the
-Wduplicated-branches warning code, so why it is needed for
-Wstringop-truncation warning is unclear to me.  More importantly,
especially -fsanitize=undefined has a tendency of creating trees
which contain two or more uses of the same SAVE_EXPR at each level, and very
deep nesting levels of such SAVE_EXPRs, so if we handle it without checking
for duplicates, it results in exponential compile time complexity.
So, if we really want to handle SAVE_EXPR here, we'd need to create some
cache where we'd remember if in the same toplevel operand_equal_p call we've
already compared two particular SAVE_EXPRs and what was the result.

	Jakub
Jakub Jelinek June 13, 2018, 8:22 a.m. UTC | #7
On Wed, Jun 13, 2018 at 09:58:49AM +0200, Jakub Jelinek wrote:
> I'm actually worried about the fold-const.c change and don't understand, why
> it has been done at all in the context of this PR.
> 
>         case SAVE_EXPR:
>           if (flags & OEP_LEXICOGRAPHIC)
>             return OP_SAME (0);
>           return 0;
> 
> means it does something different from the state before the patch:
> 	default:
> 	  return 0;
> only if OEP_LEXICOGRAPHIC is set, which AFAIK is only in the
> -Wduplicated-branches warning code, so why it is needed for
> -Wstringop-truncation warning is unclear to me.  More importantly,
> especially -fsanitize=undefined has a tendency of creating trees
> which contain two or more uses of the same SAVE_EXPR at each level, and very
> deep nesting levels of such SAVE_EXPRs, so if we handle it without checking
> for duplicates, it results in exponential compile time complexity.
> So, if we really want to handle SAVE_EXPR here, we'd need to create some
> cache where we'd remember if in the same toplevel operand_equal_p call we've
> already compared two particular SAVE_EXPRs and what was the result.

Random testcase for -Wduplicated-branches -fsanitize=shift:
int
foo (int x, int y)
{
  if (x)
    y = y << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
          << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
          << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
          << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1;
  else
    y = y << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
          << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
          << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
          << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1;
  return y;
}

Though it seems we have that problem already in inchash::add_expr.  In that
case perhaps we could have a pointer to a hashmap in inchash::hash objects,
clear it in the ctors and destroy/clear in inchash::hash::end (), though we
have the add_commutative that has two separate hash objects.

	Jakub
Jakub Jelinek June 13, 2018, 8:44 a.m. UTC | #8
On Wed, Jun 13, 2018 at 10:22:29AM +0200, Jakub Jelinek wrote:
> Random testcase for -Wduplicated-branches -fsanitize=shift:
> int
> foo (int x, int y)
> {
>   if (x)
>     y = y << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
>           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
>           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
>           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1;
>   else
>     y = y << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
>           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
>           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
>           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1;
>   return y;
> }
> 
> Though it seems we have that problem already in inchash::add_expr.  In that
> case perhaps we could have a pointer to a hashmap in inchash::hash objects,
> clear it in the ctors and destroy/clear in inchash::hash::end (), though we
> have the add_commutative that has two separate hash objects.

It isn't specific to just -fsanitize=undefined, even without that there are
cases we can end up with lots of nested SAVE_EXPRs, like
-Wduplicated-branches:
int bar (void);

void
foo (int x, int *y)
{
  if (x)
    y[0] += (y[1] += (y[2] += (y[3] += (y[4] += (y[5] += (y[6] += (y[7] += (y[8] +=
	    (y[9] += (y[10] += (y[11] += (y[12] += (y[13] += (y[14] += (y[15] += (y[16] +=
	    (y[17] += (y[18] += (y[19] += (y[20] += (y[21] += (y[22] += (y[23] += (y[24] +=
	    (y[25] += (y[26] += (y[27] += (y[28] += (y[29] += (y[30] += (y[31] += (y[32] +=
	    (y[33] += (y[34] += (y[35] += (y[36] += (y[37] += (y[38] += (y[39] += (y[40] +=
	    (y[41] += (y[42] += (y[43] += (y[44] += (y[45] += (y[46] += (y[47] += (y[48] +=
	    (y[49] += (y[50] += (y[51] += (y[52] += (y[53] += (y[54] += (y[55] += (y[56] +=
	    (y[57] += (y[58] += (y[59] += (y[60] += (y[61] += (y[62] += (y[63] += (y[64] += bar ()
	    ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));
  else
    y[0] += (y[1] += (y[2] += (y[3] += (y[4] += (y[5] += (y[6] += (y[7] += (y[8] +=
	    (y[9] += (y[10] += (y[11] += (y[12] += (y[13] += (y[14] += (y[15] += (y[16] +=
	    (y[17] += (y[18] += (y[19] += (y[20] += (y[21] += (y[22] += (y[23] += (y[24] +=
	    (y[25] += (y[26] += (y[27] += (y[28] += (y[29] += (y[30] += (y[31] += (y[32] +=
	    (y[33] += (y[34] += (y[35] += (y[36] += (y[37] += (y[38] += (y[39] += (y[40] +=
	    (y[41] += (y[42] += (y[43] += (y[44] += (y[45] += (y[46] += (y[47] += (y[48] +=
	    (y[49] += (y[50] += (y[51] += (y[52] += (y[53] += (y[54] += (y[55] += (y[56] +=
	    (y[57] += (y[58] += (y[59] += (y[60] += (y[61] += (y[62] += (y[63] += (y[64] += bar ()
	    ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));
}

	Jakub
Richard Biener June 13, 2018, 8:51 a.m. UTC | #9
On Wed, 13 Jun 2018, Jakub Jelinek wrote:

> On Wed, Jun 13, 2018 at 10:22:29AM +0200, Jakub Jelinek wrote:
> > Random testcase for -Wduplicated-branches -fsanitize=shift:
> > int
> > foo (int x, int y)
> > {
> >   if (x)
> >     y = y << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
> >           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
> >           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
> >           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1;
> >   else
> >     y = y << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
> >           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
> >           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1
> >           << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1 << 1 >> 1;
> >   return y;
> > }
> > 
> > Though it seems we have that problem already in inchash::add_expr.  In that
> > case perhaps we could have a pointer to a hashmap in inchash::hash objects,
> > clear it in the ctors and destroy/clear in inchash::hash::end (), though we
> > have the add_commutative that has two separate hash objects.
> 
> It isn't specific to just -fsanitize=undefined, even without that there are
> cases we can end up with lots of nested SAVE_EXPRs, like
> -Wduplicated-branches:
> int bar (void);
> 
> void
> foo (int x, int *y)
> {
>   if (x)
>     y[0] += (y[1] += (y[2] += (y[3] += (y[4] += (y[5] += (y[6] += (y[7] += (y[8] +=
> 	    (y[9] += (y[10] += (y[11] += (y[12] += (y[13] += (y[14] += (y[15] += (y[16] +=
> 	    (y[17] += (y[18] += (y[19] += (y[20] += (y[21] += (y[22] += (y[23] += (y[24] +=
> 	    (y[25] += (y[26] += (y[27] += (y[28] += (y[29] += (y[30] += (y[31] += (y[32] +=
> 	    (y[33] += (y[34] += (y[35] += (y[36] += (y[37] += (y[38] += (y[39] += (y[40] +=
> 	    (y[41] += (y[42] += (y[43] += (y[44] += (y[45] += (y[46] += (y[47] += (y[48] +=
> 	    (y[49] += (y[50] += (y[51] += (y[52] += (y[53] += (y[54] += (y[55] += (y[56] +=
> 	    (y[57] += (y[58] += (y[59] += (y[60] += (y[61] += (y[62] += (y[63] += (y[64] += bar ()
> 	    ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));
>   else
>     y[0] += (y[1] += (y[2] += (y[3] += (y[4] += (y[5] += (y[6] += (y[7] += (y[8] +=
> 	    (y[9] += (y[10] += (y[11] += (y[12] += (y[13] += (y[14] += (y[15] += (y[16] +=
> 	    (y[17] += (y[18] += (y[19] += (y[20] += (y[21] += (y[22] += (y[23] += (y[24] +=
> 	    (y[25] += (y[26] += (y[27] += (y[28] += (y[29] += (y[30] += (y[31] += (y[32] +=
> 	    (y[33] += (y[34] += (y[35] += (y[36] += (y[37] += (y[38] += (y[39] += (y[40] +=
> 	    (y[41] += (y[42] += (y[43] += (y[44] += (y[45] += (y[46] += (y[47] += (y[48] +=
> 	    (y[49] += (y[50] += (y[51] += (y[52] += (y[53] += (y[54] += (y[55] += (y[56] +=
> 	    (y[57] += (y[58] += (y[59] += (y[60] += (y[61] += (y[62] += (y[63] += (y[64] += bar ()
> 	    ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));
> }

Your concern is for compile-time, not for correctness, right?

I think that

  /* If ARG0 and ARG1 are the same SAVE_EXPR, they are necessarily equal.
     We don't care about side effects in that case because the SAVE_EXPR
     takes care of that for us. In all other cases, two expressions are
     equal if they have no side effects.  If we have two identical
     expressions with side effects that should be treated the same due
     to the only side effects being identical SAVE_EXPR's, that will
     be detected in the recursive calls below.
     If we are taking an invariant address of two identical objects
     they are necessarily equal as well.  */
  if (arg0 == arg1 && ! (flags & OEP_ONLY_CONST)
      && (TREE_CODE (arg0) == SAVE_EXPR
          || (flags & OEP_MATCH_SIDE_EFFECTS)
          || (! TREE_SIDE_EFFECTS (arg0) && ! TREE_SIDE_EFFECTS (arg1))))
    return 1;

is supposed to handle == SAVE_EXPRs and thus the hunk should be
conditionalized on arg0 != arg1 or rather for OEP_LEXICOGRAPHIC
always return true for arg0 == arg1?

I also wondered why this change was necessary for the fix but
it looked safe correctness-wise.

Richard.
Jakub Jelinek June 13, 2018, 8:57 a.m. UTC | #10
On Wed, Jun 13, 2018 at 10:51:41AM +0200, Richard Biener wrote:
> Your concern is for compile-time, not for correctness, right?

Yes.

> I think that
> 
>   /* If ARG0 and ARG1 are the same SAVE_EXPR, they are necessarily equal.
>      We don't care about side effects in that case because the SAVE_EXPR
>      takes care of that for us. In all other cases, two expressions are
>      equal if they have no side effects.  If we have two identical
>      expressions with side effects that should be treated the same due
>      to the only side effects being identical SAVE_EXPR's, that will
>      be detected in the recursive calls below.
>      If we are taking an invariant address of two identical objects
>      they are necessarily equal as well.  */
>   if (arg0 == arg1 && ! (flags & OEP_ONLY_CONST)
>       && (TREE_CODE (arg0) == SAVE_EXPR
>           || (flags & OEP_MATCH_SIDE_EFFECTS)
>           || (! TREE_SIDE_EFFECTS (arg0) && ! TREE_SIDE_EFFECTS (arg1))))
>     return 1;
> 
> is supposed to handle == SAVE_EXPRs and thus the hunk should be
> conditionalized on arg0 != arg1 or rather for OEP_LEXICOGRAPHIC
> always return true for arg0 == arg1?

The above handles the arg0 == arg1 case, sure, but that is not what
is compared in the -Wduplicated-branches testcases I've posted.
There we have different SAVE_EXPRs (one set for the then branch, another
for the else branch) which hash the same and we call operand_equal_p on it
then, but without some sort of caching (both in the inchash::add_expr
SAVE_EXPR case and operand_equal_p OEP_LEXICOGRAPHIC arg0 != arg1 SAVE_EXPR
case) we will check the same thing again and again.

	Jakub
Martin Sebor June 13, 2018, 2:53 p.m. UTC | #11
On 06/13/2018 01:58 AM, Jakub Jelinek wrote:
> On Wed, Jun 13, 2018 at 09:46:28AM +0200, Richard Biener wrote:
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 	PR c/85931
>>>>> 	* fold-const.c (operand_equal_p): Handle SAVE_EXPR.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 	PR c/85931
>>>>> 	* gcc.dg/Wstringop-truncation-3.c: New test.
>>>> OK for the trunk.  Richi and Jakub have the final say for the branch.
>>>>
>>>> I'm a bit surprised that you don't just use operand_equal_p for both the
>>>> INTEGER_CST and !INTEGER_CST cases when testing dstsz == srcsz
>>>
>>> It seemed that since the CST case compared the values of the size
>>> expressions and the other case their lexicographic form they needed
>>> to be different.  But operand_equal_p() does value comparison for
>>> constants so the conditional can be simplified by using just it
>>> for both cases.  Thanks for the suggestion!  I've made that
>>> simplification and committed r261515.
>>>
>>> Jakub/Richard, can you please review the commit and let me know
>>> if you have any concerns with backporting it to the GCC 8 branch?
>>> (I will proceed if I don't hear anything this week.)
>>
>> I'm fine with backporting it.
>
> I'm actually worried about the fold-const.c change and don't understand, why
> it has been done at all in the context of this PR.
>
>         case SAVE_EXPR:
>           if (flags & OEP_LEXICOGRAPHIC)
>             return OP_SAME (0);
>           return 0;
>
> means it does something different from the state before the patch:
> 	default:
> 	  return 0;
> only if OEP_LEXICOGRAPHIC is set, which AFAIK is only in the
> -Wduplicated-branches warning code, so why it is needed for
> -Wstringop-truncation warning is unclear to me.

It lets sizeof_pointer_memaccess_warning detect that sizeof b
refers to the size of the VLA b in cases like the one below
(the sizeof b expression is represented as SAVE_EXPR (n)) and
avoid false positives:

   void f (void*, ...);

   void g (unsigned n)
   {
     char a[n], b[n];
     f (a, b);
     strncpy (a, b, sizeof b);   // bogus -Wsizeof-pointer-memaccess
     f (a);
   }

Short of moving the SAVE_EXPR logic out of operand_equal_p()
and into sizeof_pointer_memaccess_warning, is there a better
way to determine that?  (I would expect the SAVE_EXPR logic
in operand_equal_p() to be useful in other contexts besides
this warning.)

Martin

> More importantly,
> especially -fsanitize=undefined has a tendency of creating trees
> which contain two or more uses of the same SAVE_EXPR at each level, and very
> deep nesting levels of such SAVE_EXPRs, so if we handle it without checking
> for duplicates, it results in exponential compile time complexity.
> So, if we really want to handle SAVE_EXPR here, we'd need to create some
> cache where we'd remember if in the same toplevel operand_equal_p call we've
> already compared two particular SAVE_EXPRs and what was the result.
>
> 	Jakub
>
Jakub Jelinek June 13, 2018, 3:16 p.m. UTC | #12
On Wed, Jun 13, 2018 at 08:53:20AM -0600, Martin Sebor wrote:
> > I'm actually worried about the fold-const.c change and don't understand, why
> > it has been done at all in the context of this PR.
> > 
> >         case SAVE_EXPR:
> >           if (flags & OEP_LEXICOGRAPHIC)
> >             return OP_SAME (0);
> >           return 0;
> > 
> > means it does something different from the state before the patch:
> > 	default:
> > 	  return 0;
> > only if OEP_LEXICOGRAPHIC is set, which AFAIK is only in the
> > -Wduplicated-branches warning code, so why it is needed for
> > -Wstringop-truncation warning is unclear to me.
> 
> It lets sizeof_pointer_memaccess_warning detect that sizeof b
> refers to the size of the VLA b in cases like the one below
> (the sizeof b expression is represented as SAVE_EXPR (n)) and
> avoid false positives:

I've missed you've added another operand_equal_p call with
OEP_LEXICOGRAPHIC, but you are using it for something it hasn't been
designed for.
Say if the array sizes are not n, but foo (n) for some
unsigned foo (unsigned).  That function can return completely different
values between the two calls, while operand_equal_p with OEP_LEXICOGRAPHIC
will still say they are the same.  That is fine for -Wduplicated-branches
it has been designed for.
Or consider
  char a[n];
  n += 3;
  char b[n];
  f (a, b);
  strncpy (a, b, sizeof b);
  f (a);
The lengths are even known not to be equal here, but OEP_LEXICOGRAPHIC
happily says they are the same.  On the other hand, you could have:
  unsigned n2 = n;
  char a[n], b[n2];
and OEP_LEXICOGRAPHIC would say they are not equal, even when they are equal
at runtime.

>   void f (void*, ...);
> 
>   void g (unsigned n)
>   {
>     char a[n], b[n];
>     f (a, b);
>     strncpy (a, b, sizeof b);   // bogus -Wsizeof-pointer-memaccess
>     f (a);
>   }
> 
> Short of moving the SAVE_EXPR logic out of operand_equal_p()
> and into sizeof_pointer_memaccess_warning, is there a better
> way to determine that?  (I would expect the SAVE_EXPR logic
> in operand_equal_p() to be useful in other contexts besides
> this warning.)

Only -Wduplicated-branches (and now your code too) are using
OEP_LEXICOGRAPHIC, nothing else, so it doesn't make a difference for other
uses.  There we only care about SAVE_EXPR pointer equality, if they aren't
pointer equal, they might, but might not evaluate the same.

If you want to be conservative with the warning, then just don't warn if
sizeof doesn't return a constant.

	Jakub
Martin Sebor June 13, 2018, 5:48 p.m. UTC | #13
On 06/13/2018 09:16 AM, Jakub Jelinek wrote:
> On Wed, Jun 13, 2018 at 08:53:20AM -0600, Martin Sebor wrote:
>>> I'm actually worried about the fold-const.c change and don't understand, why
>>> it has been done at all in the context of this PR.
>>>
>>>         case SAVE_EXPR:
>>>           if (flags & OEP_LEXICOGRAPHIC)
>>>             return OP_SAME (0);
>>>           return 0;
>>>
>>> means it does something different from the state before the patch:
>>> 	default:
>>> 	  return 0;
>>> only if OEP_LEXICOGRAPHIC is set, which AFAIK is only in the
>>> -Wduplicated-branches warning code, so why it is needed for
>>> -Wstringop-truncation warning is unclear to me.
>>
>> It lets sizeof_pointer_memaccess_warning detect that sizeof b
>> refers to the size of the VLA b in cases like the one below
>> (the sizeof b expression is represented as SAVE_EXPR (n)) and
>> avoid false positives:
>
> I've missed you've added another operand_equal_p call with
> OEP_LEXICOGRAPHIC, but you are using it for something it hasn't been
> designed for.
> Say if the array sizes are not n, but foo (n) for some
> unsigned foo (unsigned).  That function can return completely different
> values between the two calls, while operand_equal_p with OEP_LEXICOGRAPHIC
> will still say they are the same.  That is fine for -Wduplicated-branches
> it has been designed for.
> Or consider
>   char a[n];
>   n += 3;
>   char b[n];
>   f (a, b);
>   strncpy (a, b, sizeof b);
>   f (a);
> The lengths are even known not to be equal here, but OEP_LEXICOGRAPHIC
> happily says they are the same.  On the other hand, you could have:
>   unsigned n2 = n;
>   char a[n], b[n2];
> and OEP_LEXICOGRAPHIC would say they are not equal, even when they are equal
> at runtime.

True.  -Wsizeof-pointer-memaccess is not perfect (irrespective
of this change).  It warns for "suspicious length parameters
to certain string and memory built-in functions if the argument
uses sizeof."  It's unavoidably prone to missing some problems
or flagging non-issues (such as bug 57627).

>
>>   void f (void*, ...);
>>
>>   void g (unsigned n)
>>   {
>>     char a[n], b[n];
>>     f (a, b);
>>     strncpy (a, b, sizeof b);   // bogus -Wsizeof-pointer-memaccess
>>     f (a);
>>   }
>>
>> Short of moving the SAVE_EXPR logic out of operand_equal_p()
>> and into sizeof_pointer_memaccess_warning, is there a better
>> way to determine that?  (I would expect the SAVE_EXPR logic
>> in operand_equal_p() to be useful in other contexts besides
>> this warning.)
>
> Only -Wduplicated-branches (and now your code too) are using
> OEP_LEXICOGRAPHIC, nothing else, so it doesn't make a difference for other
> uses.  There we only care about SAVE_EXPR pointer equality, if they aren't
> pointer equal, they might, but might not evaluate the same.
>
> If you want to be conservative with the warning, then just don't warn if
> sizeof doesn't return a constant.

I don't think it's necessary to weaken the warning further.

Since nothing else uses OEP_LEXICOGRAPHIC, do you have any
other concerns about backporting the change to GCC 8?

Martin
diff mbox series

Patch

PR c/85931 -  -Wsizeof-pointer-memaccess for strncpy with size of source

gcc/c-family/ChangeLog:

	PR c/85931
	* c-warn.c (sizeof_pointer_memaccess_warning): Avoid warning when
	sizeof source and destination yields the same value.

gcc/ChangeLog:

	PR c/85931
	* fold-const.c (operand_equal_p): Handle SAVE_EXPR.

gcc/testsuite/ChangeLog:

	PR c/85931
	* gcc.dg/Wstringop-truncation-3.c: New test.

diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index e7bcbb1..96a56d4 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -792,13 +792,31 @@  sizeof_pointer_memaccess_warning (location_t *sizeof_arg_loc, tree callee,
     {
       /* The argument type may be an array.  Diagnose bounded string
 	 copy functions that specify the bound in terms of the source
-	 argument rather than the destination.  */
+	 argument rather than the destination unless they are equal
+	 to one another.  Handle constant sizes and also try to handle
+	 sizeof expressions involving VLAs.  */
       if (strop && !cmp && fncode != BUILT_IN_STRNDUP && src)
 	{
 	  tem = tree_strip_nop_conversions (src);
 	  if (TREE_CODE (tem) == ADDR_EXPR)
 	    tem = TREE_OPERAND (tem, 0);
-	  if (operand_equal_p (tem, sizeof_arg[idx], OEP_ADDRESS_OF))
+
+	  tree d = tree_strip_nop_conversions (dest);
+	  if (TREE_CODE (d) == ADDR_EXPR)
+	    d = TREE_OPERAND (d, 0);
+
+	  tree dstsz = TYPE_SIZE_UNIT (TREE_TYPE (d));
+	  tree srcsz = TYPE_SIZE_UNIT (TREE_TYPE (tem));
+
+	  if ((!dstsz
+	       || !srcsz
+	       || (TREE_CODE (dstsz) != INTEGER_CST
+		   && TREE_CODE (srcsz) != INTEGER_CST
+		   && !operand_equal_p (dstsz, srcsz, OEP_LEXICOGRAPHIC))
+	       || (TREE_CODE (dstsz) == INTEGER_CST
+		   && TREE_CODE (srcsz) == INTEGER_CST
+		   && !tree_int_cst_equal (dstsz, srcsz)))
+	      && operand_equal_p (tem, sizeof_arg[idx], OEP_ADDRESS_OF))
 	    warning_at (sizeof_arg_loc[idx], OPT_Wsizeof_pointer_memaccess,
 			"argument to %<sizeof%> in %qD call is the same "
 			"expression as the source; did you mean to use "
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 3258aad..ead6e53 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3358,6 +3358,7 @@  operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
 
 	case CLEANUP_POINT_EXPR:
 	case EXPR_STMT:
+	case SAVE_EXPR:
 	  if (flags & OEP_LEXICOGRAPHIC)
 	    return OP_SAME (0);
 	  return 0;
diff --git a/gcc/testsuite/gcc.dg/Wstringop-truncation-3.c b/gcc/testsuite/gcc.dg/Wstringop-truncation-3.c
new file mode 100644
index 0000000..57f4d64
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstringop-truncation-3.c
@@ -0,0 +1,59 @@ 
+/* PR c/85931 - -Wsizeof-pointer-memaccess for strncpy with size of source
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wstringop-truncation -ftrack-macro-expansion=0" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern char* strncpy (char*, const char*, size_t);
+
+extern char a3[3], b3[3];
+extern char a5[5], b5[5];
+extern char ax[], bx[];
+
+struct SA
+{
+  char a3[3], b3[3];
+  char a5[5], b5[5];
+  char ax[];
+};
+
+void sink (void*, ...);
+
+#define T(d, s, n)   sink (strncpy (d, s, n))
+
+void test_array (unsigned n)
+{
+  T (a3, b3, 3);
+  /* For the following statemenmt, GCC 8.1 issues warning:
+
+       argument to ‘sizeof’ in ‘strncpy’ call is the same expression
+       as the source; did you mean to use the size of the destination?
+
+     Since the size of both the source and destination the warning
+     isn't helpful.  Verify that it isn't issued.  */
+  T (a3, b3, sizeof b3);    /* { dg-bogus "\\\[-Wsizeof-pointer-memaccess" } */
+
+  T (a3, ax, sizeof a3);    /* { dg-warning "\\\[-Wstringop-truncation" } */
+  T (ax, a3, sizeof a3);    /* { dg-warning "argument to .sizeof. in .strncpy. call is the same expression as the source" } */
+
+  char an[n], bn[n];
+  sink (an, bn);
+
+  T (an, bn, sizeof bn);    /* { dg-bogus "\\\[-Wsizeof-pointer-memaccess" } */
+}
+
+void test_member_array (struct SA *sa, unsigned n)
+{
+  T (sa->a3, sa->b3, 3);
+  T (sa->a3, sa->b3, sizeof sa->b3);  /* { dg-bogus "\\\[-Wsizeof-pointer-memaccess" } */
+
+  T (sa->a3, sa->ax, sizeof sa->a3);  /* { dg-warning "\\\[-Wstringop-truncation" } */
+  T (sa->ax, sa->a3, sizeof sa->a3);  /* { dg-warning "argument to .sizeof. in .strncpy. call is the same expression as the source" } */
+
+  struct VarLenStruct {
+    char an[n], bn[n];
+  } x;
+
+  sink (&x);
+  T (x.an, x.bn, sizeof x.bn);        /* { dg-bogus "\\\[-Wsizeof-pointer-memaccess" } */
+}