diff mbox series

make POINTER_PLUS offset sizetype (PR 97956)

Message ID 10756ea2-c660-9add-b60f-f3cf49260e38@gmail.com
State New
Headers show
Series make POINTER_PLUS offset sizetype (PR 97956) | expand

Commit Message

Martin Sebor Nov. 25, 2020, 12:45 a.m. UTC
Offsets in pointer expressions are signed but GCC prefers to
represent them as sizetype instead, and sometimes (though not
always) crashes during GIMPLE verification when they're not.
The sometimes-but-not-always part makes it easy for mistakes
to slip in and go undetected for months, until someone either
trips over it by accident, or deliberately tries to break
things (the test case in the bug relies on declaring memchr
with the third argument of type signed long which is what's
apparently needed to trigger the ICE).  The attached patch
corrects a couple of such mistakes.

Martin

PS It would save us the time and effort dealing with these
bugs to either detect (or even correct) the mistakes early,
at the time the POINTER_PLUS_EXPR is built.  Adding an assert
to gimple_build_assign()) to verify that it has the expected
type (or converting the operand to sizetype) as in the change
below does that.  I'm pretty sure I submitted a patch like it
in the past but it was rejected.  If I'm wrong or if there are
no objections to it now I'll be happy to commit it as well.

Both patches were tested on x86_64-linux.

Comments

Richard Biener Nov. 25, 2020, 9:31 a.m. UTC | #1
On Wed, Nov 25, 2020 at 1:45 AM Martin Sebor via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Offsets in pointer expressions are signed but GCC prefers to
> represent them as sizetype instead, and sometimes (though not
> always) crashes during GIMPLE verification when they're not.
> The sometimes-but-not-always part makes it easy for mistakes
> to slip in and go undetected for months, until someone either
> trips over it by accident, or deliberately tries to break
> things (the test case in the bug relies on declaring memchr
> with the third argument of type signed long which is what's
> apparently needed to trigger the ICE).  The attached patch
> corrects a couple of such mistakes.
>
> Martin
>
> PS It would save us the time and effort dealing with these
> bugs to either detect (or even correct) the mistakes early,
> at the time the POINTER_PLUS_EXPR is built.  Adding an assert
> to gimple_build_assign()) to verify that it has the expected
> type (or converting the operand to sizetype) as in the change
> below does that.  I'm pretty sure I submitted a patch like it
> in the past but it was rejected.  If I'm wrong or if there are
> no objections to it now I'll be happy to commit it as well.

We already verify this in verify_gimple_assign_binary after
each pass.  Iff then this would argue for verifying all built
stmts immediately, assigns with verify_gimple_assign.
But I think this is overkill - your testcase is already catched
by the IL verification.

Btw, are you sure the offset returned by constant_byte_string
is never checked to be positive in callers?

The gimple-fold.c hunk and the new testcase are OK.

Richard.

> Both patches were tested on x86_64-linux.
>
> diff --git a/gcc/gimple.c b/gcc/gimple.c
> index e3e508daf2f..8e88bab9e41 100644
> --- a/gcc/gimple.c
> +++ b/gcc/gimple.c
> @@ -489,6 +489,9 @@ gassign *
>   gimple_build_assign (tree lhs, enum tree_code subcode, tree op1,
>                       tree op2 MEM_STAT_DECL)
>   {
> +  if (subcode == POINTER_PLUS_EXPR)
> +    gcc_checking_assert (ptrofftype_p (TREE_TYPE (op2)));
> +
>     return gimple_build_assign_1 (lhs, subcode, op1, op2, NULL_TREE
>                                  PASS_MEM_STAT);
>   }
Martin Sebor Nov. 25, 2020, 6:22 p.m. UTC | #2
On 11/25/20 2:31 AM, Richard Biener wrote:
> On Wed, Nov 25, 2020 at 1:45 AM Martin Sebor via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Offsets in pointer expressions are signed but GCC prefers to
>> represent them as sizetype instead, and sometimes (though not
>> always) crashes during GIMPLE verification when they're not.
>> The sometimes-but-not-always part makes it easy for mistakes
>> to slip in and go undetected for months, until someone either
>> trips over it by accident, or deliberately tries to break
>> things (the test case in the bug relies on declaring memchr
>> with the third argument of type signed long which is what's
>> apparently needed to trigger the ICE).  The attached patch
>> corrects a couple of such mistakes.
>>
>> Martin
>>
>> PS It would save us the time and effort dealing with these
>> bugs to either detect (or even correct) the mistakes early,
>> at the time the POINTER_PLUS_EXPR is built.  Adding an assert
>> to gimple_build_assign()) to verify that it has the expected
>> type (or converting the operand to sizetype) as in the change
>> below does that.  I'm pretty sure I submitted a patch like it
>> in the past but it was rejected.  If I'm wrong or if there are
>> no objections to it now I'll be happy to commit it as well.
> 
> We already verify this in verify_gimple_assign_binary after
> each pass.  Iff then this would argue for verifying all built
> stmts immediately, assigns with verify_gimple_assign.
> But I think this is overkill - your testcase is already catched
> by the IL verification.

You're right, having the check wouldn't have prevented this bug.
But I'm not worried about this test case.  What I'd like to do
is reduce the risk of similar problems happening in the future
where the check would help.  Catching problems earlier by having
functions verify their pre- and postconditions is good practice.
So yes, I think all these build() functions should do that (not
necessarily to the same extent as the full-blown IL verification
but at least the basics).

> 
> Btw, are you sure the offset returned by constant_byte_string
> is never checked to be positive in callers?

The function sets *PTR_OFFSET to sizetype in all but this one
case (actually, it also sets it to integer_zero_one).  Callers
then typically compare it to the length of the string to see
if it's less.  If not, the result is discarded because it refers
outside the string.  It's tested for equality to zero but I don't
see it being checked to see if it's positive and I'm not sure to
what end.  What's your concern?

Anyway, as an experiment, I've changed the function to set
the offset to ssizetype instead of sizetype and reran a subset
of the test suite with the check in gimple_build_assign and it
didn't trigger.  So I guess the sloppiness here doesn't matter.

That said, there is a bug in the function that I noticed while
making this change so it wasn't a completely pointless exercise.
The function should call itself recursively but instead it calls
string_constant.  I'll resubmit the sizetype change with the fix
for this bug

Martin

> 
> The gimple-fold.c hunk and the new testcase are OK.
> 
> Richard.
> 
>> Both patches were tested on x86_64-linux.
>>
>> diff --git a/gcc/gimple.c b/gcc/gimple.c
>> index e3e508daf2f..8e88bab9e41 100644
>> --- a/gcc/gimple.c
>> +++ b/gcc/gimple.c
>> @@ -489,6 +489,9 @@ gassign *
>>    gimple_build_assign (tree lhs, enum tree_code subcode, tree op1,
>>                        tree op2 MEM_STAT_DECL)
>>    {
>> +  if (subcode == POINTER_PLUS_EXPR)
>> +    gcc_checking_assert (ptrofftype_p (TREE_TYPE (op2)));
>> +
>>      return gimple_build_assign_1 (lhs, subcode, op1, op2, NULL_TREE
>>                                   PASS_MEM_STAT);
>>    }
Richard Biener Nov. 26, 2020, 7:02 a.m. UTC | #3
On Wed, Nov 25, 2020 at 7:22 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 11/25/20 2:31 AM, Richard Biener wrote:
> > On Wed, Nov 25, 2020 at 1:45 AM Martin Sebor via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> Offsets in pointer expressions are signed but GCC prefers to
> >> represent them as sizetype instead, and sometimes (though not
> >> always) crashes during GIMPLE verification when they're not.
> >> The sometimes-but-not-always part makes it easy for mistakes
> >> to slip in and go undetected for months, until someone either
> >> trips over it by accident, or deliberately tries to break
> >> things (the test case in the bug relies on declaring memchr
> >> with the third argument of type signed long which is what's
> >> apparently needed to trigger the ICE).  The attached patch
> >> corrects a couple of such mistakes.
> >>
> >> Martin
> >>
> >> PS It would save us the time and effort dealing with these
> >> bugs to either detect (or even correct) the mistakes early,
> >> at the time the POINTER_PLUS_EXPR is built.  Adding an assert
> >> to gimple_build_assign()) to verify that it has the expected
> >> type (or converting the operand to sizetype) as in the change
> >> below does that.  I'm pretty sure I submitted a patch like it
> >> in the past but it was rejected.  If I'm wrong or if there are
> >> no objections to it now I'll be happy to commit it as well.
> >
> > We already verify this in verify_gimple_assign_binary after
> > each pass.  Iff then this would argue for verifying all built
> > stmts immediately, assigns with verify_gimple_assign.
> > But I think this is overkill - your testcase is already catched
> > by the IL verification.
>
> You're right, having the check wouldn't have prevented this bug.
> But I'm not worried about this test case.  What I'd like to do
> is reduce the risk of similar problems happening in the future
> where the check would help.  Catching problems earlier by having
> functions verify their pre- and postconditions is good practice.
> So yes, I think all these build() functions should do that (not
> necessarily to the same extent as the full-blown IL verification
> but at least the basics).
>
> >
> > Btw, are you sure the offset returned by constant_byte_string
> > is never checked to be positive in callers?
>
> The function sets *PTR_OFFSET to sizetype in all but this one
> case (actually, it also sets it to integer_zero_one).  Callers
> then typically compare it to the length of the string to see
> if it's less.  If not, the result is discarded because it refers
> outside the string.  It's tested for equality to zero but I don't
> see it being checked to see if it's positive and I'm not sure to
> what end.  What's your concern?

My concern was that code might do

 /* Handle broken code.  */
 if (tree_int_cst_sgn (offset) < 0)
  return NULL_TREE;

.. expect offset to be within the string ..

but I didn't look at much context.  If the function uses sizetype
everywhere else that concern is moot and thus this hunk is OK
as well.

Thanks,
Richard.

> Anyway, as an experiment, I've changed the function to set
> the offset to ssizetype instead of sizetype and reran a subset
> of the test suite with the check in gimple_build_assign and it
> didn't trigger.  So I guess the sloppiness here doesn't matter.
>
> That said, there is a bug in the function that I noticed while
> making this change so it wasn't a completely pointless exercise.
> The function should call itself recursively but instead it calls
> string_constant.  I'll resubmit the sizetype change with the fix
> for this bug
>
> Martin
>
> >
> > The gimple-fold.c hunk and the new testcase are OK.
> >
> > Richard.
> >
> >> Both patches were tested on x86_64-linux.
> >>
> >> diff --git a/gcc/gimple.c b/gcc/gimple.c
> >> index e3e508daf2f..8e88bab9e41 100644
> >> --- a/gcc/gimple.c
> >> +++ b/gcc/gimple.c
> >> @@ -489,6 +489,9 @@ gassign *
> >>    gimple_build_assign (tree lhs, enum tree_code subcode, tree op1,
> >>                        tree op2 MEM_STAT_DECL)
> >>    {
> >> +  if (subcode == POINTER_PLUS_EXPR)
> >> +    gcc_checking_assert (ptrofftype_p (TREE_TYPE (op2)));
> >> +
> >>      return gimple_build_assign_1 (lhs, subcode, op1, op2, NULL_TREE
> >>                                   PASS_MEM_STAT);
> >>    }
>
diff mbox series

Patch

PR middle-end/97956 - ICE due to type mismatch in pointer_plus_expr during memchr folding

gcc/ChangeLog:

	PR middle-end/97956
	* expr.c (constant_byte_string): Use sizetype for pointer offsets.
	* gimple-fold.c (gimple_fold_builtin_memchr): Ditto.

gcc/testsuite/ChangeLog:

	PR middle-end/97956
	* gcc.dg/memchr-3.c: New test.

diff --git a/gcc/expr.c b/gcc/expr.c
index 25e93b6d46f..b2197f613c7 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11893,7 +11893,7 @@  constant_byte_string (tree arg, tree *ptr_offset, tree *mem_size, tree *decl,
       init = TREE_OPERAND (init, 0);
 
       *mem_size = size_int (TREE_STRING_LENGTH (init));
-      *ptr_offset = wide_int_to_tree (ssizetype, base_off);
+      *ptr_offset = wide_int_to_tree (sizetype, base_off);
 
       if (decl)
 	*decl = array;
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 905c0a057cb..2e98a2c70cd 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2689,7 +2689,7 @@  gimple_fold_builtin_memchr (gimple_stmt_iterator *gsi)
 	  gimple_seq stmts = NULL;
 	  if (lhs != NULL_TREE)
 	    {
-	      tree offset_cst = build_int_cst (TREE_TYPE (len), offset);
+	      tree offset_cst = build_int_cst (sizetype, offset);
 	      gassign *stmt = gimple_build_assign (lhs, POINTER_PLUS_EXPR,
 						   arg1, offset_cst);
 	      gimple_seq_add_stmt_without_update (&stmts, stmt);
diff --git a/gcc/testsuite/gcc.dg/memchr-3.c b/gcc/testsuite/gcc.dg/memchr-3.c
new file mode 100644
index 00000000000..c1f4e9e10dc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/memchr-3.c
@@ -0,0 +1,25 @@ 
+/* PR middle-end/97956 - ICE due to type mismatch in pointer_plus_expr
+   during memchr folding
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+typedef __INT8_TYPE__  int8_t;
+typedef __INT32_TYPE__ int32_t;
+
+extern void* memchr (const void*, int, long);
+
+struct SX
+{
+  int32_t n;
+  int8_t a[];
+};
+
+const struct SX sx = { 0x1221 };
+const char sx_rep[] = { };
+
+void test_find (void)
+{
+  int n = 0, nb = (const char*)&sx.a - (const char*)&sx;
+  const char *p = (const char*)&sx, *q = sx_rep;
+  n += p + 1 == memchr (p, q[1], nb);
+}