Message ID | 10756ea2-c660-9add-b60f-f3cf49260e38@gmail.com |
---|---|
State | New |
Headers | show |
Series | make POINTER_PLUS offset sizetype (PR 97956) | expand |
On Wed, Nov 25, 2020 at 1:45 AM Martin Sebor via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Offsets in pointer expressions are signed but GCC prefers to > represent them as sizetype instead, and sometimes (though not > always) crashes during GIMPLE verification when they're not. > The sometimes-but-not-always part makes it easy for mistakes > to slip in and go undetected for months, until someone either > trips over it by accident, or deliberately tries to break > things (the test case in the bug relies on declaring memchr > with the third argument of type signed long which is what's > apparently needed to trigger the ICE). The attached patch > corrects a couple of such mistakes. > > Martin > > PS It would save us the time and effort dealing with these > bugs to either detect (or even correct) the mistakes early, > at the time the POINTER_PLUS_EXPR is built. Adding an assert > to gimple_build_assign()) to verify that it has the expected > type (or converting the operand to sizetype) as in the change > below does that. I'm pretty sure I submitted a patch like it > in the past but it was rejected. If I'm wrong or if there are > no objections to it now I'll be happy to commit it as well. We already verify this in verify_gimple_assign_binary after each pass. Iff then this would argue for verifying all built stmts immediately, assigns with verify_gimple_assign. But I think this is overkill - your testcase is already catched by the IL verification. Btw, are you sure the offset returned by constant_byte_string is never checked to be positive in callers? The gimple-fold.c hunk and the new testcase are OK. Richard. > Both patches were tested on x86_64-linux. > > diff --git a/gcc/gimple.c b/gcc/gimple.c > index e3e508daf2f..8e88bab9e41 100644 > --- a/gcc/gimple.c > +++ b/gcc/gimple.c > @@ -489,6 +489,9 @@ gassign * > gimple_build_assign (tree lhs, enum tree_code subcode, tree op1, > tree op2 MEM_STAT_DECL) > { > + if (subcode == POINTER_PLUS_EXPR) > + gcc_checking_assert (ptrofftype_p (TREE_TYPE (op2))); > + > return gimple_build_assign_1 (lhs, subcode, op1, op2, NULL_TREE > PASS_MEM_STAT); > }
On 11/25/20 2:31 AM, Richard Biener wrote: > On Wed, Nov 25, 2020 at 1:45 AM Martin Sebor via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> Offsets in pointer expressions are signed but GCC prefers to >> represent them as sizetype instead, and sometimes (though not >> always) crashes during GIMPLE verification when they're not. >> The sometimes-but-not-always part makes it easy for mistakes >> to slip in and go undetected for months, until someone either >> trips over it by accident, or deliberately tries to break >> things (the test case in the bug relies on declaring memchr >> with the third argument of type signed long which is what's >> apparently needed to trigger the ICE). The attached patch >> corrects a couple of such mistakes. >> >> Martin >> >> PS It would save us the time and effort dealing with these >> bugs to either detect (or even correct) the mistakes early, >> at the time the POINTER_PLUS_EXPR is built. Adding an assert >> to gimple_build_assign()) to verify that it has the expected >> type (or converting the operand to sizetype) as in the change >> below does that. I'm pretty sure I submitted a patch like it >> in the past but it was rejected. If I'm wrong or if there are >> no objections to it now I'll be happy to commit it as well. > > We already verify this in verify_gimple_assign_binary after > each pass. Iff then this would argue for verifying all built > stmts immediately, assigns with verify_gimple_assign. > But I think this is overkill - your testcase is already catched > by the IL verification. You're right, having the check wouldn't have prevented this bug. But I'm not worried about this test case. What I'd like to do is reduce the risk of similar problems happening in the future where the check would help. Catching problems earlier by having functions verify their pre- and postconditions is good practice. So yes, I think all these build() functions should do that (not necessarily to the same extent as the full-blown IL verification but at least the basics). > > Btw, are you sure the offset returned by constant_byte_string > is never checked to be positive in callers? The function sets *PTR_OFFSET to sizetype in all but this one case (actually, it also sets it to integer_zero_one). Callers then typically compare it to the length of the string to see if it's less. If not, the result is discarded because it refers outside the string. It's tested for equality to zero but I don't see it being checked to see if it's positive and I'm not sure to what end. What's your concern? Anyway, as an experiment, I've changed the function to set the offset to ssizetype instead of sizetype and reran a subset of the test suite with the check in gimple_build_assign and it didn't trigger. So I guess the sloppiness here doesn't matter. That said, there is a bug in the function that I noticed while making this change so it wasn't a completely pointless exercise. The function should call itself recursively but instead it calls string_constant. I'll resubmit the sizetype change with the fix for this bug Martin > > The gimple-fold.c hunk and the new testcase are OK. > > Richard. > >> Both patches were tested on x86_64-linux. >> >> diff --git a/gcc/gimple.c b/gcc/gimple.c >> index e3e508daf2f..8e88bab9e41 100644 >> --- a/gcc/gimple.c >> +++ b/gcc/gimple.c >> @@ -489,6 +489,9 @@ gassign * >> gimple_build_assign (tree lhs, enum tree_code subcode, tree op1, >> tree op2 MEM_STAT_DECL) >> { >> + if (subcode == POINTER_PLUS_EXPR) >> + gcc_checking_assert (ptrofftype_p (TREE_TYPE (op2))); >> + >> return gimple_build_assign_1 (lhs, subcode, op1, op2, NULL_TREE >> PASS_MEM_STAT); >> }
On Wed, Nov 25, 2020 at 7:22 PM Martin Sebor <msebor@gmail.com> wrote: > > On 11/25/20 2:31 AM, Richard Biener wrote: > > On Wed, Nov 25, 2020 at 1:45 AM Martin Sebor via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> Offsets in pointer expressions are signed but GCC prefers to > >> represent them as sizetype instead, and sometimes (though not > >> always) crashes during GIMPLE verification when they're not. > >> The sometimes-but-not-always part makes it easy for mistakes > >> to slip in and go undetected for months, until someone either > >> trips over it by accident, or deliberately tries to break > >> things (the test case in the bug relies on declaring memchr > >> with the third argument of type signed long which is what's > >> apparently needed to trigger the ICE). The attached patch > >> corrects a couple of such mistakes. > >> > >> Martin > >> > >> PS It would save us the time and effort dealing with these > >> bugs to either detect (or even correct) the mistakes early, > >> at the time the POINTER_PLUS_EXPR is built. Adding an assert > >> to gimple_build_assign()) to verify that it has the expected > >> type (or converting the operand to sizetype) as in the change > >> below does that. I'm pretty sure I submitted a patch like it > >> in the past but it was rejected. If I'm wrong or if there are > >> no objections to it now I'll be happy to commit it as well. > > > > We already verify this in verify_gimple_assign_binary after > > each pass. Iff then this would argue for verifying all built > > stmts immediately, assigns with verify_gimple_assign. > > But I think this is overkill - your testcase is already catched > > by the IL verification. > > You're right, having the check wouldn't have prevented this bug. > But I'm not worried about this test case. What I'd like to do > is reduce the risk of similar problems happening in the future > where the check would help. Catching problems earlier by having > functions verify their pre- and postconditions is good practice. > So yes, I think all these build() functions should do that (not > necessarily to the same extent as the full-blown IL verification > but at least the basics). > > > > > Btw, are you sure the offset returned by constant_byte_string > > is never checked to be positive in callers? > > The function sets *PTR_OFFSET to sizetype in all but this one > case (actually, it also sets it to integer_zero_one). Callers > then typically compare it to the length of the string to see > if it's less. If not, the result is discarded because it refers > outside the string. It's tested for equality to zero but I don't > see it being checked to see if it's positive and I'm not sure to > what end. What's your concern? My concern was that code might do /* Handle broken code. */ if (tree_int_cst_sgn (offset) < 0) return NULL_TREE; .. expect offset to be within the string .. but I didn't look at much context. If the function uses sizetype everywhere else that concern is moot and thus this hunk is OK as well. Thanks, Richard. > Anyway, as an experiment, I've changed the function to set > the offset to ssizetype instead of sizetype and reran a subset > of the test suite with the check in gimple_build_assign and it > didn't trigger. So I guess the sloppiness here doesn't matter. > > That said, there is a bug in the function that I noticed while > making this change so it wasn't a completely pointless exercise. > The function should call itself recursively but instead it calls > string_constant. I'll resubmit the sizetype change with the fix > for this bug > > Martin > > > > > The gimple-fold.c hunk and the new testcase are OK. > > > > Richard. > > > >> Both patches were tested on x86_64-linux. > >> > >> diff --git a/gcc/gimple.c b/gcc/gimple.c > >> index e3e508daf2f..8e88bab9e41 100644 > >> --- a/gcc/gimple.c > >> +++ b/gcc/gimple.c > >> @@ -489,6 +489,9 @@ gassign * > >> gimple_build_assign (tree lhs, enum tree_code subcode, tree op1, > >> tree op2 MEM_STAT_DECL) > >> { > >> + if (subcode == POINTER_PLUS_EXPR) > >> + gcc_checking_assert (ptrofftype_p (TREE_TYPE (op2))); > >> + > >> return gimple_build_assign_1 (lhs, subcode, op1, op2, NULL_TREE > >> PASS_MEM_STAT); > >> } >
PR middle-end/97956 - ICE due to type mismatch in pointer_plus_expr during memchr folding gcc/ChangeLog: PR middle-end/97956 * expr.c (constant_byte_string): Use sizetype for pointer offsets. * gimple-fold.c (gimple_fold_builtin_memchr): Ditto. gcc/testsuite/ChangeLog: PR middle-end/97956 * gcc.dg/memchr-3.c: New test. diff --git a/gcc/expr.c b/gcc/expr.c index 25e93b6d46f..b2197f613c7 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -11893,7 +11893,7 @@ constant_byte_string (tree arg, tree *ptr_offset, tree *mem_size, tree *decl, init = TREE_OPERAND (init, 0); *mem_size = size_int (TREE_STRING_LENGTH (init)); - *ptr_offset = wide_int_to_tree (ssizetype, base_off); + *ptr_offset = wide_int_to_tree (sizetype, base_off); if (decl) *decl = array; diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 905c0a057cb..2e98a2c70cd 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -2689,7 +2689,7 @@ gimple_fold_builtin_memchr (gimple_stmt_iterator *gsi) gimple_seq stmts = NULL; if (lhs != NULL_TREE) { - tree offset_cst = build_int_cst (TREE_TYPE (len), offset); + tree offset_cst = build_int_cst (sizetype, offset); gassign *stmt = gimple_build_assign (lhs, POINTER_PLUS_EXPR, arg1, offset_cst); gimple_seq_add_stmt_without_update (&stmts, stmt); diff --git a/gcc/testsuite/gcc.dg/memchr-3.c b/gcc/testsuite/gcc.dg/memchr-3.c new file mode 100644 index 00000000000..c1f4e9e10dc --- /dev/null +++ b/gcc/testsuite/gcc.dg/memchr-3.c @@ -0,0 +1,25 @@ +/* PR middle-end/97956 - ICE due to type mismatch in pointer_plus_expr + during memchr folding + { dg-do compile } + { dg-options "-O2 -Wall" } */ + +typedef __INT8_TYPE__ int8_t; +typedef __INT32_TYPE__ int32_t; + +extern void* memchr (const void*, int, long); + +struct SX +{ + int32_t n; + int8_t a[]; +}; + +const struct SX sx = { 0x1221 }; +const char sx_rep[] = { }; + +void test_find (void) +{ + int n = 0, nb = (const char*)&sx.a - (const char*)&sx; + const char *p = (const char*)&sx, *q = sx_rep; + n += p + 1 == memchr (p, q[1], nb); +}