Message ID | 20230308210930.128620-1-polacek@redhat.com |
---|---|
State | New |
Headers | show |
Series | ubsan: missed -fsanitize=bounds for compound ops [PR108060] | expand |
On Wed, 8 Mar 2023, Marek Polacek wrote: > In this PR we are dealing with a missing .UBSAN_BOUNDS, so the > out-of-bounds access in the test makes the program crash before > a UBSan diagnostic was emitted. In C and C++, c_genericize gets > > a[b] = a[b] | c; > > but in C, both a[b] are one identical shared tree (not in C++ because > cp_fold/ARRAY_REF created two same but not identical trees). Since > ubsan_walk_array_refs_r keeps a pset, in C we produce > > a[.UBSAN_BOUNDS (0B, SAVE_EXPR <b>, 8);, SAVE_EXPR <b>;] = a[b] | c; > > because the LHS is walked before the RHS. > > Since r7-1900, we gimplify the RHS before the LHS. So the statement above > gets gimplified into > > _1 = a[b]; > c.0_2 = c; > b.1 = b; > .UBSAN_BOUNDS (0B, b.1, 8); > > With this patch we produce: > > a[b] = a[.UBSAN_BOUNDS (0B, SAVE_EXPR <b>, 8);, SAVE_EXPR <b>;] | c; > > which gets gimplified into: > > b.0 = b; > .UBSAN_BOUNDS (0B, b.0, 8); > _1 = a[b.0]; > > therefore we emit a runtime error before making the bad array access. > > I think it's OK that only the RHS gets a .UBSAN_BOUNDS, as in few lines > above: the instrumented array access dominates the array access on the > LHS, and I've verified that > > b = 0; > a[b] = (a[b], b = -32768, a[0] | c); > > works as expected: the inner a[b] is OK but we do emit an error for the > a[b] on the LHS. > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12? I think this is a reasonable way to address the regression, so OK. Thanks, Richard. > PR sanitizer/108060 > PR sanitizer/109050 > > gcc/c-family/ChangeLog: > > * c-gimplify.cc (ubsan_walk_array_refs_r): For a MODIFY_EXPR, instrument > the RHS before the LHS. > > gcc/testsuite/ChangeLog: > > * c-c++-common/ubsan/bounds-17.c: New test. > * c-c++-common/ubsan/bounds-18.c: New test. > * c-c++-common/ubsan/bounds-19.c: New test. > * c-c++-common/ubsan/bounds-20.c: New test. > --- > gcc/c-family/c-gimplify.cc | 12 ++++++++++++ > gcc/testsuite/c-c++-common/ubsan/bounds-17.c | 17 +++++++++++++++++ > gcc/testsuite/c-c++-common/ubsan/bounds-18.c | 17 +++++++++++++++++ > gcc/testsuite/c-c++-common/ubsan/bounds-19.c | 20 ++++++++++++++++++++ > gcc/testsuite/c-c++-common/ubsan/bounds-20.c | 16 ++++++++++++++++ > 5 files changed, 82 insertions(+) > create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-17.c > create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-18.c > create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-19.c > create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-20.c > > diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc > index 74b276b2b26..ef5c7d919fc 100644 > --- a/gcc/c-family/c-gimplify.cc > +++ b/gcc/c-family/c-gimplify.cc > @@ -106,6 +106,18 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, void *data) > } > else if (TREE_CODE (*tp) == ARRAY_REF) > ubsan_maybe_instrument_array_ref (tp, false); > + else if (TREE_CODE (*tp) == MODIFY_EXPR) > + { > + /* Since r7-1900, we gimplify RHS before LHS. Consider > + a[b] |= c; > + wherein we can have a single shared tree a[b] in both LHS and RHS. > + If we only instrument the LHS and the access is invalid, the program > + could crash before emitting a UBSan error. So instrument the RHS > + first. */ > + *walk_subtrees = 0; > + walk_tree (&TREE_OPERAND (*tp, 1), ubsan_walk_array_refs_r, pset, pset); > + walk_tree (&TREE_OPERAND (*tp, 0), ubsan_walk_array_refs_r, pset, pset); > + } > return NULL_TREE; > } > > diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-17.c b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c > new file mode 100644 > index 00000000000..b727e3235b8 > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c > @@ -0,0 +1,17 @@ > +/* PR sanitizer/108060 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=bounds" } */ > +/* { dg-skip-if "" { *-*-* } "-flto" } */ > +/* { dg-shouldfail "ubsan" } */ > + > +int a[8]; > +int c; > + > +int > +main () > +{ > + int b = -32768; > + a[b] |= c; > +} > + > +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */ > diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-18.c b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c > new file mode 100644 > index 00000000000..556abc0e1c0 > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c > @@ -0,0 +1,17 @@ > +/* PR sanitizer/108060 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=bounds" } */ > +/* { dg-skip-if "" { *-*-* } "-flto" } */ > +/* { dg-shouldfail "ubsan" } */ > + > +int a[8]; > +int c; > + > +int > +main () > +{ > + int b = -32768; > + a[b] = a[b] | c; > +} > + > +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */ > diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-19.c b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c > new file mode 100644 > index 00000000000..54217ae399f > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c > @@ -0,0 +1,20 @@ > +/* PR sanitizer/108060 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=bounds" } */ > +/* { dg-skip-if "" { *-*-* } "-flto" } */ > +/* { dg-shouldfail "ubsan" } */ > + > +int a[8]; > +int a2[18]; > +int c; > + > +int > +main () > +{ > + int b = 0; > + a[0] = (a2[b], b = -32768, a[0] | c); > + b = 0; > + a[b] = (a[b], b = -32768, a[0] | c); > +} > + > +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */ > diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-20.c b/gcc/testsuite/c-c++-common/ubsan/bounds-20.c > new file mode 100644 > index 00000000000..a78c67129e0 > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-20.c > @@ -0,0 +1,16 @@ > +/* PR sanitizer/109050 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=bounds -fno-sanitize-recover=all" } */ > +/* { dg-shouldfail "ubsan" } */ > + > +long a; > +int b; > +int > +main () > +{ > + int c[4] = {0, 1, 2, 3}; > + a = 0; > + c[a - 9806816] |= b; > +} > + > +/* { dg-output "index -9806816 out of bounds for type 'int \\\[4\\\]'" } */ > > base-commit: 2e3dd14dd287ca94b72c36ed28a1ae30887f77ce >
On Thu, Mar 09, 2023 at 08:12:47AM +0000, Richard Biener wrote:
> I think this is a reasonable way to address the regression, so OK.
It is true that both C and C++ (including c++14_down and c++17 and later
where the latter have different ordering rules) evaluate the lhs of
MODIFY_EXPR after rhs, so conceptually this patch makes sense.
But I wonder why we do in ubsan_maybe_instrument_array_ref:
if (e != NULL_TREE)
{
tree t = copy_node (*expr_p);
TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
e, op1);
*expr_p = t;
}
rather than modification of the ARRAY_REF's operand in place. If we
did that, we wouldn't really care about the order, shared tree would
be instrumented once, with SAVE_EXPR in there making sure we don't
compute that multiple times. Is that because the 2 copies could
have side-effects and we do want to evaluate those multiple times?
Jakub
diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc index 74b276b2b26..ef5c7d919fc 100644 --- a/gcc/c-family/c-gimplify.cc +++ b/gcc/c-family/c-gimplify.cc @@ -106,6 +106,18 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, void *data) } else if (TREE_CODE (*tp) == ARRAY_REF) ubsan_maybe_instrument_array_ref (tp, false); + else if (TREE_CODE (*tp) == MODIFY_EXPR) + { + /* Since r7-1900, we gimplify RHS before LHS. Consider + a[b] |= c; + wherein we can have a single shared tree a[b] in both LHS and RHS. + If we only instrument the LHS and the access is invalid, the program + could crash before emitting a UBSan error. So instrument the RHS + first. */ + *walk_subtrees = 0; + walk_tree (&TREE_OPERAND (*tp, 1), ubsan_walk_array_refs_r, pset, pset); + walk_tree (&TREE_OPERAND (*tp, 0), ubsan_walk_array_refs_r, pset, pset); + } return NULL_TREE; } diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-17.c b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c new file mode 100644 index 00000000000..b727e3235b8 --- /dev/null +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c @@ -0,0 +1,17 @@ +/* PR sanitizer/108060 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=bounds" } */ +/* { dg-skip-if "" { *-*-* } "-flto" } */ +/* { dg-shouldfail "ubsan" } */ + +int a[8]; +int c; + +int +main () +{ + int b = -32768; + a[b] |= c; +} + +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-18.c b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c new file mode 100644 index 00000000000..556abc0e1c0 --- /dev/null +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c @@ -0,0 +1,17 @@ +/* PR sanitizer/108060 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=bounds" } */ +/* { dg-skip-if "" { *-*-* } "-flto" } */ +/* { dg-shouldfail "ubsan" } */ + +int a[8]; +int c; + +int +main () +{ + int b = -32768; + a[b] = a[b] | c; +} + +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-19.c b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c new file mode 100644 index 00000000000..54217ae399f --- /dev/null +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c @@ -0,0 +1,20 @@ +/* PR sanitizer/108060 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=bounds" } */ +/* { dg-skip-if "" { *-*-* } "-flto" } */ +/* { dg-shouldfail "ubsan" } */ + +int a[8]; +int a2[18]; +int c; + +int +main () +{ + int b = 0; + a[0] = (a2[b], b = -32768, a[0] | c); + b = 0; + a[b] = (a[b], b = -32768, a[0] | c); +} + +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-20.c b/gcc/testsuite/c-c++-common/ubsan/bounds-20.c new file mode 100644 index 00000000000..a78c67129e0 --- /dev/null +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-20.c @@ -0,0 +1,16 @@ +/* PR sanitizer/109050 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=bounds -fno-sanitize-recover=all" } */ +/* { dg-shouldfail "ubsan" } */ + +long a; +int b; +int +main () +{ + int c[4] = {0, 1, 2, 3}; + a = 0; + c[a - 9806816] |= b; +} + +/* { dg-output "index -9806816 out of bounds for type 'int \\\[4\\\]'" } */