Message ID | 20170511134951.GI4910@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, May 11, 2017 at 3:49 PM, Marek Polacek <polacek@redhat.com> wrote: > I had an interesting time coming to grips with these two PRs. But it > essentially comes down to the fold call in save_expr. With that, we can call > fold() on an expression whose operands weren't folded yet, but that is what > code in fold_binary_loc assumes. Since fold() is not recursive, we could end > up there with expressions such as > i * ((unsigned long) -0 + 1) * 2 > causing various sorts of problems: turning valid code into invalid, infinite > recursion, etc. > > It's not clear why save_expr folds. I did some archeology but all I could > find was r67189 (that's 2003) which had: > > - tree t = expr; > + tree t = fold (expr); > tree inner; > > - /* Don't fold a COMPONENT_EXPR: if the operand was a CONSTRUCTOR (the > - only time it will fold), it can cause problems with PLACEHOLDER_EXPRs > - in Ada. Moreover, it isn't at all clear why we fold here at all. */ > - if (TREE_CODE (t) != COMPONENT_REF) > - t = fold (t); > > so it wasn't clear even 14 years ago. But now we know the call is harmful. > > Now, fold() doesn't recurse, but can it happen that it folds something anyway? > And maybe turn the expression into an invariant so that we don't need to wrap > it in a SAVE_EXPR? Turns out it can, but in the C FE, which either uses > c_save_expr that c_fully_folds, or calls save_expr when in_late_binary_op (so > the operands have already been folded), it's very rare, and can only be triggered > by code such as > > void > f (int i) > { > int (*a)[i]; > int x[sizeof (*a)]; > } > > so I'm not very worried about that. For C++, Jakub suggested to add SAVE_EXPR > handling to cp_fold. > > Future work: get rid of c_save_expr, only use save_expr, and teach c_fully_fold > how to fold SAVE_EXPR (once). Although there's this c_wrap_maybe_const > business... > > Bootstrapped/regtested on x86_64-linux, ok for trunk? Interesting - I tried this once (removing fold () from save_expr) but it failed miserably (don't remember the details). Maybe the cp/ part fixes it up. Did you include Ada in testing? Otherwise the tree.c change is ok. (yay, one less to go in the attempt to remove fold ()) Thanks! Richard. > 2017-05-11 Marek Polacek <polacek@redhat.com> > > PR sanitizer/80536 > PR sanitizer/80386 > * cp-gimplify.c (cp_fold): Handle SAVE_EXPR. > > * tree.c (save_expr): Don't fold the expression. > > * c-c++-common/ubsan/pr80536.c: New test. > * g++.dg/ubsan/pr80386.C: New test. > > diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c > index de62414..38a8ed4 100644 > --- gcc/cp/cp-gimplify.c > +++ gcc/cp/cp-gimplify.c > @@ -2428,6 +2428,15 @@ cp_fold (tree x) > x = fold (x); > break; > > + case SAVE_EXPR: > + /* A SAVE_EXPR might contain e.g. (0 * i) + (0 * j), which, after > + folding, evaluates to an invariant. In that case no need to wrap > + this folded tree with a SAVE_EXPR. */ > + r = cp_fold (TREE_OPERAND (x, 0)); > + if (tree_invariant_p (r)) > + x = r; > + break; > + > default: > return org_x; > } > diff --git gcc/testsuite/c-c++-common/ubsan/pr80536.c gcc/testsuite/c-c++-common/ubsan/pr80536.c > index e69de29..23913ad 100644 > --- gcc/testsuite/c-c++-common/ubsan/pr80536.c > +++ gcc/testsuite/c-c++-common/ubsan/pr80536.c > @@ -0,0 +1,9 @@ > +/* PR sanitizer/80536 */ > +/* { dg-do compile } */ > +/* { dg-options "-fsanitize=undefined" } */ > + > +int > +foo (int i) > +{ > + return ((i * (unsigned long long) (-0 + 1UL)) * 2) % 1; > +} > diff --git gcc/testsuite/g++.dg/ubsan/pr80386.C gcc/testsuite/g++.dg/ubsan/pr80386.C > index e69de29..60122da 100644 > --- gcc/testsuite/g++.dg/ubsan/pr80386.C > +++ gcc/testsuite/g++.dg/ubsan/pr80386.C > @@ -0,0 +1,13 @@ > +// PR sanitizer/80386 > +// { dg-do run } > +// { dg-options "-fsanitize=undefined -fno-sanitize-recover" } > + > +static unsigned long long int i = 13996271126042720493ULL; > + > +int > +main () > +{ > + int r = (((2921 + 0) - short(i)) + 0x7fffffff) >> 0; > + asm volatile ("" : "+g" (r)); > + return 0; > +} > diff --git gcc/tree.c gcc/tree.c > index b76b521..7506725 100644 > --- gcc/tree.c > +++ gcc/tree.c > @@ -3337,7 +3337,6 @@ tree_invariant_p (tree t) > tree > save_expr (tree expr) > { > - tree t = fold (expr); > tree inner; > > /* If the tree evaluates to a constant, then we don't want to hide that > @@ -3345,33 +3344,32 @@ save_expr (tree expr) > However, a read-only object that has side effects cannot be bypassed. > Since it is no problem to reevaluate literals, we just return the > literal node. */ > - inner = skip_simple_arithmetic (t); > + inner = skip_simple_arithmetic (expr); > if (TREE_CODE (inner) == ERROR_MARK) > return inner; > > if (tree_invariant_p_1 (inner)) > - return t; > + return expr; > > /* If INNER contains a PLACEHOLDER_EXPR, we must evaluate it each time, since > it means that the size or offset of some field of an object depends on > the value within another field. > > - Note that it must not be the case that T contains both a PLACEHOLDER_EXPR > + Note that it must not be the case that EXPR contains both a PLACEHOLDER_EXPR > and some variable since it would then need to be both evaluated once and > evaluated more than once. Front-ends must assure this case cannot > happen by surrounding any such subexpressions in their own SAVE_EXPR > and forcing evaluation at the proper time. */ > if (contains_placeholder_p (inner)) > - return t; > + return expr; > > - t = build1 (SAVE_EXPR, TREE_TYPE (expr), t); > - SET_EXPR_LOCATION (t, EXPR_LOCATION (expr)); > + expr = build1_loc (EXPR_LOCATION (expr), SAVE_EXPR, TREE_TYPE (expr), expr); > > /* This expression might be placed ahead of a jump to ensure that the > value was computed on both sides of the jump. So make sure it isn't > eliminated as dead. */ > - TREE_SIDE_EFFECTS (t) = 1; > - return t; > + TREE_SIDE_EFFECTS (expr) = 1; > + return expr; > } > > /* Look inside EXPR into any simple arithmetic operations. Return the > > Marek
On Fri, May 12, 2017 at 09:32:48AM +0200, Richard Biener wrote: > On Thu, May 11, 2017 at 3:49 PM, Marek Polacek <polacek@redhat.com> wrote: > > I had an interesting time coming to grips with these two PRs. But it > > essentially comes down to the fold call in save_expr. With that, we can call > > fold() on an expression whose operands weren't folded yet, but that is what > > code in fold_binary_loc assumes. Since fold() is not recursive, we could end > > up there with expressions such as > > i * ((unsigned long) -0 + 1) * 2 > > causing various sorts of problems: turning valid code into invalid, infinite > > recursion, etc. > > > > It's not clear why save_expr folds. I did some archeology but all I could > > find was r67189 (that's 2003) which had: > > > > - tree t = expr; > > + tree t = fold (expr); > > tree inner; > > > > - /* Don't fold a COMPONENT_EXPR: if the operand was a CONSTRUCTOR (the > > - only time it will fold), it can cause problems with PLACEHOLDER_EXPRs > > - in Ada. Moreover, it isn't at all clear why we fold here at all. */ > > - if (TREE_CODE (t) != COMPONENT_REF) > > - t = fold (t); > > > > so it wasn't clear even 14 years ago. But now we know the call is harmful. I've come to believe this was purely to discover more invariants. > > Now, fold() doesn't recurse, but can it happen that it folds something anyway? > > And maybe turn the expression into an invariant so that we don't need to wrap > > it in a SAVE_EXPR? Turns out it can, but in the C FE, which either uses > > c_save_expr that c_fully_folds, or calls save_expr when in_late_binary_op (so > > the operands have already been folded), it's very rare, and can only be triggered > > by code such as > > > > void > > f (int i) > > { > > int (*a)[i]; > > int x[sizeof (*a)]; > > } > > > > so I'm not very worried about that. For C++, Jakub suggested to add SAVE_EXPR > > handling to cp_fold. > > > > Future work: get rid of c_save_expr, only use save_expr, and teach c_fully_fold > > how to fold SAVE_EXPR (once). Although there's this c_wrap_maybe_const > > business... > > > > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > Interesting - I tried this once (removing fold () from save_expr) but > it failed miserably > (don't remember the details). Maybe the cp/ part fixes it up. I think that must've been before C++ added all that constexpr bits. The cp_fold part is just an optimization that triggers very rarely. > Did you include Ada in testing? Not before, so I've tested this now with Ada included -- looks fine still. > Otherwise the tree.c change is ok. Thanks. Jason/Joseph, any comments? > (yay, one less to go in the attempt to remove fold ()) Do we have any low hanging fruit here? Suppose not... Marek
On Fri, May 12, 2017 at 12:09 PM, Marek Polacek <polacek@redhat.com> wrote: > On Fri, May 12, 2017 at 09:32:48AM +0200, Richard Biener wrote: >> On Thu, May 11, 2017 at 3:49 PM, Marek Polacek <polacek@redhat.com> wrote: >> > I had an interesting time coming to grips with these two PRs. But it >> > essentially comes down to the fold call in save_expr. With that, we can call >> > fold() on an expression whose operands weren't folded yet, but that is what >> > code in fold_binary_loc assumes. Since fold() is not recursive, we could end >> > up there with expressions such as >> > i * ((unsigned long) -0 + 1) * 2 >> > causing various sorts of problems: turning valid code into invalid, infinite >> > recursion, etc. >> > >> > It's not clear why save_expr folds. I did some archeology but all I could >> > find was r67189 (that's 2003) which had: >> > >> > - tree t = expr; >> > + tree t = fold (expr); >> > tree inner; >> > >> > - /* Don't fold a COMPONENT_EXPR: if the operand was a CONSTRUCTOR (the >> > - only time it will fold), it can cause problems with PLACEHOLDER_EXPRs >> > - in Ada. Moreover, it isn't at all clear why we fold here at all. */ >> > - if (TREE_CODE (t) != COMPONENT_REF) >> > - t = fold (t); >> > >> > so it wasn't clear even 14 years ago. But now we know the call is harmful. > > I've come to believe this was purely to discover more invariants. > >> > Now, fold() doesn't recurse, but can it happen that it folds something anyway? >> > And maybe turn the expression into an invariant so that we don't need to wrap >> > it in a SAVE_EXPR? Turns out it can, but in the C FE, which either uses >> > c_save_expr that c_fully_folds, or calls save_expr when in_late_binary_op (so >> > the operands have already been folded), it's very rare, and can only be triggered >> > by code such as >> > >> > void >> > f (int i) >> > { >> > int (*a)[i]; >> > int x[sizeof (*a)]; >> > } >> > >> > so I'm not very worried about that. For C++, Jakub suggested to add SAVE_EXPR >> > handling to cp_fold. >> > >> > Future work: get rid of c_save_expr, only use save_expr, and teach c_fully_fold >> > how to fold SAVE_EXPR (once). Although there's this c_wrap_maybe_const >> > business... >> > >> > Bootstrapped/regtested on x86_64-linux, ok for trunk? >> >> Interesting - I tried this once (removing fold () from save_expr) but >> it failed miserably >> (don't remember the details). Maybe the cp/ part fixes it up. > > I think that must've been before C++ added all that constexpr bits. The > cp_fold part is just an optimization that triggers very rarely. > >> Did you include Ada in testing? > > Not before, so I've tested this now with Ada included -- looks fine still. > >> Otherwise the tree.c change is ok. > > Thanks. Jason/Joseph, any comments? > >> (yay, one less to go in the attempt to remove fold ()) > > Do we have any low hanging fruit here? Suppose not... Factoring out the quaternary cases to better handle fold (build4 (code, ... that happens in quite a few places. And then there's tree t = build3 (BIT_FIELD_REF, currop->type, genop0, op1, op2); REF_REVERSE_STORAGE_ORDER (t) = currop->reverse; return fold (t); for the lack of [fold_]buildN getting the REF_REVERSE_STORAGE_ORDER flag ... (I fully blame Eric for this ;)). Ok, now starts to be non-low-hanging. Richard. > Marek
On Fri, 12 May 2017, Marek Polacek wrote: > > Otherwise the tree.c change is ok. > > Thanks. Jason/Joseph, any comments? I don't have any comments on this. (c_save_expr folds to avoid c_fully_fold needing to go inside SAVE_EXPRs.)
On Fri, May 12, 2017 at 6:09 AM, Marek Polacek <polacek@redhat.com> wrote: > On Fri, May 12, 2017 at 09:32:48AM +0200, Richard Biener wrote: >> On Thu, May 11, 2017 at 3:49 PM, Marek Polacek <polacek@redhat.com> wrote: >> > I had an interesting time coming to grips with these two PRs. But it >> > essentially comes down to the fold call in save_expr. With that, we can call >> > fold() on an expression whose operands weren't folded yet, but that is what >> > code in fold_binary_loc assumes. Since fold() is not recursive, we could end >> > up there with expressions such as >> > i * ((unsigned long) -0 + 1) * 2 >> > causing various sorts of problems: turning valid code into invalid, infinite >> > recursion, etc. >> > >> > It's not clear why save_expr folds. I did some archeology but all I could >> > find was r67189 (that's 2003) which had: >> > >> > - tree t = expr; >> > + tree t = fold (expr); >> > tree inner; >> > >> > - /* Don't fold a COMPONENT_EXPR: if the operand was a CONSTRUCTOR (the >> > - only time it will fold), it can cause problems with PLACEHOLDER_EXPRs >> > - in Ada. Moreover, it isn't at all clear why we fold here at all. */ >> > - if (TREE_CODE (t) != COMPONENT_REF) >> > - t = fold (t); >> > >> > so it wasn't clear even 14 years ago. But now we know the call is harmful. > > I've come to believe this was purely to discover more invariants. > >> > Now, fold() doesn't recurse, but can it happen that it folds something anyway? >> > And maybe turn the expression into an invariant so that we don't need to wrap >> > it in a SAVE_EXPR? Turns out it can, but in the C FE, which either uses >> > c_save_expr that c_fully_folds, or calls save_expr when in_late_binary_op (so >> > the operands have already been folded), it's very rare, and can only be triggered >> > by code such as >> > >> > void >> > f (int i) >> > { >> > int (*a)[i]; >> > int x[sizeof (*a)]; >> > } >> > >> > so I'm not very worried about that. For C++, Jakub suggested to add SAVE_EXPR >> > handling to cp_fold. >> > >> > Future work: get rid of c_save_expr, only use save_expr, and teach c_fully_fold >> > how to fold SAVE_EXPR (once). Although there's this c_wrap_maybe_const >> > business... >> > >> > Bootstrapped/regtested on x86_64-linux, ok for trunk? >> >> Interesting - I tried this once (removing fold () from save_expr) but >> it failed miserably >> (don't remember the details). Maybe the cp/ part fixes it up. > > I think that must've been before C++ added all that constexpr bits. The > cp_fold part is just an optimization that triggers very rarely. > >> Did you include Ada in testing? > > Not before, so I've tested this now with Ada included -- looks fine still. > >> Otherwise the tree.c change is ok. > > Thanks. Jason/Joseph, any comments? Looks good to me. Jason
diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c index de62414..38a8ed4 100644 --- gcc/cp/cp-gimplify.c +++ gcc/cp/cp-gimplify.c @@ -2428,6 +2428,15 @@ cp_fold (tree x) x = fold (x); break; + case SAVE_EXPR: + /* A SAVE_EXPR might contain e.g. (0 * i) + (0 * j), which, after + folding, evaluates to an invariant. In that case no need to wrap + this folded tree with a SAVE_EXPR. */ + r = cp_fold (TREE_OPERAND (x, 0)); + if (tree_invariant_p (r)) + x = r; + break; + default: return org_x; } diff --git gcc/testsuite/c-c++-common/ubsan/pr80536.c gcc/testsuite/c-c++-common/ubsan/pr80536.c index e69de29..23913ad 100644 --- gcc/testsuite/c-c++-common/ubsan/pr80536.c +++ gcc/testsuite/c-c++-common/ubsan/pr80536.c @@ -0,0 +1,9 @@ +/* PR sanitizer/80536 */ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=undefined" } */ + +int +foo (int i) +{ + return ((i * (unsigned long long) (-0 + 1UL)) * 2) % 1; +} diff --git gcc/testsuite/g++.dg/ubsan/pr80386.C gcc/testsuite/g++.dg/ubsan/pr80386.C index e69de29..60122da 100644 --- gcc/testsuite/g++.dg/ubsan/pr80386.C +++ gcc/testsuite/g++.dg/ubsan/pr80386.C @@ -0,0 +1,13 @@ +// PR sanitizer/80386 +// { dg-do run } +// { dg-options "-fsanitize=undefined -fno-sanitize-recover" } + +static unsigned long long int i = 13996271126042720493ULL; + +int +main () +{ + int r = (((2921 + 0) - short(i)) + 0x7fffffff) >> 0; + asm volatile ("" : "+g" (r)); + return 0; +} diff --git gcc/tree.c gcc/tree.c index b76b521..7506725 100644 --- gcc/tree.c +++ gcc/tree.c @@ -3337,7 +3337,6 @@ tree_invariant_p (tree t) tree save_expr (tree expr) { - tree t = fold (expr); tree inner; /* If the tree evaluates to a constant, then we don't want to hide that @@ -3345,33 +3344,32 @@ save_expr (tree expr) However, a read-only object that has side effects cannot be bypassed. Since it is no problem to reevaluate literals, we just return the literal node. */ - inner = skip_simple_arithmetic (t); + inner = skip_simple_arithmetic (expr); if (TREE_CODE (inner) == ERROR_MARK) return inner; if (tree_invariant_p_1 (inner)) - return t; + return expr; /* If INNER contains a PLACEHOLDER_EXPR, we must evaluate it each time, since it means that the size or offset of some field of an object depends on the value within another field. - Note that it must not be the case that T contains both a PLACEHOLDER_EXPR + Note that it must not be the case that EXPR contains both a PLACEHOLDER_EXPR and some variable since it would then need to be both evaluated once and evaluated more than once. Front-ends must assure this case cannot happen by surrounding any such subexpressions in their own SAVE_EXPR and forcing evaluation at the proper time. */ if (contains_placeholder_p (inner)) - return t; + return expr; - t = build1 (SAVE_EXPR, TREE_TYPE (expr), t); - SET_EXPR_LOCATION (t, EXPR_LOCATION (expr)); + expr = build1_loc (EXPR_LOCATION (expr), SAVE_EXPR, TREE_TYPE (expr), expr); /* This expression might be placed ahead of a jump to ensure that the value was computed on both sides of the jump. So make sure it isn't eliminated as dead. */ - TREE_SIDE_EFFECTS (t) = 1; - return t; + TREE_SIDE_EFFECTS (expr) = 1; + return expr; } /* Look inside EXPR into any simple arithmetic operations. Return the