Message ID | 20130913180135.GU23899@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, 13 Sep 2013, Marek Polacek wrote: > This is kind of fugly, but don't have anything better at the moment. > 2013-09-13 Marek Polacek <polacek@redhat.com> > > PR sanitizer/58413 > c-family/ > * c-ubsan.c (ubsan_instrument_shift): Don't instrument > an expression if we can prove it is correct. Shouldn't the conditions used here for an expression being proved correct match those for instrumentation, i.e. depend on flag_isoc99 and on (cxx_dialect == cxx11 || cxx_dialect == cxx1y)?
On Fri, Sep 13, 2013 at 07:18:24PM +0000, Joseph S. Myers wrote: > On Fri, 13 Sep 2013, Marek Polacek wrote: > > > This is kind of fugly, but don't have anything better at the moment. > > 2013-09-13 Marek Polacek <polacek@redhat.com> > > > > PR sanitizer/58413 > > c-family/ > > * c-ubsan.c (ubsan_instrument_shift): Don't instrument > > an expression if we can prove it is correct. > > Shouldn't the conditions used here for an expression being proved correct > match those for instrumentation, i.e. depend on flag_isoc99 and on > (cxx_dialect == cxx11 || cxx_dialect == cxx1y)? I don't think so: for the unsigned case we could restrict it to C only, but it doesn't hurt doing it even for C++; in the signed case we care only about C, but we can't restrict it to flag_isoc99 only, since we need to prove the correctnes even for ANSI C. Marek
On Mon, 16 Sep 2013, Marek Polacek wrote: > On Fri, Sep 13, 2013 at 07:18:24PM +0000, Joseph S. Myers wrote: > > On Fri, 13 Sep 2013, Marek Polacek wrote: > > > > > This is kind of fugly, but don't have anything better at the moment. > > > 2013-09-13 Marek Polacek <polacek@redhat.com> > > > > > > PR sanitizer/58413 > > > c-family/ > > > * c-ubsan.c (ubsan_instrument_shift): Don't instrument > > > an expression if we can prove it is correct. > > > > Shouldn't the conditions used here for an expression being proved correct > > match those for instrumentation, i.e. depend on flag_isoc99 and on > > (cxx_dialect == cxx11 || cxx_dialect == cxx1y)? > > I don't think so: for the unsigned case we could restrict it to C > only, but it doesn't hurt doing it even for C++; in the signed case > we care only about C, but we can't restrict it to flag_isoc99 only, > since we need to prove the correctnes even for ANSI C. I don't understand how this answers my question. * The following principle applies: for any command-line options, with ubsan enabled, if an integer operation with particular (non-constant) operands is accepted by the sanitization code at runtime, the same operation with the same operand values (and types) as constants should be accepted at compile time (and at runtime) in contexts where an integer constant expression is required. Does this patch make the compiler meet this principle, for all the different command-line options that vary what is accepted at runtime? * The following principle also applies: for any command-line options, with ubsan enabled, if an integer operation with particular (non-constant) operands is rejected by the sanitization code at runtime, the same operation with the same operand values (and types) as constants should be rejected at compile time (or at runtime) in contexts where an integer constant expression is required. Does this patch make the compiler meet this principle, for all the different command-line options that vary what is accepted at runtime?
On Fri, Sep 13, 2013 at 08:01:36PM +0200, Marek Polacek wrote: > 2013-09-13 Marek Polacek <polacek@redhat.com> > > PR sanitizer/58413 > c-family/ > * c-ubsan.c (ubsan_instrument_shift): Don't instrument > an expression if we can prove it is correct. > > testsuite/ > * c-c++-common/ubsan/shift-4.c: New test. > * c-c++-common/ubsan/shift-5.c: New test. > * gcc.dg/ubsan/c-shift-1.c: New test. > > --- gcc/c-family/c-ubsan.c.mp3 2013-09-13 19:19:55.410925155 +0200 > +++ gcc/c-family/c-ubsan.c 2013-09-13 19:20:16.766006242 +0200 > @@ -104,6 +104,40 @@ ubsan_instrument_shift (location_t loc, > tree uprecm1 = build_int_cst (op1_utype, op0_prec - 1); > tree precm1 = build_int_cst (type1, op0_prec - 1); > > + /* If OP0 is of an unsigned type, we may prove that OP1 is not > + greater than UPRECM1 (and not negative); in that case we can > + bail out early. */ > + if (TYPE_UNSIGNED (type0) > + && TREE_CODE (op1) == INTEGER_CST > + && tree_int_cst_sgn (op1) != -1 > + && !tree_int_cst_lt (uprecm1, op1)) > + return NULL_TREE; > + > + /* Even when OP0 is of a signed type, we may prove that there's > + nothing wrong with the shift if both operands are INTEGER_CSTs > + and wouldn't trigger UB. We do this only for C. > + XXX Should we treat ANSI C specially wrt 1 << 31? */ > + if (TREE_CODE (op0) == INTEGER_CST > + && TREE_CODE (op1) == INTEGER_CST > + && !TYPE_UNSIGNED (type0) > + && tree_int_cst_sgn (op1) != -1 > + && !tree_int_cst_lt (uprecm1, op1) > + && !c_dialect_cxx ()) > + { > + /* If this is a right shift, we can return now. */ > + if (code == RSHIFT_EXPR) > + return NULL_TREE; > + > + /* Otherwise see comment below. */ > + tree x = fold_convert_loc (loc, unsigned_type_for (type0), op0); > + x = fold_build2 (RSHIFT_EXPR, TREE_TYPE (x), x, > + fold_build2 (MINUS_EXPR, TREE_TYPE (precm1), precm1, > + op1)); > + if (integer_zerop (x)) > + return NULL_TREE; > + } > + > + /* Ok, we have to do the instrumentation. */ I'd say the above is going to be a maintainance nightmare, with all the code duplication. And you are certainly going to miss cases that way, e.g. void foo (void) { int A[-2 / -1] = {}; } I'd say instead of adding all this, you should just at the right spot insert if (integer_zerop (t)) return NULL_TREE; or similar. For shift instrumentation, I guess you could add if (integer_zerop (t) && (tt == NULL_TREE || integer_zerop (tt))) return NULL_TREE; right before: t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t); > +/* PR sanitizer/58413 */ > +/* { dg-do run } */ > +/* { dg-options "-fsanitize=shift -w" } */ > + > +int x = 7; > +int > +main (void) > +{ > + /* All of the following should pass. */ > + int A[128 >> 5] = {}; > + int B[128 << 5] = {}; > + > + static int e = > + ((int) > + (0x00000000 | ((31 & ((1 << (4)) - 1)) << (((15) + 6) + 4)) | > + ((0) << ((15) + 6)) | ((0) << (15)))); This relies on int32plus, so needs to be /* { dg-do run { target int32plus } } */ > --- gcc/testsuite/c-c++-common/ubsan/shift-5.c.mp3 2013-09-13 18:25:06.195847144 +0200 > +++ gcc/testsuite/c-c++-common/ubsan/shift-5.c 2013-09-13 19:16:38.990211229 +0200 > @@ -0,0 +1,21 @@ > +/* { dg-do compile} */ > +/* { dg-options "-fsanitize=shift -w" } */ > +/* { dg-shouldfail "ubsan" } */ > + > +int x; > +int > +main (void) > +{ > + /* None of the following should pass. */ > + switch (x) > + { > + case 1 >> -1: /* { dg-error "" } */ > + case -1 >> -1: /* { dg-error "" } */ > + case 1 << -1: /* { dg-error "" } */ > + case -1 << -1: /* { dg-error "" } */ > + case -1 >> 200: /* { dg-error "" } */ > + case 1 << 200: /* { dg-error "" } */ Can't you fill in the error you are expecting, or is the problem that the wording is very different between C and C++? Jakub
On Mon, Sep 16, 2013 at 03:59:12PM +0000, Joseph S. Myers wrote: > On Mon, 16 Sep 2013, Marek Polacek wrote: > > > On Fri, Sep 13, 2013 at 07:18:24PM +0000, Joseph S. Myers wrote: > > > On Fri, 13 Sep 2013, Marek Polacek wrote: > > > > > > > This is kind of fugly, but don't have anything better at the moment. > > > > 2013-09-13 Marek Polacek <polacek@redhat.com> > > > > > > > > PR sanitizer/58413 > > > > c-family/ > > > > * c-ubsan.c (ubsan_instrument_shift): Don't instrument > > > > an expression if we can prove it is correct. > > > > > > Shouldn't the conditions used here for an expression being proved correct > > > match those for instrumentation, i.e. depend on flag_isoc99 and on > > > (cxx_dialect == cxx11 || cxx_dialect == cxx1y)? > > > > I don't think so: for the unsigned case we could restrict it to C > > only, but it doesn't hurt doing it even for C++; in the signed case > > we care only about C, but we can't restrict it to flag_isoc99 only, > > since we need to prove the correctnes even for ANSI C. > > I don't understand how this answers my question. I'm sorry. Please disregard the original (ugly) patch, the folloing applies to the new (pretty) patch <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01283.html>. > * The following principle applies: for any command-line options, with > ubsan enabled, if an integer operation with particular (non-constant) > operands is accepted by the sanitization code at runtime, the same > operation with the same operand values (and types) as constants should be > accepted at compile time (and at runtime) in contexts where an integer > constant expression is required. Does this patch make the compiler meet > this principle, for all the different command-line options that vary what > is accepted at runtime? I believe so. E.g. int i = 4, j = 3, k; k = i << j; is ok, thus the following is ok as well case (4 << 3) (for C++/C with various -std=*). > * The following principle also applies: for any command-line options, with > ubsan enabled, if an integer operation with particular (non-constant) > operands is rejected by the sanitization code at runtime, the same > operation with the same operand values (and types) as constants should be > rejected at compile time (or at runtime) in contexts where an integer > constant expression is required. Does this patch make the compiler meet > this principle, for all the different command-line options that vary what > is accepted at runtime? And I think this applies as well. At runtime we reject e.g. int i = 1, j = 120, k; k = i << j; and at compile-time we reject enum e { red = 0 << 120, }; Marek
--- gcc/c-family/c-ubsan.c.mp3 2013-09-13 19:19:55.410925155 +0200 +++ gcc/c-family/c-ubsan.c 2013-09-13 19:20:16.766006242 +0200 @@ -104,6 +104,40 @@ ubsan_instrument_shift (location_t loc, tree uprecm1 = build_int_cst (op1_utype, op0_prec - 1); tree precm1 = build_int_cst (type1, op0_prec - 1); + /* If OP0 is of an unsigned type, we may prove that OP1 is not + greater than UPRECM1 (and not negative); in that case we can + bail out early. */ + if (TYPE_UNSIGNED (type0) + && TREE_CODE (op1) == INTEGER_CST + && tree_int_cst_sgn (op1) != -1 + && !tree_int_cst_lt (uprecm1, op1)) + return NULL_TREE; + + /* Even when OP0 is of a signed type, we may prove that there's + nothing wrong with the shift if both operands are INTEGER_CSTs + and wouldn't trigger UB. We do this only for C. + XXX Should we treat ANSI C specially wrt 1 << 31? */ + if (TREE_CODE (op0) == INTEGER_CST + && TREE_CODE (op1) == INTEGER_CST + && !TYPE_UNSIGNED (type0) + && tree_int_cst_sgn (op1) != -1 + && !tree_int_cst_lt (uprecm1, op1) + && !c_dialect_cxx ()) + { + /* If this is a right shift, we can return now. */ + if (code == RSHIFT_EXPR) + return NULL_TREE; + + /* Otherwise see comment below. */ + tree x = fold_convert_loc (loc, unsigned_type_for (type0), op0); + x = fold_build2 (RSHIFT_EXPR, TREE_TYPE (x), x, + fold_build2 (MINUS_EXPR, TREE_TYPE (precm1), precm1, + op1)); + if (integer_zerop (x)) + return NULL_TREE; + } + + /* Ok, we have to do the instrumentation. */ t = fold_convert_loc (loc, op1_utype, op1); t = fold_build2 (GT_EXPR, boolean_type_node, t, uprecm1); @@ -125,7 +159,7 @@ ubsan_instrument_shift (location_t loc, x < 0 || ((unsigned) x >> (precm1 - y)) if > 1, is undefined. */ if (code == LSHIFT_EXPR - && !TYPE_UNSIGNED (TREE_TYPE (op0)) + && !TYPE_UNSIGNED (type0) && (cxx_dialect == cxx11 || cxx_dialect == cxx1y)) { tree x = fold_build2 (MINUS_EXPR, integer_type_node, precm1, op1); --- gcc/testsuite/c-c++-common/ubsan/shift-4.c.mp3 2013-09-13 18:25:02.294833062 +0200 +++ gcc/testsuite/c-c++-common/ubsan/shift-4.c 2013-09-13 18:43:56.171046915 +0200 @@ -0,0 +1,30 @@ +/* PR sanitizer/58413 */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=shift -w" } */ + +int x = 7; +int +main (void) +{ + /* All of the following should pass. */ + int A[128 >> 5] = {}; + int B[128 << 5] = {}; + + static int e = + ((int) + (0x00000000 | ((31 & ((1 << (4)) - 1)) << (((15) + 6) + 4)) | + ((0) << ((15) + 6)) | ((0) << (15)))); + + if (e != 503316480) + __builtin_abort (); + + switch (x) + { + case 1 >> 4: + case 1 << 4: + case 128 << (4 + 1): + case 128 >> (4 + 1): + return 1; + } + return 0; +} --- gcc/testsuite/c-c++-common/ubsan/shift-5.c.mp3 2013-09-13 18:25:06.195847144 +0200 +++ gcc/testsuite/c-c++-common/ubsan/shift-5.c 2013-09-13 19:16:38.990211229 +0200 @@ -0,0 +1,21 @@ +/* { dg-do compile} */ +/* { dg-options "-fsanitize=shift -w" } */ +/* { dg-shouldfail "ubsan" } */ + +int x; +int +main (void) +{ + /* None of the following should pass. */ + switch (x) + { + case 1 >> -1: /* { dg-error "" } */ + case -1 >> -1: /* { dg-error "" } */ + case 1 << -1: /* { dg-error "" } */ + case -1 << -1: /* { dg-error "" } */ + case -1 >> 200: /* { dg-error "" } */ + case 1 << 200: /* { dg-error "" } */ + return 1; + } + return 0; +} --- gcc/testsuite/gcc.dg/ubsan/c-shift-1.c.mp3 2013-09-13 19:01:21.334825808 +0200 +++ gcc/testsuite/gcc.dg/ubsan/c-shift-1.c 2013-09-13 19:01:21.334825808 +0200 @@ -0,0 +1,18 @@ +/* { dg-do compile} */ +/* { dg-options "-fsanitize=shift -w" } */ +/* { dg-shouldfail "ubsan" } */ + +int x; +int +main (void) +{ + /* None of the following should pass. */ + int A[1 >> -1] = {}; /* { dg-error "variable-sized object may not be initialized" } */ + int B[-1 >> -1] = {}; /* { dg-error "variable-sized object may not be initialized" } */ + int D[1 << -1] = {}; /* { dg-error "variable-sized object may not be initialized" } */ + int E[-1 << -1] = {}; /* { dg-error "variable-sized object may not be initialized" } */ + int F[-1 >> 200] = {}; /* { dg-error "variable-sized object may not be initialized" } */ + int G[1 << 200] = {}; /* { dg-error "variable-sized object may not be initialized" } */ + + return 0; +}