Message ID | 20130912122655.GN23899@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, Sep 12, 2013 at 02:26:55PM +0200, Marek Polacek wrote:
> the size of a VLA is positive. I.e., We also issue an error if the size of the
s/We also/we/
On Thu, 12 Sep 2013, Marek Polacek wrote: > This patch adds the instrumentation of VLA bounds. Basically, it just > checks that the size of a VLA is positive. I.e., We also issue an error > if the size of the VLA is 0. It catches e.g. This is not an objection to this patch, but there are a few other bits of VLA bound instrumentation that could be done as well. If the size has a wide-enough type to be bigger than the target's SIZE_MAX, and is indeed bigger than SIZE_MAX, that could be detected at runtime as well. Or if the multiplication of array size and element size exceeds SIZE_MAX (this covers both elements of constant size, and elements that are themselves VLAs, but the former can be handled more efficiently by comparing against an appropriate constant rather than needing a runtime test for whether a multiplication in size_t overflows). (Actually, I believe sizes (in bytes) greater than target PTRDIFF_MAX, not just SIZE_MAX, should be caught, because pointer subtraction cannot work reliably with larger objects. So it's not just when the size or multiplication overflow size_t, but when they overflow ptrdiff_t.)
On Thu, 12 Sep 2013, Joseph S. Myers wrote: > (Actually, I believe sizes (in bytes) greater than target PTRDIFF_MAX, not > just SIZE_MAX, should be caught, because pointer subtraction cannot work > reliably with larger objects. So it's not just when the size or > multiplication overflow size_t, but when they overflow ptrdiff_t.) And, to add a bit more to the list of possible ubsan features (is this todo list maintained anywhere?), even if the size is such that operations on the array are in principle defined, it's possible that adjusting the stack pointer by too much may take it into other areas of memory and so cause stack overflow that doesn't get detected by the kernel. So maybe ubsan should imply -fstack-check or similar. Everything about VLA checking - checks on the size being positive and on it not being larger than PTRDIFF_MAX, and on avoiding stack overflow - applies equally to alloca: calls to alloca should also be instrumented. (But I think alloca (0) is valid.)
On Thu, Sep 12, 2013 at 03:52:18PM +0000, Joseph S. Myers wrote: > On Thu, 12 Sep 2013, Marek Polacek wrote: > > > This patch adds the instrumentation of VLA bounds. Basically, it just > > checks that the size of a VLA is positive. I.e., We also issue an error > > if the size of the VLA is 0. It catches e.g. > > This is not an objection to this patch, but there are a few other bits of > VLA bound instrumentation that could be done as well. If the size has a > wide-enough type to be bigger than the target's SIZE_MAX, and is indeed > bigger than SIZE_MAX, that could be detected at runtime as well. Or if > the multiplication of array size and element size exceeds SIZE_MAX (this > covers both elements of constant size, and elements that are themselves > VLAs, but the former can be handled more efficiently by comparing against > an appropriate constant rather than needing a runtime test for whether a > multiplication in size_t overflows). > > (Actually, I believe sizes (in bytes) greater than target PTRDIFF_MAX, not > just SIZE_MAX, should be caught, because pointer subtraction cannot work > reliably with larger objects. So it's not just when the size or > multiplication overflow size_t, but when they overflow ptrdiff_t.) Yup, this all sounds good. I'll look at this tomorrow. I think I'd prefer doing this as a follow-up, after the C/C++ FE parts are reviewed; doing SIZE_MAX/PTRDIFF_MAX checking then should require changes only in c-ubsan.c. Marek
On Thu, Sep 12, 2013 at 04:05:48PM +0000, Joseph S. Myers wrote: > On Thu, 12 Sep 2013, Joseph S. Myers wrote: > > > (Actually, I believe sizes (in bytes) greater than target PTRDIFF_MAX, not > > just SIZE_MAX, should be caught, because pointer subtraction cannot work > > reliably with larger objects. So it's not just when the size or > > multiplication overflow size_t, but when they overflow ptrdiff_t.) > > And, to add a bit more to the list of possible ubsan features (is this > todo list maintained anywhere?), even if the size is such that operations No, I don't have such a list (at least nothing online/public). > on the array are in principle defined, it's possible that adjusting the > stack pointer by too much may take it into other areas of memory and so > cause stack overflow that doesn't get detected by the kernel. So maybe > ubsan should imply -fstack-check or similar. Works for me. > Everything about VLA checking - checks on the size being positive and on > it not being larger than PTRDIFF_MAX, and on avoiding stack overflow - > applies equally to alloca: calls to alloca should also be instrumented. > (But I think alloca (0) is valid.) Yes, good idea. I've just checked and clang doesn't check the size passed to alloca, I think it'd be good addition to have it. And yeah - alloca (0) seems to be valid; when expanding __builtin_alloca we call allocate_dynamic_stack_space and that contains /* If we're asking for zero bytes, it doesn't matter what we point to since we can't dereference it. But return a reasonable address anyway. */ if (size == const0_rtx) return virtual_stack_dynamic_rtx; Thanks, Marek
On Thu, Sep 12, 2013 at 04:05:48PM +0000, Joseph S. Myers wrote: > cause stack overflow that doesn't get detected by the kernel. So maybe > ubsan should imply -fstack-check or similar. Well, I have a patch for that, but I no longer think that ubsan should imply -fstack-check, since e.g. int main (void) { int x = -1; int b[x - 4]; /* ... */ return 0; } segfaults at runtime on int b[x - 4]; line when -fstack-check is used (even without sanitizing), so we wouldn't give proper diagnostics for stmts following that line... Marek
> Well, I have a patch for that, but I no longer think that ubsan should > imply -fstack-check, since e.g. > > int > main (void) > { > int x = -1; > int b[x - 4]; > /* ... */ > return 0; > } > > segfaults at runtime on int b[x - 4]; line when -fstack-check is used > (even without sanitizing), so we wouldn't give proper diagnostics > for stmts following that line... In Ada we catch the sigsegv, turn it into an exception and unwind.
On Fri, 13 Sep 2013, Marek Polacek wrote: > On Thu, Sep 12, 2013 at 04:05:48PM +0000, Joseph S. Myers wrote: > > cause stack overflow that doesn't get detected by the kernel. So maybe > > ubsan should imply -fstack-check or similar. > > Well, I have a patch for that, but I no longer think that ubsan should > imply -fstack-check, since e.g. > > int > main (void) > { > int x = -1; > int b[x - 4]; > /* ... */ > return 0; > } > > segfaults at runtime on int b[x - 4]; line when -fstack-check is used > (even without sanitizing), so we wouldn't give proper diagnostics > for stmts following that line... A guaranteed segfault is better than doing something undefined. But I'd expect sanitizing to make the initial check that the array size in bytes is in the range [1, PTRDIFF_MAX] and -fstack-check only to come into play if that passes (for sizes that are too large for the stack limit in effect at runtime although within the range that is in principle valid). You probably don't want to enable -fstack-check from ubsan until the checks for the range [1, PTRDIFF_MAX] are in place. (Those checks, incidentally, would need to apply not just to arrays whose specified size is variable, but also to constant-size arrays of variable-size arrays - if you have a VLA type, then define an array VLA array[10]; then you need to check that the result of the multiplication of sizes in bytes doesn't exceed PTRDIFF_MAX. So the more general checks can't all go in the place where you're inserting the checks for a single variable size in isolation.)
On Thu, Sep 12, 2013 at 04:05:48PM +0000, Joseph S. Myers wrote: > On Thu, 12 Sep 2013, Joseph S. Myers wrote: > > > (Actually, I believe sizes (in bytes) greater than target PTRDIFF_MAX, not > > just SIZE_MAX, should be caught, because pointer subtraction cannot work > > reliably with larger objects. So it's not just when the size or > > multiplication overflow size_t, but when they overflow ptrdiff_t.) > > And, to add a bit more to the list of possible ubsan features (is this > todo list maintained anywhere?), even if the size is such that operations > on the array are in principle defined, it's possible that adjusting the > stack pointer by too much may take it into other areas of memory and so > cause stack overflow that doesn't get detected by the kernel. So maybe > ubsan should imply -fstack-check or similar. > > Everything about VLA checking - checks on the size being positive and on > it not being larger than PTRDIFF_MAX, and on avoiding stack overflow - > applies equally to alloca: calls to alloca should also be instrumented. > (But I think alloca (0) is valid.) Problem here is that libubsan doesn't contain appropriate routines for this VLA/alloca extended checking, it really can only issue "variable length array bound evaluates to non-positive value", nothing else. So perhaps reach out to some clang mailing list and try to implement it first in the libubsan... Marek
On 09/12/2013 06:05 PM, Joseph S. Myers wrote: > On Thu, 12 Sep 2013, Joseph S. Myers wrote: > >> (Actually, I believe sizes (in bytes) greater than target PTRDIFF_MAX, not >> just SIZE_MAX, should be caught, because pointer subtraction cannot work >> reliably with larger objects. So it's not just when the size or >> multiplication overflow size_t, but when they overflow ptrdiff_t.) > > And, to add a bit more to the list of possible ubsan features (is this > todo list maintained anywhere?), even if the size is such that operations > on the array are in principle defined, it's possible that adjusting the > stack pointer by too much may take it into other areas of memory and so > cause stack overflow that doesn't get detected by the kernel. So maybe > ubsan should imply -fstack-check or similar. I have on my to-do list to make -fstack-check production-ready, by avoiding unnecessary instrumentation. My initial experiments weren't too successful, but I think it should be possible to greatly reduce its overhead. If everything else fails, the idea is to reuse the Go split stack limit and check against that. The idea is that this would eventually be enabled in production code, much like -fstack-protector. I'm quite busy with other work at the moment, and a patch from me is probably months away, though. :-(
--- gcc/opts.c.mp 2013-09-12 13:30:53.299113222 +0200 +++ gcc/opts.c 2013-09-12 13:31:31.496263290 +0200 @@ -1426,6 +1426,7 @@ common_handle_option (struct gcc_options { "undefined", SANITIZE_UNDEFINED, sizeof "undefined" - 1 }, { "unreachable", SANITIZE_UNREACHABLE, sizeof "unreachable" - 1 }, + { "vla-bound", SANITIZE_VLA, sizeof "vla-bound" - 1 }, { NULL, 0, 0 } }; const char *comma; --- gcc/c-family/c-ubsan.c.mp 2013-09-12 13:30:17.306967720 +0200 +++ gcc/c-family/c-ubsan.c 2013-09-12 13:31:31.469263169 +0200 @@ -25,7 +25,6 @@ along with GCC; see the file COPYING3. #include "alloc-pool.h" #include "cgraph.h" #include "gimple.h" -#include "hash-table.h" #include "output.h" #include "toplev.h" #include "ubsan.h" @@ -89,8 +88,7 @@ ubsan_instrument_division (location_t lo return t; } -/* Instrument left and right shifts. If not instrumenting, return - NULL_TREE. */ +/* Instrument left and right shifts. */ tree ubsan_instrument_shift (location_t loc, enum tree_code code, @@ -155,4 +153,23 @@ ubsan_instrument_shift (location_t loc, t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node); return t; +} + +/* Instrument variable length array bound. */ + +tree +ubsan_instrument_vla (location_t loc, tree size) +{ + tree type = TREE_TYPE (size); + tree t, tt; + + t = fold_build2 (LE_EXPR, boolean_type_node, size, build_int_cst (type, 0)); + tree data = ubsan_create_data ("__ubsan_vla_data", + loc, ubsan_type_descriptor (type), NULL_TREE); + data = build_fold_addr_expr_loc (loc, data); + tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE); + tt = build_call_expr_loc (loc, tt, 2, data, ubsan_encode_value (size)); + t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node); + + return t; } --- gcc/c-family/c-ubsan.h.mp 2013-09-12 13:30:25.609000661 +0200 +++ gcc/c-family/c-ubsan.h 2013-09-12 13:31:31.475263194 +0200 @@ -23,5 +23,6 @@ along with GCC; see the file COPYING3. extern tree ubsan_instrument_division (location_t, tree, tree); extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree); +extern tree ubsan_instrument_vla (location_t, tree); #endif /* GCC_C_UBSAN_H */ --- gcc/sanitizer.def.mp 2013-09-12 13:30:59.667138181 +0200 +++ gcc/sanitizer.def 2013-09-12 13:31:31.497263294 +0200 @@ -297,3 +297,7 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN "__ubsan_handle_builtin_unreachable", BT_FN_VOID_PTR, ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE, + "__ubsan_handle_vla_bound_not_positive", + BT_FN_VOID_PTR_PTR, + ATTR_COLD_NOTHROW_LEAF_LIST) --- gcc/ubsan.c.mp 2013-09-12 13:31:06.197163836 +0200 +++ gcc/ubsan.c 2013-09-12 13:31:31.498263298 +0200 @@ -261,7 +261,9 @@ ubsan_type_descriptor (tree type) /* At least for INTEGER_TYPE/REAL_TYPE/COMPLEX_TYPE, this should work. ??? For e.g. type_unsigned_for (type), the TYPE_NAME would be NULL. */ - if (TYPE_NAME (type) != NULL) + if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE) + tname = IDENTIFIER_POINTER (TYPE_NAME (type)); + else if (TYPE_NAME (type) != NULL) tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type))); else tname = "<unknown>"; --- gcc/flag-types.h.mp 2013-09-12 13:30:47.130090269 +0200 +++ gcc/flag-types.h 2013-09-12 13:31:31.495263285 +0200 @@ -201,7 +201,9 @@ enum sanitize_code { SANITIZE_SHIFT = 1 << 2, SANITIZE_DIVIDE = 1 << 3, SANITIZE_UNREACHABLE = 1 << 4, + SANITIZE_VLA = 1 << 5, SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE + | SANITIZE_VLA }; /* flag_vtable_verify initialization levels. */ --- gcc/cp/decl.c.mp 2013-09-12 13:30:39.641060204 +0200 +++ gcc/cp/decl.c 2013-09-12 13:31:31.488263253 +0200 @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. #include "c-family/c-common.h" #include "c-family/c-objc.h" #include "c-family/c-pragma.h" +#include "c-family/c-ubsan.h" #include "diagnostic.h" #include "intl.h" #include "debug.h" @@ -8462,6 +8463,24 @@ create_array_type_for_decl (tree name, t if (cxx_dialect >= cxx1y && array_of_runtime_bound_p (type)) pedwarn (input_location, OPT_Wvla, "array of array of runtime bound"); + /* Do the instrumentation of VLAs if desired. */ + if ((flag_sanitize & SANITIZE_VLA) + && size && !TREE_CONSTANT (size) + /* From C++1y onwards, we throw an exception on a negative length size + of an array. */ + && cxx_dialect < cxx1y) + { + /* Prevent bogus set-but-not-used warnings: we're definitely using + the variable. */ + if (VAR_P (size)) + DECL_READ_P (size) = 1; + /* Evaluate the array size only once. */ + size = cp_save_expr (size); + size = fold_build2 (COMPOUND_EXPR, TREE_TYPE (size), + ubsan_instrument_vla (input_location, size), + size); + } + /* Figure out the index type for the array. */ if (size) itype = compute_array_index_type (name, size, tf_warning_or_error); --- gcc/c/c-decl.c.mp 2013-09-12 13:30:32.352029153 +0200 +++ gcc/c/c-decl.c 2013-09-12 13:31:31.478263209 +0200 @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3. #include "c-family/c-common.h" #include "c-family/c-objc.h" #include "c-family/c-pragma.h" +#include "c-family/c-ubsan.h" #include "c-lang.h" #include "langhooks.h" #include "tree-iterator.h" @@ -5381,6 +5382,16 @@ grokdeclarator (const struct c_declarato with known value. */ this_size_varies = size_varies = true; warn_variable_length_array (name, size); + if (flag_sanitize & SANITIZE_VLA + && decl_context == NORMAL) + { + /* Evaluate the array size only once. */ + size = c_save_expr (size); + size = c_fully_fold (size, false, NULL); + size = fold_build2 (COMPOUND_EXPR, TREE_TYPE (size), + ubsan_instrument_vla (loc, size), + size); + } } if (integer_zerop (size) && !this_size_varies) --- gcc/testsuite/g++.dg/ubsan/cxx1y-vla.C.mp 2013-09-12 13:17:55.242089503 +0200 +++ gcc/testsuite/g++.dg/ubsan/cxx1y-vla.C 2013-09-12 13:27:38.649460187 +0200 @@ -0,0 +1,13 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=vla-bound -w -std=c++1y" } */ +/* { dg-shouldfail "ubsan" } */ + +int +main (void) +{ + int y = -18; + int a[y]; + return 0; +} + +/* { dg-output "terminate called after throwing an instance" } */ --- gcc/testsuite/c-c++-common/ubsan/vla-3.c.mp 2013-09-12 10:48:11.719745997 +0200 +++ gcc/testsuite/c-c++-common/ubsan/vla-3.c 2013-09-12 12:06:43.178724666 +0200 @@ -0,0 +1,16 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=vla-bound -w" } */ + +/* Don't instrument the arrays here. */ +int +foo (int n, int a[]) +{ + return a[n - 1]; +} + +int +main (void) +{ + int a[6] = { }; + return foo (3, a); +} --- gcc/testsuite/c-c++-common/ubsan/vla-2.c.mp 2013-09-12 10:47:56.662693753 +0200 +++ gcc/testsuite/c-c++-common/ubsan/vla-2.c 2013-09-12 12:06:28.698670292 +0200 @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=vla-bound -w" } */ + +int +main (void) +{ + const int t = 0; + struct s { + int x; + /* Don't instrument this one. */ + int g[t]; + }; + + return 0; +} --- gcc/testsuite/c-c++-common/ubsan/vla-4.c.mp 2013-09-12 10:48:14.023754028 +0200 +++ gcc/testsuite/c-c++-common/ubsan/vla-4.c 2013-09-12 12:00:37.639137936 +0200 @@ -0,0 +1,13 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=vla-bound" } */ + +int +main (void) +{ + int x = 1; + /* Check that the size of an array is evaluated only once. */ + int a[++x]; + if (x != 2) + __builtin_abort (); + return 0; +} --- gcc/testsuite/c-c++-common/ubsan/vla-1.c.mp 2013-09-12 10:47:54.377685875 +0200 +++ gcc/testsuite/c-c++-common/ubsan/vla-1.c 2013-09-12 11:00:37.693810414 +0200 @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=vla-bound -w" } */ + +static int +bar (void) +{ + return -42; +} + +typedef long int V; +int +main (void) +{ + int x = -1; + double di = -3.2; + V v = -666; + + int a[x]; + int aa[x][x]; + int aaa[x][x][x]; + int b[x - 4]; + int c[(int) di]; + int d[1 + x]; + int e[1 ? x : -1]; + int f[++x]; + int g[(signed char) --x]; + int h[(++x, --x, x)]; + int i[v]; + int j[bar ()]; + + return 0; +} + +/* { dg-output "variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -5(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -3(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value 0(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value 0(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -1(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -666(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive value -42(\n|\r\n|\r)" } */ --- gcc/asan.c.mp 2013-09-12 13:30:10.530941672 +0200 +++ gcc/asan.c 2013-09-12 13:31:31.469263169 +0200 @@ -2018,6 +2018,9 @@ initialize_sanitizer_builtins (void) tree BT_FN_VOID = build_function_type_list (void_type_node, NULL_TREE); tree BT_FN_VOID_PTR = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE); + tree BT_FN_VOID_PTR_PTR + = build_function_type_list (void_type_node, ptr_type_node, + ptr_type_node, NULL_TREE); tree BT_FN_VOID_PTR_PTR_PTR = build_function_type_list (void_type_node, ptr_type_node, ptr_type_node, ptr_type_node, NULL_TREE);