diff mbox

[RFC] Implement Undefined Behavior Sanitizer

Message ID 20130605175728.GD4160@redhat.com
State New
Headers show

Commit Message

Marek Polacek June 5, 2013, 5:57 p.m. UTC
Hi!

This is an attempt to add the Undefined Behavior Sanitizer to GCC.
Note that it's very alpha version; so far it doesn't do that much,
at the moment it should handle division by zero cases, INT_MIN / -1,
and various shift cases (shifting by a negative value, shifting when
second operand is >= than TYPE_PRECISION (first_operand) and suchlike.
(On integer types, so far.)

It works by creating a COMPOUND_EXPR around original expression, so e.g.
it creates:

if (b < 0 || (b > 31 || a < 0))
  {
    __builtin___ubsan_handle_shift_out_of_bounds ();
  }
else
  {
    0
  }, a << b;

from original "a <<= b;".

There is of course a lot of stuff that needs to be done, more
specifically:
  0) fix an ICE which I've noticed right now ;(
        long a = 1;
        int b = 3;
        a <<= b;
      (error: mismatching comparison operand types)
      temporarily solved by surrounding "doing_shift = true;"
      with if (comptypes (type0, type1))
      But that needs a better solution I'm afraid.  Bah.
  1) import & build the ubsan library from LLVM
     I've already spent some time on this, but failed miserably.  I've thought
     that importing ubsan/ from LLVM into libsanitizer/, adding
     libsanitizer/ubsan/Makefile.{am,in}, editing libsanitizer/Makefile.am
     and libsanitizer/configure.ac, then something like aclocal && automake
     could be sufficient, but no.  I'd very much appreciate any help with
     this; is someone willing to help me with this one?  And it seemed so easy...
  2) construct arguments for ubsan library
     I guess that if we want to call for instance
     void __ubsan::__ubsan_handle_shift_out_of_bounds(ShiftOutOfBoundsData *Data,
                                                 ValueHandle LHS, ValueHandle RHS)
     from GCC, we need to construct arguments compatible with
     ShiftOutOfBoundsData/ValueHandle.  
     So, perhaps we need some helper function that constructs the CALL_EXPR
     for the builtin; so far I haven't spent much time on this and don't know
     what exactly to do here.  Time to look at what asan/tsan do.
  3) add parsing of -fsanitize=<...>
     LLVM supports e.g. -fsanitize=shift,divbyzero combination, we should too.
     This doesn't sound like a big deal; just parse the arguments and set
     various flags, or error out on invalid combinations.
  4) and of course, more instrumentation (C/C++ FE, gimple level)
     What comes to mind is:
     - float/double to integer conversions,
     - integer overflows (a long list of various cases here),
     - invalid conversions of int to bool,
     - reaching a __builtin_unreachable() call,
     - VLAs size (e.g. negative size),
     - store to/load of misaligned address,
     - store to/load of null pointer,
     - etc.
     For the time being, I plan to work on overflows instrumentation.

Regtested/bootstrapped on x86_64-linux.

Comments, please?

2013-06-05  Marek Polacek  <polacek@redhat.com>

	* Makefile.in: Add ubsan.c
	* common.opt: Add -fsanitize=undefined option.
	* doc/invoke.texi: Document the new flag.
	* ubsan.h: New file.
	* ubsan.c): New file.
	* sanitizer.def (DEF_SANITIZER_BUILTIN):
	* builtins.def: Define BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW and
	BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS.
	* cp/typeck.c (cp_build_binary_op): Add division by zero and shift
	instrumentation.
	* c/c-typeck.c (build_binary_op): Likewise.
	* builtin-attrs.def: Define ATTR_COLD.
	* asan.c (initialize_sanitizer_builtins): Build
	BT_FN_VOID_PTR_PTR_PTR.


	Marek

Comments

Andrew Pinski June 5, 2013, 6:44 p.m. UTC | #1
On Wed, Jun 5, 2013 at 10:57 AM, Marek Polacek <polacek@redhat.com> wrote:
> Hi!
>
> This is an attempt to add the Undefined Behavior Sanitizer to GCC.
> Note that it's very alpha version; so far it doesn't do that much,
> at the moment it should handle division by zero cases, INT_MIN / -1,
> and various shift cases (shifting by a negative value, shifting when
> second operand is >= than TYPE_PRECISION (first_operand) and suchlike.
> (On integer types, so far.)
>
> It works by creating a COMPOUND_EXPR around original expression, so e.g.
> it creates:
>
> if (b < 0 || (b > 31 || a < 0))
>   {
>     __builtin___ubsan_handle_shift_out_of_bounds ();
>   }
> else
>   {
>     0
>   }, a << b;
>
> from original "a <<= b;".
>
> There is of course a lot of stuff that needs to be done, more
> specifically:
>   0) fix an ICE which I've noticed right now ;(
>         long a = 1;
>         int b = 3;
>         a <<= b;
>       (error: mismatching comparison operand types)
>       temporarily solved by surrounding "doing_shift = true;"
>       with if (comptypes (type0, type1))
>       But that needs a better solution I'm afraid.  Bah.
>   1) import & build the ubsan library from LLVM
>      I've already spent some time on this, but failed miserably.  I've thought
>      that importing ubsan/ from LLVM into libsanitizer/, adding
>      libsanitizer/ubsan/Makefile.{am,in}, editing libsanitizer/Makefile.am
>      and libsanitizer/configure.ac, then something like aclocal && automake
>      could be sufficient, but no.  I'd very much appreciate any help with
>      this; is someone willing to help me with this one?  And it seemed so easy...
>   2) construct arguments for ubsan library
>      I guess that if we want to call for instance
>      void __ubsan::__ubsan_handle_shift_out_of_bounds(ShiftOutOfBoundsData *Data,
>                                                  ValueHandle LHS, ValueHandle RHS)
>      from GCC, we need to construct arguments compatible with
>      ShiftOutOfBoundsData/ValueHandle.
>      So, perhaps we need some helper function that constructs the CALL_EXPR
>      for the builtin; so far I haven't spent much time on this and don't know
>      what exactly to do here.  Time to look at what asan/tsan do.
>   3) add parsing of -fsanitize=<...>
>      LLVM supports e.g. -fsanitize=shift,divbyzero combination, we should too.
>      This doesn't sound like a big deal; just parse the arguments and set
>      various flags, or error out on invalid combinations.
>   4) and of course, more instrumentation (C/C++ FE, gimple level)
>      What comes to mind is:
>      - float/double to integer conversions,
>      - integer overflows (a long list of various cases here),
>      - invalid conversions of int to bool,
>      - reaching a __builtin_unreachable() call,
>      - VLAs size (e.g. negative size),
>      - store to/load of misaligned address,
>      - store to/load of null pointer,
>      - etc.
>      For the time being, I plan to work on overflows instrumentation.
>
> Regtested/bootstrapped on x86_64-linux.
>
> Comments, please?
I think it might be better to do handle this while gimplification
happens rather than while parsing.  The main reason is that constexpr
might fail due to the added function calls.

Also please don't shorten file names like ubsan,  we already have file
names which don't fit in the older POSIX tar format and needs extended
length support.

Thanks,
Andrew Pinski


>
> 2013-06-05  Marek Polacek  <polacek@redhat.com>
>
>         * Makefile.in: Add ubsan.c
>         * common.opt: Add -fsanitize=undefined option.
>         * doc/invoke.texi: Document the new flag.
>         * ubsan.h: New file.
>         * ubsan.c): New file.
>         * sanitizer.def (DEF_SANITIZER_BUILTIN):
>         * builtins.def: Define BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW and
>         BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS.
>         * cp/typeck.c (cp_build_binary_op): Add division by zero and shift
>         instrumentation.
>         * c/c-typeck.c (build_binary_op): Likewise.
>         * builtin-attrs.def: Define ATTR_COLD.
>         * asan.c (initialize_sanitizer_builtins): Build
>         BT_FN_VOID_PTR_PTR_PTR.
>
> --- gcc/sanitizer.def.mp        2013-06-05 18:23:41.077439836 +0200
> +++ gcc/sanitizer.def   2013-06-05 18:26:04.749921181 +0200
> @@ -283,3 +283,13 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_ATOM
>  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_ATOMIC_SIGNAL_FENCE,
>                       "__tsan_atomic_signal_fence",
>                       BT_FN_VOID_INT, ATTR_NOTHROW_LEAF_LIST)
> +
> +/* Undefined Behavior Sanitizer */
> +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
> +                     "__ubsan_handle_divrem_overflow",
> +                     BT_FN_VOID_PTR_PTR_PTR,
> +                     ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS,
> +                     "__ubsan_handle_shift_out_of_bounds",
> +                     BT_FN_VOID_PTR_PTR_PTR,
> +                     ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
> --- gcc/builtins.def.mp 2013-06-05 18:23:41.072439816 +0200
> +++ gcc/builtins.def    2013-06-05 18:26:04.728921097 +0200
> @@ -155,7 +155,7 @@ along with GCC; see the file COPYING3.
>  #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
>    DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
>                true, true, true, ATTRS, true, \
> -              (flag_asan || flag_tsan))
> +              (flag_asan || flag_tsan || flag_ubsan))
>
>  #undef DEF_CILKPLUS_BUILTIN
>  #define DEF_CILKPLUS_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
> --- gcc/Makefile.in.mp  2013-06-05 18:23:25.807388466 +0200
> +++ gcc/Makefile.in     2013-06-05 18:26:04.723921077 +0200
> @@ -1377,6 +1377,7 @@ OBJS = \
>         tree-affine.o \
>         asan.o \
>         tsan.o \
> +       ubsan.o \
>         tree-call-cdce.o \
>         tree-cfg.o \
>         tree-cfgcleanup.o \
> @@ -2259,6 +2260,10 @@ tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_
>     $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_CORE_H) $(GIMPLE_H) tree-iterator.h \
>     intl.h cfghooks.h output.h options.h c-family/c-common.h tsan.h asan.h \
>     tree-ssa-propagate.h
> +ubsan.o : ubsan.c $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
> +   output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) $(CFGLOOP_H) \
> +   tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
> +   $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h
>  tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
>     $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
>     $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \
> --- gcc/doc/invoke.texi.mp      2013-06-05 18:29:18.301611796 +0200
> +++ gcc/doc/invoke.texi 2013-06-05 18:33:53.756623280 +0200
> @@ -5143,6 +5143,11 @@ Memory access instructions will be instr
>  data race bugs.
>  See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for more details.
>
> +@item -fsanitize=undefined
> +Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
> +Various computations will be instrumented to detect
> +undefined behavior, e.g. division by zero or various overflows.
> +
>  @item -fdump-final-insns@r{[}=@var{file}@r{]}
>  @opindex fdump-final-insns
>  Dump the final internal representation (RTL) to @var{file}.  If the
> --- gcc/ubsan.h.mp      2013-06-05 18:23:55.083486235 +0200
> +++ gcc/ubsan.h 2013-06-05 18:10:21.284693807 +0200
> @@ -0,0 +1,27 @@
> +/* UndefinedBehaviorSanitizer, undefined behavior detector.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +   Contributed by Marek Polacek <polacek@redhat.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_UBSAN_H
> +#define GCC_UBSAN_H
> +
> +extern tree ubsan_instrument_division (location_t, enum tree_code, tree, tree);
> +extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
> +
> +#endif  /* GCC_UBSAN_H  */
> --- gcc/ubsan.c.mp      2013-06-05 18:23:49.411467508 +0200
> +++ gcc/ubsan.c 2013-06-05 18:00:25.000000000 +0200
> @@ -0,0 +1,107 @@
> +/* UndefinedBehaviorSanitizer, undefined behavior detector.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +   Contributed by Marek Polacek <polacek@redhat.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "c-family/c-common.h"
> +
> +/* Instrument division by zero and INT_MIN / -1.  */
> +
> +tree
> +ubsan_instrument_division (location_t loc, enum tree_code code,
> +                          tree op0, tree op1)
> +{
> +  tree t, tt;
> +  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
> +
> +  if (TREE_CODE (TREE_TYPE (op0)) != INTEGER_TYPE
> +      || TREE_CODE (TREE_TYPE (op1)) != INTEGER_TYPE)
> +    return orig;
> +
> +  /* If we *know* that the divisor is not -1 or 0, we don't have to
> +     instrument this expression.
> +     ??? We could use decl_constant_value to cover up more cases.  */
> +  if (TREE_CODE (op1) == INTEGER_CST
> +      && integer_nonzerop (op1)
> +      && !integer_minus_onep (op1))
> +    return orig;
> +
> +  tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
> +                   integer_minus_one_node);
> +  t = fold_build2 (EQ_EXPR, boolean_type_node, op0,
> +                  TYPE_MIN_VALUE (TREE_TYPE (op0)));
> +  t = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, t, tt);
> +  tt = build2 (EQ_EXPR, boolean_type_node,
> +              op1, integer_zero_node);
> +  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, t);
> +  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW);
> +  // XXX Do we want _loc version here?
> +  tt = build_call_expr_loc (loc, tt, 0);
> +  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
> +  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (orig), t, orig);
> +
> +  return t;
> +}
> +
> +/* Instrument left and right shifts.  */
> +
> +tree
> +ubsan_instrument_shift (location_t loc, enum tree_code code,
> +                       tree op0, tree op1)
> +{
> +  tree t, tt;
> +  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
> +  tree prec = build_int_cst (TREE_TYPE (op0),
> +                            TYPE_PRECISION (TREE_TYPE (op0)));
> +
> +  t = fold_build2 (LT_EXPR, boolean_type_node, op1, integer_zero_node);
> +  tt = fold_build2 (GE_EXPR, boolean_type_node, op1, prec);
> +
> +  /* int a = 1;
> +     a <<= 31;
> +     is undefined in C99/C11.  */
> +  if (code == LSHIFT_EXPR
> +      && !TYPE_UNSIGNED (TREE_TYPE (op0))
> +      && (flag_isoc99 || flag_isoc11))
> +    {
> +      tree prec1 = build_int_cst (TREE_TYPE (op1),
> +                                 TYPE_PRECISION (TREE_TYPE (op1)) - 1);
> +      tree x = fold_build2 (EQ_EXPR, boolean_type_node, op1, prec1);
> +      tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, x);
> +    }
> +
> +  /* For left shift, shifting a negative value is undefined.  */
> +  if (code == LSHIFT_EXPR)
> +    {
> +      tree x = fold_build2 (LT_EXPR, boolean_type_node, op0,
> +                           integer_zero_node);
> +      tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, x);
> +    }
> +
> +  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t, tt);
> +  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS);
> +  tt = build_call_expr_loc (loc, tt, 0);
> +  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
> +  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (orig), t, orig);
> +
> +  return t;
> +}
> --- gcc/cp/typeck.c.mp  2013-06-05 18:23:41.076439832 +0200
> +++ gcc/cp/typeck.c     2013-06-05 18:26:04.746921169 +0200
> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.
>  #include "intl.h"
>  #include "target.h"
>  #include "convert.h"
> +#include "ubsan.h"
>  #include "c-family/c-common.h"
>  #include "c-family/c-objc.h"
>  #include "params.h"
> @@ -3891,6 +3892,12 @@ cp_build_binary_op (location_t location,
>    op0 = orig_op0;
>    op1 = orig_op1;
>
> +  /* Remember whether we're doing / or %.  */
> +  bool doing_div_or_mod = false;
> +
> +  /* Remember whether we're doing << or >>.  */
> +  bool doing_shift = false;
> +
>    if (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR
>        || code == TRUTH_OR_EXPR || code == TRUTH_ORIF_EXPR
>        || code == TRUTH_XOR_EXPR)
> @@ -4070,8 +4077,15 @@ cp_build_binary_op (location_t location,
>         {
>           enum tree_code tcode0 = code0, tcode1 = code1;
>           tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
> +         cop1 = maybe_constant_value (cop1);
> +
> +         if (!processing_template_decl && tcode0 == INTEGER_TYPE
> +             && (TREE_CODE (cop1) != INTEGER_CST
> +                 || integer_zerop (cop1)
> +                 || integer_minus_onep (cop1)))
> +           doing_div_or_mod = true;
>
> -         warn_for_div_by_zero (location, maybe_constant_value (cop1));
> +         warn_for_div_by_zero (location, cop1);
>
>           if (tcode0 == COMPLEX_TYPE || tcode0 == VECTOR_TYPE)
>             tcode0 = TREE_CODE (TREE_TYPE (TREE_TYPE (op0)));
> @@ -4109,8 +4123,14 @@ cp_build_binary_op (location_t location,
>      case FLOOR_MOD_EXPR:
>        {
>         tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
> +       cop1 = maybe_constant_value (cop1);
>
> -       warn_for_div_by_zero (location, maybe_constant_value (cop1));
> +       if (!processing_template_decl && code0 == INTEGER_TYPE
> +           && (TREE_CODE (cop1) != INTEGER_CST
> +               || integer_zerop (cop1)
> +               || integer_minus_onep (cop1)))
> +         doing_div_or_mod = true;
> +       warn_for_div_by_zero (location, cop1);
>        }
>
>        if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
> @@ -4164,6 +4184,7 @@ cp_build_binary_op (location_t location,
>           if (TREE_CODE (const_op1) != INTEGER_CST)
>             const_op1 = op1;
>           result_type = type0;
> +         doing_shift = true;
>           if (TREE_CODE (const_op1) == INTEGER_CST)
>             {
>               if (tree_int_cst_lt (const_op1, integer_zero_node))
> @@ -4211,6 +4232,7 @@ cp_build_binary_op (location_t location,
>           if (TREE_CODE (const_op1) != INTEGER_CST)
>             const_op1 = op1;
>           result_type = type0;
> +         doing_shift = true;
>           if (TREE_CODE (const_op1) == INTEGER_CST)
>             {
>               if (tree_int_cst_lt (const_op1, integer_zero_node))
> @@ -4607,6 +4629,18 @@ cp_build_binary_op (location_t location,
>        break;
>      }
>
> +  if (flag_ubsan && doing_div_or_mod)
> +    {
> +      resultcode = COMPOUND_EXPR;
> +      return ubsan_instrument_division (location, code, op0, op1);
> +    }
> +
> +  if (flag_ubsan && doing_shift)
> +    {
> +      resultcode = COMPOUND_EXPR;
> +      return ubsan_instrument_shift (location, code, op0, op1);
> +    }
> +
>    if (((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == COMPLEX_TYPE
>         || code0 == ENUMERAL_TYPE)
>         && (code1 == INTEGER_TYPE || code1 == REAL_TYPE
> --- gcc/common.opt.mp   2013-06-05 18:23:41.075439828 +0200
> +++ gcc/common.opt      2013-06-05 18:26:04.740921145 +0200
> @@ -858,6 +858,10 @@ fsanitize=thread
>  Common Report Var(flag_tsan)
>  Enable ThreadSanitizer, a data race detector
>
> +fsanitize=undefined
> +Common Report Var(flag_ubsan)
> +Enable UndefinedBehaviorSanitizer, an undefined behavior detector
> +
>  fasynchronous-unwind-tables
>  Common Report Var(flag_asynchronous_unwind_tables) Optimization
>  Generate unwind tables that are exact at each instruction boundary
> --- gcc/builtin-attrs.def.mp    2013-06-05 18:23:41.071439812 +0200
> +++ gcc/builtin-attrs.def       2013-06-05 18:26:04.727921093 +0200
> @@ -83,6 +83,7 @@ DEF_LIST_INT_INT (5,6)
>  #undef DEF_LIST_INT_INT
>
>  /* Construct trees for identifiers.  */
> +DEF_ATTR_IDENT (ATTR_COLD, "cold")
>  DEF_ATTR_IDENT (ATTR_CONST, "const")
>  DEF_ATTR_IDENT (ATTR_FORMAT, "format")
>  DEF_ATTR_IDENT (ATTR_FORMAT_ARG, "format_arg")
> @@ -130,6 +131,8 @@ DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHRO
>                         ATTR_NULL, ATTR_NOTHROW_LIST)
>  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
>                         ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
> +DEF_ATTR_TREE_LIST (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST, ATTR_COLD,\
> +                       ATTR_NULL, ATTR_NORETURN_NOTHROW_LEAF_LIST)
>  DEF_ATTR_TREE_LIST (ATTR_CONST_NORETURN_NOTHROW_LEAF_LIST, ATTR_CONST,\
>                         ATTR_NULL, ATTR_NORETURN_NOTHROW_LEAF_LIST)
>  DEF_ATTR_TREE_LIST (ATTR_MALLOC_NOTHROW_LIST, ATTR_MALLOC,     \
> --- gcc/c/c-typeck.c.mp 2013-06-05 18:23:41.073439820 +0200
> +++ gcc/c/c-typeck.c    2013-06-05 18:26:04.736921129 +0200
> @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-iterator.h"
>  #include "bitmap.h"
>  #include "gimple.h"
> +#include "ubsan.h"
>  #include "c-family/c-objc.h"
>  #include "c-family/c-common.h"
>
> @@ -9542,6 +9543,12 @@ build_binary_op (location_t location, en
>       operands to truth-values.  */
>    bool boolean_op = false;
>
> +  /* Remember whether we're doing / or %.  */
> +  bool doing_div_or_mod = false;
> +
> +  /* Remember whether we're doing << or >>.  */
> +  bool doing_shift = false;
> +
>    if (location == UNKNOWN_LOCATION)
>      location = input_location;
>
> @@ -9743,6 +9750,7 @@ build_binary_op (location_t location, en
>      case FLOOR_DIV_EXPR:
>      case ROUND_DIV_EXPR:
>      case EXACT_DIV_EXPR:
> +      doing_div_or_mod = true;
>        warn_for_div_by_zero (location, op1);
>
>        if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
> @@ -9790,6 +9798,7 @@ build_binary_op (location_t location, en
>
>      case TRUNC_MOD_EXPR:
>      case FLOOR_MOD_EXPR:
> +      doing_div_or_mod = true;
>        warn_for_div_by_zero (location, op1);
>
>        if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
> @@ -9888,6 +9897,7 @@ build_binary_op (location_t location, en
>        else if ((code0 == INTEGER_TYPE || code0 == FIXED_POINT_TYPE)
>           && code1 == INTEGER_TYPE)
>         {
> +         doing_shift = true;
>           if (TREE_CODE (op1) == INTEGER_CST)
>             {
>               if (tree_int_cst_sgn (op1) < 0)
> @@ -9940,6 +9950,7 @@ build_binary_op (location_t location, en
>        else if ((code0 == INTEGER_TYPE || code0 == FIXED_POINT_TYPE)
>           && code1 == INTEGER_TYPE)
>         {
> +         doing_shift = true;
>           if (TREE_CODE (op1) == INTEGER_CST)
>             {
>               if (tree_int_cst_sgn (op1) < 0)
> @@ -10224,6 +10235,20 @@ build_binary_op (location_t location, en
>        return error_mark_node;
>      }
>
> +  if (flag_ubsan && doing_div_or_mod)
> +    {
> +      ret = ubsan_instrument_division (location, code, op0, op1);
> +      resultcode = COMPOUND_EXPR;
> +      goto return_build_binary_op;
> +    }
> +
> +  if (flag_ubsan && doing_shift)
> +    {
> +      ret = ubsan_instrument_shift (location, code, op0, op1);
> +      resultcode = COMPOUND_EXPR;
> +      goto return_build_binary_op;
> +    }
> +
>    if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == COMPLEX_TYPE
>         || code0 == FIXED_POINT_TYPE || code0 == VECTOR_TYPE)
>        &&
> --- gcc/asan.c.mp       2013-06-05 18:23:41.070439808 +0200
> +++ gcc/asan.c  2013-06-05 18:26:04.726921089 +0200
> @@ -2034,6 +2034,9 @@ initialize_sanitizer_builtins (void)
>    tree BT_FN_VOID = build_function_type_list (void_type_node, NULL_TREE);
>    tree BT_FN_VOID_PTR
>      = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
> +  tree BT_FN_VOID_PTR_PTR_PTR
> +    = build_function_type_list (void_type_node, ptr_type_node,
> +                               ptr_type_node, ptr_type_node, NULL_TREE);
>    tree BT_FN_VOID_PTR_PTRMODE
>      = build_function_type_list (void_type_node, ptr_type_node,
>                                 build_nonstandard_integer_type (POINTER_SIZE,
> @@ -2099,6 +2102,9 @@ initialize_sanitizer_builtins (void)
>  #undef ATTR_TMPURE_NORETURN_NOTHROW_LEAF_LIST
>  #define ATTR_TMPURE_NORETURN_NOTHROW_LEAF_LIST \
>    ECF_TM_PURE | ATTR_NORETURN_NOTHROW_LEAF_LIST
> +#undef ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST
> +#define ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST \
> +  /* ECF_COLD missing */ ATTR_NORETURN_NOTHROW_LEAF_LIST
>  #undef DEF_SANITIZER_BUILTIN
>  #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
>    decl = add_builtin_function ("__builtin_" NAME, TYPE, ENUM,          \
>
>         Marek
Jakub Jelinek June 5, 2013, 7:19 p.m. UTC | #2
On Wed, Jun 05, 2013 at 07:57:28PM +0200, Marek Polacek wrote:
> There is of course a lot of stuff that needs to be done, more
> specifically:
>   0) fix an ICE which I've noticed right now ;(
>         long a = 1;
>         int b = 3;
>         a <<= b;
>       (error: mismatching comparison operand types)
>       temporarily solved by surrounding "doing_shift = true;"
>       with if (comptypes (type0, type1))
>       But that needs a better solution I'm afraid.  Bah.

> +  tree t, tt;
> +  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
> +  tree prec = build_int_cst (TREE_TYPE (op0),
> +			     TYPE_PRECISION (TREE_TYPE (op0)));

You compare prec with op1, thus they should have the same type, shifts
are one of the few binary ops that can have different types of the
operands (result type is the same as first argument, second argument
is something else).  So, if you use TREE_TYPE (op1) as the type of prec,
you should be fine.  More importantly, perhaps you can just use
precm1 in all the places and just use GT_EXPR for tt with precm1, and
use it in EQ.

That said, the C99 rules look somewhat different. 0 << 31 is perfectly
valid, int x = 0; x << 31 is as well.  Undefined is in C99 (likely C11 too
and maybe C89 as well?) is the usual shift count out of bounds (negative or
>= prec), or if the first operand is signed and negative, or if the first
operand is signed positive, but for x << y the expression x * 2^y overflows
in the type of x.

>   1) import & build the ubsan library from LLVM
>      I've already spent some time on this, but failed miserably.  I've thought
>      that importing ubsan/ from LLVM into libsanitizer/, adding
>      libsanitizer/ubsan/Makefile.{am,in}, editing libsanitizer/Makefile.am
>      and libsanitizer/configure.ac, then something like aclocal && automake
>      could be sufficient, but no.  I'd very much appreciate any help with
>      this; is someone willing to help me with this one?  And it seemed so easy...

I'll look at this tomorrow.

>   2) construct arguments for ubsan library
>      I guess that if we want to call for instance
>      void __ubsan::__ubsan_handle_shift_out_of_bounds(ShiftOutOfBoundsData *Data,
>                                                  ValueHandle LHS, ValueHandle RHS)
>      from GCC, we need to construct arguments compatible with
>      ShiftOutOfBoundsData/ValueHandle.  
>      So, perhaps we need some helper function that constructs the CALL_EXPR
>      for the builtin; so far I haven't spent much time on this and don't know
>      what exactly to do here.  Time to look at what asan/tsan do.
>   3) add parsing of -fsanitize=<...>
>      LLVM supports e.g. -fsanitize=shift,divbyzero combination, we should too.
>      This doesn't sound like a big deal; just parse the arguments and set
>      various flags, or error out on invalid combinations.
>   4) and of course, more instrumentation (C/C++ FE, gimple level)
>      What comes to mind is:
>      - float/double to integer conversions,
>      - integer overflows (a long list of various cases here),
>      - invalid conversions of int to bool,
>      - reaching a __builtin_unreachable() call,
>      - VLAs size (e.g. negative size),
>      - store to/load of misaligned address,
>      - store to/load of null pointer,
>      - etc.
>      For the time being, I plan to work on overflows instrumentation.

For at least signed addition, subtraction, multiplication overflow we
ideally want to handle it very efficiently on CPUs that can handle it
efficiently, so pretty much say on x86_64/i386 addl followed by jo
We need some builtin for that, either one with two return values
(this can be done right now say by returning a vector or complex int,
one integer will be the result of the addition/subtraction/multiplication,
another one a flag whether we've overflowed), or maybe we want new tree code
for that or something.

> 2013-06-05  Marek Polacek  <polacek@redhat.com>
> 
> 	* Makefile.in: Add ubsan.c

Missing dot at end of line.

> 	* common.opt: Add -fsanitize=undefined option.
> 	* doc/invoke.texi: Document the new flag.
> 	* ubsan.h: New file.
> 	* ubsan.c): New file.

Extra ).  If prefer if the support routines for ubsan instrumentation
done in the C/C++ FEs only would live in c-family/c-ubsan.[ch] or so.
ubsan.[ch] can perhaps then be used for any instrumentation done at the
gimplification level (if anything is suitable for that), or as support code
for both of that and c-ubsan.c.

> 	* sanitizer.def (DEF_SANITIZER_BUILTIN):

Define. ?

> 	* builtins.def: Define BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW and
> 	BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS.

	* builtins.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
	BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS): Define.

cp/ stuff goes into cp/ ChangeLog, without cp/ paths.

> 	* cp/typeck.c (cp_build_binary_op): Add division by zero and shift
> 	instrumentation.

Please make sure you only add it for !processing_template_decl.

Again, c/ ChangeLog.

> 	* c/c-typeck.c (build_binary_op): Likewise.
> 	* builtin-attrs.def: Define ATTR_COLD.

(ATTR_COLD): Define.

Also, the question is where exactly to place these calls to c-ubsan.c
functions.  You generally want it before stuff like short_compare and
similar handling, but on the other side you want it after type promotion
(seems ok already) but e.g. for the division also after conversion to a
single result_type.  Say the ubsan division libcall wants both arguments
to have the same type (unlike ubsan shift call, which has two types of
course), so if you have long long l; char c; l / c or c / l you want
both arguments converted to long long already.

	Jakub
Jakub Jelinek June 5, 2013, 7:23 p.m. UTC | #3
On Wed, Jun 05, 2013 at 11:44:07AM -0700, Andrew Pinski wrote:
> On Wed, Jun 5, 2013 at 10:57 AM, Marek Polacek <polacek@redhat.com> wrote:
> > Comments, please?
> I think it might be better to do handle this while gimplification
> happens rather than while parsing.  The main reason is that constexpr
> might fail due to the added function calls.

Gimplification is too late, the FEs perform various operation shortenings
etc. in many cases, and what exactly is undefined behavior is apparently
heavily dependent on the particular language (C has different rules from
C++).  Yes, constexpr is something to consider in this light, but not
something that can't be handled (recognizing ubsan builtins and just
handling them specially).

> Also please don't shorten file names like ubsan,  we already have file
> names which don't fit in the older POSIX tar format and needs extended
> length support.

We already have asan.c and tsan.c, and that is how it is commonly called.

	Jakub
Jakub Jelinek June 5, 2013, 7:35 p.m. UTC | #4
On Wed, Jun 05, 2013 at 09:19:10PM +0200, Jakub Jelinek wrote:
> On Wed, Jun 05, 2013 at 07:57:28PM +0200, Marek Polacek wrote:
> > +  tree t, tt;
> > +  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
> > +  tree prec = build_int_cst (TREE_TYPE (op0),
> > +			     TYPE_PRECISION (TREE_TYPE (op0)));

BTW, also, to check that the shift count is not < 0 or >= prec, you can
just test that fold_convert_loc (loc, unsigned_type_for (TREE_TYPE (op1)), op1)
is LE_EXPR than precm1 (also using the unsigned type).
While optimizers often fold it to that, you might very well just create
fewer trees from the start.

The C99 undefined behavior of left signed shift can be tested by
testing if ((unsigned type for op0's type) op0) >> (precm1 - y) is
non-zero.

	Jakub
Andrew Pinski June 5, 2013, 7:40 p.m. UTC | #5
On Wed, Jun 5, 2013 at 12:23 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Jun 05, 2013 at 11:44:07AM -0700, Andrew Pinski wrote:
>> On Wed, Jun 5, 2013 at 10:57 AM, Marek Polacek <polacek@redhat.com> wrote:
>> > Comments, please?
>> I think it might be better to do handle this while gimplification
>> happens rather than while parsing.  The main reason is that constexpr
>> might fail due to the added function calls.
>
> Gimplification is too late, the FEs perform various operation shortenings
> etc. in many cases, and what exactly is undefined behavior is apparently
> heavily dependent on the particular language (C has different rules from
> C++).  Yes, constexpr is something to consider in this light, but not
> something that can't be handled (recognizing ubsan builtins and just
> handling them specially).
>
>> Also please don't shorten file names like ubsan,  we already have file
>> names which don't fit in the older POSIX tar format and needs extended
>> length support.
>
> We already have asan.c and tsan.c, and that is how it is commonly called.

Can we just move them to array-sanitizer and thread-sanitizer?  I
think those are better names than asan and tsan.  Shorten names are
not useful when a new person is learning the code.

Thanks,
Andrew

>
>         Jakub
Joseph Myers June 5, 2013, 7:50 p.m. UTC | #6
On Wed, 5 Jun 2013, Marek Polacek wrote:

> It works by creating a COMPOUND_EXPR around original expression, so e.g.
> it creates:
> 
> if (b < 0 || (b > 31 || a < 0))
>   {
>     __builtin___ubsan_handle_shift_out_of_bounds ();
>   }
> else
>   {
>     0
>   }, a << b;
> 
> from original "a <<= b;".

For the "a < 0" here, and signed left shift of a positive value shifting a 
1 into or past the sign bit, I think it should be possible to control the 
checks separately from other checks on shifts - both because those cases 
were implementation-defined in C90, only undefined in C99/C11, and because 
they are widely used in practice.

> There is of course a lot of stuff that needs to be done, more
> specifically:

5) Testcases (or if applicable, running existing testcases coming with the 
library).

6) Map -ftrapv onto an appropriate subset of this option that handles the 
cases -ftrapv was meant to handle (so arithmetic overflow, which I'd say 
should include INT_MIN / -1).

>   4) and of course, more instrumentation (C/C++ FE, gimple level)
>      What comes to mind is:
>      - float/double to integer conversions,

Under Annex F, these return an unspecified value rather than being 
undefined behavior.

>      - integer overflows (a long list of various cases here),

Strictly, including INT_MIN % -1 (both / and % are undefined if the result 
of either is unrepresentable) - it appears you've already got that.  Of 
course INT_MIN % -1 and INT_MIN / -1 should *work* reliably with -fwrapv, 
which is another bug (30484).

>      - invalid conversions of int to bool,

What do you mean?  Conversion to bool is just a comparison != 0.

>      - VLAs size (e.g. negative size),

Or the multiplication used to compute the size in bytes overflows (really, 
there should be some code generated expanding the stack bit by bit to 
avoid it accidentally overflowing into another allocated area of memory, I 
suppose).

> +@item -fsanitize=undefined
> +Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
> +Various computations will be instrumented to detect
> +undefined behavior, e.g. division by zero or various overflows.

e.g.@:
Joseph Myers June 5, 2013, 7:57 p.m. UTC | #7
On Wed, 5 Jun 2013, Jakub Jelinek wrote:

> On Wed, Jun 05, 2013 at 11:44:07AM -0700, Andrew Pinski wrote:
> > On Wed, Jun 5, 2013 at 10:57 AM, Marek Polacek <polacek@redhat.com> wrote:
> > > Comments, please?
> > I think it might be better to do handle this while gimplification
> > happens rather than while parsing.  The main reason is that constexpr
> > might fail due to the added function calls.
> 
> Gimplification is too late, the FEs perform various operation shortenings
> etc. in many cases, and what exactly is undefined behavior is apparently
> heavily dependent on the particular language (C has different rules from
> C++).  Yes, constexpr is something to consider in this light, but not
> something that can't be handled (recognizing ubsan builtins and just
> handling them specially).

Agreed, this needs handling before folding and other optimizations in the 
front ends to have predictable results.

It may make sense to try running the whole testsuite with this option, 
minus tests of -fwrapv, to make sure it doesn't break any corner cases of 
valid tests (of course it may well show up some invalid tests).  In 
particular, gcc.dg/*const-expr* and gcc.dg/overflow-warn*.  Generating 
extra diagnostics for code in those tests that already gets a diagnostic 
is OK, as long as it doesn't generate diagnostics for non-overflow cases 
in those tests that aren't meant to be treated as overflow, or lose 
diagnostics for cases that are required to be diagnosed.
Jakub Jelinek June 6, 2013, 6:07 a.m. UTC | #8
On Wed, Jun 05, 2013 at 09:35:08PM +0200, Jakub Jelinek wrote:
> On Wed, Jun 05, 2013 at 09:19:10PM +0200, Jakub Jelinek wrote:
> > On Wed, Jun 05, 2013 at 07:57:28PM +0200, Marek Polacek wrote:
> > > +  tree t, tt;
> > > +  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
> > > +  tree prec = build_int_cst (TREE_TYPE (op0),
> > > +			     TYPE_PRECISION (TREE_TYPE (op0)));
> 
> BTW, also, to check that the shift count is not < 0 or >= prec, you can
> just test that fold_convert_loc (loc, unsigned_type_for (TREE_TYPE (op1)), op1)
> is LE_EXPR than precm1 (also using the unsigned type).
> While optimizers often fold it to that, you might very well just create
> fewer trees from the start.
> 
> The C99 undefined behavior of left signed shift can be tested by
> testing if ((unsigned type for op0's type) op0) >> (precm1 - y) is
> non-zero.

The C++11/C++14 undefined behavior of left signed shift can be tested
similarly, if ((unsigned type for op0's type) op0) >> (precm1 - y)
is greater than one, then it is undefined behavior.
Jason, does
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3675.html#1457
apply just to C++11/C++14, or to C++03 too?

In C++03 I see in [expr.shift]/2
"The value of E1 << E2 is E1 (interpreted as a bit pattern) left-shifted E2
bit positions; vacated bits are zero-filled. If E1 has an unsigned type,
the value of the result is E1 multiplied by the quantity 2 raised to
the power E2, reduced modulo ULONG_MAX+1 if E1 has type unsigned long,
UINT_MAX+1 otherwise."  Is that the same case as C90 then, the wording seems
to be pretty much the same?

As for controlling the C99 (or even C++11?) warning individually, either it
can be a separate suboption of -fsanitize=, like -fsanitize=shift,lshiftc99
(but then, would lshiftc99 be included in undefined and similar option
groups), or IMHO better we just convince ubsan upstream to have env var for
controlling the lshift diagnostics, gcc emits always checks for precisely
what the current -std= makes as undefined behavior (though, because of DRs
that is somewhat fuzzy, pre-DR1457 C++11 vs. post-DR1457 C++11), and users
would through env var just choose, ok, please ignore left shift warnings
of the 1 << 31 style, or ignore those and also 2 << 31 style.

	Jakub
Konstantin Serebryany June 6, 2013, 7:46 a.m. UTC | #9
On Wed, Jun 5, 2013 at 11:40 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Wed, Jun 5, 2013 at 12:23 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Wed, Jun 05, 2013 at 11:44:07AM -0700, Andrew Pinski wrote:
>>> On Wed, Jun 5, 2013 at 10:57 AM, Marek Polacek <polacek@redhat.com> wrote:
>>> > Comments, please?
>>> I think it might be better to do handle this while gimplification
>>> happens rather than while parsing.  The main reason is that constexpr
>>> might fail due to the added function calls.
>>
>> Gimplification is too late, the FEs perform various operation shortenings
>> etc. in many cases, and what exactly is undefined behavior is apparently
>> heavily dependent on the particular language (C has different rules from
>> C++).  Yes, constexpr is something to consider in this light, but not
>> something that can't be handled (recognizing ubsan builtins and just
>> handling them specially).
>>
>>> Also please don't shorten file names like ubsan,  we already have file
>>> names which don't fit in the older POSIX tar format and needs extended
>>> length support.
>>
>> We already have asan.c and tsan.c, and that is how it is commonly called.
>
> Can we just move them to array-sanitizer and thread-sanitizer?  I

s/array-sanitizer/address-sanitizer/

If we are going to import the ubsan run-time from LLVM's
projects/compiler-rt/lib/ubsan,
we may also need to update the contents of
libsanitizer/sanitizer_common and keep them in sync afterwards.
(ubsan shares few bits of code with asan/tsan/msan)
The simplest way to do that is to extend libsanitizer/merge.sh

--kcc


> think those are better names than asan and tsan.  Shorten names are
> not useful when a new person is learning the code.
>
> Thanks,
> Andrew
>
>>
>>         Jakub
Jason Merrill June 6, 2013, 12:17 p.m. UTC | #10
On 06/06/2013 02:07 AM, Jakub Jelinek wrote:
> Jason, does
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3675.html#1457
> apply just to C++11/C++14, or to C++03 too?

The committee hasn't said anything about which DRs since C++03 apply to 
it.  I take the position that most do, but not this one, since it is a 
change to wording that doesn't exist in C++03.

> In C++03 I see in [expr.shift]/2
> "The value of E1 << E2 is E1 (interpreted as a bit pattern) left-shifted E2
> bit positions; vacated bits are zero-filled. If E1 has an unsigned type,
> the value of the result is E1 multiplied by the quantity 2 raised to
> the power E2, reduced modulo ULONG_MAX+1 if E1 has type unsigned long,
> UINT_MAX+1 otherwise."  Is that the same case as C90 then, the wording seems
> to be pretty much the same?

Yes, that's the same as C90.

> what the current -std= makes as undefined behavior (though, because of DRs
> that is somewhat fuzzy, pre-DR1457 C++11 vs. post-DR1457 C++11)

In contrast to the C++03 situation, the committee has been clear about 
which DRs apply to C++11 and which to C++1y, and this one does apply to 
C++11.

It's unfortunate that C and C++ have different rules here.  I'm actually 
inclined to agree with comment 48 from 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n834.htm that we should 
have left the C90/C++98 rules alone, but I guess that comment was rejected.

Jason
Segher Boessenkool June 6, 2013, 1:26 p.m. UTC | #11
> The C++11/C++14 undefined behavior of left signed shift can be tested
> similarly, if ((unsigned type for op0's type) op0) >> (precm1 - y)
> is greater than one, then it is undefined behavior.
> Jason, does
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/ 
> n3675.html#1457
> apply just to C++11/C++14, or to C++03 too?

Doesn't DR1457 also leave

    neg << 0

as undefined, where "neg" is a negative value?  That isn't caught by
your "greater than one" expression.


Segher
Jakub Jelinek June 6, 2013, 1:35 p.m. UTC | #12
On Thu, Jun 06, 2013 at 03:26:19PM +0200, Segher Boessenkool wrote:
> >The C++11/C++14 undefined behavior of left signed shift can be tested
> >similarly, if ((unsigned type for op0's type) op0) >> (precm1 - y)
> >is greater than one, then it is undefined behavior.
> >Jason, does
> >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3675.html#1457
> >apply just to C++11/C++14, or to C++03 too?
> 
> Doesn't DR1457 also leave
> 
>    neg << 0
> 
> as undefined, where "neg" is a negative value?  That isn't caught by
> your "greater than one" expression.

Yeah, of course, it needs to be for any shift x << y or x >> y (signed or unsigned):
1) if ((unsigned) y > precm1) ub
plus for signed x << y:
2) for C99/C11 if ((unsigned) x >> (precm1 - y)) ub
3) for C++11/C++14 if (x < 0 || ((unsigned) x >> (precm1 - y)) > 1) ub

	Jakub
Marek Polacek June 7, 2013, 12:38 p.m. UTC | #13
On Wed, Jun 05, 2013 at 07:50:52PM +0000, Joseph S. Myers wrote:
> On Wed, 5 Jun 2013, Marek Polacek wrote:
> 
> > It works by creating a COMPOUND_EXPR around original expression, so e.g.
> > it creates:
> > 
> > if (b < 0 || (b > 31 || a < 0))
> >   {
> >     __builtin___ubsan_handle_shift_out_of_bounds ();
> >   }
> > else
> >   {
> >     0
> >   }, a << b;
> > 
> > from original "a <<= b;".
> 
> For the "a < 0" here, and signed left shift of a positive value shifting a 
> 1 into or past the sign bit, I think it should be possible to control the 
> checks separately from other checks on shifts - both because those cases 
> were implementation-defined in C90, only undefined in C99/C11, and because 
> they are widely used in practice.

Ok, I see.

> > There is of course a lot of stuff that needs to be done, more
> > specifically:
> 
> 5) Testcases (or if applicable, running existing testcases coming with the 
> library).

Yeah -- we definitely want to have some testcases; the trouble is
that, like for tsan, we don't have any infrastructure for that yet.
Probably we could just put new tests into gcc.dg and put
-fsanitize=undefined into dg-options?  Or maybe tweak .exp files and
run some testcases also with -fsanitize=undefined, but the thing is
that we can't use dg-do compile tests, we need dg-do run tests.

> 6) Map -ftrapv onto an appropriate subset of this option that handles the 
> cases -ftrapv was meant to handle (so arithmetic overflow, which I'd say 
> should include INT_MIN / -1).

Ok, we can look at this maybe later when ubsan is more mature.

> >   4) and of course, more instrumentation (C/C++ FE, gimple level)
> >      What comes to mind is:
> >      - float/double to integer conversions,
> 
> Under Annex F, these return an unspecified value rather than being 
> undefined behavior.

Aha, good to know.  I've mentioned it because clang instruments that.

> >      - integer overflows (a long list of various cases here),
> 
> Strictly, including INT_MIN % -1 (both / and % are undefined if the result 
> of either is unrepresentable) - it appears you've already got that.  Of 
> course INT_MIN % -1 and INT_MIN / -1 should *work* reliably with -fwrapv, 
> which is another bug (30484).
> 
> >      - invalid conversions of int to bool,
> 
> What do you mean?  Conversion to bool is just a comparison != 0.

Something like e.g.:

unsigned char c = 42;
int
main (void)
{
  _Bool *b = (_Bool *) &c;
  return *b;
}

(clang catches this.)

> >      - VLAs size (e.g. negative size),
> 
> Or the multiplication used to compute the size in bytes overflows (really, 
> there should be some code generated expanding the stack bit by bit to 
> avoid it accidentally overflowing into another allocated area of memory, I 
> suppose).

Yeah, that sounds interesting as well.

> > +@item -fsanitize=undefined
> > +Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
> > +Various computations will be instrumented to detect
> > +undefined behavior, e.g. division by zero or various overflows.
> 
> e.g.@:

Fixed.  Thanks!

	Marek
diff mbox

Patch

--- gcc/sanitizer.def.mp	2013-06-05 18:23:41.077439836 +0200
+++ gcc/sanitizer.def	2013-06-05 18:26:04.749921181 +0200
@@ -283,3 +283,13 @@  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_ATOM
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_ATOMIC_SIGNAL_FENCE,
 		      "__tsan_atomic_signal_fence",
 		      BT_FN_VOID_INT, ATTR_NOTHROW_LEAF_LIST)
+
+/* Undefined Behavior Sanitizer */
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
+		      "__ubsan_handle_divrem_overflow",
+		      BT_FN_VOID_PTR_PTR_PTR,
+		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS,
+		      "__ubsan_handle_shift_out_of_bounds",
+		      BT_FN_VOID_PTR_PTR_PTR,
+		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
--- gcc/builtins.def.mp	2013-06-05 18:23:41.072439816 +0200
+++ gcc/builtins.def	2013-06-05 18:26:04.728921097 +0200
@@ -155,7 +155,7 @@  along with GCC; see the file COPYING3.
 #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
 	       true, true, true, ATTRS, true, \
-	       (flag_asan || flag_tsan))
+	       (flag_asan || flag_tsan || flag_ubsan))
 
 #undef DEF_CILKPLUS_BUILTIN
 #define DEF_CILKPLUS_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
--- gcc/Makefile.in.mp	2013-06-05 18:23:25.807388466 +0200
+++ gcc/Makefile.in	2013-06-05 18:26:04.723921077 +0200
@@ -1377,6 +1377,7 @@  OBJS = \
 	tree-affine.o \
 	asan.o \
 	tsan.o \
+	ubsan.o \
 	tree-call-cdce.o \
 	tree-cfg.o \
 	tree-cfgcleanup.o \
@@ -2259,6 +2260,10 @@  tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_
    $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_CORE_H) $(GIMPLE_H) tree-iterator.h \
    intl.h cfghooks.h output.h options.h c-family/c-common.h tsan.h asan.h \
    tree-ssa-propagate.h
+ubsan.o : ubsan.c $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
+   output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) $(CFGLOOP_H) \
+   tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
+   $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \
--- gcc/doc/invoke.texi.mp	2013-06-05 18:29:18.301611796 +0200
+++ gcc/doc/invoke.texi	2013-06-05 18:33:53.756623280 +0200
@@ -5143,6 +5143,11 @@  Memory access instructions will be instr
 data race bugs.
 See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for more details.
 
+@item -fsanitize=undefined
+Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
+Various computations will be instrumented to detect
+undefined behavior, e.g. division by zero or various overflows.
+
 @item -fdump-final-insns@r{[}=@var{file}@r{]}
 @opindex fdump-final-insns
 Dump the final internal representation (RTL) to @var{file}.  If the
--- gcc/ubsan.h.mp	2013-06-05 18:23:55.083486235 +0200
+++ gcc/ubsan.h	2013-06-05 18:10:21.284693807 +0200
@@ -0,0 +1,27 @@ 
+/* UndefinedBehaviorSanitizer, undefined behavior detector.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Marek Polacek <polacek@redhat.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_UBSAN_H
+#define GCC_UBSAN_H
+
+extern tree ubsan_instrument_division (location_t, enum tree_code, tree, tree);
+extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
+
+#endif  /* GCC_UBSAN_H  */
--- gcc/ubsan.c.mp	2013-06-05 18:23:49.411467508 +0200
+++ gcc/ubsan.c	2013-06-05 18:00:25.000000000 +0200
@@ -0,0 +1,107 @@ 
+/* UndefinedBehaviorSanitizer, undefined behavior detector.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Marek Polacek <polacek@redhat.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-family/c-common.h"
+
+/* Instrument division by zero and INT_MIN / -1.  */
+
+tree
+ubsan_instrument_division (location_t loc, enum tree_code code,
+			   tree op0, tree op1)
+{
+  tree t, tt;
+  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
+
+  if (TREE_CODE (TREE_TYPE (op0)) != INTEGER_TYPE
+      || TREE_CODE (TREE_TYPE (op1)) != INTEGER_TYPE)
+    return orig;
+
+  /* If we *know* that the divisor is not -1 or 0, we don't have to
+     instrument this expression.
+     ??? We could use decl_constant_value to cover up more cases.  */
+  if (TREE_CODE (op1) == INTEGER_CST
+      && integer_nonzerop (op1)
+      && !integer_minus_onep (op1))
+    return orig;
+
+  tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
+		    integer_minus_one_node);
+  t = fold_build2 (EQ_EXPR, boolean_type_node, op0,
+		   TYPE_MIN_VALUE (TREE_TYPE (op0)));
+  t = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, t, tt);
+  tt = build2 (EQ_EXPR, boolean_type_node,
+	       op1, integer_zero_node);
+  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, t);
+  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW);
+  // XXX Do we want _loc version here?
+  tt = build_call_expr_loc (loc, tt, 0);
+  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
+  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (orig), t, orig);
+
+  return t;
+}
+
+/* Instrument left and right shifts.  */
+
+tree
+ubsan_instrument_shift (location_t loc, enum tree_code code,
+			tree op0, tree op1)
+{
+  tree t, tt;
+  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
+  tree prec = build_int_cst (TREE_TYPE (op0),
+			     TYPE_PRECISION (TREE_TYPE (op0)));
+
+  t = fold_build2 (LT_EXPR, boolean_type_node, op1, integer_zero_node);
+  tt = fold_build2 (GE_EXPR, boolean_type_node, op1, prec);
+
+  /* int a = 1;
+     a <<= 31;
+     is undefined in C99/C11.  */
+  if (code == LSHIFT_EXPR
+      && !TYPE_UNSIGNED (TREE_TYPE (op0))
+      && (flag_isoc99 || flag_isoc11))
+    {
+      tree prec1 = build_int_cst (TREE_TYPE (op1),
+				  TYPE_PRECISION (TREE_TYPE (op1)) - 1);
+      tree x = fold_build2 (EQ_EXPR, boolean_type_node, op1, prec1);
+      tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, x);
+    }
+
+  /* For left shift, shifting a negative value is undefined.  */
+  if (code == LSHIFT_EXPR)
+    {
+      tree x = fold_build2 (LT_EXPR, boolean_type_node, op0,
+			    integer_zero_node);
+      tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, x);
+    }
+
+  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t, tt);
+  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS);
+  tt = build_call_expr_loc (loc, tt, 0);
+  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
+  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (orig), t, orig);
+
+  return t;
+}
--- gcc/cp/typeck.c.mp	2013-06-05 18:23:41.076439832 +0200
+++ gcc/cp/typeck.c	2013-06-05 18:26:04.746921169 +0200
@@ -35,6 +35,7 @@  along with GCC; see the file COPYING3.
 #include "intl.h"
 #include "target.h"
 #include "convert.h"
+#include "ubsan.h"
 #include "c-family/c-common.h"
 #include "c-family/c-objc.h"
 #include "params.h"
@@ -3891,6 +3892,12 @@  cp_build_binary_op (location_t location,
   op0 = orig_op0;
   op1 = orig_op1;
 
+  /* Remember whether we're doing / or %.  */
+  bool doing_div_or_mod = false;
+
+  /* Remember whether we're doing << or >>.  */
+  bool doing_shift = false;
+
   if (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR
       || code == TRUTH_OR_EXPR || code == TRUTH_ORIF_EXPR
       || code == TRUTH_XOR_EXPR)
@@ -4070,8 +4077,15 @@  cp_build_binary_op (location_t location,
 	{
 	  enum tree_code tcode0 = code0, tcode1 = code1;
 	  tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
+	  cop1 = maybe_constant_value (cop1);
+
+	  if (!processing_template_decl && tcode0 == INTEGER_TYPE
+	      && (TREE_CODE (cop1) != INTEGER_CST
+		  || integer_zerop (cop1)
+		  || integer_minus_onep (cop1)))
+	    doing_div_or_mod = true;
 
-	  warn_for_div_by_zero (location, maybe_constant_value (cop1));
+	  warn_for_div_by_zero (location, cop1);
 
 	  if (tcode0 == COMPLEX_TYPE || tcode0 == VECTOR_TYPE)
 	    tcode0 = TREE_CODE (TREE_TYPE (TREE_TYPE (op0)));
@@ -4109,8 +4123,14 @@  cp_build_binary_op (location_t location,
     case FLOOR_MOD_EXPR:
       {
 	tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
+	cop1 = maybe_constant_value (cop1);
 
-	warn_for_div_by_zero (location, maybe_constant_value (cop1));
+	if (!processing_template_decl && code0 == INTEGER_TYPE
+	    && (TREE_CODE (cop1) != INTEGER_CST
+		|| integer_zerop (cop1)
+		|| integer_minus_onep (cop1)))
+	  doing_div_or_mod = true;
+	warn_for_div_by_zero (location, cop1);
       }
 
       if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
@@ -4164,6 +4184,7 @@  cp_build_binary_op (location_t location,
 	  if (TREE_CODE (const_op1) != INTEGER_CST)
 	    const_op1 = op1;
 	  result_type = type0;
+	  doing_shift = true;
 	  if (TREE_CODE (const_op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_lt (const_op1, integer_zero_node))
@@ -4211,6 +4232,7 @@  cp_build_binary_op (location_t location,
 	  if (TREE_CODE (const_op1) != INTEGER_CST)
 	    const_op1 = op1;
 	  result_type = type0;
+	  doing_shift = true;
 	  if (TREE_CODE (const_op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_lt (const_op1, integer_zero_node))
@@ -4607,6 +4629,18 @@  cp_build_binary_op (location_t location,
       break;
     }
 
+  if (flag_ubsan && doing_div_or_mod)
+    {
+      resultcode = COMPOUND_EXPR;
+      return ubsan_instrument_division (location, code, op0, op1);
+    }
+
+  if (flag_ubsan && doing_shift)
+    {
+      resultcode = COMPOUND_EXPR;
+      return ubsan_instrument_shift (location, code, op0, op1);
+    }
+
   if (((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == COMPLEX_TYPE
 	|| code0 == ENUMERAL_TYPE)
        && (code1 == INTEGER_TYPE || code1 == REAL_TYPE
--- gcc/common.opt.mp	2013-06-05 18:23:41.075439828 +0200
+++ gcc/common.opt	2013-06-05 18:26:04.740921145 +0200
@@ -858,6 +858,10 @@  fsanitize=thread
 Common Report Var(flag_tsan)
 Enable ThreadSanitizer, a data race detector
 
+fsanitize=undefined
+Common Report Var(flag_ubsan)
+Enable UndefinedBehaviorSanitizer, an undefined behavior detector
+
 fasynchronous-unwind-tables
 Common Report Var(flag_asynchronous_unwind_tables) Optimization
 Generate unwind tables that are exact at each instruction boundary
--- gcc/builtin-attrs.def.mp	2013-06-05 18:23:41.071439812 +0200
+++ gcc/builtin-attrs.def	2013-06-05 18:26:04.727921093 +0200
@@ -83,6 +83,7 @@  DEF_LIST_INT_INT (5,6)
 #undef DEF_LIST_INT_INT
 
 /* Construct trees for identifiers.  */
+DEF_ATTR_IDENT (ATTR_COLD, "cold")
 DEF_ATTR_IDENT (ATTR_CONST, "const")
 DEF_ATTR_IDENT (ATTR_FORMAT, "format")
 DEF_ATTR_IDENT (ATTR_FORMAT_ARG, "format_arg")
@@ -130,6 +131,8 @@  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHRO
 			ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
 			ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
+DEF_ATTR_TREE_LIST (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST, ATTR_COLD,\
+			ATTR_NULL, ATTR_NORETURN_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_CONST_NORETURN_NOTHROW_LEAF_LIST, ATTR_CONST,\
 			ATTR_NULL, ATTR_NORETURN_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_MALLOC_NOTHROW_LIST, ATTR_MALLOC,	\
--- gcc/c/c-typeck.c.mp	2013-06-05 18:23:41.073439820 +0200
+++ gcc/c/c-typeck.c	2013-06-05 18:26:04.736921129 +0200
@@ -37,6 +37,7 @@  along with GCC; see the file COPYING3.
 #include "tree-iterator.h"
 #include "bitmap.h"
 #include "gimple.h"
+#include "ubsan.h"
 #include "c-family/c-objc.h"
 #include "c-family/c-common.h"
 
@@ -9542,6 +9543,12 @@  build_binary_op (location_t location, en
      operands to truth-values.  */
   bool boolean_op = false;
 
+  /* Remember whether we're doing / or %.  */
+  bool doing_div_or_mod = false;
+
+  /* Remember whether we're doing << or >>.  */
+  bool doing_shift = false;
+
   if (location == UNKNOWN_LOCATION)
     location = input_location;
 
@@ -9743,6 +9750,7 @@  build_binary_op (location_t location, en
     case FLOOR_DIV_EXPR:
     case ROUND_DIV_EXPR:
     case EXACT_DIV_EXPR:
+      doing_div_or_mod = true;
       warn_for_div_by_zero (location, op1);
 
       if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
@@ -9790,6 +9798,7 @@  build_binary_op (location_t location, en
 
     case TRUNC_MOD_EXPR:
     case FLOOR_MOD_EXPR:
+      doing_div_or_mod = true;
       warn_for_div_by_zero (location, op1);
 
       if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
@@ -9888,6 +9897,7 @@  build_binary_op (location_t location, en
       else if ((code0 == INTEGER_TYPE || code0 == FIXED_POINT_TYPE)
 	  && code1 == INTEGER_TYPE)
 	{
+	  doing_shift = true;
 	  if (TREE_CODE (op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_sgn (op1) < 0)
@@ -9940,6 +9950,7 @@  build_binary_op (location_t location, en
       else if ((code0 == INTEGER_TYPE || code0 == FIXED_POINT_TYPE)
 	  && code1 == INTEGER_TYPE)
 	{
+	  doing_shift = true;
 	  if (TREE_CODE (op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_sgn (op1) < 0)
@@ -10224,6 +10235,20 @@  build_binary_op (location_t location, en
       return error_mark_node;
     }
 
+  if (flag_ubsan && doing_div_or_mod)
+    {
+      ret = ubsan_instrument_division (location, code, op0, op1);
+      resultcode = COMPOUND_EXPR;
+      goto return_build_binary_op;
+    }
+
+  if (flag_ubsan && doing_shift)
+    {
+      ret = ubsan_instrument_shift (location, code, op0, op1);
+      resultcode = COMPOUND_EXPR;
+      goto return_build_binary_op;
+    }
+
   if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == COMPLEX_TYPE
        || code0 == FIXED_POINT_TYPE || code0 == VECTOR_TYPE)
       &&
--- gcc/asan.c.mp	2013-06-05 18:23:41.070439808 +0200
+++ gcc/asan.c	2013-06-05 18:26:04.726921089 +0200
@@ -2034,6 +2034,9 @@  initialize_sanitizer_builtins (void)
   tree BT_FN_VOID = build_function_type_list (void_type_node, NULL_TREE);
   tree BT_FN_VOID_PTR
     = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  tree BT_FN_VOID_PTR_PTR_PTR
+    = build_function_type_list (void_type_node, ptr_type_node,
+				ptr_type_node, ptr_type_node, NULL_TREE);
   tree BT_FN_VOID_PTR_PTRMODE
     = build_function_type_list (void_type_node, ptr_type_node,
 				build_nonstandard_integer_type (POINTER_SIZE,
@@ -2099,6 +2102,9 @@  initialize_sanitizer_builtins (void)
 #undef ATTR_TMPURE_NORETURN_NOTHROW_LEAF_LIST
 #define ATTR_TMPURE_NORETURN_NOTHROW_LEAF_LIST \
   ECF_TM_PURE | ATTR_NORETURN_NOTHROW_LEAF_LIST
+#undef ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST
+#define ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST \
+  /* ECF_COLD missing */ ATTR_NORETURN_NOTHROW_LEAF_LIST
 #undef DEF_SANITIZER_BUILTIN
 #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
   decl = add_builtin_function ("__builtin_" NAME, TYPE, ENUM,		\