Patchwork [RFC] Implement Undefined Behavior Sanitizer (take 2)

login
register
mail settings
Submitter Marek Polacek
Date June 11, 2013, 6:44 p.m.
Message ID <20130611184430.GO4160@redhat.com>
Download mbox | patch
Permalink /patch/250606/
State New
Headers show

Comments

Marek Polacek - June 11, 2013, 6:44 p.m.
On Sat, Jun 08, 2013 at 08:22:33PM +0200, Jakub Jelinek wrote:
> > >+  tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
> > >+		    integer_minus_one_node);
> > 
> > Don't we usually try to have both operands of a comparison of the
> > same type?
> 
> Not just usually, it really has to be build_int_cst (TREE_TYPE (op1), -1).
> And, more importantly, at least in cp_build_binary_op the calls need to be
> moved further down in the function, at least after if (processing_template_decl)
> but e.g. for division the trouble is that shorten_binary_op is performed
> before actually promoting one or both operand to the result_type.  I guess
> for the diagnostics which prints the types, it would be best to diagnose
> using the promoted types and result_type constructed out of that, but
> without shorten_binary_op etc., that is just an optimization I think.
> So, maybe record the original result_type before shortening, and if
> shortening changed that, convert the arguments for the instrumentation only
> to the original result_type, otherwise use the conversion done normally.
> For shifts this isn't a big deal, because they always use result_type of the
> first operand after promotion, and the ubsan handler wants to see two types
> there (the question is, does it want for the shift amount look for the
> original shift count type, or the one converted to int)?
> 
> Also, perhaps it would be better if these ubsan_instrument* functions
> didn't return a COMPOUND_EXPR, but instead just the lhs of that (i.e. the
> actual instrumentation) and let the caller set some var to that and if that
> var is non-NULL, after building the binary operation build a COMPOUND_EXPR
> with lhs being the instrumentation and rhs the binary operation itself.

All should be resolved.

> > >+  t = fold_build2 (EQ_EXPR, boolean_type_node, op0,
> > >+		   TYPE_MIN_VALUE (TREE_TYPE (op0)));
> > 
> > I didn't see where this test was restricted to the signed case
> > (0u/-1 is well defined)?
> > 
> > >+  t = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, t, tt);
> > >+  tt = build2 (EQ_EXPR, boolean_type_node,
> > >+	       op1, integer_zero_node);
> > 
> > Why not fold this one?
> 
> Sure.  And yeah, the INT_MIN/-1 checking needs to be done for signed types
> only.

Done.

> > >+tree
> > >+ubsan_instrument_shift (location_t loc, enum tree_code code,
> > >+			tree op0, tree op1)
> > >+{
> > >+  tree t, tt = NULL_TREE;
> > >+  tree orig = build2 (code, TREE_TYPE (op0), op0, op1);
> > >+  tree uprecm1 = build_int_cst (unsigned_type_for (TREE_TYPE (op1)),
> > >+			       TYPE_PRECISION (TREE_TYPE (op0)) - 1);
> > >+  tree precm1 = build_int_cst (TREE_TYPE (op1),
> > >+			       TYPE_PRECISION (TREE_TYPE (op0)) - 1);
> > 
> > (if we later want to extend this to vector-scalar shifts,
> > element_precision will be better than TYPE_PRECISION)
> > 
> > Name unsigned_type_for (TREE_TYPE (op1)) and TYPE_PRECISION
> > (TREE_TYPE (op0)) that are used several times?

Done.

Thanks for the review.
Here's another version, hopefully all issues are fixed.  During
the rewriting I had to fix a few ICEs, so this patch took more time.
I guess I might've misunderstood the cp_convert part, so sorry if
I did it wrong.

Lightly tested, I'm really starting to miss the ubsan testsuite ;).

Regtested on x86_64-linux.

2013-06-11  Marek Polacek  <polacek@redhat.com>

	* Makefile.in: Add ubsan.c.
	* common.opt: Add -fsanitize=undefined option.
	* doc/invoke.texi: Document the new flag.
	* sanitizer.def (DEF_SANITIZER_BUILTIN): Define.
	* builtin-attrs.def (ATTR_COLD): Define.
	* asan.c (initialize_sanitizer_builtins): Build
	BT_FN_VOID_PTR_PTR_PTR.
	* builtins.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
	BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS): Define.

c-family/
	* c-ubsan.c: New file.
	* c-ubsan.h: New file.

cp/
	* typeck.c (cp_build_binary_op): Add division by zero and shift
	instrumentation.

c/
	* c-typeck.c (build_binary_op): Add division by zero and shift
	instrumentation.



   	Marek
Marc Glisse - June 11, 2013, 7:14 p.m.
Hello,

couple comments (not a true review)

On Tue, 11 Jun 2013, Marek Polacek wrote:

> +tree
> +ubsan_instrument_division (location_t loc, tree op0, tree op1)
> +{
> +  tree t, tt;
> +  tree type0 = TREE_TYPE (op0);
> +  tree type1 = TREE_TYPE (op1);

Can the 2 types be different? I thought divisions had homogeneous 
arguments, and the instrumentation was done late enough to avoid any 
potential issue, but maybe not...

> +  tree type1_zero_cst = build_int_cst (type1, 0);

It is a bit funny to do that before the following test ;-)

> +  if (TREE_CODE (type0) != INTEGER_TYPE
> +      || TREE_CODE (type1) != INTEGER_TYPE)
> +    return NULL_TREE;
> +
> +  /* If we *know* that the divisor is not -1 or 0, we don't have to
> +     instrument this expression.
> +     ??? We could use decl_constant_value to cover up more cases.  */
> +  if (TREE_CODE (op1) == INTEGER_CST
> +      && integer_nonzerop (op1)
> +      && !integer_minus_onep (op1))
> +    return NULL_TREE;
> +
> +  /* We check INT_MIN / -1 only for signed types.  */
> +  if (!TYPE_UNSIGNED (type0) && !TYPE_UNSIGNED (type1))
> +    {
> +      tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
> +			build_int_cst (type1, -1));
> +      t = fold_build2 (EQ_EXPR, boolean_type_node, op0,
> +		       TYPE_MIN_VALUE (type0));
> +      t = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, t, tt);
> +    }
> +  else
> +    t = type1_zero_cst;
> +  tt = fold_build2 (EQ_EXPR, boolean_type_node,
> +		    op1, type1_zero_cst);
> +  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, t);

If you wrote the comparison with 0 first, you could put the OR in the 
signed branch instead of relying on folding |0, no?

> +  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW);
> +  tt = build_call_expr_loc (loc, tt, 0);
> +  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
> +
> +  return t;
> +}
Marek Polacek - June 11, 2013, 7:44 p.m.
Hi!

On Tue, Jun 11, 2013 at 09:14:36PM +0200, Marc Glisse wrote:
> Hello,
> 
> couple comments (not a true review)

Thanks anyway ;).

> On Tue, 11 Jun 2013, Marek Polacek wrote:
> 
> >+tree
> >+ubsan_instrument_division (location_t loc, tree op0, tree op1)
> >+{
> >+  tree t, tt;
> >+  tree type0 = TREE_TYPE (op0);
> >+  tree type1 = TREE_TYPE (op1);
> 
> Can the 2 types be different? I thought divisions had homogeneous
> arguments, and the instrumentation was done late enough to avoid any
> potential issue, but maybe not...

Yeah, they can; they only have to be of arithmetic type.

> >+  tree type1_zero_cst = build_int_cst (type1, 0);
> 
> It is a bit funny to do that before the following test ;-)

Perhaps, yes.  Moved it...

> >+  if (TREE_CODE (type0) != INTEGER_TYPE
> >+      || TREE_CODE (type1) != INTEGER_TYPE)
> >+    return NULL_TREE;

..here.

> >+  /* We check INT_MIN / -1 only for signed types.  */
> >+  if (!TYPE_UNSIGNED (type0) && !TYPE_UNSIGNED (type1))
> >+    {
> >+      tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
> >+			build_int_cst (type1, -1));
> >+      t = fold_build2 (EQ_EXPR, boolean_type_node, op0,
> >+		       TYPE_MIN_VALUE (type0));
> >+      t = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, t, tt);
> >+    }
> >+  else
> >+    t = type1_zero_cst;
> >+  tt = fold_build2 (EQ_EXPR, boolean_type_node,
> >+		    op1, type1_zero_cst);
> >+  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, t);
> 
> If you wrote the comparison with 0 first, you could put the OR in
> the signed branch instead of relying on folding |0, no?

Duh, indeed.  Will adjust.  Thanks!

	Marek
Jakub Jelinek - June 11, 2013, 8:09 p.m.
On Tue, Jun 11, 2013 at 09:44:40PM +0200, Marek Polacek wrote:
> > >+  tree type0 = TREE_TYPE (op0);
> > >+  tree type1 = TREE_TYPE (op1);
> > 
> > Can the 2 types be different? I thought divisions had homogeneous
> > arguments, and the instrumentation was done late enough to avoid any
> > potential issue, but maybe not...
> 
> Yeah, they can; they only have to be of arithmetic type.

Nope, if this is after conversion to result_type (resp. orig_type),
then they both have result_type resp. orig_type type.

Shift is different, there the two arguments can have different type.

	Jakub
Marek Polacek - June 11, 2013, 8:20 p.m.
On Tue, Jun 11, 2013 at 10:09:00PM +0200, Jakub Jelinek wrote:
> On Tue, Jun 11, 2013 at 09:44:40PM +0200, Marek Polacek wrote:
> > > >+  tree type0 = TREE_TYPE (op0);
> > > >+  tree type1 = TREE_TYPE (op1);
> > > 
> > > Can the 2 types be different? I thought divisions had homogeneous
> > > arguments, and the instrumentation was done late enough to avoid any
> > > potential issue, but maybe not...
> > 
> > Yeah, they can; they only have to be of arithmetic type.
> 
> Nope, if this is after conversion to result_type (resp. orig_type),
> then they both have result_type resp. orig_type type.
> 
> Shift is different, there the two arguments can have different type.

But currently I'm cp_convert-ing the arguments to orig_type only if
we were performing the shortening which changed the result_type.
If, with current patch, I put debug_tree (type0); debug_tree (type1);
into ubsan_instrument_division, I see different types (int vs.
unsigned int etc.).

	Marek
Jakub Jelinek - June 11, 2013, 8:33 p.m.
On Tue, Jun 11, 2013 at 10:20:24PM +0200, Marek Polacek wrote:
> On Tue, Jun 11, 2013 at 10:09:00PM +0200, Jakub Jelinek wrote:
> > On Tue, Jun 11, 2013 at 09:44:40PM +0200, Marek Polacek wrote:
> > > > >+  tree type0 = TREE_TYPE (op0);
> > > > >+  tree type1 = TREE_TYPE (op1);
> > > > 
> > > > Can the 2 types be different? I thought divisions had homogeneous
> > > > arguments, and the instrumentation was done late enough to avoid any
> > > > potential issue, but maybe not...
> > > 
> > > Yeah, they can; they only have to be of arithmetic type.
> > 
> > Nope, if this is after conversion to result_type (resp. orig_type),
> > then they both have result_type resp. orig_type type.
> > 
> > Shift is different, there the two arguments can have different type.
> 
> But currently I'm cp_convert-ing the arguments to orig_type only if
> we were performing the shortening which changed the result_type.
> If, with current patch, I put debug_tree (type0); debug_tree (type1);
> into ubsan_instrument_division, I see different types (int vs.
> unsigned int etc.).

That means you probably should move the function call down in
cp_build_binary_op (resp. C counterpart), after the arguments are converted
to result_type?

	Jakub
Marek Polacek - June 11, 2013, 8:40 p.m.
On Tue, Jun 11, 2013 at 10:33:25PM +0200, Jakub Jelinek wrote:
> That means you probably should move the function call down in
> cp_build_binary_op (resp. C counterpart), after the arguments are converted
> to result_type?

Ok, certainly.  Seems the arguments are converted here:

  if (! converted)
    {
      if (TREE_TYPE (op0) != result_type)
	op0 = cp_convert_and_check (result_type, op0, complain);
      if (TREE_TYPE (op1) != result_type)
	op1 = cp_convert_and_check (result_type, op1, complain);

      if (op0 == error_mark_node || op1 == error_mark_node)
	return error_mark_node;
    }

I'll move the instrumentation after the hunk above.  And then 
in ubsan_instrument_division I might want to have just 
tree type = TREE_TYPE (op0);, maybe together with an assert like
gcc_assert (TREE_TYPE (op0) == TREE_TYPE (op1).

	Marek
Jakub Jelinek - June 11, 2013, 8:44 p.m.
On Tue, Jun 11, 2013 at 10:40:12PM +0200, Marek Polacek wrote:
> On Tue, Jun 11, 2013 at 10:33:25PM +0200, Jakub Jelinek wrote:
> > That means you probably should move the function call down in
> > cp_build_binary_op (resp. C counterpart), after the arguments are converted
> > to result_type?
> 
> Ok, certainly.  Seems the arguments are converted here:
> 
>   if (! converted)
>     {
>       if (TREE_TYPE (op0) != result_type)
> 	op0 = cp_convert_and_check (result_type, op0, complain);
>       if (TREE_TYPE (op1) != result_type)
> 	op1 = cp_convert_and_check (result_type, op1, complain);
> 
>       if (op0 == error_mark_node || op1 == error_mark_node)
> 	return error_mark_node;
>     }
> 
> I'll move the instrumentation after the hunk above.  And then 
> in ubsan_instrument_division I might want to have just 
> tree type = TREE_TYPE (op0);, maybe together with an assert like
> gcc_assert (TREE_TYPE (op0) == TREE_TYPE (op1).

There is another thing to solve BTW, op0 and/or op1 might have side-effects,
if you are going to evaluate them more than once, they need to be surrounded
into cp_save_expr resp. c_save_expr.

	Jakub
Marek Polacek - June 11, 2013, 8:52 p.m.
On Tue, Jun 11, 2013 at 10:44:12PM +0200, Jakub Jelinek wrote:
> There is another thing to solve BTW, op0 and/or op1 might have side-effects,
> if you are going to evaluate them more than once, they need to be surrounded
> into cp_save_expr resp. c_save_expr.

I see.  Thanks for the notice.

	Marek
Marek Polacek - June 12, 2013, 1:48 p.m.
On Tue, Jun 11, 2013 at 10:44:12PM +0200, Jakub Jelinek wrote:
> There is another thing to solve BTW, op0 and/or op1 might have side-effects,
> if you are going to evaluate them more than once, they need to be surrounded
> into cp_save_expr resp. c_save_expr.

There's that unpleasant thing that cp_save_expr is declared in
cp/cp-tree.h, but we don't want to include cp/*.h or c/*.h files
in c-family/c-ubsan.c.  Should I use save_expr from tree.c instead?
I seem to recall that that isn't the best thing to do...

	Marek
Jakub Jelinek - June 12, 2013, 1:52 p.m.
On Wed, Jun 12, 2013 at 03:48:24PM +0200, Marek Polacek wrote:
> On Tue, Jun 11, 2013 at 10:44:12PM +0200, Jakub Jelinek wrote:
> > There is another thing to solve BTW, op0 and/or op1 might have side-effects,
> > if you are going to evaluate them more than once, they need to be surrounded
> > into cp_save_expr resp. c_save_expr.
> 
> There's that unpleasant thing that cp_save_expr is declared in
> cp/cp-tree.h, but we don't want to include cp/*.h or c/*.h files
> in c-family/c-ubsan.c.  Should I use save_expr from tree.c instead?
> I seem to recall that that isn't the best thing to do...

No, you really need to use the cp_save_expr/c_save_expr, especially for
C it e.g. fully folds etc.  You want to call that in
cp_build_binary_op etc., also because you want both the instrument_expr
itself, but also the original binary expression to use the SAVE_EXPRs if
they are created.

	Jakub

Patch

--- gcc/c-family/c-ubsan.c.mp	2013-06-11 19:51:55.555492466 +0200
+++ gcc/c-family/c-ubsan.c	2013-06-11 19:29:16.925551907 +0200
@@ -0,0 +1,126 @@ 
+/* UndefinedBehaviorSanitizer, undefined behavior detector.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Marek Polacek <polacek@redhat.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-family/c-common.h"
+#include "c-family/c-ubsan.h"
+
+/* Instrument division by zero and INT_MIN / -1.  If not instrumenting,
+   return NULL_TREE.  */
+
+tree
+ubsan_instrument_division (location_t loc, tree op0, tree op1)
+{
+  tree t, tt;
+  tree type0 = TREE_TYPE (op0);
+  tree type1 = TREE_TYPE (op1);
+  tree type1_zero_cst = build_int_cst (type1, 0);
+
+  if (TREE_CODE (type0) != INTEGER_TYPE
+      || TREE_CODE (type1) != INTEGER_TYPE)
+    return NULL_TREE;
+
+  /* If we *know* that the divisor is not -1 or 0, we don't have to
+     instrument this expression.
+     ??? We could use decl_constant_value to cover up more cases.  */
+  if (TREE_CODE (op1) == INTEGER_CST
+      && integer_nonzerop (op1)
+      && !integer_minus_onep (op1))
+    return NULL_TREE;
+
+  /* We check INT_MIN / -1 only for signed types.  */
+  if (!TYPE_UNSIGNED (type0) && !TYPE_UNSIGNED (type1))
+    {
+      tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
+			build_int_cst (type1, -1));
+      t = fold_build2 (EQ_EXPR, boolean_type_node, op0,
+		       TYPE_MIN_VALUE (type0));
+      t = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, t, tt);
+    }
+  else
+    t = type1_zero_cst;
+  tt = fold_build2 (EQ_EXPR, boolean_type_node,
+		    op1, type1_zero_cst);
+  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, tt, t);
+  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW);
+  tt = build_call_expr_loc (loc, tt, 0);
+  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
+
+  return t;
+}
+
+/* Instrument left and right shifts.  If not instrumenting, return
+   NULL_TREE.  */
+
+tree
+ubsan_instrument_shift (location_t loc, enum tree_code code,
+			tree op0, tree op1)
+{
+  tree t, tt = NULL_TREE;
+  tree op1_utype = unsigned_type_for (TREE_TYPE (op1));
+  HOST_WIDE_INT op0_prec = TYPE_PRECISION (TREE_TYPE (op0));
+  tree uprecm1 = build_int_cst (op1_utype, op0_prec - 1);
+  tree precm1 = build_int_cst (TREE_TYPE (op1), op0_prec - 1);
+
+  t = fold_convert_loc (loc, op1_utype, op1);
+  t = fold_build2 (GT_EXPR, boolean_type_node, t, uprecm1);
+
+  /* For signed x << y, in C99/C11, the following:
+     (unsigned) x >> (precm1 - y)
+     if non-zero, is undefined.  */
+  if (code == LSHIFT_EXPR
+      && !TYPE_UNSIGNED (TREE_TYPE (op0))
+      && flag_isoc99)
+    {
+      tree x = fold_build2 (MINUS_EXPR, integer_type_node, precm1, op1);
+      tt = fold_convert_loc (loc, unsigned_type_for (TREE_TYPE (op0)), op0);
+      tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
+      tt = fold_build2 (NE_EXPR, boolean_type_node, tt,
+			build_int_cst (TREE_TYPE (tt), 0));
+    }
+
+  /* For signed x << y, in C++11/C++14, the following:
+     x < 0 || ((unsigned) x >> (precm1 - y))
+     if > 1, is undefined.  */
+  if (code == LSHIFT_EXPR
+      && !TYPE_UNSIGNED (TREE_TYPE (op0))
+      && (cxx_dialect == cxx11 || cxx_dialect == cxx1y))
+    {
+      tree x = fold_build2 (MINUS_EXPR, integer_type_node, precm1, op1);
+      tt = fold_convert_loc (loc, unsigned_type_for (TREE_TYPE (op0)), op0);
+      tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
+      tt = fold_build2 (GT_EXPR, boolean_type_node, tt,
+			build_int_cst (TREE_TYPE (tt), 1));
+      x = fold_build2 (LT_EXPR, boolean_type_node, op0,
+		       build_int_cst (TREE_TYPE (op0), 0));
+      tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, x, tt);
+    }
+
+  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t,
+		   tt ? tt : integer_zero_node);
+  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS);
+  tt = build_call_expr_loc (loc, tt, 0);
+  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
+
+  return t;
+}
--- gcc/c-family/c-ubsan.h.mp	2013-06-11 19:51:50.616457500 +0200
+++ gcc/c-family/c-ubsan.h	2013-06-11 16:51:38.297942275 +0200
@@ -0,0 +1,27 @@ 
+/* UndefinedBehaviorSanitizer, undefined behavior detector.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Marek Polacek <polacek@redhat.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_UBSAN_H
+#define GCC_UBSAN_H
+
+extern tree ubsan_instrument_division (location_t, tree, tree);
+extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
+
+#endif  /* GCC_UBSAN_H  */
--- gcc/sanitizer.def.mp	2013-06-11 19:51:43.781408808 +0200
+++ gcc/sanitizer.def	2013-06-11 19:53:37.768224970 +0200
@@ -283,3 +283,13 @@  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_ATOM
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_ATOMIC_SIGNAL_FENCE,
 		      "__tsan_atomic_signal_fence",
 		      BT_FN_VOID_INT, ATTR_NOTHROW_LEAF_LIST)
+
+/* Undefined Behavior Sanitizer */
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
+		      "__ubsan_handle_divrem_overflow",
+		      BT_FN_VOID_PTR_PTR_PTR,
+		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS,
+		      "__ubsan_handle_shift_out_of_bounds",
+		      BT_FN_VOID_PTR_PTR_PTR,
+		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
--- gcc/builtins.def.mp	2013-06-11 19:51:43.790408877 +0200
+++ gcc/builtins.def	2013-06-11 19:53:37.721224606 +0200
@@ -155,7 +155,7 @@  along with GCC; see the file COPYING3.
 #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
 	       true, true, true, ATTRS, true, \
-	       (flag_asan || flag_tsan))
+	       (flag_asan || flag_tsan || flag_ubsan))
 
 #undef DEF_CILKPLUS_BUILTIN
 #define DEF_CILKPLUS_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
--- gcc/Makefile.in.mp	2013-06-11 19:51:43.780408801 +0200
+++ gcc/Makefile.in	2013-06-11 19:53:37.710224521 +0200
@@ -1150,7 +1150,7 @@  C_COMMON_OBJS = c-family/c-common.o c-fa
   c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o \
   c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
   c-family/c-semantics.o c-family/c-ada-spec.o tree-mudflap.o \
-  c-family/array-notation-common.o
+  c-family/array-notation-common.o c-family/c-ubsan.o
 
 # Language-independent object files.
 # We put the insn-*.o files first so that a parallel make will build
@@ -2021,6 +2021,9 @@  c-family/array-notation-common.o : c-fam
 c-family/stub-objc.o : c-family/stub-objc.c $(CONFIG_H) $(SYSTEM_H) \
 	coretypes.h $(TREE_H) $(C_COMMON_H) c-family/c-objc.h
 
+c-family/c-ubsan.o : c-family/c-ubsan.c $(CONFIG_H) $(SYSTEM_H) \
+	coretypes.h $(TREE_H) $(C_COMMON_H) c-family/c-ubsan.h
+
 default-c.o: config/default-c.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
   $(C_TARGET_H) $(C_TARGET_DEF_H)
 	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
--- gcc/doc/invoke.texi.mp	2013-06-11 19:51:43.784408831 +0200
+++ gcc/doc/invoke.texi	2013-06-11 19:53:37.761224914 +0200
@@ -5143,6 +5143,11 @@  Memory access instructions will be instr
 data race bugs.
 See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for more details.
 
+@item -fsanitize=undefined
+Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
+Various computations will be instrumented to detect
+undefined behavior, e.g.@: division by zero or various overflows.
+
 @item -fdump-final-insns@r{[}=@var{file}@r{]}
 @opindex fdump-final-insns
 Dump the final internal representation (RTL) to @var{file}.  If the
--- gcc/cp/typeck.c.mp	2013-06-11 19:51:43.785408839 +0200
+++ gcc/cp/typeck.c	2013-06-11 19:53:37.747224808 +0200
@@ -37,6 +37,7 @@  along with GCC; see the file COPYING3.
 #include "convert.h"
 #include "c-family/c-common.h"
 #include "c-family/c-objc.h"
+#include "c-family/c-ubsan.h"
 #include "params.h"
 
 static tree pfn_from_ptrmemfunc (tree);
@@ -3867,6 +3868,7 @@  cp_build_binary_op (location_t location,
   tree final_type = 0;
 
   tree result;
+  tree orig_type = NULL;
 
   /* Nonzero if this is an operation like MIN or MAX which can
      safely be computed in short if both args are promoted shorts.
@@ -3891,6 +3893,15 @@  cp_build_binary_op (location_t location,
   op0 = orig_op0;
   op1 = orig_op1;
 
+  /* Remember whether we're doing / or %.  */
+  bool doing_div_or_mod = false;
+
+  /* Remember whether we're doing << or >>.  */
+  bool doing_shift = false;
+
+  /* Tree holding instrumentation expression.  */
+  tree instrument_expr = NULL;
+
   if (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR
       || code == TRUTH_OR_EXPR || code == TRUTH_ORIF_EXPR
       || code == TRUTH_XOR_EXPR)
@@ -4070,8 +4081,12 @@  cp_build_binary_op (location_t location,
 	{
 	  enum tree_code tcode0 = code0, tcode1 = code1;
 	  tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
+	  cop1 = maybe_constant_value (cop1);
 
-	  warn_for_div_by_zero (location, maybe_constant_value (cop1));
+	  if (!processing_template_decl && tcode0 == INTEGER_TYPE)
+	    doing_div_or_mod = true;
+
+	  warn_for_div_by_zero (location, cop1);
 
 	  if (tcode0 == COMPLEX_TYPE || tcode0 == VECTOR_TYPE)
 	    tcode0 = TREE_CODE (TREE_TYPE (TREE_TYPE (op0)));
@@ -4109,8 +4124,11 @@  cp_build_binary_op (location_t location,
     case FLOOR_MOD_EXPR:
       {
 	tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
+	cop1 = maybe_constant_value (cop1);
 
-	warn_for_div_by_zero (location, maybe_constant_value (cop1));
+	if (!processing_template_decl && code0 == INTEGER_TYPE)
+	  doing_div_or_mod = true;
+	warn_for_div_by_zero (location, cop1);
       }
 
       if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
@@ -4164,6 +4182,7 @@  cp_build_binary_op (location_t location,
 	  if (TREE_CODE (const_op1) != INTEGER_CST)
 	    const_op1 = op1;
 	  result_type = type0;
+	  doing_shift = true;
 	  if (TREE_CODE (const_op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_lt (const_op1, integer_zero_node))
@@ -4211,6 +4230,7 @@  cp_build_binary_op (location_t location,
 	  if (TREE_CODE (const_op1) != INTEGER_CST)
 	    const_op1 = op1;
 	  result_type = type0;
+	  doing_shift = true;
 	  if (TREE_CODE (const_op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_lt (const_op1, integer_zero_node))
@@ -4765,8 +4785,9 @@  cp_build_binary_op (location_t location,
 
       if (shorten && none_complex)
 	{
+	  orig_type = result_type;
 	  final_type = result_type;
-	  result_type = shorten_binary_op (result_type, op0, op1, 
+	  result_type = shorten_binary_op (result_type, op0, op1,
 					   shorten == -1);
 	}
 
@@ -4814,6 +4835,31 @@  cp_build_binary_op (location_t location,
 	}
     }
 
+  if (flag_ubsan && doing_div_or_mod && !processing_template_decl)
+    {
+      op0 = maybe_constant_value (fold_non_dependent_expr_sfinae (op0,
+								  tf_none));
+      op1 = maybe_constant_value (fold_non_dependent_expr_sfinae (op1,
+								  tf_none));
+      /* For diagnostics we want to use the promoted types without
+	 shorten_binary_op.  So convert the arguments to the
+	 original result_type.  */
+      if (orig_type != NULL && result_type != orig_type)
+        {
+	  op0 = cp_convert (orig_type, op0, complain);
+	  op1 = cp_convert (orig_type, op1, complain);
+	}
+      instrument_expr = ubsan_instrument_division (location, op0, op1);
+    }
+  else if (flag_ubsan && doing_shift && !processing_template_decl)
+    {
+      op0 = maybe_constant_value (fold_non_dependent_expr_sfinae (op0,
+								  tf_none));
+      op1 = maybe_constant_value (fold_non_dependent_expr_sfinae (op1,
+								  tf_none));
+      instrument_expr = ubsan_instrument_shift (location, code, op0, op1);
+    }
+
   /* If CONVERTED is zero, both args will be converted to type RESULT_TYPE.
      Then the expression will be built.
      It will be given type FINAL_TYPE if that is nonzero;
@@ -4842,6 +4888,10 @@  cp_build_binary_op (location_t location,
       && !TREE_OVERFLOW_P (op1))
     overflow_warning (location, result);
 
+  if (flag_ubsan && instrument_expr != NULL)
+    result = fold_build2 (COMPOUND_EXPR, TREE_TYPE (result),
+			  instrument_expr, result);
+
   return result;
 }
 
--- gcc/common.opt.mp	2013-06-11 19:51:43.787408855 +0200
+++ gcc/common.opt	2013-06-11 19:53:37.742224768 +0200
@@ -858,6 +858,10 @@  fsanitize=thread
 Common Report Var(flag_tsan)
 Enable ThreadSanitizer, a data race detector
 
+fsanitize=undefined
+Common Report Var(flag_ubsan)
+Enable UndefinedBehaviorSanitizer, an undefined behavior detector
+
 fasynchronous-unwind-tables
 Common Report Var(flag_asynchronous_unwind_tables) Optimization
 Generate unwind tables that are exact at each instruction boundary
--- gcc/builtin-attrs.def.mp	2013-06-11 19:51:43.791408885 +0200
+++ gcc/builtin-attrs.def	2013-06-11 19:53:37.717224576 +0200
@@ -83,6 +83,7 @@  DEF_LIST_INT_INT (5,6)
 #undef DEF_LIST_INT_INT
 
 /* Construct trees for identifiers.  */
+DEF_ATTR_IDENT (ATTR_COLD, "cold")
 DEF_ATTR_IDENT (ATTR_CONST, "const")
 DEF_ATTR_IDENT (ATTR_FORMAT, "format")
 DEF_ATTR_IDENT (ATTR_FORMAT_ARG, "format_arg")
@@ -130,6 +131,8 @@  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHRO
 			ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
 			ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
+DEF_ATTR_TREE_LIST (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST, ATTR_COLD,\
+			ATTR_NULL, ATTR_NORETURN_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_CONST_NORETURN_NOTHROW_LEAF_LIST, ATTR_CONST,\
 			ATTR_NULL, ATTR_NORETURN_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_MALLOC_NOTHROW_LIST, ATTR_MALLOC,	\
--- gcc/c/c-typeck.c.mp	2013-06-11 19:51:43.789408869 +0200
+++ gcc/c/c-typeck.c	2013-06-11 19:53:37.737224731 +0200
@@ -39,6 +39,7 @@  along with GCC; see the file COPYING3.
 #include "gimple.h"
 #include "c-family/c-objc.h"
 #include "c-family/c-common.h"
+#include "c-family/c-ubsan.h"
 
 /* Possible cases of implicit bad conversions.  Used to select
    diagnostic messages in convert_for_assignment.  */
@@ -9527,6 +9528,15 @@  build_binary_op (location_t location, en
      operands to truth-values.  */
   bool boolean_op = false;
 
+  /* Remember whether we're doing / or %.  */
+  bool doing_div_or_mod = false;
+
+  /* Remember whether we're doing << or >>.  */
+  bool doing_shift = false;
+
+  /* Tree holding instrumentation expression.  */
+  tree instrument_expr = NULL;
+
   if (location == UNKNOWN_LOCATION)
     location = input_location;
 
@@ -9728,6 +9738,7 @@  build_binary_op (location_t location, en
     case FLOOR_DIV_EXPR:
     case ROUND_DIV_EXPR:
     case EXACT_DIV_EXPR:
+      doing_div_or_mod = true;
       warn_for_div_by_zero (location, op1);
 
       if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
@@ -9775,6 +9786,7 @@  build_binary_op (location_t location, en
 
     case TRUNC_MOD_EXPR:
     case FLOOR_MOD_EXPR:
+      doing_div_or_mod = true;
       warn_for_div_by_zero (location, op1);
 
       if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
@@ -9873,6 +9885,7 @@  build_binary_op (location_t location, en
       else if ((code0 == INTEGER_TYPE || code0 == FIXED_POINT_TYPE)
 	  && code1 == INTEGER_TYPE)
 	{
+	  doing_shift = true;
 	  if (TREE_CODE (op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_sgn (op1) < 0)
@@ -9925,6 +9938,7 @@  build_binary_op (location_t location, en
       else if ((code0 == INTEGER_TYPE || code0 == FIXED_POINT_TYPE)
 	  && code1 == INTEGER_TYPE)
 	{
+	  doing_shift = true;
 	  if (TREE_CODE (op1) == INTEGER_CST)
 	    {
 	      if (tree_int_cst_sgn (op1) < 0)
@@ -10209,6 +10223,11 @@  build_binary_op (location_t location, en
       return error_mark_node;
     }
 
+  if (flag_ubsan && doing_div_or_mod)
+    instrument_expr = ubsan_instrument_division (location, op0, op1);
+  else if (flag_ubsan && doing_shift)
+    instrument_expr = ubsan_instrument_shift (location, code, op0, op1);
+
   if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == COMPLEX_TYPE
        || code0 == FIXED_POINT_TYPE || code0 == VECTOR_TYPE)
       &&
@@ -10492,6 +10511,11 @@  build_binary_op (location_t location, en
   if (semantic_result_type)
     ret = build1 (EXCESS_PRECISION_EXPR, semantic_result_type, ret);
   protected_set_expr_location (ret, location);
+
+  if (flag_ubsan && instrument_expr != NULL)
+    ret = fold_build2 (COMPOUND_EXPR, TREE_TYPE (ret),
+		       instrument_expr, ret);
+
   return ret;
 }
 
--- gcc/asan.c.mp	2013-06-11 19:51:43.793408901 +0200
+++ gcc/asan.c	2013-06-11 19:53:37.713224545 +0200
@@ -2034,6 +2034,9 @@  initialize_sanitizer_builtins (void)
   tree BT_FN_VOID = build_function_type_list (void_type_node, NULL_TREE);
   tree BT_FN_VOID_PTR
     = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  tree BT_FN_VOID_PTR_PTR_PTR
+    = build_function_type_list (void_type_node, ptr_type_node,
+				ptr_type_node, ptr_type_node, NULL_TREE);
   tree BT_FN_VOID_PTR_PTRMODE
     = build_function_type_list (void_type_node, ptr_type_node,
 				build_nonstandard_integer_type (POINTER_SIZE,
@@ -2099,6 +2102,9 @@  initialize_sanitizer_builtins (void)
 #undef ATTR_TMPURE_NORETURN_NOTHROW_LEAF_LIST
 #define ATTR_TMPURE_NORETURN_NOTHROW_LEAF_LIST \
   ECF_TM_PURE | ATTR_NORETURN_NOTHROW_LEAF_LIST
+#undef ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST
+#define ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST \
+  /* ECF_COLD missing */ ATTR_NORETURN_NOTHROW_LEAF_LIST
 #undef DEF_SANITIZER_BUILTIN
 #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
   decl = add_builtin_function ("__builtin_" NAME, TYPE, ENUM,		\