Message ID | ZfAbWpHgGI/Bs6hp@tucnak |
---|---|
State | New |
Headers | show |
Series | gimple-iterator, ubsan: Fix ICE during instrumentation of returns_twice calls [PR112709] | expand |
On Tue, 12 Mar 2024, Jakub Jelinek wrote: > Hi! > > ubsan, asan (both PR112709) and _BitInt lowering (PR113466) want to > insert some instrumentation or adjustment statements before some statement. > This unfortunately creates invalid IL if inserting before a returns_twice > call, because we require that such calls are the first statement in a basic > block and that we have an edge from the .ABNORMAL_DISPATCHER block to > the block containing the returns_twice call (in addition to other edge(s)). > > The following patch adds helper functions for such insertions and uses it > for now in ubsan (I'll post a follow up which uses it in asan and will > work later on the _BitInt lowering PR). > > In particular, if the bb with returns_twice call at the start has just > 2 edges, one EDGE_ABNORMAL from .ABNORMAL_DISPATCHER and another > (non-EDGE_ABNORMAL/EDGE_EH) from some other bb, it just inserts the > statement or sequence on that other edge. > If the bb has more predecessor edges or the one not from > .ABNORMAL_DISPATCHER is e.g. an EH edge (this latter case likely shouldn't > happen, one would need labels or something like that), the patch splits the > block with returns_twice call such that there is just one edge next to > .ABNORMAL_DISPATCHER edge and adjusts PHIs as needed to make it happen. > The functions also replace uses of PHIs from the returns_twice bb with > the corresponding PHI arguments, because otherwise it would be invalid IL. > > E.g. in ubsan/pr112709-2.c (qux) we have before the ubsan pass > <bb 10> : > # .MEM_5(ab) = PHI <.MEM_4(9), .MEM_25(ab)(11)> > # _7(ab) = PHI <_20(9), _8(ab)(11)> > # .MEM_21(ab) = VDEF <.MEM_5(ab)> > _22 = bar (*_7(ab)); > where bar is returns_twice call and bb 11 has .ABNORMAL_DISPATCHER call, > this patch instruments it like: > <bb 9> : > # .MEM_4 = PHI <.MEM_17(ab)(4), .MEM_10(D)(5), .MEM_14(ab)(8)> > # DEBUG BEGIN_STMT > # VUSE <.MEM_4> > _20 = p; > # .MEM_27 = VDEF <.MEM_4> > .UBSAN_NULL (_20, 0B, 0); > # VUSE <.MEM_27> > _2 = __builtin_dynamic_object_size (_20, 0); > # .MEM_28 = VDEF <.MEM_27> > .UBSAN_OBJECT_SIZE (_20, 1024, _2, 0); > > <bb 10> : > # .MEM_5(ab) = PHI <.MEM_28(9), .MEM_25(ab)(11)> > # _7(ab) = PHI <_20(9), _8(ab)(11)> > # .MEM_21(ab) = VDEF <.MEM_5(ab)> > _22 = bar (*_7(ab)); > The edge from .ABNORMAL_DISPATCHER is there just to represent the > returning for 2nd and later times, the instrumentation can't be > done at that point as there is no code executed during that point. > The ubsan/pr112709-1.c testcase includes non-virtual PHIs to cover > the handling of those as well. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2024-03-12 Jakub Jelinek <jakub@redhat.com> > > PR sanitizer/112709 > * gimple-iterator.h (gsi_insert_before_returns_twice_call, > gsi_insert_seq_before_returns_twice_call): Declare. > * gimple-iterator.cc: Include gimplify.h. > (edge_before_returns_twice_call, gsi_insert_before_returns_twice_call, > gsi_insert_seq_before_returns_twice_call): New functions. > * ubsan.cc (ubsan_insert_before, ubsan_insert_seq_before): New > functions. > (instrument_mem_ref, instrument_pointer_overflow, > instrument_nonnull_arg, instrument_nonnull_return): Use > ubsan_insert_before instead of gsi_insert_before. > (maybe_instrument_pointer_overflow): Use force_gimple_operand, > gimple_seq_add_seq_without_update and ubsan_insert_seq_before > instead of force_gimple_operand_gsi. > (instrument_object_size): Likewise. Use ubsan_insert_before instead > of gsi_insert_before. > > * gcc.dg/ubsan/pr112709-1.c: New test. > * gcc.dg/ubsan/pr112709-2.c: New test. > > --- gcc/gimple-iterator.h.jj 2024-02-08 11:17:37.144661383 +0100 > +++ gcc/gimple-iterator.h 2024-03-11 16:16:24.670464737 +0100 > @@ -93,6 +93,10 @@ extern void gsi_insert_on_edge (edge, gi > extern void gsi_insert_seq_on_edge (edge, gimple_seq); > extern basic_block gsi_insert_on_edge_immediate (edge, gimple *); > extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq); > +extern basic_block gsi_insert_before_returns_twice_call (basic_block, > + gimple *); > +extern basic_block gsi_insert_seq_before_returns_twice_call (basic_block, > + gimple_seq); > extern void gsi_commit_edge_inserts (void); > extern void gsi_commit_one_edge_insert (edge, basic_block *); > extern gphi_iterator gsi_start_phis (basic_block); > --- gcc/gimple-iterator.cc.jj 2024-02-08 11:17:37.144661383 +0100 > +++ gcc/gimple-iterator.cc 2024-03-11 18:17:08.994804808 +0100 > @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. > #include "tree-cfg.h" > #include "tree-ssa.h" > #include "value-prof.h" > +#include "gimplify.h" > > > /* Mark the statement STMT as modified, and update it. */ > @@ -944,3 +945,135 @@ gsi_start_phis (basic_block bb) > > return i; > } > + > +/* Helper function for gsi_insert_before_returns_twice_call and > + gsi_insert_seq_before_returns_twice_call. Find edge to > + insert statements before returns_twice call at the start of BB, > + if there isn't just one, split the bb and adjust PHIs to ensure > + that. */ > + > +static edge > +edge_before_returns_twice_call (basic_block bb) > +{ > + gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb); > + gcc_checking_assert (is_gimple_call (gsi_stmt (gsi)) > + && (gimple_call_flags (gsi_stmt (gsi)) > + & ECF_RETURNS_TWICE) != 0); > + edge_iterator ei; > + edge e, ad_edge = NULL, other_edge = NULL; > + bool split = false; > + /* Look for a fallthru and possibly a single backedge. */ I can't see the backedge handling (I'm not sure it's actually required?) > + FOR_EACH_EDGE (e, ei, bb->preds) > + { > + if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) EDGE_EH is special because of the landing pad? > + { > + gimple_stmt_iterator gsi > + = gsi_start_nondebug_after_labels_bb (e->src); > + gimple *ad = gsi_stmt (gsi); > + if (ad && gimple_call_internal_p (ad, IFN_ABNORMAL_DISPATCHER)) > + { > + gcc_checking_assert (ad_edge == NULL); > + ad_edge = e; > + continue; > + } > + } > + if (other_edge || e->flags & (EDGE_ABNORMAL | EDGE_EH)) > + split = true; > + other_edge = e; > + } > + gcc_checking_assert (ad_edge); > + if (other_edge == NULL) > + split = true; > + if (split) > + { > + other_edge = split_block_after_labels (bb); > + e = make_edge (ad_edge->src, other_edge->dest, EDGE_ABNORMAL); In theory there could be multiple abnormal edges. Did you try to use make_forwarder_block instead of manually doing it? Or does that fail because we end up redirecting the abnormal edges which you avoid by re-making it new and scrapping the old one? I fail to remember why we can't redirect abnormal edges - yes, there's no flow to be redirected but the process of "redirection" still should work. > + for (gphi_iterator gsi = gsi_start_phis (other_edge->src); > + !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gphi *phi = gsi.phi (); > + tree lhs = gimple_phi_result (phi); > + tree new_lhs; > + if (virtual_operand_p (lhs)) > + new_lhs = make_ssa_name (SSA_NAME_VAR (lhs)); > + else > + new_lhs = make_ssa_name (TREE_TYPE (lhs)); > + gimple_phi_set_result (phi, new_lhs); > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > + add_phi_arg (new_phi, new_lhs, other_edge, UNKNOWN_LOCATION); > + add_phi_arg (new_phi, gimple_phi_arg_def_from_edge (phi, ad_edge), > + e, gimple_phi_arg_location_from_edge (phi, ad_edge)); > + } > + remove_edge (ad_edge); > + } > + return other_edge; > +} > + > +/* Insert G before BB which starts with an ECF_RETURNS_TWICE call. > + If BB has a single predecessor edge except for edge from > + .ABNORMAL_DISPATCHER, insert on that edge, otherwise split > + BB before the call, adjust PHIs and insert into the new BB. > + If G refers to any results of BB's PHIs, replace them afterwards > + with corresponding PHI arg. */ > + > +basic_block > +gsi_insert_before_returns_twice_call (basic_block bb, gimple *g) > +{ > + edge e = edge_before_returns_twice_call (bb); > + basic_block new_bb = gsi_insert_on_edge_immediate (e, g); > + use_operand_p use_p; > + ssa_op_iter iter; > + bool m = false; > + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) > + { > + tree s = USE_FROM_PTR (use_p); > + if (SSA_NAME_DEF_STMT (s) > + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI > + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) > + { > + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); > + SET_USE (use_p, unshare_expr (r)); > + m = true; > + } > + } Ick. Doesn't the forwarder keep the PHI defs and new PHI defs are created in e->dest instead, I don't see how we need this? Btw, gsi_insert_before_returns_twice_call might be usable as gsi_insert_after_labels ()? Or alternatively not expose gsi_insert_before_returns_twice_call but instead make split_block () behave "correctly" for abnormals (or add split_block_before_returns_twice_call ()). But that's just bike-shedding. Richard. > + if (m) > + update_stmt (g); > + return new_bb; > +} > + > +/* Similiarly for sequence SEQ. */ > + > +basic_block > +gsi_insert_seq_before_returns_twice_call (basic_block bb, gimple_seq seq) > +{ > + if (gimple_seq_empty_p (seq)) > + return NULL; > + gimple *f = gimple_seq_first_stmt (seq); > + gimple *l = gimple_seq_last_stmt (seq); > + edge e = edge_before_returns_twice_call (bb); > + basic_block new_bb = gsi_insert_seq_on_edge_immediate (e, seq); > + use_operand_p use_p; > + ssa_op_iter iter; > + for (gimple_stmt_iterator gsi = gsi_for_stmt (f); ; gsi_next (&gsi)) > + { > + gimple *g = gsi_stmt (gsi); > + bool m = false; > + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) > + { > + tree s = USE_FROM_PTR (use_p); > + if (SSA_NAME_DEF_STMT (s) > + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI > + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) > + { > + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); > + SET_USE (use_p, unshare_expr (r)); > + m = true; > + } > + } > + if (m) > + update_stmt (g); > + if (g == l) > + break; > + } > + return new_bb; > +} > --- gcc/ubsan.cc.jj 2024-01-03 11:51:32.312720350 +0100 > +++ gcc/ubsan.cc 2024-03-11 17:07:58.104243341 +0100 > @@ -1432,6 +1432,37 @@ ubsan_expand_vptr_ifn (gimple_stmt_itera > return true; > } > > +/* Insert G stmt before ITER. If ITER is a returns_twice call, > + insert it on an appropriate edge instead. */ > + > +static void > +ubsan_insert_before (gimple_stmt_iterator *iter, gimple *g) > +{ > + gimple *stmt = gsi_stmt (*iter); > + if (stmt > + && is_gimple_call (stmt) > + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) > + gsi_insert_before_returns_twice_call (gsi_bb (*iter), g); > + else > + gsi_insert_before (iter, g, GSI_SAME_STMT); > +} > + > +/* Similarly for sequence SEQ. */ > + > +static void > +ubsan_insert_seq_before (gimple_stmt_iterator *iter, gimple_seq seq) > +{ > + if (gimple_seq_empty_p (seq)) > + return; > + gimple *stmt = gsi_stmt (*iter); > + if (stmt > + && is_gimple_call (stmt) > + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) > + gsi_insert_seq_before_returns_twice_call (gsi_bb (*iter), seq); > + else > + gsi_insert_seq_before (iter, seq, GSI_SAME_STMT); > +} > + > /* Instrument a memory reference. BASE is the base of MEM, IS_LHS says > whether the pointer is on the left hand side of the assignment. */ > > @@ -1458,7 +1489,7 @@ instrument_mem_ref (tree mem, tree base, > tree alignt = build_int_cst (pointer_sized_int_node, align); > gcall *g = gimple_build_call_internal (IFN_UBSAN_NULL, 3, t, kind, alignt); > gimple_set_location (g, gimple_location (gsi_stmt (*iter))); > - gsi_insert_before (iter, g, GSI_SAME_STMT); > + ubsan_insert_before (iter, g); > } > > /* Perform the pointer instrumentation. */ > @@ -1485,7 +1516,7 @@ instrument_pointer_overflow (gimple_stmt > return; > gcall *g = gimple_build_call_internal (IFN_UBSAN_PTR, 2, ptr, off); > gimple_set_location (g, gimple_location (gsi_stmt (*gsi))); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > } > > /* Instrument pointer arithmetics if any. */ > @@ -1577,10 +1608,11 @@ maybe_instrument_pointer_overflow (gimpl > else > t = fold_convert (sizetype, moff); > } > - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, > - GSI_SAME_STMT); > - base_addr = force_gimple_operand_gsi (gsi, base_addr, true, NULL_TREE, true, > - GSI_SAME_STMT); > + gimple_seq seq, this_seq; > + t = force_gimple_operand (t, &seq, true, NULL_TREE); > + base_addr = force_gimple_operand (base_addr, &this_seq, true, NULL_TREE); > + gimple_seq_add_seq_without_update (&seq, this_seq); > + ubsan_insert_seq_before (gsi, seq); > instrument_pointer_overflow (gsi, base_addr, t); > } > > @@ -2035,7 +2067,7 @@ instrument_nonnull_arg (gimple_stmt_iter > { > g = gimple_build_assign (make_ssa_name (TREE_TYPE (arg)), arg); > gimple_set_location (g, loc[0]); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > arg = gimple_assign_lhs (g); > } > > @@ -2068,7 +2100,7 @@ instrument_nonnull_arg (gimple_stmt_iter > g = gimple_build_call (fn, 1, data); > } > gimple_set_location (g, loc[0]); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > ubsan_create_edge (g); > } > *gsi = gsi_for_stmt (stmt); > @@ -2124,7 +2156,7 @@ instrument_nonnull_return (gimple_stmt_i > g = gimple_build_call (fn, 2, data, data2); > } > gimple_set_location (g, loc[0]); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > ubsan_create_edge (g); > *gsi = gsi_for_stmt (stmt); > } > @@ -2231,6 +2263,7 @@ instrument_object_size (gimple_stmt_iter > tree sizet; > tree base_addr = base; > gimple *bos_stmt = NULL; > + gimple_seq seq = NULL; > if (decl_p) > base_addr = build1 (ADDR_EXPR, > build_pointer_type (TREE_TYPE (base)), base); > @@ -2244,19 +2277,12 @@ instrument_object_size (gimple_stmt_iter > sizet = builtin_decl_explicit (BUILT_IN_DYNAMIC_OBJECT_SIZE); > sizet = build_call_expr_loc (loc, sizet, 2, base_addr, > integer_zero_node); > - sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true, > - GSI_SAME_STMT); > + sizet = force_gimple_operand (sizet, &seq, false, NULL_TREE); > /* If the call above didn't end up being an integer constant, go one > statement back and get the __builtin_object_size stmt. Save it, > we might need it later. */ > if (SSA_VAR_P (sizet)) > - { > - gsi_prev (gsi); > - bos_stmt = gsi_stmt (*gsi); > - > - /* Move on to where we were. */ > - gsi_next (gsi); > - } > + bos_stmt = gsi_stmt (gsi_last (seq)); > } > else > return; > @@ -2298,21 +2324,24 @@ instrument_object_size (gimple_stmt_iter > && !TREE_ADDRESSABLE (base)) > mark_addressable (base); > > + /* We have to emit the check. */ > + gimple_seq this_seq; > + t = force_gimple_operand (t, &this_seq, true, NULL_TREE); > + gimple_seq_add_seq_without_update (&seq, this_seq); > + ptr = force_gimple_operand (ptr, &this_seq, true, NULL_TREE); > + gimple_seq_add_seq_without_update (&seq, this_seq); > + ubsan_insert_seq_before (gsi, seq); > + > if (bos_stmt > && gimple_call_builtin_p (bos_stmt, BUILT_IN_DYNAMIC_OBJECT_SIZE)) > ubsan_create_edge (bos_stmt); > > - /* We have to emit the check. */ > - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, > - GSI_SAME_STMT); > - ptr = force_gimple_operand_gsi (gsi, ptr, true, NULL_TREE, true, > - GSI_SAME_STMT); > tree ckind = build_int_cst (unsigned_char_type_node, > is_lhs ? UBSAN_STORE_OF : UBSAN_LOAD_OF); > gimple *g = gimple_build_call_internal (IFN_UBSAN_OBJECT_SIZE, 4, > ptr, t, sizet, ckind); > gimple_set_location (g, loc); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > } > > /* Instrument values passed to builtin functions. */ > --- gcc/testsuite/gcc.dg/ubsan/pr112709-1.c.jj 2024-03-11 16:55:09.038760951 +0100 > +++ gcc/testsuite/gcc.dg/ubsan/pr112709-1.c 2024-03-11 16:55:02.159854950 +0100 > @@ -0,0 +1,52 @@ > +/* PR sanitizer/112709 */ > +/* { dg-do compile } */ > +/* { dg-options "-fsanitize=undefined -O2" } */ > + > +struct S { char c[1024]; }; > +int foo (int); > + > +__attribute__((returns_twice, noipa)) struct S > +bar (int x) > +{ > + (void) x; > + struct S s = {}; > + s.c[42] = 42; > + return s; > +} > + > +void > +baz (struct S *p) > +{ > + foo (1); > + *p = bar (0); > +} > + > +void > +qux (int x, struct S *p) > +{ > + if (x == 25) > + x = foo (2); > + else if (x == 42) > + x = foo (foo (3)); > + *p = bar (x); > +} > + > +void > +corge (int x, struct S *p) > +{ > + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; > + if (x == 25) > + { > + l1: > + x = foo (2); > + } > + else if (x == 42) > + { > + l2: > + x = foo (foo (3)); > + } > +l3: > + *p = bar (x); > + if (x < 4) > + goto *q[x & 3]; > +} > --- gcc/testsuite/gcc.dg/ubsan/pr112709-2.c.jj 2024-03-11 16:55:37.000378840 +0100 > +++ gcc/testsuite/gcc.dg/ubsan/pr112709-2.c 2024-03-11 17:13:37.517599492 +0100 > @@ -0,0 +1,50 @@ > +/* PR sanitizer/112709 */ > +/* { dg-do compile } */ > +/* { dg-options "-fsanitize=undefined -O2" } */ > + > +struct S { char c[1024]; } *p; > +int foo (int); > + > +__attribute__((returns_twice, noipa)) int > +bar (struct S x) > +{ > + (void) x.c[0]; > + return 0; > +} > + > +void > +baz (int *y) > +{ > + foo (1); > + *y = bar (*p); > +} > + > +void > +qux (int x, int *y) > +{ > + if (x == 25) > + x = foo (2); > + else if (x == 42) > + x = foo (foo (3)); > + *y = bar (*p); > +} > + > +void > +corge (int x, int *y) > +{ > + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; > + if (x == 25) > + { > + l1: > + x = foo (2); > + } > + else if (x == 42) > + { > + l2: > + x = foo (foo (3)); > + } > +l3: > + *y = bar (*p); > + if (x < 4) > + goto *q[x & 3]; > +} > > Jakub > >
On Tue, Mar 12, 2024 at 11:42:03AM +0100, Richard Biener wrote: > > +static edge > > +edge_before_returns_twice_call (basic_block bb) > > +{ > > + gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb); > > + gcc_checking_assert (is_gimple_call (gsi_stmt (gsi)) > > + && (gimple_call_flags (gsi_stmt (gsi)) > > + & ECF_RETURNS_TWICE) != 0); > > + edge_iterator ei; > > + edge e, ad_edge = NULL, other_edge = NULL; > > + bool split = false; > > + /* Look for a fallthru and possibly a single backedge. */ > > I can't see the backedge handling (I'm not sure it's actually required?) Oops, that comment line just shouldn't be there. Grepping around, I've pasted it from tree-ssa-uninit.cc (warn_uninit_phi_uses) - just wanted to grab something with the edge_iterator, edge declaration and FOR_EACH_EDGE on bb->preds) and forgot to remove that comment line. Consider it removed. > > + FOR_EACH_EDGE (e, ei, bb->preds) > > + { > > + if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) > > EDGE_EH is special because of the landing pad? Yes, though I'd think a landing pad would need to start with a label. Anyway, the (x & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL test is what is tested for ABNORMAL_DISPATCHER in other places: tree-cfgcleanup.cc- if ((ei_edge (ei)->flags tree-cfgcleanup.cc: & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) tree-cfgcleanup.cc- { tree-cfgcleanup.cc- gimple_stmt_iterator gsi tree-cfgcleanup.cc- = gsi_start_nondebug_after_labels_bb (dest); tree-cfgcleanup.cc- gimple *g = gsi_stmt (gsi); tree-cfgcleanup.cc- if (g && gimple_call_internal_p (g, IFN_ABNORMAL_DISPATCHER)) tree-cfgcleanup.cc- abnormal_dispatchers.safe_push (dest); (that is where I copied it from), tree-cfg.cc- FOR_EACH_EDGE (e, ei, bb->succs) tree-cfg.cc: if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) tree-cfg.cc- { tree-cfg.cc- gimple_stmt_iterator gsi tree-cfg.cc- = gsi_start_nondebug_after_labels_bb (e->dest); tree-cfg.cc- gimple *g = gsi_stmt (gsi); tree-cfg.cc- if (g && gimple_call_internal_p (g, IFN_ABNORMAL_DISPATCHER)) tree-cfg.cc- return e->dest; Though, admittedly, git blame shows both were written by myself, so it doesn't count much. The edge from IFN_ABNORMAL_DISPATCHER has EDGE_ABNORMAL flag set: make_edge (*dispatcher, for_bb, EDGE_ABNORMAL); and I bet some EDGE_* flags can be set on that edge later on, but EDGE_EH certainly shouldn't be. Bet in both of those preexisting cases and in this new one it could be done just as flags & EDGE_ABNORMAL test, though perhaps the tree-cfgcleanup.cc and tree-cfg.cc cases can see there EDGE_EH edges and would query the first non-debug stmt in them uselessly, while perhaps among the returns_twice predecessor EDGE_EH is very unlikely and not worth testing. So, if you want, I can change that line to if ((e->flags & EDGE_ABNORMAL) != 0) > > > + { > > + gimple_stmt_iterator gsi > > + = gsi_start_nondebug_after_labels_bb (e->src); > > + gimple *ad = gsi_stmt (gsi); > > + if (ad && gimple_call_internal_p (ad, IFN_ABNORMAL_DISPATCHER)) > > + { > > + gcc_checking_assert (ad_edge == NULL); > > + ad_edge = e; > > + continue; > > + } > > + } > > + if (other_edge || e->flags & (EDGE_ABNORMAL | EDGE_EH)) > > + split = true; > > + other_edge = e; > > + } > > + gcc_checking_assert (ad_edge); > > + if (other_edge == NULL) > > + split = true; > > + if (split) > > + { > > + other_edge = split_block_after_labels (bb); > > + e = make_edge (ad_edge->src, other_edge->dest, EDGE_ABNORMAL); > > In theory there could be multiple abnormal edges. Sure, but a function has at most one .ABNORMAL_DISPATCHER block. Edge from that block is the only one that needs to be kept pointing to the returns_twice function. In the testcases I've even tried to get other abnormal edges there, but because of labels we actually have labels on the block that falls through to the returns_twice block. I bet for EH edges it is the same. > Did you try > to use make_forwarder_block instead of manually doing it? Looking at it, it doesn't a lot of unnecessary work. We don't need to do any redirection of anything but moving the .ABNORMAL_DISPATCHER edge after splitting the block. And indeed, it would fail on redirection of the EDGE_ABNORMAL edge. > Or > does that fail because we end up redirecting the abnormal edges > which you avoid by re-making it new and scrapping the old one? > I fail to remember why we can't redirect abnormal edges - yes, > there's no flow to be redirected but the process of "redirection" > still should work. Seems the redirection always wants to adjust the src blocks of the edges. That isn't something that should be done for this particular abnormal edge, that one actually can be redirected just by replacing it, though for say EH edges we'd need to adjust the on the side EH info etc. > > + for (gphi_iterator gsi = gsi_start_phis (other_edge->src); > > + !gsi_end_p (gsi); gsi_next (&gsi)) > > + { > > + gphi *phi = gsi.phi (); > > + tree lhs = gimple_phi_result (phi); > > + tree new_lhs; > > + if (virtual_operand_p (lhs)) > > + new_lhs = make_ssa_name (SSA_NAME_VAR (lhs)); > > + else > > + new_lhs = make_ssa_name (TREE_TYPE (lhs)); Guess I should use new_lhs = copy_ssa_name (lhs); instead > > + gimple_phi_set_result (phi, new_lhs); > > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > > + add_phi_arg (new_phi, new_lhs, other_edge, UNKNOWN_LOCATION); > > + add_phi_arg (new_phi, gimple_phi_arg_def_from_edge (phi, ad_edge), > > + e, gimple_phi_arg_location_from_edge (phi, ad_edge)); > > + } > > + remove_edge (ad_edge); > > + } > > + return other_edge; > > +} > > + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) > > + { > > + tree s = USE_FROM_PTR (use_p); > > + if (SSA_NAME_DEF_STMT (s) > > + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI > > + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) > > + { > > + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); > > + SET_USE (use_p, unshare_expr (r)); > > + m = true; > > + } > > + } > > Ick. Doesn't the forwarder keep the PHI defs and new PHI defs > are created in e->dest instead, I don't see how we need this? Admittedly the above is the ugliest part of the patch IMHO. It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case we have before ubsan pass # _7(ab) = PHI <_20(9), _8(ab)(11)> _22 = bar (*_7(ab)); and want to instrument the *_7(ab) load among the parameters for object size. So, it wants to add _2 = __builtin_dynamic_object_size (_7(ab), 0); sequence and later: .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); before the call insn. If it isn't a noreturn call, it would just add those statements into the same basic block using gsi_insert*before. But because it is a returns_twice call, it needs to be done in a different block. And having all of ubsan/asan/bitintlower find that out first and figure out that it should use _20 instead of _7(ab) seems hard, especially that for cases like: void freddy (int x, int *y, struct S *p) { bar (*p); ++p; if (x == 25) x = foo (2); else if (x == 42) x = foo (foo (3)); *y = bar (*p); } added to pr112709-2.c we'd have: # x_5(ab) = PHI <x_3(ab)(7), x_4(ab)(3), x_24(6), x_21(10)> # p_8(ab) = PHI <p_16(ab)(7), p_7(ab)(3), p_16(ab)(6), p_16(ab)(10)> _26 = bar (*p_8(ab)); before ubsan and the .ABNORMAL_DISPATCHER block is 3. And the caller doesn't know what to replace the p_8(ab) in there with until the new bb is created and PHIs are adjusted. Though, I ran into ICE on the above testcase, not on the second bar call with the multiple edges, but on the first one, because unlike everything else added in the testcases, the first bar splits the e edge on gsi_insert_on_edge_immediate, so the assumption that e is the edge from the single non-EDGE_ABNORMAL predecessor of the returns_twice block to the returns_twice block is then no longer true. Fixed by adding if (new_bb) e = single_succ_edge (new_bb); twice. > Btw, gsi_insert_before_returns_twice_call might be usable as > gsi_insert_after_labels ()? It could be called like that too, sure, though still the callers would need to check if they want to insert before some statement or "after labels" which for the returns_twice case actually isn't after any labels, but on an edge and if there isn't one (except for the EDGE_ABNORMAL), ensure there is just one. In theory even gsi_insert_before and gsi_insert_seq_before could do those, but I think most of the half thousand users of those don't need that behavior and would be surprised if something like that happened. > Or alternatively not expose > gsi_insert_before_returns_twice_call but instead make > split_block () behave "correctly" for abnormals (or add > split_block_before_returns_twice_call ()). But that's just bike-shedding. I don't see how is that possible. The usual case is that no split_block is needed, usually the returns_twice block has just a single normal predecessor and the .ABNORMAL_EDGE predecessor. And in that case we shouldn't be creating further block, just insert on the edge with the PHI result to arg replacements. Similarly, even if we do split it, all further instrumentation that should go before the call should go to the created block, not result in further splits. Anyway, here is an updated patch, with the bogus comment removed, copy_ssa_name used, the ICE fix mentioned above and the additions of freddy functions into the tests. Tested so far with make check-gcc check-g++ RUNTESTFLAGS='ubsan.exp asan.exp' 2024-03-12 Jakub Jelinek <jakub@redhat.com> PR sanitizer/112709 * gimple-iterator.h (gsi_insert_before_returns_twice_call, gsi_insert_seq_before_returns_twice_call): Declare. * gimple-iterator.cc: Include gimplify.h. (edge_before_returns_twice_call, gsi_insert_before_returns_twice_call, gsi_insert_seq_before_returns_twice_call): New functions. * ubsan.cc (ubsan_insert_before, ubsan_insert_seq_before): New functions. (instrument_mem_ref, instrument_pointer_overflow, instrument_nonnull_arg, instrument_nonnull_return): Use ubsan_insert_before instead of gsi_insert_before. (maybe_instrument_pointer_overflow): Use force_gimple_operand, gimple_seq_add_seq_without_update and ubsan_insert_seq_before instead of force_gimple_operand_gsi. (instrument_object_size): Likewise. Use ubsan_insert_before instead of gsi_insert_before. * gcc.dg/ubsan/pr112709-1.c: New test. * gcc.dg/ubsan/pr112709-2.c: New test. --- gcc/gimple-iterator.h.jj 2024-03-12 10:15:41.253529859 +0100 +++ gcc/gimple-iterator.h 2024-03-12 12:34:56.026880701 +0100 @@ -93,6 +93,10 @@ extern void gsi_insert_on_edge (edge, gi extern void gsi_insert_seq_on_edge (edge, gimple_seq); extern basic_block gsi_insert_on_edge_immediate (edge, gimple *); extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq); +extern basic_block gsi_insert_before_returns_twice_call (basic_block, + gimple *); +extern basic_block gsi_insert_seq_before_returns_twice_call (basic_block, + gimple_seq); extern void gsi_commit_edge_inserts (void); extern void gsi_commit_one_edge_insert (edge, basic_block *); extern gphi_iterator gsi_start_phis (basic_block); --- gcc/gimple-iterator.cc.jj 2024-03-12 10:15:41.209530471 +0100 +++ gcc/gimple-iterator.cc 2024-03-12 12:53:26.174467429 +0100 @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. #include "tree-cfg.h" #include "tree-ssa.h" #include "value-prof.h" +#include "gimplify.h" /* Mark the statement STMT as modified, and update it. */ @@ -944,3 +945,134 @@ gsi_start_phis (basic_block bb) return i; } + +/* Helper function for gsi_insert_before_returns_twice_call and + gsi_insert_seq_before_returns_twice_call. Find edge to + insert statements before returns_twice call at the start of BB, + if there isn't just one, split the bb and adjust PHIs to ensure + that. */ + +static edge +edge_before_returns_twice_call (basic_block bb) +{ + gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb); + gcc_checking_assert (is_gimple_call (gsi_stmt (gsi)) + && (gimple_call_flags (gsi_stmt (gsi)) + & ECF_RETURNS_TWICE) != 0); + edge_iterator ei; + edge e, ad_edge = NULL, other_edge = NULL; + bool split = false; + FOR_EACH_EDGE (e, ei, bb->preds) + { + if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) + { + gimple_stmt_iterator gsi + = gsi_start_nondebug_after_labels_bb (e->src); + gimple *ad = gsi_stmt (gsi); + if (ad && gimple_call_internal_p (ad, IFN_ABNORMAL_DISPATCHER)) + { + gcc_checking_assert (ad_edge == NULL); + ad_edge = e; + continue; + } + } + if (other_edge || e->flags & (EDGE_ABNORMAL | EDGE_EH)) + split = true; + other_edge = e; + } + gcc_checking_assert (ad_edge); + if (other_edge == NULL) + split = true; + if (split) + { + other_edge = split_block_after_labels (bb); + e = make_edge (ad_edge->src, other_edge->dest, EDGE_ABNORMAL); + for (gphi_iterator gsi = gsi_start_phis (other_edge->src); + !gsi_end_p (gsi); gsi_next (&gsi)) + { + gphi *phi = gsi.phi (); + tree lhs = gimple_phi_result (phi); + tree new_lhs = copy_ssa_name (lhs); + gimple_phi_set_result (phi, new_lhs); + gphi *new_phi = create_phi_node (lhs, other_edge->dest); + add_phi_arg (new_phi, new_lhs, other_edge, UNKNOWN_LOCATION); + add_phi_arg (new_phi, gimple_phi_arg_def_from_edge (phi, ad_edge), + e, gimple_phi_arg_location_from_edge (phi, ad_edge)); + } + remove_edge (ad_edge); + } + return other_edge; +} + +/* Insert G before BB which starts with an ECF_RETURNS_TWICE call. + If BB has a single predecessor edge except for edge from + .ABNORMAL_DISPATCHER, insert on that edge, otherwise split + BB before the call, adjust PHIs and insert into the new BB. + If G refers to any results of BB's PHIs, replace them afterwards + with corresponding PHI arg. */ + +basic_block +gsi_insert_before_returns_twice_call (basic_block bb, gimple *g) +{ + edge e = edge_before_returns_twice_call (bb); + basic_block new_bb = gsi_insert_on_edge_immediate (e, g); + if (new_bb) + e = single_succ_edge (new_bb); + use_operand_p use_p; + ssa_op_iter iter; + bool m = false; + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) + { + tree s = USE_FROM_PTR (use_p); + if (SSA_NAME_DEF_STMT (s) + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) + { + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); + SET_USE (use_p, unshare_expr (r)); + m = true; + } + } + if (m) + update_stmt (g); + return new_bb; +} + +/* Similiarly for sequence SEQ. */ + +basic_block +gsi_insert_seq_before_returns_twice_call (basic_block bb, gimple_seq seq) +{ + if (gimple_seq_empty_p (seq)) + return NULL; + gimple *f = gimple_seq_first_stmt (seq); + gimple *l = gimple_seq_last_stmt (seq); + edge e = edge_before_returns_twice_call (bb); + basic_block new_bb = gsi_insert_seq_on_edge_immediate (e, seq); + if (new_bb) + e = single_succ_edge (new_bb); + use_operand_p use_p; + ssa_op_iter iter; + for (gimple_stmt_iterator gsi = gsi_for_stmt (f); ; gsi_next (&gsi)) + { + gimple *g = gsi_stmt (gsi); + bool m = false; + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) + { + tree s = USE_FROM_PTR (use_p); + if (SSA_NAME_DEF_STMT (s) + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) + { + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); + SET_USE (use_p, unshare_expr (r)); + m = true; + } + } + if (m) + update_stmt (g); + if (g == l) + break; + } + return new_bb; +} --- gcc/ubsan.cc.jj 2024-03-12 10:15:41.306529121 +0100 +++ gcc/ubsan.cc 2024-03-12 12:34:56.027880687 +0100 @@ -1432,6 +1432,37 @@ ubsan_expand_vptr_ifn (gimple_stmt_itera return true; } +/* Insert G stmt before ITER. If ITER is a returns_twice call, + insert it on an appropriate edge instead. */ + +static void +ubsan_insert_before (gimple_stmt_iterator *iter, gimple *g) +{ + gimple *stmt = gsi_stmt (*iter); + if (stmt + && is_gimple_call (stmt) + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) + gsi_insert_before_returns_twice_call (gsi_bb (*iter), g); + else + gsi_insert_before (iter, g, GSI_SAME_STMT); +} + +/* Similarly for sequence SEQ. */ + +static void +ubsan_insert_seq_before (gimple_stmt_iterator *iter, gimple_seq seq) +{ + if (gimple_seq_empty_p (seq)) + return; + gimple *stmt = gsi_stmt (*iter); + if (stmt + && is_gimple_call (stmt) + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) + gsi_insert_seq_before_returns_twice_call (gsi_bb (*iter), seq); + else + gsi_insert_seq_before (iter, seq, GSI_SAME_STMT); +} + /* Instrument a memory reference. BASE is the base of MEM, IS_LHS says whether the pointer is on the left hand side of the assignment. */ @@ -1458,7 +1489,7 @@ instrument_mem_ref (tree mem, tree base, tree alignt = build_int_cst (pointer_sized_int_node, align); gcall *g = gimple_build_call_internal (IFN_UBSAN_NULL, 3, t, kind, alignt); gimple_set_location (g, gimple_location (gsi_stmt (*iter))); - gsi_insert_before (iter, g, GSI_SAME_STMT); + ubsan_insert_before (iter, g); } /* Perform the pointer instrumentation. */ @@ -1485,7 +1516,7 @@ instrument_pointer_overflow (gimple_stmt return; gcall *g = gimple_build_call_internal (IFN_UBSAN_PTR, 2, ptr, off); gimple_set_location (g, gimple_location (gsi_stmt (*gsi))); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); } /* Instrument pointer arithmetics if any. */ @@ -1577,10 +1608,11 @@ maybe_instrument_pointer_overflow (gimpl else t = fold_convert (sizetype, moff); } - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, - GSI_SAME_STMT); - base_addr = force_gimple_operand_gsi (gsi, base_addr, true, NULL_TREE, true, - GSI_SAME_STMT); + gimple_seq seq, this_seq; + t = force_gimple_operand (t, &seq, true, NULL_TREE); + base_addr = force_gimple_operand (base_addr, &this_seq, true, NULL_TREE); + gimple_seq_add_seq_without_update (&seq, this_seq); + ubsan_insert_seq_before (gsi, seq); instrument_pointer_overflow (gsi, base_addr, t); } @@ -2035,7 +2067,7 @@ instrument_nonnull_arg (gimple_stmt_iter { g = gimple_build_assign (make_ssa_name (TREE_TYPE (arg)), arg); gimple_set_location (g, loc[0]); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); arg = gimple_assign_lhs (g); } @@ -2068,7 +2100,7 @@ instrument_nonnull_arg (gimple_stmt_iter g = gimple_build_call (fn, 1, data); } gimple_set_location (g, loc[0]); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); ubsan_create_edge (g); } *gsi = gsi_for_stmt (stmt); @@ -2124,7 +2156,7 @@ instrument_nonnull_return (gimple_stmt_i g = gimple_build_call (fn, 2, data, data2); } gimple_set_location (g, loc[0]); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); ubsan_create_edge (g); *gsi = gsi_for_stmt (stmt); } @@ -2231,6 +2263,7 @@ instrument_object_size (gimple_stmt_iter tree sizet; tree base_addr = base; gimple *bos_stmt = NULL; + gimple_seq seq = NULL; if (decl_p) base_addr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (base)), base); @@ -2244,19 +2277,12 @@ instrument_object_size (gimple_stmt_iter sizet = builtin_decl_explicit (BUILT_IN_DYNAMIC_OBJECT_SIZE); sizet = build_call_expr_loc (loc, sizet, 2, base_addr, integer_zero_node); - sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true, - GSI_SAME_STMT); + sizet = force_gimple_operand (sizet, &seq, false, NULL_TREE); /* If the call above didn't end up being an integer constant, go one statement back and get the __builtin_object_size stmt. Save it, we might need it later. */ if (SSA_VAR_P (sizet)) - { - gsi_prev (gsi); - bos_stmt = gsi_stmt (*gsi); - - /* Move on to where we were. */ - gsi_next (gsi); - } + bos_stmt = gsi_stmt (gsi_last (seq)); } else return; @@ -2298,21 +2324,24 @@ instrument_object_size (gimple_stmt_iter && !TREE_ADDRESSABLE (base)) mark_addressable (base); + /* We have to emit the check. */ + gimple_seq this_seq; + t = force_gimple_operand (t, &this_seq, true, NULL_TREE); + gimple_seq_add_seq_without_update (&seq, this_seq); + ptr = force_gimple_operand (ptr, &this_seq, true, NULL_TREE); + gimple_seq_add_seq_without_update (&seq, this_seq); + ubsan_insert_seq_before (gsi, seq); + if (bos_stmt && gimple_call_builtin_p (bos_stmt, BUILT_IN_DYNAMIC_OBJECT_SIZE)) ubsan_create_edge (bos_stmt); - /* We have to emit the check. */ - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, - GSI_SAME_STMT); - ptr = force_gimple_operand_gsi (gsi, ptr, true, NULL_TREE, true, - GSI_SAME_STMT); tree ckind = build_int_cst (unsigned_char_type_node, is_lhs ? UBSAN_STORE_OF : UBSAN_LOAD_OF); gimple *g = gimple_build_call_internal (IFN_UBSAN_OBJECT_SIZE, 4, ptr, t, sizet, ckind); gimple_set_location (g, loc); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); } /* Instrument values passed to builtin functions. */ --- gcc/testsuite/gcc.dg/ubsan/pr112709-1.c.jj 2024-03-12 12:34:56.027880687 +0100 +++ gcc/testsuite/gcc.dg/ubsan/pr112709-1.c 2024-03-12 12:57:58.993687023 +0100 @@ -0,0 +1,64 @@ +/* PR sanitizer/112709 */ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=undefined -O2" } */ + +struct S { char c[1024]; }; +int foo (int); + +__attribute__((returns_twice, noipa)) struct S +bar (int x) +{ + (void) x; + struct S s = {}; + s.c[42] = 42; + return s; +} + +void +baz (struct S *p) +{ + foo (1); + *p = bar (0); +} + +void +qux (int x, struct S *p) +{ + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *p = bar (x); +} + +void +corge (int x, struct S *p) +{ + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; + if (x == 25) + { + l1: + x = foo (2); + } + else if (x == 42) + { + l2: + x = foo (foo (3)); + } +l3: + *p = bar (x); + if (x < 4) + goto *q[x & 3]; +} + +void +freddy (int x, struct S *p) +{ + *p = bar (x); + ++p; + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *p = bar (x); +} --- gcc/testsuite/gcc.dg/ubsan/pr112709-2.c.jj 2024-03-12 12:34:56.027880687 +0100 +++ gcc/testsuite/gcc.dg/ubsan/pr112709-2.c 2024-03-12 12:58:13.305488706 +0100 @@ -0,0 +1,62 @@ +/* PR sanitizer/112709 */ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=undefined -O2" } */ + +struct S { char c[1024]; } *p; +int foo (int); + +__attribute__((returns_twice, noipa)) int +bar (struct S x) +{ + (void) x.c[0]; + return 0; +} + +void +baz (int *y) +{ + foo (1); + *y = bar (*p); +} + +void +qux (int x, int *y) +{ + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *y = bar (*p); +} + +void +corge (int x, int *y) +{ + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; + if (x == 25) + { + l1: + x = foo (2); + } + else if (x == 42) + { + l2: + x = foo (foo (3)); + } +l3: + *y = bar (*p); + if (x < 4) + goto *q[x & 3]; +} + +void +freddy (int x, int *y, struct S *p) +{ + bar (*p); + ++p; + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *y = bar (*p); +} Jakub
On Tue, 12 Mar 2024, Jakub Jelinek wrote: > On Tue, Mar 12, 2024 at 11:42:03AM +0100, Richard Biener wrote: > > > +static edge > > > +edge_before_returns_twice_call (basic_block bb) > > > +{ > > > + gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb); > > > + gcc_checking_assert (is_gimple_call (gsi_stmt (gsi)) > > > + && (gimple_call_flags (gsi_stmt (gsi)) > > > + & ECF_RETURNS_TWICE) != 0); > > > + edge_iterator ei; > > > + edge e, ad_edge = NULL, other_edge = NULL; > > > + bool split = false; > > > + /* Look for a fallthru and possibly a single backedge. */ > > > > I can't see the backedge handling (I'm not sure it's actually required?) > > Oops, that comment line just shouldn't be there. Grepping around, I've > pasted it from tree-ssa-uninit.cc (warn_uninit_phi_uses) - just wanted > to grab something with the edge_iterator, edge declaration and > FOR_EACH_EDGE on bb->preds) and forgot to remove that comment line. > > Consider it removed. > > > > + FOR_EACH_EDGE (e, ei, bb->preds) > > > + { > > > + if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) > > > > EDGE_EH is special because of the landing pad? > > Yes, though I'd think a landing pad would need to start with a label. > Anyway, the (x & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL > test is what is tested for ABNORMAL_DISPATCHER in other places: > tree-cfgcleanup.cc- if ((ei_edge (ei)->flags > tree-cfgcleanup.cc: & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) > tree-cfgcleanup.cc- { > tree-cfgcleanup.cc- gimple_stmt_iterator gsi > tree-cfgcleanup.cc- = gsi_start_nondebug_after_labels_bb (dest); > tree-cfgcleanup.cc- gimple *g = gsi_stmt (gsi); > tree-cfgcleanup.cc- if (g && gimple_call_internal_p (g, IFN_ABNORMAL_DISPATCHER)) > tree-cfgcleanup.cc- abnormal_dispatchers.safe_push (dest); > (that is where I copied it from), > tree-cfg.cc- FOR_EACH_EDGE (e, ei, bb->succs) > tree-cfg.cc: if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) > tree-cfg.cc- { > tree-cfg.cc- gimple_stmt_iterator gsi > tree-cfg.cc- = gsi_start_nondebug_after_labels_bb (e->dest); > tree-cfg.cc- gimple *g = gsi_stmt (gsi); > tree-cfg.cc- if (g && gimple_call_internal_p (g, IFN_ABNORMAL_DISPATCHER)) > tree-cfg.cc- return e->dest; > Though, admittedly, git blame shows both were written by myself, so it > doesn't count much. The edge from IFN_ABNORMAL_DISPATCHER has EDGE_ABNORMAL > flag set: > make_edge (*dispatcher, for_bb, EDGE_ABNORMAL); > and I bet some EDGE_* flags can be set on that edge later on, but EDGE_EH > certainly shouldn't be. > Bet in both of those preexisting cases and in this new one it could be done > just as flags & EDGE_ABNORMAL test, though perhaps the tree-cfgcleanup.cc > and tree-cfg.cc cases can see there EDGE_EH edges and would query the first > non-debug stmt in them uselessly, while perhaps among the returns_twice > predecessor EDGE_EH is very unlikely and not worth testing. > So, if you want, I can change that line to > if ((e->flags & EDGE_ABNORMAL) != 0) > > > > > > + { > > > + gimple_stmt_iterator gsi > > > + = gsi_start_nondebug_after_labels_bb (e->src); > > > + gimple *ad = gsi_stmt (gsi); > > > + if (ad && gimple_call_internal_p (ad, IFN_ABNORMAL_DISPATCHER)) > > > + { > > > + gcc_checking_assert (ad_edge == NULL); > > > + ad_edge = e; > > > + continue; > > > + } > > > + } > > > + if (other_edge || e->flags & (EDGE_ABNORMAL | EDGE_EH)) > > > + split = true; > > > + other_edge = e; > > > + } > > > + gcc_checking_assert (ad_edge); > > > + if (other_edge == NULL) > > > + split = true; > > > + if (split) > > > + { > > > + other_edge = split_block_after_labels (bb); > > > + e = make_edge (ad_edge->src, other_edge->dest, EDGE_ABNORMAL); > > > > In theory there could be multiple abnormal edges. > > Sure, but a function has at most one .ABNORMAL_DISPATCHER block. > Edge from that block is the only one that needs to be kept pointing to > the returns_twice function. In the testcases I've even tried to > get other abnormal edges there, but because of labels we actually have > labels on the block that falls through to the returns_twice block. > I bet for EH edges it is the same. > > > Did you try > > to use make_forwarder_block instead of manually doing it? > > Looking at it, it doesn't a lot of unnecessary work. We don't need > to do any redirection of anything but moving the .ABNORMAL_DISPATCHER > edge after splitting the block. And indeed, it would fail on redirection > of the EDGE_ABNORMAL edge. I'm mostly concerned about maintainability here, but let's leave that aside because of the edge redirection that doesn't work (I don't see why we couldn't just make it work, but ...) > > Or > > does that fail because we end up redirecting the abnormal edges > > which you avoid by re-making it new and scrapping the old one? > > I fail to remember why we can't redirect abnormal edges - yes, > > there's no flow to be redirected but the process of "redirection" > > still should work. > > Seems the redirection always wants to adjust the src blocks of the > edges. That isn't something that should be done for this particular > abnormal edge, that one actually can be redirected just by replacing it, > though for say EH edges we'd need to adjust the on the side EH info etc. > > > > + for (gphi_iterator gsi = gsi_start_phis (other_edge->src); > > > + !gsi_end_p (gsi); gsi_next (&gsi)) > > > + { > > > + gphi *phi = gsi.phi (); > > > + tree lhs = gimple_phi_result (phi); > > > + tree new_lhs; > > > + if (virtual_operand_p (lhs)) > > > + new_lhs = make_ssa_name (SSA_NAME_VAR (lhs)); > > > + else > > > + new_lhs = make_ssa_name (TREE_TYPE (lhs)); > > Guess I should use new_lhs = copy_ssa_name (lhs); instead > > > > + gimple_phi_set_result (phi, new_lhs); > > > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > > > + add_phi_arg (new_phi, new_lhs, other_edge, UNKNOWN_LOCATION); > > > + add_phi_arg (new_phi, gimple_phi_arg_def_from_edge (phi, ad_edge), > > > + e, gimple_phi_arg_location_from_edge (phi, ad_edge)); > > > + } > > > + remove_edge (ad_edge); > > > + } > > > + return other_edge; > > > +} > > > > + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) > > > + { > > > + tree s = USE_FROM_PTR (use_p); > > > + if (SSA_NAME_DEF_STMT (s) > > > + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI > > > + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) > > > + { > > > + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); > > > + SET_USE (use_p, unshare_expr (r)); > > > + m = true; > > > + } > > > + } > > > > Ick. Doesn't the forwarder keep the PHI defs and new PHI defs > > are created in e->dest instead, I don't see how we need this? > > Admittedly the above is the ugliest part of the patch IMHO. > It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case > we have before ubsan pass > # _7(ab) = PHI <_20(9), _8(ab)(11)> > _22 = bar (*_7(ab)); > and want to instrument the *_7(ab) load among the parameters for object > size. > So, it wants to add > _2 = __builtin_dynamic_object_size (_7(ab), 0); > sequence and later: > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > before the call insn. If it isn't a noreturn call, it would just > add those statements into the same basic block using gsi_insert*before. > But because it is a returns_twice call, it needs to be done in a different > block. And having all of ubsan/asan/bitintlower find that out first and > figure out that it should use _20 instead of _7(ab) seems hard, especially > that for cases like: I would have expected the result as # _7 = PHI <_20(9)> _2 = __builtin_dynamic_object_size (_7, 0); .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); // fallthru # _99(ab) = PHI <_7, _8(ab)(11)> _22 = bar (*_99(ab)); where all uses of _7 in the IL before splitting get replaced via their immediate uses but the to-be inserted IL can continue using the defs (that requires the not inserted IL to not have update_stmt called on them, of course). I think split_block would have the PHIs organized that way already, you are adding the _99 defs anyway, right? That is, + tree new_lhs = copy_ssa_name (lhs); + gimple_phi_set_result (phi, new_lhs); + gphi *new_phi = create_phi_node (lhs, other_edge->dest); here you swap the defs, I'd omit that. And instead essentially do replace_uses_by (lhs, new_lhs) here (without the folding and stuff). You have to clear SSA_NAME_OCCURS_IN_ABNORMAL_PHI on the old and set it on the new LHS if it was set. I think that should contain the "SSA update" to edge_before_returns_twice_call for the case it splits the block? > void > freddy (int x, int *y, struct S *p) > { > bar (*p); > ++p; > if (x == 25) > x = foo (2); > else if (x == 42) > x = foo (foo (3)); > *y = bar (*p); > } > added to pr112709-2.c we'd have: > # x_5(ab) = PHI <x_3(ab)(7), x_4(ab)(3), x_24(6), x_21(10)> > # p_8(ab) = PHI <p_16(ab)(7), p_7(ab)(3), p_16(ab)(6), p_16(ab)(10)> > _26 = bar (*p_8(ab)); > before ubsan and the .ABNORMAL_DISPATCHER block is 3. > And the caller doesn't know what to replace the p_8(ab) in there with > until the new bb is created and PHIs are adjusted. > > Though, I ran into ICE on the above testcase, not on the second bar call > with the multiple edges, but on the first one, because unlike everything > else added in the testcases, the first bar splits the e edge on > gsi_insert_on_edge_immediate, so the assumption that e is the edge from > the single non-EDGE_ABNORMAL predecessor of the returns_twice block > to the returns_twice block is then no longer true. Fixed by adding > if (new_bb) > e = single_succ_edge (new_bb); > twice. > > > Btw, gsi_insert_before_returns_twice_call might be usable as > > gsi_insert_after_labels ()? > > It could be called like that too, sure, though still the callers > would need to check if they want to insert before some statement or > "after labels" which for the returns_twice case actually isn't after any > labels, but on an edge and if there isn't one (except for the EDGE_ABNORMAL), > ensure there is just one. > In theory even gsi_insert_before and gsi_insert_seq_before could do those, > but I think most of the half thousand users of those don't need that > behavior and would be surprised if something like that happened. > > > Or alternatively not expose > > gsi_insert_before_returns_twice_call but instead make > > split_block () behave "correctly" for abnormals (or add > > split_block_before_returns_twice_call ()). But that's just bike-shedding. > > I don't see how is that possible. The usual case is that no split_block > is needed, usually the returns_twice block has just a single normal > predecessor and the .ABNORMAL_EDGE predecessor. And in that case > we shouldn't be creating further block, just insert on the edge with the > PHI result to arg replacements. Similarly, even if we do split it, all > further instrumentation that should go before the call should go to the > created block, not result in further splits. > > Anyway, here is an updated patch, with the bogus comment removed, > copy_ssa_name used, the ICE fix mentioned above and the additions of freddy > functions into the tests. > > Tested so far with make check-gcc check-g++ RUNTESTFLAGS='ubsan.exp asan.exp' > > 2024-03-12 Jakub Jelinek <jakub@redhat.com> > > PR sanitizer/112709 > * gimple-iterator.h (gsi_insert_before_returns_twice_call, > gsi_insert_seq_before_returns_twice_call): Declare. > * gimple-iterator.cc: Include gimplify.h. > (edge_before_returns_twice_call, gsi_insert_before_returns_twice_call, > gsi_insert_seq_before_returns_twice_call): New functions. > * ubsan.cc (ubsan_insert_before, ubsan_insert_seq_before): New > functions. > (instrument_mem_ref, instrument_pointer_overflow, > instrument_nonnull_arg, instrument_nonnull_return): Use > ubsan_insert_before instead of gsi_insert_before. > (maybe_instrument_pointer_overflow): Use force_gimple_operand, > gimple_seq_add_seq_without_update and ubsan_insert_seq_before > instead of force_gimple_operand_gsi. > (instrument_object_size): Likewise. Use ubsan_insert_before instead > of gsi_insert_before. > > * gcc.dg/ubsan/pr112709-1.c: New test. > * gcc.dg/ubsan/pr112709-2.c: New test. > > --- gcc/gimple-iterator.h.jj 2024-03-12 10:15:41.253529859 +0100 > +++ gcc/gimple-iterator.h 2024-03-12 12:34:56.026880701 +0100 > @@ -93,6 +93,10 @@ extern void gsi_insert_on_edge (edge, gi > extern void gsi_insert_seq_on_edge (edge, gimple_seq); > extern basic_block gsi_insert_on_edge_immediate (edge, gimple *); > extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq); > +extern basic_block gsi_insert_before_returns_twice_call (basic_block, > + gimple *); > +extern basic_block gsi_insert_seq_before_returns_twice_call (basic_block, > + gimple_seq); > extern void gsi_commit_edge_inserts (void); > extern void gsi_commit_one_edge_insert (edge, basic_block *); > extern gphi_iterator gsi_start_phis (basic_block); > --- gcc/gimple-iterator.cc.jj 2024-03-12 10:15:41.209530471 +0100 > +++ gcc/gimple-iterator.cc 2024-03-12 12:53:26.174467429 +0100 > @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. > #include "tree-cfg.h" > #include "tree-ssa.h" > #include "value-prof.h" > +#include "gimplify.h" > > > /* Mark the statement STMT as modified, and update it. */ > @@ -944,3 +945,134 @@ gsi_start_phis (basic_block bb) > > return i; > } > + > +/* Helper function for gsi_insert_before_returns_twice_call and > + gsi_insert_seq_before_returns_twice_call. Find edge to > + insert statements before returns_twice call at the start of BB, > + if there isn't just one, split the bb and adjust PHIs to ensure > + that. */ > + > +static edge > +edge_before_returns_twice_call (basic_block bb) > +{ > + gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb); > + gcc_checking_assert (is_gimple_call (gsi_stmt (gsi)) > + && (gimple_call_flags (gsi_stmt (gsi)) > + & ECF_RETURNS_TWICE) != 0); > + edge_iterator ei; > + edge e, ad_edge = NULL, other_edge = NULL; > + bool split = false; > + FOR_EACH_EDGE (e, ei, bb->preds) > + { > + if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) > + { > + gimple_stmt_iterator gsi > + = gsi_start_nondebug_after_labels_bb (e->src); > + gimple *ad = gsi_stmt (gsi); > + if (ad && gimple_call_internal_p (ad, IFN_ABNORMAL_DISPATCHER)) > + { > + gcc_checking_assert (ad_edge == NULL); > + ad_edge = e; > + continue; > + } > + } > + if (other_edge || e->flags & (EDGE_ABNORMAL | EDGE_EH)) > + split = true; > + other_edge = e; > + } > + gcc_checking_assert (ad_edge); > + if (other_edge == NULL) > + split = true; > + if (split) > + { > + other_edge = split_block_after_labels (bb); > + e = make_edge (ad_edge->src, other_edge->dest, EDGE_ABNORMAL); > + for (gphi_iterator gsi = gsi_start_phis (other_edge->src); > + !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gphi *phi = gsi.phi (); > + tree lhs = gimple_phi_result (phi); > + tree new_lhs = copy_ssa_name (lhs); > + gimple_phi_set_result (phi, new_lhs); > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > + add_phi_arg (new_phi, new_lhs, other_edge, UNKNOWN_LOCATION); > + add_phi_arg (new_phi, gimple_phi_arg_def_from_edge (phi, ad_edge), > + e, gimple_phi_arg_location_from_edge (phi, ad_edge)); > + } > + remove_edge (ad_edge); > + } > + return other_edge; > +} > + > +/* Insert G before BB which starts with an ECF_RETURNS_TWICE call. > + If BB has a single predecessor edge except for edge from > + .ABNORMAL_DISPATCHER, insert on that edge, otherwise split > + BB before the call, adjust PHIs and insert into the new BB. > + If G refers to any results of BB's PHIs, replace them afterwards > + with corresponding PHI arg. */ > + > +basic_block > +gsi_insert_before_returns_twice_call (basic_block bb, gimple *g) > +{ > + edge e = edge_before_returns_twice_call (bb); > + basic_block new_bb = gsi_insert_on_edge_immediate (e, g); > + if (new_bb) > + e = single_succ_edge (new_bb); > + use_operand_p use_p; > + ssa_op_iter iter; > + bool m = false; > + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) > + { > + tree s = USE_FROM_PTR (use_p); > + if (SSA_NAME_DEF_STMT (s) > + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI > + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) > + { > + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); > + SET_USE (use_p, unshare_expr (r)); > + m = true; > + } > + } > + if (m) > + update_stmt (g); > + return new_bb; > +} > + > +/* Similiarly for sequence SEQ. */ > + > +basic_block > +gsi_insert_seq_before_returns_twice_call (basic_block bb, gimple_seq seq) > +{ > + if (gimple_seq_empty_p (seq)) > + return NULL; > + gimple *f = gimple_seq_first_stmt (seq); > + gimple *l = gimple_seq_last_stmt (seq); > + edge e = edge_before_returns_twice_call (bb); > + basic_block new_bb = gsi_insert_seq_on_edge_immediate (e, seq); > + if (new_bb) > + e = single_succ_edge (new_bb); > + use_operand_p use_p; > + ssa_op_iter iter; > + for (gimple_stmt_iterator gsi = gsi_for_stmt (f); ; gsi_next (&gsi)) > + { > + gimple *g = gsi_stmt (gsi); > + bool m = false; > + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) > + { > + tree s = USE_FROM_PTR (use_p); > + if (SSA_NAME_DEF_STMT (s) > + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI > + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) > + { > + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); > + SET_USE (use_p, unshare_expr (r)); > + m = true; > + } > + } > + if (m) > + update_stmt (g); > + if (g == l) > + break; > + } > + return new_bb; > +} > --- gcc/ubsan.cc.jj 2024-03-12 10:15:41.306529121 +0100 > +++ gcc/ubsan.cc 2024-03-12 12:34:56.027880687 +0100 > @@ -1432,6 +1432,37 @@ ubsan_expand_vptr_ifn (gimple_stmt_itera > return true; > } > > +/* Insert G stmt before ITER. If ITER is a returns_twice call, > + insert it on an appropriate edge instead. */ > + > +static void > +ubsan_insert_before (gimple_stmt_iterator *iter, gimple *g) > +{ > + gimple *stmt = gsi_stmt (*iter); > + if (stmt > + && is_gimple_call (stmt) > + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) > + gsi_insert_before_returns_twice_call (gsi_bb (*iter), g); > + else > + gsi_insert_before (iter, g, GSI_SAME_STMT); > +} > + > +/* Similarly for sequence SEQ. */ > + > +static void > +ubsan_insert_seq_before (gimple_stmt_iterator *iter, gimple_seq seq) > +{ > + if (gimple_seq_empty_p (seq)) > + return; > + gimple *stmt = gsi_stmt (*iter); > + if (stmt > + && is_gimple_call (stmt) > + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) > + gsi_insert_seq_before_returns_twice_call (gsi_bb (*iter), seq); > + else > + gsi_insert_seq_before (iter, seq, GSI_SAME_STMT); > +} > + > /* Instrument a memory reference. BASE is the base of MEM, IS_LHS says > whether the pointer is on the left hand side of the assignment. */ > > @@ -1458,7 +1489,7 @@ instrument_mem_ref (tree mem, tree base, > tree alignt = build_int_cst (pointer_sized_int_node, align); > gcall *g = gimple_build_call_internal (IFN_UBSAN_NULL, 3, t, kind, alignt); > gimple_set_location (g, gimple_location (gsi_stmt (*iter))); > - gsi_insert_before (iter, g, GSI_SAME_STMT); > + ubsan_insert_before (iter, g); > } > > /* Perform the pointer instrumentation. */ > @@ -1485,7 +1516,7 @@ instrument_pointer_overflow (gimple_stmt > return; > gcall *g = gimple_build_call_internal (IFN_UBSAN_PTR, 2, ptr, off); > gimple_set_location (g, gimple_location (gsi_stmt (*gsi))); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > } > > /* Instrument pointer arithmetics if any. */ > @@ -1577,10 +1608,11 @@ maybe_instrument_pointer_overflow (gimpl > else > t = fold_convert (sizetype, moff); > } > - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, > - GSI_SAME_STMT); > - base_addr = force_gimple_operand_gsi (gsi, base_addr, true, NULL_TREE, true, > - GSI_SAME_STMT); > + gimple_seq seq, this_seq; > + t = force_gimple_operand (t, &seq, true, NULL_TREE); > + base_addr = force_gimple_operand (base_addr, &this_seq, true, NULL_TREE); > + gimple_seq_add_seq_without_update (&seq, this_seq); > + ubsan_insert_seq_before (gsi, seq); > instrument_pointer_overflow (gsi, base_addr, t); > } > > @@ -2035,7 +2067,7 @@ instrument_nonnull_arg (gimple_stmt_iter > { > g = gimple_build_assign (make_ssa_name (TREE_TYPE (arg)), arg); > gimple_set_location (g, loc[0]); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > arg = gimple_assign_lhs (g); > } > > @@ -2068,7 +2100,7 @@ instrument_nonnull_arg (gimple_stmt_iter > g = gimple_build_call (fn, 1, data); > } > gimple_set_location (g, loc[0]); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > ubsan_create_edge (g); > } > *gsi = gsi_for_stmt (stmt); > @@ -2124,7 +2156,7 @@ instrument_nonnull_return (gimple_stmt_i > g = gimple_build_call (fn, 2, data, data2); > } > gimple_set_location (g, loc[0]); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > ubsan_create_edge (g); > *gsi = gsi_for_stmt (stmt); > } > @@ -2231,6 +2263,7 @@ instrument_object_size (gimple_stmt_iter > tree sizet; > tree base_addr = base; > gimple *bos_stmt = NULL; > + gimple_seq seq = NULL; > if (decl_p) > base_addr = build1 (ADDR_EXPR, > build_pointer_type (TREE_TYPE (base)), base); > @@ -2244,19 +2277,12 @@ instrument_object_size (gimple_stmt_iter > sizet = builtin_decl_explicit (BUILT_IN_DYNAMIC_OBJECT_SIZE); > sizet = build_call_expr_loc (loc, sizet, 2, base_addr, > integer_zero_node); > - sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true, > - GSI_SAME_STMT); > + sizet = force_gimple_operand (sizet, &seq, false, NULL_TREE); > /* If the call above didn't end up being an integer constant, go one > statement back and get the __builtin_object_size stmt. Save it, > we might need it later. */ > if (SSA_VAR_P (sizet)) > - { > - gsi_prev (gsi); > - bos_stmt = gsi_stmt (*gsi); > - > - /* Move on to where we were. */ > - gsi_next (gsi); > - } > + bos_stmt = gsi_stmt (gsi_last (seq)); > } > else > return; > @@ -2298,21 +2324,24 @@ instrument_object_size (gimple_stmt_iter > && !TREE_ADDRESSABLE (base)) > mark_addressable (base); > > + /* We have to emit the check. */ > + gimple_seq this_seq; > + t = force_gimple_operand (t, &this_seq, true, NULL_TREE); > + gimple_seq_add_seq_without_update (&seq, this_seq); > + ptr = force_gimple_operand (ptr, &this_seq, true, NULL_TREE); > + gimple_seq_add_seq_without_update (&seq, this_seq); > + ubsan_insert_seq_before (gsi, seq); > + > if (bos_stmt > && gimple_call_builtin_p (bos_stmt, BUILT_IN_DYNAMIC_OBJECT_SIZE)) > ubsan_create_edge (bos_stmt); > > - /* We have to emit the check. */ > - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, > - GSI_SAME_STMT); > - ptr = force_gimple_operand_gsi (gsi, ptr, true, NULL_TREE, true, > - GSI_SAME_STMT); > tree ckind = build_int_cst (unsigned_char_type_node, > is_lhs ? UBSAN_STORE_OF : UBSAN_LOAD_OF); > gimple *g = gimple_build_call_internal (IFN_UBSAN_OBJECT_SIZE, 4, > ptr, t, sizet, ckind); > gimple_set_location (g, loc); > - gsi_insert_before (gsi, g, GSI_SAME_STMT); > + ubsan_insert_before (gsi, g); > } > > /* Instrument values passed to builtin functions. */ > --- gcc/testsuite/gcc.dg/ubsan/pr112709-1.c.jj 2024-03-12 12:34:56.027880687 +0100 > +++ gcc/testsuite/gcc.dg/ubsan/pr112709-1.c 2024-03-12 12:57:58.993687023 +0100 > @@ -0,0 +1,64 @@ > +/* PR sanitizer/112709 */ > +/* { dg-do compile } */ > +/* { dg-options "-fsanitize=undefined -O2" } */ > + > +struct S { char c[1024]; }; > +int foo (int); > + > +__attribute__((returns_twice, noipa)) struct S > +bar (int x) > +{ > + (void) x; > + struct S s = {}; > + s.c[42] = 42; > + return s; > +} > + > +void > +baz (struct S *p) > +{ > + foo (1); > + *p = bar (0); > +} > + > +void > +qux (int x, struct S *p) > +{ > + if (x == 25) > + x = foo (2); > + else if (x == 42) > + x = foo (foo (3)); > + *p = bar (x); > +} > + > +void > +corge (int x, struct S *p) > +{ > + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; > + if (x == 25) > + { > + l1: > + x = foo (2); > + } > + else if (x == 42) > + { > + l2: > + x = foo (foo (3)); > + } > +l3: > + *p = bar (x); > + if (x < 4) > + goto *q[x & 3]; > +} > + > +void > +freddy (int x, struct S *p) > +{ > + *p = bar (x); > + ++p; > + if (x == 25) > + x = foo (2); > + else if (x == 42) > + x = foo (foo (3)); > + *p = bar (x); > +} > --- gcc/testsuite/gcc.dg/ubsan/pr112709-2.c.jj 2024-03-12 12:34:56.027880687 +0100 > +++ gcc/testsuite/gcc.dg/ubsan/pr112709-2.c 2024-03-12 12:58:13.305488706 +0100 > @@ -0,0 +1,62 @@ > +/* PR sanitizer/112709 */ > +/* { dg-do compile } */ > +/* { dg-options "-fsanitize=undefined -O2" } */ > + > +struct S { char c[1024]; } *p; > +int foo (int); > + > +__attribute__((returns_twice, noipa)) int > +bar (struct S x) > +{ > + (void) x.c[0]; > + return 0; > +} > + > +void > +baz (int *y) > +{ > + foo (1); > + *y = bar (*p); > +} > + > +void > +qux (int x, int *y) > +{ > + if (x == 25) > + x = foo (2); > + else if (x == 42) > + x = foo (foo (3)); > + *y = bar (*p); > +} > + > +void > +corge (int x, int *y) > +{ > + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; > + if (x == 25) > + { > + l1: > + x = foo (2); > + } > + else if (x == 42) > + { > + l2: > + x = foo (foo (3)); > + } > +l3: > + *y = bar (*p); > + if (x < 4) > + goto *q[x & 3]; > +} > + > +void > +freddy (int x, int *y, struct S *p) > +{ > + bar (*p); > + ++p; > + if (x == 25) > + x = foo (2); > + else if (x == 42) > + x = foo (foo (3)); > + *y = bar (*p); > +} > > > Jakub > >
On Tue, Mar 12, 2024 at 01:47:53PM +0100, Richard Biener wrote: > > Admittedly the above is the ugliest part of the patch IMHO. > > It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case > > we have before ubsan pass > > # _7(ab) = PHI <_20(9), _8(ab)(11)> > > _22 = bar (*_7(ab)); > > and want to instrument the *_7(ab) load among the parameters for object > > size. > > So, it wants to add > > _2 = __builtin_dynamic_object_size (_7(ab), 0); > > sequence and later: > > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > > before the call insn. If it isn't a noreturn call, it would just > > add those statements into the same basic block using gsi_insert*before. > > But because it is a returns_twice call, it needs to be done in a different > > block. And having all of ubsan/asan/bitintlower find that out first and > > figure out that it should use _20 instead of _7(ab) seems hard, especially > > that for cases like: > > I would have expected the result as > > # _7 = PHI <_20(9)> > _2 = __builtin_dynamic_object_size (_7, 0); > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > // fallthru > > # _99(ab) = PHI <_7, _8(ab)(11)> > _22 = bar (*_99(ab)); > > where all uses of _7 in the IL before splitting get replaced via > their immediate uses but the to-be inserted IL can continue using > the defs (that requires the not inserted IL to not have update_stmt > called on them, of course). > > I think split_block would have the PHIs organized that way already, > you are adding the _99 defs anyway, right? That is, > > + tree new_lhs = copy_ssa_name (lhs); > + gimple_phi_set_result (phi, new_lhs); > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > > here you swap the defs, I'd omit that. And instead essentially do > replace_uses_by (lhs, new_lhs) here (without the folding and stuff). > You have to clear SSA_NAME_OCCURS_IN_ABNORMAL_PHI on the old > and set it on the new LHS if it was set. > > I think that should contain the "SSA update" to > edge_before_returns_twice_call for the case it splits the block? I have considered that, but there are problems with that. The most important is that in the most common case, there is no block splitting nor any edge splitting, all we have is that # _99(ab) = PHI <_7(6), _8(ab)(11)> _22 = bar (*_99(ab)); or similar. And we can do several insertions of statements or sequences before the bar call. If it is not returns_twice or even not a call, we want to use _99(ab) in there and insert just before the call/statement. If it is returns_twice call, we want to replace the _99(ab) uses with something that is actually usable on the 6->10 edge. So, if we wanted to keep using the _99(ab) (wouldn't it be even wrong that it is (ab)?), we'd need to essentially on every addition add statements like _99(ab) = _7; (for each PHI node except the virtual one) before the added statement or statements (or add it only if used in the statement/sequence?) and change the PHI to be # _100(ab) = PHI <_99(ab)(6), _8(ab)(11)> and replace all uses: _22 = bar (*_100(ab)); Right now ubsan can insert something like 3 statements/sequences per argument, so there can be thousands of them and the block could have thousands of PHIs, so that is millions of useless SSA_NAME copies. So, the intention of edge_before_returns_twice_call is just that it in the common case just finds the non-EDGE_ABNORMAL edge if there is one, if there isn't just one, it adjusts the IL such that there is just one. And then the next step is to handle that case. Jakub
On Tue, 12 Mar 2024, Jakub Jelinek wrote: > On Tue, Mar 12, 2024 at 01:47:53PM +0100, Richard Biener wrote: > > > Admittedly the above is the ugliest part of the patch IMHO. > > > It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case > > > we have before ubsan pass > > > # _7(ab) = PHI <_20(9), _8(ab)(11)> > > > _22 = bar (*_7(ab)); > > > and want to instrument the *_7(ab) load among the parameters for object > > > size. > > > So, it wants to add > > > _2 = __builtin_dynamic_object_size (_7(ab), 0); > > > sequence and later: > > > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > > > before the call insn. If it isn't a noreturn call, it would just > > > add those statements into the same basic block using gsi_insert*before. > > > But because it is a returns_twice call, it needs to be done in a different > > > block. And having all of ubsan/asan/bitintlower find that out first and > > > figure out that it should use _20 instead of _7(ab) seems hard, especially > > > that for cases like: > > > > I would have expected the result as > > > > # _7 = PHI <_20(9)> > > _2 = __builtin_dynamic_object_size (_7, 0); > > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > > // fallthru > > > > # _99(ab) = PHI <_7, _8(ab)(11)> > > _22 = bar (*_99(ab)); > > > > where all uses of _7 in the IL before splitting get replaced via > > their immediate uses but the to-be inserted IL can continue using > > the defs (that requires the not inserted IL to not have update_stmt > > called on them, of course). > > > > I think split_block would have the PHIs organized that way already, > > you are adding the _99 defs anyway, right? That is, > > > > + tree new_lhs = copy_ssa_name (lhs); > > + gimple_phi_set_result (phi, new_lhs); > > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > > > > here you swap the defs, I'd omit that. And instead essentially do > > replace_uses_by (lhs, new_lhs) here (without the folding and stuff). > > You have to clear SSA_NAME_OCCURS_IN_ABNORMAL_PHI on the old > > and set it on the new LHS if it was set. > > > > I think that should contain the "SSA update" to > > edge_before_returns_twice_call for the case it splits the block? > > I have considered that, but there are problems with that. > > The most important is that in the most common case, there is no > block splitting nor any edge splitting, all we have is that > # _99(ab) = PHI <_7(6), _8(ab)(11)> > _22 = bar (*_99(ab)); > or similar. And we can do several insertions of statements or sequences > before the bar call. If it is not returns_twice or even not a call, > we want to use _99(ab) in there and insert just before the call/statement. > If it is returns_twice call, we want to replace the _99(ab) uses with > something that is actually usable on the 6->10 edge. > So, if we wanted to keep using the _99(ab) (wouldn't it be even wrong > that it is (ab)?), we'd need to essentially on every addition > add statements like > _99(ab) = _7; > (for each PHI node except the virtual one) before the added statement or > statements (or add it only if used in the statement/sequence?) and > change the PHI to be > # _100(ab) = PHI <_99(ab)(6), _8(ab)(11)> > and replace all uses: > _22 = bar (*_100(ab)); > Right now ubsan can insert something like 3 statements/sequences per > argument, so there can be thousands of them and the block could have > thousands of PHIs, so that is millions of useless SSA_NAME copies. Ah, yeah, I see :/ > So, the intention of edge_before_returns_twice_call is just that > it in the common case just finds the non-EDGE_ABNORMAL edge if there is one, > if there isn't just one, it adjusts the IL such that there is just one. > And then the next step is to handle that case. So I guess the updated patch is OK then. Thanks, Richard.
--- gcc/gimple-iterator.h.jj 2024-02-08 11:17:37.144661383 +0100 +++ gcc/gimple-iterator.h 2024-03-11 16:16:24.670464737 +0100 @@ -93,6 +93,10 @@ extern void gsi_insert_on_edge (edge, gi extern void gsi_insert_seq_on_edge (edge, gimple_seq); extern basic_block gsi_insert_on_edge_immediate (edge, gimple *); extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq); +extern basic_block gsi_insert_before_returns_twice_call (basic_block, + gimple *); +extern basic_block gsi_insert_seq_before_returns_twice_call (basic_block, + gimple_seq); extern void gsi_commit_edge_inserts (void); extern void gsi_commit_one_edge_insert (edge, basic_block *); extern gphi_iterator gsi_start_phis (basic_block); --- gcc/gimple-iterator.cc.jj 2024-02-08 11:17:37.144661383 +0100 +++ gcc/gimple-iterator.cc 2024-03-11 18:17:08.994804808 +0100 @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. #include "tree-cfg.h" #include "tree-ssa.h" #include "value-prof.h" +#include "gimplify.h" /* Mark the statement STMT as modified, and update it. */ @@ -944,3 +945,135 @@ gsi_start_phis (basic_block bb) return i; } + +/* Helper function for gsi_insert_before_returns_twice_call and + gsi_insert_seq_before_returns_twice_call. Find edge to + insert statements before returns_twice call at the start of BB, + if there isn't just one, split the bb and adjust PHIs to ensure + that. */ + +static edge +edge_before_returns_twice_call (basic_block bb) +{ + gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb); + gcc_checking_assert (is_gimple_call (gsi_stmt (gsi)) + && (gimple_call_flags (gsi_stmt (gsi)) + & ECF_RETURNS_TWICE) != 0); + edge_iterator ei; + edge e, ad_edge = NULL, other_edge = NULL; + bool split = false; + /* Look for a fallthru and possibly a single backedge. */ + FOR_EACH_EDGE (e, ei, bb->preds) + { + if ((e->flags & (EDGE_ABNORMAL | EDGE_EH)) == EDGE_ABNORMAL) + { + gimple_stmt_iterator gsi + = gsi_start_nondebug_after_labels_bb (e->src); + gimple *ad = gsi_stmt (gsi); + if (ad && gimple_call_internal_p (ad, IFN_ABNORMAL_DISPATCHER)) + { + gcc_checking_assert (ad_edge == NULL); + ad_edge = e; + continue; + } + } + if (other_edge || e->flags & (EDGE_ABNORMAL | EDGE_EH)) + split = true; + other_edge = e; + } + gcc_checking_assert (ad_edge); + if (other_edge == NULL) + split = true; + if (split) + { + other_edge = split_block_after_labels (bb); + e = make_edge (ad_edge->src, other_edge->dest, EDGE_ABNORMAL); + for (gphi_iterator gsi = gsi_start_phis (other_edge->src); + !gsi_end_p (gsi); gsi_next (&gsi)) + { + gphi *phi = gsi.phi (); + tree lhs = gimple_phi_result (phi); + tree new_lhs; + if (virtual_operand_p (lhs)) + new_lhs = make_ssa_name (SSA_NAME_VAR (lhs)); + else + new_lhs = make_ssa_name (TREE_TYPE (lhs)); + gimple_phi_set_result (phi, new_lhs); + gphi *new_phi = create_phi_node (lhs, other_edge->dest); + add_phi_arg (new_phi, new_lhs, other_edge, UNKNOWN_LOCATION); + add_phi_arg (new_phi, gimple_phi_arg_def_from_edge (phi, ad_edge), + e, gimple_phi_arg_location_from_edge (phi, ad_edge)); + } + remove_edge (ad_edge); + } + return other_edge; +} + +/* Insert G before BB which starts with an ECF_RETURNS_TWICE call. + If BB has a single predecessor edge except for edge from + .ABNORMAL_DISPATCHER, insert on that edge, otherwise split + BB before the call, adjust PHIs and insert into the new BB. + If G refers to any results of BB's PHIs, replace them afterwards + with corresponding PHI arg. */ + +basic_block +gsi_insert_before_returns_twice_call (basic_block bb, gimple *g) +{ + edge e = edge_before_returns_twice_call (bb); + basic_block new_bb = gsi_insert_on_edge_immediate (e, g); + use_operand_p use_p; + ssa_op_iter iter; + bool m = false; + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) + { + tree s = USE_FROM_PTR (use_p); + if (SSA_NAME_DEF_STMT (s) + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) + { + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); + SET_USE (use_p, unshare_expr (r)); + m = true; + } + } + if (m) + update_stmt (g); + return new_bb; +} + +/* Similiarly for sequence SEQ. */ + +basic_block +gsi_insert_seq_before_returns_twice_call (basic_block bb, gimple_seq seq) +{ + if (gimple_seq_empty_p (seq)) + return NULL; + gimple *f = gimple_seq_first_stmt (seq); + gimple *l = gimple_seq_last_stmt (seq); + edge e = edge_before_returns_twice_call (bb); + basic_block new_bb = gsi_insert_seq_on_edge_immediate (e, seq); + use_operand_p use_p; + ssa_op_iter iter; + for (gimple_stmt_iterator gsi = gsi_for_stmt (f); ; gsi_next (&gsi)) + { + gimple *g = gsi_stmt (gsi); + bool m = false; + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, SSA_OP_USE) + { + tree s = USE_FROM_PTR (use_p); + if (SSA_NAME_DEF_STMT (s) + && gimple_code (SSA_NAME_DEF_STMT (s)) == GIMPLE_PHI + && gimple_bb (SSA_NAME_DEF_STMT (s)) == e->dest) + { + tree r = gimple_phi_arg_def_from_edge (SSA_NAME_DEF_STMT (s), e); + SET_USE (use_p, unshare_expr (r)); + m = true; + } + } + if (m) + update_stmt (g); + if (g == l) + break; + } + return new_bb; +} --- gcc/ubsan.cc.jj 2024-01-03 11:51:32.312720350 +0100 +++ gcc/ubsan.cc 2024-03-11 17:07:58.104243341 +0100 @@ -1432,6 +1432,37 @@ ubsan_expand_vptr_ifn (gimple_stmt_itera return true; } +/* Insert G stmt before ITER. If ITER is a returns_twice call, + insert it on an appropriate edge instead. */ + +static void +ubsan_insert_before (gimple_stmt_iterator *iter, gimple *g) +{ + gimple *stmt = gsi_stmt (*iter); + if (stmt + && is_gimple_call (stmt) + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) + gsi_insert_before_returns_twice_call (gsi_bb (*iter), g); + else + gsi_insert_before (iter, g, GSI_SAME_STMT); +} + +/* Similarly for sequence SEQ. */ + +static void +ubsan_insert_seq_before (gimple_stmt_iterator *iter, gimple_seq seq) +{ + if (gimple_seq_empty_p (seq)) + return; + gimple *stmt = gsi_stmt (*iter); + if (stmt + && is_gimple_call (stmt) + && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) != 0) + gsi_insert_seq_before_returns_twice_call (gsi_bb (*iter), seq); + else + gsi_insert_seq_before (iter, seq, GSI_SAME_STMT); +} + /* Instrument a memory reference. BASE is the base of MEM, IS_LHS says whether the pointer is on the left hand side of the assignment. */ @@ -1458,7 +1489,7 @@ instrument_mem_ref (tree mem, tree base, tree alignt = build_int_cst (pointer_sized_int_node, align); gcall *g = gimple_build_call_internal (IFN_UBSAN_NULL, 3, t, kind, alignt); gimple_set_location (g, gimple_location (gsi_stmt (*iter))); - gsi_insert_before (iter, g, GSI_SAME_STMT); + ubsan_insert_before (iter, g); } /* Perform the pointer instrumentation. */ @@ -1485,7 +1516,7 @@ instrument_pointer_overflow (gimple_stmt return; gcall *g = gimple_build_call_internal (IFN_UBSAN_PTR, 2, ptr, off); gimple_set_location (g, gimple_location (gsi_stmt (*gsi))); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); } /* Instrument pointer arithmetics if any. */ @@ -1577,10 +1608,11 @@ maybe_instrument_pointer_overflow (gimpl else t = fold_convert (sizetype, moff); } - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, - GSI_SAME_STMT); - base_addr = force_gimple_operand_gsi (gsi, base_addr, true, NULL_TREE, true, - GSI_SAME_STMT); + gimple_seq seq, this_seq; + t = force_gimple_operand (t, &seq, true, NULL_TREE); + base_addr = force_gimple_operand (base_addr, &this_seq, true, NULL_TREE); + gimple_seq_add_seq_without_update (&seq, this_seq); + ubsan_insert_seq_before (gsi, seq); instrument_pointer_overflow (gsi, base_addr, t); } @@ -2035,7 +2067,7 @@ instrument_nonnull_arg (gimple_stmt_iter { g = gimple_build_assign (make_ssa_name (TREE_TYPE (arg)), arg); gimple_set_location (g, loc[0]); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); arg = gimple_assign_lhs (g); } @@ -2068,7 +2100,7 @@ instrument_nonnull_arg (gimple_stmt_iter g = gimple_build_call (fn, 1, data); } gimple_set_location (g, loc[0]); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); ubsan_create_edge (g); } *gsi = gsi_for_stmt (stmt); @@ -2124,7 +2156,7 @@ instrument_nonnull_return (gimple_stmt_i g = gimple_build_call (fn, 2, data, data2); } gimple_set_location (g, loc[0]); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); ubsan_create_edge (g); *gsi = gsi_for_stmt (stmt); } @@ -2231,6 +2263,7 @@ instrument_object_size (gimple_stmt_iter tree sizet; tree base_addr = base; gimple *bos_stmt = NULL; + gimple_seq seq = NULL; if (decl_p) base_addr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (base)), base); @@ -2244,19 +2277,12 @@ instrument_object_size (gimple_stmt_iter sizet = builtin_decl_explicit (BUILT_IN_DYNAMIC_OBJECT_SIZE); sizet = build_call_expr_loc (loc, sizet, 2, base_addr, integer_zero_node); - sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true, - GSI_SAME_STMT); + sizet = force_gimple_operand (sizet, &seq, false, NULL_TREE); /* If the call above didn't end up being an integer constant, go one statement back and get the __builtin_object_size stmt. Save it, we might need it later. */ if (SSA_VAR_P (sizet)) - { - gsi_prev (gsi); - bos_stmt = gsi_stmt (*gsi); - - /* Move on to where we were. */ - gsi_next (gsi); - } + bos_stmt = gsi_stmt (gsi_last (seq)); } else return; @@ -2298,21 +2324,24 @@ instrument_object_size (gimple_stmt_iter && !TREE_ADDRESSABLE (base)) mark_addressable (base); + /* We have to emit the check. */ + gimple_seq this_seq; + t = force_gimple_operand (t, &this_seq, true, NULL_TREE); + gimple_seq_add_seq_without_update (&seq, this_seq); + ptr = force_gimple_operand (ptr, &this_seq, true, NULL_TREE); + gimple_seq_add_seq_without_update (&seq, this_seq); + ubsan_insert_seq_before (gsi, seq); + if (bos_stmt && gimple_call_builtin_p (bos_stmt, BUILT_IN_DYNAMIC_OBJECT_SIZE)) ubsan_create_edge (bos_stmt); - /* We have to emit the check. */ - t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, - GSI_SAME_STMT); - ptr = force_gimple_operand_gsi (gsi, ptr, true, NULL_TREE, true, - GSI_SAME_STMT); tree ckind = build_int_cst (unsigned_char_type_node, is_lhs ? UBSAN_STORE_OF : UBSAN_LOAD_OF); gimple *g = gimple_build_call_internal (IFN_UBSAN_OBJECT_SIZE, 4, ptr, t, sizet, ckind); gimple_set_location (g, loc); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + ubsan_insert_before (gsi, g); } /* Instrument values passed to builtin functions. */ --- gcc/testsuite/gcc.dg/ubsan/pr112709-1.c.jj 2024-03-11 16:55:09.038760951 +0100 +++ gcc/testsuite/gcc.dg/ubsan/pr112709-1.c 2024-03-11 16:55:02.159854950 +0100 @@ -0,0 +1,52 @@ +/* PR sanitizer/112709 */ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=undefined -O2" } */ + +struct S { char c[1024]; }; +int foo (int); + +__attribute__((returns_twice, noipa)) struct S +bar (int x) +{ + (void) x; + struct S s = {}; + s.c[42] = 42; + return s; +} + +void +baz (struct S *p) +{ + foo (1); + *p = bar (0); +} + +void +qux (int x, struct S *p) +{ + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *p = bar (x); +} + +void +corge (int x, struct S *p) +{ + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; + if (x == 25) + { + l1: + x = foo (2); + } + else if (x == 42) + { + l2: + x = foo (foo (3)); + } +l3: + *p = bar (x); + if (x < 4) + goto *q[x & 3]; +} --- gcc/testsuite/gcc.dg/ubsan/pr112709-2.c.jj 2024-03-11 16:55:37.000378840 +0100 +++ gcc/testsuite/gcc.dg/ubsan/pr112709-2.c 2024-03-11 17:13:37.517599492 +0100 @@ -0,0 +1,50 @@ +/* PR sanitizer/112709 */ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=undefined -O2" } */ + +struct S { char c[1024]; } *p; +int foo (int); + +__attribute__((returns_twice, noipa)) int +bar (struct S x) +{ + (void) x.c[0]; + return 0; +} + +void +baz (int *y) +{ + foo (1); + *y = bar (*p); +} + +void +qux (int x, int *y) +{ + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *y = bar (*p); +} + +void +corge (int x, int *y) +{ + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; + if (x == 25) + { + l1: + x = foo (2); + } + else if (x == 42) + { + l2: + x = foo (foo (3)); + } +l3: + *y = bar (*p); + if (x < 4) + goto *q[x & 3]; +}