From patchwork Thu May 26 13:43:45 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Matz X-Patchwork-Id: 97576 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 75E58B6F8C for ; Thu, 26 May 2011 23:44:10 +1000 (EST) Received: (qmail 17230 invoked by alias); 26 May 2011 13:44:07 -0000 Received: (qmail 17215 invoked by uid 22791); 26 May 2011 13:44:03 -0000 X-SWARE-Spam-Status: No, hits=-4.6 required=5.0 tests=AWL, BAYES_20, RCVD_IN_DNSWL_HI, TW_TM, TW_TX, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from cantor.suse.de (HELO mx1.suse.de) (195.135.220.2) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 26 May 2011 13:43:47 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 91DF493987 for ; Thu, 26 May 2011 15:43:45 +0200 (CEST) Date: Thu, 26 May 2011 15:43:45 +0200 (CEST) From: Michael Matz To: gcc-patches@gcc.gnu.org Subject: RFC: explicitely mark out-of-scope deaths Message-ID: MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, on IRC we discussed about this, here's the RFC patch. It bootstraps and causes some minor regressions most probably due to some missing sprinkled checks for the special clobber insns and sometimes due to having to adjust some regexps. Anyway, stack slot sharing is currently using the heuristic that if variables were declared in non-intersecting scopes they can share they space. Code movement transformations make this heuristic invalid, and we're lucky that we're hit with problems only very seldomly. The solution is to make the border of liveness for local variables (at least those not gimple_reg, as the latter are moved to outer scope anyway, and hence all conflict) explicit in the intermediate representation, either preventing invalid code movements, or updating the IR to reflect such movement. The idea is to explicitely mark the death for all such variables by some special instruction whose interpretation would mean "clobber all space associated with this entity". We nearly have such instructions, namely simple stores with an empty ctor on the rhs. We just have to mark them in a way to differ them from such normal stores (which are interpreted as filling the LHS with zero bytes). I chose to use volatility as such marker (which normally doesn't make sense on a CONSTRUCTOR node). Stores are better than builtin functions here, so as to not artificially take addresses of the decls in question. They are also better than a completely new type of instruction as stores are handled already by passes, and the semantics match quite well if you consider the RHS to be undefined. This store will act as a movement barrier, or alternatively can be moved with other instructions. They can also be duplicated, and generally naturally transform with any other transformation we want to do. The live area of a local decl then will be defined as the area between the first mention of the decls name, and those clobber instructions. This is conservative in that we overestimate the start of live (it's starting not at the first mention of the name which could be an address taking, but at the first store to it, or at the start of its scope, which we could identify be a similar new instruction). A simple bit algorithm can be used to calculate this area, and then the conflicts for stack slot sharing. The nice thing about explicitel death markers is that we could use them also to implement some warning where addresses to out-of-scope variables definitely or possibly escape after their point of death. For a full solution these death points need to be reflected in RTL too so as to also deactivate the scheduler messing with instruction sequence. Alternatively we can create a conflict between all alias sets in one stack slot (and then have to disable object based disambiguation). I haven't yet investigated if we could perhaps just use the CLOBBER rtx for that. The patch also doesn't remove (but disables) the current code for generating conflicts. So, I'm asking for some comments. Also about my choice of using an empty volatile CONSTRUCTOR as marker, perhaps someone has a nicer idea. (Note the above remarks though, and also that these clobbers must not be removed for uselessness, otherwise we'll get less sharing, thankfully not wrong code at least). Ciao, Michael. Index: gengtype.c =================================================================== --- gengtype.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ gengtype.c 2011-05-26 14:15:41.000000000 +0200 @@ -3613,14 +3613,13 @@ write_field_root (outf_p f, pair_p v, ty int has_length, struct fileloc *line, const char *if_marked, bool emit_pch, type_p field_type, const char *field_name) { + struct pair newv; /* If the field reference is relative to V, rather than to some subcomponent of V, we can mark any subarrays with a single stride. We're effectively treating the field as a global variable in its own right. */ if (v && type == v->type) { - struct pair newv; - newv = *v; newv.type = field_type; newv.name = ACONCAT ((v->name, ".", field_name, NULL)); Index: tree-stdarg.c =================================================================== --- tree-stdarg.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ tree-stdarg.c 2011-05-26 14:15:41.000000000 +0200 @@ -872,8 +872,12 @@ execute_optimize_stdarg (void) if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt)) == GIMPLE_SINGLE_RHS) { + /* Check for ap ={v} {}. */ + if (TREE_CODE (rhs) == CONSTRUCTOR + && TREE_THIS_VOLATILE (rhs)) + continue; /* Check for ap[0].field = temp. */ - if (va_list_counter_struct_op (&si, lhs, rhs, true)) + else if (va_list_counter_struct_op (&si, lhs, rhs, true)) continue; /* Check for temp = ap[0].field. */ Index: gimplify.c =================================================================== --- gimplify.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ gimplify.c 2011-05-26 14:15:41.000000000 +0200 @@ -1127,7 +1127,8 @@ gimplify_bind_expr (tree *expr_p, gimple bool old_save_stack = gimplify_ctxp->save_stack; tree t; gimple gimple_bind; - gimple_seq body; + gimple_seq body, cleanup; + gimple stack_save; tree temp = voidify_wrapper_expr (bind_expr, NULL); @@ -1173,22 +1174,48 @@ gimplify_bind_expr (tree *expr_p, gimple gimplify_stmt (&BIND_EXPR_BODY (bind_expr), &body); gimple_bind_set_body (gimple_bind, body); + cleanup = NULL; + stack_save = NULL; if (gimplify_ctxp->save_stack) { - gimple stack_save, stack_restore, gs; - gimple_seq cleanup, new_body; + gimple stack_restore; /* Save stack on entry and restore it on exit. Add a try_finally block to achieve this. Note that mudflap depends on the format of the emitted code: see mx_register_decls(). */ build_stack_save_restore (&stack_save, &stack_restore); - cleanup = new_body = NULL; gimplify_seq_add_stmt (&cleanup, stack_restore); + } + + /* Add kills for all variables that go out of scope. */ + for (t = BIND_EXPR_VARS (bind_expr); t ; t = DECL_CHAIN (t)) + { + if (TREE_CODE (t) == VAR_DECL + && !is_global_var (t) + && !DECL_HARD_REGISTER (t) + && !TREE_THIS_VOLATILE (t) + && !DECL_HAS_VALUE_EXPR_P (t) + && 1 + && needs_to_live_in_memory (t)) + { + tree clobber = build_constructor (TREE_TYPE (t), NULL); + TREE_THIS_VOLATILE (clobber) = 1; + gimplify_seq_add_stmt (&cleanup, gimple_build_assign (t, clobber)); + } + } + + if (cleanup) + { + gimple gs; + gimple_seq new_body; + + new_body = NULL; gs = gimple_build_try (gimple_bind_body (gimple_bind), cleanup, GIMPLE_TRY_FINALLY); - gimplify_seq_add_stmt (&new_body, stack_save); + if (stack_save) + gimplify_seq_add_stmt (&new_body, stack_save); gimplify_seq_add_stmt (&new_body, gs); gimple_bind_set_body (gimple_bind, new_body); } Index: cfgexpand.c =================================================================== --- cfgexpand.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ cfgexpand.c 2011-05-26 14:15:41.000000000 +0200 @@ -182,6 +182,7 @@ struct stack_var static struct stack_var *stack_vars; static size_t stack_vars_alloc; static size_t stack_vars_num; +static struct pointer_map_t *decl_to_stack_part; /* An array of indices such that stack_vars[stack_vars_sorted[i]].size is non-decreasing. */ @@ -260,7 +261,11 @@ add_stack_var (tree decl) stack_vars = XRESIZEVEC (struct stack_var, stack_vars, stack_vars_alloc); } + if (!decl_to_stack_part) + decl_to_stack_part = pointer_map_create (); + v = &stack_vars[stack_vars_num]; + * (size_t *)pointer_map_insert (decl_to_stack_part, decl) = stack_vars_num; v->decl = decl; v->size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1); @@ -305,6 +310,14 @@ stack_var_conflict_p (size_t x, size_t y { struct stack_var *a = &stack_vars[x]; struct stack_var *b = &stack_vars[y]; + if (x == y) + return false; + /* Partitions containing an SSA name result from gimple registers + with things like unsupported modes. They are top-level and + hence conflict with everything else. */ + if (TREE_CODE (a->decl) == SSA_NAME || TREE_CODE (b->decl) == SSA_NAME) + return true; + if (!a->conflicts || !b->conflicts) return false; return bitmap_bit_p (a->conflicts, y); @@ -375,6 +388,159 @@ add_alias_set_conflicts (void) } } +static bool +visit_op (gimple stmt ATTRIBUTE_UNUSED, tree op, void *data) +{ + bitmap active = (bitmap)data; + op = get_base_address (op); + if (op + && DECL_P (op) + && DECL_RTL_IF_SET (op) == pc_rtx) + { + size_t *v = (size_t *) pointer_map_contains (decl_to_stack_part, op); + if (v) + bitmap_set_bit (active, *v); + } + return false; +} + +static bool +visit_conflict (gimple stmt ATTRIBUTE_UNUSED, tree op, void *data) +{ + bitmap active = (bitmap)data; + op = get_base_address (op); + if (op + && DECL_P (op) + && DECL_RTL_IF_SET (op) == pc_rtx) + { + size_t *v = + (size_t *) pointer_map_contains (decl_to_stack_part, op); + if (v && bitmap_set_bit (active, *v)) + { + size_t num = *v; + bitmap_iterator bi; + unsigned i; + gcc_assert (num < stack_vars_num); + EXECUTE_IF_SET_IN_BITMAP (active, 0, i, bi) + add_stack_var_conflict (num, i); + } + } + return false; +} + +static void +add_scope_conflicts (void) +{ + basic_block bb; + bool changed; + gimple_stmt_iterator gsi; + bitmap work = BITMAP_ALLOC (NULL); + + FOR_ALL_BB (bb) + bb->aux = BITMAP_ALLOC (NULL); + + changed = true; + while (changed) + { + changed = false; + FOR_EACH_BB (bb) + { + bitmap active = (bitmap)bb->aux; + edge e; + edge_iterator ei; + bitmap_clear (work); + FOR_EACH_EDGE (e, ei, bb->preds) + bitmap_ior_into (work, (bitmap)e->src->aux); + + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + walk_stmt_load_store_addr_ops (stmt, work, visit_op, + visit_op, visit_op); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple stmt = gsi_stmt (gsi); + + if (gimple_assign_single_p (stmt) + && gimple_assign_rhs_code (stmt) == CONSTRUCTOR + && TREE_THIS_VOLATILE (gimple_assign_rhs1 (stmt))) + { + tree lhs = gimple_assign_lhs (stmt); + size_t *v; + /* Nested function lowering might introduce LHSs + that are COMPONENT_REFs. */ + if (TREE_CODE (lhs) != VAR_DECL) + continue; + if (DECL_RTL_IF_SET (lhs) == pc_rtx + && (v = (size_t *) + pointer_map_contains (decl_to_stack_part, lhs))) + bitmap_clear_bit (work, *v); + } + else if (!is_gimple_debug (stmt)) + walk_stmt_load_store_addr_ops (stmt, work, visit_op, + visit_op, visit_op); + } + if (bitmap_ior_into (active, work)) + changed = true; + } + } + + FOR_EACH_BB (bb) + { + edge e; + edge_iterator ei; + bitmap_iterator bi; + unsigned i; + + bitmap_clear (work); + FOR_EACH_EDGE (e, ei, bb->preds) + bitmap_ior_into (work, (bitmap)e->src->aux); + + EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi) + { + unsigned j; + bitmap_iterator bj; + EXECUTE_IF_SET_IN_BITMAP (work, i, j, bj) + add_stack_var_conflict (i, j); + } + + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + walk_stmt_load_store_addr_ops (stmt, work, visit_conflict, + visit_conflict, visit_conflict); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple stmt = gsi_stmt (gsi); + + if (gimple_assign_single_p (stmt) + && gimple_assign_rhs_code (stmt) == CONSTRUCTOR + && TREE_THIS_VOLATILE (gimple_assign_rhs1 (stmt))) + { + tree lhs = gimple_assign_lhs (stmt); + size_t *v; + if (TREE_CODE (lhs) != VAR_DECL) + continue; + if (DECL_RTL_IF_SET (lhs) == pc_rtx + && (v = (size_t *) + pointer_map_contains (decl_to_stack_part, lhs))) + bitmap_clear_bit (work, *v); + } + else if (!is_gimple_debug (stmt)) + walk_stmt_load_store_addr_ops (stmt, work, visit_conflict, + visit_conflict, visit_conflict); + } + } + + BITMAP_FREE (work); + FOR_ALL_BB (bb) + BITMAP_FREE (bb->aux); +} + /* A subroutine of partition_stack_vars. A comparison function for qsort, sorting an array of indices by the properties of the object. */ @@ -1111,7 +1277,7 @@ expand_used_vars_for_block (tree block, possible for variables whose address escapes), we mirror the block tree in the interference graph. Here we cause all variables at this level, and all sublevels, to conflict. */ - if (old_sv_num < this_sv_num) + if (0 && old_sv_num < this_sv_num) { new_sv_num = stack_vars_num; @@ -1304,6 +1470,8 @@ fini_vars_expansion (void) XDELETEVEC (stack_vars_sorted); stack_vars = NULL; stack_vars_alloc = stack_vars_num = 0; + pointer_map_destroy (decl_to_stack_part); + decl_to_stack_part = NULL; } /* Make a fair guess for the size of the stack frame of the function @@ -1458,6 +1626,7 @@ expand_used_vars (void) if (stack_vars_num > 0) { + add_scope_conflicts (); /* Due to the way alias sets work, no variables with non-conflicting alias sets may be assigned the same address. Add conflicts to reflect this. */ @@ -1950,8 +2119,14 @@ expand_gimple_stmt_1 (gimple stmt) == GIMPLE_SINGLE_RHS); if (gimple_has_location (stmt) && CAN_HAVE_LOCATION_P (rhs)) SET_EXPR_LOCATION (rhs, gimple_location (stmt)); - expand_assignment (lhs, rhs, - gimple_assign_nontemporal_move_p (stmt)); + if (TREE_CODE (rhs) == CONSTRUCTOR + && TREE_THIS_VOLATILE (rhs)) + /* This is a clobber to mark the going out of scope for + this LHS. */ + ; + else + expand_assignment (lhs, rhs, + gimple_assign_nontemporal_move_p (stmt)); } else { Index: tree-ssa-live.c =================================================================== --- tree-ssa-live.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ tree-ssa-live.c 2011-05-26 14:15:41.000000000 +0200 @@ -688,6 +688,7 @@ remove_unused_locals (void) referenced_var_iterator rvi; bitmap global_unused_vars = NULL; unsigned srcidx, dstidx, num; + bool have_local_clobbers = false; /* Removing declarations from lexical blocks when not optimizing is not only a waste of time, it actually causes differences in stack @@ -720,6 +721,14 @@ remove_unused_locals (void) if (is_gimple_debug (stmt)) continue; + if (gimple_assign_single_p (stmt) + && gimple_assign_rhs_code (stmt) == CONSTRUCTOR + && TREE_THIS_VOLATILE (gimple_assign_rhs1 (stmt))) + { + have_local_clobbers = true; + continue; + } + if (b) TREE_USED (b) = true; @@ -753,6 +762,36 @@ remove_unused_locals (void) TREE_USED (e->goto_block) = true; } + if (have_local_clobbers) + FOR_EACH_BB (bb) + { + gimple_stmt_iterator gsi; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) + { + gimple stmt = gsi_stmt (gsi); + tree b = gimple_block (stmt); + + if (gimple_assign_single_p (stmt) + && gimple_assign_rhs_code (stmt) == CONSTRUCTOR + && TREE_THIS_VOLATILE (gimple_assign_rhs1 (stmt))) + { + tree lhs = gimple_assign_lhs (stmt); + lhs = get_base_address (lhs); + if (DECL_P (lhs) && (!var_ann (lhs) || !is_used_p (lhs))) + { + unlink_stmt_vdef (stmt); + gsi_remove (&gsi, true); + release_defs (stmt); + continue; + } + if (b) + TREE_USED (b) = true; + } + gsi_next (&gsi); + } + } + cfun->has_local_explicit_reg_vars = false; /* Remove unmarked local vars from local_decls. */ Index: tree-sra.c =================================================================== --- tree-sra.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ tree-sra.c 2011-05-26 14:15:41.000000000 +0200 @@ -1041,6 +1041,11 @@ build_accesses_from_assign (gimple stmt) if (disqualify_ops_if_throwing_stmt (stmt, lhs, rhs)) return false; + /* Scope clobbers don't influence scalarization. */ + if (TREE_CODE (rhs) == CONSTRUCTOR + && TREE_THIS_VOLATILE (rhs)) + return false; + racc = build_access_from_expr_1 (rhs, stmt, false); lacc = build_access_from_expr_1 (lhs, stmt, true); Index: tree-ssa-dce.c =================================================================== --- tree-ssa-dce.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ tree-ssa-dce.c 2011-05-26 14:15:41.000000000 +0200 @@ -846,19 +846,17 @@ propagate_necessity (struct edge_list *e else if (gimple_assign_single_p (stmt)) { tree rhs; - bool rhs_aliased = false; /* If this is a load mark things necessary. */ rhs = gimple_assign_rhs1 (stmt); if (TREE_CODE (rhs) != SSA_NAME - && !is_gimple_min_invariant (rhs)) + && !is_gimple_min_invariant (rhs) + && TREE_CODE (rhs) != CONSTRUCTOR) { if (!ref_may_be_aliased (rhs)) mark_aliased_reaching_defs_necessary (stmt, rhs); else - rhs_aliased = true; + mark_all_reaching_defs_necessary (stmt); } - if (rhs_aliased) - mark_all_reaching_defs_necessary (stmt); } else if (gimple_code (stmt) == GIMPLE_RETURN) { @@ -866,7 +864,8 @@ propagate_necessity (struct edge_list *e /* A return statement may perform a load. */ if (rhs && TREE_CODE (rhs) != SSA_NAME - && !is_gimple_min_invariant (rhs)) + && !is_gimple_min_invariant (rhs) + && TREE_CODE (rhs) != CONSTRUCTOR) { if (!ref_may_be_aliased (rhs)) mark_aliased_reaching_defs_necessary (stmt, rhs); @@ -884,6 +883,7 @@ propagate_necessity (struct edge_list *e tree op = TREE_VALUE (gimple_asm_input_op (stmt, i)); if (TREE_CODE (op) != SSA_NAME && !is_gimple_min_invariant (op) + && TREE_CODE (op) != CONSTRUCTOR && !ref_may_be_aliased (op)) mark_aliased_reaching_defs_necessary (stmt, op); } Index: gimple.c =================================================================== --- gimple.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ gimple.c 2011-05-26 14:15:41.000000000 +0200 @@ -1425,7 +1425,9 @@ walk_gimple_op (gimple stmt, walk_tree_f { /* If the RHS has more than 1 operand, it is not appropriate for the memory. */ - wi->val_only = !is_gimple_mem_rhs (gimple_assign_rhs1 (stmt)) + wi->val_only = !(is_gimple_mem_rhs (gimple_assign_rhs1 (stmt)) + || TREE_CODE (gimple_assign_rhs1 (stmt)) + == CONSTRUCTOR) || !gimple_assign_single_p (stmt); wi->is_lhs = true; } Index: tree-ssa-structalias.c =================================================================== --- tree-ssa-structalias.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ tree-ssa-structalias.c 2011-05-26 14:15:41.000000000 +0200 @@ -4355,7 +4355,12 @@ find_func_aliases (gimple origt) tree lhsop = gimple_assign_lhs (t); tree rhsop = (gimple_num_ops (t) == 2) ? gimple_assign_rhs1 (t) : NULL; - if (rhsop && AGGREGATE_TYPE_P (TREE_TYPE (lhsop))) + if (rhsop && TREE_CODE (rhsop) == CONSTRUCTOR + && TREE_THIS_VOLATILE (rhsop)) + /* Ignore clobbers, they don't actually store anything into + the LHS. */ + ; + else if (rhsop && AGGREGATE_TYPE_P (TREE_TYPE (lhsop))) do_structure_copy (lhsop, rhsop); else { Index: tree-ssa-operands.c =================================================================== --- tree-ssa-operands.c.orig 2011-05-26 14:15:01.000000000 +0200 +++ tree-ssa-operands.c 2011-05-26 14:15:41.000000000 +0200 @@ -955,6 +955,9 @@ get_expr_operands (gimple stmt, tree *ex constructor_elt *ce; unsigned HOST_WIDE_INT idx; + if (TREE_THIS_VOLATILE (expr)) + gimple_set_has_volatile_ops (stmt, true); + for (idx = 0; VEC_iterate (constructor_elt, CONSTRUCTOR_ELTS (expr), idx, ce); idx++) Index: tree-ssa-sccvn.c =================================================================== --- tree-ssa-sccvn.c.orig 2011-05-26 14:15:39.000000000 +0200 +++ tree-ssa-sccvn.c 2011-05-26 14:15:41.000000000 +0200 @@ -1356,6 +1356,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree else if (is_gimple_reg_type (vr->type) && gimple_assign_single_p (def_stmt) && gimple_assign_rhs_code (def_stmt) == CONSTRUCTOR + && !TREE_THIS_VOLATILE (gimple_assign_rhs1 (def_stmt)) && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (def_stmt)) == 0) { tree base2;