From patchwork Mon May 13 20:16:30 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 243521 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 9C36E2C009E for ; Tue, 14 May 2013 06:17:09 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:content-type:date:message-id:mime-version; q=dns; s=default; b=MnmZFWZdhKn1KjhaJix+K0eLQyETST67jwzaN0MPNbC CoDQBU7c/LFdJEvjjIeWjmoRoe4kDgWmhTFgoPUJCBt+WMZQNW30CeXQqyO9pcof MkY9TJrYa5LvxXVoum8c6UsdbC0rWm+pqjgjt8HD9NfAcy5Q7T0mHsBEQPM4lb+U = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:content-type:date:message-id:mime-version; s=default; bh=vT862LN87jn4+Ix3cIhknSTJBG0=; b=ASPj4qn5LXR30iGzn ZIbEcgNcPBlhc7RePG2sMqTC1prvbmhgM7AVGkqI+V4me86V1Z4VQrS2G2WOJMPm hZZDJLr/eVv1Jv3ifVL5z5/OVS4dZc+JKo37EN/tNnKiIbwEKkUp8qB24wTxlv+t RmgCmJRYk7UcgjmZfIwl6Ew3X0= Received: (qmail 31232 invoked by alias); 13 May 2013 20:17:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 31222 invoked by uid 89); 13 May 2013 20:17:02 -0000 X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, TW_CF, TW_TM autolearn=ham version=3.3.1 Received: from multi.imgtec.com (HELO multi.imgtec.com) (194.200.65.239) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Mon, 13 May 2013 20:17:00 +0000 Subject: [PATCH] New switch optimization pass (PR tree-optimization/54742) From: Steve Ellcey To: CC: , Date: Mon, 13 May 2013 13:16:30 -0700 Message-ID: <1368476190.22602.109.camel@ubuntu-sellcey> MIME-Version: 1.0 X-SEF-Processed: 7_3_0_01181__2013_05_13_21_16_53 X-Virus-Found: No Here is the latest version of my SSA optimization pass to do the switch statement optimization described in PR 54742 (core_state_transition from coremark). I have tested this optimization with a x86 bootstrap and GCC test run with no errors and tested the MIPS cross compiler with no errors. Because of that I decided to submit it as a statically linked optimization pass instead of a dynamically loaded one, though I did keep the ifdefs for using it as a dynamic pass in the code. They could be removed if this patch is approved as a statically linked pass. Also, while this patch shows the optimization only being turned on with the -ftree-switch-shortcut flag, my bootstrap and testing had it turned on for all -O2 optimizations in order to maximize the testing. We may want to turn this on for -O3 and/or for -fexpensive-optimizations. I had to make one change to dominance.c in order to avoid some compiler aborts where it was dereferencing a null pointer. I believe this was happening because I am calling gimple_duplicate_sese_region with regions that are not really SESE. Because I am doing this, I regenerate the cfg and SSA information after each call, but I also had to change iterate_fix_dominators to fix the abort. Another way we might want to fix this would be to pass a flag to gimple_duplicate_sese_region that tells it whether or not we want it to recalculate the dominance information at all. If set to false, it would assume the caller will take care of it. Opinions? OK to checkin? Steve Ellcey sellcey@imgtec.com 2013-05-13 Steve Ellcey PR tree-optimization/54742 * Makefile.in (OBJS): Add tree-switch-shortcut.o. * common.opt (ftree-switch-shortcut): New. * dominance.c (iterate_fix_dominators): Add null check. * params.def (PARAM_MAX_SWITCH_INSNS): New. (PARAM_MAX_SWITCH_PATHS): New. * passes.c (init_optimization_passes): Add pass_switch_shortcut. * timevar.def (TV_SWITCH_SHORTCUT): New. * tree-pass.c (pass_switch_shortcut): New. * tree-switch-shortcut.c: New file. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 903125e..db0ffcb 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1399,6 +1399,7 @@ OBJS = \ tree-scalar-evolution.o \ tree-sra.o \ tree-switch-conversion.o \ + tree-switch-shortcut.o \ tree-ssa-address.o \ tree-ssa-alias.o \ tree-ssa-ccp.o \ diff --git a/gcc/common.opt b/gcc/common.opt index 4c7933e..e028e2d 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2160,6 +2160,10 @@ ftree-sra Common Report Var(flag_tree_sra) Optimization Perform scalar replacement of aggregates +ftree-switch-shortcut +Common Report Var(flag_tree_switch_shortcut) Init(0) Optimization +Do fancy switch statement shortcutting + ftree-ter Common Report Var(flag_tree_ter) Optimization Replace temporary expressions in the SSA->normal pass diff --git a/gcc/dominance.c b/gcc/dominance.c index 5c96dad..d858ad1 100644 --- a/gcc/dominance.c +++ b/gcc/dominance.c @@ -1251,6 +1251,7 @@ iterate_fix_dominators (enum cdi_direction dir, vec bbs, struct pointer_map_t *map; int *parent, *son, *brother; unsigned int dir_index = dom_convert_dir_to_idx (dir); + void **slot; /* We only support updating dominators. There are some problems with updating postdominators (need to add fake edges from infinite loops @@ -1357,7 +1358,10 @@ iterate_fix_dominators (enum cdi_direction dir, vec bbs, if (dom == bb) continue; - dom_i = (size_t) *pointer_map_contains (map, dom); + slot = pointer_map_contains (map, dom); + if (slot == NULL) + continue; + dom_i = (size_t) *slot; /* Do not include parallel edges to G. */ if (!bitmap_set_bit ((bitmap) g->vertices[dom_i].data, i)) diff --git a/gcc/params.def b/gcc/params.def index 3c52651..bdabe07 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1020,6 +1020,20 @@ DEFPARAM (PARAM_MAX_SLSR_CANDIDATE_SCAN, "strength reduction", 50, 1, 999999) +/* Maximum number of instructions to duplicate when shortcutting a switch. */ +DEFPARAM (PARAM_MAX_SWITCH_INSNS, + "max-switch-insns", + "Maximum number of instructions to duplicate when " + "shortcutting a switch statement", + 100, 1, 999999) + +/* Maximum number of paths to duplicate when shortcutting a switch. */ +DEFPARAM (PARAM_MAX_SWITCH_PATHS, + "max-switch-paths", + "Maximum number of new paths to create when" + " shortcutting a switch statement", + 50, 1, 999999) + /* Local variables: mode:c diff --git a/gcc/passes.c b/gcc/passes.c index fd67ee6..0fb826c 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1416,6 +1416,7 @@ init_optimization_passes (void) NEXT_PASS (pass_call_cdce); NEXT_PASS (pass_cselim); NEXT_PASS (pass_tree_ifcombine); + NEXT_PASS (pass_switch_shortcut); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_tail_recursion); NEXT_PASS (pass_ch); diff --git a/gcc/timevar.def b/gcc/timevar.def index 44f0eac..e859657 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -162,6 +162,7 @@ DEFTIMEVAR (TV_TREE_LOOP_IVCANON , "tree canonical iv") DEFTIMEVAR (TV_SCEV_CONST , "scev constant prop") DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH , "tree loop unswitching") DEFTIMEVAR (TV_COMPLETE_UNROLL , "complete unrolling") +DEFTIMEVAR (TV_SWITCH_SHORTCUT , "switch statement shortcuts") DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops") DEFTIMEVAR (TV_TREE_VECTORIZATION , "tree vectorization") DEFTIMEVAR (TV_TREE_SLP_VECTORIZATION, "tree slp vectorization") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index b8c59a7..219946a 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -489,6 +489,7 @@ extern struct gimple_opt_pass pass_early_inline; extern struct gimple_opt_pass pass_inline_parameters; extern struct gimple_opt_pass pass_update_address_taken; extern struct gimple_opt_pass pass_convert_switch; +extern struct gimple_opt_pass pass_switch_shortcut; /* The root of the compilation pass tree, once constructed. */ extern struct opt_pass *all_passes, *all_small_ipa_passes, *all_lowering_passes, diff --git a/gcc/tree-switch-shortcut.c b/gcc/tree-switch-shortcut.c new file mode 100644 index 0000000..6423454 --- /dev/null +++ b/gcc/tree-switch-shortcut.c @@ -0,0 +1,412 @@ +/* Switch shortcutting optimization for GNU C + Copyright (C) 2013 Free Software Foundation, Inc. + Contributed by Steve Ellcey (sellcey@imgtec.com). + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +/* This file implements an optimization where, when a variable is set + to a constant value and there is a path that leads from this definition + to a switch statement that uses that variable as its controlling expression + we duplicate the blocks on this path and change the switch goto to a + direct goto to the label of the switch block that control would goto based + on the value of the variable. */ + +#ifdef COMPILE_AS_PLUGIN +#include +#include +#else +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#endif + +#include "tree-pass.h" +#include "tree.h" +#include "tree-inline.h" +#include "tree-flow.h" +#include "tree-flow-inline.h" +#include "basic-block.h" +#include "pointer-set.h" +#include "gimple.h" +#include "cfgloop.h" +#include "params.h" + +#ifdef COMPILE_AS_PLUGIN +int plugin_is_GPL_compatible; +#endif + +/* Helper function for find_path, visited_bbs is used to make sure we don't + fall into an infinite loop. */ + +static int +find_path_1(basic_block start_bb, basic_block end_bb, struct pointer_set_t *visited_bbs) +{ + edge_iterator ei; + edge e; + + if (start_bb == end_bb) return 1; + + if (!pointer_set_insert (visited_bbs, start_bb)) + { + FOR_EACH_EDGE (e, ei, start_bb->succs) + if (find_path_1 (e->dest, end_bb, visited_bbs)) + return 1; + } + return 0; +} + +/* Return 1 if there is a path from start_bb to end_bb and 0 if there + is not. There may be multiple paths from start_bb to end_bb. */ + +static int +find_path(basic_block start_bb, basic_block end_bb) +{ + edge_iterator ei; + edge e; + struct pointer_set_t *visited_bbs; + int p = 0; + + if (start_bb == end_bb) return 1; + + visited_bbs = pointer_set_create (); + if (!pointer_set_insert (visited_bbs, start_bb)) + { + FOR_EACH_EDGE (e, ei, start_bb->succs) + if (find_path_1 (e->dest, end_bb, visited_bbs)) + { + p = 1; + break; + } + } + pointer_set_destroy (visited_bbs); + return p; +} + + +/* We save the paths we want to copy in bbs_list_array. n_bbs_list is the + number of paths saved, bbs_list_array[i] is the list of basic blocks in + one path. Each path starts with the block where a variable is assigned + a constant value (bbs_list_array[i][0]) and ends with the switch statement + block (bbs_list_array[i][bbs_list_size[i]-2]) and then the block that the + switch statement is going to go to given the constant value of the + variable (bbs_list_array[i][bbs_list_size[i]-1]). */ + +static basic_block **bbs_list_array; +static int *val_array; +static int *bbs_list_size; +static int max_path_count; +static int max_insn_count; +static int n_bbs_list; + +/* bbs_list[0] is the block with the switch statement, + bbs_list[n-1] is the block where the switch statement variable is assigned + a constant value, + The entries in between make a (reverse) path between the two. + + We don't want to change bb_list, we want to leave that alone and + and copy the path to bbs_list_array so that we wind up with a list (array) + of paths that we want to update. We also want to add the block that the + switch is going to go to on to the list so that we know which exit from + the switch statement is important. */ + +static void +save_new_path (basic_block *bbs_list, int n, tree val) +{ + int i; + int insn_count; + basic_block bb; + edge switch_taken_edge; + gimple_stmt_iterator gsi; + + if (n <= 1) return; + + if (n_bbs_list >= max_path_count) + return; + + /* Put the blocks in 'correct' order and add in where we want to go after + the switch statement, We want to leave bbs_list untouched for future + calls. */ + + bbs_list_array[n_bbs_list] = XNEWVEC (basic_block, n+1); + for (i = 0; i < n; i++) + bbs_list_array[n_bbs_list][i] = bbs_list[n-i-1]; + + switch_taken_edge = find_taken_edge (bbs_list[0], val); + bbs_list_array[n_bbs_list][n] = switch_taken_edge->dest; + + bbs_list_size[n_bbs_list] = n + 1; + val_array[n_bbs_list] = (int) TREE_INT_CST_LOW (val); + + /* Count how many instructions are in the blocks we are going to + duplicate and if there are too many do not save this path + (return without incrementing n_bbs_list). */ + + insn_count = 0; + for (i = 1; i < n; i++) + { + bb = bbs_list_array[n_bbs_list][i]; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + insn_count += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights); + } + + if (insn_count > max_insn_count) + return; + + n_bbs_list = n_bbs_list + 1; +} + +/* switch_stmt is a switch statement whose switch index expression + is the variable expr. We trace the value of the variable back + through any phi nodes looking for places where it gets a constant + value and save the path in bbs_list. Then we call save_new_path + to create a list of such paths. */ + +static void +process_switch (tree expr, gimple switch_stmt, + struct pointer_set_t *visited_phis, + basic_block *bbs_list, int n) +{ + gimple def_stmt; + tree var; + unsigned int i; + edge e; + edge_iterator ei; + basic_block bbx; + basic_block var_bb; + int e_count; + + gcc_assert (gimple_code (switch_stmt) == GIMPLE_SWITCH); + var = SSA_NAME_VAR (expr); + def_stmt = SSA_NAME_DEF_STMT (expr); + var_bb = gimple_bb (def_stmt); + + if (var == NULL || var_bb == NULL) return; + + /* We have a variable definition (var) that is defined in var_bb, + We want to put the path from var_bb to the current bb into the + bbs_list. If there is more then one path, skip this and don't + try to do the optimization. */ + + bbx = bbs_list[n-1]; + while (bbx != var_bb) + { + e_count = 0; + FOR_EACH_EDGE (e, ei, bbx->preds) + { + if (find_path (var_bb, e->src)) + { + bbs_list[n] = e->src; + n = n + 1; + e_count = e_count + 1; + } + } + if (e_count != 1) return; + bbx = bbs_list[n-1]; + } + + if ((gimple_code (def_stmt) == GIMPLE_PHI) + && !pointer_set_insert (visited_phis, def_stmt)) + { + for (i = 0; i < gimple_phi_num_args (def_stmt); i++) + { + tree arg = gimple_phi_arg_def (def_stmt, i); + if (arg && (TREE_CODE (arg) == INTEGER_CST)) + { + /* const char *name = IDENTIFIER_POINTER (DECL_NAME (var)); */ + bbs_list[n] = gimple_phi_arg_edge (def_stmt, i)->src; + save_new_path(bbs_list, n + 1, arg); + } + else if (arg && (TREE_CODE (arg) == SSA_NAME)) + { + bbs_list[n] = gimple_phi_arg_edge (def_stmt, i)->src; + process_switch (arg, switch_stmt, visited_phis, bbs_list, n+1); + } + } + } +} + +/* Find paths that lead from blocks where a variable is assigned a constant + value to a switch statement where that variable is used as the switch + index. Save the paths in bbs_list_array so that they can be processed + by copy_switch_paths. */ + +static unsigned int +find_switch_shortcuts (void) +{ + basic_block bb; + struct pointer_set_t *visited_phis; + basic_block *bbs_list; + int n = 1; + + bbs_list = XNEWVEC (basic_block, n_basic_blocks); + visited_phis = pointer_set_create (); + FOR_EACH_BB (bb) + { + gimple stmt = last_stmt (bb); + if (stmt && gimple_code (stmt) == GIMPLE_SWITCH) + { + tree op = gimple_switch_index (stmt); + tree var = SSA_NAME_VAR (op); + if (var) + { + bbs_list[0] = bb; + process_switch (op, stmt, visited_phis, bbs_list, n); + } + } + } + pointer_set_destroy (visited_phis); + XDELETEVEC (bbs_list); + return 0; +} + +/* Call gimple_duplicate_sese_region to douplicate the blocks in bb_list. + We free and recalculate all ssa and dominance information afterwords + because the regsion being copied is not really SESE and so we cannot + trust gimple_duplicate_sese_region to correctly update the dataflow + information. */ + +static void +duplicate_blocks (basic_block *bb_list, int bb_count) +{ + edge orig_edge, exit_edge; + + orig_edge = find_edge (bb_list[0], bb_list[1]); + exit_edge = find_edge (bb_list[bb_count-2], bb_list[bb_count-1]); + gimple_duplicate_sese_region (orig_edge, exit_edge, &bb_list[1], bb_count-2, NULL); + free_dominance_info (CDI_DOMINATORS); + update_ssa (TODO_update_ssa); + calculate_dominance_info (CDI_DOMINATORS); +} + +/* Go through the paths saved in bbs_list_array and make copies of them. */ + +static void +copy_switch_paths (void) +{ + int i; + + /* Process each path in bbs_list_size. */ + for (i = 0; i < n_bbs_list; i++) + { + /* For each path in bbs_list_size loop through and copy each block in + the path (except the first on where the constant is assigned and + the final one where the switch statement goes to. */ + + if (!single_pred_p (bbs_list_array[i][1])) + duplicate_blocks (bbs_list_array[i], bbs_list_size[i]); + } +} + +static unsigned int +do_switch_shortcut (void) +{ + int i; + + n_bbs_list = 0; +#ifdef COMPILE_AS_PLUGIN + max_insn_count = 100; + max_path_count = 20; +#else + max_insn_count = PARAM_VALUE (PARAM_MAX_SWITCH_INSNS); + max_path_count = PARAM_VALUE (PARAM_MAX_SWITCH_PATHS); +#endif + val_array = XNEWVEC (int, max_path_count); + bbs_list_size = XNEWVEC (int, max_path_count); + bbs_list_array = XNEWVEC (basic_block *, max_path_count); + find_switch_shortcuts (); + copy_switch_paths (); + XDELETEVEC (val_array); + XDELETEVEC (bbs_list_size); + for (i = 0; i < n_bbs_list; i++) + XDELETEVEC (bbs_list_array[i]); + XDELETEVEC (bbs_list_array); + return 0; +} + +/* The pass gate. */ + +static bool +gate_switch_shortcut (void) +{ +#ifdef COMPILE_AS_PLUGIN + return (optimize > 1); +#else + return flag_tree_switch_shortcut; +#endif +} + +#ifndef COMPILE_AS_PLUGIN +struct gimple_opt_pass pass_switch_shortcut = +{ + { + GIMPLE_PASS, + "switch_shortcut", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + gate_switch_shortcut, /* gate */ + do_switch_shortcut, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_SWITCH_SHORTCUT, /* tv_id */ + PROP_cfg | PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_cleanup_cfg | TODO_verify_all, /* todo_flags_finish */ + } +}; +#else +struct opt_pass pass_switch_shortcut = + { + GIMPLE_PASS, + "switch_shortcut", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + gate_switch_shortcut, /* gate */ + do_switch_shortcut, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + (timevar_id_t) 0, /* tv_id */ + PROP_cfg | PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_update_ssa + | TODO_verify_ssa + | TODO_verify_stmts + | TODO_verify_flow /* todo_flags_finish */ +}; + +int +plugin_init (struct plugin_name_args *plugin_info, + struct plugin_gcc_version *version) +{ + struct register_pass_info pass_info; + + if (!plugin_default_version_check (version, &gcc_version)) + return 1; + + pass_info.pass = &pass_switch_shortcut; + pass_info.reference_pass_name = "ifcombine"; + pass_info.ref_pass_instance_number = 1; + pass_info.pos_op = PASS_POS_INSERT_AFTER; + + register_callback (plugin_info->base_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &pass_info); + return 0; +} +#endif