Patchwork New switch optimization pass (PR tree-optimization/54742)

login
register
mail settings
Submitter Steve Ellcey
Date May 13, 2013, 8:16 p.m.
Message ID <1368476190.22602.109.camel@ubuntu-sellcey>
Download mbox | patch
Permalink /patch/243521/
State New
Headers show

Comments

Steve Ellcey - May 13, 2013, 8:16 p.m.
Here is the latest version of my SSA optimization pass to do the switch
statement optimization described in PR 54742 (core_state_transition from
coremark).

I have tested this optimization with a x86 bootstrap and GCC test run
with no errors and tested the MIPS cross compiler with no errors.
Because of that I decided to submit it as a statically linked
optimization pass instead of a dynamically loaded one, though I did keep
the ifdefs for using it as a dynamic pass in the code.  They could be
removed if this patch is approved as a statically linked pass.  Also,
while this patch shows the optimization only being turned on with the
-ftree-switch-shortcut flag, my bootstrap and testing had it turned on
for all -O2 optimizations in order to maximize the testing.
We may want to turn this on for -O3 and/or for
-fexpensive-optimizations.

I had to make one change to dominance.c in order to avoid some compiler
aborts where it was dereferencing a null pointer.  I believe this was
happening because I am calling gimple_duplicate_sese_region with regions
that are not really SESE.  Because I am doing this, I regenerate the cfg
and SSA information after each call, but I also had to change
iterate_fix_dominators to fix the abort.  Another way we might want to
fix this would be to pass a flag to gimple_duplicate_sese_region that
tells it whether or not we want it to recalculate the dominance
information at all.  If set to false, it would assume the caller will
take care of it.

Opinions?  OK to checkin?

Steve Ellcey
sellcey@imgtec.com


2013-05-13  Steve Ellcey  <sellcey@imgtec.com>

	PR tree-optimization/54742
	* Makefile.in (OBJS): Add tree-switch-shortcut.o.
	* common.opt (ftree-switch-shortcut): New.
	* dominance.c (iterate_fix_dominators): Add null check.
	* params.def (PARAM_MAX_SWITCH_INSNS): New.
	(PARAM_MAX_SWITCH_PATHS): New.
	* passes.c (init_optimization_passes): Add pass_switch_shortcut.
	* timevar.def (TV_SWITCH_SHORTCUT): New.
	* tree-pass.c (pass_switch_shortcut): New.
	* tree-switch-shortcut.c: New file.
Jeff Law - May 13, 2013, 8:28 p.m.
On 05/13/2013 02:16 PM, Steve Ellcey wrote:
> Here is the latest version of my SSA optimization pass to do the switch
> statement optimization described in PR 54742 (core_state_transition from
> coremark).
>
> I have tested this optimization with a x86 bootstrap and GCC test run
> with no errors and tested the MIPS cross compiler with no errors.
> Because of that I decided to submit it as a statically linked
> optimization pass instead of a dynamically loaded one, though I did keep
> the ifdefs for using it as a dynamic pass in the code.  They could be
> removed if this patch is approved as a statically linked pass.  Also,
> while this patch shows the optimization only being turned on with the
> -ftree-switch-shortcut flag, my bootstrap and testing had it turned on
> for all -O2 optimizations in order to maximize the testing.
> We may want to turn this on for -O3 and/or for
> -fexpensive-optimizations.
>
> I had to make one change to dominance.c in order to avoid some compiler
> aborts where it was dereferencing a null pointer.  I believe this was
> happening because I am calling gimple_duplicate_sese_region with regions
> that are not really SESE.  Because I am doing this, I regenerate the cfg
> and SSA information after each call, but I also had to change
> iterate_fix_dominators to fix the abort.  Another way we might want to
> fix this would be to pass a flag to gimple_duplicate_sese_region that
> tells it whether or not we want it to recalculate the dominance
> information at all.  If set to false, it would assume the caller will
> take care of it.
>
> Opinions?  OK to checkin?
>
> Steve Ellcey
> sellcey@imgtec.com
>
>
> 2013-05-13  Steve Ellcey  <sellcey@imgtec.com>
>
> 	PR tree-optimization/54742
> 	* Makefile.in (OBJS): Add tree-switch-shortcut.o.
> 	* common.opt (ftree-switch-shortcut): New.
> 	* dominance.c (iterate_fix_dominators): Add null check.
> 	* params.def (PARAM_MAX_SWITCH_INSNS): New.
> 	(PARAM_MAX_SWITCH_PATHS): New.
> 	* passes.c (init_optimization_passes): Add pass_switch_shortcut.
> 	* timevar.def (TV_SWITCH_SHORTCUT): New.
> 	* tree-pass.c (pass_switch_shortcut): New.
> 	* tree-switch-shortcut.c: New file.
I was looking at this last week (stuck for hours on tarmac at BWI).

I think we should fix this in the threader rather than doing a special 
purpose pass.  This is primarily because we get it for free if we 
address one limitation in the threader (some of the other issues I was 
concerned about don't apply).

Specifically if we look at thread_around_empty_block we have:

   /* This block must have a single predecessor (E->dest).  */
   if (!single_pred_p (bb))
     return NULL;

   /* This block must have more than one successor.  */
   if (single_succ_p (bb))
     return NULL;

   /* This block can have no PHI nodes.  This is overly conservative.  */
   if (!gsi_end_p (gsi_start_phis (bb)))
     return NULL;


The test that the block have > 1 successor and no PHI nodes are the keys.

;;   basic block 17, loop depth 1
;;    pred:       9
;;                13
;;                16
;;                15
;;                4
;;                12
;;                7
;;                14
;;                10
;;                5
;;                6
;;                8
;;                11
   # state_1 = PHI <0(9), 2(13), 3(16), 3(15), state_36(4), 1(12), 0(7), 
2(14), 1(10), 1(5), 0(6), 2(8), 3(11)>
   # .MEM_4 = PHI <.MEM_29(9), .MEM_20(13), .MEM_24(16), .MEM_29(15), 
.MEM_29(4), .MEM_29(12), .MEM_12(7), .MEM_29(14), .MEM_16(10), 
.MEM_29(5), .MEM_29(6), .MEM_29(8), .MEM_29(11)>
<L32>:
   str_25 = str_37 + 1;
   # VUSE <.MEM_4>
   _8 = MEM[base: str_25, offset: 0B];
   if (_8 != 0)
     goto <bb 18>;
   else
     goto <bb 19>;
;;    succ:       18
;;                19

;;   basic block 18, loop depth 1
;;    pred:       17
   goto <bb 4>;
;;    succ:       4



bb will be block #18.  In theory we just do
if (single_succ_p (bb))
   bb = single_succ_edge (bb)->dest;

And allow PHI nodes in this specific instance.

That's enough to allow existing code to identify the potential jump 
threading candidates -- and more importantly, it's much more general.



The updater needs copy bb17 & bb4.  bb17' will transfer to bb4' which 
will directly transfer to the destination of the switch.

The advantage of doing this in the threader is it'll help more than just 
the switch statement case.  It's probably highly likely that we're 
missing cases to thread through empty blocks with PHIs.

I've started cobbling together some of the updating code.  So far it 
doesn't look too terrible.


Jeff
Richard Guenther - May 14, 2013, 9:08 a.m.
On Mon, May 13, 2013 at 10:28 PM, Jeff Law <law@redhat.com> wrote:
> On 05/13/2013 02:16 PM, Steve Ellcey wrote:
>>
>> Here is the latest version of my SSA optimization pass to do the switch
>> statement optimization described in PR 54742 (core_state_transition from
>> coremark).
>>
>> I have tested this optimization with a x86 bootstrap and GCC test run
>> with no errors and tested the MIPS cross compiler with no errors.
>> Because of that I decided to submit it as a statically linked
>> optimization pass instead of a dynamically loaded one, though I did keep
>> the ifdefs for using it as a dynamic pass in the code.  They could be
>> removed if this patch is approved as a statically linked pass.  Also,
>> while this patch shows the optimization only being turned on with the
>> -ftree-switch-shortcut flag, my bootstrap and testing had it turned on
>> for all -O2 optimizations in order to maximize the testing.
>> We may want to turn this on for -O3 and/or for
>> -fexpensive-optimizations.
>>
>> I had to make one change to dominance.c in order to avoid some compiler
>> aborts where it was dereferencing a null pointer.  I believe this was
>> happening because I am calling gimple_duplicate_sese_region with regions
>> that are not really SESE.  Because I am doing this, I regenerate the cfg
>> and SSA information after each call, but I also had to change
>> iterate_fix_dominators to fix the abort.  Another way we might want to
>> fix this would be to pass a flag to gimple_duplicate_sese_region that
>> tells it whether or not we want it to recalculate the dominance
>> information at all.  If set to false, it would assume the caller will
>> take care of it.
>>
>> Opinions?  OK to checkin?
>>
>> Steve Ellcey
>> sellcey@imgtec.com
>>
>>
>> 2013-05-13  Steve Ellcey  <sellcey@imgtec.com>
>>
>>         PR tree-optimization/54742
>>         * Makefile.in (OBJS): Add tree-switch-shortcut.o.
>>         * common.opt (ftree-switch-shortcut): New.
>>         * dominance.c (iterate_fix_dominators): Add null check.
>>         * params.def (PARAM_MAX_SWITCH_INSNS): New.
>>         (PARAM_MAX_SWITCH_PATHS): New.
>>         * passes.c (init_optimization_passes): Add pass_switch_shortcut.
>>         * timevar.def (TV_SWITCH_SHORTCUT): New.
>>         * tree-pass.c (pass_switch_shortcut): New.
>>         * tree-switch-shortcut.c: New file.
>
> I was looking at this last week (stuck for hours on tarmac at BWI).
>
> I think we should fix this in the threader rather than doing a special
> purpose pass.  This is primarily because we get it for free if we address
> one limitation in the threader (some of the other issues I was concerned
> about don't apply).
>
> Specifically if we look at thread_around_empty_block we have:
>
>   /* This block must have a single predecessor (E->dest).  */
>   if (!single_pred_p (bb))
>     return NULL;
>
>   /* This block must have more than one successor.  */
>   if (single_succ_p (bb))
>     return NULL;
>
>   /* This block can have no PHI nodes.  This is overly conservative.  */
>   if (!gsi_end_p (gsi_start_phis (bb)))
>     return NULL;
>
>
> The test that the block have > 1 successor and no PHI nodes are the keys.
>
> ;;   basic block 17, loop depth 1
> ;;    pred:       9
> ;;                13
> ;;                16
> ;;                15
> ;;                4
> ;;                12
> ;;                7
> ;;                14
> ;;                10
> ;;                5
> ;;                6
> ;;                8
> ;;                11
>   # state_1 = PHI <0(9), 2(13), 3(16), 3(15), state_36(4), 1(12), 0(7),
> 2(14), 1(10), 1(5), 0(6), 2(8), 3(11)>
>   # .MEM_4 = PHI <.MEM_29(9), .MEM_20(13), .MEM_24(16), .MEM_29(15),
> .MEM_29(4), .MEM_29(12), .MEM_12(7), .MEM_29(14), .MEM_16(10), .MEM_29(5),
> .MEM_29(6), .MEM_29(8), .MEM_29(11)>
> <L32>:
>   str_25 = str_37 + 1;
>   # VUSE <.MEM_4>
>   _8 = MEM[base: str_25, offset: 0B];
>   if (_8 != 0)
>     goto <bb 18>;
>   else
>     goto <bb 19>;
> ;;    succ:       18
> ;;                19
>
> ;;   basic block 18, loop depth 1
> ;;    pred:       17
>   goto <bb 4>;
> ;;    succ:       4
>
>
>
> bb will be block #18.  In theory we just do
> if (single_succ_p (bb))
>   bb = single_succ_edge (bb)->dest;
>
> And allow PHI nodes in this specific instance.
>
> That's enough to allow existing code to identify the potential jump
> threading candidates -- and more importantly, it's much more general.
>
>
>
> The updater needs copy bb17 & bb4.  bb17' will transfer to bb4' which will
> directly transfer to the destination of the switch.
>
> The advantage of doing this in the threader is it'll help more than just the
> switch statement case.  It's probably highly likely that we're missing cases
> to thread through empty blocks with PHIs.
>
> I've started cobbling together some of the updating code.  So far it doesn't
> look too terrible.

I agree - thanks for looking at this!

Richard.

>
> Jeff
>

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 903125e..db0ffcb 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1399,6 +1399,7 @@  OBJS = \
 	tree-scalar-evolution.o \
 	tree-sra.o \
 	tree-switch-conversion.o \
+	tree-switch-shortcut.o \
 	tree-ssa-address.o \
 	tree-ssa-alias.o \
 	tree-ssa-ccp.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 4c7933e..e028e2d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2160,6 +2160,10 @@  ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+ftree-switch-shortcut
+Common Report Var(flag_tree_switch_shortcut) Init(0) Optimization
+Do fancy switch statement shortcutting
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/dominance.c b/gcc/dominance.c
index 5c96dad..d858ad1 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -1251,6 +1251,7 @@  iterate_fix_dominators (enum cdi_direction dir, vec<basic_block> bbs,
   struct pointer_map_t *map;
   int *parent, *son, *brother;
   unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  void **slot;
 
   /* We only support updating dominators.  There are some problems with
      updating postdominators (need to add fake edges from infinite loops
@@ -1357,7 +1358,10 @@  iterate_fix_dominators (enum cdi_direction dir, vec<basic_block> bbs,
 	  if (dom == bb)
 	    continue;
 
-	  dom_i = (size_t) *pointer_map_contains (map, dom);
+	  slot = pointer_map_contains (map, dom);
+	  if (slot == NULL)
+	    continue;
+	  dom_i = (size_t) *slot;
 
 	  /* Do not include parallel edges to G.  */
 	  if (!bitmap_set_bit ((bitmap) g->vertices[dom_i].data, i))
diff --git a/gcc/params.def b/gcc/params.def
index 3c52651..bdabe07 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1020,6 +1020,20 @@  DEFPARAM (PARAM_MAX_SLSR_CANDIDATE_SCAN,
 	  "strength reduction",
 	  50, 1, 999999)
 
+/* Maximum number of instructions to duplicate when shortcutting a switch.  */
+DEFPARAM (PARAM_MAX_SWITCH_INSNS,
+	  "max-switch-insns",
+	  "Maximum number of instructions to duplicate when "
+	  "shortcutting a switch statement",
+	  100, 1, 999999)
+
+/* Maximum number of paths to duplicate when shortcutting a switch.  */
+DEFPARAM (PARAM_MAX_SWITCH_PATHS,
+	  "max-switch-paths",
+	  "Maximum number of new paths to create when"
+	  " shortcutting a switch statement",
+	  50, 1, 999999)
+
 /*
 Local variables:
 mode:c
diff --git a/gcc/passes.c b/gcc/passes.c
index fd67ee6..0fb826c 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1416,6 +1416,7 @@  init_optimization_passes (void)
       NEXT_PASS (pass_call_cdce);
       NEXT_PASS (pass_cselim);
       NEXT_PASS (pass_tree_ifcombine);
+      NEXT_PASS (pass_switch_shortcut);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_tail_recursion);
       NEXT_PASS (pass_ch);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 44f0eac..e859657 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -162,6 +162,7 @@  DEFTIMEVAR (TV_TREE_LOOP_IVCANON     , "tree canonical iv")
 DEFTIMEVAR (TV_SCEV_CONST            , "scev constant prop")
 DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH    , "tree loop unswitching")
 DEFTIMEVAR (TV_COMPLETE_UNROLL       , "complete unrolling")
+DEFTIMEVAR (TV_SWITCH_SHORTCUT       , "switch statement shortcuts")
 DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
 DEFTIMEVAR (TV_TREE_VECTORIZATION    , "tree vectorization")
 DEFTIMEVAR (TV_TREE_SLP_VECTORIZATION, "tree slp vectorization")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index b8c59a7..219946a 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -489,6 +489,7 @@  extern struct gimple_opt_pass pass_early_inline;
 extern struct gimple_opt_pass pass_inline_parameters;
 extern struct gimple_opt_pass pass_update_address_taken;
 extern struct gimple_opt_pass pass_convert_switch;
+extern struct gimple_opt_pass pass_switch_shortcut;
 
 /* The root of the compilation pass tree, once constructed.  */
 extern struct opt_pass *all_passes, *all_small_ipa_passes, *all_lowering_passes,
diff --git a/gcc/tree-switch-shortcut.c b/gcc/tree-switch-shortcut.c
new file mode 100644
index 0000000..6423454
--- /dev/null
+++ b/gcc/tree-switch-shortcut.c
@@ -0,0 +1,412 @@ 
+/* Switch shortcutting optimization for GNU C
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Steve Ellcey (sellcey@imgtec.com).
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* This file implements an optimization where, when a variable is set
+   to a constant value and there is a path that leads from this definition
+   to a switch statement that uses that variable as its controlling expression
+   we duplicate the blocks on this path and change the switch goto to a
+   direct goto to the label of the switch block that control would goto based
+   on the value of the variable.  */
+
+#ifdef COMPILE_AS_PLUGIN
+#include <gcc-plugin.h>
+#include <plugin-version.h>
+#else
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#endif
+
+#include "tree-pass.h"
+#include "tree.h"
+#include "tree-inline.h"
+#include "tree-flow.h"
+#include "tree-flow-inline.h"
+#include "basic-block.h"
+#include "pointer-set.h"
+#include "gimple.h"
+#include "cfgloop.h"
+#include "params.h"
+
+#ifdef COMPILE_AS_PLUGIN
+int plugin_is_GPL_compatible;
+#endif
+
+/* Helper function for find_path, visited_bbs is used to make sure we don't
+   fall into an infinite loop.  */
+
+static int
+find_path_1(basic_block start_bb, basic_block end_bb, struct pointer_set_t *visited_bbs)
+{
+  edge_iterator ei;
+  edge e;
+
+  if (start_bb == end_bb) return 1;
+
+  if (!pointer_set_insert (visited_bbs, start_bb))
+    {
+      FOR_EACH_EDGE (e, ei, start_bb->succs)
+	if (find_path_1 (e->dest, end_bb, visited_bbs))
+	  return 1;
+    }
+    return 0;
+}
+
+/* Return 1 if there is a path from start_bb to end_bb and 0 if there
+   is not.  There may be multiple paths from start_bb to end_bb.  */
+
+static int
+find_path(basic_block start_bb, basic_block end_bb)
+{
+  edge_iterator ei;
+  edge e;
+  struct pointer_set_t *visited_bbs;
+  int p = 0;
+
+  if (start_bb == end_bb) return 1;
+
+  visited_bbs = pointer_set_create ();
+  if (!pointer_set_insert (visited_bbs, start_bb))
+    {
+      FOR_EACH_EDGE (e, ei, start_bb->succs)
+	if (find_path_1 (e->dest, end_bb, visited_bbs))
+	  {
+	    p = 1;
+	    break;
+	  }
+    }
+  pointer_set_destroy (visited_bbs);
+  return p;
+}
+
+
+/* We save the paths we want to copy in bbs_list_array.  n_bbs_list is the
+   number of paths saved, bbs_list_array[i] is the list of basic blocks in
+   one path.  Each path starts with the block where a variable is assigned
+   a constant value (bbs_list_array[i][0]) and ends with the switch statement
+   block (bbs_list_array[i][bbs_list_size[i]-2]) and then the block that the
+   switch statement is going to go to given the constant value of the
+   variable (bbs_list_array[i][bbs_list_size[i]-1]).  */
+
+static basic_block **bbs_list_array;
+static int *val_array;
+static int *bbs_list_size;
+static int max_path_count;
+static int max_insn_count;
+static int n_bbs_list;
+
+/* bbs_list[0] is the block with the switch statement,
+   bbs_list[n-1] is the block where the switch statement variable is assigned
+     a constant value,
+   The entries in between make a (reverse) path between the two.
+
+   We don't want to change bb_list, we want to leave that alone and
+   and copy the path to bbs_list_array so that we wind up with a list (array)
+   of paths that we want to update.  We also want to add the block that the
+   switch is going to go to on to the list so that we know which exit from
+   the switch statement is important.  */
+
+static void
+save_new_path (basic_block *bbs_list, int n, tree val)
+{
+  int i;
+  int insn_count;
+  basic_block bb;
+  edge switch_taken_edge;
+  gimple_stmt_iterator gsi;
+
+  if (n <= 1) return;
+
+  if (n_bbs_list >= max_path_count)
+    return;
+
+  /* Put the blocks in 'correct' order and add in where we want to go after
+     the switch statement, We want to leave bbs_list untouched for future
+     calls.  */
+
+  bbs_list_array[n_bbs_list] = XNEWVEC (basic_block, n+1);
+  for (i = 0; i < n; i++)
+    bbs_list_array[n_bbs_list][i] = bbs_list[n-i-1];
+
+  switch_taken_edge = find_taken_edge (bbs_list[0], val);
+  bbs_list_array[n_bbs_list][n] = switch_taken_edge->dest;
+
+  bbs_list_size[n_bbs_list] = n + 1;
+  val_array[n_bbs_list] = (int) TREE_INT_CST_LOW (val);
+
+  /* Count how many instructions are in the blocks we are going to
+     duplicate and if there are too many do not save this path
+     (return without incrementing n_bbs_list).  */
+
+  insn_count = 0;
+  for (i = 1; i < n; i++)
+    {
+      bb = bbs_list_array[n_bbs_list][i];
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	insn_count += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);
+    }
+
+  if (insn_count > max_insn_count)
+    return;
+
+  n_bbs_list = n_bbs_list + 1;
+}
+
+/* switch_stmt is a switch statement whose switch index expression
+   is the variable expr.  We trace the value of the variable back
+   through any phi nodes looking for places where it gets a constant
+   value and save the path in bbs_list.  Then we call save_new_path
+   to create a list of such paths.  */
+
+static void
+process_switch (tree expr, gimple switch_stmt,
+	        struct pointer_set_t *visited_phis,
+	        basic_block *bbs_list, int n)
+{
+  gimple def_stmt;
+  tree var;
+  unsigned int i;
+  edge e;
+  edge_iterator ei;
+  basic_block bbx;
+  basic_block var_bb;
+  int e_count;
+
+  gcc_assert (gimple_code (switch_stmt) == GIMPLE_SWITCH);
+  var = SSA_NAME_VAR (expr);
+  def_stmt = SSA_NAME_DEF_STMT (expr);
+  var_bb = gimple_bb (def_stmt);
+
+  if (var == NULL || var_bb == NULL) return;
+
+  /* We have a variable definition (var) that is defined in var_bb,
+     We want to put the path from var_bb to the current bb into the
+     bbs_list.  If there is more then one path, skip this and don't
+     try to do the optimization.  */
+
+  bbx = bbs_list[n-1];
+  while (bbx != var_bb)
+   {
+     e_count = 0;
+     FOR_EACH_EDGE (e, ei, bbx->preds)
+       {
+         if (find_path (var_bb, e->src))
+	   {
+      	     bbs_list[n] = e->src;
+      	     n = n + 1;
+	     e_count = e_count + 1;
+    	   }
+       }
+     if (e_count != 1) return;
+     bbx = bbs_list[n-1];
+   }
+
+  if ((gimple_code (def_stmt) == GIMPLE_PHI)
+       && !pointer_set_insert (visited_phis, def_stmt))
+    {
+      for (i = 0; i < gimple_phi_num_args (def_stmt); i++)
+	{
+	  tree arg = gimple_phi_arg_def (def_stmt, i);
+	  if (arg && (TREE_CODE (arg) == INTEGER_CST))
+	    {
+	      /* const char *name = IDENTIFIER_POINTER (DECL_NAME (var)); */
+	      bbs_list[n] = gimple_phi_arg_edge (def_stmt, i)->src;
+	      save_new_path(bbs_list, n + 1, arg);
+	    }
+	  else if (arg && (TREE_CODE (arg) == SSA_NAME))
+	    {
+		  bbs_list[n] = gimple_phi_arg_edge (def_stmt, i)->src;
+		  process_switch (arg, switch_stmt, visited_phis, bbs_list, n+1);
+	    }
+	}
+    }
+}
+
+/* Find paths that lead from blocks where a variable is assigned a constant
+   value to a switch statement where that variable is used as the switch
+   index.  Save the paths in bbs_list_array so that they can be processed
+   by copy_switch_paths.  */
+
+static unsigned int
+find_switch_shortcuts (void)
+{
+  basic_block bb;
+  struct pointer_set_t *visited_phis;
+  basic_block *bbs_list;
+  int n = 1;
+
+  bbs_list = XNEWVEC (basic_block, n_basic_blocks);
+  visited_phis = pointer_set_create ();
+  FOR_EACH_BB (bb)
+    {
+      gimple stmt = last_stmt (bb);
+      if (stmt && gimple_code (stmt) == GIMPLE_SWITCH)
+	{
+	  tree op = gimple_switch_index (stmt);
+	  tree var = SSA_NAME_VAR (op);
+	  if (var)
+	    {
+	      bbs_list[0] = bb;
+	      process_switch (op, stmt, visited_phis, bbs_list, n);
+	    }
+	}
+    }
+  pointer_set_destroy (visited_phis);
+  XDELETEVEC (bbs_list);
+  return 0;
+}
+
+/* Call gimple_duplicate_sese_region to douplicate the blocks in bb_list.
+   We free and recalculate all ssa and dominance information afterwords
+   because the regsion being copied is not really SESE and so we cannot
+   trust gimple_duplicate_sese_region to correctly update the dataflow
+   information.  */
+
+static void
+duplicate_blocks (basic_block *bb_list, int bb_count)
+{
+  edge orig_edge, exit_edge;
+
+  orig_edge = find_edge (bb_list[0], bb_list[1]);
+  exit_edge = find_edge (bb_list[bb_count-2], bb_list[bb_count-1]);
+  gimple_duplicate_sese_region (orig_edge, exit_edge, &bb_list[1], bb_count-2, NULL);
+  free_dominance_info (CDI_DOMINATORS);
+  update_ssa (TODO_update_ssa);
+  calculate_dominance_info (CDI_DOMINATORS);
+}
+
+/* Go through the paths saved in bbs_list_array and make copies of them.  */
+
+static void
+copy_switch_paths (void)
+{
+  int i;
+
+  /* Process each path in bbs_list_size.  */
+  for (i = 0; i < n_bbs_list; i++)
+    {
+    /* For each path in bbs_list_size loop through and copy each block in
+       the path (except the first on where the constant is assigned and
+       the final one where the switch statement goes to.  */
+
+    if (!single_pred_p (bbs_list_array[i][1]))
+      duplicate_blocks (bbs_list_array[i], bbs_list_size[i]);
+    }
+}
+
+static unsigned int
+do_switch_shortcut (void)
+{
+  int i;
+
+  n_bbs_list = 0;
+#ifdef COMPILE_AS_PLUGIN
+  max_insn_count = 100;
+  max_path_count = 20;
+#else
+  max_insn_count = PARAM_VALUE (PARAM_MAX_SWITCH_INSNS);
+  max_path_count = PARAM_VALUE (PARAM_MAX_SWITCH_PATHS);
+#endif
+  val_array = XNEWVEC (int, max_path_count);
+  bbs_list_size = XNEWVEC (int, max_path_count); 
+  bbs_list_array = XNEWVEC (basic_block *, max_path_count);
+  find_switch_shortcuts ();
+  copy_switch_paths ();
+  XDELETEVEC (val_array);
+  XDELETEVEC (bbs_list_size);
+  for (i = 0; i < n_bbs_list; i++)
+    XDELETEVEC (bbs_list_array[i]); 
+  XDELETEVEC (bbs_list_array);
+  return 0;
+}
+
+/* The pass gate. */
+
+static bool
+gate_switch_shortcut (void)
+{
+#ifdef COMPILE_AS_PLUGIN
+   return (optimize > 1);
+#else
+   return flag_tree_switch_shortcut;
+#endif
+}
+
+#ifndef COMPILE_AS_PLUGIN
+struct gimple_opt_pass pass_switch_shortcut =
+{
+ {
+  GIMPLE_PASS,
+  "switch_shortcut",			/* name */
+  OPTGROUP_NONE,                        /* optinfo_flags */
+  gate_switch_shortcut,			/* gate */
+  do_switch_shortcut,			/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_SWITCH_SHORTCUT,			/* tv_id */
+  PROP_cfg | PROP_ssa,			/* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  TODO_cleanup_cfg | TODO_verify_all,	/* todo_flags_finish */
+ }
+};
+#else
+struct opt_pass pass_switch_shortcut =
+ {
+  GIMPLE_PASS,
+  "switch_shortcut",			/* name */
+  OPTGROUP_NONE,                        /* optinfo_flags */
+  gate_switch_shortcut,			/* gate */
+  do_switch_shortcut,			/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  (timevar_id_t) 0,					/* tv_id */
+  PROP_cfg | PROP_ssa,	                /* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  TODO_update_ssa 
+  | TODO_verify_ssa
+  | TODO_verify_stmts
+  | TODO_verify_flow			/* todo_flags_finish */
+};
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+             struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  pass_info.pass = &pass_switch_shortcut;
+  pass_info.reference_pass_name = "ifcombine";
+  pass_info.ref_pass_instance_number = 1;
+  pass_info.pos_op = PASS_POS_INSERT_AFTER;
+
+  register_callback (plugin_info->base_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &pass_info);
+  return 0;
+}
+#endif