diff mbox

[RFA/RFC] : Follow-up on type-demotion pass ...

Message ID 317948312.36384059.1384803974853.JavaMail.root@redhat.com
State New
Headers show

Commit Message

Kai Tietz Nov. 18, 2013, 7:46 p.m. UTC
Hello,

this is the second approach of the patch I've sent some months ago.  I think we saw that there was no other approach shown, so I will continue on that ...
The idea of this pass is to introduce a statement "normalization" and "denormalization".  The "normalization" happens in forward-propagation, and "denormalization" is done by this type-demotion-pass.  Therefore it is recommented that the type-demotion pass runs just before the forwprop-pass, as otherwise we might end in "denormalized" representation, which is in some cases less performance.

The current pass supports for addition/substraction type-transscript to unsigned-integer logic for preventing overflow-issues.  By this we transform operations like:
int a,b; ... (unsigned int) (a + b) ... into ((unsigned int) a) + ((unsigned int) b).
This transformation needs to be done after first analyze code, as we otherwise would destroy overflow-detection by this transscription.  We could do the same for MULT_EXPR, if there wouldn't be that late analyzis of "possible" overflow of it.  Therefore I disabled such transformation for it.

Then it supports shift-operation type-sinking/type-raising.  This seems to be the most interesting part of this patch - beside of the constnat propagtion optimization as seen in ts-add-3.c testcase - as this kind of optimization is new to gcc.  The part in forward-propagation is required to define the "normal" notation of such shift-expressions.  As nice side-effect by these transformation we can find that type-over-widening on some operations don't occure anymore in vectorize code.

Additional this pass should catch useless type-conversion and simplifies them.  Therefore I kept the code about type-cast reduction still within that pass.  Of course we can move it out to some other place, like tree-ssa-gimple, or even gimple might be a place.  But right now I would like to keep at beginning all those pieces together.


ChangeLog gcc

	* Makefile.in: Add tree-ssa-te.c to build.
	* passes.def: Add typedemote passes.
	* tree-pass.h (make_pass_demote1): Add prototype.
	(make_pass_demote2): Likewise.
	* tree-ssa-forwprop.c (simplify_shift):  New function.
	(ssa_forward_propagate_and_combine): Use it.
	*  tree-ssa-te.c: New pass.

Changelog gcc/testsuite:

	* gcc.dg/tree-ssa/scev-cast.c: Adjust test.
	* gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.
	* gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-1.c: Likewise.
	* gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-3.c: Likewise.
	* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-4.c: Likewise.
	* gcc.dg/tree-ssa/ts-add-1.c: New test.
	* gcc.dg/tree-ssa/ts-add-2.c: New test.
	* gcc.dg/tree-ssa/ts-add-3.c: New test.
	* gcc.dg/tree-ssa/ts-shift-1.c: New test.

Test for x86_64-unknown-linux-gnu, and i686-pc-cygwin, and x86_64-pc-cygwin.

Ok for apply?

Regards,
Kai

Comments

Joseph Myers Nov. 18, 2013, 10:55 p.m. UTC | #1
This is not a review, but:

* What do you need from rtl.h?  It's generally best for GIMPLE passes to 
avoid rtl.h where possible (and if you can avoid it, the next question is 
whether you can also avoid tm.h).

* Going just on the general description of the pass and not looking at the 
details: does this do any of the things that are done by shorten_binary_op 
or shorten_compare in c-common.c?  If so, do you plan followup changes to 
remove as premature optimizations whatever those functions do that can be 
done by this pass?
Jeff Law Nov. 19, 2013, 6:09 a.m. UTC | #2
On 11/18/13 15:55, Joseph S. Myers wrote:
> * Going just on the general description of the pass and not looking at the
> details: does this do any of the things that are done by shorten_binary_op
> or shorten_compare in c-common.c?  If so, do you plan followup changes to
> remove as premature optimizations whatever those functions do that can be
> done by this pass?
I don't recall if we had explicitly called out shorten_compare and/or 
shorten_binary_op in any of our discussions, but I can confirm that this 
is one step in trying to do something sensible in this space.

I'm happy to put shorten_compare on the hit-list.  I haven't had to fix 
anything in shorten_compare in about 15 years and I'd largely forgotten 
about it.

jeff
Kai Tietz Nov. 19, 2013, 10:28 a.m. UTC | #3
----- Original Message -----
> This is not a review, but:
> 
> * What do you need from rtl.h?  It's generally best for GIMPLE passes to
> avoid rtl.h where possible (and if you can avoid it, the next question is
> whether you can also avoid tm.h).

Yes, for now rtl.h, tm.h and tm_p.h headers can be avoided.  The first two are required for tm_p.h header.  As this patch doesn't requires target-hooks, I removed those three includes from my local patch.
 
> * Going just on the general description of the pass and not looking at the
> details: does this do any of the things that are done by shorten_binary_op
> or shorten_compare in c-common.c?  If so, do you plan followup changes to
> remove as premature optimizations whatever those functions do that can be
> done by this pass?

I didn't had explicit shorten_compare and/or shorten_binary_op on radar, but on a closer look into them, yes.  This patch is a step into this area.

> 
> --
> Joseph S. Myers
> joseph@codesourcery.com
> 

Kai
Richard Biener Nov. 20, 2013, 9:42 a.m. UTC | #4
On Mon, Nov 18, 2013 at 8:46 PM, Kai Tietz <ktietz@redhat.com> wrote:
> Hello,
>
> this is the second approach of the patch I've sent some months ago.  I think we saw that there was no other approach shown, so I will continue on that ...
> The idea of this pass is to introduce a statement "normalization" and "denormalization".  The "normalization" happens in forward-propagation, and "denormalization" is done by this type-demotion-pass.  Therefore it is recommented that the type-demotion pass runs just before the forwprop-pass, as otherwise we might end in "denormalized" representation, which is in some cases less performance.
>
> The current pass supports for addition/substraction type-transscript to unsigned-integer logic for preventing overflow-issues.  By this we transform operations like:
> int a,b; ... (unsigned int) (a + b) ... into ((unsigned int) a) + ((unsigned int) b).
> This transformation needs to be done after first analyze code, as we otherwise would destroy overflow-detection by this transscription.  We could do the same for MULT_EXPR, if there wouldn't be that late analyzis of "possible" overflow of it.  Therefore I disabled such transformation for it.
>
> Then it supports shift-operation type-sinking/type-raising.  This seems to be the most interesting part of this patch - beside of the constnat propagtion optimization as seen in ts-add-3.c testcase - as this kind of optimization is new to gcc.  The part in forward-propagation is required to define the "normal" notation of such shift-expressions.  As nice side-effect by these transformation we can find that type-over-widening on some operations don't occure anymore in vectorize code.
>
> Additional this pass should catch useless type-conversion and simplifies them.  Therefore I kept the code about type-cast reduction still within that pass.  Of course we can move it out to some other place, like tree-ssa-gimple, or even gimple might be a place.  But right now I would like to keep at beginning all those pieces together.

This is not a review of the pass either.

1) Can you please split out the forwprop pieces and motivate them
separately?
2) How did you decide on pass placement?  It looks like you do
it twice (eh?) and before forwprop passes.  What's the difference
between the passes?
3) You don't add a flag to disable this pass, please add one
4) You compute post-dominators but appearantly only to specify
basic-block walk order - instead use one of the basic-block
odering functions

Richard.

>
> ChangeLog gcc
>
>         * Makefile.in: Add tree-ssa-te.c to build.
>         * passes.def: Add typedemote passes.
>         * tree-pass.h (make_pass_demote1): Add prototype.
>         (make_pass_demote2): Likewise.
>         * tree-ssa-forwprop.c (simplify_shift):  New function.
>         (ssa_forward_propagate_and_combine): Use it.
>         *  tree-ssa-te.c: New pass.
>
> Changelog gcc/testsuite:
>
>         * gcc.dg/tree-ssa/scev-cast.c: Adjust test.
>         * gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
>         * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
>         * gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
>         * gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-1.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-3.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-4.c: Likewise.
>         * gcc.dg/tree-ssa/ts-add-1.c: New test.
>         * gcc.dg/tree-ssa/ts-add-2.c: New test.
>         * gcc.dg/tree-ssa/ts-add-3.c: New test.
>         * gcc.dg/tree-ssa/ts-shift-1.c: New test.
>
> Test for x86_64-unknown-linux-gnu, and i686-pc-cygwin, and x86_64-pc-cygwin.
>
> Ok for apply?
>
> Regards,
> Kai
>
>
> Index: gcc-trunk/gcc/Makefile.in
> ===================================================================
> --- gcc-trunk.orig/gcc/Makefile.in
> +++ gcc-trunk/gcc/Makefile.in
> @@ -1430,6 +1430,7 @@ OBJS = \
>         tree-ssa-pre.o \
>         tree-ssa-propagate.o \
>         tree-ssa-reassoc.o \
> +       tree-ssa-te.o \
>         tree-ssa-sccvn.o \
>         tree-ssa-sink.o \
>         tree-ssa-strlen.o \
> Index: gcc-trunk/gcc/passes.def
> ===================================================================
> --- gcc-trunk.orig/gcc/passes.def
> +++ gcc-trunk/gcc/passes.def
> @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
>           NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_ccp);
> +         NEXT_PASS (pass_demote1);
>           /* After CCP we rewrite no longer addressed locals into SSA
>              form if possible.  */
>           NEXT_PASS (pass_forwprop);
> @@ -141,6 +142,7 @@ along with GCC; see the file COPYING3.
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
>        NEXT_PASS (pass_phiprop);
> +      NEXT_PASS (pass_demote2);
>        NEXT_PASS (pass_forwprop);
>        NEXT_PASS (pass_object_sizes);
>        /* pass_build_alias is a dummy pass that ensures that we
> Index: gcc-trunk/gcc/tree-pass.h
> ===================================================================
> --- gcc-trunk.orig/gcc/tree-pass.h
> +++ gcc-trunk/gcc/tree-pass.h
> @@ -430,6 +430,8 @@ extern gimple_opt_pass *make_pass_vrp (g
>  extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_reassoc (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_demote1 (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_demote2 (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_rebuild_cgraph_edges (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_remove_cgraph_callee_edges (gcc::context
>                                                               *ctxt);
> @@ -447,6 +449,8 @@ extern gimple_opt_pass *make_pass_split_
>  extern gimple_opt_pass *make_pass_feedback_split_functions (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_strength_reduction (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
> +extern struct gimple_opt_pass pass_type_demote1;
> +extern struct gimple_opt_pass pass_type_demote2;
>
>  /* IPA Passes */
>  extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
> Index: gcc-trunk/gcc/tree-ssa-forwprop.c
> ===================================================================
> --- gcc-trunk.orig/gcc/tree-ssa-forwprop.c
> +++ gcc-trunk/gcc/tree-ssa-forwprop.c
> @@ -548,6 +548,107 @@ forward_propagate_into_gimple_cond (gimp
>    return 0;
>  }
>
> +/* Try to simplify shift-statement ((TYPE1) X) CODE Y for
> +   integral-kind types.
> +   Returns none-zero if the stmt was changed.  */
> +
> +static int
> +simplify_shift (enum tree_code code, gimple_stmt_iterator *gsi_p)
> +{
> +  gimple stmt = gsi_stmt (*gsi_p);
> +  gimple def, newop;
> +  tree op1, opx, op2, t_opx, tem;
> +  int ret;
> +
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt))))
> +    return 0;
> +
> +  op1 = gimple_assign_rhs1 (stmt);
> +  if (TREE_CODE (op1) != SSA_NAME
> +      || !(def = SSA_NAME_DEF_STMT (op1))
> +      || !is_gimple_assign (def)
> +      || !gimple_assign_cast_p (def)
> +      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op1)
> +      || !has_single_use (op1))
> +    return 0;
> +
> +  opx = gimple_assign_rhs1 (def);
> +  if (TREE_CODE (opx) != SSA_NAME)
> +    return 0;
> +
> +  t_opx = TREE_TYPE (opx);
> +
> +  if (!INTEGRAL_TYPE_P (t_opx))
> +    return 0;
> +
> +  op2 = gimple_assign_rhs2 (stmt);
> +
> +  if (code == LSHIFT_EXPR)
> +    {
> +      if (TYPE_PRECISION (TREE_TYPE (op1))
> +         > TYPE_PRECISION (t_opx))
> +        return 0;
> +    }
> +  else if (code == RSHIFT_EXPR)
> +    {
> +      /* If type of OPX and OP1 are compatible, we don't need to check OP2
> +         for validity as we don't change range of operation.
> +         Otherwise we need to check that right-hand operand of shift-right
> +        doesn't exceed type-precision of inner operand.  */
> +      if (!useless_type_conversion_p (TREE_TYPE (op1), t_opx))
> +        {
> +         if (TREE_CODE (op2) != INTEGER_CST)
> +           return 0;
> +         if (integer_zerop (fold_binary (LT_EXPR, boolean_type_node, op2,
> +                                         build_int_cst (TREE_TYPE (op2),
> +                                                        TYPE_PRECISION (t_opx)))))
> +           return 0;
> +
> +          /* See if cast can be moved out of the shift-right operation.  */
> +         if (TYPE_PRECISION (TREE_TYPE (op1)) <= TYPE_PRECISION (t_opx)
> +             || (!TYPE_UNSIGNED (t_opx) && TYPE_UNSIGNED (TREE_TYPE (op1))))
> +           return 0;
> +        }
> +    }
> +  else
> +    return 0;
> +
> +  /* Do transformation ((T1) X) shift-code N -> (T1) (X shift-code N).  */
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "  Replaced ");
> +      print_gimple_stmt (dump_file, stmt, 0, 0);
> +      fprintf (dump_file, "    ");
> +      print_gimple_stmt (dump_file, def, 0, 0);
> +      fprintf (dump_file, "  by ");
> +    }
> +
> +  tem = make_ssa_name (t_opx, NULL);
> +  newop = gimple_build_assign_with_ops (code, tem, opx, op2);
> +  gimple_set_location (newop, gimple_location (stmt));
> +  gsi_insert_before (gsi_p, newop, GSI_SAME_STMT);
> +  gimple_assign_set_rhs_with_ops_1 (gsi_p, NOP_EXPR, tem, NULL_TREE, NULL_TREE);
> +
> +  if (gimple_location (def) != UNKNOWN_LOCATION)
> +    gimple_set_location (stmt, gimple_location (def));
> +
> +  stmt = gsi_stmt (*gsi_p);
> +  update_stmt (stmt);
> +
> +  if (TREE_CODE (op1) == SSA_NAME)
> +    ret = remove_prop_source_from_use (op1) ? 2 : 1;
> +  else
> +    ret = 1;
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "   by ");
> +      print_gimple_stmt (dump_file, stmt, 0, 0);
> +      fprintf (dump_file, "    ");
> +      print_gimple_stmt (dump_file, newop, 0, 0);
> +    }
> +
> +  return ret;
> +}
>
>  /* Propagate from the ssa name definition statements of COND_EXPR
>     in the rhs of statement STMT into the conditional if that simplifies it.
> @@ -3429,6 +3530,15 @@ ssa_forward_propagate_and_combine (void)
>                      || code == NEGATE_EXPR)
>                     && TREE_CODE (rhs1) == SSA_NAME)
>                   changed = simplify_not_neg_expr (&gsi);
> +               else if (code == LSHIFT_EXPR
> +                        || code == RSHIFT_EXPR)
> +                 {
> +                   int did_something;
> +                   did_something = simplify_shift (code, &gsi);
> +                   if (did_something == 2)
> +                     cfg_changed = true;
> +                   changed = did_something != 0;
> +                 }
>                 else if (code == COND_EXPR
>                          || code == VEC_COND_EXPR)
>                   {
> Index: gcc-trunk/gcc/tree-ssa-te.c
> ===================================================================
> --- /dev/null
> +++ gcc-trunk/gcc/tree-ssa-te.c
> @@ -0,0 +1,885 @@
> +/* Type-demotion for trees
> +   Copyright (C) 2013
> +   Free Software Foundation, Inc.
> +   Contributed by Kai Tietz <ktietz@redhat.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/*
> +   This pass tries sink casts into statements.  So statements like
> +       (TYPE) (OP1 code OP2) are getting transformed into ((TYPE) OP1) code ((TYPE) OP2).
> +
> +   These transformation are the oppose transformation-direction of that one done in
> +   tree-ssa-forward-propagation.  We can do those transformation within one pass due
> +   cycles.  The "normal" representation is generated by the forward-propagation.  It
> +   tries instead to raise casts out of the statement.
> +   Therefore it is essential that forward-propagation pass is done after this
> +   type-demotion pass, so that "normal" representation is kept.
> +
> +   By this we are able to resolve two optimization opportunities.  First is some
> +   constant propagation within binary-expressions.  The second is the reduction of
> +   useless type-conversions within statements.
> +
> +   That we actual do here full transformation instead of doing "virtual"
> +   expression-optimization and just write out optimized expression is caused
> +   by a complexity issues.  Such analyzis might cause too high complexity.  So
> +   it is more efficient and simpler to perform optimization instead.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "hash-table.h"
> +#include "tm.h"
> +#include "rtl.h"
> +#include "tm_p.h"
> +#include "tree.h"
> +#include "flags.h"
> +#include "basic-block.h"
> +#include "gimple-pretty-print.h"
> +#include "tree-inline.h"
> +#include "gimple.h"
> +#include "gimple-iterator.h"
> +#include "gimplify-me.h"
> +#include "gimple-ssa.h"
> +#include "tree-cfg.h"
> +#include "tree-phinodes.h"
> +#include "ssa-iterators.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-loop-niter.h"
> +#include "tree-ssa-loop.h"
> +#include "tree-dfa.h"
> +#include "tree-ssa.h"
> +#include "tree-iterator.h"
> +#include "tree-pass.h"
> +#include "alloc-pool.h"
> +#include "vec.h"
> +#include "langhooks.h"
> +#include "pointer-set.h"
> +#include "cfgloop.h"
> +#include "target.h"
> +#include "params.h"
> +#include "diagnostic-core.h"
> +
> +/* Operation kinds.  */
> +#define TDK_NO_EXPAND_VIA_UNSIGNED  1
> +#define TDK_NO_EXPAND_WIDENING     2
> +
> +/* Returns true if statement S1 dominates statement S2.  Like
> +   stmt_dominates_stmt_p, but uses stmt UIDs to optimize.  */
> +
> +static bool
> +uid_stmt_dominates_stmt_p (gimple s1, gimple s2)
> +{
> +  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
> +
> +  /* If bb1 is NULL, it should be a GIMPLE_NOP def stmt of an (D)
> +     SSA_NAME.  Assume it lives at the beginning of function and
> +     thus dominates everything.  */
> +  if (!bb1 || s1 == s2)
> +    return true;
> +
> +  /* If bb2 is NULL, it doesn't dominate any stmt with a bb.  */
> +  if (!bb2)
> +    return false;
> +
> +  if (bb1 == bb2)
> +    {
> +      /* PHIs in the same basic block are assumed to be
> +         executed all in parallel, if only one stmt is a PHI,
> +         it dominates the other stmt in the same basic block.  */
> +      if (gimple_code (s1) == GIMPLE_PHI)
> +        return true;
> +
> +      if (gimple_code (s2) == GIMPLE_PHI)
> +        return false;
> +
> +      /* gcc_assert (gimple_uid (s1) && gimple_uid (s2)); */
> +
> +      if (gimple_uid (s1) < gimple_uid (s2))
> +        return true;
> +
> +      if (gimple_uid (s1) > gimple_uid (s2))
> +        return false;
> +
> +      gimple_stmt_iterator gsi = gsi_for_stmt (s1);
> +      unsigned int uid = gimple_uid (s1);
> +      for (gsi_next (&gsi); !gsi_end_p (gsi); gsi_next (&gsi))
> +        {
> +          gimple s = gsi_stmt (gsi);
> +          if (gimple_uid (s) != uid)
> +            break;
> +          if (s == s2)
> +            return true;
> +        }
> +
> +      return false;
> +    }
> +
> +  return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
> +}
> +
> +/* Insert STMT after INSERT_POINT.  */
> +
> +static void
> +insert_stmt_after (gimple stmt, gimple insert_point)
> +{
> +  gimple_stmt_iterator gsi;
> +  basic_block bb;
> +
> +  if (gimple_code (insert_point) == GIMPLE_PHI)
> +    bb = gimple_bb (insert_point);
> +  else if (!stmt_ends_bb_p (insert_point))
> +    {
> +      gsi = gsi_for_stmt (insert_point);
> +      gimple_set_uid (stmt, gimple_uid (insert_point));
> +      gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
> +      return;
> +    }
> +  else
> +    /* We assume INSERT_POINT is a SSA_NAME_DEF_STMT of some SSA_NAME,
> +       thus if it must end a basic block, it should be a call that can
> +       throw, or some assignment that can throw.  If it throws, the LHS
> +       of it will not be initialized though, so only valid places using
> +       the SSA_NAME should be dominated by the fallthru edge.  */
> +    bb = find_fallthru_edge (gimple_bb (insert_point)->succs)->dest;
> +  gsi = gsi_after_labels (bb);
> +  if (gsi_end_p (gsi))
> +    {
> +      gimple_stmt_iterator gsi2 = gsi_last_bb (bb);
> +      gimple_set_uid (stmt,
> +                      gsi_end_p (gsi2) ? 1 : gimple_uid (gsi_stmt (gsi2)));
> +    }
> +  else
> +    gimple_set_uid (stmt, gimple_uid (gsi_stmt (gsi)));
> +  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
> +}
> +
> +/* Returns the basic-block, which would be used, if a statement
> +   would be insert at INSERT_POINT.  */
> +
> +static basic_block
> +get_insert_stmt_after_bb (gimple insert_point)
> +{
> +  basic_block bb = gimple_bb (insert_point);
> +
> +  if (gimple_code (insert_point) == GIMPLE_PHI
> +      || !stmt_ends_bb_p (insert_point))
> +    return bb;
> +  /* We assume INSERT_POINT is a SSA_NAME_DEF_STMT of some SSA_NAME,
> +      thus if it must end a basic block, it should be a call that can
> +      throw, or some assignment that can throw.  If it throws, the LHS
> +      of it will not be initialized though, so only valid places using
> +      the SSA_NAME should be dominated by the fallthru edge.  */
> +  bb = find_fallthru_edge (bb->succs)->dest;
> +  return gimple_bb (gsi_stmt (gsi_after_labels (bb)));
> +}
> +
> +/*  Returns the basic-block, which would be used on insertation
> +    of a statement having OP1 (and OP2).  */
> +
> +static basic_block
> +get_insertation_bb (tree op1, tree op2)
> +{
> +  gimple op1def = NULL, op2def = NULL, insert_point;
> +  gimple_stmt_iterator gsi;
> +
> +  /* Find an insertion place and insert.  */
> +  if (op1 && TREE_CODE (op1) == SSA_NAME)
> +    op1def = SSA_NAME_DEF_STMT (op1);
> +  if (op2 && TREE_CODE (op2) == SSA_NAME)
> +    op2def = SSA_NAME_DEF_STMT (op2);
> +
> +  if ((!op1def || gimple_nop_p (op1def))
> +      && (!op2def || gimple_nop_p (op2def)))
> +    {
> +      gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR));
> +      return gimple_bb (gsi_stmt (gsi));
> +    }
> +
> +  if ((!op1def || gimple_nop_p (op1def))
> +      || (op2def && !gimple_nop_p (op2def)
> +          && uid_stmt_dominates_stmt_p (op1def, op2def)))
> +    insert_point = op2def;
> +  else
> +    insert_point = op1def;
> +
> +  return get_insert_stmt_after_bb (insert_point);
> +}
> +
> +/* Determine if a new statement dependent on OP1 (and OP2)
> +   would be inserted at place with loop-depth bigger as the
> +   loop-depth of statement STMT.
> +   On matching condition TRUE is returned, otherwise FALSE.  */
> +
> +static bool
> +sink_into_loop_p (gimple stmt, tree op1, tree op2)
> +{
> +  basic_block bb;
> +  unsigned stmt_depth, ins_deepth;
> +
> +  if (op1 && TREE_CODE (op1) != SSA_NAME)
> +    op1 = NULL_TREE;
> +  if (op2 && TREE_CODE (op2) != SSA_NAME)
> +    op2 = NULL_TREE;
> +  if (!op1 && !op2)
> +    return false;
> +
> +  bb = get_insertation_bb (op1, op2);
> +  ins_deepth = loop_depth (bb->loop_father);
> +
> +  if (!ins_deepth)
> +    return false;
> +
> +  stmt_depth = loop_depth (gimple_bb (stmt)->loop_father);
> +
> +  return (stmt_depth < ins_deepth);
> +}
> +
> +/* Builds one assign-statement performing OP1 OPCODE OP2 or OPCODE OP1,
> +   using TMPVAR for the result.  Places the statement after the definition
> +   of either OP1 or OP2.  Returns the new statement's lhs-tree.  */
> +
> +static tree
> +build_gimple_assign_loc (tree type, tree op1, tree op2, enum tree_code opcode,
> +                        location_t loc)
> +{
> +  gimple op1def = NULL, op2def = NULL;
> +  gimple_stmt_iterator gsi;
> +  tree op;
> +  gimple sum;
> +
> +  /* Create the addition statement.  */
> +  op = make_ssa_name (type, NULL);
> +  sum = gimple_build_assign_with_ops (opcode, op, op1, op2);
> +
> +  /* Find an insertion place and insert.  */
> +  if (op1 && TREE_CODE (op1) == SSA_NAME)
> +    op1def = SSA_NAME_DEF_STMT (op1);
> +  if (op2 && TREE_CODE (op2) == SSA_NAME)
> +    op2def = SSA_NAME_DEF_STMT (op2);
> +  if ((!op1def || gimple_nop_p (op1def))
> +      && (!op2def || gimple_nop_p (op2def)))
> +    {
> +      gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR));
> +      if (gsi_end_p (gsi))
> +        {
> +          gimple_stmt_iterator gsi2
> +            = gsi_last_bb (single_succ (ENTRY_BLOCK_PTR));
> +          gimple_set_uid (sum,
> +                          gsi_end_p (gsi2) ? 1 : gimple_uid (gsi_stmt (gsi2)));
> +        }
> +      else
> +        gimple_set_uid (sum, gimple_uid (gsi_stmt (gsi)));
> +      gsi_insert_before (&gsi, sum, GSI_NEW_STMT);
> +    }
> +  else
> +    {
> +      gimple insert_point;
> +      if ((!op1def || gimple_nop_p (op1def))
> +           || (op2def && !gimple_nop_p (op2def)
> +               && uid_stmt_dominates_stmt_p (op1def, op2def)))
> +        insert_point = op2def;
> +      else
> +        insert_point = op1def;
> +      insert_stmt_after (sum, insert_point);
> +    }
> +
> +  gimple_set_location (sum, loc);
> +  update_stmt (sum);
> +
> +  return gimple_assign_lhs (sum);
> +}
> +
> +/* Release assign-statement chain of VAR, if it has zero-uses.
> +   This functions tries to recurse on up to two operands.  */
> +
> +static void
> +remove_stmt_chain (tree var)
> +{
> +  tree var2;
> +  gimple stmt;
> +  gimple_stmt_iterator gsi;
> +
> +  while (1)
> +    {
> +      if (TREE_CODE (var) != SSA_NAME || !has_zero_uses (var))
> +        return;
> +      stmt = SSA_NAME_DEF_STMT (var);
> +      if (!is_gimple_assign (stmt))
> +        return;
> +      var = gimple_assign_rhs1 (stmt);
> +      if (gimple_assign_rhs_class (stmt) == GIMPLE_BINARY_RHS)
> +       var2 = gimple_assign_rhs2 (stmt);
> +      else
> +       var2 = NULL_TREE;
> +      gsi = gsi_for_stmt (stmt);
> +      gsi_remove (&gsi, true);
> +      release_defs (stmt);
> +      if (var2)
> +       remove_stmt_chain (var2);
> +    }
> +}
> +
> +/* Generated gimple cast expression (NEWTYPE) name, and
> +   return new created assignment.  */
> +
> +static tree
> +gen_cast (tree newtype, tree name)
> +{
> +  tree ret;
> +
> +  if (INTEGRAL_TYPE_P (newtype) && TREE_CODE (name) == INTEGER_CST)
> +    {
> +      bool saved_flag_wrapv = flag_wrapv;
> +      flag_wrapv = 1;
> +      ret = fold_convert (newtype, name);
> +      flag_wrapv = saved_flag_wrapv;
> +
> +      return ret;
> +    }
> +
> +  /* We can't propagate through our defining statement, so emit
> +     a conversion explicitely.  */
> +  if (useless_type_conversion_p (newtype, TREE_TYPE (name)))
> +    return name;
> +
> +  ret = build_gimple_assign_loc (newtype, name, NULL_TREE, NOP_EXPR,
> +                                EXPR_LOCATION (name));
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "Create cast ");
> +      print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (ret), 0, 0);
> +    }
> +
> +  return ret;
> +}
> +
> +/* Check if simplification of pattern '(TYPE1) (TYPE2) X' to '(TYPE1) X' is
> +   possible.  If so this function returns true, otherwise false.
> + */
> +
> +static bool
> +demote_cast_p (tree t1, tree t2, tree tx)
> +{
> +  if (!INTEGRAL_TYPE_P (t1)
> +      || !INTEGRAL_TYPE_P (t2))
> +    return false;
> +
> +  /* A boolean-cast of a boolean-type can be optimized away for any width.
> +     A useless type-conversion can be optimized away, too.  */
> +  if (useless_type_conversion_p (t1, t2)
> +      || useless_type_conversion_p (t2, tx)
> +      /* Boolean types might have different width.  */
> +      || (TREE_CODE (t2) == BOOLEAN_TYPE
> +         && (TREE_CODE (t1) == BOOLEAN_TYPE
> +             || TREE_CODE (tx) == BOOLEAN_TYPE)))
> +    return true;
> +
> +  /* Boolean types are special ...
> +     They aren't real integer-scalar values, and a cast to a boolean-type
> +     implies a comparison.  Therefore it isn't allowed to perform simplification
> +     for any of the remaining operations if T2 is of Boolean kind.  */
> +  if (TREE_CODE (t2) == BOOLEAN_TYPE
> +      || !INTEGRAL_TYPE_P (tx))
> +    return false;
> +
> +  /* For (T1) (T2) X, if T1 <= X && T2 >= X we can transform to
> +     (T1) X.  */
> +  if (TYPE_PRECISION (t1) <= TYPE_PRECISION (tx))
> +    return (TYPE_PRECISION (t1) <= TYPE_PRECISION (t2));
> +
> +  /* If T2 < X, then there is a truncation and we
> +     can't do anything.   */
> +  if (TYPE_PRECISION (t2) < TYPE_PRECISION (tx))
> +    return false;
> +
> +  /* For (T1) (T2) X, if T1 > X && T1 <= T2,  we can
> +     transform to (T1) X.  */
> +  if (TYPE_PRECISION (t1) <= TYPE_PRECISION (t2))
> +    return true;
> +
> +  /* For (T1) (T2) X, if T1 > X && T2 >= X && SIGN(X) == SIGN(T2),
> +     we can transform to (T1) X.  */
> +  return (TYPE_UNSIGNED (t2) == TYPE_UNSIGNED (tx));
> +}
> +
> +
> +/* Try to sink cast expression (TYPE1) (TYPE2) X to (TYPE1) X, if
> +   TYPE1, TYPE2, and type of X are of integral kind and met condition
> +   checked in demote_cast_p.
> +   If cast was sunken, then TRUE is returned, otherwise FALSE.  */
> +
> +static bool
> +demote_cast (gimple def, tree name, tree c, gimple def2)
> +{
> +  tree x;
> +
> +  if (!gimple_assign_cast_p (def2))
> +    return false;
> +
> +  x = gimple_assign_rhs1 (def2);
> +
> +  if (TREE_CODE (x) != SSA_NAME
> +      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (x)
> +      || !demote_cast_p (TREE_TYPE (name), TREE_TYPE (c), TREE_TYPE (x)))
> +    return false;
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "Remove cast ");
> +      print_gimple_stmt (dump_file, def, 0, 0);
> +      fprintf (dump_file, "  ");
> +      print_gimple_stmt (dump_file, def2, 0, 0);
> +    }
> +
> +  gimple_assign_set_rhs1 (def, x);
> +
> +  if (useless_type_conversion_p (TREE_TYPE (name), TREE_TYPE (x)))
> +    gimple_assign_set_rhs_code (def, SSA_NAME);
> +
> +  if (gimple_location (def2) != UNKNOWN_LOCATION
> +      && gimple_location (def) == UNKNOWN_LOCATION)
> +    gimple_set_location (def, gimple_location (def2));
> +
> +  update_stmt (def);
> +  remove_stmt_chain (c);
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, " by cast ");
> +      print_gimple_stmt (dump_file, def, 0, 0);
> +    }
> +
> +  return true;
> +}
> +
> +/* Helper routine to check if OP is an gimple-cast-statement of
> +   an integral value.
> +   On success the gimple-statment variable referenced by PDEF is set,
> +   if PDEF isn't zero.
> +   On sucess the right-hand operand of the cast statment is stored in PINNER,
> +   if PINNER isn't zero.
> +   On success TRUE is returnd, otherwise FALSE.  */
> +
> +static bool
> +is_integral_cast (tree op, gimple *pdef, tree *pinner)
> +{
> +  gimple def = NULL;
> +  tree inner;
> +
> +  if (TREE_CODE (op) != SSA_NAME
> +      || !(def = SSA_NAME_DEF_STMT (op))
> +      || !is_gimple_assign (def)
> +      || !gimple_assign_cast_p (def)
> +      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> +    return false;
> +
> +  inner = gimple_assign_rhs1 (def);
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (inner)))
> +    return false;
> +
> +  if (pdef)
> +    *pdef = def;
> +  if (pinner)
> +    *pinner = inner;
> +  return true;
> +}
> +
> +/* Try to demote a cast into assignment.  We try to do this for
> +   MULT_EXPR, PLUS_EXPR, MINUS_EXPR, LROTATE_EXPR, RROTATE_EXPR,
> +   LSHIFT_EXPR, RSHIFT_EXPR, BIT_IOR_EXPR, BIT_XOR_EXPR, BIT_AND_EXPR,
> +   MIN_EXPR, and MAX_EXPR.
> +   On success function modifies incoming STMT and new statement is stored
> +   in PSTMT. The result on success is TRUE, otherwise FALSE.  */
> +
> +static bool
> +demote_into (int flags, gimple stmt, tree lhs, tree r, gimple r_stmt, gimple *pstmt)
> +{
> +  gimple_stmt_iterator gsi;
> +  enum tree_code code;
> +  tree l_type = TREE_TYPE (lhs);
> +  tree r_type = TREE_TYPE (r);
> +  tree a1, a2, i1 = NULL_TREE;
> +  gimple def = NULL;
> +
> +  /* We can't do type-sinking if only one side is Boolean-typed.  */
> +  if ((TREE_CODE (r_type) == BOOLEAN_TYPE
> +       || TREE_CODE (l_type) == BOOLEAN_TYPE)
> +      && TREE_CODE (r_type) != TREE_CODE (l_type))
> +    return false;
> +
> +  if ((flags & TDK_NO_EXPAND_WIDENING) != 0
> +      && TYPE_PRECISION (l_type) > TYPE_PRECISION (r_type))
> +    return false;
> +
> +  code = gimple_assign_rhs_code (r_stmt);
> +
> +  switch (code)
> +    {
> +    case MULT_EXPR:
> +      /* We don't want to sink casts into multiply-expressions
> +        if outter type and inner-type don't have same sign.  */
> +      if (TYPE_UNSIGNED (l_type) != TYPE_UNSIGNED (r_type))
> +       return false;
> +
> +      /* If inner-type isn't unsigned, we don't want to convert.  */
> +      if (!TYPE_UNSIGNED (l_type)
> +          && !useless_type_conversion_p (l_type, r_type))
> +       return false;
> +      /* Fall through.  */
> +
> +    case PLUS_EXPR: case MINUS_EXPR:
> +      a1 = gimple_assign_rhs1 (r_stmt);
> +      a2 = gimple_assign_rhs2 (r_stmt);
> +
> +      if (TYPE_PRECISION (l_type) < TYPE_PRECISION (r_type)
> +         && (!TYPE_UNSIGNED (l_type) && !flag_wrapv))
> +       {
> +         tree uns;
> +
> +         if ((flags & TDK_NO_EXPAND_VIA_UNSIGNED) != 0
> +             || sink_into_loop_p (stmt, a1, a2))
> +           return false;
> +
> +         uns = unsigned_type_for (l_type);
> +         a1 = gen_cast (uns, a1);
> +         a2 = gen_cast (uns, a2);
> +         a1 = build_gimple_assign_loc (uns, a1, a2, code,
> +                                       gimple_location (r_stmt));
> +         code = NOP_EXPR;
> +         a2 = NULL_TREE;
> +         break;
> +       }
> +      else if ((flags & TDK_NO_EXPAND_VIA_UNSIGNED) == 0
> +              && TYPE_PRECISION (l_type) < TYPE_PRECISION (r_type)
> +              && (TYPE_UNSIGNED (l_type) || flag_wrapv))
> +       {
> +         if (sink_into_loop_p (stmt, a1, a2))
> +           return false;
> +         a1 = gen_cast (l_type, a1);
> +         a2 = gen_cast (l_type, a2);
> +         break;
> +       }
> +      else if (TYPE_PRECISION (l_type) == TYPE_PRECISION (r_type))
> +       {
> +         if (TYPE_UNSIGNED (r_type) == TYPE_UNSIGNED (l_type)
> +             || !TYPE_UNSIGNED (r_type))
> +           {
> +             a1 = gen_cast (l_type, a1);
> +             a2 = gen_cast (l_type, a2);
> +             break;
> +           }
> +       }
> +      return false;
> +
> +    case LROTATE_EXPR: case RROTATE_EXPR:
> +      /* Any cast with same precision as inner left-hand value can be
> +        sunken.  */
> +      if (TYPE_PRECISION (l_type) == TYPE_PRECISION (r_type))
> +       {
> +         a1 = gimple_assign_rhs1 (r_stmt);
> +         if (!useless_type_conversion_p (l_type, r_type))
> +           a1 = gen_cast (l_type, a1);
> +         a2 = gimple_assign_rhs2 (r_stmt);
> +         break;
> +       }
> +
> +      return false;
> +
> +    case RSHIFT_EXPR:
> +      /* For shift-right operation (T1) (X >> N) we can do sink type into shift-statement,
> +        if types of X and T1 are compatible, or if type-precision of T1
> +        is greater then type-precision of X plus the T1 isn't unsigned
> +         and type of X isn't signed.
> +        As this doesn't lower precision of resulting type, we don't need to check for
> +        out of bounds N for type T1.  */
> +
> +      if (useless_type_conversion_p (l_type, r_type))
> +       {
> +         a1 = gimple_assign_rhs1 (r_stmt);
> +         a2 = gimple_assign_rhs2 (r_stmt);
> +         break;
> +       }
> +
> +      if (TYPE_PRECISION (l_type) > TYPE_PRECISION (r_type)
> +         && (!TYPE_UNSIGNED (l_type) || TYPE_UNSIGNED (r_type)))
> +       {
> +         a1 = gimple_assign_rhs1 (r_stmt);
> +         /* We do this transformation only, if left-hand operand
> +            is a cast operation.  It is valid for any such construct,
> +            nevertheless it might introduce none-reversable pattern.
> +            TODO: Add reverse-pattern to forward-propagation.  Then
> +            this check can additional check for validity of N on reversal.  */
> +         if (is_integral_cast (a1, &def, &i1)
> +             && demote_cast_p (l_type, r_type, TREE_TYPE (i1)))
> +           {
> +             a1 = gen_cast (l_type, a1);
> +             a2 = gimple_assign_rhs2 (r_stmt);
> +             break;
> +           }
> +       }
> +
> +      return false;
> +
> +    case LSHIFT_EXPR:
> +      /* For shift-left operation '(T1) (X << N)', we can sink T1 into
> +        shift-left statement, if X and T1 are of compatible type, or if N
> +        doesn't exceed value-range of type T1 and type-precision of T1 is
> +        smaller or equal to type-precision of X.  */
> +      /* If outer type's precision is wider then inner, it is a truncation
> +        and can't be handled.  */
> +      if (TYPE_PRECISION (l_type) > TYPE_PRECISION (r_type))
> +       return false;
> +
> +      /* Trivial case.  */
> +      if (useless_type_conversion_p (l_type, r_type))
> +       {
> +         a1 = gimple_assign_rhs1 (r_stmt);
> +         a2 = gimple_assign_rhs2 (r_stmt);
> +         break;
> +       }
> +
> +      a1 = gimple_assign_rhs1 (r_stmt);
> +      a2 = gimple_assign_rhs2 (r_stmt);
> +
> +      /* If A2 isn't an integral constant, or A1 isn't a cast-assignment,
> +        or precision of LTYPE is greater then precision of R_TYPE, then
> +        we can't do anything.  */
> +      if (TREE_CODE (a2) != INTEGER_CST
> +         || !is_integral_cast (a1, &def, &i1))
> +       return false;
> +
> +      /* Check if right-hand constant operand A2 is overflowing
> +        precision of L_TYPE on shift.  */
> +      if (integer_zerop (fold_binary (LT_EXPR, boolean_type_node, a2,
> +                        build_int_cst (TREE_TYPE (a2),
> +                                       TYPE_PRECISION (l_type)))))
> +       return false;
> +
> +      /* We only sink into, if pattern can be simplified further.
> +         TODO: Add reverse pattern to forward-propagation pass.  Then
> +        this check can be removed.  */
> +      if (!demote_cast_p (l_type, r_type, TREE_TYPE (i1)))
> +       return false;
> +
> +      /* Sink cast into shift-statement.  */
> +      a1 = gen_cast (l_type, a1);
> +
> +      break;
> +
> +    case MIN_EXPR: case MAX_EXPR:
> +
> +      /* Truncation can't be handled.  */
> +      if (TYPE_PRECISION (l_type) < TYPE_PRECISION (r_type))
> +       return false;
> +
> +      if (TYPE_UNSIGNED (l_type) != TYPE_UNSIGNED (r_type))
> +       {
> +         /* See that we don't introduce here an implicit sign-change.  */
> +         if (!TYPE_UNSIGNED (r_type)
> +             || TYPE_PRECISION (l_type) == TYPE_PRECISION (r_type))
> +           return false;
> +       }
> +
> +      a1 = gimple_assign_rhs1 (r_stmt);
> +      a2 = gimple_assign_rhs2 (r_stmt);
> +
> +      if (useless_type_conversion_p (l_type, r_type))
> +       break;
> +
> +      a1 = gen_cast (l_type, a1);
> +      a2 = gen_cast (l_type, a2);
> +
> +      break;
> +
> +    case BIT_AND_EXPR: case BIT_IOR_EXPR: case BIT_XOR_EXPR:
> +      a1 = gimple_assign_rhs1 (r_stmt);
> +      a2 = gimple_assign_rhs2 (r_stmt);
> +
> +      if (useless_type_conversion_p (l_type, r_type))
> +       break;
> +      a1 = gen_cast (l_type, a1);
> +      a2 = gen_cast (l_type, a2);
> +      break;
> +    default:
> +      return false;
> +    }
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "Sink cast ");
> +      print_gimple_stmt (dump_file, stmt, 0, 0);
> +      fprintf (dump_file, " of stmt ");
> +      print_gimple_stmt (dump_file, r_stmt, 0, 0);
> +    }
> +
> +  if (gimple_location (r_stmt) != UNKNOWN_LOCATION
> +      && gimple_location (stmt) == UNKNOWN_LOCATION)
> +    gimple_set_location (stmt, gimple_location (r_stmt));
> +
> +  gsi = gsi_for_stmt (stmt);
> +  gimple_assign_set_rhs_with_ops (&gsi, code, a1, a2);
> +  stmt = gsi_stmt (gsi);
> +  update_stmt (stmt);
> +
> +  remove_stmt_chain (r);
> +  *pstmt = stmt;
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, " into ");
> +      print_gimple_stmt (dump_file, stmt, 0, 0);
> +    }
> +
> +  return true;
> +}
> +
> +/* Loop over each statement for each basic-block.  */
> +
> +static void
> +execute_type_demote_int (int flags, basic_block bb)
> +{
> +  gimple_stmt_iterator gsi;
> +  basic_block son;
> +
> +  for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
> +    {
> +      gimple stmt, def2;
> +      tree lhs, inner;
> +
> +      stmt = gsi_stmt (gsi);
> +      gsi_prev (&gsi);
> +
> +      if (!is_gimple_assign (stmt)
> +         || (lhs = gimple_assign_lhs (stmt)) == NULL_TREE
> +         || TREE_CODE (lhs) != SSA_NAME
> +         || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> +         || !gimple_assign_cast_p (stmt))
> +       continue;
> +
> +      inner = gimple_assign_rhs1 (stmt);
> +
> +      if (TREE_CODE (inner) != SSA_NAME
> +         || !INTEGRAL_TYPE_P (TREE_TYPE (inner))
> +         || !(def2 = SSA_NAME_DEF_STMT (inner))
> +          || !is_gimple_assign (def2)
> +         || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (inner)
> +         || !has_single_use (inner))
> +       continue;
> +
> +      if (demote_cast (stmt, lhs, inner, def2))
> +       gsi = gsi_for_stmt (stmt);
> +      else if (demote_into (flags, stmt, lhs, inner, def2, &stmt))
> +        {
> +         gsi = gsi_for_stmt (stmt);
> +         gsi_prev (&gsi);
> +        }
> +    }
> +
> +  for (son = first_dom_son (CDI_POST_DOMINATORS, bb);
> +       son;
> +       son = next_dom_son (CDI_POST_DOMINATORS, son))
> +    execute_type_demote_int (flags, son);
> +
> +}
> +
> +/* Entry for type demotion.  By FLAG you can control mode of operation of this pass.  */
> +
> +static unsigned int
> +execute_type_demote (int flag)
> +{
> +  /* We calculate dominance info as CDI_DOMINATORS to make
> +     sure we can insert new generated statenents at proper
> +     basic-block.  */
> +  calculate_dominance_info (CDI_DOMINATORS);
> +  calculate_dominance_info (CDI_POST_DOMINATORS);
> +
> +  execute_type_demote_int (flag, EXIT_BLOCK_PTR);
> +
> +  free_dominance_info (CDI_POST_DOMINATORS);
> +
> +  return 0;
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_demote1 =
> +{
> +  GIMPLE_PASS,
> +  "typedemote1",                      /* name */
> +  OPTGROUP_NONE,
> +  false, /* has gate */
> +  true, /* has execute */
> +  TV_NONE, /* tv_id */
> +  ( PROP_cfg | PROP_ssa ), /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  (TODO_verify_ssa | TODO_update_ssa), /* todo_flags_finish */
> +};
> +
> +const pass_data pass_data_demote2 =
> +{
> +  GIMPLE_PASS,
> +  "typedemote2",                      /* name */
> +  OPTGROUP_NONE,
> +  false, /* has gate */
> +  true, /* has execute */
> +  TV_NONE, /* tv_id */
> +  ( PROP_cfg | PROP_ssa ), /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  (TODO_verify_ssa | TODO_update_ssa), /* todo_flags_finish */
> +};
> +
> +class pass_demote1 : public gimple_opt_pass
> +{
> +public:
> +  pass_demote1 (gcc::context *ctxt)
> +    : gimple_opt_pass (pass_data_demote1, ctxt)
> +  {}
> +  /* opt_pass methods: */
> +  opt_pass * clone() { return new pass_demote1 (m_ctxt); }
> +  unsigned int execute () { return execute_type_demote (TDK_NO_EXPAND_VIA_UNSIGNED); }
> +}; // class pass_demote1
> +
> +class pass_demote2 : public gimple_opt_pass
> +{
> +public:
> +  pass_demote2 (gcc::context *ctxt)
> +    : gimple_opt_pass (pass_data_demote2, ctxt)
> +  {}
> +  /* opt_pass methods: */
> +  opt_pass * clone() { return new pass_demote2 (m_ctxt); }
> +  unsigned int execute () { return execute_type_demote (0); }
> +}; // class pass_demote1
> +
> +}
> +
> +gimple_opt_pass *
> +make_pass_demote1 (gcc::context *ctxt)
> +{
> +  return new pass_demote1 (ctxt);
> +}
> +
> +gimple_opt_pass *
> +make_pass_demote2 (gcc::context *ctxt)
> +{
> +  return new pass_demote2 (ctxt);
> +}
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
> @@ -22,7 +22,7 @@ void tst(void)
>      blau ((unsigned char) i);
>  }
>
> -/* { dg-final { scan-tree-dump-times "& 255" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "& 255" 2 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "= \\(signed char\\)" 1 "optimized" } } */
>
>  /* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
> @@ -14,5 +14,5 @@ foo (long a)
>    h = c;
>  }
>
> -/* { dg-final { scan-tree-dump "Replaced \\\(short int\\\) c_.*with b_" "fre1" } } */
> +/* { dg-final { scan-tree-dump "Replaced \\\(short int\\\) c_.*with g\\\." "fre1" } } */
>  /* { dg-final { cleanup-tree-dump "fre1" } } */
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
> @@ -18,5 +18,5 @@ foo (int a, int b)
>    return aa + bb;
>  }
>
> -/* { dg-final { scan-tree-dump "Replaced \\\(int\\\) aa_.*with a_" "fre1" } } */
> +/* { dg-final { scan-tree-dump-times "Replaced \\\(int\\\) aa_.*with a_" 0 "fre1" } } */
>  /* { dg-final { cleanup-tree-dump "fre1" } } */
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
> @@ -11,5 +11,5 @@ char bar(char f)
>          return wrap(f);
>  }
>
> -/* { dg-final { scan-tree-dump "Replaced \\\(char\\\) .*with " "fre1" } } */
> +/* { dg-final { scan-tree-dump-times "Replaced \\\(char\\\) .*with " 0 "fre1" } } */
>  /* { dg-final { cleanup-tree-dump "fre1" } } */
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
> @@ -10,5 +10,5 @@ bar (unsigned int t)
>    return a == t;
>  }
>
> -/* { dg-final { scan-tree-dump "Replaced \\\(unsigned int\\\) a_.*with t_" "fre1" } } */
> +/* { dg-final { scan-tree-dump-times "Replaced \\\(unsigned int\\\) a_.*with t_" 0 "fre1" } } */
>  /* { dg-final { cleanup-tree-dump "fre1" } } */
> Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
> @@ -58,9 +58,8 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 4 "vect" { target vect_widen_shift } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>
> Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
> @@ -58,10 +58,8 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 8 "vect" { target vect_sizes_32B_16B } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 3 "vect" { target vect_widen_shift } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>
> Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
> @@ -58,7 +58,7 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>
> Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-3.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3.c
> @@ -58,7 +58,7 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump "vect_recog_over_widening_pattern: detected" "vect" } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>
> Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
> @@ -62,9 +62,8 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 4 "vect" { target vect_widen_shift } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>
> Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
> ===================================================================
> --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
> +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
> @@ -62,10 +62,8 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } } } } */
> -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 8 "vect" { target vect_sizes_32B_16B } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 4 "vect" { target vect_widen_shift } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-1.c
> ===================================================================
> --- /dev/null
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +short foo (unsigned char a, unsigned char b)
> +{
> +  int c = (int) a + (int) b;
> +  return (short) c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "= \\\(int\\\)" 0 "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-2.c
> ===================================================================
> --- /dev/null
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-2.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +short foo (unsigned char a, unsigned short b)
> +{
> +  int c = (int) a + (int) b;
> +  return (short) c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "= \\\(int\\\)" 0 "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-3.c
> ===================================================================
> --- /dev/null
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-3.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +signed char a[1024], b[1024];
> +
> +void
> +foo (void)
> +{
> +  int i, s, t;
> +
> +  for (i = 0; i < 1024; i++)
> +    {
> +      s = a[i];
> +      t = b[i];
> +      s += t + 0x12345600;
> +      a[i] = s;
> +    }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "305419776" 0 "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-1.c
> ===================================================================
> --- /dev/null
> +++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-1.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +unsigned char f1 (char x)
> +{
> +  unsigned char x1 = (unsigned char) x;
> +  return (x1 << 5);
> +}
> +
> +unsigned short f2 (unsigned int x)
> +{
> +  unsigned short x1 = (unsigned short) x;
> +  return (x1 << 15);
> +}
> +
> +unsigned int f3 (unsigned short x)
> +{
> +  unsigned long long x1 = (unsigned long long) x;
> +  return (unsigned int) (x1 >> 15);
> +}
> +
> +unsigned long long f4 (unsigned short x)
> +{
> +  unsigned int x1 = (unsigned int) x;
> +  return (unsigned long long) (x1 >> 7);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "= \\\(long long unsigned int\\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\\(short unsigned int\\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\\(unsigned int\\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\\(unsigned char\\\)" 1 "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
diff mbox

Patch

Index: gcc-trunk/gcc/Makefile.in
===================================================================
--- gcc-trunk.orig/gcc/Makefile.in
+++ gcc-trunk/gcc/Makefile.in
@@ -1430,6 +1430,7 @@  OBJS = \
 	tree-ssa-pre.o \
 	tree-ssa-propagate.o \
 	tree-ssa-reassoc.o \
+	tree-ssa-te.o \
 	tree-ssa-sccvn.o \
 	tree-ssa-sink.o \
 	tree-ssa-strlen.o \
Index: gcc-trunk/gcc/passes.def
===================================================================
--- gcc-trunk.orig/gcc/passes.def
+++ gcc-trunk/gcc/passes.def
@@ -64,6 +64,7 @@  along with GCC; see the file COPYING3.
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
 	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_ccp);
+	  NEXT_PASS (pass_demote1);
 	  /* After CCP we rewrite no longer addressed locals into SSA
 	     form if possible.  */
 	  NEXT_PASS (pass_forwprop);
@@ -141,6 +142,7 @@  along with GCC; see the file COPYING3.
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
       NEXT_PASS (pass_phiprop);
+      NEXT_PASS (pass_demote2);
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_object_sizes);
       /* pass_build_alias is a dummy pass that ensures that we
Index: gcc-trunk/gcc/tree-pass.h
===================================================================
--- gcc-trunk.orig/gcc/tree-pass.h
+++ gcc-trunk/gcc/tree-pass.h
@@ -430,6 +430,8 @@  extern gimple_opt_pass *make_pass_vrp (g
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_reassoc (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_demote1 (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_demote2 (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_rebuild_cgraph_edges (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_remove_cgraph_callee_edges (gcc::context
 							      *ctxt);
@@ -447,6 +449,8 @@  extern gimple_opt_pass *make_pass_split_
 extern gimple_opt_pass *make_pass_feedback_split_functions (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_strength_reduction (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
+extern struct gimple_opt_pass pass_type_demote1;
+extern struct gimple_opt_pass pass_type_demote2;
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
Index: gcc-trunk/gcc/tree-ssa-forwprop.c
===================================================================
--- gcc-trunk.orig/gcc/tree-ssa-forwprop.c
+++ gcc-trunk/gcc/tree-ssa-forwprop.c
@@ -548,6 +548,107 @@  forward_propagate_into_gimple_cond (gimp
   return 0;
 }
 
+/* Try to simplify shift-statement ((TYPE1) X) CODE Y for
+   integral-kind types.
+   Returns none-zero if the stmt was changed.  */
+
+static int
+simplify_shift (enum tree_code code, gimple_stmt_iterator *gsi_p)
+{
+  gimple stmt = gsi_stmt (*gsi_p);
+  gimple def, newop;
+  tree op1, opx, op2, t_opx, tem;
+  int ret;
+
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt))))
+    return 0;
+
+  op1 = gimple_assign_rhs1 (stmt);
+  if (TREE_CODE (op1) != SSA_NAME
+      || !(def = SSA_NAME_DEF_STMT (op1))
+      || !is_gimple_assign (def)
+      || !gimple_assign_cast_p (def)
+      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op1)
+      || !has_single_use (op1))
+    return 0;
+
+  opx = gimple_assign_rhs1 (def);
+  if (TREE_CODE (opx) != SSA_NAME)
+    return 0;
+
+  t_opx = TREE_TYPE (opx);
+
+  if (!INTEGRAL_TYPE_P (t_opx))
+    return 0;
+
+  op2 = gimple_assign_rhs2 (stmt);
+
+  if (code == LSHIFT_EXPR)
+    {
+      if (TYPE_PRECISION (TREE_TYPE (op1))
+	  > TYPE_PRECISION (t_opx))
+        return 0;
+    }
+  else if (code == RSHIFT_EXPR)
+    {
+      /* If type of OPX and OP1 are compatible, we don't need to check OP2
+         for validity as we don't change range of operation.
+         Otherwise we need to check that right-hand operand of shift-right
+	 doesn't exceed type-precision of inner operand.  */
+      if (!useless_type_conversion_p (TREE_TYPE (op1), t_opx))
+        {
+	  if (TREE_CODE (op2) != INTEGER_CST)
+	    return 0;
+	  if (integer_zerop (fold_binary (LT_EXPR, boolean_type_node, op2,
+	                                  build_int_cst (TREE_TYPE (op2),
+					                 TYPE_PRECISION (t_opx)))))
+	    return 0;
+
+          /* See if cast can be moved out of the shift-right operation.  */
+	  if (TYPE_PRECISION (TREE_TYPE (op1)) <= TYPE_PRECISION (t_opx)
+	      || (!TYPE_UNSIGNED (t_opx) && TYPE_UNSIGNED (TREE_TYPE (op1))))
+	    return 0;
+        }
+    }
+  else
+    return 0;
+
+  /* Do transformation ((T1) X) shift-code N -> (T1) (X shift-code N).  */
+  if (dump_file)
+    {
+      fprintf (dump_file, "  Replaced ");
+      print_gimple_stmt (dump_file, stmt, 0, 0);
+      fprintf (dump_file, "    ");
+      print_gimple_stmt (dump_file, def, 0, 0);
+      fprintf (dump_file, "  by ");
+    }
+
+  tem = make_ssa_name (t_opx, NULL);
+  newop = gimple_build_assign_with_ops (code, tem, opx, op2);
+  gimple_set_location (newop, gimple_location (stmt));
+  gsi_insert_before (gsi_p, newop, GSI_SAME_STMT);
+  gimple_assign_set_rhs_with_ops_1 (gsi_p, NOP_EXPR, tem, NULL_TREE, NULL_TREE);
+
+  if (gimple_location (def) != UNKNOWN_LOCATION)
+    gimple_set_location (stmt, gimple_location (def));
+
+  stmt = gsi_stmt (*gsi_p);
+  update_stmt (stmt);
+
+  if (TREE_CODE (op1) == SSA_NAME)
+    ret = remove_prop_source_from_use (op1) ? 2 : 1;
+  else
+    ret = 1;
+  if (dump_file)
+    {
+      fprintf (dump_file, "   by ");
+      print_gimple_stmt (dump_file, stmt, 0, 0);
+      fprintf (dump_file, "    ");
+      print_gimple_stmt (dump_file, newop, 0, 0);
+    }
+
+  return ret;
+}
 
 /* Propagate from the ssa name definition statements of COND_EXPR
    in the rhs of statement STMT into the conditional if that simplifies it.
@@ -3429,6 +3530,15 @@  ssa_forward_propagate_and_combine (void)
 		     || code == NEGATE_EXPR)
 		    && TREE_CODE (rhs1) == SSA_NAME)
 		  changed = simplify_not_neg_expr (&gsi);
+		else if (code == LSHIFT_EXPR
+		         || code == RSHIFT_EXPR)
+		  {
+		    int did_something;
+		    did_something = simplify_shift (code, &gsi);
+		    if (did_something == 2)
+		      cfg_changed = true;
+		    changed = did_something != 0;
+		  }
 		else if (code == COND_EXPR
 			 || code == VEC_COND_EXPR)
 		  {
Index: gcc-trunk/gcc/tree-ssa-te.c
===================================================================
--- /dev/null
+++ gcc-trunk/gcc/tree-ssa-te.c
@@ -0,0 +1,885 @@ 
+/* Type-demotion for trees
+   Copyright (C) 2013
+   Free Software Foundation, Inc.
+   Contributed by Kai Tietz <ktietz@redhat.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/*
+   This pass tries sink casts into statements.  So statements like
+       (TYPE) (OP1 code OP2) are getting transformed into ((TYPE) OP1) code ((TYPE) OP2).
+
+   These transformation are the oppose transformation-direction of that one done in
+   tree-ssa-forward-propagation.  We can do those transformation within one pass due
+   cycles.  The "normal" representation is generated by the forward-propagation.  It
+   tries instead to raise casts out of the statement.
+   Therefore it is essential that forward-propagation pass is done after this
+   type-demotion pass, so that "normal" representation is kept.
+
+   By this we are able to resolve two optimization opportunities.  First is some
+   constant propagation within binary-expressions.  The second is the reduction of
+   useless type-conversions within statements.
+
+   That we actual do here full transformation instead of doing "virtual"
+   expression-optimization and just write out optimized expression is caused
+   by a complexity issues.  Such analyzis might cause too high complexity.  So
+   it is more efficient and simpler to perform optimization instead.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "hash-table.h"
+#include "tm.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "tree.h"
+#include "flags.h"
+#include "basic-block.h"
+#include "gimple-pretty-print.h"
+#include "tree-inline.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "gimple-ssa.h"
+#include "tree-cfg.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-loop-niter.h"
+#include "tree-ssa-loop.h"
+#include "tree-dfa.h"
+#include "tree-ssa.h"
+#include "tree-iterator.h"
+#include "tree-pass.h"
+#include "alloc-pool.h"
+#include "vec.h"
+#include "langhooks.h"
+#include "pointer-set.h"
+#include "cfgloop.h"
+#include "target.h"
+#include "params.h"
+#include "diagnostic-core.h"
+
+/* Operation kinds.  */
+#define TDK_NO_EXPAND_VIA_UNSIGNED  1
+#define TDK_NO_EXPAND_WIDENING	    2
+
+/* Returns true if statement S1 dominates statement S2.  Like
+   stmt_dominates_stmt_p, but uses stmt UIDs to optimize.  */
+
+static bool
+uid_stmt_dominates_stmt_p (gimple s1, gimple s2)
+{
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  /* If bb1 is NULL, it should be a GIMPLE_NOP def stmt of an (D)
+     SSA_NAME.  Assume it lives at the beginning of function and
+     thus dominates everything.  */
+  if (!bb1 || s1 == s2)
+    return true;
+
+  /* If bb2 is NULL, it doesn't dominate any stmt with a bb.  */
+  if (!bb2)
+    return false;
+
+  if (bb1 == bb2)
+    {
+      /* PHIs in the same basic block are assumed to be
+         executed all in parallel, if only one stmt is a PHI,
+         it dominates the other stmt in the same basic block.  */
+      if (gimple_code (s1) == GIMPLE_PHI)
+        return true;
+
+      if (gimple_code (s2) == GIMPLE_PHI)
+        return false;
+
+      /* gcc_assert (gimple_uid (s1) && gimple_uid (s2)); */
+
+      if (gimple_uid (s1) < gimple_uid (s2))
+        return true;
+
+      if (gimple_uid (s1) > gimple_uid (s2))
+        return false;
+
+      gimple_stmt_iterator gsi = gsi_for_stmt (s1);
+      unsigned int uid = gimple_uid (s1);
+      for (gsi_next (&gsi); !gsi_end_p (gsi); gsi_next (&gsi))
+        {
+          gimple s = gsi_stmt (gsi);
+          if (gimple_uid (s) != uid)
+            break;
+          if (s == s2)
+            return true;
+        }
+
+      return false;
+    }
+
+  return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
+}
+
+/* Insert STMT after INSERT_POINT.  */
+
+static void
+insert_stmt_after (gimple stmt, gimple insert_point)
+{
+  gimple_stmt_iterator gsi;
+  basic_block bb;
+
+  if (gimple_code (insert_point) == GIMPLE_PHI)
+    bb = gimple_bb (insert_point);
+  else if (!stmt_ends_bb_p (insert_point))
+    {
+      gsi = gsi_for_stmt (insert_point);
+      gimple_set_uid (stmt, gimple_uid (insert_point));
+      gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+      return;
+    }
+  else
+    /* We assume INSERT_POINT is a SSA_NAME_DEF_STMT of some SSA_NAME,
+       thus if it must end a basic block, it should be a call that can
+       throw, or some assignment that can throw.  If it throws, the LHS
+       of it will not be initialized though, so only valid places using
+       the SSA_NAME should be dominated by the fallthru edge.  */
+    bb = find_fallthru_edge (gimple_bb (insert_point)->succs)->dest;
+  gsi = gsi_after_labels (bb);
+  if (gsi_end_p (gsi))
+    {
+      gimple_stmt_iterator gsi2 = gsi_last_bb (bb);
+      gimple_set_uid (stmt,
+                      gsi_end_p (gsi2) ? 1 : gimple_uid (gsi_stmt (gsi2)));
+    }
+  else
+    gimple_set_uid (stmt, gimple_uid (gsi_stmt (gsi)));
+  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+}
+
+/* Returns the basic-block, which would be used, if a statement
+   would be insert at INSERT_POINT.  */
+
+static basic_block
+get_insert_stmt_after_bb (gimple insert_point)
+{
+  basic_block bb = gimple_bb (insert_point);
+
+  if (gimple_code (insert_point) == GIMPLE_PHI
+      || !stmt_ends_bb_p (insert_point))
+    return bb;
+  /* We assume INSERT_POINT is a SSA_NAME_DEF_STMT of some SSA_NAME,
+      thus if it must end a basic block, it should be a call that can
+      throw, or some assignment that can throw.  If it throws, the LHS
+      of it will not be initialized though, so only valid places using
+      the SSA_NAME should be dominated by the fallthru edge.  */
+  bb = find_fallthru_edge (bb->succs)->dest;
+  return gimple_bb (gsi_stmt (gsi_after_labels (bb)));
+}
+
+/*  Returns the basic-block, which would be used on insertation
+    of a statement having OP1 (and OP2).  */
+
+static basic_block
+get_insertation_bb (tree op1, tree op2)
+{
+  gimple op1def = NULL, op2def = NULL, insert_point;
+  gimple_stmt_iterator gsi;
+
+  /* Find an insertion place and insert.  */
+  if (op1 && TREE_CODE (op1) == SSA_NAME)
+    op1def = SSA_NAME_DEF_STMT (op1);
+  if (op2 && TREE_CODE (op2) == SSA_NAME)
+    op2def = SSA_NAME_DEF_STMT (op2);
+
+  if ((!op1def || gimple_nop_p (op1def))
+      && (!op2def || gimple_nop_p (op2def)))
+    {
+      gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR));
+      return gimple_bb (gsi_stmt (gsi));
+    }
+
+  if ((!op1def || gimple_nop_p (op1def))
+      || (op2def && !gimple_nop_p (op2def)
+          && uid_stmt_dominates_stmt_p (op1def, op2def)))
+    insert_point = op2def;
+  else
+    insert_point = op1def;
+
+  return get_insert_stmt_after_bb (insert_point);
+}
+
+/* Determine if a new statement dependent on OP1 (and OP2)
+   would be inserted at place with loop-depth bigger as the
+   loop-depth of statement STMT.
+   On matching condition TRUE is returned, otherwise FALSE.  */
+
+static bool
+sink_into_loop_p (gimple stmt, tree op1, tree op2)
+{
+  basic_block bb;
+  unsigned stmt_depth, ins_deepth;
+
+  if (op1 && TREE_CODE (op1) != SSA_NAME)
+    op1 = NULL_TREE;
+  if (op2 && TREE_CODE (op2) != SSA_NAME)
+    op2 = NULL_TREE;
+  if (!op1 && !op2)
+    return false;
+
+  bb = get_insertation_bb (op1, op2);
+  ins_deepth = loop_depth (bb->loop_father);
+
+  if (!ins_deepth)
+    return false;
+
+  stmt_depth = loop_depth (gimple_bb (stmt)->loop_father);
+
+  return (stmt_depth < ins_deepth);
+}
+
+/* Builds one assign-statement performing OP1 OPCODE OP2 or OPCODE OP1,
+   using TMPVAR for the result.  Places the statement after the definition
+   of either OP1 or OP2.  Returns the new statement's lhs-tree.  */
+
+static tree
+build_gimple_assign_loc (tree type, tree op1, tree op2, enum tree_code opcode,
+			 location_t loc)
+{
+  gimple op1def = NULL, op2def = NULL;
+  gimple_stmt_iterator gsi;
+  tree op;
+  gimple sum;
+
+  /* Create the addition statement.  */
+  op = make_ssa_name (type, NULL);
+  sum = gimple_build_assign_with_ops (opcode, op, op1, op2);
+
+  /* Find an insertion place and insert.  */
+  if (op1 && TREE_CODE (op1) == SSA_NAME)
+    op1def = SSA_NAME_DEF_STMT (op1);
+  if (op2 && TREE_CODE (op2) == SSA_NAME)
+    op2def = SSA_NAME_DEF_STMT (op2);
+  if ((!op1def || gimple_nop_p (op1def))
+      && (!op2def || gimple_nop_p (op2def)))
+    {
+      gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR));
+      if (gsi_end_p (gsi))
+        {
+          gimple_stmt_iterator gsi2
+            = gsi_last_bb (single_succ (ENTRY_BLOCK_PTR));
+          gimple_set_uid (sum,
+                          gsi_end_p (gsi2) ? 1 : gimple_uid (gsi_stmt (gsi2)));
+        }
+      else
+        gimple_set_uid (sum, gimple_uid (gsi_stmt (gsi)));
+      gsi_insert_before (&gsi, sum, GSI_NEW_STMT);
+    }
+  else
+    {
+      gimple insert_point;
+      if ((!op1def || gimple_nop_p (op1def))
+           || (op2def && !gimple_nop_p (op2def)
+               && uid_stmt_dominates_stmt_p (op1def, op2def)))
+        insert_point = op2def;
+      else
+        insert_point = op1def;
+      insert_stmt_after (sum, insert_point);
+    }
+
+  gimple_set_location (sum, loc);
+  update_stmt (sum);
+
+  return gimple_assign_lhs (sum);
+}
+
+/* Release assign-statement chain of VAR, if it has zero-uses.
+   This functions tries to recurse on up to two operands.  */
+
+static void
+remove_stmt_chain (tree var)
+{
+  tree var2;
+  gimple stmt;
+  gimple_stmt_iterator gsi;
+
+  while (1)
+    {
+      if (TREE_CODE (var) != SSA_NAME || !has_zero_uses (var))
+        return;
+      stmt = SSA_NAME_DEF_STMT (var);
+      if (!is_gimple_assign (stmt))
+        return;
+      var = gimple_assign_rhs1 (stmt);
+      if (gimple_assign_rhs_class (stmt) == GIMPLE_BINARY_RHS)
+	var2 = gimple_assign_rhs2 (stmt);
+      else
+	var2 = NULL_TREE;
+      gsi = gsi_for_stmt (stmt);
+      gsi_remove (&gsi, true);
+      release_defs (stmt);
+      if (var2)
+	remove_stmt_chain (var2);
+    }
+}
+
+/* Generated gimple cast expression (NEWTYPE) name, and
+   return new created assignment.  */
+
+static tree
+gen_cast (tree newtype, tree name)
+{
+  tree ret;
+
+  if (INTEGRAL_TYPE_P (newtype) && TREE_CODE (name) == INTEGER_CST)
+    {
+      bool saved_flag_wrapv = flag_wrapv;
+      flag_wrapv = 1;
+      ret = fold_convert (newtype, name);
+      flag_wrapv = saved_flag_wrapv;
+
+      return ret;
+    }
+
+  /* We can't propagate through our defining statement, so emit
+     a conversion explicitely.  */
+  if (useless_type_conversion_p (newtype, TREE_TYPE (name)))
+    return name;
+
+  ret = build_gimple_assign_loc (newtype, name, NULL_TREE, NOP_EXPR,
+				 EXPR_LOCATION (name));
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "Create cast ");
+      print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (ret), 0, 0);
+    }
+
+  return ret;
+}
+
+/* Check if simplification of pattern '(TYPE1) (TYPE2) X' to '(TYPE1) X' is
+   possible.  If so this function returns true, otherwise false.
+ */
+
+static bool
+demote_cast_p (tree t1, tree t2, tree tx)
+{
+  if (!INTEGRAL_TYPE_P (t1)
+      || !INTEGRAL_TYPE_P (t2))
+    return false;
+
+  /* A boolean-cast of a boolean-type can be optimized away for any width.
+     A useless type-conversion can be optimized away, too.  */
+  if (useless_type_conversion_p (t1, t2)
+      || useless_type_conversion_p (t2, tx)
+      /* Boolean types might have different width.  */
+      || (TREE_CODE (t2) == BOOLEAN_TYPE
+	  && (TREE_CODE (t1) == BOOLEAN_TYPE
+	      || TREE_CODE (tx) == BOOLEAN_TYPE)))
+    return true;
+
+  /* Boolean types are special ...
+     They aren't real integer-scalar values, and a cast to a boolean-type
+     implies a comparison.  Therefore it isn't allowed to perform simplification
+     for any of the remaining operations if T2 is of Boolean kind.  */
+  if (TREE_CODE (t2) == BOOLEAN_TYPE
+      || !INTEGRAL_TYPE_P (tx))
+    return false;
+
+  /* For (T1) (T2) X, if T1 <= X && T2 >= X we can transform to
+     (T1) X.  */
+  if (TYPE_PRECISION (t1) <= TYPE_PRECISION (tx))
+    return (TYPE_PRECISION (t1) <= TYPE_PRECISION (t2));
+
+  /* If T2 < X, then there is a truncation and we
+     can't do anything.   */
+  if (TYPE_PRECISION (t2) < TYPE_PRECISION (tx))
+    return false;
+
+  /* For (T1) (T2) X, if T1 > X && T1 <= T2,  we can
+     transform to (T1) X.  */
+  if (TYPE_PRECISION (t1) <= TYPE_PRECISION (t2))
+    return true;
+
+  /* For (T1) (T2) X, if T1 > X && T2 >= X && SIGN(X) == SIGN(T2),
+     we can transform to (T1) X.  */
+  return (TYPE_UNSIGNED (t2) == TYPE_UNSIGNED (tx));
+}
+
+
+/* Try to sink cast expression (TYPE1) (TYPE2) X to (TYPE1) X, if
+   TYPE1, TYPE2, and type of X are of integral kind and met condition
+   checked in demote_cast_p.
+   If cast was sunken, then TRUE is returned, otherwise FALSE.  */
+
+static bool
+demote_cast (gimple def, tree name, tree c, gimple def2)
+{
+  tree x;
+
+  if (!gimple_assign_cast_p (def2))
+    return false;
+
+  x = gimple_assign_rhs1 (def2);
+
+  if (TREE_CODE (x) != SSA_NAME
+      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (x)
+      || !demote_cast_p (TREE_TYPE (name), TREE_TYPE (c), TREE_TYPE (x)))
+    return false;
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "Remove cast ");
+      print_gimple_stmt (dump_file, def, 0, 0);
+      fprintf (dump_file, "  ");
+      print_gimple_stmt (dump_file, def2, 0, 0);
+    }
+
+  gimple_assign_set_rhs1 (def, x);
+
+  if (useless_type_conversion_p (TREE_TYPE (name), TREE_TYPE (x)))
+    gimple_assign_set_rhs_code (def, SSA_NAME);
+
+  if (gimple_location (def2) != UNKNOWN_LOCATION
+      && gimple_location (def) == UNKNOWN_LOCATION)
+    gimple_set_location (def, gimple_location (def2));
+
+  update_stmt (def);
+  remove_stmt_chain (c);
+
+  if (dump_file)
+    {
+      fprintf (dump_file, " by cast ");
+      print_gimple_stmt (dump_file, def, 0, 0);
+    }
+
+  return true;
+}
+
+/* Helper routine to check if OP is an gimple-cast-statement of
+   an integral value.
+   On success the gimple-statment variable referenced by PDEF is set,
+   if PDEF isn't zero.
+   On sucess the right-hand operand of the cast statment is stored in PINNER,
+   if PINNER isn't zero.
+   On success TRUE is returnd, otherwise FALSE.  */
+
+static bool
+is_integral_cast (tree op, gimple *pdef, tree *pinner)
+{
+  gimple def = NULL;
+  tree inner;
+
+  if (TREE_CODE (op) != SSA_NAME
+      || !(def = SSA_NAME_DEF_STMT (op))
+      || !is_gimple_assign (def)
+      || !gimple_assign_cast_p (def)
+      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
+    return false;
+
+  inner = gimple_assign_rhs1 (def);
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (inner)))
+    return false;
+
+  if (pdef)
+    *pdef = def;
+  if (pinner)
+    *pinner = inner;
+  return true;
+}
+
+/* Try to demote a cast into assignment.  We try to do this for
+   MULT_EXPR, PLUS_EXPR, MINUS_EXPR, LROTATE_EXPR, RROTATE_EXPR,
+   LSHIFT_EXPR, RSHIFT_EXPR, BIT_IOR_EXPR, BIT_XOR_EXPR, BIT_AND_EXPR,
+   MIN_EXPR, and MAX_EXPR.
+   On success function modifies incoming STMT and new statement is stored
+   in PSTMT. The result on success is TRUE, otherwise FALSE.  */
+
+static bool
+demote_into (int flags, gimple stmt, tree lhs, tree r, gimple r_stmt, gimple *pstmt)
+{
+  gimple_stmt_iterator gsi;
+  enum tree_code code;
+  tree l_type = TREE_TYPE (lhs);
+  tree r_type = TREE_TYPE (r);
+  tree a1, a2, i1 = NULL_TREE;
+  gimple def = NULL;
+
+  /* We can't do type-sinking if only one side is Boolean-typed.  */
+  if ((TREE_CODE (r_type) == BOOLEAN_TYPE
+       || TREE_CODE (l_type) == BOOLEAN_TYPE)
+      && TREE_CODE (r_type) != TREE_CODE (l_type))
+    return false;
+
+  if ((flags & TDK_NO_EXPAND_WIDENING) != 0
+      && TYPE_PRECISION (l_type) > TYPE_PRECISION (r_type))
+    return false;
+
+  code = gimple_assign_rhs_code (r_stmt);
+
+  switch (code)
+    {
+    case MULT_EXPR:
+      /* We don't want to sink casts into multiply-expressions
+  	 if outter type and inner-type don't have same sign.  */
+      if (TYPE_UNSIGNED (l_type) != TYPE_UNSIGNED (r_type))
+	return false;
+
+      /* If inner-type isn't unsigned, we don't want to convert.  */
+      if (!TYPE_UNSIGNED (l_type)
+          && !useless_type_conversion_p (l_type, r_type))
+	return false;
+      /* Fall through.  */
+
+    case PLUS_EXPR: case MINUS_EXPR:
+      a1 = gimple_assign_rhs1 (r_stmt);
+      a2 = gimple_assign_rhs2 (r_stmt);
+
+      if (TYPE_PRECISION (l_type) < TYPE_PRECISION (r_type)
+	  && (!TYPE_UNSIGNED (l_type) && !flag_wrapv))
+	{
+	  tree uns;
+
+	  if ((flags & TDK_NO_EXPAND_VIA_UNSIGNED) != 0
+	      || sink_into_loop_p (stmt, a1, a2))
+	    return false;
+
+	  uns = unsigned_type_for (l_type);
+	  a1 = gen_cast (uns, a1);
+	  a2 = gen_cast (uns, a2);
+	  a1 = build_gimple_assign_loc (uns, a1, a2, code,
+					gimple_location (r_stmt));
+	  code = NOP_EXPR;
+	  a2 = NULL_TREE;
+	  break;
+	}
+      else if ((flags & TDK_NO_EXPAND_VIA_UNSIGNED) == 0
+	       && TYPE_PRECISION (l_type) < TYPE_PRECISION (r_type)
+	       && (TYPE_UNSIGNED (l_type) || flag_wrapv))
+	{
+	  if (sink_into_loop_p (stmt, a1, a2))
+	    return false;
+	  a1 = gen_cast (l_type, a1);
+	  a2 = gen_cast (l_type, a2);
+	  break;
+	}
+      else if (TYPE_PRECISION (l_type) == TYPE_PRECISION (r_type))
+	{
+	  if (TYPE_UNSIGNED (r_type) == TYPE_UNSIGNED (l_type)
+	      || !TYPE_UNSIGNED (r_type))
+	    {
+	      a1 = gen_cast (l_type, a1);
+	      a2 = gen_cast (l_type, a2);
+	      break;
+	    }
+	}
+      return false;
+
+    case LROTATE_EXPR: case RROTATE_EXPR:
+      /* Any cast with same precision as inner left-hand value can be
+	 sunken.  */
+      if (TYPE_PRECISION (l_type) == TYPE_PRECISION (r_type))
+	{
+	  a1 = gimple_assign_rhs1 (r_stmt);
+	  if (!useless_type_conversion_p (l_type, r_type))
+	    a1 = gen_cast (l_type, a1);
+	  a2 = gimple_assign_rhs2 (r_stmt);
+	  break;
+	}
+
+      return false;
+
+    case RSHIFT_EXPR:
+      /* For shift-right operation (T1) (X >> N) we can do sink type into shift-statement,
+ 	 if types of X and T1 are compatible, or if type-precision of T1
+	 is greater then type-precision of X plus the T1 isn't unsigned
+         and type of X isn't signed.
+	 As this doesn't lower precision of resulting type, we don't need to check for
+	 out of bounds N for type T1.  */
+
+      if (useless_type_conversion_p (l_type, r_type))
+	{
+	  a1 = gimple_assign_rhs1 (r_stmt);
+	  a2 = gimple_assign_rhs2 (r_stmt);
+	  break;
+	}
+
+      if (TYPE_PRECISION (l_type) > TYPE_PRECISION (r_type)
+	  && (!TYPE_UNSIGNED (l_type) || TYPE_UNSIGNED (r_type)))
+	{
+	  a1 = gimple_assign_rhs1 (r_stmt);
+	  /* We do this transformation only, if left-hand operand
+	     is a cast operation.  It is valid for any such construct,
+	     nevertheless it might introduce none-reversable pattern.
+	     TODO: Add reverse-pattern to forward-propagation.  Then
+	     this check can additional check for validity of N on reversal.  */
+	  if (is_integral_cast (a1, &def, &i1)
+	      && demote_cast_p (l_type, r_type, TREE_TYPE (i1)))
+	    {
+	      a1 = gen_cast (l_type, a1);
+	      a2 = gimple_assign_rhs2 (r_stmt);
+	      break;
+	    }
+	}
+
+      return false;
+
+    case LSHIFT_EXPR:
+      /* For shift-left operation '(T1) (X << N)', we can sink T1 into
+	 shift-left statement, if X and T1 are of compatible type, or if N
+	 doesn't exceed value-range of type T1 and type-precision of T1 is
+	 smaller or equal to type-precision of X.  */
+      /* If outer type's precision is wider then inner, it is a truncation
+	 and can't be handled.  */
+      if (TYPE_PRECISION (l_type) > TYPE_PRECISION (r_type))
+	return false;
+
+      /* Trivial case.  */
+      if (useless_type_conversion_p (l_type, r_type))
+	{
+	  a1 = gimple_assign_rhs1 (r_stmt);
+	  a2 = gimple_assign_rhs2 (r_stmt);
+	  break;
+	}
+
+      a1 = gimple_assign_rhs1 (r_stmt);
+      a2 = gimple_assign_rhs2 (r_stmt);
+
+      /* If A2 isn't an integral constant, or A1 isn't a cast-assignment,
+	 or precision of LTYPE is greater then precision of R_TYPE, then
+	 we can't do anything.  */
+      if (TREE_CODE (a2) != INTEGER_CST
+	  || !is_integral_cast (a1, &def, &i1))
+	return false;
+
+      /* Check if right-hand constant operand A2 is overflowing
+	 precision of L_TYPE on shift.  */
+      if (integer_zerop (fold_binary (LT_EXPR, boolean_type_node, a2,
+			 build_int_cst (TREE_TYPE (a2),
+					TYPE_PRECISION (l_type)))))
+	return false;
+
+      /* We only sink into, if pattern can be simplified further.
+         TODO: Add reverse pattern to forward-propagation pass.  Then
+	 this check can be removed.  */
+      if (!demote_cast_p (l_type, r_type, TREE_TYPE (i1)))
+	return false;
+
+      /* Sink cast into shift-statement.  */
+      a1 = gen_cast (l_type, a1);
+
+      break;
+
+    case MIN_EXPR: case MAX_EXPR:
+
+      /* Truncation can't be handled.  */
+      if (TYPE_PRECISION (l_type) < TYPE_PRECISION (r_type))
+	return false;
+
+      if (TYPE_UNSIGNED (l_type) != TYPE_UNSIGNED (r_type))
+	{
+	  /* See that we don't introduce here an implicit sign-change.  */
+	  if (!TYPE_UNSIGNED (r_type)
+	      || TYPE_PRECISION (l_type) == TYPE_PRECISION (r_type))
+	    return false;
+	}
+
+      a1 = gimple_assign_rhs1 (r_stmt);
+      a2 = gimple_assign_rhs2 (r_stmt);
+
+      if (useless_type_conversion_p (l_type, r_type))
+	break;
+
+      a1 = gen_cast (l_type, a1);
+      a2 = gen_cast (l_type, a2);
+
+      break;
+
+    case BIT_AND_EXPR: case BIT_IOR_EXPR: case BIT_XOR_EXPR:
+      a1 = gimple_assign_rhs1 (r_stmt);
+      a2 = gimple_assign_rhs2 (r_stmt);
+
+      if (useless_type_conversion_p (l_type, r_type))
+	break;
+      a1 = gen_cast (l_type, a1);
+      a2 = gen_cast (l_type, a2);
+      break;
+    default:
+      return false;
+    }
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "Sink cast ");
+      print_gimple_stmt (dump_file, stmt, 0, 0);
+      fprintf (dump_file, " of stmt ");
+      print_gimple_stmt (dump_file, r_stmt, 0, 0);
+    }
+
+  if (gimple_location (r_stmt) != UNKNOWN_LOCATION
+      && gimple_location (stmt) == UNKNOWN_LOCATION)
+    gimple_set_location (stmt, gimple_location (r_stmt));
+
+  gsi = gsi_for_stmt (stmt);
+  gimple_assign_set_rhs_with_ops (&gsi, code, a1, a2);
+  stmt = gsi_stmt (gsi);
+  update_stmt (stmt);
+
+  remove_stmt_chain (r);
+  *pstmt = stmt;
+
+  if (dump_file)
+    {
+      fprintf (dump_file, " into ");
+      print_gimple_stmt (dump_file, stmt, 0, 0);
+    }
+
+  return true;
+}
+
+/* Loop over each statement for each basic-block.  */
+
+static void
+execute_type_demote_int (int flags, basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  basic_block son;
+
+  for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
+    {
+      gimple stmt, def2;
+      tree lhs, inner;
+
+      stmt = gsi_stmt (gsi);
+      gsi_prev (&gsi);
+
+      if (!is_gimple_assign (stmt)
+	  || (lhs = gimple_assign_lhs (stmt)) == NULL_TREE
+	  || TREE_CODE (lhs) != SSA_NAME
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+	  || !gimple_assign_cast_p (stmt))
+	continue;
+
+      inner = gimple_assign_rhs1 (stmt);
+
+      if (TREE_CODE (inner) != SSA_NAME
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (inner))
+	  || !(def2 = SSA_NAME_DEF_STMT (inner))
+          || !is_gimple_assign (def2)
+	  || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (inner)
+	  || !has_single_use (inner))
+	continue;
+
+      if (demote_cast (stmt, lhs, inner, def2))
+	gsi = gsi_for_stmt (stmt);
+      else if (demote_into (flags, stmt, lhs, inner, def2, &stmt))
+        {
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_prev (&gsi);
+        }
+    }
+
+  for (son = first_dom_son (CDI_POST_DOMINATORS, bb);
+       son;
+       son = next_dom_son (CDI_POST_DOMINATORS, son))
+    execute_type_demote_int (flags, son);
+
+}
+
+/* Entry for type demotion.  By FLAG you can control mode of operation of this pass.  */
+
+static unsigned int
+execute_type_demote (int flag)
+{
+  /* We calculate dominance info as CDI_DOMINATORS to make
+     sure we can insert new generated statenents at proper
+     basic-block.  */
+  calculate_dominance_info (CDI_DOMINATORS);
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+
+  execute_type_demote_int (flag, EXIT_BLOCK_PTR);
+
+  free_dominance_info (CDI_POST_DOMINATORS);
+
+  return 0;
+}
+
+namespace {
+
+const pass_data pass_data_demote1 =
+{
+  GIMPLE_PASS,
+  "typedemote1",                      /* name */
+  OPTGROUP_NONE,
+  false, /* has gate */
+  true, /* has execute */
+  TV_NONE, /* tv_id */
+  ( PROP_cfg | PROP_ssa ), /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_verify_ssa | TODO_update_ssa), /* todo_flags_finish */
+};
+
+const pass_data pass_data_demote2 =
+{
+  GIMPLE_PASS,
+  "typedemote2",                      /* name */
+  OPTGROUP_NONE,
+  false, /* has gate */
+  true, /* has execute */
+  TV_NONE, /* tv_id */
+  ( PROP_cfg | PROP_ssa ), /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_verify_ssa | TODO_update_ssa), /* todo_flags_finish */
+};
+
+class pass_demote1 : public gimple_opt_pass
+{
+public:
+  pass_demote1 (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_demote1, ctxt)
+  {}
+  /* opt_pass methods: */
+  opt_pass * clone() { return new pass_demote1 (m_ctxt); }
+  unsigned int execute () { return execute_type_demote (TDK_NO_EXPAND_VIA_UNSIGNED); }
+}; // class pass_demote1
+
+class pass_demote2 : public gimple_opt_pass
+{
+public:
+  pass_demote2 (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_demote2, ctxt)
+  {}
+  /* opt_pass methods: */
+  opt_pass * clone() { return new pass_demote2 (m_ctxt); }
+  unsigned int execute () { return execute_type_demote (0); }
+}; // class pass_demote1
+
+}
+
+gimple_opt_pass *
+make_pass_demote1 (gcc::context *ctxt)
+{
+  return new pass_demote1 (ctxt);
+}
+
+gimple_opt_pass *
+make_pass_demote2 (gcc::context *ctxt)
+{
+  return new pass_demote2 (ctxt);
+}
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
@@ -22,7 +22,7 @@  void tst(void)
     blau ((unsigned char) i);
 }
 
-/* { dg-final { scan-tree-dump-times "& 255" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "& 255" 2 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "= \\(signed char\\)" 1 "optimized" } } */
 
 /* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
@@ -14,5 +14,5 @@  foo (long a)
   h = c;
 }
 
-/* { dg-final { scan-tree-dump "Replaced \\\(short int\\\) c_.*with b_" "fre1" } } */
+/* { dg-final { scan-tree-dump "Replaced \\\(short int\\\) c_.*with g\\\." "fre1" } } */
 /* { dg-final { cleanup-tree-dump "fre1" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
@@ -18,5 +18,5 @@  foo (int a, int b)
   return aa + bb;
 }
 
-/* { dg-final { scan-tree-dump "Replaced \\\(int\\\) aa_.*with a_" "fre1" } } */
+/* { dg-final { scan-tree-dump-times "Replaced \\\(int\\\) aa_.*with a_" 0 "fre1" } } */
 /* { dg-final { cleanup-tree-dump "fre1" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
@@ -11,5 +11,5 @@  char bar(char f)
         return wrap(f);
 }
 
-/* { dg-final { scan-tree-dump "Replaced \\\(char\\\) .*with " "fre1" } } */
+/* { dg-final { scan-tree-dump-times "Replaced \\\(char\\\) .*with " 0 "fre1" } } */
 /* { dg-final { cleanup-tree-dump "fre1" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
@@ -10,5 +10,5 @@  bar (unsigned int t)
   return a == t;
 }
 
-/* { dg-final { scan-tree-dump "Replaced \\\(unsigned int\\\) a_.*with t_" "fre1" } } */
+/* { dg-final { scan-tree-dump-times "Replaced \\\(unsigned int\\\) a_.*with t_" 0 "fre1" } } */
 /* { dg-final { cleanup-tree-dump "fre1" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
@@ -58,9 +58,8 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 4 "vect" { target vect_widen_shift } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
@@ -58,10 +58,8 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 8 "vect" { target vect_sizes_32B_16B } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 3 "vect" { target vect_widen_shift } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
@@ -58,7 +58,7 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-3.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-3.c
@@ -58,7 +58,7 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "vect_recog_over_widening_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
@@ -62,9 +62,8 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 4 "vect" { target vect_widen_shift } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
===================================================================
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
@@ -62,10 +62,8 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 8 "vect" { target vect_sizes_32B_16B } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 4 "vect" { target vect_widen_shift } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-1.c
===================================================================
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-1.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+short foo (unsigned char a, unsigned char b)
+{
+  int c = (int) a + (int) b;
+  return (short) c;
+}
+
+/* { dg-final { scan-tree-dump-times "= \\\(int\\\)" 0 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-2.c
===================================================================
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-2.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+short foo (unsigned char a, unsigned short b)
+{
+  int c = (int) a + (int) b;
+  return (short) c;
+}
+
+/* { dg-final { scan-tree-dump-times "= \\\(int\\\)" 0 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-3.c
===================================================================
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-add-3.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+signed char a[1024], b[1024];
+
+void
+foo (void)
+{
+  int i, s, t;
+
+  for (i = 0; i < 1024; i++)
+    {
+      s = a[i];
+      t = b[i];
+      s += t + 0x12345600;
+      a[i] = s;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "305419776" 0 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-1.c
===================================================================
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-1.c
@@ -0,0 +1,33 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+unsigned char f1 (char x)
+{
+  unsigned char x1 = (unsigned char) x;
+  return (x1 << 5);
+}
+
+unsigned short f2 (unsigned int x)
+{
+  unsigned short x1 = (unsigned short) x;
+  return (x1 << 15);
+}
+
+unsigned int f3 (unsigned short x)
+{
+  unsigned long long x1 = (unsigned long long) x;
+  return (unsigned int) (x1 >> 15);
+}
+
+unsigned long long f4 (unsigned short x)
+{
+  unsigned int x1 = (unsigned int) x;
+  return (unsigned long long) (x1 >> 7);
+}
+
+/* { dg-final { scan-tree-dump-times "= \\\(long long unsigned int\\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\\(short unsigned int\\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\\(unsigned int\\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\\(unsigned char\\\)" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+