diff mbox

ifcvt cond_exec support rewrite

Message ID 4E846069.4050007@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt Sept. 29, 2011, 12:11 p.m. UTC
Our current cond_exec support in ifcvt is essentially limited to single
basic blocks. There is a mechanism to detect a sequence of && or ||
blocks, but it has a number of limitations:

- it requires backend support: IFCVT_MODIFY_MULTIPLE_TESTS, only
  present on FRV. See below on a question what to do about FRV.
- it does not support arbitrary if-statements, but only those with
  multiple test blocks all ending in one of a fixed choice of one
  then or one else block.
- the conditions it calculates are sometimes nonsensical

The last point may need explanation. Consider an "or_or" block,

if (cond1) goto then;
if (cond2) goto then;
if (cond3) goto else; [*]
then: stuff;
else: more stuff;

then we'd construct a predicate for the then block by twice going through
         f = gen_rtx_AND (GET_MODE (t), false_expr, f);
which comes out after the first time as (!cond1 && !cond2), correct for
the test block marked [*], but then the final result (!cond1 && !cond2
&& !cond3) isn't useful for anything.  Maybe frv.c adjusts for it -
difficult to tell from the code.

The following patch rewrites essentially all the cond_exec support in
ifcvt; reviewing is probably easier if it's thought of as new code. A
large overview comment is added to basic-block.h.

The motivation for it was a benchmark in a popular embedded testsuite
that probably mustn't be named, where we have the following inside a loop:

    7   9
   / \ / \
  6 - 8   \
 /     \   \
5       10  \
 \        \  |
  11 ------ 12

Modulo scheduling only works on fully if-converted loops, so we need to
flatten that to a single basic block. 5, 6 and 8 are the test blocks.
This implies the following code generation:

C5 = test5
[!C5] C6 = test6
[C5] C6 = 1;		[*]
[!C6] block 7
[!C5] C8 = test8
[C5] C8 = 1		[*]
[!C8] block 9
[C5] C8 = 0		[*]
[C8] block 10
[C5] block 11
block 12

The trick is to modify inner conditions based on outer ones; these extra
instructions are marked [*]. If we only had the blocks 5-8, we could
alternatively try to ensure that C5 and C6 use the same register, and
avoid the extra modification by just reusing the condition.

So, features of this patch:
- detect complicated if-block structures where then, else and join
  blocks may themselves be if blocks
- Rename condition registers so that they aren't clobbered in the
  predicated blocks, and so that conditions may be reused if
  possible
- rewrite condition registers in cases where reuse isn't possible
- support matching heads/tails in then and else blocks as before, but
  also within nested conditions (using the predicate from one level
  higher)

Unfortunately this no longer works on FRV, which needs special code to
move condition registers to predication registers. The problem I have is
that I've been unabled to find an instruction set manual for the chip
(there is a manual for something called FR, but that appears different),
and the code in frv.c is somewhat nontrivial. Experiments show that the
existing multi-if-block support isn't terribly effective on FRV;
before-after comparisons show that by turning it off, there are three
spots in gcc that are meaningfully changed, and below 20 in the C
benchmarks of SPEC2k.

FRV also doesn't build in mainline, and it looks as if it likely hasn't
in a while, so I haven't tried testing this patch on it. If no public
manual can be found, is it still my responsibility to keep this support
working, or can we put the burden on the FRV maintainers?

This patch was bootstrapped on ia64-linux; a slightly earlier version
was regression tested and had a random guality failure that goes away
with this one (gcc & g++ tested again). It's also been tested on c6x-elf
along the way (currently testing again for the latest version and
looking OK so far), and it's in our 4.5 c6x tree.

Comments, or approvals?


Bernd
gcc/
	* ifcvt.c (block_fallthru, block_jumps_and_fallthru): Delete
	functions.
	(merge_if_block): Don't declare.
	(struct ce_added_insn, DEF_VEC_O and heap DEF_VEC_ALLOC_O for it):
	New.
	(ce_added_insns): New static variable.
	(ce_if_block_p): New typedef.
	(DEF_VEC_P and heap DEF_VEC_ALLOC_P for it): New.
	(struct ce_blocks_info, ce_blocks_info_t): New.
	(cond_exec_record_insn, cond_exec_finalize_added_insns,
	cond_exec_discard_added_insns, find_set_in_block, free_sub_if_blocks,
	cond_exec_compute_blocks_info, find_block_in_vec,
	cond_exec_last_cond_user, cond_exec_reg_valid_for_cond_p,
	cond_exec_assign_registers, cond_exec_limit_convert,
	cond_exec_assign_single_block, cond_exec_assign_conditions,
	record_rewrite, skip-cond_rewrite_p, cond_exec_modify_conditions,
	make_insns_for_implied_conds, cond_exec_make_extra_insns,
	cond_exec_limit_costs, cond_exec_finalize_if_tree,
	delete_unnecessary_matching_code, cond_exec_convert): New static
	functions.
	(merge_if_block): Adjust for new data structures.
	(cond_exec_process_if_block): Remove second arg.  All callers
	changed.  Rewrite for new data structures, and use the new
	functions.
	(discover_if_header): New function, broken out of...
	(find_if_header): ... here.
	(ce_discover_matches, ce_discover_if_structure): New static
	functions.
	(cond_exec_find_if_block): Use new functions.
	(if_convert): Also compute dominator info.
	* basic-block.h (struct ce_rewrite, struct ce_if_branch): New.
	(struct ce_if_block): Embed ce_if_branch structures for then and
	else.  Remove support for old and_and and or_or tests, add fields
	for describing nested if structures.

Comments

Nick Clifton Sept. 30, 2011, 3:22 p.m. UTC | #1
Hi Bernd,

> Experiments show that the
> existing multi-if-block support isn't terribly effective on FRV;
> before-after comparisons show that by turning it off, there are three
> spots in gcc that are meaningfully changed, and below 20 in the C
> benchmarks of SPEC2k.
>
> FRV also doesn't build in mainline, and it looks as if it likely hasn't
> in a while, so I haven't tried testing this patch on it. If no public
> manual can be found, is it still my responsibility to keep this support
> working, or can we put the burden on the FRV maintainers?

You can put in on the FRV maintainers.  Although now that the FRV port 
does build it would be very much appreciated if you could provide a 
patch for this backend as well.

Cheers
   Nick
Bernd Schmidt Sept. 30, 2011, 3:33 p.m. UTC | #2
Hi Nick,

>> Experiments show that the
>> existing multi-if-block support isn't terribly effective on FRV;
>> before-after comparisons show that by turning it off, there are three
>> spots in gcc that are meaningfully changed, and below 20 in the C
>> benchmarks of SPEC2k.
>>
>> FRV also doesn't build in mainline, and it looks as if it likely hasn't
>> in a while, so I haven't tried testing this patch on it. If no public
>> manual can be found, is it still my responsibility to keep this support
>> working, or can we put the burden on the FRV maintainers?
> 
> You can put in on the FRV maintainers.

Thanks. That sounds like the manual you have was under NDA after all :(

> Although now that the FRV port
> does build it would be very much appreciated if you could provide a
> patch for this backend as well.

Ok. I'll wait for general review and if it looks like this will go in
I'll make an effort to at least make sure it generates correct code on
FRV (although it won't do multiple-block conversion).


Bernd
Bernd Schmidt Oct. 13, 2011, 9:03 p.m. UTC | #3
Ping.  Better support for nested if-then-else structures:

http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01935.html


Bernd
Bernd Schmidt Nov. 9, 2011, 11:48 a.m. UTC | #4
Ping^2.  Better support for nested if-then-else structures:

> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01935.html
> 
> 
> Bernd
>
Bernd Schmidt March 2, 2012, 3:02 p.m. UTC | #5
Ping^3.  Better support for nested if-then-else structures:
> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01935.html


Bernd
Maxim Kuvyrkov March 3, 2012, 12:55 a.m. UTC | #6
On 30/09/2011, at 1:11 AM, Bernd Schmidt wrote:
...
> 
> The following patch rewrites essentially all the cond_exec support in
> ifcvt; reviewing is probably easier if it's thought of as new code.

Kudos for improving if-conversion!

I reviewed this patch to the extent I know ifcvt.c, which is below your level of understanding.  The new implementation looks good to me, and my review mostly focuses on making the code more understandable to someone without the in-depth knowledge of optimization.

> A
> large overview comment is added to basic-block.h.

I suggest moving this comment to ifcvt.c as it really describes the optimization, not the underlying data structures.

Also, are there any publications describing the new implementation or is this your own masterpiece?  If there are papers relevant to your new implementation, please add references to them.

...
> 
> This patch was bootstrapped on ia64-linux; a slightly earlier version
> was regression tested and had a random guality failure that goes away
> with this one (gcc & g++ tested again). It's also been tested on c6x-elf
> along the way (currently testing again for the latest version and
> looking OK so far), and it's in our 4.5 c6x tree.

For which targets is this optimization enabled for (as in "is not a no-op")?  Is it only IA64, FRV and C6X?  [I think at least ARM should also be affected.]

If it's only ia64, frv and c6x, then you are in the clear with testing.  If the optimization can change code for other targets (arm, x86, ppc, mips), then we need to do more testing to make sure correctness and performance do not regress.  I'm ready to help with the testing and benchmarking.


> Index: gcc/ifcvt.c
> ===================================================================
> *** gcc/ifcvt.c	(revision 178854)
> --- gcc/ifcvt.c	(working copy)
...
> +   cond_exec_discard_added_insns ();
>     cancel_changes (0);
> !   return false;
> ! }
> ! 
> ! /* Given an IF-THEN or IF-THEN-ELSE block with possibly nested
> !    sub-blocks in CE_ROOT, attempt to convert as much of the tree as
> !    possible to conditional execution.  Return TRUE if we were
> !    successful at converting the block.  */
> ! 
> ! static int
> ! cond_exec_process_if_block (ce_if_block_t *ce_root)
> ! {
> !   bool retval = false;
> !   basic_block bb, prev_bb = NULL;
> !   ce_blocks_info_t *blkinfo;
> !   int n_blocks;
> !   unsigned i;
> !   HARD_REG_SET set_anywhere;
> ! 
> !   /* Verify that all blocks are adjacent.  This isn't really a
> !      requirement, but only because rtl_move_block is unimplemented.
> !      Later.  */

What does this 'Later' mean?

> !   FOR_EACH_VEC_ELT (basic_block, ce_root->targets, i, bb)
> !     {
> !       if (prev_bb && prev_bb->next_bb != bb)
> ! 	return false;
> ! 
> !       prev_bb = bb;
> !     }

...

> ! /* After finding an if header CE_INFO with discover_if_header, this function
> !    can be used to recursively examine the then, else, and join blocks to see
> !    if they are themselves if blocks.  OUTER_JOIN should be NULL for the
> !    outermost if-block we're examining; on recursive calls it holds the join_bb
> !    of the parent if-block.
> !    Return true if we found a recognizable blocks structure.  */
>   
> ! static bool
> ! ce_discover_if_structure (ce_if_block_t *ce_info, basic_block outer_join)

This is a HUGE function.  Please try splitting it up into several smaller functions or making (its structure || steps that it makes) more apparent in other ways.

...
> 
> !   /* Try to avoid building up extremely large if trees only to find that
> !      they would be too expensive to convert.  The numbers are arbitrarily
> !      chosen to ensure reasonable compile times for extreme test cases without
> !      preventing useful optimizations.  */
> !   if (n_basic_blocks > 100
> !       /* Avoid the special case from above.  */
> !       && (then_bb == NULL || EDGE_COUNT (then_bb->succs) > 0))
> !     {
> !       VEC (basic_block, heap) *dom_vec;
> !       basic_block bb;
> !       int count = 0;
> !       dom_vec = get_dominated_to_depth (CDI_DOMINATORS, test_bb, 0);
> !       FOR_EACH_VEC_ELT (basic_block, dom_vec, i, bb)
> ! 	if (dominated_by_p (CDI_POST_DOMINATORS, bb, join_bb))
> ! 	  count++;
> !       VEC_free (basic_block, heap, dom_vec);
> !       if (count > 40)
> ! 	return false;
> !     }

Please replace the magic numbers '100' and '40' above with parameters or, at least, constants with descriptive names.  

...
> 
> !   if (sub_info != NULL && sub_info->join_bb == join_bb
> !       /* We cannot do this if we have matching pieces of code, as these must be
> ! 	 predicated with the parent's condition (which, hence, must be in a
> ! 	 different register).  */

"We cannot do this ... " -- what "this" refers to?

...

> Index: gcc/basic-block.h
> ===================================================================
> *** gcc/basic-block.h	(revision 178854)
> --- gcc/basic-block.h	(working copy)

  The placement of data structure definitions to basic-block.h seems historic, and, I think, it would be better to move them to a new stand-alone header ifcvt.h.  I didn't see anything among ifcvt definitions in basic-block.h that has to be intertwined with the rest of the header.

...

> +    === Conversion
> + 
> +    In order to convert such a structure to conditional execution, we work from
> +    the leaves (then or else blocks which are not themselves tests) upwards
> +    through the tree, assigning registers for the conditions, verifying costs,
> +    etc.

References to specific routines where we do the above core steps of algorithm (the upward walk, register assignment, etc.) would be very helpful here for someone trying to understand the implementation.  Similarly, code references in other parts of this and other comment would be greatly appreciated.


>  Ideally we can convert the entire structure, but in complicated cases
> +    it's more likely that We end up with a forest of convertible sub-trees.

s/We/we.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Maxim Kuvyrkov March 3, 2012, 8:39 a.m. UTC | #7
On 3/03/2012, at 1:55 PM, Maxim Kuvyrkov wrote:

> On 30/09/2011, at 1:11 AM, Bernd Schmidt wrote:
> ...
>> 
>> The following patch rewrites essentially all the cond_exec support in
>> ifcvt; reviewing is probably easier if it's thought of as new code.
> 
> Kudos for improving if-conversion!
> 
> I reviewed this patch to the extent I know ifcvt.c, which is below your level of understanding.

I realized this sentence can be read in two different ways.  Obviously I meant that you have a superior understanding of if-conversion than I :-).

>  The new implementation looks good to me, and my review mostly focuses on making the code more understandable to someone without the in-depth knowledge of optimization.


--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
diff mbox

Patch

Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 178854)
+++ gcc/ifcvt.c	(working copy)
@@ -90,15 +90,12 @@  static rtx first_active_insn (basic_bloc
 static rtx last_active_insn (basic_block, int);
 static rtx find_active_insn_before (basic_block, rtx);
 static rtx find_active_insn_after (basic_block, rtx);
-static basic_block block_fallthru (basic_block);
 static int cond_exec_process_insns (ce_if_block_t *, rtx, rtx, rtx, rtx, int);
 static rtx cond_exec_get_condition (rtx);
 static rtx noce_get_condition (rtx, rtx *, bool);
 static int noce_operand_ok (const_rtx);
-static void merge_if_block (ce_if_block_t *);
 static int find_cond_trap (basic_block, edge, edge);
 static basic_block find_if_header (basic_block, int);
-static int block_jumps_and_fallthru_p (basic_block, basic_block);
 static int noce_find_if_block (basic_block, edge, edge, int);
 static int cond_exec_find_if_block (ce_if_block_t *);
 static int find_if_case_1 (basic_block, edge, edge);
@@ -272,17 +269,1073 @@  find_active_insn_after (basic_block curr
 
   return insn;
 }
+
+/* A structure to record insns that must be added if we succeed
+   converting insns to COND_EXEC form.  They are emitted in the order
+   they were added; this means that if several have the same other
+   insn and before_p is true for all of them, the first recorded insn
+   will occur first in the stream, while the final order will be the
+   reverse if before_p is false.  */
+typedef struct ce_added_insn
+{
+  /* The insn to be added.  */
+  rtx insn;
+  /* Another insn, giving the location for INSN to be added.  */
+  rtx other;
+  /* Whether to add before or after OTHER.  */
+  bool before_p;
+} ce_added_insn_t;
 
-/* Return the basic block reached by falling though the basic block BB.  */
+DEF_VEC_O(ce_added_insn_t);
+DEF_VEC_ALLOC_O(ce_added_insn_t, heap);
 
-static basic_block
-block_fallthru (basic_block bb)
+/* A vector of these structures.  */
+static VEC(ce_added_insn_t, heap) *ce_added_insns;
+
+/* Record that INSN must be emitted before or after OTHER if we
+   succeed in converting the current block to COND_EXEC.  */
+
+static void
+cond_exec_record_insn (rtx insn, rtx other, bool before_p)
 {
-  edge e = find_fallthru_edge (bb->succs);
+  ce_added_insn_t cai;
 
-  return (e) ? e->dest : NULL_BLOCK;
+  if (ce_added_insns == NULL)
+    ce_added_insns = VEC_alloc (ce_added_insn_t, heap, 5);
+
+  cai.insn = insn;
+  cai.other = other;
+  cai.before_p = before_p;
+  
+  VEC_safe_push (ce_added_insn_t, heap, ce_added_insns, &cai);
 }
-
+
+/* Emit the insns we recorded earlier.  */
+
+static void
+cond_exec_finalize_added_insns (void)
+{
+  int i;
+  ce_added_insn_t *pcai;
+
+  if (ce_added_insns == NULL)
+    return;
+ 
+  FOR_EACH_VEC_ELT (ce_added_insn_t, ce_added_insns, i, pcai)
+    {
+      if (pcai->before_p)
+	emit_insn_before (pcai->insn, pcai->other);
+      else
+	emit_insn_after (pcai->insn, pcai->other);
+    }
+  VEC_free (ce_added_insn_t, heap, ce_added_insns);
+  ce_added_insns = NULL;
+}
+
+/* Called when we fail to convert a block to COND_EXEC, this discards the
+   previously recorded insns.  */
+
+static void
+cond_exec_discard_added_insns (void)
+{
+  if (ce_added_insns != NULL)
+    VEC_free (ce_added_insn_t, heap, ce_added_insns);
+  ce_added_insns = NULL;
+}
+
+/* Walk the insns between START and END, and record in *PSET the registers
+   set in this block, excluding the ones set in the last insn, which are
+   stored in *PSET_LAST.  These pointers may be identical if the caller
+   doesn't care about the distinction.  */
+
+static void
+find_set_in_block (const_rtx start, const_rtx end, HARD_REG_SET *pset,
+		   HARD_REG_SET *pset_last)
+{
+  const_rtx insn;
+
+  CLEAR_HARD_REG_SET (*pset);
+  CLEAR_HARD_REG_SET (*pset_last);
+
+  end = NEXT_INSN (end);
+  for (insn = start; insn != end; insn = NEXT_INSN (insn))
+    {
+      df_ref *def_rec;
+
+      if (LABEL_P (insn) || NOTE_P (insn) || DEBUG_INSN_P (insn))
+	continue;
+
+      gcc_assert (NONJUMP_INSN_P (insn) || CALL_P (insn));
+
+      /* Remove USE insns that get in the way.  */
+      if (GET_CODE (PATTERN (insn)) == USE)
+	continue;
+
+      if (pset != pset_last)
+	{
+	  IOR_HARD_REG_SET (*pset, *pset_last);
+	  CLEAR_HARD_REG_SET (*pset_last);
+	}
+      for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++)
+	{
+	  rtx dreg = DF_REF_REG (*def_rec);
+
+	  if (!REG_P (dreg))
+	    continue;
+
+	  add_to_hard_reg_set (pset_last, GET_MODE (dreg),
+			       REGNO (dreg));
+	}
+    }
+}
+
+
+typedef ce_if_block_t *ce_if_block_p;
+DEF_VEC_P (ce_if_block_p);
+DEF_VEC_ALLOC_P (ce_if_block_p, heap);
+
+/* Free all data associated with CE_INFO, including its sub-blocks, if any.  */ 
+
+static void
+free_sub_if_blocks (struct ce_if_block *ce_info)
+{
+  if (ce_info == NULL)
+    return;
+
+  free_sub_if_blocks (ce_info->then_br.ifblock);
+  free (ce_info->then_br.ifblock);
+
+  free_sub_if_blocks (ce_info->else_br.ifblock);
+  free (ce_info->else_br.ifblock);
+
+  free_sub_if_blocks (ce_info->join_ifblock);
+  free (ce_info->join_ifblock);
+
+  while (ce_info->then_br.rewrite_list)
+    {
+      struct ce_rewrite *t = ce_info->then_br.rewrite_list;
+      ce_info->then_br.rewrite_list = t->next_same_block;
+      free (t);
+    }
+  while (ce_info->else_br.rewrite_list)
+    {
+      struct ce_rewrite *t = ce_info->else_br.rewrite_list;
+      ce_info->else_br.rewrite_list = t->next_same_block;
+      free (t);
+    }
+}
+
+/* This structure holds additional information about every basic block
+   in a tree defined by ce_if_block_t.  Primarily, this is information
+   about the registers set in the block, and the condition chosen for
+   it.  */
+typedef struct ce_blocks_info
+{
+  /* The if block that computes the condition that will be used to
+     predicate this block.  */
+  ce_if_block_t *controlling;
+  /* Set if the controlling block's true condition should be used for this
+     basic block, clear if the block's false condtion should be used.  */
+  bool controlling_true_p;
+
+  /* Boundaries of the block in RTL.  START and END mark the block
+     proper; there may be a head and a tail if we found matching
+     sequences.  */
+  rtx all_start, last_head, start, end, first_tail, all_end;
+
+  /* Registers set in the matching head part of the block, if any.  */
+  HARD_REG_SET set_in_head;
+  /* Registers set in the block proper, excluding the last insn.  */
+  HARD_REG_SET set_in_block;
+  /* Registers set in the last insn of the block proper.  */
+  HARD_REG_SET set_last_insn;
+  /* Registers set in the matching tail part of the block, if any.  */
+  HARD_REG_SET set_in_tail;
+  /* Used to compute information about registers that are live due to
+     the flattening of the block structure.  */
+  HARD_REG_SET live_at_start;
+} ce_blocks_info_t;
+
+/* Go through all the basic blocks that are candidates for conversion.
+   The basic block vector is the one found in CE_ROOT, the
+   corresponding blocks information is in INFO.  We collect
+   information about block boundaries (taking matching sequences into
+   account), and compute set registers.
+   The function cond_exec_assign_conditions must have run prior to this.  */
+
+static void
+cond_exec_compute_blocks_info (ce_if_block_t *ce_root, ce_blocks_info_t *info)
+{
+  int i;
+  basic_block bb;
+
+  FOR_EACH_VEC_ELT (basic_block, ce_root->targets, i, bb)
+    {
+      ce_if_block_t *controller = info[i].controlling;
+      struct ce_if_branch *br = NULL;
+      rtx all_start = first_active_insn (bb);
+      rtx all_end = last_active_insn (bb, TRUE);
+      rtx start = all_start;
+      rtx end = all_end;
+
+      /* Since we'll be laying out all these blocks in a linear
+	 sequence without jumps, new instructions (disabled by
+	 predication) will be executed between an exit of one of the
+	 original blocks and its destination.  This means we have to
+	 be careful later when inserting new instructions for
+	 modifying conditions: they are present not only in the bb
+	 where we are inserting them, but conceptually also on these
+	 edges.
+	 Compute an extra live_at_start set based on the linear layout of
+	 the blocks.  */
+      if (bb != ce_root->join_bb)
+	{
+	  HARD_REG_SET live_out;
+	  VEC (basic_block, heap) *succ_blocks = NULL;
+	  edge e;
+	  edge_iterator ei;
+	  int j;
+	  int len = VEC_length (basic_block, ce_root->targets);
+
+	  CLEAR_HARD_REG_SET (live_out);
+	  reg_set_to_hard_reg_set (&live_out, df_get_live_out (bb));
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    VEC_safe_push (basic_block, heap, succ_blocks, e->dest);
+	  for (j = i + 1; j < len; j++)
+	    {
+	      HARD_REG_SET live;
+	      basic_block next_bb, test_bb;
+	      int found = -1;
+	      int k;
+
+	      CLEAR_HARD_REG_SET (live);
+	      next_bb = VEC_index (basic_block, ce_root->targets, j);
+	      FOR_EACH_VEC_ELT (basic_block, succ_blocks, k, test_bb)
+		{
+		  if (test_bb == next_bb)
+		    found = k;
+		  reg_set_to_hard_reg_set (&live, df_get_live_in (test_bb));
+		}
+	      if (found >= 0)
+		VEC_unordered_remove (basic_block, succ_blocks, found);
+	      AND_HARD_REG_SET (live, live_out);
+	      IOR_HARD_REG_SET (info[j].live_at_start, live);
+	    }
+	  if (!VEC_empty (basic_block, succ_blocks))
+	    {
+	      basic_block bb1 = VEC_pop (basic_block, succ_blocks);
+	      gcc_assert (VEC_empty (basic_block, succ_blocks));
+	      gcc_assert (bb1 == ce_root->join_bb);
+	    }
+	  VEC_free (basic_block, heap, succ_blocks);
+	}
+
+      if (start == NULL_RTX || end == NULL_RTX)
+	continue;
+
+      info[i].all_start = all_start;
+      info[i].all_end = all_end;
+
+      /* Find out if there were matching insns.  */
+      if (controller && bb == controller->then_bb)
+	br = &controller->then_br;
+      else if (controller && bb == controller->else_bb)
+	br = &controller->else_br;
+      if (br)
+	{
+	  rtx last_head = br->last_head;
+	  rtx first_tail = br->first_tail;
+
+	  if (last_head && bb == controller->then_bb)
+	    find_set_in_block (start, last_head, &info[i].set_in_head,
+			       &info[i].set_in_head);
+	  if (first_tail && bb == controller->else_bb)
+	    find_set_in_block (first_tail, end, &info[i].set_in_tail,
+			       &info[i].set_in_tail);
+
+	  if (first_tail == BB_HEAD (bb))
+	    start = end = NULL_RTX;
+	  else if (first_tail)
+	    end = find_active_insn_before (bb, first_tail);
+	  if (start != NULL && last_head)
+	    {
+	      if (last_head == end)
+		start = end = NULL_RTX;
+	      else
+		start = find_active_insn_after (bb, last_head);
+	    }
+	  if (first_tail && !INSN_P (first_tail))
+	    first_tail = find_active_insn_after (bb, first_tail);
+
+	  info[i].last_head = last_head;
+	  info[i].first_tail = first_tail;
+	  if (start == NULL)
+	    continue;
+	}
+
+      info[i].start = start;
+      info[i].end = end;
+
+      find_set_in_block (start, end, &info[i].set_in_block,
+			 &info[i].set_last_insn);
+    }
+}
+
+/* Find SEARCH_BB in VEC, and return its index.  */
+
+static int
+find_block_in_vec (VEC (basic_block, heap) *vec, basic_block search_bb)
+{
+  int i;
+  basic_block bb;
+  FOR_EACH_VEC_ELT (basic_block, vec, i, bb)
+    if (bb == search_bb)
+      return i;
+  gcc_unreachable ();
+}
+
+/* Return the index of the last basic block using the condition set in CE_INFO's
+   test block.  If the condition is reused by a subblock, this index is that
+   of the reuser's test block.  Otherwise, it is simply the last block at the
+   level below CE_INFO, i.e. the last block in its targets vector, unless
+   that is the join block.  If ASSUME_NOT_REUSED is true, we ignore the
+   possibility of reuse.  */
+
+static int
+cond_exec_last_cond_user (ce_if_block_t *ce_info, bool assume_not_reused)
+{
+  if (!assume_not_reused
+      && ce_info->reuser
+      && ce_info->reuser->cond_regno == ce_info->cond_regno)
+    return ce_info->reuser->blocks_offset;
+
+  return ce_info->blocks_offset + ce_info->blocks_length - 1;
+}
+
+/* Check if REGNO is valid for use in the condition set by the test block
+   associated with CE_INFO.  CLOBBERED is the set of all registers clobbered in
+   basic blocks at the level below CE_INFO, while CLOBBERED_IF_REUSED is
+   computed from a limited set of blocks, ending at the test block that can
+   possibly reuse CE_INFO's condition.  */
+
+static bool
+cond_exec_reg_valid_for_cond_p (ce_if_block_t *ce_info, HARD_REG_SET clobbered,
+				HARD_REG_SET clobbered_if_reused, int regno)
+{
+  if (!df_regs_ever_live_p (regno) && !call_used_regs[regno])
+    return false;
+
+  if (fixed_regs[regno])
+    return false;
+
+#ifdef PREDICATE_REG_CLASS
+  if (!TEST_HARD_REG_BIT (reg_class_contents[PREDICATE_REG_CLASS], regno))
+    return false;
+#endif
+
+  if (regno != ce_info->cond_regno
+      && (overlaps_hard_reg_set_p (ce_info->live_at_end,
+				   ce_info->cond_reg_mode, regno)
+	  || overlaps_hard_reg_set_p (ce_info->live_at_end,
+				      ce_info->cond_reg_mode,
+				      ce_info->cond_regno)))
+    return false;
+
+  if (ce_info->reuser && ce_info->reuser->cond_regno == regno)
+    return (!overlaps_hard_reg_set_p (ce_info->sub_nonreusable,
+				      ce_info->cond_reg_mode, regno)
+	    && !overlaps_hard_reg_set_p (clobbered_if_reused,
+					 ce_info->cond_reg_mode, regno));
+  else
+    return (!overlaps_hard_reg_set_p (ce_info->sub_nonreusable,
+				      ce_info->cond_reg_mode, regno)
+	    && !overlaps_hard_reg_set_p (clobbered, ce_info->cond_reg_mode,
+					 regno));
+}
+
+/* Recursively process if blocks, starting with innermost ones first,
+   and assign registers for the conditions they set.  The block to be
+   processed is CE_INFO, while the root of the tree is at ROOT.
+   BLOCKS_VEC contains auxiliary information about the basic blocks
+   involved in the if-block tree.  SET_ANYWHERE contains the registers
+   that are set in any of these blocks, it is used to give slight
+   preference to other registers when allocating.  */
+
+static bool
+cond_exec_assign_registers (ce_if_block_t *root, ce_if_block_t *ce_info,
+			    ce_blocks_info_t *blocks_vec,
+			    HARD_REG_SET set_anywhere)
+{
+  ce_if_block_t *sub_info;
+  int i, last_user, reuser_idx, best_reg;
+  enum machine_mode mode;
+  HARD_REG_SET clobbered;
+  HARD_REG_SET clobbered_if_reused;
+  rtx condreg, newreg, op1;
+  bool success = true;
+
+  if (ce_info->then_br.ifblock)
+    {
+      success &= cond_exec_assign_registers (root, ce_info->then_br.ifblock,
+					     blocks_vec, set_anywhere);
+      sub_info = ce_info->then_br.ifblock->join_ifblock;
+      while (sub_info)
+	{
+	  success &= cond_exec_assign_registers (root, sub_info, blocks_vec,
+						 set_anywhere);
+	  sub_info = sub_info->join_ifblock;
+	}
+    }
+  if (ce_info->else_br.ifblock)
+    {
+      success &= cond_exec_assign_registers (root, ce_info->else_br.ifblock,
+					     blocks_vec, set_anywhere);
+      sub_info = ce_info->else_br.ifblock->join_ifblock;
+      while (sub_info)
+	{
+	  success &= cond_exec_assign_registers (root, sub_info, blocks_vec,
+						 set_anywhere);
+	  sub_info = sub_info->join_ifblock;
+	}
+    }
+
+  /* We can't handle this case yet as we don't assign enough registers.  */
+  if (ce_info->then_br.ifblock && ce_info->else_br.ifblock)
+    success = false;
+
+  ce_info->will_convert = success;
+  if (!success)
+    return false;
+
+  last_user = cond_exec_last_cond_user (ce_info, true);
+  reuser_idx = last_user;
+  if (ce_info->reuser)
+    reuser_idx = ce_info->reuser->blocks_offset;
+
+  CLEAR_HARD_REG_SET (clobbered);
+  CLEAR_HARD_REG_SET (clobbered_if_reused);
+  for (i = ce_info->blocks_offset + 1; i <= last_user; i++)
+    {
+      basic_block bb = VEC_index (basic_block, root->targets, i);
+      IOR_HARD_REG_SET (clobbered, blocks_vec[i].set_in_head);
+      IOR_HARD_REG_SET (clobbered, blocks_vec[i].set_in_block);
+      /* Normally, all insns in a block must be checked whether they clobber
+	 the condition.  The only exception is that the last instruction that
+	 uses the condition.  However, if there are basic blocks that have
+	 no successor, we can treat them as having "last" instructions as
+	 well.  */
+      if (i != last_user
+	  && (EDGE_COUNT (bb->succs) > 0
+	      || blocks_vec[i].first_tail != NULL_RTX))
+	{
+	  IOR_HARD_REG_SET (clobbered, blocks_vec[i].set_last_insn);
+	  IOR_HARD_REG_SET (clobbered, blocks_vec[i].set_in_tail);
+	}
+      if (i <= reuser_idx)
+	{
+	  IOR_HARD_REG_SET (clobbered_if_reused, blocks_vec[i].set_in_head);
+	  IOR_HARD_REG_SET (clobbered_if_reused, blocks_vec[i].set_in_block);
+	}
+      if (i < reuser_idx)
+	{
+	  IOR_HARD_REG_SET (clobbered_if_reused, blocks_vec[i].set_last_insn);
+	  IOR_HARD_REG_SET (clobbered_if_reused, blocks_vec[i].set_in_tail);
+	}
+    }
+
+  if (dump_file)
+    fprintf (dump_file, "assigning reg for test block %d: last user index %d\n",
+	     ce_info->test_bb->index, last_user);
+
+  condreg = XEXP (ce_info->else_br.cond, 0);
+  if (ce_info->set_reg == -1)
+    {
+      bool retval = cond_exec_reg_valid_for_cond_p (ce_info, clobbered,
+						    clobbered_if_reused,
+						    REGNO (condreg));
+      if (dump_file)
+	fprintf (dump_file,
+		 "  unable to change reg; %ssuccess with reg %d\n",
+		 retval ? "" : "no ", ce_info->cond_regno);
+      if (!retval)
+	{
+	  ce_info->will_convert = false;
+	  return false;
+	}
+      best_reg = REGNO (condreg);
+      newreg = gen_rtx_REG (GET_MODE (condreg), REGNO (condreg));
+    }
+  else
+    {
+      ce_blocks_info_t *test_info;
+      rtx set = single_set (ce_info->end);
+      int best_cost = INT_MAX;
+
+      best_reg = -1;
+      newreg = NULL_RTX;
+      for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+	{
+	  rtx dest = SET_DEST (set);
+	  int cost;
+
+	  if (!cond_exec_reg_valid_for_cond_p (ce_info, clobbered,
+					       clobbered_if_reused, i))
+	    continue;
+
+	  if (ce_info->reuser != NULL && ce_info->reuser->cond_regno != i)
+	    cost = 2;
+	  else
+	    cost = 0;
+	  if (TEST_HARD_REG_BIT (set_anywhere, i))
+	    cost++;
+	  if (cost >= best_cost)
+	    continue;
+
+	  if (newreg == NULL_RTX)
+	    newreg = gen_raw_REG (GET_MODE (condreg), i);
+	  else
+	    SET_REGNO_RAW (newreg, i);
+	  validate_change (ce_info->end, &SET_DEST (set), newreg,
+			   true);
+	  if (validate_replace_rtx (dest, newreg, BB_END (ce_info->test_bb)))
+	    {
+	      best_reg = i;
+	      best_cost = cost;
+	      newreg = NULL_RTX;
+	      if (cost == 0)
+		break;
+	    }
+	}
+
+      if (dump_file)
+	fprintf (dump_file, "  decided on best reg %d\n", best_reg);
+      if (best_reg == -1)
+	{
+	  ce_info->will_convert = false;
+	  return false;
+	}
+      newreg = SET_DEST (set);
+      ce_info->cond_regno = best_reg;
+      test_info = blocks_vec + ce_info->blocks_offset;
+      remove_from_hard_reg_set (&test_info->set_last_insn,
+				ce_info->cond_reg_mode, ce_info->set_reg);
+      add_to_hard_reg_set (&test_info->set_last_insn,
+			   ce_info->cond_reg_mode, best_reg);
+      ce_info->set_reg = best_reg;
+    }
+
+  if (ce_info->parent && ce_info->parent->reuser != ce_info)
+    SET_HARD_REG_BIT (ce_info->parent->sub_nonreusable, ce_info->cond_regno);
+  if (ce_info->parent)
+    IOR_HARD_REG_SET (ce_info->parent->sub_nonreusable, ce_info->sub_nonreusable);
+
+  /* Regenerate the conditions.  */
+  mode = GET_MODE (ce_info->else_br.cond);
+  op1 = XEXP (ce_info->else_br.cond, 1);
+
+  if (ce_info->then_bb)
+    {
+      ce_blocks_info_t *then_bb_info;
+      then_bb_info = blocks_vec + find_block_in_vec (root->targets,
+						     ce_info->then_bb);
+      ce_info->then_br.cond = gen_rtx_fmt_ee (ce_info->then_br.cond_code, mode,
+					      newreg, op1);
+      ce_info->then_br.may_rewrite
+	= !overlaps_hard_reg_set_p (then_bb_info->live_at_start,
+				    ce_info->cond_reg_mode, best_reg);
+			      
+    }
+  if (ce_info->else_bb)
+    {
+      ce_blocks_info_t *else_bb_info;
+      else_bb_info = blocks_vec + find_block_in_vec (root->targets,
+						     ce_info->else_bb);
+      ce_info->else_br.cond = gen_rtx_fmt_ee (ce_info->else_br.cond_code, mode,
+					      newreg, op1);
+      ce_info->else_br.may_rewrite
+	= !overlaps_hard_reg_set_p (else_bb_info->live_at_start,
+				    ce_info->cond_reg_mode, best_reg);
+    }
+  return true;
+}
+
+/* This function should be called after the will_convert member has first been
+   computed for individual if-blocks, or if subsequent steps in the analysis
+   may have cleared it for some of them.
+
+   It limits the will_convert flag for each if-block so that
+   afterwards, it is true only for blocks we are really going to try to
+   convert.  It assumes that on entry, if a block cannot be converted, all its
+   parents also have will_convert set to false.  The function then pushes this
+   information downward in the tree, so that on any path to a leaf, the first
+   node seen with will_convert has unaccounted == 0 (indicating that it is
+   self-contained), and all following nodes in such a path also have
+   will_convert == true (i.e. the entire subtree is consistent).
+
+   CE_INFO is an if-block to be examined; PARENT_OK indicates whether the
+   parent if-block will be converted.  Return true if there is something
+   left in the tree that can be converted, false otherwise.  */
+
+static bool
+cond_exec_limit_convert (ce_if_block_t *ce_info, bool parent_ok)
+{
+  bool any_success = false;
+
+  if (ce_info->join_ifblock)
+    any_success |= cond_exec_limit_convert (ce_info->join_ifblock, parent_ok);
+
+  /* A value of will_convert means that all child nodes recursively had a register
+     assigned successfully.  If no nodes are unaccounted, the sub-if-block
+     represented by ce_info can be converted.  */
+  if (ce_info->unaccounted == 0 && ce_info->will_convert)
+    parent_ok = true;
+
+  ce_info->will_convert &= parent_ok;
+
+  if (!ce_info->then_br.ifblock && !ce_info->else_br.ifblock)
+    return ce_info->will_convert;
+
+  if (ce_info->then_br.ifblock)
+    any_success |= cond_exec_limit_convert (ce_info->then_br.ifblock,
+					   ce_info->will_convert);
+  if (ce_info->else_br.ifblock)
+    any_success |= cond_exec_limit_convert (ce_info->else_br.ifblock,
+					   ce_info->will_convert);
+
+  return any_success;
+}
+
+/* A subroutine of cond_exec_assign_conditions.  This function does
+   the actual work, setting up the data in BLOCKS_VEC about which
+   condition to use for each block BB.  The condition is described by
+   CE_INFO, giving the if-block that computes it, and TRUE_P,
+   indicating whether the condition must be true or false for BB to be
+   executed.  ROOT is the root of the if-block tree.
+
+   This is done only once per block, the first time we are called for it.
+   The precise structure of the recursive walk in cond_exec_assign_conditions
+   ensures that this leads to the correct choice of condition.  */
+
+static void
+cond_exec_assign_single_block (ce_if_block_t *root, ce_if_block_t *ce_info,
+			       basic_block bb, bool true_p,
+			       ce_blocks_info_t *blocks_vec)
+{
+  int i;
+  
+  if (bb == NULL)
+    return;
+  
+  i = find_block_in_vec (root->targets, bb);
+  gcc_assert (bb == VEC_index (basic_block, root->targets, i));
+
+  /* We process the blocks from the leaves of the tree upwards, which means
+     that we get the right condition the first time this function is called
+     for a given block.  */
+  if (blocks_vec[i].controlling != NULL)
+    return;
+
+  blocks_vec[i].controlling = ce_info;
+  blocks_vec[i].controlling_true_p = true_p;
+
+  if (dump_file)
+    fprintf (dump_file, "assigning cond to block %d: condition for block %d, %s\n",
+	     VEC_index (basic_block, root->targets, i)->index,
+	     ce_info->test_bb->index, true_p ? "true" : "false");
+}
+
+/* Recursively assign conditions to blocks, starting with innermost ones
+   first.  ROOT is the root of the tree, CE_INFO the if-block we're currently
+   processing.  BLOCKS_VEC holds information about all basic blocks we're
+   considering in the entire tree.
+   This function by itself does little more than calling itself recursively,
+   and then calling cond_exec_assign_single_block for all nodes at the same
+   level.  */
+
+static void
+cond_exec_assign_conditions (ce_if_block_t *root, ce_if_block_t *ce_info,
+			     ce_blocks_info_t *blocks_vec)
+{
+  ce_if_block_t *sub_info;
+  if (ce_info->then_br.ifblock)
+    cond_exec_assign_conditions (root, ce_info->then_br.ifblock, blocks_vec);
+  if (ce_info->else_br.ifblock)
+    cond_exec_assign_conditions (root, ce_info->else_br.ifblock, blocks_vec);
+  if (ce_info->join_ifblock)
+    cond_exec_assign_conditions (root, ce_info->join_ifblock, blocks_vec);
+
+  cond_exec_assign_single_block (root, ce_info, ce_info->then_bb, false,
+				 blocks_vec);
+  if (ce_info->then_br.ifblock)
+    {
+      sub_info = ce_info->then_br.ifblock;
+      while (sub_info && sub_info->join_bb != ce_info->join_bb)
+	{
+	  cond_exec_assign_single_block (root, ce_info, sub_info->join_bb,
+					 false, blocks_vec);
+	  sub_info = sub_info->join_ifblock;
+	}
+    }
+  cond_exec_assign_single_block (root, ce_info, ce_info->else_bb, true,
+				 blocks_vec);  
+  if (ce_info->else_br.ifblock)
+    {
+      sub_info = ce_info->else_br.ifblock;
+      while (sub_info && sub_info->join_bb != ce_info->join_bb)
+	{
+	  cond_exec_assign_single_block (root, ce_info, sub_info->join_bb,
+					 true, blocks_vec);
+	  sub_info = sub_info->join_ifblock;
+	}
+    }
+}
+
+/* Record that before the if-branch BRANCH, which has a condition set by
+   CONTROLLER, we must rewrite its condition.
+   The new insn that rewrites the condition should either make the
+   condition for BRANCH true or false, depending on TRUE_P, and it will
+   itself be conditional on CONTROLLING_COND.  */
+
+static void
+record_rewrite (struct ce_if_branch *branch, ce_if_block_t *controller,
+		rtx controlling_cond, bool true_p)
+{
+  struct ce_rewrite *cei = XNEW (struct ce_rewrite);
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "recording rewrite: ");
+      print_rtl_single (dump_file, controlling_cond);
+      fprintf (dump_file, " implies condition ");
+      print_rtl_single (dump_file, branch->cond);
+      fprintf (dump_file, " is %s\n", true_p ? "true" : "false");
+    }
+  cei->controller = controller;
+  cei->controlling_cond = controlling_cond;
+  cei->implied_true_p = true_p;
+  cei->next_same_block = branch->rewrite_list;
+  branch->rewrite_list = cei;
+}
+
+/* Consider a structure such as
+
+                           F
+                          /
+                   B2 - C2
+                  /       \
+           B1 - C1         T0
+          /       \ 
+   B0 - C0         \
+          \-------- T1
+
+   where Bn stands for a basic block, and Cn at the condition set at its end.
+   B2 uses !C1 as its condition, and we record a ce_rewrite struct for B2:
+   "C0 implies C1" (i.e. the instruction generated would be "[C0] C1 = 1").
+   This instruction, together with the fact that B0 and B1 have the same else
+   branch, means that when creating ce_rewrite structures for F and T0, we can
+   ignore C0, and just use "[C1] C2 = 1" for F and "[C1] C2 = 0" for T0.  */
+
+static bool
+skip_cond_rewrite_p (ce_if_block_t *parent, ce_if_block_t *sub_info)
+{
+  return (parent->else_bb == sub_info->else_bb
+	  || parent->then_bb == sub_info->then_bb);
+}
+
+/* Find the cases where we must modify condition registers based on conditions
+   at a higher level in the tree.  See the section "Rewriting conditions" in
+   the comment in basic-block.h.
+   ROOT is the root of the if-block tree, while CE_INFO is the if-block we
+   should be examining.  We descend into CE_INFO recursively.  BLOCKS_VEC
+   holds side information about all the basic blocks involved in the if-block
+   tree.  */
+
+static void
+cond_exec_modify_conditions (ce_if_block_t *root, ce_if_block_t *ce_info,
+			     ce_blocks_info_t *blocks_vec)
+{
+  basic_block then_bb, else_bb;
+  rtx this_block_false_cond;
+  ce_if_block_t *parent = ce_info->parent;
+  ce_if_block_t *first_recorded_parent = NULL;
+  ce_blocks_info_t *test_bb_info;
+  ce_blocks_info_t *then_bb_info = NULL, *else_bb_info = NULL;
+
+  if (ce_info->then_br.ifblock)
+    cond_exec_modify_conditions (root, ce_info->then_br.ifblock, blocks_vec);
+  if (ce_info->else_br.ifblock)
+    cond_exec_modify_conditions (root, ce_info->else_br.ifblock, blocks_vec);
+  if (ce_info->join_ifblock)
+    cond_exec_modify_conditions (root, ce_info->join_ifblock, blocks_vec);
+
+  if (!ce_info->will_convert)
+    return;
+
+  if (parent == NULL || !parent->will_convert)
+    return;
+
+  test_bb_info = blocks_vec + ce_info->blocks_offset;
+  gcc_assert (test_bb_info->controlling == ce_info->parent);
+
+  then_bb = ce_info->then_bb;
+  else_bb = ce_info->else_bb;
+
+  /* There are two cases we must distinguish.  If this is the last test block
+     in a sequence (i.e. then_bb and else_bb are not themselves test blocks),
+     then we must walk upwards towards the topmost condition and adjust the
+     condition register for each of the conditions encountered.
+     If this is a test block that is followed by at least one more test block,
+     we have ensured in ce_discover_if_structure that the child test block is
+     reachable only from here.  Hence, we need to update the condition with at
+     most one instruction.  */
+  if (ce_info->then_br.ifblock != NULL || ce_info->else_br.ifblock != NULL)
+    {
+      /* This is the case where there is at least one sub-ifblock.  */
+      gcc_assert (!(ce_info->then_br.ifblock && ce_info->else_br.ifblock));
+
+      if (parent->reuser != ce_info || parent->cond_regno != ce_info->cond_regno)
+	{
+	  this_block_false_cond = (test_bb_info->controlling_true_p
+				   ? parent->then_br.cond : parent->else_br.cond);
+
+	  if (ce_info->then_br.ifblock)
+	    record_rewrite (&ce_info->then_br, parent, this_block_false_cond, false);
+	  if (ce_info->else_br.ifblock)
+	    record_rewrite (&ce_info->else_br, parent, this_block_false_cond, false);
+	  if (ce_info->then_br.ifblock || ce_info->else_br.ifblock)
+	    first_recorded_parent = parent;
+	}
+    }
+
+  /* Deal with the case where then or else blocks are terminal,
+     i.e. not themselves nested ifblocks, and also not controlled by
+     any nested ifblocks.  We must walk backwards and ensure the early
+     decisions in upper test blocks take the right path.  */
+  if (then_bb && ce_info->then_br.ifblock == NULL)
+    then_bb_info = blocks_vec + find_block_in_vec (root->targets,
+						   ce_info->then_bb);
+  if (else_bb && ce_info->else_br.ifblock == NULL)
+    else_bb_info = blocks_vec + find_block_in_vec (root->targets,
+						   ce_info->else_bb);
+  if ((then_bb_info && then_bb_info->controlling == ce_info)
+      || (else_bb_info && else_bb_info->controlling == ce_info))
+    {
+      ce_if_block_t *cur = ce_info;
+
+      while (parent && parent->will_convert)
+	{
+	  if ((ce_info == cur || !skip_cond_rewrite_p (parent, cur))
+	      && (parent->reuser != ce_info
+		  || parent->cond_regno != ce_info->cond_regno))
+	    {
+	      basic_block parent_other_branch;
+
+	      test_bb_info = blocks_vec + cur->blocks_offset;
+	      this_block_false_cond = (test_bb_info->controlling_true_p
+				       ? parent->then_br.cond
+				       : parent->else_br.cond);
+
+	      parent_other_branch = (test_bb_info->controlling_true_p
+				     ? parent->then_bb
+				     : parent->else_bb);
+
+	      if (else_bb_info && else_bb_info->controlling == ce_info)
+		record_rewrite (&ce_info->else_br, parent, this_block_false_cond,
+				parent_other_branch == else_bb);
+	      if (then_bb_info && then_bb_info->controlling == ce_info)
+		record_rewrite (&ce_info->then_br, parent, this_block_false_cond,
+				parent_other_branch == then_bb);
+
+	      if (first_recorded_parent == NULL)
+		first_recorded_parent = parent;
+	    }
+	  cur = parent;
+	  parent = cur->parent;
+	}
+    }
+
+  /* See if we've recorded modifications for a live register.  We can't clobber
+     the value, so all the parent blocks that caused such modifications must
+     be marked as non-convertable.  */
+
+  if (first_recorded_parent != NULL
+      && (TEST_HARD_REG_BIT (ce_info->live_at_end, ce_info->cond_regno)
+	  || (ce_info->then_br.rewrite_list != NULL
+	      && !ce_info->then_br.may_rewrite)
+	  || (ce_info->else_br.rewrite_list != NULL
+	      && !ce_info->else_br.may_rewrite)))
+    {
+      parent = first_recorded_parent;
+      if (dump_file)
+	fprintf (dump_file, "would have to clobber reg %d after block %d\n",
+		 ce_info->cond_regno, ce_info->test_bb->index);
+      while (parent)
+	{
+	  parent->will_convert = false;
+	  if (dump_file)
+	    fprintf (dump_file, "disabling conversion of %d\n",
+		     parent->test_bb->index);
+		   
+	  parent = parent->parent;
+	}
+    }
+}
+
+/* For blocks nested inside more than one condition, we may have to
+   modify the condition register based on the conditions higher up in
+   the tree.  Input code of the form
+   
+   if (C0 && C1 && C2) { A } else { B }
+
+   will end up using C2 as the predicate for A, and !C2 as the
+   predicate for B.  We must insert two insns,
+   [!C1] C2 = 1
+   [!C0] C2 = 1
+   in exactly that order, before A, and two opposite insns
+   before B.
+   TODO: Equivalent forms using logical operations would also be valid.
+
+   THIS_BRANCH and OPPOSITE_BRANCH point to the then/else branches of the
+   same if block; BB corresponds to THIS_BRANCH.  */
+   
+static bool
+make_insns_for_implied_conds (struct ce_if_branch *this_branch,
+			      struct ce_if_branch *opposite_branch,
+			      basic_block bb)
+{
+  struct ce_rewrite *list = this_branch->rewrite_list;
+  rtx head;
+
+  if (!list)
+    return true;
+
+  head = first_active_insn (bb);
+  if (head == NULL)
+    {
+      /* If a block is really empty, we don't need to compute the
+	 correct condition, since we won't use it anyway.  */
+      if (this_branch->ifblock == NULL)
+	return true;
+      head = BB_END (bb);
+      gcc_assert (JUMP_P (head));
+    }
+
+  while (list)
+    {
+      rtx new_src, new_insn, new_cond, to_be_made_true;
+      struct ce_rewrite *this_elt = list;
+
+      list = list->next_same_block;
+      if (!this_elt->controller->will_convert)
+	continue;
+
+      if (this_elt->implied_true_p)
+	to_be_made_true = this_branch->cond;
+      else
+	to_be_made_true = opposite_branch->cond;
+      if (GET_CODE (to_be_made_true) == EQ)
+	{
+	  new_src = XEXP (to_be_made_true, 1);
+	}
+      else if (GET_CODE (to_be_made_true) == NE)
+	{
+	  if (XEXP (to_be_made_true, 1) == const0_rtx)
+	    new_src = const1_rtx;
+	  else
+	    new_src = const0_rtx;
+	}
+      else
+	return false;
+      new_insn = gen_move_insn (XEXP (to_be_made_true, 0), new_src);
+      gcc_assert (INSN_P (new_insn));
+      if (NEXT_INSN (new_insn) != NULL_RTX)
+	return false;
+      new_cond = copy_rtx (this_elt->controlling_cond);
+      validate_change (new_insn, &PATTERN (new_insn),
+		       gen_rtx_COND_EXEC (VOIDmode, new_cond,
+					  PATTERN (new_insn)),
+		       true);
+      cond_exec_record_insn (new_insn, PREV_INSN (head), false);
+    }
+  return true;
+}
+
+/* Recursively descend through the if-blocks in CE_INFO, and call
+   make_insns_for_implied_conds.  TOPLEVEL is true only if CE_INFO is
+   the root of the tree.  */
+
+static bool
+cond_exec_make_extra_insns (ce_if_block_t *ce_info, bool toplevel)
+{
+  if (ce_info->then_br.ifblock)
+    cond_exec_make_extra_insns (ce_info->then_br.ifblock, false);
+  if (ce_info->else_br.ifblock)
+    cond_exec_make_extra_insns (ce_info->else_br.ifblock, false);
+  if (!toplevel && ce_info->join_ifblock)
+    cond_exec_make_extra_insns (ce_info->join_ifblock, false);
+
+  if (!make_insns_for_implied_conds (&ce_info->then_br, &ce_info->else_br,
+				     ce_info->then_bb)
+      || !make_insns_for_implied_conds (&ce_info->else_br, &ce_info->then_br,
+					ce_info->else_bb))
+    return false;
+  return true;
+}
+
+/* This function is called after cond_exec_modify_conditions.  At this point,
+   we know exactly how many instructions would be converted, and we can further
+   limit the blocks we convert by taking costs into account.  We recursively
+   descend into the if block tree, starting with CE_INFO.  After treating all
+   the subblocks, compute the number of insns in them and compare against the
+   maximum.  Update the will_convert member, and return it.  */
+
+static bool
+cond_exec_limit_costs (ce_if_block_t *ce_info)
+{
+  ce_if_block_t *sub_info;
+  bool success = true;
+  int total;
+
+  if (ce_info->then_br.ifblock)
+    {
+      success |= cond_exec_limit_costs (ce_info->then_br.ifblock);
+      total = ce_info->then_br.ifblock->num_insns;
+      sub_info = ce_info->then_br.ifblock->join_ifblock;
+      while (sub_info)
+	{
+	  success &= cond_exec_limit_costs (sub_info);
+	  total += sub_info->num_insns;
+	  sub_info = sub_info->join_ifblock;
+	}
+    }
+  else
+    total = ce_info->then_br.num_insns;
+  if (total > MAX_CONDITIONAL_EXECUTE)
+    success = false;
+  ce_info->num_insns += total;
+
+  if (ce_info->else_br.ifblock)
+    {
+      success &= cond_exec_limit_costs (ce_info->else_br.ifblock);
+      total = ce_info->else_br.ifblock->num_insns;
+      sub_info = ce_info->else_br.ifblock->join_ifblock;
+      while (sub_info)
+	{
+	  success &= cond_exec_limit_costs (sub_info);
+	  total += sub_info->num_insns;
+	  sub_info = sub_info->join_ifblock;
+	}
+    }
+  else
+    total += ce_info->else_br.num_insns;
+  if (total > MAX_CONDITIONAL_EXECUTE)
+    success = false;
+  ce_info->num_insns += total;
+
+  ce_info->will_convert &= success;
+  return ce_info->will_convert;
+}
+
 /* Go through a bunch of insns, converting them to conditional
    execution format if possible.  Return TRUE if all of the non-note
    insns were processed.  */
@@ -312,7 +1365,7 @@  cond_exec_process_insns (ce_if_block_t *
       if (NOTE_P (insn) || DEBUG_INSN_P (insn))
 	goto insn_done;
 
-      gcc_assert(NONJUMP_INSN_P (insn) || CALL_P (insn));
+      gcc_assert (NONJUMP_INSN_P (insn) || CALL_P (insn));
 
       /* Remove USE insns that get in the way.  */
       if (reload_completed && GET_CODE (PATTERN (insn)) == USE)
@@ -405,258 +1458,336 @@  cond_exec_get_condition (rtx jump)
   return cond;
 }
 
-/* Given a simple IF-THEN or IF-THEN-ELSE block, attempt to convert it
-   to conditional execution.  Return TRUE if we were successful at
-   converting the block.  */
+/* Walk a tree of if-blocks, the root of which is ROOT, and finalize
+   some of the structure members: blocks_offset, and parent.  The current
+   block is CE_INFO and PARENT is its parent.
+   This function also prints the structure of the if block to the dump
+   file; ROLE is used as a descriptive string for the current block.
+   LEVEL is the indentation level to use for this output.  */
 
-static int
-cond_exec_process_if_block (ce_if_block_t * ce_info,
-			    /* if block information */int do_multiple_p)
+static void
+cond_exec_finalize_if_tree (ce_if_block_t *root, ce_if_block_t *ce_info,
+			  ce_if_block_t *parent, int level, const char *role)
 {
-  basic_block test_bb = ce_info->test_bb;	/* last test block */
-  basic_block then_bb = ce_info->then_bb;	/* THEN */
-  basic_block else_bb = ce_info->else_bb;	/* ELSE or NULL */
-  rtx test_expr;		/* expression in IF_THEN_ELSE that is tested */
-  rtx then_start;		/* first insn in THEN block */
-  rtx then_end;			/* last insn + 1 in THEN block */
-  rtx else_start = NULL_RTX;	/* first insn in ELSE block or NULL */
-  rtx else_end = NULL_RTX;	/* last insn + 1 in ELSE block */
-  int max;			/* max # of insns to convert.  */
-  int then_mod_ok;		/* whether conditional mods are ok in THEN */
-  rtx true_expr;		/* test for else block insns */
-  rtx false_expr;		/* test for then block insns */
-  rtx true_prob_val;		/* probability of else block */
-  rtx false_prob_val;		/* probability of then block */
-  rtx then_last_head = NULL_RTX;	/* Last match at the head of THEN */
-  rtx else_last_head = NULL_RTX;	/* Last match at the head of ELSE */
-  rtx then_first_tail = NULL_RTX;	/* First match at the tail of THEN */
-  rtx else_first_tail = NULL_RTX;	/* First match at the tail of ELSE */
-  int then_n_insns, else_n_insns, n_insns;
-  enum rtx_code false_code;
+  ce_if_block_t *sub_info;
 
-  /* If test is comprised of && or || elements, and we've failed at handling
-     all of them together, just use the last test if it is the special case of
-     && elements without an ELSE block.  */
-  if (!do_multiple_p && ce_info->num_multiple_test_blocks)
+  ce_info->parent = parent;
+  gcc_assert (ce_info->test_bb == VEC_index (basic_block, root->targets,
+					     ce_info->blocks_offset));
+  if (dump_file)
     {
-      if (else_bb || ! ce_info->and_and_p)
-	return FALSE;
+      int i;
+      for (i = 0; i < level; i++)
+	fputc (' ', dump_file);
+      fprintf (dump_file, "Test block %d (%s for parent %d)\n",
+	       ce_info->test_bb->index, role,
+	       ce_info == root ? -1 : ce_info->parent->test_bb->index);
+    }
+  if (ce_info->then_br.ifblock)
+    {
+      ce_info->then_br.ifblock->blocks_offset += ce_info->blocks_offset;
+      cond_exec_finalize_if_tree (root, ce_info->then_br.ifblock, ce_info,
+				level + 2, "then");
+      sub_info = ce_info->then_br.ifblock;
+      while (sub_info->join_ifblock)
+	{
+	  sub_info = sub_info->join_ifblock;
+	  sub_info->blocks_offset += ce_info->blocks_offset;
+	  cond_exec_finalize_if_tree (root, sub_info, ce_info, level + 2,
+				    "join continue for then");
+	}
+    }
+  else if (dump_file && ce_info->then_bb)
+    {
+      int i;
+      for (i = 0; i < level + 2; i++)
+	fputc (' ', dump_file);
+      fprintf (dump_file, "Then block %d\n", ce_info->then_bb->index);
+    }
 
-      ce_info->test_bb = test_bb = ce_info->last_test_bb;
-      ce_info->num_multiple_test_blocks = 0;
-      ce_info->num_and_and_blocks = 0;
-      ce_info->num_or_or_blocks = 0;
+  if (ce_info->else_br.ifblock)
+    {
+      ce_info->else_br.ifblock->blocks_offset += ce_info->blocks_offset;
+      cond_exec_finalize_if_tree (root, ce_info->else_br.ifblock,
+				ce_info, level + 2, "else");
+      sub_info = ce_info->else_br.ifblock;
+      while (sub_info->join_ifblock)
+	{
+	  sub_info = sub_info->join_ifblock;
+	  sub_info->blocks_offset += ce_info->blocks_offset;
+	  cond_exec_finalize_if_tree (root, sub_info, ce_info, level + 2,
+				    "join continue for else");
+	}
+    }
+  else if (dump_file && ce_info->else_bb)
+    {
+      int i;
+      for (i = 0; i < level + 2; i++)
+	fputc (' ', dump_file);
+      fprintf (dump_file, "Else block %d\n", ce_info->else_bb->index);
     }
 
-  /* Find the conditional jump to the ELSE or JOIN part, and isolate
-     the test.  */
-  test_expr = cond_exec_get_condition (BB_END (test_bb));
-  if (! test_expr)
-    return FALSE;
+  if (dump_file)
+    {
+      int i;
+      bool not_really;
+      not_really = ce_info->parent && ce_info->parent->join_bb == ce_info->join_bb;
+      for (i = 0; i < level; i++)
+	fputc (' ', dump_file);
+      fprintf (dump_file, "%sJoin block %d%s\n", not_really ? "[ " : "",
+	       ce_info->join_bb->index, not_really ? " ]" : "");
+    }
+}
 
-  /* If the conditional jump is more than just a conditional jump,
-     then we can not do conditional execution conversion on this block.  */
-  if (! onlyjump_p (BB_END (test_bb)))
-    return FALSE;
+/* Merge the blocks controlled by the if-block in CE_INFO, and mark
+   for local life update.  */
 
-  /* Collect the bounds of where we're to search, skipping any labels, jumps
-     and notes at the beginning and end of the block.  Then count the total
-     number of insns and see if it is small enough to convert.  */
-  then_start = first_active_insn (then_bb);
-  then_end = last_active_insn (then_bb, TRUE);
-  then_n_insns = ce_info->num_then_insns = count_bb_insns (then_bb);
-  n_insns = then_n_insns;
-  max = MAX_CONDITIONAL_EXECUTE;
+static void
+merge_if_block (ce_if_block_t *root, ce_if_block_t *ce_info)
+{
+  basic_block test_bb = ce_info->test_bb;	/* last test block */
+  basic_block join_bb = ce_info->join_bb;	/* join block */
+  basic_block combo_bb, bb;
+  edge e;
+  bool join_adjacent;
+  int i;
+  int off = ce_info->blocks_offset;
+  int len = ce_info->blocks_length;
 
-  if (else_bb)
-    {
-      int n_matching;
+  /* It can be significantly cheaper to free and recompute this information
+     than having merge_blocks try to update it.  */
+  free_dominance_info (CDI_DOMINATORS);
+  free_dominance_info (CDI_POST_DOMINATORS);
 
-      max *= 2;
-      else_start = first_active_insn (else_bb);
-      else_end = last_active_insn (else_bb, TRUE);
-      else_n_insns = ce_info->num_else_insns = count_bb_insns (else_bb);
-      n_insns += else_n_insns;
+  /* All block merging is done into the lower block numbers.  */
+  combo_bb = test_bb;
+  df_set_bb_dirty (test_bb);
 
-      /* Look for matching sequences at the head and tail of the two blocks,
-	 and limit the range of insns to be converted if possible.  */
-      n_matching = flow_find_cross_jump (then_bb, else_bb,
-					 &then_first_tail, &else_first_tail,
-					 NULL);
-      if (then_first_tail == BB_HEAD (then_bb))
-	then_start = then_end = NULL_RTX;
-      if (else_first_tail == BB_HEAD (else_bb))
-	else_start = else_end = NULL_RTX;
+  bb = VEC_index (basic_block, root->targets, off + len - 1);
+  e = find_edge (bb, join_bb);
+  /* discover_if_structure ensures that in the special case of a then
+     block without a successor, the join block is adjacent to the then
+     block.  */
+  join_adjacent = e == NULL || (e->flags & EDGE_FALLTHRU) != 0;
 
-      if (n_matching > 0)
-	{
-	  if (then_end)
-	    then_end = find_active_insn_before (then_bb, then_first_tail);
-	  if (else_end)
-	    else_end = find_active_insn_before (else_bb, else_first_tail);
-	  n_insns -= 2 * n_matching;
-	}
+  for (i = off + 1; i < off + len; i++)
+    {
+      bb = VEC_index (basic_block, root->targets, i);
+      merge_blocks (combo_bb, bb);
+      num_true_changes++;
+    }
 
-      if (then_start && else_start)
-	{
-	  int longest_match = MIN (then_n_insns - n_matching,
-				   else_n_insns - n_matching);
-	  n_matching
-	    = flow_find_head_matching_sequence (then_bb, else_bb,
-						&then_last_head,
-						&else_last_head,
-						longest_match);
+  /* The JOIN block may have had quite a number of other predecessors too.
+     Since we've already merged the TEST, THEN and ELSE blocks, we should
+     have only one remaining edge from our if-then-else diamond.  If there
+     is more than one remaining edge, it must come from elsewhere.  There
+     may be zero incoming edges if the THEN block didn't actually join
+     back up (as with a call to a non-return function).  */
+  if (join_bb != EXIT_BLOCK_PTR && join_adjacent
+      && EDGE_COUNT (join_bb->preds) < 2)
+    {
+      /* We can merge the JOIN cleanly and update the dataflow try
+	 again on this pass.*/
+      merge_blocks (combo_bb, join_bb);
+      num_true_changes++;
+    }
+  else if (join_adjacent)
+    {
+      /* We cannot merge the JOIN.  */
 
-	  if (n_matching > 0)
-	    {
-	      rtx insn;
+      /* The outgoing edge for the current COMBO block should already
+	 be correct.  Verify this.  */
+      gcc_assert (single_succ_p (combo_bb)
+		  && single_succ (combo_bb) == join_bb);
 
-	      /* We won't pass the insns in the head sequence to
-		 cond_exec_process_insns, so we need to test them here
-		 to make sure that they don't clobber the condition.  */
-	      for (insn = BB_HEAD (then_bb);
-		   insn != NEXT_INSN (then_last_head);
-		   insn = NEXT_INSN (insn))
-		if (!LABEL_P (insn) && !NOTE_P (insn)
-		    && !DEBUG_INSN_P (insn)
-		    && modified_in_p (test_expr, insn))
-		  return FALSE;
-	    }
+      /* Remove the jump and cruft from the end of the COMBO block.  */
+      if (join_bb != EXIT_BLOCK_PTR)
+	tidy_fallthru_edge (single_succ_edge (combo_bb));
+    }
+  else
+    {
+      rtx last = BB_END (combo_bb);
 
-	  if (then_last_head == then_end)
-	    then_start = then_end = NULL_RTX;
-	  if (else_last_head == else_end)
-	    else_start = else_end = NULL_RTX;
+      /* The outgoing edge for the current COMBO block should already
+	 be correct.  Verify this.  */
+      if (EDGE_COUNT (combo_bb->succs) == 0)
+	gcc_assert (find_reg_note (last, REG_NORETURN, NULL)
+		    || (NONJUMP_INSN_P (last)
+			&& GET_CODE (PATTERN (last)) == TRAP_IF
+			&& (TRAP_CONDITION (PATTERN (last))
+			    == const_true_rtx)));
 
-	  if (n_matching > 0)
-	    {
-	      if (then_start)
-		then_start = find_active_insn_after (then_bb, then_last_head);
-	      if (else_start)
-		else_start = find_active_insn_after (else_bb, else_last_head);
-	      n_insns -= 2 * n_matching;
-	    }
-	}
+      else
+      /* There should still be something at the end of the THEN or ELSE
+         blocks taking us to our final destination.  */
+	gcc_assert (JUMP_P (last)
+		    || (EDGE_SUCC (combo_bb, 0)->dest == EXIT_BLOCK_PTR
+			&& CALL_P (last)
+			&& SIBLING_CALL_P (last))
+		    || ((EDGE_SUCC (combo_bb, 0)->flags & EDGE_EH)
+			&& can_throw_internal (last)));
     }
 
-  if (n_insns > max)
-    return FALSE;
+  calculate_dominance_info (CDI_DOMINATORS);
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+  num_updated_if_blocks++;
+}
 
-  /* Map test_expr/test_jump into the appropriate MD tests to use on
-     the conditionally executed code.  */
+/* After if-converting the if-block tree starting at CE_INFO, remove duplicate
+   versions of matching sequences.  We use the head of the then block and
+   the tail of the else block, and remove the two others.  BLKINFO holds
+   the auxiliary information about the basic blocks.  ROOT is the root of the
+   tree.  We recurse into the sub-ifblocks of CE_INFO.  */
 
-  true_expr = test_expr;
+static void
+delete_unnecessary_matching_code (ce_if_block_t *root, ce_if_block_t *ce_info,
+				  ce_blocks_info_t *blkinfo)
+{
+  ce_blocks_info_t *then_bb_info, *else_bb_info;
 
-  false_code = reversed_comparison_code (true_expr, BB_END (test_bb));
-  if (false_code != UNKNOWN)
-    false_expr = gen_rtx_fmt_ee (false_code, GET_MODE (true_expr),
-				 XEXP (true_expr, 0), XEXP (true_expr, 1));
-  else
-    false_expr = NULL_RTX;
+  if (ce_info->then_br.ifblock)
+    delete_unnecessary_matching_code (root, ce_info->then_br.ifblock, blkinfo);
+  if (ce_info->else_br.ifblock)
+    delete_unnecessary_matching_code (root, ce_info->else_br.ifblock, blkinfo);
 
-#ifdef IFCVT_MODIFY_TESTS
-  /* If the machine description needs to modify the tests, such as setting a
-     conditional execution register from a comparison, it can do so here.  */
-  IFCVT_MODIFY_TESTS (ce_info, true_expr, false_expr);
+  if (ce_info->then_br.ifblock != NULL || ce_info->else_br.ifblock != NULL
+      || ce_info->then_bb == NULL || ce_info->else_bb == NULL)
+    return;
 
-  /* See if the conversion failed.  */
-  if (!true_expr || !false_expr)
-    goto fail;
-#endif
+  then_bb_info = blkinfo + find_block_in_vec (root->targets,
+					      ce_info->then_bb);
+  else_bb_info = blkinfo + find_block_in_vec (root->targets,
+					      ce_info->else_bb);
 
-  true_prob_val = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX);
-  if (true_prob_val)
+  gcc_assert (then_bb_info->controlling == else_bb_info->controlling);
+  if (then_bb_info->controlling == NULL
+      || !then_bb_info->controlling->will_convert)
+    return;
+  
+  if (else_bb_info->last_head)
     {
-      true_prob_val = XEXP (true_prob_val, 0);
-      false_prob_val = GEN_INT (REG_BR_PROB_BASE - INTVAL (true_prob_val));
+      if (dump_file)
+	fprintf (dump_file,
+		 "    found an else branch with matching head, deleting %d to %d.\n",
+		 INSN_UID (else_bb_info->all_start),
+		 INSN_UID (else_bb_info->last_head));
+	  delete_insn_chain (else_bb_info->all_start, else_bb_info->last_head,
+			     false);
+	}
+  if (then_bb_info->first_tail)
+    {
+      rtx from = then_bb_info->first_tail;
+      if (dump_file)
+	fprintf (dump_file,
+		 "    found a then branch with matching tail, deleting %d to %d.\n",
+		 INSN_UID (from),
+		 INSN_UID (then_bb_info->all_end));
+      delete_insn_chain (from, then_bb_info->all_end, false);
     }
-  else
-    false_prob_val = NULL_RTX;
+}
 
-  /* If we have && or || tests, do them here.  These tests are in the adjacent
-     blocks after the first block containing the test.  */
-  if (ce_info->num_multiple_test_blocks > 0)
-    {
-      basic_block bb = test_bb;
-      basic_block last_test_bb = ce_info->last_test_bb;
+/* Examine CE_INFO to see whether it can be converted and do so if possible.
+   If not, recursively examine its subblocks.  BLKINFO holds the auxiliary
+   information about the basic blocks in the tree; its root is ROOT. */
 
-      if (! false_expr)
-	goto fail;
+static bool
+cond_exec_convert (ce_if_block_t *root, ce_if_block_t *ce_info,
+		   ce_blocks_info_t *blkinfo)
+{
+  int i;
+  int off = ce_info->blocks_offset;
+  int len = ce_info->blocks_length;
 
-      do
-	{
-	  rtx start, end;
-	  rtx t, f;
-	  enum rtx_code f_code;
+  if (ce_info->unaccounted != 0 || !ce_info->will_convert)
+    {
+      bool any_success = false;
+      if (ce_info->then_br.ifblock)
+	any_success |= cond_exec_convert (root, ce_info->then_br.ifblock, blkinfo);
+      if (ce_info->else_br.ifblock)
+	any_success |= cond_exec_convert (root, ce_info->else_br.ifblock, blkinfo);
+      if (ce_info->join_ifblock)
+	any_success |= cond_exec_convert (root, ce_info->join_ifblock, blkinfo);
+      return any_success;
+    }
 
-	  bb = block_fallthru (bb);
-	  start = first_active_insn (bb);
-	  end = last_active_insn (bb, TRUE);
-	  if (start
-	      && ! cond_exec_process_insns (ce_info, start, end, false_expr,
-					    false_prob_val, FALSE))
-	    goto fail;
+  if (dump_file)
+    fprintf (dump_file, "Converting if-block with head %d\n",
+	     ce_info->test_bb->index);
+  for (i = off + 1; i < off + len; i++)
+    {
+      basic_block bb = VEC_index (basic_block, root->targets, i);
+      bool may_clobber_p;
+      ce_if_block_t *controller, *upper;
+      bool upper_true_p = false;
+      struct ce_if_branch *branch;
 
-	  /* If the conditional jump is more than just a conditional jump, then
-	     we can not do conditional execution conversion on this block.  */
-	  if (! onlyjump_p (BB_END (bb)))
-	    goto fail;
+      gcc_assert (bb != ce_info->test_bb && bb != ce_info->join_bb);
 
-	  /* Find the conditional jump and isolate the test.  */
-	  t = cond_exec_get_condition (BB_END (bb));
-	  if (! t)
-	    goto fail;
+      controller = blkinfo[i].controlling;
+      upper = controller->parent;
+      if (upper)
+	upper_true_p = upper->else_br.ifblock == controller;
+      branch = (blkinfo[i].controlling_true_p ? &controller->else_br
+		: &controller->then_br);
 
-	  f_code = reversed_comparison_code (t, BB_END (bb));
-	  if (f_code == UNKNOWN)
-	    goto fail;
+      if (dump_file)
+	fprintf (dump_file, " ... converting bb %d\n", bb->index);
 
-	  f = gen_rtx_fmt_ee (f_code, GET_MODE (t), XEXP (t, 0), XEXP (t, 1));
-	  if (ce_info->and_and_p)
+      if (bb == controller->then_bb && blkinfo[i].last_head)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "    is a then branch with matching head%s.\n",
+		     upper && upper->will_convert ? " (to be converted)" : "");
+	  if (upper && upper->will_convert)
 	    {
-	      t = gen_rtx_AND (GET_MODE (t), true_expr, t);
-	      f = gen_rtx_IOR (GET_MODE (t), false_expr, f);
+	      struct ce_if_branch *upper_br = (upper_true_p
+					       ? &upper->else_br
+					       : &upper->then_br);
+	      gcc_assert (upper->reuser != controller);
+	      may_clobber_p = i == cond_exec_last_cond_user (upper, false);
+	      if (!cond_exec_process_insns (ce_info, blkinfo[i].all_start,
+					    blkinfo[i].last_head,
+					    upper_br->cond, branch->prob_val,
+					    may_clobber_p))
+		goto fail;
 	    }
-	  else
+	}
+      if (bb == controller->else_bb && blkinfo[i].first_tail)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "    is an else branch with matching tail%s.\n",
+		     upper && upper->will_convert ? " (to be converted)" : "");
+	  if (upper && upper->will_convert)
 	    {
-	      t = gen_rtx_IOR (GET_MODE (t), true_expr, t);
-	      f = gen_rtx_AND (GET_MODE (t), false_expr, f);
+	      struct ce_if_branch *upper_br = (upper_true_p
+					       ? &upper->else_br
+					       : &upper->then_br);
+	      gcc_assert (upper->reuser != controller);
+	      may_clobber_p = i == cond_exec_last_cond_user (upper, false);
+	      if (!cond_exec_process_insns (ce_info, blkinfo[i].first_tail,
+					    blkinfo[i].all_end,
+					    upper_br->cond, branch->prob_val,
+					    may_clobber_p))
+		goto fail;
 	    }
-
-	  /* If the machine description needs to modify the tests, such as
-	     setting a conditional execution register from a comparison, it can
-	     do so here.  */
-#ifdef IFCVT_MODIFY_MULTIPLE_TESTS
-	  IFCVT_MODIFY_MULTIPLE_TESTS (ce_info, bb, t, f);
-
-	  /* See if the conversion failed.  */
-	  if (!t || !f)
-	    goto fail;
-#endif
-
-	  true_expr = t;
-	  false_expr = f;
 	}
-      while (bb != last_test_bb);
-    }
-
-  /* For IF-THEN-ELSE blocks, we don't allow modifications of the test
-     on then THEN block.  */
-  then_mod_ok = (else_bb == NULL_BLOCK);
 
-  /* Go through the THEN and ELSE blocks converting the insns if possible
-     to conditional execution.  */
+      if (blkinfo[i].start == NULL_RTX || blkinfo[i].end == NULL_RTX)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, " ... bb %d is empty\n", bb->index);
+	  continue;
+	}
 
-  if (then_end
-      && (! false_expr
-	  || ! cond_exec_process_insns (ce_info, then_start, then_end,
-					false_expr, false_prob_val,
-					then_mod_ok)))
-    goto fail;
+      gcc_assert (controller->will_convert);
+      may_clobber_p = i == cond_exec_last_cond_user (controller, false);
+      if (!cond_exec_process_insns (ce_info, blkinfo[i].start, blkinfo[i].end,
+				    branch->cond, branch->prob_val,
+				    may_clobber_p))
+	goto fail;
+    }
 
-  if (else_bb && else_end
-      && ! cond_exec_process_insns (ce_info, else_start, else_end,
-				    true_expr, true_prob_val, TRUE))
+  if (!cond_exec_make_extra_insns (ce_info, true))
     goto fail;
 
   /* If we cannot apply the changes, fail.  Do not go through the normal fail
@@ -667,9 +1798,11 @@  cond_exec_process_if_block (ce_if_block_
       /* Cancel any machine dependent changes.  */
       IFCVT_MODIFY_CANCEL (ce_info);
 #endif
-      return FALSE;
+      cond_exec_discard_added_insns ();
+      return false;
     }
 
+  cond_exec_finalize_added_insns ();
 #ifdef IFCVT_MODIFY_FINAL
   /* Do any machine dependent final modifications.  */
   IFCVT_MODIFY_FINAL (ce_info);
@@ -677,36 +1810,86 @@  cond_exec_process_if_block (ce_if_block_
 
   /* Conversion succeeded.  */
   if (dump_file)
-    fprintf (dump_file, "%d insn%s converted to conditional execution.\n",
-	     n_insns, (n_insns == 1) ? " was" : "s were");
+    fprintf (dump_file, "  Converted to conditional execution.\n");
 
-  /* Merge the blocks!  If we had matching sequences, make sure to delete one
-     copy at the appropriate location first: delete the copy in the THEN branch
-     for a tail sequence so that the remaining one is executed last for both
-     branches, and delete the copy in the ELSE branch for a head sequence so
-     that the remaining one is executed first for both branches.  */
-  if (then_first_tail)
-    {
-      rtx from = then_first_tail;
-      if (!INSN_P (from))
-	from = find_active_insn_after (then_bb, from);
-      delete_insn_chain (from, BB_END (then_bb), false);
-    }
-  if (else_last_head)
-    delete_insn_chain (first_active_insn (else_bb), else_last_head, false);
+  delete_unnecessary_matching_code (root, ce_info, blkinfo);
 
-  merge_if_block (ce_info);
+  merge_if_block (root, ce_info);
   cond_exec_changed_p = TRUE;
-  return TRUE;
+
+  return true;
 
  fail:
+  if (dump_file)
+    fprintf (dump_file, "  failed to convert\n");
 #ifdef IFCVT_MODIFY_CANCEL
   /* Cancel any machine dependent changes.  */
   IFCVT_MODIFY_CANCEL (ce_info);
 #endif
 
+  cond_exec_discard_added_insns ();
   cancel_changes (0);
-  return FALSE;
+  return false;
+}
+
+/* Given an IF-THEN or IF-THEN-ELSE block with possibly nested
+   sub-blocks in CE_ROOT, attempt to convert as much of the tree as
+   possible to conditional execution.  Return TRUE if we were
+   successful at converting the block.  */
+
+static int
+cond_exec_process_if_block (ce_if_block_t *ce_root)
+{
+  bool retval = false;
+  basic_block bb, prev_bb = NULL;
+  ce_blocks_info_t *blkinfo;
+  int n_blocks;
+  unsigned i;
+  HARD_REG_SET set_anywhere;
+
+  /* Verify that all blocks are adjacent.  This isn't really a
+     requirement, but only because rtl_move_block is unimplemented.
+     Later.  */
+  FOR_EACH_VEC_ELT (basic_block, ce_root->targets, i, bb)
+    {
+      if (prev_bb && prev_bb->next_bb != bb)
+	return false;
+
+      prev_bb = bb;
+    }
+  cond_exec_finalize_if_tree (ce_root, ce_root, NULL, 0, "root");
+  if (dump_file)
+    {
+      fprintf (dump_file, "flattened:");
+      FOR_EACH_VEC_ELT (basic_block, ce_root->targets, i, bb)
+	{
+	  fprintf (dump_file, " %d", bb->index);
+	}
+      fprintf (dump_file, "\n");
+    }
+  n_blocks = VEC_length (basic_block, ce_root->targets);
+  blkinfo = XCNEWVEC (ce_blocks_info_t, n_blocks);
+  cond_exec_assign_conditions (ce_root, ce_root, blkinfo);
+  cond_exec_compute_blocks_info (ce_root, blkinfo);
+  CLEAR_HARD_REG_SET (set_anywhere);
+  for (i = 0; i < VEC_length (basic_block, ce_root->targets); i++)
+    {
+      IOR_HARD_REG_SET (set_anywhere, blkinfo[i].set_in_block);
+      IOR_HARD_REG_SET (set_anywhere, blkinfo[i].set_last_insn);
+    }
+  cond_exec_assign_registers (ce_root, ce_root, blkinfo, set_anywhere);
+  if (!cond_exec_limit_convert (ce_root, false))
+    goto fail;
+  cond_exec_modify_conditions (ce_root, ce_root, blkinfo);
+  cond_exec_limit_costs (ce_root);
+  if (!cond_exec_limit_convert (ce_root, false))
+    goto fail;
+
+  retval = cond_exec_convert (ce_root, ce_root, blkinfo);
+
+ fail:
+  free (blkinfo);
+  return retval;
 }
 
 /* Used by noce_process_if_block to communicate with its subroutines.
@@ -3082,151 +4265,36 @@  noce_find_if_block (basic_block test_bb,
   return FALSE;
 }
 
+/* Determine if TEST_BB could be the test block of an if statement.  If so,
+   fill in the minimum amount of information known about it (such as then and else
+   blocks) in CE_INFO, and return true.  */
 
-/* Merge the blocks and mark for local life update.  */
-
-static void
-merge_if_block (struct ce_if_block * ce_info)
-{
-  basic_block test_bb = ce_info->test_bb;	/* last test block */
-  basic_block then_bb = ce_info->then_bb;	/* THEN */
-  basic_block else_bb = ce_info->else_bb;	/* ELSE or NULL */
-  basic_block join_bb = ce_info->join_bb;	/* join block */
-  basic_block combo_bb;
-
-  /* All block merging is done into the lower block numbers.  */
-
-  combo_bb = test_bb;
-  df_set_bb_dirty (test_bb);
-
-  /* Merge any basic blocks to handle && and || subtests.  Each of
-     the blocks are on the fallthru path from the predecessor block.  */
-  if (ce_info->num_multiple_test_blocks > 0)
-    {
-      basic_block bb = test_bb;
-      basic_block last_test_bb = ce_info->last_test_bb;
-      basic_block fallthru = block_fallthru (bb);
-
-      do
-	{
-	  bb = fallthru;
-	  fallthru = block_fallthru (bb);
-	  merge_blocks (combo_bb, bb);
-	  num_true_changes++;
-	}
-      while (bb != last_test_bb);
-    }
-
-  /* Merge TEST block into THEN block.  Normally the THEN block won't have a
-     label, but it might if there were || tests.  That label's count should be
-     zero, and it normally should be removed.  */
-
-  if (then_bb)
-    {
-      merge_blocks (combo_bb, then_bb);
-      num_true_changes++;
-    }
-
-  /* The ELSE block, if it existed, had a label.  That label count
-     will almost always be zero, but odd things can happen when labels
-     get their addresses taken.  */
-  if (else_bb)
-    {
-      merge_blocks (combo_bb, else_bb);
-      num_true_changes++;
-    }
-
-  /* If there was no join block reported, that means it was not adjacent
-     to the others, and so we cannot merge them.  */
-
-  if (! join_bb)
-    {
-      rtx last = BB_END (combo_bb);
-
-      /* The outgoing edge for the current COMBO block should already
-	 be correct.  Verify this.  */
-      if (EDGE_COUNT (combo_bb->succs) == 0)
-	gcc_assert (find_reg_note (last, REG_NORETURN, NULL)
-		    || (NONJUMP_INSN_P (last)
-			&& GET_CODE (PATTERN (last)) == TRAP_IF
-			&& (TRAP_CONDITION (PATTERN (last))
-			    == const_true_rtx)));
-
-      else
-      /* There should still be something at the end of the THEN or ELSE
-         blocks taking us to our final destination.  */
-	gcc_assert (JUMP_P (last)
-		    || (EDGE_SUCC (combo_bb, 0)->dest == EXIT_BLOCK_PTR
-			&& CALL_P (last)
-			&& SIBLING_CALL_P (last))
-		    || ((EDGE_SUCC (combo_bb, 0)->flags & EDGE_EH)
-			&& can_throw_internal (last)));
-    }
-
-  /* The JOIN block may have had quite a number of other predecessors too.
-     Since we've already merged the TEST, THEN and ELSE blocks, we should
-     have only one remaining edge from our if-then-else diamond.  If there
-     is more than one remaining edge, it must come from elsewhere.  There
-     may be zero incoming edges if the THEN block didn't actually join
-     back up (as with a call to a non-return function).  */
-  else if (EDGE_COUNT (join_bb->preds) < 2
-	   && join_bb != EXIT_BLOCK_PTR)
-    {
-      /* We can merge the JOIN cleanly and update the dataflow try
-	 again on this pass.*/
-      merge_blocks (combo_bb, join_bb);
-      num_true_changes++;
-    }
-  else
-    {
-      /* We cannot merge the JOIN.  */
-
-      /* The outgoing edge for the current COMBO block should already
-	 be correct.  Verify this.  */
-      gcc_assert (single_succ_p (combo_bb)
-		  && single_succ (combo_bb) == join_bb);
-
-      /* Remove the jump and cruft from the end of the COMBO block.  */
-      if (join_bb != EXIT_BLOCK_PTR)
-	tidy_fallthru_edge (single_succ_edge (combo_bb));
-    }
-
-  num_updated_if_blocks++;
-}
-
-/* Find a block ending in a simple IF condition and try to transform it
-   in some way.  When converting a multi-block condition, put the new code
-   in the first such block and delete the rest.  Return a pointer to this
-   first block if some transformation was done.  Return NULL otherwise.  */
-
-static basic_block
-find_if_header (basic_block test_bb, int pass)
+static bool
+discover_if_header (basic_block test_bb, ce_if_block_t *ce_info)
 {
-  ce_if_block_t ce_info;
-  edge then_edge;
-  edge else_edge;
+  edge then_edge, else_edge;
 
   /* The kind of block we're looking for has exactly two successors.  */
   if (EDGE_COUNT (test_bb->succs) != 2)
-    return NULL;
+    return false;
 
   then_edge = EDGE_SUCC (test_bb, 0);
   else_edge = EDGE_SUCC (test_bb, 1);
 
   if (df_get_bb_dirty (then_edge->dest))
-    return NULL;
+    return false;
   if (df_get_bb_dirty (else_edge->dest))
-    return NULL;
+    return false;
 
   /* Neither edge should be abnormal.  */
   if ((then_edge->flags & EDGE_COMPLEX)
       || (else_edge->flags & EDGE_COMPLEX))
-    return NULL;
+    return false;
 
   /* Nor exit the loop.  */
   if ((then_edge->flags & EDGE_LOOP_EXIT)
       || (else_edge->flags & EDGE_LOOP_EXIT))
-    return NULL;
+    return false;
 
   /* The THEN edge is canonically the one that falls through.  */
   if (then_edge->flags & EDGE_FALLTHRU)
@@ -3239,23 +4307,42 @@  find_if_header (basic_block test_bb, int
     }
   else
     /* Otherwise this must be a multiway branch of some sort.  */
-    return NULL;
+    return false;
 
-  memset (&ce_info, 0, sizeof (ce_info));
-  ce_info.test_bb = test_bb;
-  ce_info.then_bb = then_edge->dest;
-  ce_info.else_bb = else_edge->dest;
-  ce_info.pass = pass;
+  memset (ce_info, 0, sizeof *ce_info);
+  ce_info->test_bb = test_bb;
+  ce_info->then_bb = then_edge->dest;
+  ce_info->else_bb = else_edge->dest;
 
 #ifdef IFCVT_INIT_EXTRA_FIELDS
-  IFCVT_INIT_EXTRA_FIELDS (&ce_info);
+  IFCVT_INIT_EXTRA_FIELDS (ce_info);
 #endif
+  return true;
+}
+
+/* Find a block ending in a simple IF condition and try to transform it
+   in some way.  When converting a multi-block condition, put the new code
+   in the first such block and delete the rest.  Return a pointer to this
+   first block if some transformation was done.  Return NULL otherwise.  */
+
+static basic_block
+find_if_header (basic_block test_bb, int pass)
+{
+  edge then_edge, else_edge;
+  ce_if_block_t ce_info;
+
+  if (!discover_if_header (test_bb, &ce_info))
+    return NULL;
+
+  then_edge = find_edge (test_bb, ce_info.then_bb);
+  else_edge = find_edge (test_bb, ce_info.else_bb);
 
   if (!reload_completed
       && noce_find_if_block (test_bb, then_edge, else_edge, pass))
     goto success;
 
   if (reload_completed
+      && dom_info_state (CDI_POST_DOMINATORS) > DOM_NO_FAST_QUERY
       && targetm.have_conditional_execution ()
       && cond_exec_find_if_block (&ce_info))
     goto success;
@@ -3284,308 +4371,433 @@  find_if_header (basic_block test_bb, int
   return ce_info.test_bb;
 }
 
-/* Return true if a block has two edges, one of which falls through to the next
-   block, and the other jumps to a specific block, so that we can tell if the
-   block is part of an && test or an || test.  Returns either -1 or the number
-   of non-note, non-jump, non-USE/CLOBBER insns in the block.  */
+/* Look for identical sequences in the then and else blocks.  Called only
+   for cases where CE_INFO describes a simple if, without nested sub-if
+   structures.  */
 
-static int
-block_jumps_and_fallthru_p (basic_block cur_bb, basic_block target_bb)
+static void
+ce_discover_matches (ce_if_block_t *ce_info)
 {
-  edge cur_edge;
-  int fallthru_p = FALSE;
-  int jump_p = FALSE;
-  rtx insn;
-  rtx end;
-  int n_insns = 0;
-  edge_iterator ei;
-
-  if (!cur_bb || !target_bb)
-    return -1;
-
-  /* If no edges, obviously it doesn't jump or fallthru.  */
-  if (EDGE_COUNT (cur_bb->succs) == 0)
-    return FALSE;
-
-  FOR_EACH_EDGE (cur_edge, ei, cur_bb->succs)
-    {
-      if (cur_edge->flags & EDGE_COMPLEX)
-	/* Anything complex isn't what we want.  */
-	return -1;
-
-      else if (cur_edge->flags & EDGE_FALLTHRU)
-	fallthru_p = TRUE;
+  basic_block then_bb = ce_info->then_bb;
+  basic_block else_bb = ce_info->else_bb;  
+  int n_matching, longest_match;
 
-      else if (cur_edge->dest == target_bb)
-	jump_p = TRUE;
+  if (ce_info->then_br.ifblock != NULL || ce_info->else_br.ifblock != NULL)
+    /* We cannot do this if either block is a nested if - we might end up
+       discovering the same match twice if a sub-ifblock also jumps to one
+       of our destinations.  */
+    return;
 
-      else
-	return -1;
-    }
+  if (EDGE_COUNT (then_bb->preds) != 1 || EDGE_COUNT (else_bb->preds) != 1)
+    return;
 
-  if ((jump_p & fallthru_p) == 0)
-    return -1;
+  n_matching = flow_find_cross_jump (then_bb, else_bb,
+				     &ce_info->then_br.first_tail,
+				     &ce_info->else_br.first_tail, NULL);
 
-  /* Don't allow calls in the block, since this is used to group && and ||
-     together for conditional execution support.  ??? we should support
-     conditional execution support across calls for IA-64 some day, but
-     for now it makes the code simpler.  */
-  end = BB_END (cur_bb);
-  insn = BB_HEAD (cur_bb);
+  if (n_matching > 0 && dump_file)
+    fprintf (dump_file, "discovered tail matches, %d insns\n", n_matching);
+  ce_info->then_br.num_insns -= n_matching;
+  ce_info->else_br.num_insns -= n_matching;
+  ce_info->num_unconverted += n_matching;
 
-  while (insn != NULL_RTX)
+  longest_match = MIN (ce_info->then_br.num_insns, ce_info->else_br.num_insns);
+  if (longest_match > 0)
     {
-      if (CALL_P (insn))
-	return -1;
-
-      if (INSN_P (insn)
-	  && !JUMP_P (insn)
-	  && !DEBUG_INSN_P (insn)
-	  && GET_CODE (PATTERN (insn)) != USE
-	  && GET_CODE (PATTERN (insn)) != CLOBBER)
-	n_insns++;
-
-      if (insn == end)
-	break;
-
-      insn = NEXT_INSN (insn);
+      n_matching
+	= flow_find_head_matching_sequence (then_bb, else_bb,
+					    &ce_info->then_br.last_head,
+					    &ce_info->else_br.last_head,
+					    longest_match);
+      if (n_matching > 0 && dump_file)
+	fprintf (dump_file, "discovered head matches, %d insns\n", n_matching);
+      ce_info->then_br.num_insns -= n_matching;
+      ce_info->else_br.num_insns -= n_matching;
+      ce_info->num_unconverted += n_matching;
     }
+}
 
-  return n_insns;
+/* Determine if BB is suitable as a then or else branch in an if-block that
+   will be predicated.  */
+static bool
+suitable_for_if_branch (basic_block bb)
+{
+  if (epilogue_completed && tablejump_p (BB_END (bb), NULL, NULL))
+    return false;
+  return true;
 }
 
-/* Determine if a given basic block heads a simple IF-THEN or IF-THEN-ELSE
-   block.  If so, we'll try to convert the insns to not require the branch.
-   Return TRUE if we were successful at converting the block.  */
+/* After finding an if header CE_INFO with discover_if_header, this function
+   can be used to recursively examine the then, else, and join blocks to see
+   if they are themselves if blocks.  OUTER_JOIN should be NULL for the
+   outermost if-block we're examining; on recursive calls it holds the join_bb
+   of the parent if-block.
+   Return true if we found a recognizable blocks structure.  */
 
-static int
-cond_exec_find_if_block (struct ce_if_block * ce_info)
+static bool
+ce_discover_if_structure (ce_if_block_t *ce_info, basic_block outer_join)
 {
   basic_block test_bb = ce_info->test_bb;
   basic_block then_bb = ce_info->then_bb;
   basic_block else_bb = ce_info->else_bb;
-  basic_block join_bb = NULL_BLOCK;
-  edge cur_edge;
-  basic_block next;
-  edge_iterator ei;
-
-  ce_info->last_test_bb = test_bb;
-
-  /* We only ever should get here after reload,
-     and if we have conditional execution.  */
-  gcc_assert (reload_completed && targetm.have_conditional_execution ());
+  basic_block join_bb, tmp_bb;
+  int i;
+  edge then_ss = single_succ_p (then_bb) ? single_succ_edge (then_bb) : NULL;
+  edge else_ss = single_succ_p (else_bb) ? single_succ_edge (else_bb) : NULL;
+  edge join_ss;
+  ce_if_block_t sub_ce_info;
+  ce_if_block_t *sub_info;
+  bool then_seen, else_seen;
+  rtx tmp;
+  int unaccounted;
 
-  /* Discover if any fall through predecessors of the current test basic block
-     were && tests (which jump to the else block) or || tests (which jump to
-     the then block).  */
-  if (single_pred_p (test_bb)
-      && single_pred_edge (test_bb)->flags == EDGE_FALLTHRU)
+  if (dump_file)
     {
-      basic_block bb = single_pred (test_bb);
-      basic_block target_bb;
-      int max_insns = MAX_CONDITIONAL_EXECUTE;
-      int n_insns;
+      fprintf (dump_file, "Examining structure for bb %d", test_bb->index);
+      if (outer_join)
+	fprintf (dump_file, ", outer join %d", outer_join->index);
+      fprintf (dump_file, "\n");
+    }
 
-      /* Determine if the preceding block is an && or || block.  */
-      if ((n_insns = block_jumps_and_fallthru_p (bb, else_bb)) >= 0)
-	{
-	  ce_info->and_and_p = TRUE;
-	  target_bb = else_bb;
-	}
-      else if ((n_insns = block_jumps_and_fallthru_p (bb, then_bb)) >= 0)
-	{
-	  ce_info->and_and_p = FALSE;
-	  target_bb = then_bb;
-	}
-      else
-	target_bb = NULL_BLOCK;
+  /* If the conditional jump is more than just a conditional jump,
+     then we can not do conditional execution conversion on this block.  */
+  if (!onlyjump_p (BB_END (test_bb)))
+    return false;
 
-      if (target_bb && n_insns <= max_insns)
-	{
-	  int total_insns = 0;
-	  int blocks = 0;
+  if (df_get_bb_dirty (test_bb))
+    return false;
 
-	  ce_info->last_test_bb = test_bb;
+  if (dominated_by_p (CDI_POST_DOMINATORS, then_bb, else_bb)
+      /* As a special case, allow a simple if-then where the then branch
+	 has no outgoing edge.  See also merge_if_block.  */
+      || (EDGE_COUNT (then_bb->succs) == 0 && then_bb->next_bb == else_bb
+	  && EDGE_COUNT (else_bb->preds) == 1))
+    {
+      join_bb = else_bb;
+      else_bb = NULL;
+      else_ss = NULL;
+    }
+  else if (dominated_by_p (CDI_POST_DOMINATORS, else_bb, then_bb))
+    {
+      join_bb = then_bb;
+      then_bb = NULL;
+      then_ss = NULL;
+    }
+  else
+    join_bb = nearest_common_dominator (CDI_POST_DOMINATORS, then_bb, else_bb);
 
-	  /* Found at least one && or || block, look for more.  */
-	  do
-	    {
-	      ce_info->test_bb = test_bb = bb;
-	      total_insns += n_insns;
-	      blocks++;
+  join_ss = single_succ_p (join_bb) ? single_succ_edge (join_bb) : NULL;
+  if ((then_bb && !suitable_for_if_branch (then_bb))
+      || (else_bb && !suitable_for_if_branch (else_bb))
+      || (then_ss && ((then_ss->flags & EDGE_COMPLEX) != 0
+		      || then_ss->dest != join_bb))
+      || (else_ss && ((else_ss->flags & EDGE_COMPLEX) != 0
+		      || else_ss->dest != join_bb))
+      || (outer_join && !dominated_by_p (CDI_POST_DOMINATORS, join_bb,
+					 outer_join))
+      || (outer_join && join_bb != outer_join
+	  && !dominated_by_p (CDI_DOMINATORS, join_bb, test_bb))
+      || (outer_join && join_bb != outer_join
+	  && join_ss && join_ss->dest != outer_join))
+    {
+      if (dump_file)
+	fprintf (dump_file, "failed (join %d)\n", join_bb->index);
 
-	      if (!single_pred_p (bb))
-		break;
+      return false;
+    }
 
-	      bb = single_pred (bb);
-	      n_insns = block_jumps_and_fallthru_p (bb, target_bb);
-	    }
-	  while (n_insns >= 0 && (total_insns + n_insns) <= max_insns);
+  /* Try to avoid building up extremely large if trees only to find that
+     they would be too expensive to convert.  The numbers are arbitrarily
+     chosen to ensure reasonable compile times for extreme test cases without
+     preventing useful optimizations.  */
+  if (n_basic_blocks > 100
+      /* Avoid the special case from above.  */
+      && (then_bb == NULL || EDGE_COUNT (then_bb->succs) > 0))
+    {
+      VEC (basic_block, heap) *dom_vec;
+      basic_block bb;
+      int count = 0;
+      dom_vec = get_dominated_to_depth (CDI_DOMINATORS, test_bb, 0);
+      FOR_EACH_VEC_ELT (basic_block, dom_vec, i, bb)
+	if (dominated_by_p (CDI_POST_DOMINATORS, bb, join_bb))
+	  count++;
+      VEC_free (basic_block, heap, dom_vec);
+      if (count > 40)
+	return false;
+    }
+  
+  ce_info->then_bb = then_bb;
+  ce_info->else_bb = else_bb;
+  ce_info->join_bb = join_bb;
 
-	  ce_info->num_multiple_test_blocks = blocks;
-	  ce_info->num_multiple_test_insns = total_insns;
+  /* Find the conditional jump to the ELSE or JOIN part, and isolate
+     the test.  */
+  ce_info->else_br.cond = cond_exec_get_condition (BB_END (test_bb));
+  if (!ce_info->else_br.cond
+      || !COMPARISON_P (ce_info->else_br.cond)
+      || !REG_P (XEXP (ce_info->else_br.cond, 0)))
+    return false;
+  ce_info->else_br.cond_code = GET_CODE (ce_info->else_br.cond);
+  ce_info->then_br.cond_code = reversed_comparison_code (ce_info->else_br.cond,
+							 BB_END (test_bb));
+  if (ce_info->then_br.cond_code == UNKNOWN)
+    return false;
+  ce_info->cond_regno = REGNO (XEXP (ce_info->else_br.cond, 0));
+  ce_info->cond_reg_mode = GET_MODE (XEXP (ce_info->else_br.cond, 0));
 
-	  if (ce_info->and_and_p)
-	    ce_info->num_and_and_blocks = blocks;
-	  else
-	    ce_info->num_or_or_blocks = blocks;
-	}
+  /* See if the last insn before the jump sets the condition register
+     in a way we can change.  */
+  ce_info->end = last_active_insn (test_bb, TRUE);
+  ce_info->set_reg = -1;
+  if (ce_info->end)
+    {
+      rtx set = single_set (ce_info->end);
+      if (set && REG_P (SET_DEST (set))
+	  && rtx_equal_p (SET_DEST (set), XEXP (ce_info->else_br.cond, 0)))
+	ce_info->set_reg = REGNO (SET_DEST (set));
     }
+  if (ce_info->set_reg != ce_info->cond_regno
+      || REGNO_REG_SET_P (df_get_live_out (test_bb), ce_info->cond_regno))
+    ce_info->set_reg = -1;
 
-  /* The THEN block of an IF-THEN combo must have exactly one predecessor,
-     other than any || blocks which jump to the THEN block.  */
-  if ((EDGE_COUNT (then_bb->preds) - ce_info->num_or_or_blocks) != 1)
-    return FALSE;
+  CLEAR_HARD_REG_SET (ce_info->live_at_end);
+  reg_set_to_hard_reg_set (&ce_info->live_at_end,
+			   df_get_live_out (test_bb));
 
-  /* The edges of the THEN and ELSE blocks cannot have complex edges.  */
-  FOR_EACH_EDGE (cur_edge, ei, then_bb->preds)
+  /* Now, look for nested if-blocks.  */
+  if (then_bb && EDGE_COUNT (then_bb->succs) > 1)
     {
-      if (cur_edge->flags & EDGE_COMPLEX)
-	return FALSE;
+      if (EDGE_COUNT (then_bb->preds) != 1
+	  || !discover_if_header (then_bb, &sub_ce_info)
+	  || !ce_discover_if_structure (&sub_ce_info, join_bb))
+	goto out_fail;
+      ce_info->then_br.ifblock = XNEW (struct ce_if_block);
+      *ce_info->then_br.ifblock = sub_ce_info;
     }
-
-  FOR_EACH_EDGE (cur_edge, ei, else_bb->preds)
+  else if (then_bb && EDGE_COUNT (then_bb->succs) != 0 && then_ss == NULL)
+    goto out_fail;
+  if (else_bb && EDGE_COUNT (else_bb->succs) > 1)
     {
-      if (cur_edge->flags & EDGE_COMPLEX)
-	return FALSE;
+      if (EDGE_COUNT (else_bb->preds) != 1
+	  || !discover_if_header (else_bb, &sub_ce_info)
+	  || !ce_discover_if_structure (&sub_ce_info, join_bb))
+	goto out_fail;
+      ce_info->else_br.ifblock = XNEW (struct ce_if_block);
+      *ce_info->else_br.ifblock = sub_ce_info;
     }
-
-  /* The THEN block of an IF-THEN combo must have zero or one successors.  */
-  if (EDGE_COUNT (then_bb->succs) > 0
-      && (!single_succ_p (then_bb)
-          || (single_succ_edge (then_bb)->flags & EDGE_COMPLEX)
-	  || (epilogue_completed
-	      && tablejump_p (BB_END (then_bb), NULL, NULL))))
-    return FALSE;
-
-  /* If the THEN block has no successors, conditional execution can still
-     make a conditional call.  Don't do this unless the ELSE block has
-     only one incoming edge -- the CFG manipulation is too ugly otherwise.
-     Check for the last insn of the THEN block being an indirect jump, which
-     is listed as not having any successors, but confuses the rest of the CE
-     code processing.  ??? we should fix this in the future.  */
-  if (EDGE_COUNT (then_bb->succs) == 0)
+  else if (else_bb && EDGE_COUNT (else_bb->succs) != 0 && else_ss == NULL)
+    goto out_fail;
+  if (join_bb && outer_join && join_bb != outer_join && !join_ss)
     {
-      if (single_pred_p (else_bb))
-	{
-	  rtx last_insn = BB_END (then_bb);
+      if (!discover_if_header (join_bb, &sub_ce_info)
+	  || !ce_discover_if_structure (&sub_ce_info, outer_join)
+	  || sub_ce_info.unaccounted != 0)
+	goto out_fail;
+      ce_info->join_ifblock = XNEW (struct ce_if_block);
+      *ce_info->join_ifblock = sub_ce_info;
+    }
 
-	  while (last_insn
-		 && NOTE_P (last_insn)
-		 && last_insn != BB_HEAD (then_bb))
-	    last_insn = PREV_INSN (last_insn);
+  /* Now, see if the condition is reusable in one of the sub-tests.  This can
+     only occur if there's exactly one sub-test.  */
+  sub_info = ce_info->then_br.ifblock;
+  if (sub_info == NULL)
+    sub_info = ce_info->else_br.ifblock;
 
-	  if (last_insn
-	      && JUMP_P (last_insn)
-	      && ! simplejump_p (last_insn))
-	    return FALSE;
+  else while (sub_info->join_ifblock)
+    sub_info = sub_info->join_ifblock;
 
-	  join_bb = else_bb;
-	  else_bb = NULL_BLOCK;
+  if (sub_info != NULL && sub_info->join_bb == join_bb
+      /* We cannot do this if we have matching pieces of code, as these must be
+	 predicated with the parent's condition (which, hence, must be in a
+	 different register).  */
+      && sub_info->then_br.last_head == NULL
+      && sub_info->then_br.first_tail == NULL)
+    {
+      bool reusable = false;
+      rtx this_op1 = XEXP (ce_info->else_br.cond, 1);
+      rtx sub_op1 = XEXP (sub_info->else_br.cond, 1);
+
+      if (rtx_equal_p (this_op1, sub_op1))
+	{
+	  if (ce_info->then_bb == sub_info->then_bb
+	      || ce_info->else_bb == sub_info->else_bb)
+	    reusable = sub_info->else_br.cond_code == ce_info->else_br.cond_code;
+	  else if (ce_info->then_bb == sub_info->else_bb
+		   || ce_info->else_bb == sub_info->then_bb)
+	    reusable = sub_info->then_br.cond_code == ce_info->else_br.cond_code;
+	}
+      if (reusable)
+	{
+	  ce_info->reuser = sub_info;
+	  if (dump_file)
+	    fprintf (dump_file, "sub-block %d can reuse %d's condition\n",
+		     sub_info->test_bb->index, ce_info->test_bb->index);
 	}
-      else
-	return FALSE;
     }
 
-  /* If the THEN block's successor is the other edge out of the TEST block,
-     then we have an IF-THEN combo without an ELSE.  */
-  else if (single_succ (then_bb) == else_bb)
+  /* If there's one side of the if statement with unaccounted sub-blocks,
+     try to find our then and else blocks in that side's vector.  If so,
+     it affects accounting, and we won't push the same block twice to the
+     blocks vector.  */
+  sub_info = NULL;
+  if (ce_info->then_br.ifblock && ce_info->then_br.ifblock->unaccounted > 0)
+    sub_info = ce_info->then_br.ifblock;
+  if (ce_info->else_br.ifblock && ce_info->else_br.ifblock->unaccounted > 0)
     {
-      join_bb = else_bb;
-      else_bb = NULL_BLOCK;
+      if (sub_info != NULL)
+	goto out_fail;
+      sub_info = ce_info->else_br.ifblock;
     }
 
-  /* If the THEN and ELSE block meet in a subsequent block, and the ELSE
-     has exactly one predecessor and one successor, and the outgoing edge
-     is not complex, then we have an IF-THEN-ELSE combo.  */
-  else if (single_succ_p (else_bb)
-	   && single_succ (then_bb) == single_succ (else_bb)
-	   && single_pred_p (else_bb)
-	   && !(single_succ_edge (else_bb)->flags & EDGE_COMPLEX)
-	   && !(epilogue_completed
-		&& tablejump_p (BB_END (else_bb), NULL, NULL)))
-    join_bb = single_succ (else_bb);
+  then_seen = else_seen = false;
+  unaccounted = 0;
+  if (sub_info)
+    {
+      unaccounted = sub_info->unaccounted;
+      FOR_EACH_VEC_ELT (basic_block, sub_info->targets, i, tmp_bb)
+	{
+	  if (tmp_bb == then_bb && !ce_info->then_br.ifblock)
+	    then_seen = true;
+	  else if (tmp_bb == else_bb && !ce_info->else_br.ifblock)
+	    else_seen = true;
+	}
+    }
 
-  /* Otherwise it is not an IF-THEN or IF-THEN-ELSE combination.  */
-  else
-    return FALSE;
+  ce_info->num_insns = count_bb_insns (test_bb);
+  if (then_bb && !then_seen)
+    ce_info->then_br.num_insns = count_bb_insns (then_bb);
+  if (else_bb && !else_seen)
+    ce_info->else_br.num_insns = count_bb_insns (else_bb);
 
-  num_possible_if_blocks++;
+  if (then_bb && else_bb)
+    ce_discover_matches (ce_info);
 
-  if (dump_file)
+  VEC_safe_push (basic_block, heap, ce_info->targets, test_bb);
+  
+  if (then_seen)
+    unaccounted--;
+  else if (ce_info->then_br.ifblock)
     {
-      fprintf (dump_file,
-	       "\nIF-THEN%s block found, pass %d, start block %d "
-	       "[insn %d], then %d [%d]",
-	       (else_bb) ? "-ELSE" : "",
-	       ce_info->pass,
-	       test_bb->index,
-	       BB_HEAD (test_bb) ? (int)INSN_UID (BB_HEAD (test_bb)) : -1,
-	       then_bb->index,
-	       BB_HEAD (then_bb) ? (int)INSN_UID (BB_HEAD (then_bb)) : -1);
-
-      if (else_bb)
-	fprintf (dump_file, ", else %d [%d]",
-		 else_bb->index,
-		 BB_HEAD (else_bb) ? (int)INSN_UID (BB_HEAD (else_bb)) : -1);
+      ce_info->then_br.ifblock->blocks_offset
+	= VEC_length (basic_block, ce_info->targets);
+      FOR_EACH_VEC_ELT (basic_block, ce_info->then_br.ifblock->targets, i, tmp_bb)
+	VEC_safe_push (basic_block, heap, ce_info->targets, tmp_bb);
+      VEC_free (basic_block, heap, ce_info->then_br.ifblock->targets);
+      sub_info = ce_info->then_br.ifblock;
+      while (sub_info->join_ifblock)
+	{
+	  sub_info = sub_info->join_ifblock;
+	  sub_info->blocks_offset = VEC_length (basic_block, ce_info->targets);
+	  FOR_EACH_VEC_ELT (basic_block, sub_info->targets, i, tmp_bb)
+	    VEC_safe_push (basic_block, heap, ce_info->targets, tmp_bb);
+	  VEC_free (basic_block, heap, sub_info->targets);
+	}
+      if (sub_info->join_bb != join_bb)
+	VEC_safe_push (basic_block, heap, ce_info->targets, sub_info->join_bb);
+    }
+  else if (then_bb)
+    {
+      VEC_safe_push (basic_block, heap, ce_info->targets, then_bb);
+      unaccounted += EDGE_COUNT (then_bb->preds) - 1;
+    }
 
-      fprintf (dump_file, ", join %d [%d]",
-	       join_bb->index,
-	       BB_HEAD (join_bb) ? (int)INSN_UID (BB_HEAD (join_bb)) : -1);
+  if (else_seen)
+    unaccounted--;
+  else if (ce_info->else_br.ifblock)
+    {
+      ce_info->else_br.ifblock->blocks_offset
+	= VEC_length (basic_block, ce_info->targets);
+      FOR_EACH_VEC_ELT (basic_block, ce_info->else_br.ifblock->targets, i, tmp_bb)
+	VEC_safe_push (basic_block, heap, ce_info->targets, tmp_bb);
+      VEC_free (basic_block, heap, ce_info->else_br.ifblock->targets);
+      sub_info = ce_info->else_br.ifblock;
+      while (sub_info->join_ifblock)
+	{
+	  sub_info = sub_info->join_ifblock;
+	  sub_info->blocks_offset = VEC_length (basic_block, ce_info->targets);
+	  FOR_EACH_VEC_ELT (basic_block, sub_info->targets, i, tmp_bb)
+	    VEC_safe_push (basic_block, heap, ce_info->targets, tmp_bb);
+	  VEC_free (basic_block, heap, sub_info->targets);
+	}
+      if (sub_info->join_bb != join_bb)
+	VEC_safe_push (basic_block, heap, ce_info->targets, sub_info->join_bb);
+    }
+  else if (else_bb)
+    {
+      VEC_safe_push (basic_block, heap, ce_info->targets, else_bb);
+      unaccounted += EDGE_COUNT (else_bb->preds) - 1;
+    }
 
-      if (ce_info->num_multiple_test_blocks > 0)
-	fprintf (dump_file, ", %d %s block%s last test %d [%d]",
-		 ce_info->num_multiple_test_blocks,
-		 (ce_info->and_and_p) ? "&&" : "||",
-		 (ce_info->num_multiple_test_blocks == 1) ? "" : "s",
-		 ce_info->last_test_bb->index,
-		 ((BB_HEAD (ce_info->last_test_bb))
-		  ? (int)INSN_UID (BB_HEAD (ce_info->last_test_bb))
-		  : -1));
+  ce_info->unaccounted = unaccounted;
+  gcc_assert (ce_info->unaccounted >= 0);
 
-      fputc ('\n', dump_file);
+  tmp = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX);
+  if (tmp)
+    {
+      tmp = XEXP (tmp, 0);
+      ce_info->else_br.prob_val = tmp;
+      ce_info->then_br.prob_val = GEN_INT (REG_BR_PROB_BASE - INTVAL (tmp));
     }
 
-  /* Make sure IF, THEN, and ELSE, blocks are adjacent.  Actually, we get the
-     first condition for free, since we've already asserted that there's a
-     fallthru edge from IF to THEN.  Likewise for the && and || blocks, since
-     we checked the FALLTHRU flag, those are already adjacent to the last IF
-     block.  */
-  /* ??? As an enhancement, move the ELSE block.  Have to deal with
-     BLOCK notes, if by no other means than backing out the merge if they
-     exist.  Sticky enough I don't want to think about it now.  */
-  next = then_bb;
-  if (else_bb && (next = next->next_bb) != else_bb)
-    return FALSE;
-  if ((next = next->next_bb) != join_bb && join_bb != EXIT_BLOCK_PTR)
+  ce_info->blocks_length = VEC_length (basic_block, ce_info->targets);
+  if (VEC_last (basic_block, ce_info->targets) == join_bb)
+    ce_info->blocks_length--;
+  /* Find the last basic block before the join bb.  */
+  tmp_bb = VEC_index (basic_block, ce_info->targets,
+		      ce_info->blocks_length - 1);
+  if (EDGE_COUNT (tmp_bb->succs) == 0)
     {
-      if (else_bb)
-	join_bb = NULL;
-      else
-	return FALSE;
+      /* Determine if we'll merge the last bb with the join bb.  If not,
+	 we'll fail later if the block does not have outgoing edges.  */
+      if (tmp_bb->next_bb != join_bb || EDGE_COUNT (join_bb->preds) > 1
+	  || join_bb == EXIT_BLOCK_PTR)
+	{
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "failed test block %d, last block has no successors\n",
+		     test_bb->index);
+	  goto out_fail;
+	}
     }
+  else
+    gcc_assert (single_succ_p (tmp_bb) && single_succ (tmp_bb) == join_bb);
 
-  /* Do the real work.  */
-
-  ce_info->else_bb = else_bb;
-  ce_info->join_bb = join_bb;
+  if (dump_file)
+    fprintf (dump_file, "succeeded test block %d\n", test_bb->index);
 
-  /* If we have && and || tests, try to first handle combining the && and ||
-     tests into the conditional code, and if that fails, go back and handle
-     it without the && and ||, which at present handles the && case if there
-     was no ELSE block.  */
-  if (cond_exec_process_if_block (ce_info, TRUE))
-    return TRUE;
+  return outer_join != NULL || ce_info->unaccounted == 0;
 
-  if (ce_info->num_multiple_test_blocks)
-    {
-      cancel_changes (0);
+ out_fail:
+  free_sub_if_blocks (ce_info->then_br.ifblock);
+  free_sub_if_blocks (ce_info->else_br.ifblock);
+  free_sub_if_blocks (ce_info->join_ifblock);
+  if (dump_file)
+    fprintf (dump_file, "failed test block %d\n", test_bb->index);
+  return false;
+}
 
-      if (cond_exec_process_if_block (ce_info, FALSE))
-	return TRUE;
-    }
+/* Determine if a given basic block heads a simple IF-THEN or IF-THEN-ELSE
+   block.  If so, we'll try to convert the insns to not require the branch.
+   Return TRUE if we were successful at converting the block.  */
 
-  return FALSE;
+static int
+cond_exec_find_if_block (struct ce_if_block * ce_info)
+{
+  bool success;
+  /* We only ever should get here after reload,
+     and only if we have conditional execution.  */
+  gcc_assert (targetm.have_conditional_execution () && reload_completed);
+  
+  if (!ce_discover_if_structure (ce_info, NULL))
+    return FALSE;
+  if (ce_info->unaccounted != 0)
+    return FALSE;
+  if (!dbg_cnt (if_after_reload))
+    return FALSE;
+  num_possible_if_blocks++;
+  success = cond_exec_process_if_block (ce_info);
+  free_sub_if_blocks (ce_info);
+  return success;
 }
 
 /* Convert a branch over a trap, or a branch
@@ -4268,10 +5480,6 @@  if_convert (void)
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
   mark_loop_exit_edges ();
   loop_optimizer_finalize ();
-  free_dominance_info (CDI_DOMINATORS);
-
-  /* Compute postdominators.  */
-  calculate_dominance_info (CDI_POST_DOMINATORS);
 
   df_set_flags (DF_LR_RUN_DCE);
 
@@ -4281,6 +5489,10 @@  if_convert (void)
   pass = 0;
   do
     {
+      free_dominance_info (CDI_DOMINATORS);
+      free_dominance_info (CDI_POST_DOMINATORS);
+      calculate_dominance_info (CDI_DOMINATORS);
+      calculate_dominance_info (CDI_POST_DOMINATORS);
       df_analyze ();
       /* Only need to do dce on the first pass.  */
       df_clear_flags (DF_LR_RUN_DCE);
@@ -4318,6 +5530,7 @@  if_convert (void)
 #endif
 
   free_dominance_info (CDI_POST_DOMINATORS);
+  free_dominance_info (CDI_DOMINATORS);
 
   if (dump_file)
     fflush (dump_file);
@@ -4434,8 +5647,7 @@  struct rtl_opt_pass pass_if_after_combin
 static bool
 gate_handle_if_after_reload (void)
 {
-  return optimize > 0 && flag_if_conversion2
-    && dbg_cnt (if_after_reload);
+  return optimize > 0 && flag_if_conversion2;
 }
 
 static unsigned int
Index: gcc/basic-block.h
===================================================================
--- gcc/basic-block.h	(revision 178854)
+++ gcc/basic-block.h	(working copy)
@@ -464,6 +464,200 @@  extern void scale_bbs_frequencies_int (b
 extern void scale_bbs_frequencies_gcov_type (basic_block *, int, gcov_type,
 					     gcov_type);
 
+#ifdef RTX_CODE
+
+/* Definitions for structures used in ifcvt.c to convert if-blocks to
+   conditional execution.
+
+   === Terminology
+
+   A general comment about terminology.  It is assumed that then
+   blocks occur before else blocks, and that conditions and branch
+   directions should be thought of at an assembly language level, with
+   tests and jumps.  Together, this may lead to confusion when
+   thinking about it at a source level, since a "false" condition in an
+   if-block means that the "then" block is executed (the branch is
+   not-taken and the test block falls through into the "then" block).
+   Likewise, a "true" condition means that the "else" block is executed.
+   If there are diagrams in comments, the false branch (then block) is
+   always assumed to go up.
+
+   === Structure
+
+   An if-then-else block consists of a test block, a then ior an else
+   branch, and finally a join block where the two branches meet.  We
+   are able to recursively discover the structure of nested
+   if-then-else blocks; for each ce_if_block, the then and else blocks
+   can themselves be an if-then-else, and for nodes below the root, so
+   can the join block.  A (complicated) example would be
+
+                       T0        T1 
+                      /  \      /   \
+                 B2 C2    JB3 C3     outer join
+                /     \  /      \   / |
+           B1 C1       E0        E1   |
+	  /     \                     /
+	 /       E2................../
+        /                           /
+   B0 C0       --------T2          /
+        \     /       /  \        /
+	 B4 C4       /    J6 .....
+	      \     /    /
+	       B5 C5    /
+	            \  /
+		     E3
+
+   Here, Bn are test blocks, Cn the conditions they set; Tn are then blocks,
+   En else blocks, and Jn plain join blocks, and JBn join blocks which
+   themselves are if-then-else test blocks.  The outer join block is not
+   part of the if-block proper; in fact it may have other incoming edges
+   than shown in this diagram.  This is not true for the inner join blocks.
+   If they are different from their parent's join block, they must be
+   a) dominated by, and postdominate, the corresponding test block
+   b) have a single outgoing edge leading to the parent's join block.
+
+   Note that we ensure that every non-join condition block only has one
+   incoming edge. That condition ensures that every condition block (including
+   joins) is controlled only by its immediate parent.  Conceptually, in the
+   example above, we have this condition tree:
+        C0
+       /  \
+      C4  C1
+     /   /  \
+   C5   C2  C3
+
+   C2 and C3 are siblings, the blocks in which they are set (B2 and JB3) are
+   controlled by condition C1.
+
+   Note how T2 is used as the then branch from two test blocks, B4 and B5,
+   setting up a situation where condition register reuse (see below) may be
+   possible.
+   
+   === Conversion
+
+   In order to convert such a structure to conditional execution, we work from
+   the leaves (then or else blocks which are not themselves tests) upwards
+   through the tree, assigning registers for the conditions, verifying costs,
+   etc.  Ideally we can convert the entire structure, but in complicated cases
+   it's more likely that We end up with a forest of convertible sub-trees.
+
+   Note the case of B5/T2/E3/J6.  This is almost a complete if-then-else block,
+   but it has one more incoming edge from an if block (B4) at an upper level.
+   This is kept track of in the "unaccounted" member of ce_if_block_t, and it
+   means that if B5/T2/E3/J6 by itself would be ok to convert, we still must
+   prevent the conversion if B4 cannot be handled.
+
+   === Reuse of conditions
+
+   In some cases, it is possible to reuse conditions from a parent block
+   by conditionally overwriting them.  As an example,
+
+               F
+              /
+         B2 C2
+        /     \
+   B1 C1 ----- T
+
+   Let us assume that C1 has the from (condreg1 == 0) and C2 has the
+   form (condreg2) == 0.  If we can ensure that condreg1 == condreg2, and
+   only the last instruction in B2 sets condreg2, then we can use the same
+   condition (reg != 0) in both B2 and F, and (reg == 0) in T.  If the
+   conditions had a different structure, or if B1 and B2 had no common
+   branch targets, or if we could not use the same register, then we would
+   need to modify condreg2 at the start of F and T, based on the result
+   of C1.  See below.
+   (Note that the same optimization applies in one additional case: if the
+   two conditions here have opposite codes, and the opposite branch directions
+   in B1 and B2 go to the same block).
+   
+   === Rewriting conditions
+
+   In the previous example, we would use condition 2 to predicate F
+   and T, with !C2 and C2 respectively.  However, if C1 is true, C2
+   will not be computed, so (unless we manage to reuse the condition as
+   described above), we must insert additional instructions to ensure that
+   C2 has the correct value correct during F and T.
+
+   We say that C1 implies C2, meaning that if C1 is set, C2 must be made true.
+   If C2 is (reg == 0), then this can be done with an instruction of the form
+     [C1] C2 = 0
+   before F.  Likewise, !C1 implies !C2, causing an instruction to be emitted
+   before T.
+
+   This means that in all cases where reuse of conditions is not possible, two
+   conditions on the same path must use different registers.  We try to
+   reallocate condition registers to achieve the best possible results.
+   
+   Some targets have better support for combining conditions; this is an
+   area where further improvement may be possible.
+
+   === Merging identical code on opposite branches
+
+   Consider
+    /B0 T B1\
+   C         J
+    \B0 E B1/
+    
+   where the then branch has the instructions B0, T, and B1, while the
+   else branch has the instructions B0, E and B1.  We should not emit
+   two copies of the same code with opposite predicates; it would be
+   better to emit a single copy, using the predicate from one level
+   higher up.  */
+
+struct ce_if_block;
+
+/* Describe a rewrite of a condition.  See the big comment above, and the
+   example
+               F
+              /
+         B2 C2
+        /     \
+   B1 C1 ----- T  */
+
+struct ce_rewrite {
+  /* Chains together structures for a given if block.  */
+  struct ce_rewrite *next_same_block;
+  struct ce_if_block *controller;
+  /* In the example above, this would be C1.  */
+  rtx controlling_cond;
+  bool implied_true_p;
+};
+
+/* A structure to record data for then/else branches.  */
+struct ce_if_branch
+{
+  /* A structure describing a nested if-block, or NULL.  */
+  struct ce_if_block *ifblock;
+
+  /* The condition to be used for predicating the block corresponding
+     to this structure, and possibly also others at the same level
+     (such as the separate join block of a nested if block).  */
+  rtx cond;
+
+  /* The condition rewrites necessary to make the condition correct before
+     use.  */
+  struct ce_rewrite *rewrite_list;
+  /* False if rewrites are disallowed for this condition; set in
+     cond_exec_assign_registers based on life information.  */
+  bool may_rewrite;
+
+  /* The code of the condition.  */
+  enum rtx_code cond_code;
+
+  /* Computed from a REG_BR_PROB note on the branch that jumps
+     to/across this block.  */
+  rtx prob_val;
+  /* The number of insns to be converted.  Used to estimate costs.  */
+  int num_insns;
+
+  /* Information about identical code sequences at the start and end
+     of this block and its sibling.  LAST_HEAD is either NULL, or the
+     last instruction that matched at the start of the block.  Similarly,
+     FIRST_TAIL may be the first instruction that matched at the end of the
+     block.  */
+  rtx last_head, first_tail;
+};
+
 /* Structure to group all of the information to process IF-THEN and
    IF-THEN-ELSE blocks for the conditional execution support.  This
    needs to be in a public file in case the IFCVT macros call
@@ -471,25 +665,75 @@  extern void scale_bbs_frequencies_gcov_t
 
 typedef struct ce_if_block
 {
-  basic_block test_bb;			/* First test block.  */
-  basic_block then_bb;			/* THEN block.  */
-  basic_block else_bb;			/* ELSE block or NULL.  */
-  basic_block join_bb;			/* Join THEN/ELSE blocks.  */
-  basic_block last_test_bb;		/* Last bb to hold && or || tests.  */
-  int num_multiple_test_blocks;		/* # of && and || basic blocks.  */
-  int num_and_and_blocks;		/* # of && blocks.  */
-  int num_or_or_blocks;			/* # of || blocks.  */
-  int num_multiple_test_insns;		/* # of insns in && and || blocks.  */
-  int and_and_p;			/* Complex test is &&.  */
-  int num_then_insns;			/* # of insns in THEN block.  */
-  int num_else_insns;			/* # of insns in ELSE block.  */
-  int pass;				/* Pass number.  */
+  basic_block test_bb;
+  basic_block then_bb, else_bb;
+  basic_block join_bb;
+
+  /* When nesting if blocks, a pointer to the parent.  */
+  struct ce_if_block *parent;
+
+  /* Record information about the then/else branches of this if, including
+     sub-if-blocks.  */
+  struct ce_if_branch then_br, else_br;
+
+  /* If this is not the top-level if block, and its join block is another
+     recognizable if-block, record its structure here.  */
+  struct ce_if_block *join_ifblock;
+
+  /* Nonnull if one of the sub-ifblocks can reuse the condition by
+     modifying the condition register.  */
+  struct ce_if_block *reuser;
+
+  /* The last active insn before the jump in the test block.  */
+  rtx end;
+  /* The register set in the last insn.  */
+  int set_reg;
+  /* The condition register used in the test block's jump instruction,
+     and its mode.  */
+  int cond_regno;
+  enum machine_mode cond_reg_mode;
+
+  /* The number of instructions in this if-then-else block that are not
+     in the converted parts of the then/else blocks.  This is used to
+     compute the number of instructions to be converted in an outer if
+     block.  */
+  int num_unconverted;
+
+  /* A vector containing the basic blocks which are controlled (reachable
+     without going past the join block) by this if-block.  This information
+     is computed for all if-blocks in the tree, but freed soon.  It is
+     retained only for the root if-block; the indices of the blocks in this
+     vector in the root node correspond to the ce_blocks_info_t array we
+     compute.  */
+  VEC (basic_block, heap) *targets;
+  /* An offset into the top-most ce_if_block's targets vector which points to
+     the list of blocks associated with this ce_if_block.  Useful to recover
+     the information from this if-block's targets vector after it has been
+     freed.  */
+  int blocks_offset;
+  /* Likewise, the length of the targets vector, kept around after the
+     vector has been freed.  The length is shortened by one if necessary
+     to make sure the join block is not included.  */
+  int blocks_length;
+  /* The number of edges coming into basic blocks in the subtree controlled by
+     this test.  We can only convert entire subtrees for which the root node
+     has zero unaccounted in-edges.  */
+  int unaccounted;
+  bool will_convert;
+  /* A set of registers used as condition registers at lower levels,
+     not reusable as a condition register here.  */
+  HARD_REG_SET sub_nonreusable;
+  /* The hard registers live at the end of the test block.  */
+  HARD_REG_SET live_at_end;
+
+  /* The number of insns in this block after it has been converted.  */
+  int num_insns;
 
 #ifdef IFCVT_EXTRA_FIELDS
   IFCVT_EXTRA_FIELDS			/* Any machine dependent fields.  */
 #endif
-
 } ce_if_block_t;
+#endif
 
 /* This structure maintains an edge list vector.  */
 struct edge_list