diff mbox

[lra] patch mostly implementing pseudo live range split

Message ID 4ED556AE.70306@redhat.com
State New
Headers show

Commit Message

Vladimir Makarov Nov. 29, 2011, 10:03 p.m. UTC
This patch contains a part of my several last weeks work on LRA.

1. The major change is to add pseudo live range split transformation to the
    existing inheritance transformations.  The idea is to create a new
    split pseudo, add save/restore code using the split pseudo in EBB
    for a pseudo got hard register which intersects a few reload
    pseudos live ranges with high register pressure, and assign the
    original pseudo hard register to the split pseudo.  The
    transformation is undone if the split pseudo did not change his hard
    register or was not spilled in the assignment pass which is between
    the passes doing and undoing inheritance/split transformations.

    This new functionality of LRA (pseudo range splitting) permits to
    move a separate caller saves pass into inheritance/split
    transformation pass.  This makes LRA smaller and a bit faster.

2. The patch adds the code for correct update of debug insns for
    inheritance/split transformations.  So I believe that now LRA deals
    with debug info not worse than reload pass.

3. The patch rewrite the way of dealing with the secondary memory
    moves in constraint pass of LRA.  Previously we generated secondary
    moves if macro SECONDARY_MEMORY_NEEDED says so.  Unfortunately, the
    macro is usually defined inaccurately.  Therefore now we emit
    secondary moves only if insn constraints are not satisfied (in
    other words we need reloads for given move).  This change permitted
    to solve last degradations on GCC testsuite for x86-64/x86 in
    comparison when the reload pass is used.

4. The patch also rewrites code for dealing with secondary memory
    moves in PPC.  PPC port is single one allocating explicitly stack
    slot for secondary memory of given mode through using macro
    SECONDARY_MEMORY_NEEDED_RTX.  The patch rejects this approach for
    LRA and uses a pseudo for secondary memory and this pseudo gets
    memory as in standard mechanism for spilling in LRA.  This permits
    reuse the slots allocated for secondary memory for other spilled
    pseudo decreasing allocated stack slot size in some case.  This
    change will permit to remove SECONDARY_MEMORY_NEEDED_RTX (as many
    others) if LRA is in GCC finally.

5. There are other small and insignificant changes in the patch mostly
    to fix some bugs and testsuite degradations.

The patch was successfully bootstrapped on x86/x86-64, ppc64, and
itanium.  Arm bootstrap is still going.

Committed to LRA branch as rev. 181820.

2011-11-29  Vladimir Makarov<vmakarov@redhat.com>

	* config/rs6000/rs6000.h (SECONDARY_MEMORY_NEEDED_MODE): New
	macro.

	* config/rs6000/rs6000-protos.h
	(rs6000_secondary_memory_needed_mode): New prototype.

	* config/rs6000/rs6000.c (rs6000_emit_move): Rewrite the code for
	LRA for emitting secondary memory moves based on class of the
	pseudo denoting the secondary memory.
	(rs6000_secondary_memory_needed_mode): New function.
	(rs6000_check_sdmode): Do nothing for LRA.

	* config/i386/i386.c (inline_secondary_memory_needed): Switch off
	an assert for LRA.

	* lra-int.h (lra_split_pseudos): New external.

	* lra-assigns.c (lra_setup_reg_renumber): Add code for printing
	split pseudos.
	(assign_candidates_bitmap, assign_candidates, assigned_pseudos):
	Remove.
	(spill_for): Add code for dealing with the split pseudos.
	(assign_by_spills): Ditto.

	* lra.c (lra_split_pseudos): New bitmap.
	(lra): Initialize and finalize lra_split_pseudos.  Remove call of
	lra_save_restore.  Always do inheritance/split pass on the first
	iteration when caller_save_needed.

	* lra-constraints.c (lra_secondary_memory): Remove.
	(get_secondary_mem): Ditto.
	(get_reload_reg): Make value of output reload pseudo unique.
	(emit_secondary_memory_move): Rename emit_spill_move.  Process
	subregs.
	(check_and_process_move): Add new argument.  Don't emit secondary
	memory moves, only report them.  Make values of scratch and
	secondary pseudos unique.
	(curr_insn_transform): Emit secondary memory moves only if insn
	does not satisfy all constraints.
	(MAX_RELOAD_INSNS_NUMBER): Increase it to LRA_MAX_INSN_RELOADS.
	(lra_contraints_init): Remove code initializing
	lra_secondary_memory.
	(usage_insns_check): Remove.
	(reloads_num, calls_num): New.
	(struct usage_insns): New.
	(usage_insns): Change the type.
	(inherit_reload_reg): Improve formatting.  Add more asserts.
	(need_for_call_save_p, need_for_split_p, choose_split_class): New
	functions.
	(split_pseudo): New function.
	(update_ebb_live_info): Process insns even if EBB contains one BB.
	(get_live_on_other_edges, get_non_debug_insn): New functions.
	(temp_bitmap): New bitmap.
	(add_next_usage_insn): New function.
	(inherit_in_ebb): Change the prototype.  Add code for pseudo range
	splitting.
	(lra_inheritance): Call update_ebb_live_info if the changes were
	made in inherit_in_ebb.
	(get_pseudo_regno): New function.
	(remove_inheritance_pseudos): Add code for undoing pseudo live
	range split and dealing with subregs.  Update debug info too.
	(lra_undo_inheritance): Add code for undoing pseudo live range
	split.  Add printing more debug info.

	* lra-saves.c: Remove.

	* Makefile.in (OBJS): Remove lra-saves.o
	(lra-saves.o): Remove the entry.

Comments

Jeff Law Nov. 30, 2011, 5:09 a.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/29/11 15:03, Vladimir Makarov wrote:
> This patch contains a part of my several last weeks work on LRA.
> 
> 1. The major change is to add pseudo live range split
> transformation to the existing inheritance transformations.  The
> idea is to create a new split pseudo, add save/restore code using
> the split pseudo in EBB for a pseudo got hard register which
> intersects a few reload pseudos live ranges with high register
> pressure, and assign the original pseudo hard register to the split
> pseudo.  The transformation is undone if the split pseudo did not
> change his hard register or was not spilled in the assignment pass
> which is between the passes doing and undoing inheritance/split
> transformations.
> 
> This new functionality of LRA (pseudo range splitting) permits to 
> move a separate caller saves pass into inheritance/split 
> transformation pass.  This makes LRA smaller and a bit faster.
> 
> 2. The patch adds the code for correct update of debug insns for 
> inheritance/split transformations.  So I believe that now LRA
> deals with debug info not worse than reload pass.
> 
> 3. The patch rewrite the way of dealing with the secondary memory 
> moves in constraint pass of LRA.  Previously we generated
> secondary moves if macro SECONDARY_MEMORY_NEEDED says so.
> Unfortunately, the macro is usually defined inaccurately.
> Therefore now we emit secondary moves only if insn constraints are
> not satisfied (in other words we need reloads for given move).
> This change permitted to solve last degradations on GCC testsuite
> for x86-64/x86 in comparison when the reload pass is used.
> 
> 4. The patch also rewrites code for dealing with secondary memory 
> moves in PPC.  PPC port is single one allocating explicitly stack 
> slot for secondary memory of given mode through using macro 
> SECONDARY_MEMORY_NEEDED_RTX.  The patch rejects this approach for 
> LRA and uses a pseudo for secondary memory and this pseudo gets 
> memory as in standard mechanism for spilling in LRA.  This permits 
> reuse the slots allocated for secondary memory for other spilled 
> pseudo decreasing allocated stack slot size in some case.  This 
> change will permit to remove SECONDARY_MEMORY_NEEDED_RTX (as many 
> others) if LRA is in GCC finally.
> 
> 5. There are other small and insignificant changes in the patch
> mostly to fix some bugs and testsuite degradations.
> 
> The patch was successfully bootstrapped on x86/x86-64, ppc64, and 
> itanium.  Arm bootstrap is still going.
> 
> Committed to LRA branch as rev. 181820.
BTW, have you noticed the regressions creeping up on the lra branch
relative to the mainline?  We also need to merge in the api change to
pass in address spaces to base_reg_class.  I've avoided merging in
that specific change during my weekly merges.

I haven't looked at the overall framework you've built for splitting,
but the obvious question, can your framework handle splitting
unallocated pseudos on EBB boundaries?

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO1bqRAAoJEBRtltQi2kC7f7MH/218wwvioVywKIpzv2UOI1nE
2L4YdpfVV3OS7UqdjPl2srqZdq39Y3k7EprjYABy5/NQQnmJClMV3fZ01/CxPHG/
24i3xu0fZW1fYnqkB18qxE1ThtO4vFMAeUJsF2DpE/pfEkj2OBhvMyjfQ3Ed48T/
vKhLkTOhku+1012jPLOEZwWha4cLqniOpwcsRNN0SSCowxvM+lCFMW7Dqtc4KWuJ
HuKtxUTahXTP8aMz5xSBFmSxndhhnnmQ/Jp7IPjqnQWPWf4CUGqCMyUnzHicaezK
W2UjfSc48OxmB75nqvlNCRv49ebl8qYZz9bDnc3eLpK8ippHFxdqK2OJIh6+oBo=
=hbKI
-----END PGP SIGNATURE-----
Vladimir Makarov Nov. 30, 2011, 2:46 p.m. UTC | #2
On 11/30/2011 12:09 AM, Jeff Law wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 11/29/11 15:03, Vladimir Makarov wrote:
>> This patch contains a part of my several last weeks work on LRA.
>>
>> 1. The major change is to add pseudo live range split
>> transformation to the existing inheritance transformations.  The
>> idea is to create a new split pseudo, add save/restore code using
>> the split pseudo in EBB for a pseudo got hard register which
>> intersects a few reload pseudos live ranges with high register
>> pressure, and assign the original pseudo hard register to the split
>> pseudo.  The transformation is undone if the split pseudo did not
>> change his hard register or was not spilled in the assignment pass
>> which is between the passes doing and undoing inheritance/split
>> transformations.
>>
>> This new functionality of LRA (pseudo range splitting) permits to
>> move a separate caller saves pass into inheritance/split
>> transformation pass.  This makes LRA smaller and a bit faster.
>>
>> 2. The patch adds the code for correct update of debug insns for
>> inheritance/split transformations.  So I believe that now LRA
>> deals with debug info not worse than reload pass.
>>
>> 3. The patch rewrite the way of dealing with the secondary memory
>> moves in constraint pass of LRA.  Previously we generated
>> secondary moves if macro SECONDARY_MEMORY_NEEDED says so.
>> Unfortunately, the macro is usually defined inaccurately.
>> Therefore now we emit secondary moves only if insn constraints are
>> not satisfied (in other words we need reloads for given move).
>> This change permitted to solve last degradations on GCC testsuite
>> for x86-64/x86 in comparison when the reload pass is used.
>>
>> 4. The patch also rewrites code for dealing with secondary memory
>> moves in PPC.  PPC port is single one allocating explicitly stack
>> slot for secondary memory of given mode through using macro
>> SECONDARY_MEMORY_NEEDED_RTX.  The patch rejects this approach for
>> LRA and uses a pseudo for secondary memory and this pseudo gets
>> memory as in standard mechanism for spilling in LRA.  This permits
>> reuse the slots allocated for secondary memory for other spilled
>> pseudo decreasing allocated stack slot size in some case.  This
>> change will permit to remove SECONDARY_MEMORY_NEEDED_RTX (as many
>> others) if LRA is in GCC finally.
>>
>> 5. There are other small and insignificant changes in the patch
>> mostly to fix some bugs and testsuite degradations.
>>
>> The patch was successfully bootstrapped on x86/x86-64, ppc64, and
>> itanium.  Arm bootstrap is still going.
>>
>> Committed to LRA branch as rev. 181820.
> BTW, have you noticed the regressions creeping up on the lra branch
> relative to the mainline?  We also need to merge in the api change to
> pass in address spaces to base_reg_class.  I've avoided merging in
> that specific change during my weekly merges.
>
Yes, the trunk even now is not stable.  I was struggling with itanium 
and shrink-wrapping optimizations.  I guess there were a lot of changes 
last minutes on stage1 on the trunk which should still be addressed.  
Thanks for your regular work on merging the branch, Jeff.
> I haven't looked at the overall framework you've built for splitting,
> but the obvious question, can your framework handle splitting
> unallocated pseudos on EBB boundaries?
>
LRA can split now pseudos assigned to hard registers in EBB including on 
EBB borders.  The analogous effect of splitting spilled pseudos (and 
further assignment to its part in the EBB) can be achieved by already 
existing inheritance in EBB.
Hans-Peter Nilsson Dec. 2, 2011, 1:35 p.m. UTC | #3
On Tue, 29 Nov 2011, Vladimir Makarov wrote:
> 3. The patch rewrite the way of dealing with the secondary memory
>    moves in constraint pass of LRA.  Previously we generated secondary
>    moves if macro SECONDARY_MEMORY_NEEDED says so.  Unfortunately, the
>    macro is usually defined inaccurately.

I do not doubt that, but I think it would help if you mentioned
what you see that is wrong, in particular if it's consistent
among targets.

For example, for MIPS (and I'd probably for other targets too if
I looked) for an older gcc, I've seen calls with class ==
NO_REGS to the related function mips_secondary_reload_class due
to MEMORY_MOVE_COST applied to a constant, which becomes a bit
of a problem if it's used as-is as a first argument to
reg_class_subset_p (the empty class being a subset of every
class).

brgds, H-P
Vladimir Makarov Dec. 5, 2011, 7:18 p.m. UTC | #4
On 12/02/2011 08:35 AM, Hans-Peter Nilsson wrote:
> On Tue, 29 Nov 2011, Vladimir Makarov wrote:
>> 3. The patch rewrite the way of dealing with the secondary memory
>>     moves in constraint pass of LRA.  Previously we generated secondary
>>     moves if macro SECONDARY_MEMORY_NEEDED says so.  Unfortunately, the
>>     macro is usually defined inaccurately.
> I do not doubt that, but I think it would help if you mentioned
> what you see that is wrong, in particular if it's consistent
> among targets.
>
The last testsuite degradations solved for x86-64 were about moving 
between xmm and and general regs.
SECONDARY_MEMORY_NEEDED returned NO_REGS.  The code for this macro 
should be expanded a lot to be accurate (too many elements in a set 
CLASSxCLASSxMODE).  Currently, mode is ignored and not any mode moves 
between xmm and general regs are permitted.  The right source of this 
info are insn constraints.  So instead of expanding code for the macro, 
I decided to use constraints info first.
> For example, for MIPS (and I'd probably for other targets too if
> I looked) for an older gcc, I've seen calls with class ==
> NO_REGS to the related function mips_secondary_reload_class due
> to MEMORY_MOVE_COST applied to a constant, which becomes a bit
> of a problem if it's used as-is as a first argument to
> reg_class_subset_p (the empty class being a subset of every
> class).
>
> brgds, H-P
diff mbox

Patch

Index: lra-assigns.c
===================================================================
--- lra-assigns.c	(revision 181810)
+++ lra-assigns.c	(working copy)
@@ -537,8 +537,9 @@  lra_setup_reg_renumber (int regno, int h
     fprintf (lra_dump_file, "      Assign %d to %sr%d (freq=%d)\n",
 	     reg_renumber[regno],
 	     regno < lra_constraint_new_regno_start
-	     ? "" : bitmap_bit_p (&lra_inheritance_pseudos, regno)
-	     ? "inheritance " : "reload ",
+	     ? ""
+	     : bitmap_bit_p (&lra_inheritance_pseudos, regno) ? "inheritance "
+	     : bitmap_bit_p (&lra_split_pseudos, regno) ? "split " : "reload ",
 	     regno, lra_reg_info[regno].freq);
   if (hard_regno >= 0)
     {
@@ -555,17 +556,6 @@  static bitmap_head ignore_pseudos_bitmap
    and best spill pseudos for given pseudo (and best hard regno).  */
 static bitmap_head spill_pseudos_bitmap, best_spill_pseudos_bitmap;
 
-/* Bitmap used to contain all posible candidates which might get a
-   hard register after spilling for given pseudo hard regno.  */
-static bitmap_head assign_candidates_bitmap;
-/* Array used to contain and sort the possible candidates (see
-   above).  */
-static int *assign_candidates;
-
-/* Array used to contain assigned pseudos during a try of spilling
-   pseudos.  */
-static int *assigned_pseudos;
-
 /* Current pseudo check for validity of elements in
    TRY_HARD_REG_PSEUDOS.  */
 static int curr_pseudo_check;
@@ -692,7 +682,8 @@  spill_for (int regno, bitmap spilled_pse
       EXECUTE_IF_SET_IN_BITMAP (&spill_pseudos_bitmap, 0, spill_regno, bi)
 	if ((int) spill_regno >= lra_constraint_new_regno_start
 	    /* ??? */
-	    && ! bitmap_bit_p (&lra_inheritance_pseudos, spill_regno))
+	    && ! bitmap_bit_p (&lra_inheritance_pseudos, spill_regno)
+	    && ! bitmap_bit_p (&lra_split_pseudos, spill_regno))
 	  goto fail;
       insn_pseudos_num = 0;
       if (lra_dump_file != NULL)
@@ -804,7 +795,9 @@  spill_for (int regno, bitmap spilled_pse
 		 ((int) spill_regno < lra_constraint_new_regno_start
 		  ? ""
 		  : bitmap_bit_p (&lra_inheritance_pseudos, spill_regno)
-		  ? "inheritance " : "reload "),
+		  ? "inheritance "
+		  : bitmap_bit_p (&lra_split_pseudos, spill_regno)
+		  ? "split " : "reload "),
 		 spill_regno, reg_renumber[spill_regno],
 		 lra_reg_info[spill_regno].freq, regno);
       update_lives (spill_regno, true);
@@ -990,9 +983,6 @@  assign_by_spills (void)
   bitmap_initialize (&ignore_pseudos_bitmap, &reg_obstack);
   bitmap_initialize (&spill_pseudos_bitmap, &reg_obstack);
   bitmap_initialize (&best_spill_pseudos_bitmap, &reg_obstack);
-  bitmap_initialize (&assign_candidates_bitmap, &reg_obstack);
-  assign_candidates = (int *) xmalloc (sizeof (int) * max_reg_num ());
-  assigned_pseudos = (int *) xmalloc (sizeof (int) * max_reg_num ());
   update_hard_regno_preference_check = (int *) xmalloc (sizeof (int)
 							* max_reg_num ());
   memset (update_hard_regno_preference_check, 0,
@@ -1019,11 +1009,13 @@  assign_by_spills (void)
 		     regno_assign_info[regno_assign_info[regno].first].freq);
 	  hard_regno = find_hard_regno_for (regno, &cost, -1);
 	  if (hard_regno < 0
-	      && ! bitmap_bit_p (&lra_inheritance_pseudos, regno))
+	      && ! bitmap_bit_p (&lra_inheritance_pseudos, regno)
+	      && ! bitmap_bit_p (&lra_split_pseudos, regno))
 	    hard_regno = spill_for (regno, &all_spilled_pseudos);
 	  if (hard_regno < 0)
 	    {
-	      if (! bitmap_bit_p (&lra_inheritance_pseudos, regno))
+	      if (! bitmap_bit_p (&lra_inheritance_pseudos, regno)
+		  && ! bitmap_bit_p (&lra_split_pseudos, regno))
 		sorted_pseudos[nfails++] = regno;
 	    }
 	  else
@@ -1084,17 +1076,24 @@  assign_by_spills (void)
     }
   improve_inheritance ();
   bitmap_clear (&changed_insns);
-  /* Inheritance pseudo can be assigned and after that spilled.  We
-     should look at the final result.  */
+  /* We can not assign to inherited pseudos if any its inheritance
+     pseudo did not get hard register because undo inheritance pass
+     will extend live range of such inherited pseudos.  */
   bitmap_initialize (&do_not_assign_nonreload_pseudos, &reg_obstack);
   EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, u, bi)
-    if (reg_renumber[u] < 0
-	&& (restore_regno = lra_reg_info[u].restore_regno) >= 0)
+    if ((restore_regno = lra_reg_info[u].restore_regno) >= 0
+	&& ((reg_renumber[u] < 0
+	     && bitmap_bit_p (&lra_inheritance_pseudos, u))
+	    || (reg_renumber[u] >= 0
+		&& bitmap_bit_p (&lra_split_pseudos, u))))
       bitmap_set_bit (&do_not_assign_nonreload_pseudos, restore_regno);
   for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_reg_num (); i++)
     if (((i < lra_constraint_new_regno_start
 	  && ! bitmap_bit_p (&do_not_assign_nonreload_pseudos, i))
-	 || bitmap_bit_p (&lra_inheritance_pseudos, i))
+	 || (bitmap_bit_p (&lra_inheritance_pseudos, i)
+	     && lra_reg_info[i].restore_regno >= 0)
+	 || (bitmap_bit_p (&lra_split_pseudos, i)
+	     && lra_reg_info[i].restore_regno >= 0))
 	&& reg_renumber[i] < 0 && lra_reg_info[i].nrefs != 0
 	&& regno_allocno_class_array[i] != NO_REGS)
       sorted_pseudos[n++] = i;
@@ -1117,9 +1116,6 @@  assign_by_spills (void)
 	}
     }
   free (update_hard_regno_preference_check);
-  free (assigned_pseudos);
-  free (assign_candidates);
-  bitmap_clear (&assign_candidates_bitmap);
   bitmap_clear (&best_spill_pseudos_bitmap);
   bitmap_clear (&spill_pseudos_bitmap);
   bitmap_clear (&ignore_pseudos_bitmap);
Index: lra-int.h
===================================================================
--- lra-int.h	(revision 181810)
+++ lra-int.h	(working copy)
@@ -276,6 +276,7 @@  extern bool lra_former_scratch_operand_p
 
 extern int lra_constraint_new_regno_start;
 extern bitmap_head lra_inheritance_pseudos;
+extern bitmap_head lra_split_pseudos;
 extern int lra_constraint_new_insn_uid_start;
 
 /* lra-constraints.c: */
Index: lra.c
===================================================================
--- lra.c	(revision 181810)
+++ lra.c	(working copy)
@@ -2065,9 +2065,12 @@  int lra_in_progress;
 /* Start of reload pseudo regnos before the new spill pass.  */ 
 int lra_constraint_new_regno_start;
 
-/* Inheritance pseudo regnos breore the new spill pass.  */ 
+/* Inheritance pseudo regnos before the new spill pass.  */ 
 bitmap_head lra_inheritance_pseudos;
 
+/* Split pseudo regnos before the new spill pass.  */ 
+bitmap_head lra_split_pseudos;
+
 /* First UID of insns generated before a new spill pass.  */
 int lra_constraint_new_insn_uid_start;
 
@@ -2122,12 +2125,6 @@  lra (FILE *f)
       if (! call_used_regs[i] && ! fixed_regs[i] && ! LOCAL_REGNO (i))
 	df_set_regs_ever_live (i, true);
 
-  if (flag_caller_saves)
-    {
-      if (lra_save_restore ())
-	df_analyze ();
-    }
-
   /* We don't DF from now and avoid its using because it is to
      expensive when a lot of RTL changes are made.  */
   df_set_flags (DF_NO_INSN_RESCAN);
@@ -2140,6 +2137,7 @@  lra (FILE *f)
   /* It is needed for the 1st coalescing.  */
   lra_constraint_new_insn_uid_start = get_max_uid ();
   bitmap_initialize (&lra_inheritance_pseudos, &reg_obstack);
+  bitmap_initialize (&lra_split_pseudos, &reg_obstack);
   live_p = false;
   for (;;)
     {
@@ -2149,7 +2147,8 @@  lra (FILE *f)
 	     if there were no RTL transformations in
 	     lra_constraints.  */
 	  if (! lra_constraints (lra_constraint_iter == 0)
-	      && (lra_constraint_iter > 1 || ! scratch_p))
+	      && (lra_constraint_iter > 1
+		  || (! scratch_p && ! caller_save_needed)))
 	    break;
 	  /* Constraint transformations may result in that eliminable
 	     hard regs become uneliminable and pseudos which use them
@@ -2175,6 +2174,7 @@  lra (FILE *f)
 	    live_p = false;
 	}
       bitmap_clear (&lra_inheritance_pseudos);
+      bitmap_clear (&lra_split_pseudos);
       if (! lra_need_for_spills_p ())
 	break;
       if (! live_p)
Index: lra-eliminations.c
===================================================================
--- lra-eliminations.c	(revision 181810)
+++ lra-eliminations.c	(working copy)
@@ -1259,9 +1259,6 @@  lra_eliminate (bool final_p)
   bitmap_head insns_with_changed_offsets;
   struct elim_table *ep;
   int regs_num = max_reg_num ();
-#ifdef SECONDARY_MEMORY_NEEDED
-  int mode;
-#endif
 
   bitmap_initialize (&insns_with_changed_offsets, &reg_obstack);
   if (final_p)
@@ -1307,13 +1304,6 @@  lra_eliminate (bool final_p)
 	  fprintf (lra_dump_file,
 		   "Updating elimination of equiv for reg %d\n", i);
       }
-#ifdef SECONDARY_MEMORY_NEEDED
-  for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
-    if (lra_secondary_memory[mode] != NULL_RTX)
-      lra_secondary_memory[mode]
-	= lra_eliminate_regs_1 (lra_secondary_memory[mode],
-				VOIDmode, final_p, ! final_p, false);
-#endif
   FOR_EACH_BB (bb)
     FOR_BB_INSNS_SAFE (bb, insn, temp)
       {
Index: lra-saves.c
===================================================================
--- lra-saves.c	(revision 181810)
+++ lra-saves.c	(working copy)
@@ -1,490 +0,0 @@ 
-/* Save/restore placement optimization based on register allocation.
-   Copyright (C) 2010, 2011
-   Free Software Foundation, Inc.
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify it under
-the terms of the GNU General Public License as published by the Free
-Software Foundation; either version 3, or (at your option) any later
-version.
-
-GCC is distributed in the hope that it will be useful, but WITHOUT ANY
-WARRANTY; without even the implied warranty of MERCHANTABILITY or
-FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-
-/* This file contains code for placement save/restore insns.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hard-reg-set.h"
-#include "rtl.h"
-#include "tm_p.h"
-#include "regs.h"
-#include "insn-config.h"
-#include "recog.h"
-#include "output.h"
-#include "regs.h"
-#include "target.h"
-#include "function.h"
-#include "expr.h"
-#include "basic-block.h"
-#include "except.h"
-#include "cfgloop.h"
-#include "df.h"
-#include "ira.h"
-#include "lra-int.h"
-
-/* Pseudos living through calls and assigned to call used hard
-   registers.  */
-static bitmap consideration_pseudos;
-/* Currently live pseudos which are subset of the consideration
-   pseudos.  */
-static bitmap live_pseudos;
-
-
-
-/* The number elements in arrays correspondingly REFERENCED_PSEUDOS
-   and MODIFIED_PSEUDOS.  */
-static int referenced_pseudos_num, modified_pseudos_num;
-
-/* Enough space to keep all operands and two address registers for an
-   operand.  */
-#define MAX_INSN_PSEUDOS (MAX_RECOG_OPERANDS * 3)
-
-/* Arrays used to store pseudos referenced and modified in the current
-   insn.  */
-static int referenced_pseudos[MAX_INSN_PSEUDOS];
-static int modified_pseudos[MAX_INSN_PSEUDOS];
-
-/* Add pseudos referenced and modified in OP to ones referenced and
-   modified in the current insn.  */
-static void
-mark_pseudos (rtx op, bool out_p)
-{
-  enum rtx_code code = GET_CODE (op);
-  const char *fmt;
-  int i, j, regno;
-
-  switch (code)
-    {
-    case SUBREG:
-      op = SUBREG_REG (op);
-      if (! REG_P (op))
-	{
-	  mark_pseudos (op, out_p);
-	  return;
-	}
-      /* Fall through:  */
-
-    case REG:
-      if ((regno = REGNO (op)) >= FIRST_PSEUDO_REGISTER
-	  && bitmap_bit_p (consideration_pseudos, regno))
-	{
-	  gcc_assert (referenced_pseudos_num < MAX_INSN_PSEUDOS);
-	  if (out_p)
-	    modified_pseudos[modified_pseudos_num++] = regno;
-	  referenced_pseudos[referenced_pseudos_num++] = regno;
-	}
-      return;
-     
-    case PRE_INC:
-    case POST_INC:
-    case PRE_DEC:
-    case POST_DEC:
-      mark_pseudos (XEXP (op, 0), true);
-      return;
-
-    case PRE_MODIFY:
-    case POST_MODIFY:
-      mark_pseudos (XEXP (op, 0), true);
-      mark_pseudos (XEXP (op, 1), false);
-      return;
-    default:
-      fmt = GET_RTX_FORMAT (code);
-      for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
-	{
-	  if (fmt[i] == 'e')
-	    mark_pseudos (XEXP (op, i), false);
-	  else if (fmt[i] == 'E')
-	    for (j = XVECLEN (op, i) - 1; j >= 0; j--)
-	      mark_pseudos (XVECEXP (op, i, j), false);
-	}
-    }
-}
-
-/* Process INSN and record all referenced registers in
-   REFERENCED_PSEUDOS and modified registers in MODIFIED_PSEUDOS.
-   Only pseudos in CONSIDERATION_PSEUDOS are considered.  */
-static void
-mark_insn_pseudos (rtx insn)
-{
-  int i;
-  lra_insn_recog_data_t id;
-  struct lra_static_insn_data *static_id;
-
-  modified_pseudos_num = referenced_pseudos_num = 0;
-  id = lra_get_insn_recog_data (insn);
-  static_id = id->insn_static_data;
-  for (i = static_id->n_operands - 1; i >= 0; i--)
-    mark_pseudos (*id->operand_loc[i], static_id->operand[i].type != OP_IN);
-}
-
-/* Process storing of REG to update live info.  */
-static void
-mark_pseudo_store (rtx reg, const_rtx setter ATTRIBUTE_UNUSED,
-		   void *data ATTRIBUTE_UNUSED)
-{
-  int regno;
-
-  if (GET_CODE (reg) == SUBREG)
-    reg = SUBREG_REG (reg);
-
-  if (! REG_P (reg))
-    return;
-
-  regno = REGNO (reg);
-
-  if (regno >= FIRST_PSEUDO_REGISTER
-      && bitmap_bit_p (consideration_pseudos, regno))
-    bitmap_set_bit (live_pseudos, regno);
-}
-
-/* Info about save/restore code for a pseudo.  */
-struct save_regs
-{
-  /* True if MEM_REG is actually an equivalence of the corresponding
-     saved pseudo.  */
-  bool equiv_p;
-  /* A spilled pseudo or equivalence which hold value of the
-     corresponding saved/restored pseudo around calls.  */
-  rtx saved_value;
-  /* Saving/restoring of a pseudo could be done in a mode different
-     from the pseudo mode.  The following is the pseudo or a subreg of
-     the pseudo and is used in save/restore code.  */
-  rtx saved_reg;
-  /* Last insn in EBB referencing the pseudo.  */
-  rtx last_referencing_insn;
-};
-
-/* Map: regno -> info about save/restore for REGNO.  */
-static struct save_regs *save_regs;
-
-/* Return equivalence value we can use for restoring.  NULL,
-   otherwise.  ??? Should we use and is it worth to invariants with
-   caller saved hard registers.  */
-static rtx
-equiv_for_save (int regno)
-{
-  return ira_reg_equiv[regno].constant;
-}
-
-/* Set to true if changes in live info are too complex to update it
-   here.  */
-static bool update_live_info_p;
-
-/* Set up if necessary equiv_p, saved_value, and saved_reg for REGNO.  */
-static void
-setup_save_regs (int regno)
-{
-  if (save_regs[regno].saved_value == NULL_RTX)
-    {
-      enum machine_mode mode;
-      int hard_regno;
-      rtx equiv, saved_reg = regno_reg_rtx[regno];
-
-      equiv = ira_reg_equiv[regno].defined_p ? equiv_for_save (regno) : NULL_RTX;
-      if (equiv != NULL_RTX)
-	{
-	  save_regs[regno].saved_value = equiv;
-	  update_live_info_p = true;
-	}
-      else
-	{
-	  hard_regno = reg_renumber[regno];
-	  gcc_assert (hard_regno >= 0);
-	  mode = (HARD_REGNO_CALLER_SAVE_MODE
-		  (hard_regno,
-		   hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (regno)],
-		   PSEUDO_REGNO_MODE (regno)));
-	  if (mode != PSEUDO_REGNO_MODE (regno))
-	    saved_reg = gen_rtx_SUBREG (mode, saved_reg, 0);
-	  save_regs[regno].saved_value
-	    = lra_create_new_reg (VOIDmode, saved_reg, NO_REGS, NULL);
-	}
-      save_regs[regno].saved_reg = saved_reg;
-      save_regs[regno].equiv_p = equiv != NULL_RTX;
-    }
-}
-
-/* Insert save (if SAVE_P) or restore code for REGNO before (if
-   BEFORE_P) or after INSN.  Use equivalence for restoring if it is
-   possible.  */
-static void
-insert_save_restore (rtx insn, bool before_p, int regno, bool save_p)
-{
-  rtx x, saved_value;
-  int to_regno, from_regno;
-  rtx saved_reg;
-
-  setup_save_regs (regno);
-  saved_value = save_regs[regno].saved_value;
-  saved_reg = save_regs[regno].saved_reg;
-  if (save_regs[regno].equiv_p)
-    {
-      if (save_p)
-	/* Do nothing -- the pseudo always holds the same value.  */
-	return;
-      start_sequence ();
-      emit_move_insn (saved_reg, saved_value);
-      x = get_insns ();
-      end_sequence ();
-      if (lra_dump_file != NULL)
-	{
-	  fprintf (lra_dump_file, "Inserting %s i%u bb%d\n",
-		   before_p ? "before" : "after", INSN_UID (insn),
-		   BLOCK_FOR_INSN (insn)->index);
-	  print_rtl_slim (lra_dump_file, x, x, -1, 0);
-	}
-      ira_reg_equiv[regno].init_insns
-	= gen_rtx_INSN_LIST (VOIDmode, x, ira_reg_equiv[regno].init_insns);
-      if (before_p)
-	emit_insn_before (x, insn); 
-      else
-	emit_insn_after (x, insn); 
-      return;
-    }
-  start_sequence ();
-  if (save_p)
-    {
-      gcc_assert (REG_P (saved_value));
-      to_regno = REGNO (saved_value);
-      from_regno = regno;
-      emit_move_insn (saved_value, saved_reg);
-    }
-  else
-    {
-      to_regno = regno;
-      from_regno = REGNO (saved_value);
-      emit_move_insn (saved_reg, saved_value);
-    }
-  x = get_insns ();
-  end_sequence ();
-  if (before_p || ! save_p || ! CALL_P (insn))
-    {
-      if (! update_live_info_p)
-	add_reg_note (x, REG_DEAD, regno_reg_rtx[from_regno]);
-    }
-  else
-    {
-      /* A special case: saving a pseudo used in a call.  Put saving
-	 insn before the call and attach the note to the call.  */
-      before_p = true;
-      if (! update_live_info_p)
-	add_reg_note (insn, REG_DEAD, regno_reg_rtx[from_regno]);
-    }
-  ira_update_equiv_info_by_shuffle_insn (to_regno, from_regno, x);
-  if (lra_dump_file != NULL)
-    fprintf (lra_dump_file, "Inserting insn %u %d:=%d %s i%u bb%d\n",
-	     INSN_UID (x),
-	     save_p ? (int) REGNO (saved_value) : regno,
-	     save_p ? regno : (int) REGNO (saved_value),
-	     before_p ? "before" : "after", INSN_UID (insn),
-	     BLOCK_FOR_INSN (insn)->index);
-  if (before_p)
-    emit_insn_before (x, insn); 
-  else
-    emit_insn_after (x, insn); 
-}
-
-/* Insert save/restore code usinge solution of the data-flow
-   equations.  */
-static void
-insert_saves_restores (void)
-{
-  int i;
-  unsigned int regno;
-  rtx insn, curr, link, first;
-  rtx where = NULL_RTX;
-  basic_block bb, prev_bb, start_bb, end_bb;
-  edge e;
-  edge_iterator ei;
-  bitmap pseudos_saved, pseudos_to_restore;
-  reg_set_iterator rsi;
-  
-  pseudos_saved = BITMAP_ALLOC (&reg_obstack);
-  pseudos_to_restore = BITMAP_ALLOC (&reg_obstack);
-  FOR_EACH_BB (bb)
-    {
-      /* Find EBB.  */
-      if (lra_dump_file != NULL)
-	fprintf (lra_dump_file, "EBB");
-      /* Form a EBB starting with BB.  */
-      for (start_bb = bb;; bb = bb->next_bb)
-	{
-	  end_bb = bb;
-	  if (lra_dump_file != NULL)
-	    fprintf (lra_dump_file, " %d", bb->index);
-	  if (bb->next_bb == EXIT_BLOCK_PTR || LABEL_P (BB_HEAD (bb->next_bb)))
-	    break;
-	  e = find_fallthru_edge (bb->succs);
-	  if (! e)
-	    break;
-	  /* We are not interesting in frequencies, we need the
-	     longest EBB because the farther BB in EBB, the smaller
-	     execution frequency of restore insns.  */
-	}
-      if (lra_dump_file != NULL)
-	fprintf (lra_dump_file, "\n");
-      
-      bitmap_clear (pseudos_saved);
-      bitmap_and (live_pseudos, DF_LR_IN (start_bb), consideration_pseudos);
-
-      for (bb = start_bb;;)
-	{
-	  first = NULL_RTX;
-	  FOR_BB_INSNS (bb, insn)
-	    {
-	      if (NONDEBUG_INSN_P (insn))
-		break;
-	      first = insn;
-	    }
-	  EXECUTE_IF_AND_COMPL_IN_BITMAP (live_pseudos, pseudos_saved,
-					  FIRST_PSEUDO_REGISTER, regno, rsi)
-	    save_regs[regno].last_referencing_insn = first;
-
-	  FOR_BB_INSNS_SAFE (bb, insn, curr)
-	    {
-	      enum rtx_code code = GET_CODE (insn);
-	      
-	      if (! NONDEBUG_INSN_P (insn))
-		continue;
-	      
-	      mark_insn_pseudos (insn);
-	      for (i = 0; i < referenced_pseudos_num; i++)
-		{
-		  regno = referenced_pseudos[i];
-		  if (bitmap_bit_p (pseudos_saved, regno)
-		      && bitmap_bit_p (live_pseudos, regno))
-		    {
-		      insert_save_restore
-			(save_regs[regno].last_referencing_insn,
-			 false, regno, true);
-		      insert_save_restore (insn, true, regno, false);
-		      bitmap_clear_bit (pseudos_saved, regno);
-		    }
-		  save_regs[regno].last_referencing_insn = insn;
-		}
-	      for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
-		if (REG_NOTE_KIND (link) == REG_DEAD)
-		  bitmap_clear_bit (live_pseudos, REGNO (XEXP (link, 0)));
-	      
-	      if (code == CALL_INSN && ! SIBLING_CALL_P (insn)
-		  && ! find_reg_note (insn, REG_NORETURN, NULL))
-		bitmap_ior_into (pseudos_saved, live_pseudos);
-	      
-	      /* Mark any registers set in INSN as live.  */
-	      note_stores (PATTERN (insn), mark_pseudo_store, NULL);
-	      
-#ifdef AUTO_INC_DEC
-	      for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
-		if (REG_NOTE_KIND (link) == REG_INC)
-		  mark_pseudo_store (XEXP (link, 0), NULL_RTX, NULL);
-#endif
-	      for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
-		if (REG_NOTE_KIND (link) == REG_UNUSED)
-		  bitmap_clear_bit (live_pseudos, REGNO (XEXP (link, 0)));
-	      
-	      where = insn;
-	    }
-	  /* At the end of the basic block, we must restore any
-	     registers living at all non-fall through BB.  */
-	  if (bb == end_bb)
-	    bitmap_copy (pseudos_to_restore, pseudos_saved);
-	  else
-	    {
-	      bitmap_clear (pseudos_to_restore);
-	      FOR_EACH_EDGE (e, ei, bb->succs)
-		if (e->dest != bb->next_bb)
-		  bitmap_ior_and_into (pseudos_to_restore,
-				       DF_LR_IN (e->dest), pseudos_saved);
-	    }
-	      
-	  EXECUTE_IF_SET_IN_REG_SET (pseudos_to_restore,
-				     FIRST_PSEUDO_REGISTER, regno, rsi)
-	    {
-	      insert_save_restore (save_regs[regno].last_referencing_insn,
-				   false, regno, true);
-	      insert_save_restore (where, JUMP_P (where), regno, false);
-	      bitmap_clear_bit (pseudos_saved, regno);
-	    }
-	  if (bb == end_bb)
-	    break;
-	  prev_bb = bb;
-	  bb = bb->next_bb;
-	  bitmap_and (live_pseudos, DF_LR_IN (bb), consideration_pseudos);
-	  EXECUTE_IF_SET_IN_REG_SET (pseudos_saved,
-				     FIRST_PSEUDO_REGISTER, regno, rsi)
-	    {
-	      bitmap_clear_bit (DF_LR_IN (bb), regno);
-	      bitmap_clear_bit (DF_LR_OUT (prev_bb), regno);
-	      setup_save_regs (regno);
-	      if (update_live_info_p)
-		break;
-	      bitmap_set_bit (DF_LR_IN (bb),
-			      REGNO (save_regs[regno].saved_value));
-	      bitmap_set_bit (DF_LR_OUT (prev_bb),
-			      REGNO (save_regs[regno].saved_value));
-	    }
-	}
-    }
-  BITMAP_FREE (pseudos_to_restore);
-  BITMAP_FREE (pseudos_saved);
-}
-
-/* Major function to insert save/restore code.  The function needs
-   correct DFA info and REG_N_REFS & REG_N_CALLS_CROSSED before the
-   function work.  It keeps correct bb live info and live related insn
-   notes.  Return true if we need to update live info because the
-   changes are too complex to do it here.  */
-bool
-lra_save_restore (void)
-{
-  int i, hard_regno, max_regno_before;
-
-  /* We ignore created scratches.  The don't need to be saved.  */
-  max_regno_before = lra_constraint_new_regno_start;
-
-  if (lra_dump_file != NULL)
-    fprintf (lra_dump_file, "\n********** Caller saves: **********\n\n");
-
-  update_live_info_p = false;
-  consideration_pseudos = BITMAP_ALLOC (&reg_obstack);
-  for (i = FIRST_PSEUDO_REGISTER; i < max_regno_before; i++)
-    if (REG_N_REFS (i) != 0
-	&& (hard_regno = lra_get_regno_hard_regno (i)) >= 0
-	&& REG_N_CALLS_CROSSED (i) != 0
-	&& lra_hard_reg_set_intersection_p (hard_regno, PSEUDO_REGNO_MODE (i),
-					    call_used_reg_set))
-      bitmap_set_bit (consideration_pseudos, i);
-  if (! bitmap_empty_p (consideration_pseudos))
-    {
-      live_pseudos = BITMAP_ALLOC (&reg_obstack);
-      save_regs = (struct save_regs *) xmalloc (sizeof (struct save_regs)
-						* max_regno_before);
-      memset (save_regs, 0, sizeof (struct save_regs) * max_regno_before);
-      insert_saves_restores ();
-      free (save_regs);
-      BITMAP_FREE (live_pseudos);
-    }
-  BITMAP_FREE (consideration_pseudos);
-  return update_live_info_p;
-}
Index: lra-constraints.c
===================================================================
--- lra-constraints.c	(revision 181810)
+++ lra-constraints.c	(working copy)
@@ -19,7 +19,7 @@ 
    <http://www.gnu.org/licenses/>.  */
 
 
-/* This code selects alternatives for insnsy based on register
+/* This code selects alternatives for insns based on register
    allocation.  It has also a mode to do some simple code
    transformations.  */
 
@@ -94,52 +94,6 @@  static struct lra_static_insn_data *curr
 
 
 
-/* The page contains code to deal with the secondary memory.  */
-
-#ifdef SECONDARY_MEMORY_NEEDED
-
-/* Cached result of function get_secondary_mem.  */
-rtx lra_secondary_memory[NUM_MACHINE_MODES];
-
-/* Return a memory location that will be used to copy values from
-   registers in mode MODE.  */
-static rtx
-get_secondary_mem (enum machine_mode mode)
-{
-  rtx x;
-
-  /* By default, if MODE is narrower than a word, widen it to a word.
-     This is required because most machines that require these memory
-     locations do not support short load and stores from all registers
-     (e.g., FP registers).  */
-
-#ifdef SECONDARY_MEMORY_NEEDED_MODE
-  mode = SECONDARY_MEMORY_NEEDED_MODE (mode);
-#else
-  if (GET_MODE_BITSIZE (mode) < BITS_PER_WORD && INTEGRAL_MODE_P (mode))
-    mode = mode_for_size (BITS_PER_WORD, GET_MODE_CLASS (mode), 0);
-#endif
-
-  if (lra_secondary_memory[(int) mode] == NULL_RTX)
-    {
-      /* If this is the first time we've tried to get a MEM for this
-	  mode, allocate a new one.  `something_changed' in reload will
-	  get set by noticing that the frame size has changed.  */
-#ifdef SECONDARY_MEMORY_NEEDED_RTX
-      x = SECONDARY_MEMORY_NEEDED_RTX (mode);
-#else
-      x = assign_stack_local (mode, GET_MODE_SIZE (mode), 0);
-#endif
-      lra_secondary_memory[(int) mode]
-	 = lra_eliminate_regs_1 (x, GET_MODE (x), false, false, true);
-    }
-
-  return copy_rtx (lra_secondary_memory[(int) mode]);
-}
-#endif
-
-
-
 /* Start numbers for new registers and insns at the current constraints
    pass start.  */
 static int new_regno_start;
@@ -300,7 +254,12 @@  get_reload_reg (enum op_type type, enum 
 
   if (type == OP_OUT)
     {
-      *result_reg = lra_create_new_reg (mode, original, rclass, title);
+      /* Unique value is needed when we need reloads for pseudo which
+	 occurs as earlier clobber output and input operands to
+	 guarantee that the both reload pseudos have unique value and
+	 can not be assigned to the same hard register.  */
+      *result_reg
+	= lra_create_new_reg_with_unique_value (mode, original, rclass, title);
       return true;
     }
   for (i = 0; i < curr_insn_input_reloads_num; i++)
@@ -997,10 +956,10 @@  get_op_class (rtx op)
   return NO_REGS;
 }
 
-/* Return generated insn sec_mem:=val if TO_P or val:=sec_mem
-   otherwise.  If modes of SEC_MEM and VAL are different, use SUBREG
-   for VAL to make them equal.  Assign CODE to the insn if it is not
-   recognized.
+/* Return generated insn mem_pseudo:=val if TO_P or val:=mem_pseudo
+   otherwise.  If modes of MEM_PSEUDO and VAL are different, use
+   SUBREG for VAL to make them equal.  Assign CODE to the insn if it
+   is not recognized.
 
    We can not use emit_move_insn in some cases because of used bad
    practice in machine descriptions.  For example, power can use only
@@ -1014,17 +973,19 @@  get_op_class (rtx op)
    explicitly because the generated move can be unrecognizable because
    of the predicates.  */
 static rtx
-emit_secondary_memory_move (bool to_p, rtx sec_mem, rtx val, int code)
+emit_spill_move (bool to_p, rtx mem_pseudo, rtx val, int code)
 {
   rtx insn, after;
 
   start_sequence ();
-  if (GET_MODE (sec_mem) != GET_MODE (val))
-    val = gen_rtx_SUBREG (GET_MODE (sec_mem), val, 0);
+  if (GET_MODE (mem_pseudo) != GET_MODE (val))
+    val = gen_rtx_SUBREG (GET_MODE (mem_pseudo),
+			  GET_CODE (val) == SUBREG ? SUBREG_REG (val) : val,
+			  0);
   if (to_p)
-    insn = gen_move_insn (sec_mem, val);
+    insn = gen_move_insn (mem_pseudo, val);
   else
-    insn = gen_move_insn (val, sec_mem);
+    insn = gen_move_insn (val, mem_pseudo);
   if (recog_memoized (insn) < 0)
     INSN_CODE (insn) = code;
   emit_insn (insn);
@@ -1035,9 +996,10 @@  emit_secondary_memory_move (bool to_p, r
 
 /* Process a special case insn (register move), return true if we
    don't need to process it anymore.  Return that RTL was changed
-   through CHANGE_P.  */
+   through CHANGE_P and macro SECONDARY_MEMORY_NEEDED says to use
+   secondary memory through SEC_MEM_P.  */
 static bool
-check_and_process_move (bool *change_p)
+check_and_process_move (bool *change_p, bool *sec_mem_p)
 {
   int regno;
   rtx set, dest, src, dreg, sr, dr, sreg, new_reg, before, x, scratch_reg;
@@ -1045,7 +1007,7 @@  check_and_process_move (bool *change_p)
   secondary_reload_info sri;
   bool in_p, temp_assign_p;
 
-  *change_p = false;
+  *sec_mem_p = *change_p = false;
   if ((set = single_set (curr_insn)) == NULL || side_effects_p (set))
     return false;
   dreg = dest = SET_DEST (set);
@@ -1087,27 +1049,11 @@  check_and_process_move (bool *change_p)
     /* See comments above.  */
     return false;
 #ifdef SECONDARY_MEMORY_NEEDED
-  if (dclass != NO_REGS && sclass != NO_REGS)
+  if (dclass != NO_REGS && sclass != NO_REGS
+      && SECONDARY_MEMORY_NEEDED (sclass, dclass, GET_MODE (src)))
     {
-      rtx after;
-
-      if (SECONDARY_MEMORY_NEEDED (sclass, dclass, GET_MODE (src)))
-	{
-	  new_reg = get_secondary_mem (GET_MODE (dest));
-	  /* If the mode is changed, it should be wider.  */
-	  gcc_assert (GET_MODE_SIZE (GET_MODE (new_reg))
-		      >= GET_MODE_SIZE (GET_MODE (src)));
-	  after = emit_secondary_memory_move (false, new_reg, dest,
-					      INSN_CODE (curr_insn));
-	  lra_process_new_insns (curr_insn, NULL_RTX, after,
-				 "Inserting the sec. move");
-	  before = emit_secondary_memory_move (true, new_reg, src,
-					       INSN_CODE (curr_insn));
-	  lra_process_new_insns (curr_insn, before, NULL_RTX, "Changing on");
-	  lra_set_insn_deleted (curr_insn);
-	  *change_p = true;
-	  return true;
-	}
+      *sec_mem_p = true;
+      return false;
     }
 #endif
   sri.prev_sri = NULL;
@@ -1150,8 +1096,9 @@  check_and_process_move (bool *change_p)
   *change_p = true;
   new_reg = NULL_RTX;
   if (secondary_class != NO_REGS)
-    new_reg = lra_create_new_reg (GET_MODE (sreg), NULL_RTX,
-				  secondary_class, "secondary");
+    new_reg = lra_create_new_reg_with_unique_value (GET_MODE (sreg), NULL_RTX,
+						    secondary_class,
+						    "secondary");
   start_sequence ();
   if (sri.icode == CODE_FOR_nothing)
     lra_emit_move (new_reg, sreg);
@@ -1161,8 +1108,9 @@  check_and_process_move (bool *change_p)
 
       scratch_class = (reg_class_from_constraints
 		       (insn_data[sri.icode].operand[2].constraint));
-      scratch_reg = lra_create_new_reg (insn_data[sri.icode].operand[2].mode,
-					NULL_RTX, scratch_class, "scratch");
+      scratch_reg = (lra_create_new_reg_with_unique_value
+		     (insn_data[sri.icode].operand[2].mode, NULL_RTX,
+		      scratch_class, "scratch"));
       emit_insn (GEN_FCN (sri.icode) (new_reg != NULL_RTX ? new_reg : dest,
 				      sreg, scratch_reg));
     }
@@ -2551,7 +2499,7 @@  process_address (int nop, rtx *before, r
     {
       /* We don't use transformation 'base + disp => base + new index'
 	 because of some bad practice used in machine descriptions
-	 (see comments for emit_secondary_memory_move).  */
+	 (see comments for emit_spill_move).  */
       /* base + disp => new base  */
       new_reg = base_plus_disp_to_reg (mode, &ad);
       *addr_loc = new_reg;
@@ -2714,13 +2662,17 @@  curr_insn_transform (void)
   bool alt_p = false;
   /* Flag that the insn has been changed through a transformation.  */
   bool change_p;
+  bool sec_mem_p;
+#ifdef SECONDARY_MEMORY_NEEDED
+  bool use_sec_mem_p;
+#endif
   int max_regno_before;
   int reused_alternative_num;
 
   no_input_reloads_p = no_output_reloads_p = false;
-  goal_alt_number = 0;
+  goal_alt_number = -1;
 
-  if (check_and_process_move (&change_p))
+  if (check_and_process_move (&change_p, &sec_mem_p))
     return change_p;
 
   /* JUMP_INSNs and CALL_INSNs are not allowed to have any output
@@ -2861,7 +2813,7 @@  curr_insn_transform (void)
      alternative that we could reach by reloading the fewest operands.
      Reload so as to fit it.  */
 
-  if (! alt_p)
+  if (! alt_p && ! sec_mem_p)
     {
       /* No alternative works with reloads??  */
       if (INSN_CODE (curr_insn) >= 0)
@@ -2910,6 +2862,52 @@  curr_insn_transform (void)
       change_p = true;
     }
 
+#ifdef SECONDARY_MEMORY_NEEDED
+  /* Some target macros SECONDARY_MEMORY_NEEDED (e.g. x86) are defined
+     too conservatively.  So we use the secondary memory only if there
+     is no any alternative without reloads.  */
+  use_sec_mem_p = false;
+  if (! alt_p)
+    use_sec_mem_p = true;
+  else if (sec_mem_p)
+    {
+      for (i = 0; i < n_operands; i++)
+	if (! goal_alt_win[i] && ! goal_alt_match_win[i])
+	  break;
+      use_sec_mem_p = i < n_operands;
+    }
+
+  if (use_sec_mem_p)
+    {
+      rtx new_reg, set, src, dest;
+      enum machine_mode sec_mode;
+
+      gcc_assert (sec_mem_p);
+      set = single_set (curr_insn);
+      gcc_assert (set != NULL_RTX && ! side_effects_p (set));
+      dest = SET_DEST (set);
+      src = SET_SRC (set);
+#ifdef SECONDARY_MEMORY_NEEDED_MODE
+      sec_mode = SECONDARY_MEMORY_NEEDED_MODE (GET_MODE (src));
+#else
+      sec_mode = GET_MODE (src);
+#endif
+      new_reg = lra_create_new_reg (sec_mode, NULL_RTX,
+				    NO_REGS, "secondary");
+      /* If the mode is changed, it should be wider.  */
+      gcc_assert (GET_MODE_SIZE (GET_MODE (new_reg))
+		  >= GET_MODE_SIZE (GET_MODE (src)));
+      after = emit_spill_move (false, new_reg, dest, INSN_CODE (curr_insn));
+      lra_process_new_insns (curr_insn, NULL_RTX, after,
+			     "Inserting the sec. move");
+      before = emit_spill_move (true, new_reg, src, INSN_CODE (curr_insn));
+      lra_process_new_insns (curr_insn, before, NULL_RTX, "Changing on");
+      lra_set_insn_deleted (curr_insn);
+      return true;
+    }
+#endif
+
+  gcc_assert (goal_alt_number >= 0);
   lra_set_used_insn_alternative (curr_insn, goal_alt_number);
 
   if (lra_dump_file != NULL)
@@ -3310,7 +3308,7 @@  debug_loc_equivalence_change_p (rtx *loc
 
 /* Maximum number of generated reload insns per an insn.  It is for
    preventing this pass cycling.  */
-#define MAX_RELOAD_INSNS_NUMBER 20
+#define MAX_RELOAD_INSNS_NUMBER LRA_MAX_INSN_RELOADS
 
 /* The current iteration number of this LRA pass.  */
 int lra_constraint_iter;
@@ -3489,12 +3487,6 @@  lra_constraints (bool first_p)
 void
 lra_contraints_init (void)
 {
-#ifdef SECONDARY_MEMORY_NEEDED
-  int mode;
-
-  for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
-    lra_secondary_memory[mode] = NULL_RTX;
-#endif
   init_indirect_mem ();
   bitmap_initialize (&lra_matched_pseudos, &reg_obstack);
   bitmap_initialize (&lra_bound_pseudos, &reg_obstack);
@@ -3511,21 +3503,47 @@  lra_contraints_finish (void)
 
 
 
-/* This page contains code to do inheritance transformations.  */
+/* This page contains code to do inheritance/split
+   transformations.  */
+
+/* Number of reloads passed so far in current EBB.  */
+static int reloads_num;
+
+/* Number of calls passed so far in current EBB.  */
+static int calls_num;
 
 /* Current reload pseudo check for validity of elements in
    USAGE_INSNS.  */
 static int curr_usage_insns_check;
-/* If an element value is equal to the above variable value, then the
-   corresponding element in USAGE_INSNS is valid.  */
-static int *usage_insns_check;
-/* Map: pseudo regno -> next insns in the current EBB which use the
-   original pseudo and the original pseudo value is not changed
-   between the current insn and the next insns.  In order words, if we
-   need to use the original pseudo value again in the next insns we
-   can try to use the value in a hard register from a reload insn of
-   the current insn.  */
-static rtx *usage_insns;
+
+/* Info about last usage of pseudos in EBB to do inheritance/split
+   transformation.  Inheritance transformation is done from a spilled
+   pseudo and split transformations from a pseudo to assigned to a
+   hard register.  */
+struct usage_insns
+{
+  /* If the value is equal to the above variable value, then the INSNS
+     is valid.  The insns is chain of optional debug insns and a
+     finishing non-debug insn using the corresponding pseudo.  */
+  int check;
+  /* Value of global reloads_num at the corresponding next insns.  */
+  int reloads_num;
+  /* Value of global reloads_num at the corresponding next insns.  */
+  int calls_num;
+  /* It can be true only for splitting.  And it means that the restore
+     insn should be put after insn give by the following member.  */
+  bool after_p;
+  /* Next insns in the current EBB which use the original pseudo and
+     the original pseudo value is not changed between the current insn
+     and the next insns.  In order words, if we need to use the
+     original pseudo value again in the next insns we can try to use
+     the value in a hard register from a reload insn of the current
+     insn.  */
+  rtx insns;
+};
+
+/* Map: pseudo regno -> corresponding pseudo usage insns.  */
+static struct usage_insns *usage_insns;
 
 /* Process all regs OLD_REGNO in location *LOC and change them on the
    reload pseudo NEW_REG.  Return true if any change was done.  */
@@ -3567,8 +3585,8 @@  substitute_pseudo (rtx *loc, int old_reg
   return result;
 }
 
-/* Pseudos involved in inheritance in the current EBB (inheritance and
-   original pseudos).  */
+/* Pseudos involved in inheritance/split in the current EBB
+   (inheritance/split and original pseudos).  */
 static bitmap_head check_only_pseudos;
 
 /* Do inheritance transformation for insn INSN defining (if DEF_P) or
@@ -3591,7 +3609,8 @@  static bitmap_head check_only_pseudos;
    class of ORIGINAL REGNO.  It will have unique value if UNIQ_P.  The
    unique value is necessary for correct assignment to inheritance
    pseudo for input of an insn which should be the same as output
-   (bound pseudos).  */
+   (bound pseudos).  Return true if we succeed in such
+   transformation.  */
 static bool
 inherit_reload_reg (bool def_p, bool uniq_p, int original_regno,
 		    enum reg_class cl, rtx insn, rtx next_usage_insns)
@@ -3600,18 +3619,21 @@  inherit_reload_reg (bool def_p, bool uni
   rtx original_reg = regno_reg_rtx[original_regno];
   rtx new_reg, new_insns, usage_insn;
 
+  gcc_assert (! usage_insns[original_regno].after_p);
   if (lra_dump_file != NULL)
     fprintf (lra_dump_file,
-	     "    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<\n");
+	     "    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<\n");
   if (! ira_reg_classes_intersect_p[cl][rclass])
     {
       if (lra_dump_file != NULL)
 	{
 	  fprintf (lra_dump_file,
-		   "    Rejecting inheritance for %d because of too different classes %s and %s\n",
-		   original_regno, reg_class_names[cl], reg_class_names[rclass]);
+		   "    Rejecting inheritance for %d "
+		   "because of too different classes %s and %s\n",
+		   original_regno, reg_class_names[cl],
+		   reg_class_names[rclass]);
 	  fprintf (lra_dump_file,
-		   "    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
+		   "    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
 	}
       return false;
     }
@@ -3643,11 +3665,12 @@  inherit_reload_reg (bool def_p, bool uni
       if (lra_dump_file != NULL)
 	{
 	  fprintf (lra_dump_file,
-		   "    Rejecting inheritance %d->%d as it results in 2 or more insns:\n",
+		   "    Rejecting inheritance %d->%d "
+		   "as it results in 2 or more insns:\n",
 		   original_regno, REGNO (new_reg));
 	  print_rtl_slim (lra_dump_file, new_insns, NULL_RTX, -1, 0);
 	  fprintf (lra_dump_file,
-		   "    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
+		   "    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
 	}
       return false;
     }
@@ -3655,8 +3678,12 @@  inherit_reload_reg (bool def_p, bool uni
   lra_update_insn_regno_info (insn);
   if (! def_p)
     {
-      usage_insns_check[original_regno] = curr_usage_insns_check;
-      usage_insns[original_regno] = new_insns;
+      /* We now have a new usage insn for original regno.  */
+      usage_insns[original_regno].check = curr_usage_insns_check;
+      usage_insns[original_regno].insns = new_insns;
+      usage_insns[original_regno].reloads_num = reloads_num;
+      usage_insns[original_regno].calls_num = calls_num;
+      usage_insns[original_regno].after_p = false;
     }
   if (lra_dump_file != NULL)
     fprintf (lra_dump_file, "    Original reg change %d->%d:\n",
@@ -3669,17 +3696,20 @@  inherit_reload_reg (bool def_p, bool uni
     lra_process_new_insns (insn, NULL_RTX, new_insns,
 			   "Add original<-inheritance");
   else
-    lra_process_new_insns (insn, new_insns, NULL_RTX, "Add inheritance<-pseudo");
+    lra_process_new_insns (insn, new_insns, NULL_RTX,
+			   "Add inheritance<-pseudo");
   while (next_usage_insns != NULL_RTX)
     {
       if (GET_CODE (next_usage_insns) != INSN_LIST)
 	{
 	  usage_insn = next_usage_insns;
+	  gcc_assert (NONDEBUG_INSN_P (usage_insn));
 	  next_usage_insns = NULL;
 	}
       else
 	{
 	  usage_insn = XEXP (next_usage_insns, 0);
+	  gcc_assert (DEBUG_INSN_P (usage_insn));
 	  next_usage_insns = XEXP (next_usage_insns, 1);
 	}
       substitute_pseudo (&usage_insn, original_regno, new_reg);
@@ -3698,12 +3728,226 @@  inherit_reload_reg (bool def_p, bool uni
   return true;
 }
 
+/* Return true if we need a caller save/restore for pseudo REGNO which
+   was assigned to a hard register.  */
+static inline bool
+need_for_call_save_p (int regno)
+{
+  gcc_assert (reg_renumber[regno] >= 0);
+  return (usage_insns[regno].calls_num < calls_num
+	  && (lra_hard_reg_set_intersection_p
+	      (reg_renumber[regno], PSEUDO_REGNO_MODE (regno),
+	       call_used_reg_set)));
+}
+
+/* Return true if we need a split for pseudo REGNO which was assigned
+   to a hard register.  POTENTIAL_RELOAD_HARD_REGS contains hard
+   registers which might be used for reloads since the EBB end.  It is
+   an approximation of the used hard registers in the split range.
+   The exact value would require expensive calculations.  If we were
+   aggressive with splitting because of the approximation, the split
+   pseudo will save the same hard register assignment and will be
+   removed in the undo pass.  We still need the approximation because
+   too aggressive splitting would result in too inaccurate cost
+   calculation in the assignment pass because of too many generated
+   moves which will be probably removed in the undo pass.  */
+static inline bool
+need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
+{
+  gcc_assert (reg_renumber[regno] >= 0);
+  return ((TEST_HARD_REG_BIT (potential_reload_hard_regs, reg_renumber[regno])
+	   && usage_insns[regno].reloads_num + 1 < reloads_num)
+	  || need_for_call_save_p (regno));
+}
+
+/* Return class for the split pseudo created from original pseudo with
+   ALLOCNO_CLASS and MODE which got a hard register with
+   HARD_REG_CLASS.  We choose subclass of ALLOCNO_CLASS which results
+   in no secondary memory movements.  */
+static enum reg_class
+choose_split_class (enum reg_class allocno_class,
+		    enum reg_class hard_reg_class ATTRIBUTE_UNUSED,
+		    enum machine_mode mode ATTRIBUTE_UNUSED)
+{
+#ifndef SECONDARY_MEMORY_NEEDED
+  return allocno_class;
+#else
+  int i;
+  enum reg_class cl, best_cl = NO_REGS;
+  
+  if (allocno_class == hard_reg_class
+      && ! SECONDARY_MEMORY_NEEDED (hard_reg_class, hard_reg_class, mode))
+    best_cl = hard_reg_class;
+  for (i = 0;
+       (cl = reg_class_subclasses[allocno_class][i]) != LIM_REG_CLASSES;
+       i++)
+    if (! SECONDARY_MEMORY_NEEDED (cl, hard_reg_class, mode)
+	&& ! SECONDARY_MEMORY_NEEDED (hard_reg_class, cl, mode)
+	&& (best_cl == NO_REGS
+	    || (hard_reg_set_subset_p (reg_class_contents[best_cl],
+				       reg_class_contents[cl])
+		&& ! hard_reg_set_equal_p (reg_class_contents[best_cl],
+					   reg_class_contents[cl]))))
+      best_cl = cl;
+  return best_cl;
+#endif
+}
+
+/* Do split transformation for insn INSN defining (if DEF_P) or
+   using ORIGINAL_REGNO where the subsequent insn(s) in EBB (remember
+   we traverse insns in the backward direction) for the original regno
+   is NEXT_USAGE_INSNS.  The transformations look like
+
+     p <- ...             p <- ...
+     ...                  s <- p    (new insn -- save)
+     ...             =>
+     ...                  p <- s    (new insn -- restore)
+     <- ... p ...         <- ... p ...
+   or
+     <- ... p ...         <- ... p ...
+     ...                  s <- p    (new insn -- save)
+     ...             =>
+     ...                  p <- s    (new insn -- restore)
+     <- ... p ...         <- ... p ...
+
+   where p is an original pseudo got a hard register and s is a new
+   split pseudo.  Return true if we succeed in such
+   transformation.  */
+static bool
+split_pseudo (bool def_p, int original_regno, rtx insn, rtx next_usage_insns)
+{
+  enum reg_class rclass = lra_get_allocno_class (original_regno);
+  rtx original_reg = regno_reg_rtx[original_regno];
+  rtx new_reg, save, restore, usage_insn;
+  bool after_p;
+  bool call_save_p = need_for_call_save_p (original_regno);
+
+  gcc_assert (reg_renumber[original_regno] >= 0);
+  if (lra_dump_file != NULL)
+    fprintf (lra_dump_file,
+	     "    ((((((((((((((((((((((((((((((((((((((((((((((((\n");
+  rclass = choose_split_class (rclass,
+			       REGNO_REG_CLASS (reg_renumber[original_regno]),
+			       GET_MODE (original_reg));
+  if (call_save_p)
+    {
+      enum machine_mode sec_mode;
+      
+#ifdef SECONDARY_MEMORY_NEEDED_MODE
+      sec_mode = SECONDARY_MEMORY_NEEDED_MODE (GET_MODE (original_reg));
+#else
+      sec_mode = GET_MODE (original_reg);
+#endif
+      new_reg = lra_create_new_reg (sec_mode, NULL_RTX,
+				    NO_REGS, "save");
+    }
+  else
+    {
+      new_reg = lra_create_new_reg (GET_MODE (original_reg), original_reg,
+				    rclass, "split");
+      reg_renumber[REGNO (new_reg)] = reg_renumber[original_regno];
+    }
+  if (call_save_p)
+    save = emit_spill_move (true, new_reg, original_reg, -1);
+  else
+    {
+      start_sequence ();
+      emit_move_insn (new_reg, original_reg);
+      save = get_insns ();
+      end_sequence ();
+    }
+  if (NEXT_INSN (save) != NULL_RTX)
+    {
+      if (lra_dump_file != NULL)
+	{
+	  fprintf (lra_dump_file,
+		   "    Rejecting split %d->%d resulting in > 2 %s save insns:\n",
+		   original_regno, REGNO (new_reg), call_save_p ? "call" : "");
+	  print_rtl_slim (lra_dump_file, save, NULL_RTX, -1, 0);
+	  fprintf (lra_dump_file,
+		   "    ))))))))))))))))))))))))))))))))))))))))))))))))\n");
+	}
+      return false;
+    }
+  if (call_save_p)
+    restore = emit_spill_move (false, new_reg, original_reg, -1);
+  else
+    {
+      start_sequence ();
+      emit_move_insn (original_reg, new_reg);
+      restore = get_insns ();
+      end_sequence ();
+    }
+  if (NEXT_INSN (restore) != NULL_RTX)
+    {
+      if (lra_dump_file != NULL)
+	{
+	  fprintf (lra_dump_file,
+		   "    Rejecting split %d->%d "
+		   "resulting in > 2 %s restore insns:\n",
+		   original_regno, REGNO (new_reg), call_save_p ? "call" : "");
+	  print_rtl_slim (lra_dump_file, restore, NULL_RTX, -1, 0);
+	  fprintf (lra_dump_file,
+		   "    ))))))))))))))))))))))))))))))))))))))))))))))))\n");
+	}
+      return false;
+    }
+  after_p = usage_insns[original_regno].after_p;
+  if (! def_p)
+    {
+      /* We now have a new usage insn for original regno.  */
+      usage_insns[original_regno].check = curr_usage_insns_check;
+      usage_insns[original_regno].insns = save;
+      usage_insns[original_regno].reloads_num = reloads_num;
+      usage_insns[original_regno].calls_num = calls_num;
+      usage_insns[original_regno].after_p = false;
+    }
+  lra_reg_info[REGNO (new_reg)].restore_regno = original_regno;
+  bitmap_set_bit (&check_only_pseudos, REGNO (new_reg));
+  bitmap_set_bit (&check_only_pseudos, original_regno);
+  bitmap_set_bit (&lra_split_pseudos, REGNO (new_reg));
+  for (;;)
+    {
+      if (GET_CODE (next_usage_insns) != INSN_LIST)
+	{
+	  usage_insn = next_usage_insns;
+	  break;
+	}
+      usage_insn = XEXP (next_usage_insns, 0);
+      gcc_assert (DEBUG_INSN_P (usage_insn));
+      next_usage_insns = XEXP (next_usage_insns, 1);
+      substitute_pseudo (&usage_insn, original_regno, new_reg);
+      lra_update_insn_regno_info (usage_insn);
+      if (lra_dump_file != NULL)
+	{
+	  fprintf (lra_dump_file, "    Split reuse change %d->%d:\n",
+		   original_regno, REGNO (new_reg));
+	  print_rtl_slim (lra_dump_file, usage_insn, usage_insn,
+			  -1, 0);
+	}
+    }
+  gcc_assert ((usage_insn != insn || (after_p && ! def_p))
+	      && NONDEBUG_INSN_P (usage_insn));
+  lra_process_new_insns (usage_insn, after_p ? NULL_RTX : restore,
+			 after_p ? restore : NULL_RTX,
+			 call_save_p
+			 ?  "Add pseudo<-save" : "Add pseudo<-split");
+  lra_process_new_insns (insn, def_p ? NULL_RTX : save,
+			 def_p ? save : NULL_RTX,
+			 call_save_p
+			 ?  "Add save<-pseudo" : "Add split<-pseudo");
+  if (lra_dump_file != NULL)
+    fprintf (lra_dump_file,
+	     "    ))))))))))))))))))))))))))))))))))))))))))))))))\n");
+  return true;
+}
+
 /* Check only pseudos living at the current program point in the
    current EBB.  */
 static bitmap_head live_pseudos;
 
 /* Update live info in EBB given by its HEAD and TAIL insns after
-   inheritance transformation.  The function can remove dead moves
+   inheritance/split transformation.  The function removes dead moves
    too.  */
 static void
 update_ebb_live_info (rtx head, rtx tail)
@@ -3713,17 +3957,13 @@  update_ebb_live_info (rtx head, rtx tail
   bool live_p;
   rtx prev_insn, set;
   bool remove_p;
-  basic_block first_bb, last_bb, prev_bb, curr_bb;
+  basic_block last_bb, prev_bb, curr_bb;
   bitmap_iterator bi;
   struct lra_insn_reg *reg;
   edge e;
   edge_iterator ei;
 
-  first_bb = BLOCK_FOR_INSN (head);
   last_bb = BLOCK_FOR_INSN (tail);
-  if (first_bb == last_bb)
-    /* It is a BB.  No need to update liveness info.  */
-    return;
   prev_bb = NULL;
   for (curr_insn = tail;
        curr_insn != PREV_INSN (head);
@@ -3763,8 +4003,6 @@  update_ebb_live_info (rtx head, rtx tail
 		    bitmap_clear_bit (DF_LR_OUT (curr_bb), j);
 		}
 	    }
-	  if (curr_bb == first_bb)
-	    break;
 	  prev_bb = curr_bb;
 	  bitmap_and (&live_pseudos,
 		      &check_only_pseudos, DF_LR_OUT (curr_bb));
@@ -3791,6 +4029,11 @@  update_ebb_live_info (rtx head, rtx tail
       for (reg = curr_id->regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT && reg->early_clobber && ! reg->subreg_p)
 	  bitmap_clear_bit (&live_pseudos, reg->regno);
+      /* It is quite important to remove dead move insns because it
+	 means removing dead store, we don't need to process them for
+	 constraints, and unfortunately some subsequent optimizations
+	 (like shrink-wrapping) currently based on assumption that
+	 there are no trivial dead insns.  */
       if (remove_p)
 	{
 	  if (lra_dump_file != NULL)
@@ -3804,7 +4047,9 @@  update_ebb_live_info (rtx head, rtx tail
 }
 
 /* The structure describes info to do an inheritance for the current
-   insns.  */
+   insn.  We need to collect such info first before doing the
+   transformations because the transformations change the insn
+   internal representation.  */
 struct to_inherit
 {
   /* Original regno.  */
@@ -3813,7 +4058,8 @@  struct to_inherit
   rtx insns;
 };
 
-/* Array containing all info for doing inheritance from the current insn.  */
+/* Array containing all info for doing inheritance from the current
+   insn.  */
 static struct to_inherit to_inherit[LRA_MAX_INSN_RELOADS];
 
 /* Number elements in the previous array.  */
@@ -3833,26 +4079,150 @@  add_to_inherit (int regno, rtx insns)
   to_inherit[to_inherit_num++].insns = insns;
 }
 
-/* Do inheritance transformations in EBB starting with HEAD and
-   finishing on TAIL.  We process EBB insns in the reverse order.  */
+/* Set up RES by registers living on edges FROM except edege (FROM,
+   TO).  */
+static void
+get_live_on_other_edges (basic_block from, basic_block to, bitmap res)
+{
+  edge e;
+  edge_iterator ei;
+
+  gcc_assert (to != NULL);
+  bitmap_clear (res);
+  FOR_EACH_EDGE (e, ei, from->succs)
+    if (e->dest != to)
+      bitmap_ior_into (res, DF_LR_IN (e->dest));
+}
+	
+/* Return first (if FIRST_P) or last non-debug insn in basic block BB.
+   Return null if there are no non-debug insns in the block.  */
+static rtx
+get_non_debug_insn (bool first_p, basic_block bb)
+{
+  rtx insn;
+
+  for (insn = first_p ? BB_HEAD (bb) : BB_END (bb);
+       insn != NULL_RTX && ! NONDEBUG_INSN_P (insn);
+       insn = first_p ? NEXT_INSN (insn) : PREV_INSN (insn))
+    ;
+  if (insn != NULL_RTX && BLOCK_FOR_INSN (insn) != bb)
+    insn = NULL_RTX;
+  return insn;
+}
+
+/* Used as a temporary results of some bitmap calculations.  */
+static bitmap_head temp_bitmap;
+
+/* The function is used to form list REGNO usages which consists of
+   optional debug insns finished by a non-debug insn using REGNO.
+   RELOADS_NUM is current number of reload insns processed so far.  */
 static void
+add_next_usage_insn (int regno, rtx insn, int reloads_num)
+{
+  rtx next_usage_insns;
+  
+  if (usage_insns[regno].check == curr_usage_insns_check
+      && (next_usage_insns = usage_insns[regno].insns) != NULL_RTX
+      && DEBUG_INSN_P (insn))
+    {
+      /* Check that we did not add the debug insn yet.  */
+      if (next_usage_insns != insn
+	  && (GET_CODE (next_usage_insns) != INSN_LIST
+	      || XEXP (next_usage_insns, 0) != insn))
+	usage_insns[regno].insns = gen_rtx_INSN_LIST (VOIDmode, insn,
+						      next_usage_insns);
+    }
+  else if (NONDEBUG_INSN_P (insn))
+    {
+      usage_insns[regno].check = curr_usage_insns_check;
+      usage_insns[regno].insns = insn;
+      usage_insns[regno].reloads_num = reloads_num;
+      usage_insns[regno].calls_num = calls_num;
+      usage_insns[regno].after_p = false;
+    }
+  else
+    usage_insns[regno].check = 0;
+}
+  
+/* Do inheritance/split transformations in EBB starting with HEAD and
+   finishing on TAIL.  We process EBB insns in the reverse order.
+   Return true if we did any inheritance/split transformation in the
+   EBB.
+
+   We should avoid excessive splitting which results in worse code
+   because of inaccurate cost calculations for spilling new split
+   pseudos in such case.  To achieve this we do splitting only if
+   register pressure is high in given basic block and there reload
+   pseudos requiring hard registers.  We could do more register
+   pressure calculations at any given program point to avoid necessary
+   splitting even more but it is to expensive and the current approach
+   is well enough.  */
+static bool
 inherit_in_ebb (rtx head, rtx tail)
 {
   int i, src_regno, dst_regno;
-  bool succ_p;
-  rtx prev_insn, next_usage_insns, set;
+  bool change_p, succ_p;
+  rtx prev_insn, next_usage_insns, set,  first_insn, last_insn;
   enum reg_class cl;
   struct lra_insn_reg *reg;
+  basic_block last_processed_bb, curr_bb = NULL;
+  HARD_REG_SET potential_reload_hard_regs, live_hard_regs;
+  bitmap to_process;
+  unsigned int j;
+  bitmap_iterator bi;
+  bool head_p, after_p;
+
 
+  change_p = false;
   curr_usage_insns_check++;
+  reloads_num = calls_num = 0;
   /* Remeber: we can remove the current insn.  */
   bitmap_clear (&check_only_pseudos);
+  last_processed_bb = NULL;
+  CLEAR_HARD_REG_SET (potential_reload_hard_regs);
+  CLEAR_HARD_REG_SET (live_hard_regs);
+  /* We don't process new insns generated in the loop.  */
   for (curr_insn = tail; curr_insn != PREV_INSN (head); curr_insn = prev_insn)
     {
       prev_insn = PREV_INSN (curr_insn);
-      if (! INSN_P (curr_insn))
-	continue;
-      curr_id = lra_get_insn_recog_data (curr_insn);
+      if (BLOCK_FOR_INSN (curr_insn) != NULL)
+	curr_bb = BLOCK_FOR_INSN (curr_insn);
+      if (last_processed_bb != curr_bb)
+	{
+	  /* We are at the end of BB.  Add qualified living
+	     pseudos for potential splitting.  */
+	  to_process = DF_LR_OUT (curr_bb);
+	  if (last_processed_bb != NULL)
+	    {	
+	      /* We are somewhere in the middle of EBB. */
+	      get_live_on_other_edges (curr_bb, last_processed_bb, &temp_bitmap);
+	      to_process = &temp_bitmap;
+	    }
+	  last_processed_bb = curr_bb;
+	  last_insn = get_non_debug_insn (false, curr_bb);
+	  after_p = (last_insn != NULL_RTX && ! JUMP_P (last_insn)
+		     && (! CALL_P (last_insn)
+			 || find_reg_note (last_insn,
+					   REG_NORETURN, NULL) == NULL_RTX));
+	  REG_SET_TO_HARD_REG_SET (live_hard_regs, DF_LR_OUT (curr_bb));
+	  IOR_HARD_REG_SET (live_hard_regs, eliminable_regset);
+	  IOR_HARD_REG_SET (live_hard_regs, lra_no_alloc_regs);
+	  CLEAR_HARD_REG_SET (potential_reload_hard_regs);
+	  EXECUTE_IF_SET_IN_BITMAP (to_process,
+				    FIRST_PSEUDO_REGISTER, j, bi)
+	    if ((int) j < lra_constraint_new_regno_start
+		&& reg_renumber[j] >= 0)
+	      {
+		lra_add_hard_reg_set (reg_renumber[j],
+				      PSEUDO_REGNO_MODE (j),
+				      &live_hard_regs);
+		usage_insns[j].check = curr_usage_insns_check;
+		usage_insns[j].insns = last_insn;
+		usage_insns[j].reloads_num = reloads_num;
+		usage_insns[j].calls_num = calls_num;
+		usage_insns[j].after_p = after_p;
+	      }
+	}
       src_regno = dst_regno = -1;
       if (NONDEBUG_INSN_P (curr_insn)
 	  && (set = single_set (curr_insn)) != NULL_RTX
@@ -3868,108 +4238,212 @@  inherit_in_ebb (rtx head, rtx tail)
 	  && (cl = lra_get_allocno_class (dst_regno)) != NO_REGS)
 	{
 	  /* 'reload_pseudo <- original_pseudo'.  */
+	  reloads_num++;
 	  succ_p = false;
-	  if (usage_insns_check[src_regno] == curr_usage_insns_check
-	      && (next_usage_insns = usage_insns[src_regno]) != NULL_RTX)
+	  if (usage_insns[src_regno].check == curr_usage_insns_check
+	      && (next_usage_insns = usage_insns[src_regno].insns) != NULL_RTX)
 	    succ_p = inherit_reload_reg (false,
 					 bitmap_bit_p (&lra_matched_pseudos,
 						       dst_regno),
 					 src_regno, cl,
 					 curr_insn, next_usage_insns);
-	  if (! succ_p)
+	  if (succ_p)
+	    change_p = true;
+	  else
 	    {
-	      usage_insns_check[src_regno] = curr_usage_insns_check;
-	      usage_insns[src_regno] = curr_insn;
+	      usage_insns[src_regno].check = curr_usage_insns_check;
+	      usage_insns[src_regno].insns = curr_insn;
+	      usage_insns[src_regno].reloads_num = reloads_num;
+	      usage_insns[src_regno].calls_num = calls_num;
+	      usage_insns[src_regno].after_p = false;
 	    }
+	  if (cl != NO_REGS
+	      && hard_reg_set_subset_p (reg_class_contents[cl],
+					live_hard_regs))
+	    IOR_HARD_REG_SET (potential_reload_hard_regs, reg_class_contents[cl]);
 	}
       else if (src_regno >= lra_constraint_new_regno_start
 	       && dst_regno < lra_constraint_new_regno_start
 	       && dst_regno >= FIRST_PSEUDO_REGISTER
 	       && reg_renumber[dst_regno] < 0
 	       && (cl = lra_get_allocno_class (src_regno)) != NO_REGS
-	       && usage_insns_check[dst_regno] == curr_usage_insns_check
-	       && (next_usage_insns = usage_insns[dst_regno]) != NULL_RTX)
+	       && usage_insns[dst_regno].check == curr_usage_insns_check
+	       && (next_usage_insns = usage_insns[dst_regno].insns) != NULL_RTX)
 	{
+	  reloads_num++;
 	  /* 'original_pseudo <- reload_pseudo'.  */
-	  inherit_reload_reg (true, false, dst_regno, cl,
-			      curr_insn, next_usage_insns);
+	  if (inherit_reload_reg (true, false, dst_regno, cl,
+				  curr_insn, next_usage_insns))
+	    change_p = true;
 	  /* Invalidate.  */
-	  usage_insns_check[dst_regno] = 0;
+	  usage_insns[dst_regno].check = 0;
+	  if (cl != NO_REGS
+	      && hard_reg_set_subset_p (reg_class_contents[cl], live_hard_regs))
+	    IOR_HARD_REG_SET (potential_reload_hard_regs, reg_class_contents[cl]);
 	}
-      else
+      else if (INSN_P (curr_insn))
 	{
+	  int max_uid = get_max_uid ();
+
+	  curr_id = lra_get_insn_recog_data (curr_insn);
 	  to_inherit_num = 0;
+	  /* Process insn definitions.  */
 	  for (reg = curr_id->regs; reg != NULL; reg = reg->next)
 	    if (reg->type != OP_IN
 		&& (dst_regno = reg->regno) >= FIRST_PSEUDO_REGISTER
-		&& dst_regno < lra_constraint_new_regno_start)
+		&& dst_regno < lra_constraint_new_regno_start
+		&& usage_insns[dst_regno].check == curr_usage_insns_check
+		&& (next_usage_insns
+		    = usage_insns[dst_regno].insns) != NULL_RTX)
 	      {
-		if (reg->type == OP_OUT && ! reg->subreg_p
-		    && reg_renumber[dst_regno] < 0
-		    && usage_insns_check[dst_regno] == curr_usage_insns_check
-		    && (next_usage_insns = usage_insns[dst_regno]) != NULL_RTX)
+		if (reg->type == OP_OUT
+		    && reg_renumber[dst_regno] < 0 && ! reg->subreg_p)
 		  {
-		    struct lra_insn_reg *reg2;
-
-		    for (reg2 = curr_id->regs; reg2 != NULL; reg2 = reg2->next)
-		      if (reg2->type != OP_OUT && reg2->regno == dst_regno)
+		    struct lra_insn_reg *r;
+		    
+		    for (r = curr_id->regs; r != NULL; r = r->next)
+		      if (r->type != OP_OUT && r->regno == dst_regno)
 			break;
-		    if (reg2 == NULL)
-		      /* We can not do inheritance right now because
-			 the current insn reg info (chain regs) can
-			 change after that.  */
+		    /* Don't do inheritance if the pseudo is also
+		       used in the insn.  */
+		    if (r == NULL)
+		      /* We can not do inheritance right now
+			 because the current insn reg info (chain
+			 regs) can change after that.  */
 		      add_to_inherit (dst_regno, next_usage_insns);
 		  }
+		/* We can not process one pseudo twice here
+		   because of usage_insns invalidation.  */
+		if (reg_renumber[dst_regno] >= 0)
+		  {
+		    if (need_for_split_p (potential_reload_hard_regs,
+					  dst_regno)
+			&& split_pseudo (true, dst_regno, curr_insn,
+					 next_usage_insns))
+		      change_p = true;
+		    if (! reg->subreg_p)
+		      {
+			HARD_REG_SET s;
+			
+			CLEAR_HARD_REG_SET (s);
+			lra_add_hard_reg_set (reg_renumber[dst_regno],
+					      PSEUDO_REGNO_MODE (dst_regno),
+					      &s);
+			AND_COMPL_HARD_REG_SET (live_hard_regs, s);
+		      }
+		  }
 		/* Invalidate.  */
-		usage_insns_check[dst_regno] = 0;
+		usage_insns[dst_regno].check = 0;
 	      }
 	  for (i = 0; i < to_inherit_num; i++)
-	    inherit_reload_reg (true, false, to_inherit[i].regno, ALL_REGS,
-				curr_insn, to_inherit[i].insns);
+	    if (inherit_reload_reg (true, false, to_inherit[i].regno, ALL_REGS,
+				    curr_insn, to_inherit[i].insns))
+	      change_p = true;
+	  if (CALL_P (curr_insn))
+	    calls_num++;
 	  to_inherit_num = 0;
+	  /* Process insn usages.  */
 	  for (reg = curr_id->regs; reg != NULL; reg = reg->next)
-	    if (reg->type == OP_IN
+	    if (reg->type != OP_OUT
 		&& (src_regno = reg->regno) >= FIRST_PSEUDO_REGISTER
-		&& src_regno < lra_constraint_new_regno_start
-		&& reg_renumber[src_regno] < 0)
+		&& src_regno < lra_constraint_new_regno_start)
 	      {
-		if (usage_insns_check[src_regno] == curr_usage_insns_check
-		    && (next_usage_insns = usage_insns[src_regno]) != NULL_RTX
-		    && NONDEBUG_INSN_P (curr_insn))
-		  add_to_inherit (src_regno, next_usage_insns);
-		/* Set usages.  */
-		else if (usage_insns_check[src_regno] == curr_usage_insns_check
-			 && (next_usage_insns = usage_insns[src_regno]) != NULL_RTX
-			 && DEBUG_INSN_P (curr_insn))
+		if (reg_renumber[src_regno] < 0 && reg->type == OP_IN)
 		  {
-		    /* Check that we did not add the debug insn yet.  */
-		    if (next_usage_insns != curr_insn
-			&& (GET_CODE (next_usage_insns) != INSN_LIST
-			    || XEXP (next_usage_insns, 0) != curr_insn))
-		      usage_insns[src_regno]
-			= gen_rtx_INSN_LIST (VOIDmode, curr_insn,
-					     next_usage_insns);
+		    if (usage_insns[src_regno].check == curr_usage_insns_check
+			&& (next_usage_insns
+			    = usage_insns[src_regno].insns) != NULL_RTX
+			&& NONDEBUG_INSN_P (curr_insn))
+		      add_to_inherit (src_regno, next_usage_insns);
+		    /* Add usages.  */
+		    else
+		      add_next_usage_insn (src_regno, curr_insn, reloads_num);
 		  }
-		else if (NONDEBUG_INSN_P (curr_insn))
+		else if (reg_renumber[src_regno] >= 0)
 		  {
-		    usage_insns_check[src_regno] = curr_usage_insns_check;
-		    usage_insns[src_regno] = curr_insn;
+		    bool ok_p = false;
+		    
+		    if (usage_insns[src_regno].check == curr_usage_insns_check
+			&& (next_usage_insns
+			    = usage_insns[src_regno].insns) != NULL_RTX
+			&& reg->type == OP_IN
+			/* To avoid processing the pseudo twice or
+			   more.  */
+			&& ((GET_CODE (next_usage_insns) != INSN_LIST
+			     && INSN_UID (next_usage_insns) < max_uid)
+			    || (GET_CODE (next_usage_insns) == INSN_LIST
+				&& (INSN_UID (XEXP (next_usage_insns, 0))
+				    < max_uid)))
+			&& need_for_split_p (potential_reload_hard_regs,
+					     src_regno)
+			&& NONDEBUG_INSN_P (curr_insn)
+			&& split_pseudo (false, src_regno, curr_insn,
+					 next_usage_insns))
+		      ok_p = change_p = true;
+		    if (NONDEBUG_INSN_P (curr_insn))
+		      lra_add_hard_reg_set (reg_renumber[src_regno],
+					    PSEUDO_REGNO_MODE (src_regno),
+					    &live_hard_regs);
+		    if (! ok_p)
+		      add_next_usage_insn (src_regno, curr_insn, reloads_num);
 		  }
-		else
-		  usage_insns_check[src_regno] = 0;
 	      }
 	  for (i = 0; i < to_inherit_num; i++)
 	    {
 	      src_regno = to_inherit[i].regno;
-	      if (! inherit_reload_reg (false, false, src_regno, ALL_REGS,
-					curr_insn, to_inherit[i].insns))
+	      if (inherit_reload_reg (false, false, src_regno, ALL_REGS,
+				      curr_insn, to_inherit[i].insns))
+		change_p = true;
+	      else
 		{
-		  usage_insns_check[src_regno] = curr_usage_insns_check;
-		  usage_insns[src_regno] = curr_insn;
+		  usage_insns[src_regno].check = curr_usage_insns_check;
+		  usage_insns[src_regno].insns = curr_insn;
+		  usage_insns[src_regno].reloads_num = reloads_num;
+		  usage_insns[src_regno].calls_num = calls_num;
+		  usage_insns[src_regno].after_p = false;
 		}
 	    }
 	}
+      /* We reached the start of the current basic block.  */
+      if (prev_insn == NULL_RTX || prev_insn == PREV_INSN (head)
+	  || BLOCK_FOR_INSN (prev_insn) != curr_bb)
+	{
+	  /* We reached the beginning of the current block -- do
+	     rest of spliting in the current BB.  */
+	  first_insn = get_non_debug_insn (true, curr_bb);
+	  to_process = DF_LR_IN (curr_bb);
+	  if (BLOCK_FOR_INSN (head) != curr_bb)
+	    {	
+	      /* We are somewhere in the middle of EBB.  */
+	      get_live_on_other_edges (EDGE_PRED (curr_bb, 0)->src,
+				       curr_bb, &temp_bitmap);
+	      to_process = &temp_bitmap;
+	    }
+	  head_p = true;
+	  EXECUTE_IF_SET_IN_BITMAP (to_process,
+				    FIRST_PSEUDO_REGISTER, j, bi)
+	    if ((int) j < lra_constraint_new_regno_start
+		&& reg_renumber[j] >= 0
+		&& usage_insns[j].check == curr_usage_insns_check
+		&& (next_usage_insns = usage_insns[j].insns) != NULL_RTX)
+	      {
+		if (first_insn != NULL_RTX
+		    && need_for_split_p (potential_reload_hard_regs, j))
+		  {
+		    if (lra_dump_file != NULL && head_p)
+		      {
+			fprintf (lra_dump_file,
+				 "  ----------------------------------\n");
+			head_p = false;
+		      }
+		    if (split_pseudo (false, j, first_insn, next_usage_insns))
+		      change_p = true;
+		  }
+		usage_insns[j].check = 0;
+	      }
+	}
     }
+  return change_p;
 }
 
 /* This value affects EBB forming.  If probability of edge from EBB to
@@ -3977,15 +4451,15 @@  inherit_in_ebb (rtx head, rtx tail)
    to EBB.  */ 
 #define EBB_PROBABILITY_CUTOFF (REG_BR_PROB_BASE / 2)
 
-/* Current number of inheritance iteration.  */
+/* Current number of inheritance/split iteration.  */
 int lra_inheritance_iter;
 
-/* Entry function for inheritance pass.  */
+/* Entry function for inheritance/split pass.  */
 void
 lra_inheritance (void)
 {
-  basic_block bb;
-  rtx head, tail;
+  int i;
+  basic_block bb, start_bb;
   edge e;
 
   lra_inheritance_iter++;
@@ -3993,17 +4467,17 @@  lra_inheritance (void)
     fprintf (lra_dump_file, "\n********** Inheritance #%d: **********\n\n",
 	     lra_inheritance_iter);
   curr_usage_insns_check = 0;
-  usage_insns_check = (int *) xmalloc (sizeof (int)
-				       * lra_constraint_new_regno_start);
-  memset (usage_insns_check, 0,
-	  sizeof (int) * lra_constraint_new_regno_start);
   usage_insns
-    = (rtx *) xmalloc (sizeof (rtx) * lra_constraint_new_regno_start);
+    = (struct usage_insns *) xmalloc (sizeof (struct usage_insns)
+				      * lra_constraint_new_regno_start);
+  for (i = 0; i < lra_constraint_new_regno_start; i++)
+    usage_insns[i].check = 0;
   bitmap_initialize (&check_only_pseudos, &reg_obstack);
   bitmap_initialize (&live_pseudos, &reg_obstack);
+  bitmap_initialize (&temp_bitmap, &reg_obstack);
   FOR_EACH_BB (bb)
     {
-      head = BB_HEAD (bb);
+      start_bb = bb;
       if (lra_dump_file != NULL)
 	fprintf (lra_dump_file, "EBB");
       /* Form a EBB starting with BB.  */
@@ -4011,7 +4485,6 @@  lra_inheritance (void)
 	{
 	  if (lra_dump_file != NULL)
 	    fprintf (lra_dump_file, " %d", bb->index);
-	  tail = BB_END (bb);
 	  if (bb->next_bb == EXIT_BLOCK_PTR || LABEL_P (BB_HEAD (bb->next_bb)))
 	    break;
 	  e = find_fallthru_edge (bb->succs);
@@ -4023,28 +4496,30 @@  lra_inheritance (void)
 	}
       if (lra_dump_file != NULL)
 	fprintf (lra_dump_file, "\n");
-      inherit_in_ebb (head, tail);
-      update_ebb_live_info (head, tail);
+      if (inherit_in_ebb (BB_HEAD (start_bb), BB_END (bb)))
+	/* Remember that the EBB head and tail can change in
+	   inherit_in_ebb.  */
+	update_ebb_live_info (BB_HEAD (start_bb), BB_END (bb));
     }
+  bitmap_clear (&temp_bitmap);
   bitmap_clear (&live_pseudos);
   bitmap_clear (&check_only_pseudos);
-  free (usage_insns_check);
   free (usage_insns);
 }
 
 
 
-/* This page contains code to undo failed inheritance
+/* This page contains code to undo failed inheritance/split
    transformations.  */
 
-/* Current number of iteration undoing inheritance.  */
+/* Current number of iteration undoing inheritance/split.  */
 int lra_undo_inheritance_iter;
 
 /* Temporary bitmaps used during calls of FIX_BB_LIVE_INFO.  */
 static bitmap_head temp_bitmap_head;
 
 /* Fix BB live info LIVE after removing pseudos created on pass doing
-   inheritance which are REMOVED_PSEUDOS.  */
+   inheritance/split which are REMOVED_PSEUDOS.  */
 static void
 fix_bb_live_info (bitmap live, bitmap removed_pseudos)
 {
@@ -4059,17 +4534,33 @@  fix_bb_live_info (bitmap live, bitmap re
     }
 }
 
-/* Remove inheritance pseudos which are in REMOVE_PSEUDOS and return
-   true if we did any change.  The undo transformations looks like
+/* Return regno of the (subreg of) pseudo REG. Otherise, return
+   a negative number.  */
+static int
+get_pseudo_regno (rtx reg)
+{
+  int regno;
+
+  if (GET_CODE (reg) == SUBREG)
+    reg = SUBREG_REG (reg);
+  if (REG_P (reg)
+      && (regno = REGNO (reg)) >= FIRST_PSEUDO_REGISTER)
+    return regno;
+  return -1;
+}
+
+/* Remove inheritance/split pseudos which are in REMOVE_PSEUDOS and
+   return true if we did any change.  The undo transformations for
+   inheritance looks like
       i <- i2
       p <- i      =>   p <- i2
    or removing
       p <- i, i <- p, and i <- i3
    where p is original pseudo from which inheritance pseudo i was
    created, i and i3 are removed inheritance pseudos, i2 is another
-   not removed inheritance pseudo.  All other occurrences of removed
-   inheritance pseudos are changed on the corresponding original
-   pseudos.  */
+   not removed inheritance pseudo.  All split pseudos or other
+   occurrences of removed inheritance pseudos are changed on the
+   corresponding original pseudos.  */
 static bool
 remove_inheritance_pseudos (bitmap remove_pseudos)
 {
@@ -4089,12 +4580,15 @@  remove_inheritance_pseudos (bitmap remov
 	  if (! INSN_P (curr_insn))
 	    continue;
 	  done_p = false;
-	  if (change_p
-	      && NONDEBUG_INSN_P (curr_insn)
-	      && (set = single_set (curr_insn)) != NULL_RTX
-	      && REG_P (SET_DEST (set)) && REG_P (SET_SRC (set))
-	      && (sregno = REGNO (SET_SRC (set))) >= FIRST_PSEUDO_REGISTER
-	      && (dregno = REGNO (SET_DEST (set))) >= FIRST_PSEUDO_REGISTER)
+	  sregno = dregno = -1;
+	  if (change_p && NONDEBUG_INSN_P (curr_insn)
+	      && (set = single_set (curr_insn)) != NULL_RTX)
+	    {
+	      dregno = get_pseudo_regno (SET_DEST (set));
+	      sregno = get_pseudo_regno (SET_SRC (set));
+	    }
+	  
+	  if (sregno >= 0 && dregno >= 0)
 	    {
 	      if ((bitmap_bit_p (remove_pseudos, sregno)
 		   && (lra_reg_info[sregno].restore_regno == dregno
@@ -4105,35 +4599,46 @@  remove_inheritance_pseudos (bitmap remov
 		      && lra_reg_info[dregno].restore_regno == sregno))
 		/* One of the following cases:
 		     original <- removed inheritance pseudo
-                     removed inheritance pseudo <- another removed inherit pseudo
-                     removed inheritance pseudo <- original pseudo */
+                     removed inherit pseudo <- another removed inherit pseudo
+                     removed inherit pseudo <- original pseudo
+		   Or
+		     removed_split_pseudo <- original_pseudo
+                     original_pseudo <- removed_split_pseudo */
 		{
 		  if (lra_dump_file != NULL)
 		    {
-		      fprintf (lra_dump_file,
-			       "    Removing inheritance:\n");
+		      fprintf (lra_dump_file, "    Removing %s:\n",
+			       bitmap_bit_p (&lra_split_pseudos, sregno)
+			       || bitmap_bit_p (&lra_split_pseudos, dregno)
+			       ? "split" : "inheritance");
 		      print_rtl_slim (lra_dump_file,
 				      curr_insn, curr_insn, -1, 0);
 		    }
 		  lra_set_insn_deleted (curr_insn);
 		  done_p = true;
 		}
-	      else if (bitmap_bit_p (remove_pseudos, sregno))
+	      else if (bitmap_bit_p (remove_pseudos, sregno)
+		       && bitmap_bit_p (&lra_inheritance_pseudos, sregno))
 		{
 		  /* Search the following pattern:
-		       inherit_pseudo1 <- inherit_pseudo2
-                       reload_pseudo <- inherit_pseudo1
+		       inherit_or_split_pseudo1 <- inherit_or_split_pseudo2
+                       original_pseudo <- inherit_or_split_pseudo1
 		    where the 2nd insn is the current insn and
-		    inherit_pseudo2 is not removed.  If it is found,
+		    inherit_or_split_pseudo2 is not removed.  If it is found,
 		    change the current insn onto:
-		       reload_pseudo1 <- inherit_pseudo2.  */
+		       original_pseudo1 <- inherit_or_split_pseudo2.  */
 		  for (prev_insn = PREV_INSN (curr_insn);
 		       prev_insn != NULL_RTX && ! NONDEBUG_INSN_P (prev_insn);
 		       prev_insn = PREV_INSN (prev_insn))
 		    ;
 		  if (prev_insn != NULL_RTX && BLOCK_FOR_INSN (prev_insn) == bb
 		      && (prev_set = single_set (prev_insn)) != NULL_RTX
-		      && REG_P (SET_DEST (prev_set)) && REG_P (SET_SRC (prev_set))
+		      /* There should be no subregs in insn we are
+			 searching because only the original reg might
+			 be in subreg when we changed the mode of
+			 load/store for splitting.  */
+		      && REG_P (SET_DEST (prev_set))
+		      && REG_P (SET_SRC (prev_set))
 		      && (int) REGNO (SET_DEST (prev_set)) == sregno
 		      && ((prev_sregno = REGNO (SET_SRC (prev_set)))
 			  >= FIRST_PSEUDO_REGISTER)
@@ -4141,8 +4646,14 @@  remove_inheritance_pseudos (bitmap remov
 			  == lra_reg_info[prev_sregno].restore_regno)
 		      && ! bitmap_bit_p (remove_pseudos, prev_sregno))
 		    {
-		      SET_SRC (set) = SET_SRC (prev_set);
-		      lra_update_insn_regno_info (curr_insn);
+		      gcc_assert (GET_MODE (SET_SRC (prev_set))
+				  == GET_MODE (regno_reg_rtx[sregno]));
+		      if (GET_CODE (SET_SRC (set)) == SUBREG)
+			SUBREG_REG (SET_SRC (set)) = SET_SRC (prev_set);
+		      else
+			SET_SRC (set) = SET_SRC (prev_set);
+		      lra_push_insn_and_update_insn_regno_info (curr_insn);
+		      lra_set_used_insn_alternative_by_uid (INSN_UID (curr_insn), -1);
 		      done_p = true;
 		      if (lra_dump_file != NULL)
 			{
@@ -4163,7 +4674,6 @@  remove_inheritance_pseudos (bitmap remov
 		if ((regno = reg->regno) >= lra_constraint_new_regno_start
 		    && lra_reg_info[regno].restore_regno >= 0)
 		  {
-		    gcc_assert (bitmap_bit_p (&lra_inheritance_pseudos, regno));
 		    if (change_p && bitmap_bit_p (remove_pseudos, regno))
 		      {
 			restore_regno = lra_reg_info[regno].restore_regno;
@@ -4201,7 +4711,9 @@  remove_inheritance_pseudos (bitmap remov
 bool
 lra_undo_inheritance (void)
 {
-  unsigned int inherit_regno;
+  unsigned int regno;
+  int restore_regno;
+  int n_all_inherit, n_inherit, n_all_split, n_split;
   bitmap_head remove_pseudos;
   bitmap_iterator bi;
   bool change_p;
@@ -4212,14 +4724,46 @@  lra_undo_inheritance (void)
 	     "\n********** Undoing inheritance #%d: **********\n\n",
 	     lra_undo_inheritance_iter);
   bitmap_initialize (&remove_pseudos, &reg_obstack);
-  EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, inherit_regno, bi)
-    if (lra_reg_info[inherit_regno].restore_regno >= 0
-	&& reg_renumber[inherit_regno] < 0)
-      bitmap_set_bit (&remove_pseudos, inherit_regno);
+  n_inherit = n_all_inherit = 0;
+  EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, regno, bi)
+    if (lra_reg_info[regno].restore_regno >= 0)
+      {
+	n_all_inherit++;
+	if (reg_renumber[regno] < 0)
+	  bitmap_set_bit (&remove_pseudos, regno);
+	else
+	  n_inherit++;
+      }
+  if (lra_dump_file != NULL && n_all_inherit != 0)
+    fprintf (lra_dump_file, "Inherit %d out of %d (%.2f%%)\n",
+	     n_inherit, n_all_inherit,
+	     (double) n_inherit / n_all_inherit * 100);
+  n_split = n_all_split = 0;
+  EXECUTE_IF_SET_IN_BITMAP (&lra_split_pseudos, 0, regno, bi)
+    if ((restore_regno = lra_reg_info[regno].restore_regno) >= 0)
+      {
+	n_all_split++;
+	if (reg_renumber[restore_regno] < 0
+	    || reg_renumber[regno] == reg_renumber[restore_regno])
+	  bitmap_set_bit (&remove_pseudos, regno);
+	else
+	  {
+	    n_split++;
+	    if (lra_dump_file != NULL)
+	      fprintf (lra_dump_file, "      Keep split r%d (orig=r%d)\n",
+		       regno, restore_regno);
+	  }
+      }
+  if (lra_dump_file != NULL && n_all_split != 0)
+    fprintf (lra_dump_file, "Split %d out of %d (%.2f%%)\n",
+	     n_split, n_all_split,
+	     (double) n_split / n_all_split * 100);
   change_p = remove_inheritance_pseudos (&remove_pseudos);
   bitmap_clear (&remove_pseudos);
   /* Clear restore_regnos.  */
-  EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, inherit_regno, bi)
-    lra_reg_info[inherit_regno].restore_regno = -1;
+  EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, regno, bi)
+    lra_reg_info[regno].restore_regno = -1;
+  EXECUTE_IF_SET_IN_BITMAP (&lra_split_pseudos, 0, regno, bi)
+    lra_reg_info[regno].restore_regno = -1;
   return change_p;
 }
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 181810)
+++ Makefile.in	(working copy)
@@ -1292,7 +1292,6 @@  OBJS = \
 	lra-constraints.o \
 	lra-eliminations.o \
 	lra-lives.o \
-	lra-saves.o \
 	lra-spills.o \
 	lto-cgraph.o \
 	lto-streamer.o \
@@ -3340,11 +3339,6 @@  lra-lives.o : lra-lives.c $(CONFIG_H) $(
    $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
    $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \
    $(LRA_INT_H)
-lra-saves.o : lra-saves.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
-   $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
-   $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
-   $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \
-   ira.h $(LRA_INT_H)
 lra-spills.o : lra-spills.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
    $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 181810)
+++ config/i386/i386.c	(working copy)
@@ -31278,7 +31278,7 @@  inline_secondary_memory_needed (enum reg
       || MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1)
       || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2))
     {
-      gcc_assert (!strict);
+      gcc_assert (!strict || lra_in_progress);
       return true;
     }
 
Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 181810)
+++ config/rs6000/rs6000.c	(working copy)
@@ -6956,8 +6956,8 @@  rs6000_emit_move (rtx dest, rtx source, 
 
   if (lra_in_progress
       && mode == SDmode
-      && MEM_P (operands[0])
-      && rtx_equal_p (operands[0], cfun->machine->sdmode_stack_slot)
+      && REG_P (operands[0]) && REGNO (operands[0]) >= FIRST_PSEUDO_REGISTER
+      && reg_preferred_class (REGNO (operands[0])) == NO_REGS
       && (REG_P (operands[1])
 	  || (GET_CODE (operands[1]) == SUBREG
 	      && REG_P (SUBREG_REG (operands[1])))))
@@ -6973,19 +6973,9 @@  rs6000_emit_move (rtx dest, rtx source, 
 	  regno = ira_class_hard_regs[cl][0];
 	}
       if (FP_REGNO_P (regno))
-	{
-	  rtx mem = adjust_address_nv (operands[0], DDmode, 0);
-
-	  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
-	  emit_insn (gen_movsd_store (mem, operands[1]));
-	}
+	emit_insn (gen_movsd_store (operands[0], operands[1]));
       else if (INT_REGNO_P (regno))
-	{
-	  rtx mem = adjust_address_nv (operands[0], mode, 4);
-
-	  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
-	  emit_insn (gen_movsd_hardfloat (mem, operands[1]));
-	}
+	emit_insn (gen_movsd_hardfloat (operands[0], operands[1]));
       else
 	gcc_unreachable();
       return;
@@ -6995,8 +6985,8 @@  rs6000_emit_move (rtx dest, rtx source, 
       && (REG_P (operands[0])
 	  || (GET_CODE (operands[0]) == SUBREG
 	      && REG_P (SUBREG_REG (operands[0]))))
-      && MEM_P (operands[1])
-      && rtx_equal_p (operands[1], cfun->machine->sdmode_stack_slot))
+      && REG_P (operands[1]) && REGNO (operands[1]) >= FIRST_PSEUDO_REGISTER
+      && reg_preferred_class (REGNO (operands[1])) == NO_REGS)
     {
       int regno = REGNO (GET_CODE (operands[0]) == SUBREG
 			 ? SUBREG_REG (operands[0]) : operands[0]);
@@ -7009,19 +6999,9 @@  rs6000_emit_move (rtx dest, rtx source, 
 	  regno = ira_class_hard_regs[cl][0];
 	}
       if (FP_REGNO_P (regno))
-	{
-	  rtx mem = adjust_address_nv (operands[1], DDmode, 0);
-
-	  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
-	  emit_insn (gen_movsd_load (operands[0], mem));
-	}
+	emit_insn (gen_movsd_load (operands[0], operands[1]));
       else if (INT_REGNO_P (regno))
-	{
-	  rtx mem = adjust_address_nv (operands[1], mode, 4);
-
-	  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
-	  emit_insn (gen_movsd_hardfloat (operands[0], mem));
-	}
+	emit_insn (gen_movsd_hardfloat (operands[0], operands[1]));
       else
 	gcc_unreachable();
       return;
@@ -14156,6 +14136,17 @@  rs6000_secondary_memory_needed_rtx (enum
   return ret;
 }
 
+/* Return the mode to be used for memory when a secondary memory
+   location is needed.  For SDmode values we need to use DDmode, in
+   all other cases we can use the same mode.  */
+enum machine_mode
+rs6000_secondary_memory_needed_mode (enum machine_mode mode)
+{
+  if (mode == SDmode)
+    return DDmode;
+  return mode;
+}
+
 static tree
 rs6000_check_sdmode (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
 {
@@ -14699,6 +14690,10 @@  rs6000_alloc_sdmode_stack_slot (void)
   gimple_stmt_iterator gsi;
 
   gcc_assert (cfun->machine->sdmode_stack_slot == NULL_RTX);
+  /* We use a different approach for dealing with the secondary
+     memmory in LRA.  */
+  if (flag_lra)
+    return;
 
   FOR_EACH_BB (bb)
     for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
Index: config/rs6000/rs6000.h
===================================================================
--- config/rs6000/rs6000.h	(revision 181810)
+++ config/rs6000/rs6000.h	(working copy)
@@ -1328,6 +1328,13 @@  extern enum reg_class rs6000_constraints
 #define SECONDARY_MEMORY_NEEDED_RTX(MODE) \
   rs6000_secondary_memory_needed_rtx (MODE)
 
+/* Specify the mode to be used for memory when a secondary memory
+   location is needed.  For cpus that cannot load/store SDmode values
+   from the 64-bit FP registers without using a full 64-bit
+   load/store, we need a wider mode.  */
+#define SECONDARY_MEMORY_NEEDED_MODE(MODE)		\
+  rs6000_secondary_memory_needed_mode (MODE)
+
 /* Return the maximum number of consecutive registers
    needed to represent mode MODE in a register of class CLASS.
 
Index: config/rs6000/rs6000-protos.h
===================================================================
--- config/rs6000/rs6000-protos.h	(revision 181810)
+++ config/rs6000/rs6000-protos.h	(working copy)
@@ -115,6 +115,8 @@  extern rtx create_TOC_reference (rtx, rt
 extern void rs6000_split_multireg_move (rtx, rtx);
 extern void rs6000_emit_move (rtx, rtx, enum machine_mode);
 extern rtx rs6000_secondary_memory_needed_rtx (enum machine_mode);
+extern enum machine_mode rs6000_secondary_memory_needed_mode (enum
+							      machine_mode);
 extern rtx (*rs6000_legitimize_reload_address_ptr) (rtx, enum machine_mode,
 						    int, int, int, int *);
 extern bool rs6000_legitimate_offset_address_p (enum machine_mode, rtx, int);