diff mbox

patch to fix a wrong code generation with LRA when SDmode is used on ppc.

Message ID 53E3EB6C.7080302@redhat.com
State New
Headers show

Commit Message

Vladimir Makarov Aug. 7, 2014, 9:11 p.m. UTC
Hi, Mike.

Here is the patch which solves LRA SDmode problem which we discussed on
IRC on Friday.

The patch was successfully tested and bootstrapped on ppc64 with -mlra
switched on.

David, is it ok to commit it to the trunk?

2014-08-07  Vladimir Makarov  <vmakarov@redhat.com>

        * config/rs6000/rs6000.c (rs6000_emit_move): Use SDmode to
        load/store from/to non-floating class pseudo.

Comments

David Edelsohn Aug. 8, 2014, 6:53 p.m. UTC | #1
Hi, Vlad

Why does rs6000_emit_move need special support for SDmode with LRA and
not with reload?  In other words, why is this a specific fix for
rs6000 instead of a general improvement for reload?

SDmode has some bizarre restrictions on PPC, but one of the historical
advantages of GCC was reload's ability to support strange addressing
modes of processors. Why fix this in a target-specific way and then
possibly replicated this in other architectures instead of fixing it
more generally?

Thanks, David


On Thu, Aug 7, 2014 at 5:11 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> Hi, Mike.
>
> Here is the patch which solves LRA SDmode problem which we discussed on
> IRC on Friday.
>
> The patch was successfully tested and bootstrapped on ppc64 with -mlra
> switched on.
>
> David, is it ok to commit it to the trunk?
>
> 2014-08-07  Vladimir Makarov  <vmakarov@redhat.com>
>
>         * config/rs6000/rs6000.c (rs6000_emit_move): Use SDmode to
>         load/store from/to non-floating class pseudo.
>
Vladimir Makarov Aug. 8, 2014, 9:26 p.m. UTC | #2
On 2014-08-08, 2:53 PM, David Edelsohn wrote:
> Hi, Vlad
>
> Why does rs6000_emit_move need special support for SDmode with LRA and
> not with reload?

   rs6000_emit_move has also special SDmode support for *reload too*. 
It is not implemented generically in reload as you wrote.  Please, look 
at the code with reload_in_progress nearby.  So I am trying to do 
basically the same as done by rs6000 back end for reload but by means of 
LRA.  The difference between code for reload and LRA is in that a 
special stack slot is created for reload.  Reload through a hook 
SECONDARY_MEMORY_NEEDED_RTX provided by rs6000 back-end generates a move 
using the stack slot and that triggers a special code generation in 
rs6000_emit_move.  So it is a trick, not a general support in reload.

LRA uses a spilled pseudo for this and don't use the hook at all.  It 
means that the stack slot created for reload is used only for SD moves. 
  LRA uses a common stack slot allocation techniques for spilled pseudos 
which permits to share this slot for other purposes, generates a better 
code, and uses less one hook.  But basically code for LRA in 
rs600_emit_move do the same as analogous code for reload.

But if you mean that general support you mentioned is the usage of 
absolutely the same trick for LRA as for reload (through 
SECONDARY_MEMORY_NEEDED_RTX), I could do this.  But as I wrote it would 
be worse code generation than LRA currently produces and using an 
additional hook out of many already used by LRA.

> In other words, why is this a specific fix for
> rs6000 instead of a general improvement for reload?
>

Because it is too specific.  It is about how rs6000 too specifically 
implements sd load/store.  For example, SECONDARY_MEMORY_NEEDED_RTX is a 
creation only for rs6000.  No other port uses it because the hook is 
used for the trick I wrote above.  I don't think other port will use the 
trick in future.

> SDmode has some bizarre restrictions on PPC, but one of the historical
> advantages of GCC was reload's ability to support strange addressing
> modes of processors. Why fix this in a target-specific way and then
> possibly replicated this in other architectures instead of fixing it
> more generally?
>

It is not about addressing.  It is about machine description of insns. 
If movsd_{load,store} insns were part of common move insns, we would 
have no problem at all.  But I guess, achieving this would be not 
possible or easy.  I believe Mike had serious reasons to do this such 
way.  I am not such a specialist in md writing as Mike.

If I could solve this problem only in LRA, I'd prefer to do this because 
it is simpler for me as I don't need an approval.
David Edelsohn Aug. 9, 2014, 12:49 p.m. UTC | #3
Hi, Vlad

Thanks for the explanation.

The patch is okay.

Thanks, David


On Fri, Aug 8, 2014 at 5:26 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> On 2014-08-08, 2:53 PM, David Edelsohn wrote:
>>
>> Hi, Vlad
>>
>> Why does rs6000_emit_move need special support for SDmode with LRA and
>> not with reload?
>
>
>   rs6000_emit_move has also special SDmode support for *reload too*. It is
> not implemented generically in reload as you wrote.  Please, look at the
> code with reload_in_progress nearby.  So I am trying to do basically the
> same as done by rs6000 back end for reload but by means of LRA.  The
> difference between code for reload and LRA is in that a special stack slot
> is created for reload.  Reload through a hook SECONDARY_MEMORY_NEEDED_RTX
> provided by rs6000 back-end generates a move using the stack slot and that
> triggers a special code generation in rs6000_emit_move.  So it is a trick,
> not a general support in reload.
>
> LRA uses a spilled pseudo for this and don't use the hook at all.  It means
> that the stack slot created for reload is used only for SD moves.  LRA uses
> a common stack slot allocation techniques for spilled pseudos which permits
> to share this slot for other purposes, generates a better code, and uses
> less one hook.  But basically code for LRA in rs600_emit_move do the same as
> analogous code for reload.
>
> But if you mean that general support you mentioned is the usage of
> absolutely the same trick for LRA as for reload (through
> SECONDARY_MEMORY_NEEDED_RTX), I could do this.  But as I wrote it would be
> worse code generation than LRA currently produces and using an additional
> hook out of many already used by LRA.
>
>
>> In other words, why is this a specific fix for
>> rs6000 instead of a general improvement for reload?
>>
>
> Because it is too specific.  It is about how rs6000 too specifically
> implements sd load/store.  For example, SECONDARY_MEMORY_NEEDED_RTX is a
> creation only for rs6000.  No other port uses it because the hook is used
> for the trick I wrote above.  I don't think other port will use the trick in
> future.
>
>
>> SDmode has some bizarre restrictions on PPC, but one of the historical
>> advantages of GCC was reload's ability to support strange addressing
>> modes of processors. Why fix this in a target-specific way and then
>> possibly replicated this in other architectures instead of fixing it
>> more generally?
>>
>
> It is not about addressing.  It is about machine description of insns. If
> movsd_{load,store} insns were part of common move insns, we would have no
> problem at all.  But I guess, achieving this would be not possible or easy.
> I believe Mike had serious reasons to do this such way.  I am not such a
> specialist in md writing as Mike.
>
> If I could solve this problem only in LRA, I'd prefer to do this because it
> is simpler for me as I don't need an approval.
>
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 213506)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -8271,6 +8271,30 @@  rs6000_emit_move (rtx dest, rtx source,
       eliminate_regs (cfun->machine->sdmode_stack_slot, VOIDmode, NULL_RTX);
 
 
+  /* Transform (p0:DD, (SUBREG:DD p1:SD)) to ((SUBREG:SD p0:DD),
+     p1:SD) if p1 is not of floating point class and p0 is spilled as
+     we can have no analogous movsd_store for this.  */
+  if (lra_in_progress && mode == DDmode
+      && REG_P (operands[0]) && REGNO (operands[0]) >= FIRST_PSEUDO_REGISTER
+      && reg_preferred_class (REGNO (operands[0])) == NO_REGS
+      && GET_CODE (operands[1]) == SUBREG && REG_P (SUBREG_REG (operands[1]))
+      && GET_MODE (SUBREG_REG (operands[1])) == SDmode)
+    {
+      enum reg_class cl;
+      int regno = REGNO (SUBREG_REG (operands[1]));
+
+      if (regno >= FIRST_PSEUDO_REGISTER)
+	{
+	  cl = reg_preferred_class (regno);
+	  regno = cl == NO_REGS ? -1 : ira_class_hard_regs[cl][1];
+	}
+      if (regno >= 0 && ! FP_REGNO_P (regno))
+	{
+	  mode = SDmode;
+	  operands[0] = gen_lowpart_SUBREG (SDmode, operands[0]);
+	  operands[1] = SUBREG_REG (operands[1]);
+	}
+    }
   if (lra_in_progress
       && mode == SDmode
       && REG_P (operands[0]) && REGNO (operands[0]) >= FIRST_PSEUDO_REGISTER
@@ -8301,6 +8325,30 @@  rs6000_emit_move (rtx dest, rtx source,
 	gcc_unreachable();
       return;
     }
+  /* Transform ((SUBREG:DD p0:SD), p1:DD) to (p0:SD, (SUBREG:SD
+     p:DD)) if p0 is not of floating point class and p1 is spilled as
+     we can have no analogous movsd_load for this.  */
+  if (lra_in_progress && mode == DDmode
+      && GET_CODE (operands[0]) == SUBREG && REG_P (SUBREG_REG (operands[0]))
+      && GET_MODE (SUBREG_REG (operands[0])) == SDmode
+      && REG_P (operands[1]) && REGNO (operands[1]) >= FIRST_PSEUDO_REGISTER
+      && reg_preferred_class (REGNO (operands[1])) == NO_REGS)
+    {
+      enum reg_class cl;
+      int regno = REGNO (SUBREG_REG (operands[0]));
+
+      if (regno >= FIRST_PSEUDO_REGISTER)
+	{
+	  cl = reg_preferred_class (regno);
+	  regno = cl == NO_REGS ? -1 : ira_class_hard_regs[cl][0];
+	}
+      if (regno >= 0 && ! FP_REGNO_P (regno))
+	{
+	  mode = SDmode;
+	  operands[0] = SUBREG_REG (operands[0]);
+	  operands[1] = gen_lowpart_SUBREG (SDmode, operands[1]);
+	}
+    }
   if (lra_in_progress
       && mode == SDmode
       && (REG_P (operands[0])