diff mbox

: Fix PR middle-end/56382 -- Only move MODE_COMPLEX_FLOAT by parts if we can create pseudos

Message ID CABu31nO4UDdKTLKW=-14SqUpvXP06jV4tMeem5RsyzNG9YkfVA@mail.gmail.com
State New
Headers show

Commit Message

Steven Bosscher Aug. 24, 2013, 10:43 a.m. UTC
On Fri, Aug 23, 2013 at 2:47 AM, John David Anglin wrote:
> Ping.
>
>
> On 28-Jul-13, at 12:17 PM, John David Anglin wrote:
>
>> This patch fixes PR middle-end/56382 on hppa64-hp-hpux11.11.  The patch
>> prevents moving a complex float by parts if we can't
>> create pseudos.  On a big endian 64-bit target, we need a psuedo to move a
>> complex float and this fails during reload.
>>
>> OK for trunk?
>>

I'm trying to understand how the patch would help...

The code you're patching is:

  /* Move floating point as parts.  */
  if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
+    && can_create_pseudo_p ()
      && optab_handler (mov_optab, GET_MODE_INNER (mode)) != CODE_FOR_nothing)
    try_int = false;
  /* Not possible if the values are inherently not adjacent.  */
  else if (GET_CODE (x) == CONCAT || GET_CODE (y) == CONCAT)
    try_int = false;
  /* Is possible if both are registers (or subregs of registers).  */
  else if (register_operand (x, mode) && register_operand (y, mode))
    try_int = true;
  /* If one of the operands is a memory, and alignment constraints
     are friendly enough, we may be able to do combined memory operations.
     We do not attempt this if Y is a constant because that combination is
     usually better with the by-parts thing below.  */
  else if ((MEM_P (x) ? !CONSTANT_P (y) : MEM_P (y))
           && (!STRICT_ALIGNMENT
               || get_mode_alignment (mode) == BIGGEST_ALIGNMENT))
    try_int = true;
  else
    try_int = false;

With the new test for can_create_pseudo_p, you're trying to make
"try_int" be false. Apparently your failure happens if one of the
operands is a MEM? Otherwise the second "else if " test would find x
and y be registers and "try_int" still ends up being true.

It seems to me that can_create_pseudo_p is not the right test anyway.
There many be other targets that can take this path just fine without
needing new registers. In the PR audit trail you say: "The problem is
SCmode is the same size as DImode on this target, so the subreg can't
be extracted by a move." Using can_create_pseudo_p is too big a hammer
to solve this problem. The right test would be to see if you end up
needing extra registers to perform the move. But emit_move_change_mode
already handles that, AFAICT, so can you please try and test if the
following patch solves the PR for you?

Ciao!
Steven

Comments

John David Anglin Aug. 24, 2013, 2:37 p.m. UTC | #1
On 24-Aug-13, at 6:43 AM, Steven Bosscher wrote:

> I'm trying to understand how the patch would help...
>
> The code you're patching is:
>
>  /* Move floating point as parts.  */
>  if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
> +    && can_create_pseudo_p ()
>      && optab_handler (mov_optab, GET_MODE_INNER (mode)) !=  
> CODE_FOR_nothing)
>    try_int = false;
>  /* Not possible if the values are inherently not adjacent.  */
>  else if (GET_CODE (x) == CONCAT || GET_CODE (y) == CONCAT)
>    try_int = false;
>  /* Is possible if both are registers (or subregs of registers).  */
>  else if (register_operand (x, mode) && register_operand (y, mode))
>    try_int = true;
>  /* If one of the operands is a memory, and alignment constraints
>     are friendly enough, we may be able to do combined memory  
> operations.
>     We do not attempt this if Y is a constant because that  
> combination is
>     usually better with the by-parts thing below.  */
>  else if ((MEM_P (x) ? !CONSTANT_P (y) : MEM_P (y))
>           && (!STRICT_ALIGNMENT
>               || get_mode_alignment (mode) == BIGGEST_ALIGNMENT))
>    try_int = true;
>  else
>    try_int = false;
>
> With the new test for can_create_pseudo_p, you're trying to make
> "try_int" be false. Apparently your failure happens if one of the
> operands is a MEM? Otherwise the second "else if " test would find x
> and y be registers and "try_int" still ends up being true.

I was trying to prevent "try_int" from being set to false in the "if"  
if we
can't create a pseudo.  If this is done, try_int gets set to true in  
the second
"else if" in the failing testcase.  As a result, we don't directly use
"emit_move_complex_parts" which currently needs a register on hppa64.

>
> It seems to me that can_create_pseudo_p is not the right test anyway.
> There many be other targets that can take this path just fine without
> needing new registers. In the PR audit trail you say: "The problem is
> SCmode is the same size as DImode on this target, so the subreg can't
> be extracted by a move." Using can_create_pseudo_p is too big a hammer

> to solve this problem. The right test would be to see if you end up
> needing extra registers to perform the move. But emit_move_change_mode
> already handles that, AFAICT, so can you please try and test if the
> following patch solves the PR for you?

I'll give your patch a try.

>
> Ciao!
> Steven
>
>
> Index: expr.c
> ===================================================================
> --- expr.c      (revision 201887)
> +++ expr.c      (working copy)
> @@ -3268,7 +3268,7 @@ emit_move_complex (enum machine_mode mode, rtx  
> x,
>          return get_last_insn ();
>        }
>
> -      ret = emit_move_via_integer (mode, x, y, true);
> +      ret = emit_move_via_integer (mode, x, y, can_create_pseudo_p  
> ());
>       if (ret)
>        return ret;
>     }
>


Thanks,
Dave
--
John David Anglin	dave.anglin@bell.net
John David Anglin Aug. 24, 2013, 2:46 p.m. UTC | #2
On 24-Aug-13, at 10:37 AM, John David Anglin wrote:

>
> On 24-Aug-13, at 6:43 AM, Steven Bosscher wrote:
>
>> I'm trying to understand how the patch would help...
>>
>> The code you're patching is:
>>
>> /* Move floating point as parts.  */
>> if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
>> +    && can_create_pseudo_p ()
>>     && optab_handler (mov_optab, GET_MODE_INNER (mode)) !=  
>> CODE_FOR_nothing)
>>   try_int = false;
>> /* Not possible if the values are inherently not adjacent.  */
>> else if (GET_CODE (x) == CONCAT || GET_CODE (y) == CONCAT)
>>   try_int = false;
>> /* Is possible if both are registers (or subregs of registers).  */
>> else if (register_operand (x, mode) && register_operand (y, mode))
>>   try_int = true;
>> /* If one of the operands is a memory, and alignment constraints
>>    are friendly enough, we may be able to do combined memory  
>> operations.
>>    We do not attempt this if Y is a constant because that  
>> combination is
>>    usually better with the by-parts thing below.  */
>> else if ((MEM_P (x) ? !CONSTANT_P (y) : MEM_P (y))
>>          && (!STRICT_ALIGNMENT
>>              || get_mode_alignment (mode) == BIGGEST_ALIGNMENT))
>>   try_int = true;
>> else
>>   try_int = false;
>>
>> With the new test for can_create_pseudo_p, you're trying to make
>> "try_int" be false. Apparently your failure happens if one of the
>> operands is a MEM? Otherwise the second "else if " test would find x
>> and y be registers and "try_int" still ends up being true.
>
> I was trying to prevent "try_int" from being set to false in the  
> "if" if we
> can't create a pseudo.  If this is done, try_int gets set to true in  
> the second
> "else if" in the failing testcase.  As a result, we don't directly use
> "emit_move_complex_parts" which currently needs a register on hppa64.
>
>>
>> It seems to me that can_create_pseudo_p is not the right test anyway.
>> There many be other targets that can take this path just fine without
>> needing new registers. In the PR audit trail you say: "The problem is
>> SCmode is the same size as DImode on this target, so the subreg can't
>> be extracted by a move." Using can_create_pseudo_p is too big a  
>> hammer
>
>> to solve this problem. The right test would be to see if you end up
>> needing extra registers to perform the move. But  
>> emit_move_change_mode
>> already handles that, AFAICT, so can you please try and test if the
>> following patch solves the PR for you?
>
> I'll give your patch a try.
>
>>
>> Ciao!
>> Steven
>>
>>
>> Index: expr.c
>> ===================================================================
>> --- expr.c      (revision 201887)
>> +++ expr.c      (working copy)
>> @@ -3268,7 +3268,7 @@ emit_move_complex (enum machine_mode mode,  
>> rtx x,
>>         return get_last_insn ();
>>       }
>>
>> -      ret = emit_move_via_integer (mode, x, y, true);
>> +      ret = emit_move_via_integer (mode, x, y, can_create_pseudo_p  
>> ());
>>      if (ret)
>>       return ret;
>>    }
>>


Actually, I don't think it will work because "try_int" gets set to  
false and the code isn't used.

Dave
--
John David Anglin	dave.anglin@bell.net
John David Anglin Aug. 29, 2013, 12:14 a.m. UTC | #3
On 24-Aug-13, at 6:43 AM, Steven Bosscher wrote:

> On Fri, Aug 23, 2013 at 2:47 AM, John David Anglin wrote:
>> Ping.
>>
>>
>> On 28-Jul-13, at 12:17 PM, John David Anglin wrote:
>>
>>> This patch fixes PR middle-end/56382 on hppa64-hp-hpux11.11.  The  
>>> patch
>>> prevents moving a complex float by parts if we can't
>>> create pseudos.  On a big endian 64-bit target, we need a psuedo  
>>> to move a
>>> complex float and this fails during reload.
>>>
>>> OK for trunk?
>>>
>
> I'm trying to understand how the patch would help...
>
> The code you're patching is:
>
>  /* Move floating point as parts.  */
>  if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
> +    && can_create_pseudo_p ()
>      && optab_handler (mov_optab, GET_MODE_INNER (mode)) !=  
> CODE_FOR_nothing)
>    try_int = false;
>  /* Not possible if the values are inherently not adjacent.  */
>  else if (GET_CODE (x) == CONCAT || GET_CODE (y) == CONCAT)
>    try_int = false;
>  /* Is possible if both are registers (or subregs of registers).  */
>  else if (register_operand (x, mode) && register_operand (y, mode))
>    try_int = true;
>  /* If one of the operands is a memory, and alignment constraints
>     are friendly enough, we may be able to do combined memory  
> operations.
>     We do not attempt this if Y is a constant because that  
> combination is
>     usually better with the by-parts thing below.  */
>  else if ((MEM_P (x) ? !CONSTANT_P (y) : MEM_P (y))
>           && (!STRICT_ALIGNMENT
>               || get_mode_alignment (mode) == BIGGEST_ALIGNMENT))
>    try_int = true;
>  else
>    try_int = false;
>
> With the new test for can_create_pseudo_p, you're trying to make
> "try_int" be false. Apparently your failure happens if one of the
> operands is a MEM? Otherwise the second "else if " test would find x
> and y be registers and "try_int" still ends up being true.

With the patch, I'm trying to prevent "try_int" from being set  
directly to false
when the mode class is MODE_COMPLEX_FLOAT.  The 64-bit PA target
has move instructions to handle the inner mode.  So, "optab_handler  
(mov_optab,
GET_MODE_INNER (mode)) != CODE_FOR_nothing" is always true.

>
> It seems to me that can_create_pseudo_p is not the right test anyway.
> There many be other targets that can take this path just fine without
> needing new registers. In the PR audit trail you say: "The problem is
> SCmode is the same size as DImode on this target, so the subreg can't
> be extracted by a move." Using can_create_pseudo_p is too big a hammer
> to solve this problem. The right test would be to see if you end up
> needing extra registers to perform the move. But emit_move_change_mode
> already handles that, AFAICT, so can you please try and test if the
> following patch solves the PR for you?

As expected, your patch doesn't fix the PR.

The bug lies in using "emit_move_complex_parts" when we can't create  
pseudos.
There are several places in the functions that it calls where  
"gen_reg_rtx" can be
called (e.g., "store_bit_field_using_insv").  In debugging, I found  
that fixing these
didn't help.  In the end, it fails to find a way move the complex parts.

In gcc.c-torture/compile/pr55921.c, we have two register operands so  
try_int
can be true.  "emit_move_via_integer" is successful in this case.   
Your patch
might be correct.

I'm not sure that can_create_pseudo_p is that big a hammer as it appears
emit_move_complex is mainly called prior to reload.  The proposed change
only affects the code when we can't create pseudos.  Even then, we  
fall back
to emit_move_complex_parts.

As you say, some other check might be more appropriate to determine
whether a call to gen_reg_rtx might be needed in  
emit_move_complex_parts.
For the PA, it would be something like "GET_MODE_BITSIZE (mode) >  
BITS_PER_WORD".
But, we still need to check can_create_pseudo_p as we probably still  
want to use
emit_move_complex_parts before reload.

Dave
--
John David Anglin	dave.anglin@bell.net
Steven Bosscher Aug. 29, 2013, 10:52 p.m. UTC | #4
On Thu, Aug 29, 2013 at 2:14 AM, John David Anglin wrote:
> As expected, your patch doesn't fix the PR.

Hmm, unfortunate. The reason why I proposed it is because it would
revert to the way this code worked before http://gcc.gnu.org/r104371

The idea was to make "force" false, and let the normal back-up plans
work after failure in emit_move_via_integer.


> The bug lies in using "emit_move_complex_parts" when we can't create
> pseudos.
> There are several places in the functions that it calls where "gen_reg_rtx"
> can be
> called (e.g., "store_bit_field_using_insv").  In debugging, I found that
> fixing these
> didn't help.  In the end, it fails to find a way move the complex parts.
>
> In gcc.c-torture/compile/pr55921.c, we have two register operands so try_int
> can be true.  "emit_move_via_integer" is successful in this case.  Your
> patch might be correct.
>
> I'm not sure that can_create_pseudo_p is that big a hammer as it appears
> emit_move_complex is mainly called prior to reload.  The proposed change
> only affects the code when we can't create pseudos.  Even then, we fall back
> to emit_move_complex_parts.
>
> As you say, some other check might be more appropriate to determine
> whether a call to gen_reg_rtx might be needed in emit_move_complex_parts.
> For the PA, it would be something like "GET_MODE_BITSIZE (mode) >
> BITS_PER_WORD".
> But, we still need to check can_create_pseudo_p as we probably still want to
> use emit_move_complex_parts before reload.

I haven't tried to see what goes on in the compiler, but it feels like
your.patch just fixes a symptom. Might be just reload behaving like
reload, but it just seems to me that the problem is not that you
cannot create new pseudos at the point of the ICE. The test on whether
the optab exists is obviously not enough, you somehow need to make
sure that the move doesn't require new registers. That's something I
would expect reload to check: Can the target make the move I'm asking
for without introducing new registers...

I realize I'm not being very helpful... just thinking out loud ;-)

Ciao!
Steven
John David Anglin Aug. 29, 2013, 11:02 p.m. UTC | #5
On 29-Aug-13, at 6:52 PM, Steven Bosscher wrote:

> Can the target make the move I'm asking
> for without introducing new registers...

Yes, it can execute extract and insert instructions using the same  
source and target register.
Of course, this clobbers the source.

Dave
--
John David Anglin	dave.anglin@bell.net
John David Anglin Aug. 29, 2013, 11:20 p.m. UTC | #6
On 29-Aug-13, at 6:52 PM, Steven Bosscher wrote:

> I haven't tried to see what goes on in the compiler, but it feels like
> your.patch just fixes a symptom. Might be just reload behaving like
> reload, but it just seems to me that the problem is not that you
> cannot create new pseudos at the point of the ICE. The test on whether
> the optab exists is obviously not enough, you somehow need to make
> sure that the move doesn't require new registers. That's something I
> would expect reload to check: Can the target make the move I'm asking
> for without introducing new registers...

I believe that reload does not handle complex moves.  The PA backend  
only
has reload patterns for integer and floating point moves, and the  
documentation
indicates that the backend doesn't have to provide complex support.

Maybe I'm missing something, but I don't think I have a way to trigger  
a reload
pattern for this case in the backend.  There are no reloads patterns  
for insert
and extract operations.  The middle is supposed to handle it.

Dave
--
John David Anglin	dave.anglin@bell.net
Eric Botcazou Aug. 30, 2013, 10:38 a.m. UTC | #7
> As you say, some other check might be more appropriate to determine
> whether a call to gen_reg_rtx might be needed in
> emit_move_complex_parts.
> For the PA, it would be something like "GET_MODE_BITSIZE (mode) >
> BITS_PER_WORD".
> But, we still need to check can_create_pseudo_p as we probably still
> want to use
> emit_move_complex_parts before reload.

Let's avoid trying to do something general since this seems to be really a 
corner case.  Can't we simply deal with hard registers specially?

  /* Move floating point as parts if splitting is easy.  */
  if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
      && optab_handler (mov_optab, GET_MODE_INNER (mode)) != CODE_FOR_nothing
      && !(REG_P (x)
           && HARD_REGISTER_P (x)
           && hard_regno_nregs[REGNO(x)][mode] == 1)
      && !(REG_P (y)
           && HARD_REGISTER_P (y)
           && hard_regno_nregs[REGNO(y)][mode] == 1))
    try_int = false;
diff mbox

Patch

Index: expr.c
===================================================================
--- expr.c      (revision 201887)
+++ expr.c      (working copy)
@@ -3268,7 +3268,7 @@  emit_move_complex (enum machine_mode mode, rtx x,
          return get_last_insn ();
        }

-      ret = emit_move_via_integer (mode, x, y, true);
+      ret = emit_move_via_integer (mode, x, y, can_create_pseudo_p ());
       if (ret)
        return ret;
     }