diff mbox

[LRA] Don't generate reload for output scratch operand from reload instruction.

Message ID 56D04AF1.9030000@foss.arm.com
State New
Headers show

Commit Message

Renlin Li Feb. 26, 2016, 12:54 p.m. UTC
Hi all,

I admit that, the title looks a little bit confusing.

The situation is like this,
To make insn_1 strict, lra generates a new insn_1_reload insn.
In insn_1_reload, there is a scratch operand with this form
clobber (match_scratch:MODE x "=&r")

When lra tries to reload insn_1_reload in later iteration, a new pseudo
register (let say RXX) is created to replace this scratch operand in-place.
Additionally, a new insn will be generated and inserted after insn_1_reload to
finish the reload. It's in this form:
(set scratch, RXX)

And this instruction is illegal. no target implements this kind of pattern.
LRA will ICE because of this.
"internal compiler error: in lra_set_insn_recog_data, at lra.c:964"

And indeed, this pattern has no side-effect. The scratch operand should
stay inside the pattern.

Normally, at the very beginning of LRA reload, all scratch operands will be
replaced by newly created pseudo register. However, this is a problem when
generated reload insn has output scratch operand.

I have checked, x86, arm, aarch64, mips, arc all have such patterns. But it's
not triggered. In my case, it's triggered by compiling glibc with local change.

So a simple change is made in this patch. The output operand is reloaded only
when it's not a scratch operand and it's not unused since then.


aarch64-none-linux-gnu bootstrap and regression test OK.
x86_64-linux bootstrap and regression test OK.
OK for trunk?

Regards,
Renlin Li


gcc/ChangeLog:

2016-02-26  Renlin Li<renlin.li@arm.com>

	* lra-constraints.c (curr_insn_transform): Don't generate reload for
	output scratch operand.

Comments

Richard Biener Feb. 26, 2016, 12:57 p.m. UTC | #1
On Fri, Feb 26, 2016 at 1:54 PM, Renlin Li <renlin.li@foss.arm.com> wrote:
> Hi all,
>
> I admit that, the title looks a little bit confusing.
>
> The situation is like this,
> To make insn_1 strict, lra generates a new insn_1_reload insn.
> In insn_1_reload, there is a scratch operand with this form
> clobber (match_scratch:MODE x "=&r")
>
> When lra tries to reload insn_1_reload in later iteration, a new pseudo
> register (let say RXX) is created to replace this scratch operand in-place.
> Additionally, a new insn will be generated and inserted after insn_1_reload
> to
> finish the reload. It's in this form:
> (set scratch, RXX)
>
> And this instruction is illegal. no target implements this kind of pattern.
> LRA will ICE because of this.
> "internal compiler error: in lra_set_insn_recog_data, at lra.c:964"
>
> And indeed, this pattern has no side-effect. The scratch operand should
> stay inside the pattern.
>
> Normally, at the very beginning of LRA reload, all scratch operands will be
> replaced by newly created pseudo register. However, this is a problem when
> generated reload insn has output scratch operand.
>
> I have checked, x86, arm, aarch64, mips, arc all have such patterns. But
> it's
> not triggered. In my case, it's triggered by compiling glibc with local
> change.
>
> So a simple change is made in this patch. The output operand is reloaded
> only
> when it's not a scratch operand and it's not unused since then.
>
>
> aarch64-none-linux-gnu bootstrap and regression test OK.
> x86_64-linux bootstrap and regression test OK.
> OK for trunk?

Please extract a testcase from your modified glibc sources.

Richard.

> Regards,
> Renlin Li
>
>
> gcc/ChangeLog:
>
> 2016-02-26  Renlin Li<renlin.li@arm.com>
>
>         * lra-constraints.c (curr_insn_transform): Don't generate reload for
>         output scratch operand.
>
Renlin Li Feb. 26, 2016, 1:10 p.m. UTC | #2
Hi Richard,

On 26/02/16 12:57, Richard Biener wrote:
> On Fri, Feb 26, 2016 at 1:54 PM, Renlin Li <renlin.li@foss.arm.com> wrote:
>>
>> I have checked, x86, arm, aarch64, mips, arc all have such patterns. But
>> it's
>> not triggered. In my case, it's triggered by compiling glibc with local
>> change.
>>
>>
> Please extract a testcase from your modified glibc sources.
>
> Richard.

The change is to the compiler. I change the compiler and tries to build 
a linux toolchain.

I tried to create a test case. But no luck.

It needs to make LRA generate a reload pattern with a non-input scratch 
register operand.
This type of pattern actually is quite common in the target backends.

I also searched the GCC bugzilla website, there is no similar bug reported.

Regards,
Renlin
>
>> Regards,
>> Renlin Li
>>
>>
>> gcc/ChangeLog:
>>
>> 2016-02-26  Renlin Li<renlin.li@arm.com>
>>
>>          * lra-constraints.c (curr_insn_transform): Don't generate reload for
>>          output scratch operand.
>>
Vladimir Makarov Feb. 26, 2016, 11:54 p.m. UTC | #3
On 02/26/2016 07:54 AM, Renlin Li wrote:
> Hi all,
>
> I admit that, the title looks a little bit confusing.
>
> The situation is like this,
> To make insn_1 strict, lra generates a new insn_1_reload insn.
> In insn_1_reload, there is a scratch operand with this form
> clobber (match_scratch:MODE x "=&r")
>
> When lra tries to reload insn_1_reload in later iteration, a new pseudo
> register (let say RXX) is created to replace this scratch operand 
> in-place.
> Additionally, a new insn will be generated and inserted after 
> insn_1_reload to
> finish the reload. It's in this form:
> (set scratch, RXX)
>
> And this instruction is illegal. no target implements this kind of 
> pattern.
> LRA will ICE because of this.
> "internal compiler error: in lra_set_insn_recog_data, at lra.c:964"
>
> And indeed, this pattern has no side-effect. The scratch operand should
> stay inside the pattern.
>
> Normally, at the very beginning of LRA reload, all scratch operands 
> will be
> replaced by newly created pseudo register. However, this is a problem 
> when
> generated reload insn has output scratch operand.
>
> I have checked, x86, arm, aarch64, mips, arc all have such patterns. 
> But it's
> not triggered. In my case, it's triggered by compiling glibc with 
> local change.
>
> So a simple change is made in this patch. The output operand is 
> reloaded only
> when it's not a scratch operand and it's not unused since then.
>
>
> aarch64-none-linux-gnu bootstrap and regression test OK.
> x86_64-linux bootstrap and regression test OK.
> OK for trunk?
>
>
Thanks for working on this and providing a good description of the 
problem.  Could you fill a PR and provide a test even if you can not 
reduce it.

It is always hard to describe a bound of responsibilities for correct 
behaviour between LRA/reload and machine-dependent code. For example, 
reload can work when you describe a register class constraint in an insn 
definition and actually a subclass of the class rejects the operand 
mode.  I could provide the same behaviour for LRA but it is wrong as 
global register allocator will generate inefficient code.

As for the scratch.  As I understand the scratch was introduced for 
operands which will not require any resources (memory or a new register) 
for some insn alternatives.  If we use pseudo for this, it will always 
need memory or a register.  The typical constraint for scratch is "r,X" 
or "0r".  So I guess using just "&r" for scratch is a bad practice.  
Still for compatibility I think we should implement the same reload 
behaviour for this case too.

I believe we should use the same technique -- changing scratches to 
pseudo and back at the end of LRA if they don't need a register.  It 
will solve also a possible problem for correct scratch generation during 
LRA.

I am going to work on this problem on the next week.  A test case would 
be a help for me.
>
> gcc/ChangeLog:
>
> 2016-02-26  Renlin Li<renlin.li@arm.com>
>
>     * lra-constraints.c (curr_insn_transform): Don't generate reload for
>     output scratch operand.
>
Sorry, I can not accept the patch as I'd like to provide a better 
solution I described above.  The patch is also wrong for unused 
non-scratch operands.  They still should be reloaded if they do not 
satisfy their constraints even if they are not used later.
Renlin Li Feb. 29, 2016, 4:11 p.m. UTC | #4
Hi Vladimir,

Thank you for explain it.
I have a few comments inlined.

On 26/02/16 23:54, Vladimir Makarov wrote:
>
> Thanks for working on this and providing a good description of the 
> problem.  Could you fill a PR and provide a test even if you can not 
> reduce it.

I will fill a PR. Try to reduce a test case. As it's triggered by my 
local change to gcc, I cannot guarantee it.
Anyway, I am quite happy to test your fix when you have one.

> As for the scratch.  As I understand the scratch was introduced for 
> operands which will not require any resources (memory or a new 
> register) for some insn alternatives.  If we use pseudo for this, it 
> will always need memory or a register.  The typical constraint for 
> scratch is "r,X" or "0r".  So I guess using just "&r" for scratch is a 
> bad practice.  Still for compatibility I think we should implement the 
> same reload behaviour for this case too.

Actually (clobber (match_scratch:MODE x "=r")) also triggers this ICE. 
The early clobber modifier here doesn't really matter.
the purpose of this pattern is to reserve a pseudo register for use as a 
temporary.
The "=" modifier is required for MATCH_SCRATCH expression. Otherwise, it 
will error "missing output reload"
That why (set scratch, RXX) is generated.

>
> I believe we should use the same technique -- changing scratches to 
> pseudo and back at the end of LRA if they don't need a register.  It 
> will solve also a possible problem for correct scratch generation 
> during LRA.
>
> I am going to work on this problem on the next week.  A test case 
> would be a help for me.
>>
>> gcc/ChangeLog:
>>
>> 2016-02-26  Renlin Li<renlin.li@arm.com>
>>
>>     * lra-constraints.c (curr_insn_transform): Don't generate reload for
>>     output scratch operand.
>>
> Sorry, I can not accept the patch as I'd like to provide a better 
> solution I described above.  The patch is also wrong for unused 
> non-scratch operands.  They still should be reloaded if they do not 
> satisfy their constraints even if they are not used later.
>

I think still it will be reload according to the code logic here:

if (get_reload_reg (type, mode, old, goal_alt[i],
                   loc != curr_id->operand_loc[i], "", &new_reg)
           && type != OP_OUT)
         {
           push_to_sequence (before);
           lra_emit_move (new_reg, old);
           before = get_insns ();
           end_sequence ();
         }
       *loc = new_reg; ------------->>>>>>>>> the original operand will 
be replaced by a reload reg.
       if (type != OP_IN
           /* Don't generate reload for output scratch operand.  */
           && GET_CODE (old) != SCRATCH
           && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)


a reload register will be generated to replace the old operand in the 
original rtx pattern to satisfy their constraints.
Later, it will check, if this operand is an ouput operand which will be 
used later, another insn will be generated to
move newly generate pseudo into old operand.

The patch is to add one more condition to this final insn generation.

Regards,
Renlin
diff mbox

Patch

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 08cf0aa6c4208bb60ba5071bad1255d587f1cb4a..ef5809ff226cca69bb711bfc5dab55e24caba01a 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -3882,6 +3882,8 @@  curr_insn_transform (bool check_only_p)
 	    }
 	  *loc = new_reg;
 	  if (type != OP_IN
+	      /* Don't generate reload for output scratch operand.  */
+	      && GET_CODE (old) != SCRATCH
 	      && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
 	    {
 	      start_sequence ();