Patchwork avx512 mask register spilling

login
register
mail settings
Submitter Richard Henderson
Date Aug. 6, 2013, 7:11 p.m.
Message ID <52014A44.6010306@redhat.com>
Download mbox | patch
Permalink /patch/265203/
State New
Headers show

Comments

Richard Henderson - Aug. 6, 2013, 7:11 p.m.
On 08/06/2013 03:57 AM, Kirill Yukhin wrote:
> On 05 Aug 09:55, Richard Henderson wrote:
>> On 08/05/2013 08:07 AM, Kirill Yukhin wrote:
>>> Hello Richard, Vlad,
>>>
>>> On 31 Jul 06:26, Richard Henderson wrote:
>>>> On 07/31/2013 05:02 AM, Kirill Yukhin wrote:
>>>>>
>>>>> There's ICE (max. number of generated reload insns per insn is achieved (90)),
>>>>> when LRA tried to save mask register before call. This was caused by fact that  split_reg function
>>>>> in lra-constraints.c allocates memory for saved reg in SECONDARY_MEMORY_NEEDED_MODE which 
>>>>
>>>> I've told you before that it's not SECONDARY_MEMORY that you want, but
>>>> a secondary reload.  You should be patching ix86_secondary_reload, right
>>>> below where we handle QImode spills from non-Q registers for x86-32.
>>>
>>> Trying to do that with no success so far.
>>> Could you pls correct me if I am wrong.
>>> What I am trying to do is to introduce 2 new `define_expand' for load and store.
>>
>> Huh?  You shouldn't need this.
>>
>> Give me a test case and I can have a look at it.
> 
> Hello,
> 
> I've squashed part 1 and 2 + rebased on recent trunk.
> Testcase is attached.
> To reproduce: build-x86_64-linux/gcc/xgcc -Bbuild-x86_64-linux/gcc -Ofast -mavx512f -march=core-avx2 repro.c  -S -o-  -ffixed-rsi  -ffixed-rdi  -ffixed-rbx -ffixed-rbp -m32
> 
> Thanks a lot for help!

You've found what I believe to be a bug in LRA.

Specifically, lra-constraints.c split_reg uses SECONDARY_MEMORY_NEEDED_MODE to
choose what mode to spill a caller-save register.  Given the existing
definition in i386.h, this tries to spill a MASK_CLASS register in SImode.  But
MASK_CLASS does not support SImode, only QI/HImode.  Which leads to substantial
confusion in the allocator trying to satisfy the move.

I believe the use of SECONDARY_MEMORY_NEEDED_MODE in split_reg is wrong.
What's the history behind that, Vlad?  Surely we can spill the value in
its current mode?

Certainly this patch fixes the crash from Kirill's reproducer...


r~
* lra-constraints.c (split_reg): Always spill in the current mode.
Vladimir Makarov - Aug. 7, 2013, 3:54 p.m.
On 13-08-06 3:11 PM, Richard Henderson wrote:
> On 08/06/2013 03:57 AM, Kirill Yukhin wrote:
>> On 05 Aug 09:55, Richard Henderson wrote:
>>> On 08/05/2013 08:07 AM, Kirill Yukhin wrote:
>>>> Hello Richard, Vlad,
>>>>
>>>> On 31 Jul 06:26, Richard Henderson wrote:
>>>>> On 07/31/2013 05:02 AM, Kirill Yukhin wrote:
>>>>>> There's ICE (max. number of generated reload insns per insn is achieved (90)),
>>>>>> when LRA tried to save mask register before call. This was caused by fact that  split_reg function
>>>>>> in lra-constraints.c allocates memory for saved reg in SECONDARY_MEMORY_NEEDED_MODE which
>>>>> I've told you before that it's not SECONDARY_MEMORY that you want, but
>>>>> a secondary reload.  You should be patching ix86_secondary_reload, right
>>>>> below where we handle QImode spills from non-Q registers for x86-32.
>>>> Trying to do that with no success so far.
>>>> Could you pls correct me if I am wrong.
>>>> What I am trying to do is to introduce 2 new `define_expand' for load and store.
>>> Huh?  You shouldn't need this.
>>>
>>> Give me a test case and I can have a look at it.
>> Hello,
>>
>> I've squashed part 1 and 2 + rebased on recent trunk.
>> Testcase is attached.
>> To reproduce: build-x86_64-linux/gcc/xgcc -Bbuild-x86_64-linux/gcc -Ofast -mavx512f -march=core-avx2 repro.c  -S -o-  -ffixed-rsi  -ffixed-rdi  -ffixed-rbx -ffixed-rbp -m32
>>
>> Thanks a lot for help!
> You've found what I believe to be a bug in LRA.
>
> Specifically, lra-constraints.c split_reg uses SECONDARY_MEMORY_NEEDED_MODE to
> choose what mode to spill a caller-save register.  Given the existing
> definition in i386.h, this tries to spill a MASK_CLASS register in SImode.  But
> MASK_CLASS does not support SImode, only QI/HImode.  Which leads to substantial
> confusion in the allocator trying to satisfy the move.
>
> I believe the use of SECONDARY_MEMORY_NEEDED_MODE in split_reg is wrong.
> What's the history behind that, Vlad?  Surely we can spill the value in
> its current mode?

As I remember I tried to decrease number of macros used for LRA.

Just using mode of reg might not work for general case.

Reload (caller-saves.c) uses HARD_REGNO_CALLER_SAVE_MODE.   I guess we 
should use it.

I'll try to implement this and after some testing and checking on a few 
platform I'll commit it.

I guess we will have a solution at the end of this week.
> Certainly this patch fixes the crash from Kirill's reproducer...
>
Thanks for working on this, Richard.
>

Patch

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 728c058..924d588 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4388,11 +4388,7 @@  split_reg (bool before_p, int original_regno, rtx insn, rtx next_usage_insns)
     {
       enum machine_mode sec_mode;
 
-#ifdef SECONDARY_MEMORY_NEEDED_MODE
-      sec_mode = SECONDARY_MEMORY_NEEDED_MODE (GET_MODE (original_reg));
-#else
       sec_mode = GET_MODE (original_reg);
-#endif
       new_reg = lra_create_new_reg (sec_mode, NULL_RTX,
 				    NO_REGS, "save");
     }