diff mbox

fix hardreg_cprop to honor HARD_REGNO_MODE_OK.

Message ID 20140811134937.GA95937@msticlxl7.ims.intel.com
State New
Headers show

Commit Message

Ilya Tocar Aug. 11, 2014, 1:49 p.m. UTC
Hi,

I've observed SPEC2006 failure on avx512-vlbwdq branch.
It was caused by  hardreg_cprop. In maybe_mode_change it was
assumed, that all values of the same register class and same mode.
are ok. This is not the case for i386/avx512. We need to honor
HARD_REGNO_MODE_OK.
Patch bellow does it.
Ok for trunk?

2014-08-11  Ilya Tocar  <ilya.tocar@intel.com>

	* regcprop.c (maybe_mode_change): Honor HARD_REGNO_MODE_OK.


---
 gcc/regcprop.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jeff Law Aug. 13, 2014, 8:31 p.m. UTC | #1
On 08/11/14 07:49, Ilya Tocar wrote:
> Hi,
>
> I've observed SPEC2006 failure on avx512-vlbwdq branch.
> It was caused by  hardreg_cprop. In maybe_mode_change it was
> assumed, that all values of the same register class and same mode.
> are ok. This is not the case for i386/avx512. We need to honor
> HARD_REGNO_MODE_OK.
> Patch bellow does it.
> Ok for trunk?
>
> 2014-08-11  Ilya Tocar  <ilya.tocar@intel.com>
>
> 	* regcprop.c (maybe_mode_change): Honor HARD_REGNO_MODE_OK.
One could argue that having a class where some members are OK for being 
used in a particular mode, but other members are not is the core issue 
here.

Can you describe a bit about why you've got a class of that nature? 
Background on that would be useful.

jeff
Ilya Tocar Aug. 14, 2014, 1:16 p.m. UTC | #2
> >I've observed SPEC2006 failure on avx512-vlbwdq branch.
> >It was caused by  hardreg_cprop. In maybe_mode_change it was
> >assumed, that all values of the same register class and same mode.
> >are ok. This is not the case for i386/avx512. We need to honor
> >HARD_REGNO_MODE_OK.

> One could argue that having a class where some members are OK for being used
> in a particular mode, but other members are not is the core issue here.
> 
> Can you describe a bit about why you've got a class of that nature?
> Background on that would be useful.
> 

AVX512 added new 16 xmm registers (xmm16-xmm31).
Those registers require evex encoding.
Only 512-bit wide versions of instructions have evex encoding with
avx512f, but all versions have it with avx512vl.
Most instructions have same macroized pattern for 128/256/512 vector
length. They all use constraint 'v', which corresponds to 
class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
256-bit case (avx512f) and allow it only in avx512vl case we have
HARD_REGNO_MODE_OK checking for regno being evex-only and
disallowing it if mode is not 512-bit.
Jeff Law Aug. 29, 2014, 9:03 p.m. UTC | #3
On 08/14/14 07:16, Ilya Tocar wrote:
>>> I've observed SPEC2006 failure on avx512-vlbwdq branch.
>>> It was caused by  hardreg_cprop. In maybe_mode_change it was
>>> assumed, that all values of the same register class and same mode.
>>> are ok. This is not the case for i386/avx512. We need to honor
>>> HARD_REGNO_MODE_OK.
>
>> One could argue that having a class where some members are OK for being used
>> in a particular mode, but other members are not is the core issue here.
>>
>> Can you describe a bit about why you've got a class of that nature?
>> Background on that would be useful.
>>
>
> AVX512 added new 16 xmm registers (xmm16-xmm31).
> Those registers require evex encoding.
> Only 512-bit wide versions of instructions have evex encoding with
> avx512f, but all versions have it with avx512vl.
> Most instructions have same macroized pattern for 128/256/512 vector
> length. They all use constraint 'v', which corresponds to
> class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
> 256-bit case (avx512f) and allow it only in avx512vl case we have
> HARD_REGNO_MODE_OK checking for regno being evex-only and
> disallowing it if mode is not 512-bit.
Generally this kind of thing has been handled by splitting the register 
class into two classes.  I strongly suspect there are numerous places 
where we assume that two regs in the same class are interchangeable.

I realize that's going to require some work in the x86 machine 
description, but I think that's going to be a much better approach and 
save you work in the long run.

jeff
Ilya Tocar Sept. 1, 2014, 10:29 a.m. UTC | #4
> >
> >AVX512 added new 16 xmm registers (xmm16-xmm31).
> >Those registers require evex encoding.
> >Only 512-bit wide versions of instructions have evex encoding with
> >avx512f, but all versions have it with avx512vl.
> >Most instructions have same macroized pattern for 128/256/512 vector
> >length. They all use constraint 'v', which corresponds to
> >class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
> >256-bit case (avx512f) and allow it only in avx512vl case we have
> >HARD_REGNO_MODE_OK checking for regno being evex-only and
> >disallowing it if mode is not 512-bit.
> Generally this kind of thing has been handled by splitting the register
> class into two classes.  I strongly suspect there are numerous places where
> we assume that two regs in the same class are interchangeable.
I'm not sure that there are many places where we replace hard regs
without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK.
As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we
should always check when substituting hard reg. Why is regcprop
different, and what's the point of HARD_REGNO_MODE_OK if it is ignored
by some passes?

> 
> I realize that's going to require some work in the x86 machine description,
> but I think that's going to be a much better approach and save you work in
> the long run.
>

This will approximately double sse.md, as we will need to split all
patterns with 512-bit versions in 2 (512 and 128/256 cases) and play
games with enabling/disabling alternatives depending on flags.
Are you sure that this better than honoring HARD_REGNO_MODE_OK?
As far as I understand, honoring  HARD_REGNO_MODE_OK shouldn't produce
worse code.
Jeff Law Sept. 25, 2014, 7:14 p.m. UTC | #5
On 09/01/14 04:29, Ilya Tocar wrote:
>>>
>>> AVX512 added new 16 xmm registers (xmm16-xmm31).
>>> Those registers require evex encoding.
>>> Only 512-bit wide versions of instructions have evex encoding with
>>> avx512f, but all versions have it with avx512vl.
>>> Most instructions have same macroized pattern for 128/256/512 vector
>>> length. They all use constraint 'v', which corresponds to
>>> class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
>>> 256-bit case (avx512f) and allow it only in avx512vl case we have
>>> HARD_REGNO_MODE_OK checking for regno being evex-only and
>>> disallowing it if mode is not 512-bit.
>> Generally this kind of thing has been handled by splitting the register
>> class into two classes.  I strongly suspect there are numerous places where
>> we assume that two regs in the same class are interchangeable.
> I'm not sure that there are many places where we replace hard regs
> without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK.
> As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we
> should always check when substituting hard reg. Why is regcprop
> different, and what's the point of HARD_REGNO_MODE_OK if it is ignored
> by some passes?
>
>>
>> I realize that's going to require some work in the x86 machine description,
>> but I think that's going to be a much better approach and save you work in
>> the long run.
>>
>
> This will approximately double sse.md, as we will need to split all
> patterns with 512-bit versions in 2 (512 and 128/256 cases) and play
> games with enabling/disabling alternatives depending on flags.
> Are you sure that this better than honoring HARD_REGNO_MODE_OK?
> As far as I understand, honoring  HARD_REGNO_MODE_OK shouldn't produce
> worse code.
I don't see how it doubles the size.  You split the class into two 
classes.  Whatever letter your second class has, you use it in 
conjunction with 'v' that you're already using.  Note you do not need 
different alternatives, you use them in the same alternative.

It's not a question of performance, but of design.  I suspect you're 
really just at the tip of the iceberg with this stuff if you continue to 
go down the path of having registers in the same class, some of which 
are allocatable and some of which are not.

The other approach that I believe has been taken has been to mark the 
new registers as fixed when compiling for hardware where they're not 
available.  But I'm not sure offhand if that would be sufficient to fix 
this problem.


Jeff
Ilya Tocar Sept. 26, 2014, 3:31 p.m. UTC | #6
On 25 Sep 13:14, Jeff Law wrote:
> On 09/01/14 04:29, Ilya Tocar wrote:
> >>>
> >>>AVX512 added new 16 xmm registers (xmm16-xmm31).
> >>>Those registers require evex encoding.
> >>>Only 512-bit wide versions of instructions have evex encoding with
> >>>avx512f, but all versions have it with avx512vl.
> >>>Most instructions have same macroized pattern for 128/256/512 vector
> >>>length. They all use constraint 'v', which corresponds to
> >>>class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
> >>>256-bit case (avx512f) and allow it only in avx512vl case we have
> >>>HARD_REGNO_MODE_OK checking for regno being evex-only and
> >>>disallowing it if mode is not 512-bit.
> >>Generally this kind of thing has been handled by splitting the register
> >>class into two classes.  I strongly suspect there are numerous places where
> >>we assume that two regs in the same class are interchangeable.
> >I'm not sure that there are many places where we replace hard regs
> >without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK.
> >As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we
> >should always check when substituting hard reg. Why is regcprop
> >different, and what's the point of HARD_REGNO_MODE_OK if it is ignored
> >by some passes?
> >
> >>
> >>I realize that's going to require some work in the x86 machine description,
> >>but I think that's going to be a much better approach and save you work in
> >>the long run.
> >>
> >
> >This will approximately double sse.md, as we will need to split all
> >patterns with 512-bit versions in 2 (512 and 128/256 cases) and play
> >games with enabling/disabling alternatives depending on flags.
> >Are you sure that this better than honoring HARD_REGNO_MODE_OK?
> >As far as I understand, honoring  HARD_REGNO_MODE_OK shouldn't produce
> >worse code.
> I don't see how it doubles the size.  You split the class into two classes.
> Whatever letter your second class has, you use it in conjunction with 'v'
> that you're already using.  Note you do not need different alternatives, you
> use them in the same alternative.
I'm not sure how will this help. Consider
add<V2DF,V4DF,V8DF>, right now they are described in one pattern.
Now in AVX512F (without AVX512VL) case we can use xmm16 for V8DF, but not for
V2DF,V4DF. If we keep them in one pattern, they will have same
alternatives for all modes. So we will need to either
split V2DF,V4DF into separate pattern (doubling number of patterns), or
disallow particular modes depending on flags (what we do now).

> 
> It's not a question of performance, but of design.
Obviously, but I still fail to see why honoring HARD_REGNO_MODE_OK is
bad design. I suspect that even without avx512 changes not honoring it will
bite us sooner or later.
> I suspect you're really
> just at the tip of the iceberg with this stuff if you continue to go down
> the path of having registers in the same class, some of which are
> allocatable and some of which are not.
Having class where some registers are not available is an old approach:
Consider SSE_REGS class, where half of registers is not available in
32-bit case. Problem is with different modes being valid in those
registers, depending on flags. And it worked fine for previous
~year in gcc 4.9. In my opinion if we check in original patch we will
harm no one, and fix correctness problem. If we later discover some new
problem, that is not fixable by simple patch, we may rework all of avx512
implementation. As all bugs of this kind will never generate incorrect
code (all error will be caught by assembler), I see no reason not to
check it in.
> 
> The other approach that I believe has been taken has been to mark the new
> registers as fixed when compiling for hardware where they're not available.
> But I'm not sure offhand if that would be sufficient to fix this problem.
It will not help. Registers are available. Just some modes are not
supported.
Jeff Law Oct. 10, 2014, 10:52 p.m. UTC | #7
On 09/26/14 09:31, Ilya Tocar wrote:



>>
>> It's not a question of performance, but of design.
> Obviously, but I still fail to see why honoring HARD_REGNO_MODE_OK is
> bad design. I suspect that even without avx512 changes not honoring it will
> bite us sooner or later.
If we look at maybe_mode_change:

/* Register REGNO was originally set in ORIG_MODE.  It - or a copy of it -
    was copied in COPY_MODE to COPY_REGNO, and then COPY_REGNO was accessed
    in NEW_MODE.
    Return a NEW_MODE rtx for REGNO if that's OK, otherwise return 
NULL_RTX.  */

static rtx
maybe_mode_change (enum machine_mode orig_mode, enum machine_mode copy_mode,
                    enum machine_mode new_mode, unsigned int regno,
                    unsigned int copy_regno ATTRIBUTE_UNUSED)
{
   if (GET_MODE_SIZE (copy_mode) < GET_MODE_SIZE (orig_mode)
       && GET_MODE_SIZE (copy_mode) < GET_MODE_SIZE (new_mode))
     return NULL_RTX;

   if (orig_mode == new_mode)
     return gen_rtx_raw_REG (new_mode, regno);

REGNO was originally set in ORIG_MODE, that's an invariant of this 
fucntion.  ORIG_MODE == NEW_MODE in the conditional which generates the 
new reg.  How in the world can HARD_REGNO_MODE_OK be false in that case?

If it was false in that case, then how in the world did a value get put 
in REGNO using ORIG_MODE/NEW_MODE to start with?!?

It really looks like you're papering over a problem.

Jeff
diff mbox

Patch

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 932037d..694deb2 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -410,7 +410,7 @@  maybe_mode_change (enum machine_mode orig_mode, enum machine_mode copy_mode,
       && GET_MODE_SIZE (copy_mode) < GET_MODE_SIZE (new_mode))
     return NULL_RTX;
 
-  if (orig_mode == new_mode)
+  if (orig_mode == new_mode && HARD_REGNO_MODE_OK (regno, new_mode))
     return gen_rtx_raw_REG (new_mode, regno);
   else if (mode_change_ok (orig_mode, new_mode, regno))
     {