diff mbox series

i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL

Message ID 20190207211057.19933-1-hjl.tools@gmail.com
State New
Headers show
Series i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL | expand

Commit Message

H.J. Lu Feb. 7, 2019, 9:10 p.m. UTC
OImode and TImode moves must be done in XImode to access upper 16
vector registers without AVX512VL.  With AVX512VL, we can access
upper 16 vector registers in OImode and TImode.

	PR target/89229
	* config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
	upper 16 vector registers without TARGET_AVX512VL.
	(*movti_internal): Likewise.
---
 gcc/config/i386/i386.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Uros Bizjak Feb. 8, 2019, 9:51 a.m. UTC | #1
On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> OImode and TImode moves must be done in XImode to access upper 16
> vector registers without AVX512VL.  With AVX512VL, we can access
> upper 16 vector registers in OImode and TImode.
>
>         PR target/89229
>         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
>         upper 16 vector registers without TARGET_AVX512VL.
>         (*movti_internal): Likewise.

Please use (not (match_test "...")) instead of (match_test "!...") and
put the new test as the first argument of the AND rtx.

LGTM with the above change.

Uros.

> ---
>  gcc/config/i386/i386.md | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c1492363bca..e7f4b9a9c8d 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1933,8 +1933,9 @@
>     (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>     (set_attr "prefix" "vex")
>     (set (attr "mode")
> -       (cond [(ior (match_operand 0 "ext_sse_reg_operand")
> -                   (match_operand 1 "ext_sse_reg_operand"))
> +       (cond [(and (ior (match_operand 0 "ext_sse_reg_operand")
> +                        (match_operand 1 "ext_sse_reg_operand"))
> +                   (match_test "!TARGET_AVX512VL"))
>                  (const_string "XI")
>                (and (eq_attr "alternative" "1")
>                     (match_test "TARGET_AVX512VL"))
> @@ -2012,8 +2013,9 @@
>     (set (attr "mode")
>         (cond [(eq_attr "alternative" "0,1")
>                  (const_string "DI")
> -              (ior (match_operand 0 "ext_sse_reg_operand")
> -                   (match_operand 1 "ext_sse_reg_operand"))
> +              (and (ior (match_operand 0 "ext_sse_reg_operand")
> +                        (match_operand 1 "ext_sse_reg_operand"))
> +                   (match_test "!TARGET_AVX512VL"))
>                  (const_string "XI")
>                (and (eq_attr "alternative" "3")
>                     (match_test "TARGET_AVX512VL"))
> --
> 2.20.1
>
H.J. Lu Feb. 8, 2019, 11:28 a.m. UTC | #2
On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > OImode and TImode moves must be done in XImode to access upper 16
> > vector registers without AVX512VL.  With AVX512VL, we can access
> > upper 16 vector registers in OImode and TImode.
> >
> >         PR target/89229
> >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> >         upper 16 vector registers without TARGET_AVX512VL.
> >         (*movti_internal): Likewise.
>
> Please use (not (match_test "...")) instead of (match_test "!...") and
> put the new test as the first argument of the AND rtx.
>
> LGTM with the above change.

This is the patch I am checking in.

Thanks.

H.J.
---
OImode and TImode moves must be done in XImode to access upper 16
vector registers without AVX512VL.  With AVX512VL, we can access
upper 16 vector registers in OImode and TImode.

PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
upper 16 vector registers without TARGET_AVX512VL.
(*movti_internal): Likewise.
---
 gcc/config/i386/i386.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c1492363bca..3d9141ae450 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1933,8 +1933,9 @@
    (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
    (set_attr "prefix" "vex")
    (set (attr "mode")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
-     (match_operand 1 "ext_sse_reg_operand"))
+ (cond [(and (not (match_test "TARGET_AVX512VL"))
+     (ior (match_operand 0 "ext_sse_reg_operand")
+ (match_operand 1 "ext_sse_reg_operand")))
  (const_string "XI")
         (and (eq_attr "alternative" "1")
      (match_test "TARGET_AVX512VL"))
@@ -2012,8 +2013,9 @@
    (set (attr "mode")
  (cond [(eq_attr "alternative" "0,1")
  (const_string "DI")
-        (ior (match_operand 0 "ext_sse_reg_operand")
-     (match_operand 1 "ext_sse_reg_operand"))
+        (and (not (match_test "TARGET_AVX512VL"))
+     (ior (match_operand 0 "ext_sse_reg_operand")
+ (match_operand 1 "ext_sse_reg_operand")))
  (const_string "XI")
         (and (eq_attr "alternative" "3")
      (match_test "TARGET_AVX512VL"))
--
H.J. Lu Feb. 9, 2019, 12:29 a.m. UTC | #3
On Fri, Feb 8, 2019 at 3:28 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > OImode and TImode moves must be done in XImode to access upper 16
> > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > upper 16 vector registers in OImode and TImode.
> > >
> > >         PR target/89229
> > >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > >         upper 16 vector registers without TARGET_AVX512VL.
> > >         (*movti_internal): Likewise.
> >
> > Please use (not (match_test "...")) instead of (match_test "!...") and
> > put the new test as the first argument of the AND rtx.
> >
> > LGTM with the above change.
>
> This is the patch I am checking in.
>
> Thanks.
>
> H.J.
> ---
> OImode and TImode moves must be done in XImode to access upper 16
> vector registers without AVX512VL.  With AVX512VL, we can access
> upper 16 vector registers in OImode and TImode.
>
> PR target/89229
> * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> upper 16 vector registers without TARGET_AVX512VL.
> (*movti_internal): Likewise.
> ---
>  gcc/config/i386/i386.md | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c1492363bca..3d9141ae450 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1933,8 +1933,9 @@
>     (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>     (set_attr "prefix" "vex")
>     (set (attr "mode")
> - (cond [(ior (match_operand 0 "ext_sse_reg_operand")
> -     (match_operand 1 "ext_sse_reg_operand"))
> + (cond [(and (not (match_test "TARGET_AVX512VL"))
> +     (ior (match_operand 0 "ext_sse_reg_operand")
> + (match_operand 1 "ext_sse_reg_operand")))
>   (const_string "XI")
>          (and (eq_attr "alternative" "1")
>       (match_test "TARGET_AVX512VL"))
> @@ -2012,8 +2013,9 @@
>     (set (attr "mode")
>   (cond [(eq_attr "alternative" "0,1")
>   (const_string "DI")
> -        (ior (match_operand 0 "ext_sse_reg_operand")
> -     (match_operand 1 "ext_sse_reg_operand"))
> +        (and (not (match_test "TARGET_AVX512VL"))
> +     (ior (match_operand 0 "ext_sse_reg_operand")
> + (match_operand 1 "ext_sse_reg_operand")))
>   (const_string "XI")
>          (and (eq_attr "alternative" "3")
>       (match_test "TARGET_AVX512VL"))
> --

Also need this patch since we no longer set MODE_XI for
AVX512VL.
Uros Bizjak Feb. 9, 2019, 9:50 a.m. UTC | #4
On 2/9/19, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Feb 8, 2019 at 3:28 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>> >
>> > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> > >
>> > > OImode and TImode moves must be done in XImode to access upper 16
>> > > vector registers without AVX512VL.  With AVX512VL, we can access
>> > > upper 16 vector registers in OImode and TImode.
>> > >
>> > >         PR target/89229
>> > >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI
>> > > for
>> > >         upper 16 vector registers without TARGET_AVX512VL.
>> > >         (*movti_internal): Likewise.
>> >
>> > Please use (not (match_test "...")) instead of (match_test "!...") and
>> > put the new test as the first argument of the AND rtx.
>> >
>> > LGTM with the above change.
>>
>> This is the patch I am checking in.
>>
>> Thanks.
>>
>> H.J.
>> ---
>> OImode and TImode moves must be done in XImode to access upper 16
>> vector registers without AVX512VL.  With AVX512VL, we can access
>> upper 16 vector registers in OImode and TImode.
>>
>> PR target/89229
>> * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
>> upper 16 vector registers without TARGET_AVX512VL.
>> (*movti_internal): Likewise.
>> ---
>>  gcc/config/i386/i386.md | 10 ++++++----
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index c1492363bca..3d9141ae450 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -1933,8 +1933,9 @@
>>     (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>>     (set_attr "prefix" "vex")
>>     (set (attr "mode")
>> - (cond [(ior (match_operand 0 "ext_sse_reg_operand")
>> -     (match_operand 1 "ext_sse_reg_operand"))
>> + (cond [(and (not (match_test "TARGET_AVX512VL"))
>> +     (ior (match_operand 0 "ext_sse_reg_operand")
>> + (match_operand 1 "ext_sse_reg_operand")))
>>   (const_string "XI")
>>          (and (eq_attr "alternative" "1")
>>       (match_test "TARGET_AVX512VL"))
>> @@ -2012,8 +2013,9 @@
>>     (set (attr "mode")
>>   (cond [(eq_attr "alternative" "0,1")
>>   (const_string "DI")
>> -        (ior (match_operand 0 "ext_sse_reg_operand")
>> -     (match_operand 1 "ext_sse_reg_operand"))
>> +        (and (not (match_test "TARGET_AVX512VL"))
>> +     (ior (match_operand 0 "ext_sse_reg_operand")
>> + (match_operand 1 "ext_sse_reg_operand")))
>>   (const_string "XI")
>>          (and (eq_attr "alternative" "3")
>>       (match_test "TARGET_AVX512VL"))
>> --
>
> Also need this patch since we no longer set MODE_XI for
> AVX512VL.

No. Please figure out correct condition to set mode attribute to XImode instead.

Uros.
Jakub Jelinek Feb. 9, 2019, 9:56 a.m. UTC | #5
On Sat, Feb 09, 2019 at 10:50:43AM +0100, Uros Bizjak wrote:
> > Also need this patch since we no longer set MODE_XI for
> > AVX512VL.
> 
> No. Please figure out correct condition to set mode attribute to XImode instead.

If it is AVX512VL, isn't MODE_OI or MODE_TI correct in those cases though?
While the instructions need EVEX encoding if they have [xy]mm{16,...31}
operands, they operate just on 256 or 128 bits.

	Jakub
Jakub Jelinek Feb. 9, 2019, 10:40 a.m. UTC | #6
On Sat, Feb 09, 2019 at 10:56:38AM +0100, Jakub Jelinek wrote:
> On Sat, Feb 09, 2019 at 10:50:43AM +0100, Uros Bizjak wrote:
> > > Also need this patch since we no longer set MODE_XI for
> > > AVX512VL.
> > 
> > No. Please figure out correct condition to set mode attribute to XImode instead.
> 
> If it is AVX512VL, isn't MODE_OI or MODE_TI correct in those cases though?
> While the instructions need EVEX encoding if they have [xy]mm{16,...31}
> operands, they operate just on 256 or 128 bits.

That said, mov{oi,ti}_internal is severely broken for avx512f without
avx512vl even after this patch.

I think the following patch, incremental to H.J.'s patch, should fix that.
It is pretty much a copy of what sse.md (*mov<mode>_internal) pattern does,
just specialized to the particular instructions (i.e. that it is integral,
not floating, and always 32-byte or always 16-byte).  sse.md has:
      /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
         in avx512f, so we need to use workarounds, to access sse registers
         16-31, which are evex-only. In avx512vl we don't need workarounds.  */
      if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL
          && (EXT_REX_SSE_REG_P (operands[0])
              || EXT_REX_SSE_REG_P (operands[1])))
        {
          if (memory_operand (operands[0], <MODE>mode))
            {
              if (<MODE_SIZE> == 32)
                return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, 0x0}";
              else if (<MODE_SIZE> == 16)
                return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, 0x0}";
              else
                gcc_unreachable ();
            }
          else if (memory_operand (operands[1], <MODE>mode))
            {
              if (<MODE_SIZE> == 32)
                return "vbroadcast<shuffletype>64x4\t{%1, %g0|%g0, %1}";
              else if (<MODE_SIZE> == 16)
                return "vbroadcast<shuffletype>32x4\t{%1, %g0|%g0, %1}";
              else
                gcc_unreachable ();
            }
          else
            /* Reg -> reg move is always aligned.  Just use wider move.  */
            switch (get_attr_mode (insn))
              {
              case MODE_V8SF:
              case MODE_V4SF:
                return "vmovaps\t{%g1, %g0|%g0, %g1}";
              case MODE_V4DF:
              case MODE_V2DF:
                return "vmovapd\t{%g1, %g0|%g0, %g1}";
              case MODE_OI:
              case MODE_TI:
                return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
              default:
                gcc_unreachable ();
              }
        }
before it tries to handle the normal cases.  Ok for trunk if it passes
bootstrap/regtest?

2019-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR target/89229
	* config/i386/i386.md (*movoi_internal_avx, *movti_internal): Handle
	MODE_XI properly.

--- gcc/config/i386/i386.md.jj	2019-02-09 11:18:53.995450055 +0100
+++ gcc/config/i386/i386.md	2019-02-09 11:26:04.364342306 +0100
@@ -1905,6 +1905,18 @@ (define_insn "*movoi_internal_avx"
       return standard_sse_constant_opcode (insn, operands);
 
     case TYPE_SSEMOV:
+      /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
+	 in avx512f, so we need to use workarounds to access sse registers
+	 16-31, which are evex-only. In avx512vl we don't need workarounds.  */
+      if (get_attr_mode (insn) == MODE_XI)
+	{
+	  if (memory_operand (operands[0], OImode))
+	    return "vextracti64x4\t{$0x0, %g1, %0|%0, %g1, 0x0}";
+	  else if (memory_operand (operands[1], OImode))
+	    return "vbroadcasti64x4\t{%1, %g0|%g0, %1}";
+	  else
+	    return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
+	}
       if (misaligned_operand (operands[0], OImode)
 	  || misaligned_operand (operands[1], OImode))
 	{
@@ -1968,6 +1980,18 @@ (define_insn "*movti_internal"
       return standard_sse_constant_opcode (insn, operands);
 
     case TYPE_SSEMOV:
+      /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
+	 in avx512f, so we need to use workarounds to access sse registers
+	 16-31, which are evex-only. In avx512vl we don't need workarounds.  */
+      if (get_attr_mode (insn) == MODE_XI)
+	{
+	  if (memory_operand (operands[0], TImode))
+	    return "vextracti32x4\t{$0x0, %g1, %0|%0, %g1, 0x0}";
+	  else if (memory_operand (operands[1], TImode))
+	    return "vbroadcasti32x4\t{%1, %g0|%g0, %1}";
+	  else
+	    return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
+	}
       /* TDmode values are passed as TImode on the stack.  Moving them
 	 to stack may result in unaligned memory access.  */
       if (misaligned_operand (operands[0], TImode)


	Jakub
Jakub Jelinek Feb. 9, 2019, 10:50 a.m. UTC | #7
On Sat, Feb 09, 2019 at 11:40:49AM +0100, Jakub Jelinek wrote:
> 2019-02-09  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/89229
> 	* config/i386/i386.md (*movoi_internal_avx, *movti_internal): Handle
> 	MODE_XI properly.

Actually, I believe this shouldn't be needed, basically I think MODE_XI
should never be the case for these instructions, because hard_regno_mode_ok
shouldn't allow that:

      /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
      if (TARGET_AVX512VL
          && (mode == OImode
              || mode == TImode
              || VALID_AVX256_REG_MODE (mode)
              || VALID_AVX512VL_128_REG_MODE (mode)))
        return true;

      /* xmm16-xmm31 are only available for AVX-512.  */
      if (EXT_REX_SSE_REGNO_P (regno))
        return false;

but then the question is if we really need:
(and (not (match_test "TARGET_AVX512VL"))
                    (ior (match_operand 0 "ext_sse_reg_operand")
                         (match_operand 1 "ext_sse_reg_operand")))
                 (const_string "XI")
on both of the instructions, not avx512vl, the above shouldn't allow
ext_sse_reg_operand through with OImode or TImode.
We still need the MODE_XI -> EXT_REX_SSE_REGNO_P patch H.J. posted.

	Jakub
H.J. Lu Feb. 9, 2019, 12:11 p.m. UTC | #8
On Sat, Feb 9, 2019 at 2:50 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Sat, Feb 09, 2019 at 11:40:49AM +0100, Jakub Jelinek wrote:
> > 2019-02-09  Jakub Jelinek  <jakub@redhat.com>
> >
> >       PR target/89229
> >       * config/i386/i386.md (*movoi_internal_avx, *movti_internal): Handle
> >       MODE_XI properly.
>
> Actually, I believe this shouldn't be needed, basically I think MODE_XI
> should never be the case for these instructions, because hard_regno_mode_ok
> shouldn't allow that:
>
>       /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
>       if (TARGET_AVX512VL
>           && (mode == OImode
>               || mode == TImode
>               || VALID_AVX256_REG_MODE (mode)
>               || VALID_AVX512VL_128_REG_MODE (mode)))
>         return true;
>
>       /* xmm16-xmm31 are only available for AVX-512.  */
>       if (EXT_REX_SSE_REGNO_P (regno))
>         return false;
>
> but then the question is if we really need:
> (and (not (match_test "TARGET_AVX512VL"))
>                     (ior (match_operand 0 "ext_sse_reg_operand")
>                          (match_operand 1 "ext_sse_reg_operand")))
>                  (const_string "XI")
> on both of the instructions, not avx512vl, the above shouldn't allow
> ext_sse_reg_operand through with OImode or TImode.
> We still need the MODE_XI -> EXT_REX_SSE_REGNO_P patch H.J. posted.
>
>         Jakub

I believe all usages of

(ior (match_operand 0 "ext_sse_reg_operand")
      (match_operand 1 "ext_sse_reg_operand"))

should be checked.  I am not sure if they should be there at all.
Jakub Jelinek Feb. 9, 2019, 12:22 p.m. UTC | #9
On Sat, Feb 09, 2019 at 04:11:43AM -0800, H.J. Lu wrote:
> I believe all usages of
> 
> (ior (match_operand 0 "ext_sse_reg_operand")
>       (match_operand 1 "ext_sse_reg_operand"))
> 
> should be checked.  I am not sure if they should be there at all.

E.g. in i386.md all the other spots look fine, because {DI,SI,DF,SF}mode
is allowed in ext sse regs even with -mavx512f.  And sse.md doesn't use this
at all.  What I'm wondering is if we need the sse.md (*mov<mode>_internal)
code I've cited earlier, doing bootstrap/regtest now with gcc_unreachable in
there (and in *mov{o,x}i_internal* for MODE_XI too) too see if it ever
triggers.

	Jakub
Jakub Jelinek Feb. 9, 2019, 1:39 p.m. UTC | #10
On Sat, Feb 09, 2019 at 01:22:30PM +0100, Jakub Jelinek wrote:
> On Sat, Feb 09, 2019 at 04:11:43AM -0800, H.J. Lu wrote:
> > I believe all usages of
> > 
> > (ior (match_operand 0 "ext_sse_reg_operand")
> >       (match_operand 1 "ext_sse_reg_operand"))
> > 
> > should be checked.  I am not sure if they should be there at all.
> 
> E.g. in i386.md all the other spots look fine, because {DI,SI,DF,SF}mode
> is allowed in ext sse regs even with -mavx512f.  And sse.md doesn't use this
> at all.  What I'm wondering is if we need the sse.md (*mov<mode>_internal)
> code I've cited earlier, doing bootstrap/regtest now with gcc_unreachable in
> there (and in *mov{o,x}i_internal* for MODE_XI too) too see if it ever
> triggers.

The following didn't ICE on anything, which is not a proof, but given that
hard_regno_mode_ok should return false for ext_sse_reg_operand regs for
avx512f && !avx512vl, it matches my expectations, on the other hand, it was
a normal defaults bootstrap, don't have a knl which might be best for this
to test -mavx512f -mno-avx512vl on everything.
So perhaps we can also nuke the large if from mov<mode>_internal.

--- gcc/config/i386/i386.md.jj	2019-02-09 12:35:57.971475641 +0100
+++ gcc/config/i386/i386.md	2019-02-09 12:37:40.776802962 +0100
@@ -1905,6 +1905,7 @@ (define_insn "*movoi_internal_avx"
       return standard_sse_constant_opcode (insn, operands);
 
     case TYPE_SSEMOV:
+      gcc_assert (get_attr_mode (insn) != MODE_XI);
       if (misaligned_operand (operands[0], OImode)
 	  || misaligned_operand (operands[1], OImode))
 	{
@@ -1970,6 +1971,7 @@ (define_insn "*movti_internal"
     case TYPE_SSEMOV:
       /* TDmode values are passed as TImode on the stack.  Moving them
 	 to stack may result in unaligned memory access.  */
+      gcc_assert (get_attr_mode (insn) != MODE_XI);
       if (misaligned_operand (operands[0], TImode)
 	  || misaligned_operand (operands[1], TImode))
 	{
--- gcc/config/i386/sse.md.jj	2019-01-28 21:57:39.301110220 +0100
+++ gcc/config/i386/sse.md	2019-02-09 12:36:45.863696416 +0100
@@ -989,6 +989,7 @@ (define_insn "mov<mode>_internal"
 	  && (EXT_REX_SSE_REG_P (operands[0])
 	      || EXT_REX_SSE_REG_P (operands[1])))
 	{
+	  gcc_unreachable ();
 	  if (memory_operand (operands[0], <MODE>mode))
 	    {
 	      if (<MODE_SIZE> == 32)

	Jakub
Alan Modra Feb. 11, 2019, 2:35 a.m. UTC | #11
On Fri, Feb 08, 2019 at 10:51:34AM +0100, Uros Bizjak wrote:
> On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > OImode and TImode moves must be done in XImode to access upper 16
> > vector registers without AVX512VL.  With AVX512VL, we can access
> > upper 16 vector registers in OImode and TImode.
> >
> >         PR target/89229
> >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> >         upper 16 vector registers without TARGET_AVX512VL.
> >         (*movti_internal): Likewise.
> 
> Please use (not (match_test "...")) instead of (match_test "!...") and

I'm curious.  Is there a reason other than style to ask for this
change?
Uros Bizjak Feb. 11, 2019, 7:23 a.m. UTC | #12
On Mon, Feb 11, 2019 at 3:35 AM Alan Modra <amodra@gmail.com> wrote:
>
> On Fri, Feb 08, 2019 at 10:51:34AM +0100, Uros Bizjak wrote:
> > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > OImode and TImode moves must be done in XImode to access upper 16
> > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > upper 16 vector registers in OImode and TImode.
> > >
> > >         PR target/89229
> > >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > >         upper 16 vector registers without TARGET_AVX512VL.
> > >         (*movti_internal): Likewise.
> >
> > Please use (not (match_test "...")) instead of (match_test "!...") and
>
> I'm curious.  Is there a reason other than style to ask for this
> change?

It is style that we want to keep throughout i386 *.md files,
otherwise, it should result in identical code.

Uros.
H.J. Lu Feb. 11, 2019, 1:10 p.m. UTC | #13
On Sat, Feb 09, 2019 at 02:39:30PM +0100, Jakub Jelinek wrote:
> On Sat, Feb 09, 2019 at 01:22:30PM +0100, Jakub Jelinek wrote:
> > On Sat, Feb 09, 2019 at 04:11:43AM -0800, H.J. Lu wrote:
> > > I believe all usages of
> > > 
> > > (ior (match_operand 0 "ext_sse_reg_operand")
> > >       (match_operand 1 "ext_sse_reg_operand"))
> > > 
> > > should be checked.  I am not sure if they should be there at all.
> > 
> > E.g. in i386.md all the other spots look fine, because {DI,SI,DF,SF}mode
> > is allowed in ext sse regs even with -mavx512f.  And sse.md doesn't use this
> > at all.  What I'm wondering is if we need the sse.md (*mov<mode>_internal)
> > code I've cited earlier, doing bootstrap/regtest now with gcc_unreachable in
> > there (and in *mov{o,x}i_internal* for MODE_XI too) too see if it ever
> > triggers.
> 
> The following didn't ICE on anything, which is not a proof, but given that
> hard_regno_mode_ok should return false for ext_sse_reg_operand regs for
> avx512f && !avx512vl, it matches my expectations, on the other hand, it was
> a normal defaults bootstrap, don't have a knl which might be best for this
> to test -mavx512f -mno-avx512vl on everything.
> So perhaps we can also nuke the large if from mov<mode>_internal.
> 
> --- gcc/config/i386/i386.md.jj	2019-02-09 12:35:57.971475641 +0100
> +++ gcc/config/i386/i386.md	2019-02-09 12:37:40.776802962 +0100
> @@ -1905,6 +1905,7 @@ (define_insn "*movoi_internal_avx"
>        return standard_sse_constant_opcode (insn, operands);
>  
>      case TYPE_SSEMOV:
> +      gcc_assert (get_attr_mode (insn) != MODE_XI);
>        if (misaligned_operand (operands[0], OImode)
>  	  || misaligned_operand (operands[1], OImode))
>  	{
> @@ -1970,6 +1971,7 @@ (define_insn "*movti_internal"
>      case TYPE_SSEMOV:
>        /* TDmode values are passed as TImode on the stack.  Moving them
>  	 to stack may result in unaligned memory access.  */
> +      gcc_assert (get_attr_mode (insn) != MODE_XI);
>        if (misaligned_operand (operands[0], TImode)
>  	  || misaligned_operand (operands[1], TImode))
>  	{
> --- gcc/config/i386/sse.md.jj	2019-01-28 21:57:39.301110220 +0100
> +++ gcc/config/i386/sse.md	2019-02-09 12:36:45.863696416 +0100
> @@ -989,6 +989,7 @@ (define_insn "mov<mode>_internal"
>  	  && (EXT_REX_SSE_REG_P (operands[0])
>  	      || EXT_REX_SSE_REG_P (operands[1])))
>  	{
> +	  gcc_unreachable ();
>  	  if (memory_operand (operands[0], <MODE>mode))
>  	    {
>  	      if (<MODE_SIZE> == 32)
> 

Here is the updated patch to remove ext_sse_reg_operand check with a
testcase.

OK for trunk?

Thanks.

H.J.
---
Since hard_regno_mode_ok only allows xmm16-xmm31 in OImode or TImode
with TARGET_AVX512VL:

      /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
      if (TARGET_AVX512VL
          && (mode == OImode
              || mode == TImode
              || VALID_AVX256_REG_MODE (mode)
              || VALID_AVX512VL_128_REG_MODE (mode)))
        return true;

      /* xmm16-xmm31 are only available for AVX-512.  */
      if (EXT_REX_SSE_REGNO_P (regno))
        return false;

there is no need to check ext_sse_reg_operand in *movoi_internal_avx nor
*movti_internal.  Instead, we should check EXT_REX_SSE_REG_P for upper 16
vector registers.

2019-02-11  H.J. Lu  <hongjiu.lu@intel.com>
	    Jakub Jelinek  <jakub@redhat.com>

gcc/

	PR target/89229
	* config/i386/i386.md (*movoi_internal_avx): Check
	EXT_REX_SSE_REG_P instead of MODE_XI for upper 16 vector
	registers.
	(*movti_internal): Likewise.

gcc/testsuite/

	PR target/89229
	* gcc.target/i386/pr89229-1.c: New test.
---
 gcc/config/i386/i386.md                   | 22 +++++------
 gcc/testsuite/gcc.target/i386/pr89229-1.c | 47 +++++++++++++++++++++++
 2 files changed, 56 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-1.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 3d9141ae450..5b89e52493e 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1910,7 +1910,8 @@
 	{
 	  if (get_attr_mode (insn) == MODE_V8SF)
 	    return "vmovups\t{%1, %0|%0, %1}";
-	  else if (get_attr_mode (insn) == MODE_XI)
+	  else if (EXT_REX_SSE_REG_P (operands[0])
+		   || EXT_REX_SSE_REG_P (operands[1]))
 	    return "vmovdqu32\t{%1, %0|%0, %1}";
 	  else
 	    return "vmovdqu\t{%1, %0|%0, %1}";
@@ -1919,7 +1920,8 @@
 	{
 	  if (get_attr_mode (insn) == MODE_V8SF)
 	    return "vmovaps\t{%1, %0|%0, %1}";
-	  else if (get_attr_mode (insn) == MODE_XI)
+	  else if (EXT_REX_SSE_REG_P (operands[0])
+		   || EXT_REX_SSE_REG_P (operands[1]))
 	    return "vmovdqa32\t{%1, %0|%0, %1}";
 	  else
 	    return "vmovdqa\t{%1, %0|%0, %1}";
@@ -1933,11 +1935,7 @@
    (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
    (set_attr "prefix" "vex")
    (set (attr "mode")
-	(cond [(and (not (match_test "TARGET_AVX512VL"))
-		    (ior (match_operand 0 "ext_sse_reg_operand")
-			 (match_operand 1 "ext_sse_reg_operand")))
-		 (const_string "XI")
-	       (and (eq_attr "alternative" "1")
+	(cond [(and (eq_attr "alternative" "1")
 		    (match_test "TARGET_AVX512VL"))
 		 (const_string "OI")
 	       (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
@@ -1973,7 +1971,8 @@
 	{
 	  if (get_attr_mode (insn) == MODE_V4SF)
 	    return "%vmovups\t{%1, %0|%0, %1}";
-	  else if (get_attr_mode (insn) == MODE_XI)
+	  else if (EXT_REX_SSE_REG_P (operands[0])
+		   || EXT_REX_SSE_REG_P (operands[1]))
 	    return "vmovdqu32\t{%1, %0|%0, %1}";
 	  else
 	    return "%vmovdqu\t{%1, %0|%0, %1}";
@@ -1982,7 +1981,8 @@
 	{
 	  if (get_attr_mode (insn) == MODE_V4SF)
 	    return "%vmovaps\t{%1, %0|%0, %1}";
-	  else if (get_attr_mode (insn) == MODE_XI)
+	  else if (EXT_REX_SSE_REG_P (operands[0])
+		   || EXT_REX_SSE_REG_P (operands[1]))
 	    return "vmovdqa32\t{%1, %0|%0, %1}";
 	  else
 	    return "%vmovdqa\t{%1, %0|%0, %1}";
@@ -2013,10 +2013,6 @@
    (set (attr "mode")
 	(cond [(eq_attr "alternative" "0,1")
 		 (const_string "DI")
-	       (and (not (match_test "TARGET_AVX512VL"))
-		    (ior (match_operand 0 "ext_sse_reg_operand")
-			 (match_operand 1 "ext_sse_reg_operand")))
-		 (const_string "XI")
 	       (and (eq_attr "alternative" "3")
 		    (match_test "TARGET_AVX512VL"))
 		 (const_string "TI")
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-1.c b/gcc/testsuite/gcc.target/i386/pr89229-1.c
new file mode 100644
index 00000000000..cce95350bf2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-1.c
@@ -0,0 +1,47 @@
+/* { dg-do assemble { target { avx512bw && avx512vl } } } */
+/* { dg-options "-O1 -mavx512bw -mavx512vl -mtune=skylake-avx512" } */
+
+extern void abort (void);
+extern void exit (int);
+struct s { unsigned char a[256]; };
+union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; };
+static union u v;
+static union u v0;
+static struct s *p = &v.d.b;
+static struct s *q = &v.e.b;
+
+static inline struct s rp (void) { return *p; }
+static inline struct s rq (void) { return *q; }
+static void pq (void) { *p = rq(); }
+static void qp (void) { *q = rp(); }
+
+static void
+init (struct s *sp)
+{
+  int i;
+  for (i = 0; i < 256; i++)
+    sp->a[i] = i;
+}
+
+static void
+check (struct s *sp)
+{
+  int i;
+  for (i = 0; i < 256; i++)
+    if (sp->a[i] != i)
+      abort ();
+}
+
+void
+main_test (void)
+{
+  v = v0;
+  init (p);
+  qp ();
+  check (q);
+  v = v0;
+  init (q);
+  pq ();
+  check (p);
+  exit (0);
+}
Uros Bizjak Feb. 11, 2019, 1:15 p.m. UTC | #14
On Mon, Feb 11, 2019 at 2:10 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sat, Feb 09, 2019 at 02:39:30PM +0100, Jakub Jelinek wrote:
> > On Sat, Feb 09, 2019 at 01:22:30PM +0100, Jakub Jelinek wrote:
> > > On Sat, Feb 09, 2019 at 04:11:43AM -0800, H.J. Lu wrote:
> > > > I believe all usages of
> > > >
> > > > (ior (match_operand 0 "ext_sse_reg_operand")
> > > >       (match_operand 1 "ext_sse_reg_operand"))
> > > >
> > > > should be checked.  I am not sure if they should be there at all.
> > >
> > > E.g. in i386.md all the other spots look fine, because {DI,SI,DF,SF}mode
> > > is allowed in ext sse regs even with -mavx512f.  And sse.md doesn't use this
> > > at all.  What I'm wondering is if we need the sse.md (*mov<mode>_internal)
> > > code I've cited earlier, doing bootstrap/regtest now with gcc_unreachable in
> > > there (and in *mov{o,x}i_internal* for MODE_XI too) too see if it ever
> > > triggers.
> >
> > The following didn't ICE on anything, which is not a proof, but given that
> > hard_regno_mode_ok should return false for ext_sse_reg_operand regs for
> > avx512f && !avx512vl, it matches my expectations, on the other hand, it was
> > a normal defaults bootstrap, don't have a knl which might be best for this
> > to test -mavx512f -mno-avx512vl on everything.
> > So perhaps we can also nuke the large if from mov<mode>_internal.
> >
> > --- gcc/config/i386/i386.md.jj        2019-02-09 12:35:57.971475641 +0100
> > +++ gcc/config/i386/i386.md   2019-02-09 12:37:40.776802962 +0100
> > @@ -1905,6 +1905,7 @@ (define_insn "*movoi_internal_avx"
> >        return standard_sse_constant_opcode (insn, operands);
> >
> >      case TYPE_SSEMOV:
> > +      gcc_assert (get_attr_mode (insn) != MODE_XI);
> >        if (misaligned_operand (operands[0], OImode)
> >         || misaligned_operand (operands[1], OImode))
> >       {
> > @@ -1970,6 +1971,7 @@ (define_insn "*movti_internal"
> >      case TYPE_SSEMOV:
> >        /* TDmode values are passed as TImode on the stack.  Moving them
> >        to stack may result in unaligned memory access.  */
> > +      gcc_assert (get_attr_mode (insn) != MODE_XI);
> >        if (misaligned_operand (operands[0], TImode)
> >         || misaligned_operand (operands[1], TImode))
> >       {
> > --- gcc/config/i386/sse.md.jj 2019-01-28 21:57:39.301110220 +0100
> > +++ gcc/config/i386/sse.md    2019-02-09 12:36:45.863696416 +0100
> > @@ -989,6 +989,7 @@ (define_insn "mov<mode>_internal"
> >         && (EXT_REX_SSE_REG_P (operands[0])
> >             || EXT_REX_SSE_REG_P (operands[1])))
> >       {
> > +       gcc_unreachable ();
> >         if (memory_operand (operands[0], <MODE>mode))
> >           {
> >             if (<MODE_SIZE> == 32)
> >
>
> Here is the updated patch to remove ext_sse_reg_operand check with a
> testcase.
>
> OK for trunk?

No. As said, please correctly set mode to XImode in mode attribute calculation.

Uros.

> Thanks.
>
> H.J.
> ---
> Since hard_regno_mode_ok only allows xmm16-xmm31 in OImode or TImode
> with TARGET_AVX512VL:
>
>       /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
>       if (TARGET_AVX512VL
>           && (mode == OImode
>               || mode == TImode
>               || VALID_AVX256_REG_MODE (mode)
>               || VALID_AVX512VL_128_REG_MODE (mode)))
>         return true;
>
>       /* xmm16-xmm31 are only available for AVX-512.  */
>       if (EXT_REX_SSE_REGNO_P (regno))
>         return false;
>
> there is no need to check ext_sse_reg_operand in *movoi_internal_avx nor
> *movti_internal.  Instead, we should check EXT_REX_SSE_REG_P for upper 16
> vector registers.
>
> 2019-02-11  H.J. Lu  <hongjiu.lu@intel.com>
>             Jakub Jelinek  <jakub@redhat.com>
>
> gcc/
>
>         PR target/89229
>         * config/i386/i386.md (*movoi_internal_avx): Check
>         EXT_REX_SSE_REG_P instead of MODE_XI for upper 16 vector
>         registers.
>         (*movti_internal): Likewise.
>
> gcc/testsuite/
>
>         PR target/89229
>         * gcc.target/i386/pr89229-1.c: New test.
> ---
>  gcc/config/i386/i386.md                   | 22 +++++------
>  gcc/testsuite/gcc.target/i386/pr89229-1.c | 47 +++++++++++++++++++++++
>  2 files changed, 56 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-1.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 3d9141ae450..5b89e52493e 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1910,7 +1910,8 @@
>         {
>           if (get_attr_mode (insn) == MODE_V8SF)
>             return "vmovups\t{%1, %0|%0, %1}";
> -         else if (get_attr_mode (insn) == MODE_XI)
> +         else if (EXT_REX_SSE_REG_P (operands[0])
> +                  || EXT_REX_SSE_REG_P (operands[1]))
>             return "vmovdqu32\t{%1, %0|%0, %1}";
>           else
>             return "vmovdqu\t{%1, %0|%0, %1}";
> @@ -1919,7 +1920,8 @@
>         {
>           if (get_attr_mode (insn) == MODE_V8SF)
>             return "vmovaps\t{%1, %0|%0, %1}";
> -         else if (get_attr_mode (insn) == MODE_XI)
> +         else if (EXT_REX_SSE_REG_P (operands[0])
> +                  || EXT_REX_SSE_REG_P (operands[1]))
>             return "vmovdqa32\t{%1, %0|%0, %1}";
>           else
>             return "vmovdqa\t{%1, %0|%0, %1}";
> @@ -1933,11 +1935,7 @@
>     (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>     (set_attr "prefix" "vex")
>     (set (attr "mode")
> -       (cond [(and (not (match_test "TARGET_AVX512VL"))
> -                   (ior (match_operand 0 "ext_sse_reg_operand")
> -                        (match_operand 1 "ext_sse_reg_operand")))
> -                (const_string "XI")
> -              (and (eq_attr "alternative" "1")
> +       (cond [(and (eq_attr "alternative" "1")
>                     (match_test "TARGET_AVX512VL"))
>                  (const_string "OI")
>                (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
> @@ -1973,7 +1971,8 @@
>         {
>           if (get_attr_mode (insn) == MODE_V4SF)
>             return "%vmovups\t{%1, %0|%0, %1}";
> -         else if (get_attr_mode (insn) == MODE_XI)
> +         else if (EXT_REX_SSE_REG_P (operands[0])
> +                  || EXT_REX_SSE_REG_P (operands[1]))
>             return "vmovdqu32\t{%1, %0|%0, %1}";
>           else
>             return "%vmovdqu\t{%1, %0|%0, %1}";
> @@ -1982,7 +1981,8 @@
>         {
>           if (get_attr_mode (insn) == MODE_V4SF)
>             return "%vmovaps\t{%1, %0|%0, %1}";
> -         else if (get_attr_mode (insn) == MODE_XI)
> +         else if (EXT_REX_SSE_REG_P (operands[0])
> +                  || EXT_REX_SSE_REG_P (operands[1]))
>             return "vmovdqa32\t{%1, %0|%0, %1}";
>           else
>             return "%vmovdqa\t{%1, %0|%0, %1}";
> @@ -2013,10 +2013,6 @@
>     (set (attr "mode")
>         (cond [(eq_attr "alternative" "0,1")
>                  (const_string "DI")
> -              (and (not (match_test "TARGET_AVX512VL"))
> -                   (ior (match_operand 0 "ext_sse_reg_operand")
> -                        (match_operand 1 "ext_sse_reg_operand")))
> -                (const_string "XI")
>                (and (eq_attr "alternative" "3")
>                     (match_test "TARGET_AVX512VL"))
>                  (const_string "TI")
> diff --git a/gcc/testsuite/gcc.target/i386/pr89229-1.c b/gcc/testsuite/gcc.target/i386/pr89229-1.c
> new file mode 100644
> index 00000000000..cce95350bf2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr89229-1.c
> @@ -0,0 +1,47 @@
> +/* { dg-do assemble { target { avx512bw && avx512vl } } } */
> +/* { dg-options "-O1 -mavx512bw -mavx512vl -mtune=skylake-avx512" } */
> +
> +extern void abort (void);
> +extern void exit (int);
> +struct s { unsigned char a[256]; };
> +union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; };
> +static union u v;
> +static union u v0;
> +static struct s *p = &v.d.b;
> +static struct s *q = &v.e.b;
> +
> +static inline struct s rp (void) { return *p; }
> +static inline struct s rq (void) { return *q; }
> +static void pq (void) { *p = rq(); }
> +static void qp (void) { *q = rp(); }
> +
> +static void
> +init (struct s *sp)
> +{
> +  int i;
> +  for (i = 0; i < 256; i++)
> +    sp->a[i] = i;
> +}
> +
> +static void
> +check (struct s *sp)
> +{
> +  int i;
> +  for (i = 0; i < 256; i++)
> +    if (sp->a[i] != i)
> +      abort ();
> +}
> +
> +void
> +main_test (void)
> +{
> +  v = v0;
> +  init (p);
> +  qp ();
> +  check (q);
> +  v = v0;
> +  init (q);
> +  pq ();
> +  check (p);
> +  exit (0);
> +}
> --
> 2.20.1
>
H.J. Lu Feb. 11, 2019, 1:29 p.m. UTC | #15
On Mon, Feb 11, 2019 at 5:15 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Feb 11, 2019 at 2:10 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sat, Feb 09, 2019 at 02:39:30PM +0100, Jakub Jelinek wrote:
> > > On Sat, Feb 09, 2019 at 01:22:30PM +0100, Jakub Jelinek wrote:
> > > > On Sat, Feb 09, 2019 at 04:11:43AM -0800, H.J. Lu wrote:
> > > > > I believe all usages of
> > > > >
> > > > > (ior (match_operand 0 "ext_sse_reg_operand")
> > > > >       (match_operand 1 "ext_sse_reg_operand"))
> > > > >
> > > > > should be checked.  I am not sure if they should be there at all.
> > > >
> > > > E.g. in i386.md all the other spots look fine, because {DI,SI,DF,SF}mode
> > > > is allowed in ext sse regs even with -mavx512f.  And sse.md doesn't use this
> > > > at all.  What I'm wondering is if we need the sse.md (*mov<mode>_internal)
> > > > code I've cited earlier, doing bootstrap/regtest now with gcc_unreachable in
> > > > there (and in *mov{o,x}i_internal* for MODE_XI too) too see if it ever
> > > > triggers.
> > >
> > > The following didn't ICE on anything, which is not a proof, but given that
> > > hard_regno_mode_ok should return false for ext_sse_reg_operand regs for
> > > avx512f && !avx512vl, it matches my expectations, on the other hand, it was
> > > a normal defaults bootstrap, don't have a knl which might be best for this
> > > to test -mavx512f -mno-avx512vl on everything.
> > > So perhaps we can also nuke the large if from mov<mode>_internal.
> > >
> > > --- gcc/config/i386/i386.md.jj        2019-02-09 12:35:57.971475641 +0100
> > > +++ gcc/config/i386/i386.md   2019-02-09 12:37:40.776802962 +0100
> > > @@ -1905,6 +1905,7 @@ (define_insn "*movoi_internal_avx"
> > >        return standard_sse_constant_opcode (insn, operands);
> > >
> > >      case TYPE_SSEMOV:
> > > +      gcc_assert (get_attr_mode (insn) != MODE_XI);
> > >        if (misaligned_operand (operands[0], OImode)
> > >         || misaligned_operand (operands[1], OImode))
> > >       {
> > > @@ -1970,6 +1971,7 @@ (define_insn "*movti_internal"
> > >      case TYPE_SSEMOV:
> > >        /* TDmode values are passed as TImode on the stack.  Moving them
> > >        to stack may result in unaligned memory access.  */
> > > +      gcc_assert (get_attr_mode (insn) != MODE_XI);
> > >        if (misaligned_operand (operands[0], TImode)
> > >         || misaligned_operand (operands[1], TImode))
> > >       {
> > > --- gcc/config/i386/sse.md.jj 2019-01-28 21:57:39.301110220 +0100
> > > +++ gcc/config/i386/sse.md    2019-02-09 12:36:45.863696416 +0100
> > > @@ -989,6 +989,7 @@ (define_insn "mov<mode>_internal"
> > >         && (EXT_REX_SSE_REG_P (operands[0])
> > >             || EXT_REX_SSE_REG_P (operands[1])))
> > >       {
> > > +       gcc_unreachable ();
> > >         if (memory_operand (operands[0], <MODE>mode))
> > >           {
> > >             if (<MODE_SIZE> == 32)
> > >
> >
> > Here is the updated patch to remove ext_sse_reg_operand check with a
> > testcase.
> >
> > OK for trunk?
>
> No. As said, please correctly set mode to XImode in mode attribute calculation.

There is

 switch (get_attr_type (insn))
    {
    case TYPE_SSELOG1:
      return standard_sse_constant_opcode (insn, operands);

standard_sse_constant_opcode has

else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
    {
      enum attr_mode insn_mode = get_attr_mode (insn);

      switch (insn_mode)
        {
        case MODE_XI:
        case MODE_V8DF:
        case MODE_V16SF:
          gcc_assert (TARGET_AVX512F);
          return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";

What mode should be used to set %xmm23 to -1 with AVX512VL?  What mode
should be used to load %xmm23 with AVX512VL? There is no need to
check ext_sse_reg_operand here the same as in

(define_insn "mov<mode>_internal"
  [(set (match_operand:VMOVE 0 "nonimmediate_operand"
         "=v,v ,v ,m")
        (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
         " C,BC,vm,v"))]
  "TARGET_SSE
   && (register_operand (operands[0], <MODE>mode)
       || register_operand (operands[1], <MODE>mode))"
{

> Uros.
>
> > Thanks.
> >
> > H.J.
> > ---
> > Since hard_regno_mode_ok only allows xmm16-xmm31 in OImode or TImode
> > with TARGET_AVX512VL:
> >
> >       /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
> >       if (TARGET_AVX512VL
> >           && (mode == OImode
> >               || mode == TImode
> >               || VALID_AVX256_REG_MODE (mode)
> >               || VALID_AVX512VL_128_REG_MODE (mode)))
> >         return true;
> >
> >       /* xmm16-xmm31 are only available for AVX-512.  */
> >       if (EXT_REX_SSE_REGNO_P (regno))
> >         return false;
> >
> > there is no need to check ext_sse_reg_operand in *movoi_internal_avx nor
> > *movti_internal.  Instead, we should check EXT_REX_SSE_REG_P for upper 16
> > vector registers.
> >
> > 2019-02-11  H.J. Lu  <hongjiu.lu@intel.com>
> >             Jakub Jelinek  <jakub@redhat.com>
> >
> > gcc/
> >
> >         PR target/89229
> >         * config/i386/i386.md (*movoi_internal_avx): Check
> >         EXT_REX_SSE_REG_P instead of MODE_XI for upper 16 vector
> >         registers.
> >         (*movti_internal): Likewise.
> >
> > gcc/testsuite/
> >
> >         PR target/89229
> >         * gcc.target/i386/pr89229-1.c: New test.
> > ---
> >  gcc/config/i386/i386.md                   | 22 +++++------
> >  gcc/testsuite/gcc.target/i386/pr89229-1.c | 47 +++++++++++++++++++++++
> >  2 files changed, 56 insertions(+), 13 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-1.c
> >
> > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > index 3d9141ae450..5b89e52493e 100644
> > --- a/gcc/config/i386/i386.md
> > +++ b/gcc/config/i386/i386.md
> > @@ -1910,7 +1910,8 @@
> >         {
> >           if (get_attr_mode (insn) == MODE_V8SF)
> >             return "vmovups\t{%1, %0|%0, %1}";
> > -         else if (get_attr_mode (insn) == MODE_XI)
> > +         else if (EXT_REX_SSE_REG_P (operands[0])
> > +                  || EXT_REX_SSE_REG_P (operands[1]))
> >             return "vmovdqu32\t{%1, %0|%0, %1}";
> >           else
> >             return "vmovdqu\t{%1, %0|%0, %1}";
> > @@ -1919,7 +1920,8 @@
> >         {
> >           if (get_attr_mode (insn) == MODE_V8SF)
> >             return "vmovaps\t{%1, %0|%0, %1}";
> > -         else if (get_attr_mode (insn) == MODE_XI)
> > +         else if (EXT_REX_SSE_REG_P (operands[0])
> > +                  || EXT_REX_SSE_REG_P (operands[1]))
> >             return "vmovdqa32\t{%1, %0|%0, %1}";
> >           else
> >             return "vmovdqa\t{%1, %0|%0, %1}";
> > @@ -1933,11 +1935,7 @@
> >     (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
> >     (set_attr "prefix" "vex")
> >     (set (attr "mode")
> > -       (cond [(and (not (match_test "TARGET_AVX512VL"))
> > -                   (ior (match_operand 0 "ext_sse_reg_operand")
> > -                        (match_operand 1 "ext_sse_reg_operand")))
> > -                (const_string "XI")
> > -              (and (eq_attr "alternative" "1")
> > +       (cond [(and (eq_attr "alternative" "1")
> >                     (match_test "TARGET_AVX512VL"))
> >                  (const_string "OI")
> >                (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
> > @@ -1973,7 +1971,8 @@
> >         {
> >           if (get_attr_mode (insn) == MODE_V4SF)
> >             return "%vmovups\t{%1, %0|%0, %1}";
> > -         else if (get_attr_mode (insn) == MODE_XI)
> > +         else if (EXT_REX_SSE_REG_P (operands[0])
> > +                  || EXT_REX_SSE_REG_P (operands[1]))
> >             return "vmovdqu32\t{%1, %0|%0, %1}";
> >           else
> >             return "%vmovdqu\t{%1, %0|%0, %1}";
> > @@ -1982,7 +1981,8 @@
> >         {
> >           if (get_attr_mode (insn) == MODE_V4SF)
> >             return "%vmovaps\t{%1, %0|%0, %1}";
> > -         else if (get_attr_mode (insn) == MODE_XI)
> > +         else if (EXT_REX_SSE_REG_P (operands[0])
> > +                  || EXT_REX_SSE_REG_P (operands[1]))
> >             return "vmovdqa32\t{%1, %0|%0, %1}";
> >           else
> >             return "%vmovdqa\t{%1, %0|%0, %1}";
> > @@ -2013,10 +2013,6 @@
> >     (set (attr "mode")
> >         (cond [(eq_attr "alternative" "0,1")
> >                  (const_string "DI")
> > -              (and (not (match_test "TARGET_AVX512VL"))
> > -                   (ior (match_operand 0 "ext_sse_reg_operand")
> > -                        (match_operand 1 "ext_sse_reg_operand")))
> > -                (const_string "XI")
> >                (and (eq_attr "alternative" "3")
> >                     (match_test "TARGET_AVX512VL"))
> >                  (const_string "TI")
> > diff --git a/gcc/testsuite/gcc.target/i386/pr89229-1.c b/gcc/testsuite/gcc.target/i386/pr89229-1.c
> > new file mode 100644
> > index 00000000000..cce95350bf2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr89229-1.c
> > @@ -0,0 +1,47 @@
> > +/* { dg-do assemble { target { avx512bw && avx512vl } } } */
> > +/* { dg-options "-O1 -mavx512bw -mavx512vl -mtune=skylake-avx512" } */
> > +
> > +extern void abort (void);
> > +extern void exit (int);
> > +struct s { unsigned char a[256]; };
> > +union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; };
> > +static union u v;
> > +static union u v0;
> > +static struct s *p = &v.d.b;
> > +static struct s *q = &v.e.b;
> > +
> > +static inline struct s rp (void) { return *p; }
> > +static inline struct s rq (void) { return *q; }
> > +static void pq (void) { *p = rq(); }
> > +static void qp (void) { *q = rp(); }
> > +
> > +static void
> > +init (struct s *sp)
> > +{
> > +  int i;
> > +  for (i = 0; i < 256; i++)
> > +    sp->a[i] = i;
> > +}
> > +
> > +static void
> > +check (struct s *sp)
> > +{
> > +  int i;
> > +  for (i = 0; i < 256; i++)
> > +    if (sp->a[i] != i)
> > +      abort ();
> > +}
> > +
> > +void
> > +main_test (void)
> > +{
> > +  v = v0;
> > +  init (p);
> > +  qp ();
> > +  check (q);
> > +  v = v0;
> > +  init (q);
> > +  pq ();
> > +  check (p);
> > +  exit (0);
> > +}
> > --
> > 2.20.1
> >
Jakub Jelinek Feb. 11, 2019, 1:47 p.m. UTC | #16
On Mon, Feb 11, 2019 at 02:15:18PM +0100, Uros Bizjak wrote:
> > Here is the updated patch to remove ext_sse_reg_operand check with a
> > testcase.
> >
> > OK for trunk?
> 
> No. As said, please correctly set mode to XImode in mode attribute calculation.

The instructions in question are
vmovdqu32 mem, %{x,y}mm{1[6-9],2[0-9],3[01]}
vmovdqu32 %{x,y}mm{1[6-9],2[0-9],3[01]}, mem
vmovdqa32 mem, %{x,y}mm{1[6-9],2[0-9],3[01]}
vmovdqa32 %{x,y}mm{1[6-9],2[0-9],3[01]}, mem
vmovdqa32 %{x,y}mm{[0-9],[12][0-9],3[01]}, %{x,y}mm{1[6-9],2[0-9],3[01]}
vmovdqa32 %{x,y}mm{1[6-9],2[0-9],3[01]}, %{x,y}mm{[0-9],[12][0-9],3[01]}
Why should those instructions be XImode?  They have 16 or 32 byte operands,
never 64 byte.

Using EXT_REX_SSE_REG_P is what *movdi_internal uses too:
        case MODE_TI:
          /* Handle AVX512 registers set.  */
          if (EXT_REX_SSE_REG_P (operands[0])
              || EXT_REX_SSE_REG_P (operands[1]))
            return "vmovdqa64\t{%1, %0|%0, %1}";
          return "%vmovdqa\t{%1, %0|%0, %1}";
Sure, in that case it is not MODE_DI, but MODE_TI, because the move is
actually 128-bit, not 64-bit, but we do not claim it is 512-bit.

*movsi_internal is incorrect (and inefficient):
        case MODE_TI:
          return "%vmovdqa\t{%1, %0|%0, %1}";
        case MODE_XI:
          return "vmovdqa32\t{%g1, %g0|%g0, %g1}";
...
            (eq_attr "alternative" "8,9")
              (cond [(ior (match_operand 0 "ext_sse_reg_operand")
                          (match_operand 1 "ext_sse_reg_operand"))
                       (const_string "XI")
                     (ior (not (match_test "TARGET_SSE2"))
                          (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
                       (const_string "V4SF")
                     (match_test "TARGET_AVX")
                       (const_string "TI")
                     (match_test "optimize_function_for_size_p (cfun)")
                       (const_string "V4SF")
                    ]
                    (const_string "TI"))
In my reading, for (set (reg:SI xmm16) (reg:SI xmm17)) the above will
emit vmovdqa32	%zmm17, %zmm16 even for -mavx512vl, which looks wrong.
So, I'd suggest (and (not (match_test "TARGET_AVX512VL"))
		     (ior (match_operand 0 "ext_sse_reg_operand")
                          (match_operand 1 "ext_sse_reg_operand"))
                       (const_string "XI")

I see other wierdo stuff e.g. in *movdf_internal:
               /* movaps is one byte shorter for non-AVX targets.  */
               (eq_attr "alternative" "13,17")
                 (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
                                  (not (match_test "TARGET_AVX512VL")))
                             (ior (match_operand 0 "ext_sse_reg_operand")
                                  (match_operand 1 "ext_sse_reg_operand")))
                          (const_string "V8DF")
If !TARGET_AVX512VL and one or both of the operands is (are)
ext_sse_reg_operand, obviously MODE_V8DF needs to be used.  But doesn't the
above force use of vmovapd	%zmm16, %zmm17 even if just -mavx512vl
-mprefer-vector-width=512?  I don't see any reason not to use
vmovapd	%xmm16, %xmm17 in that case.  -mprefer-vector-width=512 is not you
must use ZMM all the time, but it is fine to use even EVEX instructions with
512-bit width.  Ditto *movsf_internal.

	Jakub
Uros Bizjak Feb. 11, 2019, 1:50 p.m. UTC | #17
On Mon, Feb 11, 2019 at 2:29 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > No. As said, please correctly set mode to XImode in mode attribute calculation.
>
> There is
>
>  switch (get_attr_type (insn))
>     {
>     case TYPE_SSELOG1:
>       return standard_sse_constant_opcode (insn, operands);
>
> standard_sse_constant_opcode has
>
> else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
>     {
>       enum attr_mode insn_mode = get_attr_mode (insn);
>
>       switch (insn_mode)
>         {
>         case MODE_XI:
>         case MODE_V8DF:
>         case MODE_V16SF:
>           gcc_assert (TARGET_AVX512F);
>           return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";

If there is something wrong with standard_sse_constant_opcode, then
fix the problem in the function itself. With your previous patch, you
introduced a regression, and the presented fix is another kludge to
fix a stack of kludges inside standard_sse_constant_opcode.

Please take your time and propose some acceptable solution that would
put some logic into const_0/const_1 handling. The situation is not OK
and your patch makes it even worse.

Uros.
H.J. Lu Feb. 11, 2019, 2:31 p.m. UTC | #18
On Mon, Feb 11, 2019 at 5:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Feb 11, 2019 at 2:29 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> > > No. As said, please correctly set mode to XImode in mode attribute calculation.
> >
> > There is
> >
> >  switch (get_attr_type (insn))
> >     {
> >     case TYPE_SSELOG1:
> >       return standard_sse_constant_opcode (insn, operands);
> >
> > standard_sse_constant_opcode has
> >
> > else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
> >     {
> >       enum attr_mode insn_mode = get_attr_mode (insn);
> >
> >       switch (insn_mode)
> >         {
> >         case MODE_XI:
> >         case MODE_V8DF:
> >         case MODE_V16SF:
> >           gcc_assert (TARGET_AVX512F);
> >           return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";
>
> If there is something wrong with standard_sse_constant_opcode, then
> fix the problem in the function itself. With your previous patch, you
> introduced a regression, and the presented fix is another kludge to
> fix a stack of kludges inside standard_sse_constant_opcode.
>
> Please take your time and propose some acceptable solution that would
> put some logic into const_0/const_1 handling. The situation is not OK
> and your patch makes it even worse.
>

Let's first define what MODE_XI means in standard_sse_constant_opcode
as well as in all these mov patterns for with and without AVX512VL.   Without
a clear definition, we can't get out of this mess.
Uros Bizjak Feb. 11, 2019, 3:56 p.m. UTC | #19
On Mon, Feb 11, 2019 at 3:32 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Mon, Feb 11, 2019 at 5:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Mon, Feb 11, 2019 at 2:29 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > > > No. As said, please correctly set mode to XImode in mode attribute calculation.
> > >
> > > There is
> > >
> > >  switch (get_attr_type (insn))
> > >     {
> > >     case TYPE_SSELOG1:
> > >       return standard_sse_constant_opcode (insn, operands);
> > >
> > > standard_sse_constant_opcode has
> > >
> > > else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
> > >     {
> > >       enum attr_mode insn_mode = get_attr_mode (insn);
> > >
> > >       switch (insn_mode)
> > >         {
> > >         case MODE_XI:
> > >         case MODE_V8DF:
> > >         case MODE_V16SF:
> > >           gcc_assert (TARGET_AVX512F);
> > >           return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";
> >
> > If there is something wrong with standard_sse_constant_opcode, then
> > fix the problem in the function itself. With your previous patch, you
> > introduced a regression, and the presented fix is another kludge to
> > fix a stack of kludges inside standard_sse_constant_opcode.
> >
> > Please take your time and propose some acceptable solution that would
> > put some logic into const_0/const_1 handling. The situation is not OK
> > and your patch makes it even worse.
> >
>
> Let's first define what MODE_XI means in standard_sse_constant_opcode
> as well as in all these mov patterns for with and without AVX512VL.   Without
> a clear definition, we can't get out of this mess.

INT_MODE (OI, 32);
INT_MODE (XI, 64);

So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
in case of const_1, all 512 bits set.

We can load zeros with narrower instruction, (e.g. 256 bit by inherent
zeroing of highpart in case of 128 bit xor), so TImode in this case.

Some targets prefer V4SF mode, so they will emit float xorps for zeroing

Then the introduction of AVX512F fubared everything by overloading the
meaning of insn mode.

Uros.
H.J. Lu Feb. 11, 2019, 4:02 p.m. UTC | #20
In Mon, Feb 11, 2019 at 7:56 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Feb 11, 2019 at 3:32 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Mon, Feb 11, 2019 at 5:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Mon, Feb 11, 2019 at 2:29 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > > > No. As said, please correctly set mode to XImode in mode attribute calculation.
> > > >
> > > > There is
> > > >
> > > >  switch (get_attr_type (insn))
> > > >     {
> > > >     case TYPE_SSELOG1:
> > > >       return standard_sse_constant_opcode (insn, operands);
> > > >
> > > > standard_sse_constant_opcode has
> > > >
> > > > else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
> > > >     {
> > > >       enum attr_mode insn_mode = get_attr_mode (insn);
> > > >
> > > >       switch (insn_mode)
> > > >         {
> > > >         case MODE_XI:
> > > >         case MODE_V8DF:
> > > >         case MODE_V16SF:
> > > >           gcc_assert (TARGET_AVX512F);
> > > >           return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";
> > >
> > > If there is something wrong with standard_sse_constant_opcode, then
> > > fix the problem in the function itself. With your previous patch, you
> > > introduced a regression, and the presented fix is another kludge to
> > > fix a stack of kludges inside standard_sse_constant_opcode.
> > >
> > > Please take your time and propose some acceptable solution that would
> > > put some logic into const_0/const_1 handling. The situation is not OK
> > > and your patch makes it even worse.
> > >
> >
> > Let's first define what MODE_XI means in standard_sse_constant_opcode
> > as well as in all these mov patterns for with and without AVX512VL.   Without
> > a clear definition, we can't get out of this mess.
>
> INT_MODE (OI, 32);
> INT_MODE (XI, 64);
>
> So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
> in case of const_1, all 512 bits set.
>
> We can load zeros with narrower instruction, (e.g. 256 bit by inherent
> zeroing of highpart in case of 128 bit xor), so TImode in this case.
>
> Some targets prefer V4SF mode, so they will emit float xorps for zeroing
>
> Then the introduction of AVX512F fubared everything by overloading the
> meaning of insn mode.

Exactly.

How should we use INSN mode,  MODE_XI, in standard_sse_constant_opcode
and patterns which use standard_sse_constant_opcode? 2 options:

1.  MODE_XI should only used to check if EXT_REX_SSE_REG_P is true
in any register operand.  The operand size must be determined by operand
itself , not by MODE_XI.  The operand encoding size should be determined
by the operand size, EXT_REX_SSE_REG_P and AVX512VL.
2. MODE_XI should be used to determine the operand encoding size.
EXT_REX_SSE_REG_P and AVX512VL should be checked for encoding
instructions.

Which way should we go?
Jakub Jelinek Feb. 11, 2019, 4:24 p.m. UTC | #21
On Mon, Feb 11, 2019 at 04:56:45PM +0100, Uros Bizjak wrote:
> > Let's first define what MODE_XI means in standard_sse_constant_opcode
> > as well as in all these mov patterns for with and without AVX512VL.   Without
> > a clear definition, we can't get out of this mess.
> 
> INT_MODE (OI, 32);
> INT_MODE (XI, 64);
> 
> So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
> in case of const_1, all 512 bits set.
> 
> We can load zeros with narrower instruction, (e.g. 256 bit by inherent
> zeroing of highpart in case of 128 bit xor), so TImode in this case.
> 
> Some targets prefer V4SF mode, so they will emit float xorps for zeroing
> 
> Then the introduction of AVX512F fubared everything by overloading the
> meaning of insn mode.

I don't see much changes in AVX512F here, most of the behavior has been
there already in AVX.
Most of the SSE/AVX/AVX512 instructions affect the whole register,
usually there is DEST[MAX_VL-1:VL] <- 0 at the end of each instruction.
But, using the MAX_VL to determine get_attr_mode doesn't seem really useful,
because that changes dynamically at runtime based on the actual hw, not on
what we've been compiled for.
So, I believe we want to use that VL value to determine the bitsize of the
mode corresponding to get_attr_mode.  And in that case, for
*movoi_internal_avx and *movti_internal, I believe the right mode is MODE_OI
resp. MODE_TI for AVX512VL, because e.g.
vmovdqa32 %ymm12, %ymm23
is a VL = 256 instruction, not VL = 512.  Similarly, if we want to set
%ymm25 to all ones, i.e. movoi_internal_avx, we use
vpternlogd	$0xFF, %ymm25, %ymm25, %ymm25
which is again VL = 256 instruction, so should use MODE_OI.
We'd need to use
vmovdqa32 %zmm12, %zmm23
or
vpternlogd	$0xFF, %zmm25, %zmm25, %zmm25
instructions for AVX512F without AVX512VL, but as has been discussed, this
won't really happen, because hard_regno_mode_ok refuses to allocate 256-bit
or 128-bit modes in ext sse registers.

	Jakub
Uros Bizjak Feb. 12, 2019, 6:02 p.m. UTC | #22
On Fri, Feb 8, 2019 at 12:29 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > OImode and TImode moves must be done in XImode to access upper 16
> > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > upper 16 vector registers in OImode and TImode.
> > >
> > >         PR target/89229
> > >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > >         upper 16 vector registers without TARGET_AVX512VL.
> > >         (*movti_internal): Likewise.
> >
> > Please use (not (match_test "...")) instead of (match_test "!...") and
> > put the new test as the first argument of the AND rtx.
> >
> > LGTM with the above change.
>
> This is the patch I am checking in.

HJ,

please revert two PR89229 patches as they introduce a regression.

Uros.

> Thanks.
>
> H.J.
> ---
> OImode and TImode moves must be done in XImode to access upper 16
> vector registers without AVX512VL.  With AVX512VL, we can access
> upper 16 vector registers in OImode and TImode.
>
> PR target/89229
> * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> upper 16 vector registers without TARGET_AVX512VL.
> (*movti_internal): Likewise.
> ---
>  gcc/config/i386/i386.md | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c1492363bca..3d9141ae450 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1933,8 +1933,9 @@
>     (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>     (set_attr "prefix" "vex")
>     (set (attr "mode")
> - (cond [(ior (match_operand 0 "ext_sse_reg_operand")
> -     (match_operand 1 "ext_sse_reg_operand"))
> + (cond [(and (not (match_test "TARGET_AVX512VL"))
> +     (ior (match_operand 0 "ext_sse_reg_operand")
> + (match_operand 1 "ext_sse_reg_operand")))
>   (const_string "XI")
>          (and (eq_attr "alternative" "1")
>       (match_test "TARGET_AVX512VL"))
> @@ -2012,8 +2013,9 @@
>     (set (attr "mode")
>   (cond [(eq_attr "alternative" "0,1")
>   (const_string "DI")
> -        (ior (match_operand 0 "ext_sse_reg_operand")
> -     (match_operand 1 "ext_sse_reg_operand"))
> +        (and (not (match_test "TARGET_AVX512VL"))
> +     (ior (match_operand 0 "ext_sse_reg_operand")
> + (match_operand 1 "ext_sse_reg_operand")))
>   (const_string "XI")
>          (and (eq_attr "alternative" "3")
>       (match_test "TARGET_AVX512VL"))
> --
H.J. Lu Feb. 12, 2019, 7:01 p.m. UTC | #23
On Tue, Feb 12, 2019 at 10:02 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Fri, Feb 8, 2019 at 12:29 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > OImode and TImode moves must be done in XImode to access upper 16
> > > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > > upper 16 vector registers in OImode and TImode.
> > > >
> > > >         PR target/89229
> > > >         * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > > >         upper 16 vector registers without TARGET_AVX512VL.
> > > >         (*movti_internal): Likewise.
> > >
> > > Please use (not (match_test "...")) instead of (match_test "!...") and
> > > put the new test as the first argument of the AND rtx.
> > >
> > > LGTM with the above change.
> >
> > This is the patch I am checking in.
>
> HJ,
>
> please revert two PR89229 patches as they introduce a regression.
>

This is what I checked in.
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c1492363bca..e7f4b9a9c8d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1933,8 +1933,9 @@ 
    (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
    (set_attr "prefix" "vex")
    (set (attr "mode")
-	(cond [(ior (match_operand 0 "ext_sse_reg_operand")
-		    (match_operand 1 "ext_sse_reg_operand"))
+	(cond [(and (ior (match_operand 0 "ext_sse_reg_operand")
+			 (match_operand 1 "ext_sse_reg_operand"))
+		    (match_test "!TARGET_AVX512VL"))
 		 (const_string "XI")
 	       (and (eq_attr "alternative" "1")
 		    (match_test "TARGET_AVX512VL"))
@@ -2012,8 +2013,9 @@ 
    (set (attr "mode")
 	(cond [(eq_attr "alternative" "0,1")
 		 (const_string "DI")
-	       (ior (match_operand 0 "ext_sse_reg_operand")
-		    (match_operand 1 "ext_sse_reg_operand"))
+	       (and (ior (match_operand 0 "ext_sse_reg_operand")
+			 (match_operand 1 "ext_sse_reg_operand"))
+		    (match_test "!TARGET_AVX512VL"))
 		 (const_string "XI")
 	       (and (eq_attr "alternative" "3")
 		    (match_test "TARGET_AVX512VL"))