Message ID | 20190711071716.GO2125@tucnak |
---|---|
State | New |
Headers | show |
Series | Fix recent avx512vl-vcvttpd2*dq-2.c FAILs (PR target/91124) | expand |
On Thu, Jul 11, 2019 at 9:17 AM Jakub Jelinek <jakub@redhat.com> wrote: > > Hi! > > The same SCCVN change also introduced: > FAIL: gcc.target/i386/avx512vl-vcvttpd2dq-2.c execution test > FAIL: gcc.target/i386/avx512vl-vcvttpd2udq-2.c execution test > failures, the reason in the generic code is similar, more constants > propagated to builtins, but the actual bug is very different. > The 128-bit instructions like vcvtpd2dqx when using non-{z} masking > don't behave like their RTL pattern describes. > The instructions only use low 2 bits of the mask register and the > upper 2 elements of the 128-bit destination register are always cleared > regardless of what the bit 2 and 3 in the mask register say and what is in > the destination before, so it is incorrect to use: > (set (match_operand:V4SI 0 "register_operand" "=v") > (vec_merge:V4SI (vec_concat:V4SI (fix:V2SI (match_operand:V2DF 1 "vector_operand" "vBm")) > (const_vector:V2SI [ > (const_int 0 [0]) > (const_int 0 [0]) > ])) > (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > (match_operand:QI 3 "register_operand" "Yk"))) > which is correct for the low 2 elements, but for the upper 2 elements > says that if the mask bit 2 (or 3) is set, then the element will be zero > (correct), but if it is clear, then it will be operands[2] element (correct > if it is const0_operand, but if it is some nonimmediate and doesn't have > zeros there, it is incorrect). > The RTL has been correct already for similar instructions, e.g. > *float<floatunssuffix>v2div2sf2{,_mask{,_1}} or some narrowing movs, this > patch just uses similar RTL for those. The _mask_1 patterns are added > as PR69671 will trigger on these new patterns as well, for {z} masking > RTL simplification can fold that second vec_merge operand into CONST0_RTX. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2019-07-11 Jakub Jelinek <jakub@redhat.com> > > PR target/91124 > * config/i386/sse.md (sse2_cvtpd2dq<mask_name>): Change into ... > (sse2_cvtpd2dq): ... this. Remove mask substitution macros. > (sse2_cvtpd2dq_mask, sse2_cvtpd2dq_mask_1): New define_insns. > (ufix_notruncv2dfv2si2<mask_name>): Change into ... > (ufix_notruncv2dfv2si2): ... this. Remove mask substitution macros. > (ufix_notruncv2dfv2si2_mask, ufix_notruncv2dfv2si2_mask_1): New > define_insns. > (ufix_truncv2dfv2si2<mask_name>): Change into ... > (ufix_truncv2dfv2si2): ... this. Remove mask substitution macros. > (ufix_truncv2dfv2si2_mask, ufix_truncv2dfv2si2_mask_1): New > define_insns. > (sse2_cvttpd2dq<mask_name>): Change into ... > (sse2_cvttpd2dq): ... this. Remove mask substitution macros. > (sse2_cvttpd2dq_mask, sse2_cvttpd2dq_mask_1): New define_insns. > (*sse2_cvtpd2dq<mask_name>): Change into ... > (*sse2_cvtpd2dq): ... this. Remove mask substitution macros. > Add "C" constraint to const0_operand. > (*sse2_cvtpd2dq_mask, *sse2_cvtpd2dq_mask_1): New define_insns. > (sse2_cvtpd2ps_mask): Adjust expand to match *sse2_cvtpd2ps_mask > changes. OK. Thanks, Uros. > --- gcc/config/i386/sse.md.jj 2019-07-08 23:52:40.239757801 +0200 > +++ gcc/config/i386/sse.md 2019-07-10 22:50:15.021381492 +0200 > @@ -5927,16 +5927,16 @@ (define_insn "*avx_cvtpd2dq256_2" > (set_attr "btver2_decode" "vector") > (set_attr "mode" "OI")]) > > -(define_insn "sse2_cvtpd2dq<mask_name>" > +(define_insn "sse2_cvtpd2dq" > [(set (match_operand:V4SI 0 "register_operand" "=v") > (vec_concat:V4SI > (unspec:V2SI [(match_operand:V2DF 1 "vector_operand" "vBm")] > UNSPEC_FIX_NOTRUNC) > (const_vector:V2SI [(const_int 0) (const_int 0)])))] > - "TARGET_SSE2 && <mask_avx512vl_condition>" > + "TARGET_SSE2" > { > if (TARGET_AVX) > - return "vcvtpd2dq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"; > + return "vcvtpd2dq{x}\t{%1, %0|%0, %1}"; > else > return "cvtpd2dq\t{%1, %0|%0, %1}"; > } > @@ -5949,6 +5949,38 @@ (define_insn "sse2_cvtpd2dq<mask_name>" > (set_attr "athlon_decode" "vector") > (set_attr "bdver1_decode" "double")]) > > +(define_insn "sse2_cvtpd2dq_mask" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] > + UNSPEC_FIX_NOTRUNC) > + (vec_select:V2SI > + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvtpd2dq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "*sse2_cvtpd2dq_mask_1" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] > + UNSPEC_FIX_NOTRUNC) > + (const_vector:V2SI [(const_int 0) (const_int 0)]) > + (match_operand:QI 2 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvtpd2dq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > ;; For ufix_notrunc* insn patterns > (define_mode_attr pd2udqsuff > [(V8DF "") (V4DF "{y}")]) > @@ -5964,15 +5996,49 @@ (define_insn "ufix_notrunc<mode><si2dfmo > (set_attr "prefix" "evex") > (set_attr "mode" "<sseinsnmode>")]) > > -(define_insn "ufix_notruncv2dfv2si2<mask_name>" > +(define_insn "ufix_notruncv2dfv2si2" > [(set (match_operand:V4SI 0 "register_operand" "=v") > (vec_concat:V4SI > (unspec:V2SI > [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] > - UNSPEC_UNSIGNED_FIX_NOTRUNC) > + UNSPEC_UNSIGNED_FIX_NOTRUNC) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvtpd2udq{x}\t{%1, %0|%0, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "ufix_notruncv2dfv2si2_mask" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unspec:V2SI > + [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] > + UNSPEC_UNSIGNED_FIX_NOTRUNC) > + (vec_select:V2SI > + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvtpd2udq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "*ufix_notruncv2dfv2si2_mask_1" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unspec:V2SI > + [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] > + UNSPEC_UNSIGNED_FIX_NOTRUNC) > + (const_vector:V2SI [(const_int 0) (const_int 0)]) > + (match_operand:QI 2 "register_operand" "Yk")) > (const_vector:V2SI [(const_int 0) (const_int 0)])))] > "TARGET_AVX512VL" > - "vcvtpd2udq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}" > + "vcvtpd2udq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" > [(set_attr "type" "ssecvt") > (set_attr "prefix" "evex") > (set_attr "mode" "TI")]) > @@ -5987,13 +6053,43 @@ (define_insn "fix<fixunssuffix>_truncv8d > (set_attr "prefix" "evex") > (set_attr "mode" "OI")]) > > -(define_insn "ufix_truncv2dfv2si2<mask_name>" > +(define_insn "ufix_truncv2dfv2si2" > [(set (match_operand:V4SI 0 "register_operand" "=v") > (vec_concat:V4SI > (unsigned_fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > (const_vector:V2SI [(const_int 0) (const_int 0)])))] > "TARGET_AVX512VL" > - "vcvttpd2udq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}" > + "vcvttpd2udq{x}\t{%1, %0|%0, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "ufix_truncv2dfv2si2_mask" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unsigned_fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > + (vec_select:V2SI > + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvttpd2udq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "*ufix_truncv2dfv2si2_mask_1" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unsigned_fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > + (const_vector:V2SI [(const_int 0) (const_int 0)]) > + (match_operand:QI 2 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvttpd2udq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" > [(set_attr "type" "ssecvt") > (set_attr "prefix" "evex") > (set_attr "mode" "TI")]) > @@ -6138,15 +6234,15 @@ (define_expand "avx_cvttpd2dq256_2" > "TARGET_AVX" > "operands[2] = CONST0_RTX (V4SImode);") > > -(define_insn "sse2_cvttpd2dq<mask_name>" > +(define_insn "sse2_cvttpd2dq" > [(set (match_operand:V4SI 0 "register_operand" "=v") > (vec_concat:V4SI > (fix:V2SI (match_operand:V2DF 1 "vector_operand" "vBm")) > (const_vector:V2SI [(const_int 0) (const_int 0)])))] > - "TARGET_SSE2 && <mask_avx512vl_condition>" > + "TARGET_SSE2" > { > if (TARGET_AVX) > - return "vcvttpd2dq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"; > + return "vcvttpd2dq{x}\t{%1, %0|%0, %1}"; > else > return "cvttpd2dq\t{%1, %0|%0, %1}"; > } > @@ -6157,6 +6253,36 @@ (define_insn "sse2_cvttpd2dq<mask_name>" > (set_attr "prefix" "maybe_vex") > (set_attr "mode" "TI")]) > > +(define_insn "sse2_cvttpd2dq_mask" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > + (vec_select:V2SI > + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvttpd2dq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "*sse2_cvttpd2dq_mask_1" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > + (const_vector:V2SI [(const_int 0) (const_int 0)]) > + (match_operand:QI 2 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvttpd2dq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > (define_insn "sse2_cvtsd2ss<round_name>" > [(set (match_operand:V4SF 0 "register_operand" "=x,x,v") > (vec_merge:V4SF > @@ -6276,26 +6402,28 @@ (define_expand "sse2_cvtpd2ps" > > (define_expand "sse2_cvtpd2ps_mask" > [(set (match_operand:V4SF 0 "register_operand") > - (vec_merge:V4SF > - (vec_concat:V4SF > + (vec_concat:V4SF > + (vec_merge:V2SF > (float_truncate:V2SF > (match_operand:V2DF 1 "vector_operand")) > - (match_dup 4)) > - (match_operand:V4SF 2 "register_operand") > - (match_operand:QI 3 "register_operand")))] > + (vec_select:V2SF > + (match_operand:V4SF 2 "nonimm_or_0_operand") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand")) > + (match_dup 4)))] > "TARGET_SSE2" > "operands[4] = CONST0_RTX (V2SFmode);") > > -(define_insn "*sse2_cvtpd2ps<mask_name>" > +(define_insn "*sse2_cvtpd2ps" > [(set (match_operand:V4SF 0 "register_operand" "=v") > (vec_concat:V4SF > (float_truncate:V2SF > (match_operand:V2DF 1 "vector_operand" "vBm")) > - (match_operand:V2SF 2 "const0_operand")))] > - "TARGET_SSE2 && <mask_avx512vl_condition>" > + (match_operand:V2SF 2 "const0_operand" "C")))] > + "TARGET_SSE2" > { > if (TARGET_AVX) > - return "vcvtpd2ps{x}\t{%1, %0<mask_operand3>|%0<mask_operand3>, %1}"; > + return "vcvtpd2ps{x}\t{%1, %0|%0, %1}"; > else > return "cvtpd2ps\t{%1, %0|%0, %1}"; > } > @@ -6307,6 +6435,38 @@ (define_insn "*sse2_cvtpd2ps<mask_name>" > (set_attr "prefix" "maybe_vex") > (set_attr "mode" "V4SF")]) > > +(define_insn "*sse2_cvtpd2ps_mask" > + [(set (match_operand:V4SF 0 "register_operand" "=v") > + (vec_concat:V4SF > + (vec_merge:V2SF > + (float_truncate:V2SF > + (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > + (vec_select:V2SF > + (match_operand:V4SF 2 "nonimm_or_0_operand" "0C") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand" "Yk")) > + (match_operand:V2SF 4 "const0_operand" "C")))] > + "TARGET_AVX512VL" > + "vcvtpd2ps{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "V4SF")]) > + > +(define_insn "*sse2_cvtpd2ps_mask_1" > + [(set (match_operand:V4SF 0 "register_operand" "=v") > + (vec_concat:V4SF > + (vec_merge:V2SF > + (float_truncate:V2SF > + (match_operand:V2DF 1 "nonimmediate_operand" "vm")) > + (match_operand:V2SF 3 "const0_operand" "C") > + (match_operand:QI 2 "register_operand" "Yk")) > + (match_operand:V2SF 4 "const0_operand" "C")))] > + "TARGET_AVX512VL" > + "vcvtpd2ps{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "V4SF")]) > + > ;; For <sse2_avx_avx512f>_cvtps2pd<avxsizesuffix> insn pattern > (define_mode_attr sf2dfmode > [(V8DF "V8SF") (V4DF "V4SF")]) > > Jakub
--- gcc/config/i386/sse.md.jj 2019-07-08 23:52:40.239757801 +0200 +++ gcc/config/i386/sse.md 2019-07-10 22:50:15.021381492 +0200 @@ -5927,16 +5927,16 @@ (define_insn "*avx_cvtpd2dq256_2" (set_attr "btver2_decode" "vector") (set_attr "mode" "OI")]) -(define_insn "sse2_cvtpd2dq<mask_name>" +(define_insn "sse2_cvtpd2dq" [(set (match_operand:V4SI 0 "register_operand" "=v") (vec_concat:V4SI (unspec:V2SI [(match_operand:V2DF 1 "vector_operand" "vBm")] UNSPEC_FIX_NOTRUNC) (const_vector:V2SI [(const_int 0) (const_int 0)])))] - "TARGET_SSE2 && <mask_avx512vl_condition>" + "TARGET_SSE2" { if (TARGET_AVX) - return "vcvtpd2dq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"; + return "vcvtpd2dq{x}\t{%1, %0|%0, %1}"; else return "cvtpd2dq\t{%1, %0|%0, %1}"; } @@ -5949,6 +5949,38 @@ (define_insn "sse2_cvtpd2dq<mask_name>" (set_attr "athlon_decode" "vector") (set_attr "bdver1_decode" "double")]) +(define_insn "sse2_cvtpd2dq_mask" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] + UNSPEC_FIX_NOTRUNC) + (vec_select:V2SI + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvtpd2dq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "*sse2_cvtpd2dq_mask_1" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] + UNSPEC_FIX_NOTRUNC) + (const_vector:V2SI [(const_int 0) (const_int 0)]) + (match_operand:QI 2 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvtpd2dq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + ;; For ufix_notrunc* insn patterns (define_mode_attr pd2udqsuff [(V8DF "") (V4DF "{y}")]) @@ -5964,15 +5996,49 @@ (define_insn "ufix_notrunc<mode><si2dfmo (set_attr "prefix" "evex") (set_attr "mode" "<sseinsnmode>")]) -(define_insn "ufix_notruncv2dfv2si2<mask_name>" +(define_insn "ufix_notruncv2dfv2si2" [(set (match_operand:V4SI 0 "register_operand" "=v") (vec_concat:V4SI (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] - UNSPEC_UNSIGNED_FIX_NOTRUNC) + UNSPEC_UNSIGNED_FIX_NOTRUNC) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvtpd2udq{x}\t{%1, %0|%0, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "ufix_notruncv2dfv2si2_mask" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unspec:V2SI + [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] + UNSPEC_UNSIGNED_FIX_NOTRUNC) + (vec_select:V2SI + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvtpd2udq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "*ufix_notruncv2dfv2si2_mask_1" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unspec:V2SI + [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] + UNSPEC_UNSIGNED_FIX_NOTRUNC) + (const_vector:V2SI [(const_int 0) (const_int 0)]) + (match_operand:QI 2 "register_operand" "Yk")) (const_vector:V2SI [(const_int 0) (const_int 0)])))] "TARGET_AVX512VL" - "vcvtpd2udq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}" + "vcvtpd2udq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") (set_attr "mode" "TI")]) @@ -5987,13 +6053,43 @@ (define_insn "fix<fixunssuffix>_truncv8d (set_attr "prefix" "evex") (set_attr "mode" "OI")]) -(define_insn "ufix_truncv2dfv2si2<mask_name>" +(define_insn "ufix_truncv2dfv2si2" [(set (match_operand:V4SI 0 "register_operand" "=v") (vec_concat:V4SI (unsigned_fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) (const_vector:V2SI [(const_int 0) (const_int 0)])))] "TARGET_AVX512VL" - "vcvttpd2udq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}" + "vcvttpd2udq{x}\t{%1, %0|%0, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "ufix_truncv2dfv2si2_mask" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unsigned_fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) + (vec_select:V2SI + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvttpd2udq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "*ufix_truncv2dfv2si2_mask_1" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unsigned_fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) + (const_vector:V2SI [(const_int 0) (const_int 0)]) + (match_operand:QI 2 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvttpd2udq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") (set_attr "mode" "TI")]) @@ -6138,15 +6234,15 @@ (define_expand "avx_cvttpd2dq256_2" "TARGET_AVX" "operands[2] = CONST0_RTX (V4SImode);") -(define_insn "sse2_cvttpd2dq<mask_name>" +(define_insn "sse2_cvttpd2dq" [(set (match_operand:V4SI 0 "register_operand" "=v") (vec_concat:V4SI (fix:V2SI (match_operand:V2DF 1 "vector_operand" "vBm")) (const_vector:V2SI [(const_int 0) (const_int 0)])))] - "TARGET_SSE2 && <mask_avx512vl_condition>" + "TARGET_SSE2" { if (TARGET_AVX) - return "vcvttpd2dq{x}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"; + return "vcvttpd2dq{x}\t{%1, %0|%0, %1}"; else return "cvttpd2dq\t{%1, %0|%0, %1}"; } @@ -6157,6 +6253,36 @@ (define_insn "sse2_cvttpd2dq<mask_name>" (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) +(define_insn "sse2_cvttpd2dq_mask" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) + (vec_select:V2SI + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvttpd2dq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "*sse2_cvttpd2dq_mask_1" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vm")) + (const_vector:V2SI [(const_int 0) (const_int 0)]) + (match_operand:QI 2 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvttpd2dq{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + (define_insn "sse2_cvtsd2ss<round_name>" [(set (match_operand:V4SF 0 "register_operand" "=x,x,v") (vec_merge:V4SF @@ -6276,26 +6402,28 @@ (define_expand "sse2_cvtpd2ps" (define_expand "sse2_cvtpd2ps_mask" [(set (match_operand:V4SF 0 "register_operand") - (vec_merge:V4SF - (vec_concat:V4SF + (vec_concat:V4SF + (vec_merge:V2SF (float_truncate:V2SF (match_operand:V2DF 1 "vector_operand")) - (match_dup 4)) - (match_operand:V4SF 2 "register_operand") - (match_operand:QI 3 "register_operand")))] + (vec_select:V2SF + (match_operand:V4SF 2 "nonimm_or_0_operand") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand")) + (match_dup 4)))] "TARGET_SSE2" "operands[4] = CONST0_RTX (V2SFmode);") -(define_insn "*sse2_cvtpd2ps<mask_name>" +(define_insn "*sse2_cvtpd2ps" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_concat:V4SF (float_truncate:V2SF (match_operand:V2DF 1 "vector_operand" "vBm")) - (match_operand:V2SF 2 "const0_operand")))] - "TARGET_SSE2 && <mask_avx512vl_condition>" + (match_operand:V2SF 2 "const0_operand" "C")))] + "TARGET_SSE2" { if (TARGET_AVX) - return "vcvtpd2ps{x}\t{%1, %0<mask_operand3>|%0<mask_operand3>, %1}"; + return "vcvtpd2ps{x}\t{%1, %0|%0, %1}"; else return "cvtpd2ps\t{%1, %0|%0, %1}"; } @@ -6307,6 +6435,38 @@ (define_insn "*sse2_cvtpd2ps<mask_name>" (set_attr "prefix" "maybe_vex") (set_attr "mode" "V4SF")]) +(define_insn "*sse2_cvtpd2ps_mask" + [(set (match_operand:V4SF 0 "register_operand" "=v") + (vec_concat:V4SF + (vec_merge:V2SF + (float_truncate:V2SF + (match_operand:V2DF 1 "nonimmediate_operand" "vm")) + (vec_select:V2SF + (match_operand:V4SF 2 "nonimm_or_0_operand" "0C") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand" "Yk")) + (match_operand:V2SF 4 "const0_operand" "C")))] + "TARGET_AVX512VL" + "vcvtpd2ps{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "V4SF")]) + +(define_insn "*sse2_cvtpd2ps_mask_1" + [(set (match_operand:V4SF 0 "register_operand" "=v") + (vec_concat:V4SF + (vec_merge:V2SF + (float_truncate:V2SF + (match_operand:V2DF 1 "nonimmediate_operand" "vm")) + (match_operand:V2SF 3 "const0_operand" "C") + (match_operand:QI 2 "register_operand" "Yk")) + (match_operand:V2SF 4 "const0_operand" "C")))] + "TARGET_AVX512VL" + "vcvtpd2ps{x}\t{%1, %0%{%2%}%{z%}|%0%{%2%}%{z%}, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "V4SF")]) + ;; For <sse2_avx_avx512f>_cvtps2pd<avxsizesuffix> insn pattern (define_mode_attr sf2dfmode [(V8DF "V8SF") (V4DF "V4SF")])