mbox series

[00/40] V6: Emulate MMX intrinsics with SSE

Message ID 20190215135812.32306-1-hjl.tools@gmail.com
Headers show
Series V6: Emulate MMX intrinsics with SSE | expand

Message

H.J. Lu Feb. 15, 2019, 1:57 p.m. UTC
On x86-64, since __m64 is returned and passed in XMM registers, we can
emulate MMX intrinsics with SSE instructions. To support it, we added

 #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)

;; Define instruction set of MMX instructions
(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
  (const_string "base"))

         (eq_attr "mmx_isa" "native")
           (symbol_ref "!TARGET_MMX_WITH_SSE")
         (eq_attr "mmx_isa" "x64")
           (symbol_ref "TARGET_MMX_WITH_SSE")
         (eq_attr "mmx_isa" "x64_avx")
           (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
         (eq_attr "mmx_isa" "x64_noavx")
           (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")

We added SSE emulation to MMX patterns and disabled MMX alternatives with
TARGET_MMX_WITH_SSE.

Most of MMX instructions have equivalent SSE versions and results of some
SSE versions need to be reshuffled to the right order for MMX.  Thee are
couple tricky cases:

1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
mask operand and handle unmapped bits 64:127 at memory address by
adjusting source and mask operands together with memory address.

2. MMX movntq is emulated with SSE2 DImode movnti, which is available
in 64-bit mode.

3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
SSE emulation must clear the bit 4 in the shuffle control mask.

4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
the upper 64 bits of destination XMM register.

Tests are also added to check each SSE emulation of MMX intrinsics.

There are no regressions on i686 and x86-64.  For x86-64, GCC is also
tested with

--with-arch=native --with-cpu=native

on AVX2 and AVX512F machines.

H.J. Lu (41):
  i386: Allow MMX register modes in SSE registers
  i386: Add mmx_nonimmediate_operand
  i386: Emulate MMX packsswb/packssdw/packuswb with SSE2
  i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX
  i386: Emulate MMX plusminus/sat_plusminus with SSE
  i386: Emulate MMX mulv4hi3 with SSE
  i386: Emulate MMX smulv4hi3_highpart with SSE
  i386: Emulate MMX mmx_pmaddwd with SSE
  i386: Emulate MMX ashr<mode>3/<shift_insn><mode>3 with SSE
  i386: Emulate MMX <any_logic><mode>3 with SSE
  i386: Emulate MMX mmx_andnot<mode>3 with SSE
  i386: Emulate MMX mmx_eq/mmx_gt<mode>3 with SSE
  i386: Emulate MMX vec_dupv2si with SSE
  i386: Emulate MMX pshufw with SSE
  i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE
  i386: Emulate MMX sse_cvtpi2ps with SSE
  i386: Emulate MMX mmx_pextrw with SSE
  i386: Emulate MMX mmx_pinsrw with SSE
  i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE
  i386: Emulate MMX mmx_pmovmskb with SSE
  i386: Emulate MMX mmx_umulv4hi3_highpart with SSE
  i386: Emulate MMX maskmovq with SSE2 maskmovdqu
  i386: Emulate MMX mmx_uavgv8qi3 with SSE
  i386: Emulate MMX mmx_uavgv4hi3 with SSE
  i386: Emulate MMX mmx_psadbw with SSE
  i386: Emulate MMX movntq with SSE2 movntidi
  i386: Emulate MMX umulv1siv1di3 with SSE2
  i386: Make _mm_empty () as NOP when MMX is disabled
  i386: Emulate MMX ssse3_ph<plusminus_mnemonic>wv4hi3 with SSE
  i386: Emulate MMX ssse3_ph<plusminus_mnemonic>dv2si3 with SSE
  i386: Emulate MMX ssse3_pmaddubsw with SSE
  i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE
  i386: Emulate MMX pshufb with SSE version
  i386: Emulate MMX ssse3_psign<mode>3 with SSE
  i386: Emulate MMX ssse3_palignrdi with SSE
  i386: Emulate MMX abs<mode>2 with SSE
  i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE
  i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE
  i386: Allow MMX intrinsic emulation with SSE
  i386: Enable TM MMX intrinsics with SSE2
  i386: Add tests for MMX intrinsic emulations with SSE

Uros Bizjak (1):
  Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE

 gcc/config/i386/constraints.md                |   6 +
 gcc/config/i386/i386-builtin.def              | 126 +--
 gcc/config/i386/i386-c.c                      |   2 +
 gcc/config/i386/i386-protos.h                 |   4 +
 gcc/config/i386/i386.c                        | 189 +++-
 gcc/config/i386/i386.h                        |   2 +
 gcc/config/i386/i386.md                       |  17 +
 gcc/config/i386/mmintrin.h                    |  12 +-
 gcc/config/i386/mmx.md                        | 903 ++++++++++++------
 gcc/config/i386/predicates.md                 |   7 +
 gcc/config/i386/sse.md                        | 353 +++++--
 gcc/config/i386/xmmintrin.h                   |  61 ++
 gcc/testsuite/gcc.target/i386/mmx-vals.h      |  77 ++
 gcc/testsuite/gcc.target/i386/pr82483-1.c     |   2 +-
 gcc/testsuite/gcc.target/i386/pr82483-2.c     |   2 +-
 gcc/testsuite/gcc.target/i386/sse2-mmx-10.c   |  44 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-11.c   |  39 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-12.c   |  43 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-13.c   |  40 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-14.c   |  32 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-15.c   |  37 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-16.c   |  41 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-17.c   |  52 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c  |  14 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c  |   7 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c  |   7 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c  |  14 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c  |   7 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c  |   7 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c  |   7 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c  |   7 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-2.c    |  12 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-20.c   |  12 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-21.c   |  13 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-3.c    |  13 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-4.c    |   4 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-5.c    |  11 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-6.c    |  11 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-7.c    |  13 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-8.c    |   4 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-9.c    |  79 ++
 .../gcc.target/i386/sse2-mmx-cvtpi2ps.c       |  44 +
 .../gcc.target/i386/sse2-mmx-cvtps2pi.c       |  37 +
 .../gcc.target/i386/sse2-mmx-cvttps2pi.c      |  37 +
 .../gcc.target/i386/sse2-mmx-maskmovq.c       | 100 ++
 .../gcc.target/i386/sse2-mmx-packssdw.c       |  53 +
 .../gcc.target/i386/sse2-mmx-packsswb.c       |  53 +
 .../gcc.target/i386/sse2-mmx-packuswb.c       |  53 +
 .../gcc.target/i386/sse2-mmx-paddb.c          |  49 +
 .../gcc.target/i386/sse2-mmx-paddd.c          |  49 +
 .../gcc.target/i386/sse2-mmx-paddq.c          |  44 +
 .../gcc.target/i386/sse2-mmx-paddsb.c         |  49 +
 .../gcc.target/i386/sse2-mmx-paddsw.c         |  49 +
 .../gcc.target/i386/sse2-mmx-paddusb.c        |  49 +
 .../gcc.target/i386/sse2-mmx-paddusw.c        |  49 +
 .../gcc.target/i386/sse2-mmx-paddw.c          |  49 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c |  45 +
 .../gcc.target/i386/sse2-mmx-pandn.c          |  45 +
 .../gcc.target/i386/sse2-mmx-pavgb.c          |  53 +
 .../gcc.target/i386/sse2-mmx-pavgw.c          |  53 +
 .../gcc.target/i386/sse2-mmx-pcmpeqb.c        |  49 +
 .../gcc.target/i386/sse2-mmx-pcmpeqd.c        |  49 +
 .../gcc.target/i386/sse2-mmx-pcmpeqw.c        |  49 +
 .../gcc.target/i386/sse2-mmx-pcmpgtb.c        |  49 +
 .../gcc.target/i386/sse2-mmx-pcmpgtd.c        |  49 +
 .../gcc.target/i386/sse2-mmx-pcmpgtw.c        |  49 +
 .../gcc.target/i386/sse2-mmx-pextrw.c         |  60 ++
 .../gcc.target/i386/sse2-mmx-pinsrw.c         |  62 ++
 .../gcc.target/i386/sse2-mmx-pmaddwd.c        |  48 +
 .../gcc.target/i386/sse2-mmx-pmaxsw.c         |  49 +
 .../gcc.target/i386/sse2-mmx-pmaxub.c         |  49 +
 .../gcc.target/i386/sse2-mmx-pminsw.c         |  49 +
 .../gcc.target/i386/sse2-mmx-pminub.c         |  49 +
 .../gcc.target/i386/sse2-mmx-pmovmskb.c       |  47 +
 .../gcc.target/i386/sse2-mmx-pmulhuw.c        |  52 +
 .../gcc.target/i386/sse2-mmx-pmulhw.c         |  54 ++
 .../gcc.target/i386/sse2-mmx-pmullw.c         |  53 +
 .../gcc.target/i386/sse2-mmx-pmuludq.c        |  48 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-por.c  |  45 +
 .../gcc.target/i386/sse2-mmx-psadbw.c         |  59 ++
 .../gcc.target/i386/sse2-mmx-pshufw.c         | 249 +++++
 .../gcc.target/i386/sse2-mmx-pslld.c          |  53 +
 .../gcc.target/i386/sse2-mmx-pslldi.c         | 154 +++
 .../gcc.target/i386/sse2-mmx-psllq.c          |  48 +
 .../gcc.target/i386/sse2-mmx-psllqi.c         | 246 +++++
 .../gcc.target/i386/sse2-mmx-psllw.c          |  53 +
 .../gcc.target/i386/sse2-mmx-psllwi.c         | 106 ++
 .../gcc.target/i386/sse2-mmx-psrad.c          |  53 +
 .../gcc.target/i386/sse2-mmx-psradi.c         | 154 +++
 .../gcc.target/i386/sse2-mmx-psraw.c          |  53 +
 .../gcc.target/i386/sse2-mmx-psrawi.c         | 106 ++
 .../gcc.target/i386/sse2-mmx-psrld.c          |  53 +
 .../gcc.target/i386/sse2-mmx-psrldi.c         | 154 +++
 .../gcc.target/i386/sse2-mmx-psrlq.c          |  48 +
 .../gcc.target/i386/sse2-mmx-psrlqi.c         | 246 +++++
 .../gcc.target/i386/sse2-mmx-psrlw.c          |  53 +
 .../gcc.target/i386/sse2-mmx-psrlwi.c         | 106 ++
 .../gcc.target/i386/sse2-mmx-psubb.c          |  49 +
 .../gcc.target/i386/sse2-mmx-psubd.c          |  49 +
 .../gcc.target/i386/sse2-mmx-psubq.c          |  44 +
 .../gcc.target/i386/sse2-mmx-psubusb.c        |  49 +
 .../gcc.target/i386/sse2-mmx-psubusw.c        |  49 +
 .../gcc.target/i386/sse2-mmx-psubw.c          |  49 +
 .../gcc.target/i386/sse2-mmx-punpckhbw.c      |  54 ++
 .../gcc.target/i386/sse2-mmx-punpckhdq.c      |  48 +
 .../gcc.target/i386/sse2-mmx-punpckhwd.c      |  50 +
 .../gcc.target/i386/sse2-mmx-punpcklbw.c      |  54 ++
 .../gcc.target/i386/sse2-mmx-punpckldq.c      |  48 +
 .../gcc.target/i386/sse2-mmx-punpcklwd.c      |  50 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c |  45 +
 gcc/testsuite/gcc.target/i386/sse2-mmx.c      |   1 -
 111 files changed, 6422 insertions(+), 463 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/mmx-vals.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-11.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-12.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-13.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-14.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-15.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-16.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-17.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-20.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-21.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-9.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtpi2ps.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtps2pi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvttps2pi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-maskmovq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packssdw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packsswb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packuswb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pandn.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pinsrw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaddwd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxsw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxub.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminsw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminub.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmovmskb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhuw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmullw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmuludq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-por.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psadbw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pshufw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslld.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslldi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllqi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllwi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrad.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psradi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psraw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrawi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrld.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrldi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlqi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlwi.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusb.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhbw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhdq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhwd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklbw.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckldq.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklwd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c

Comments

Uros Bizjak Feb. 15, 2019, 5:49 p.m. UTC | #1
On Fri, Feb 15, 2019 at 2:58 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On x86-64, since __m64 is returned and passed in XMM registers, we can
> emulate MMX intrinsics with SSE instructions. To support it, we added
>
>  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
>
> ;; Define instruction set of MMX instructions
> (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
>   (const_string "base"))
>
>          (eq_attr "mmx_isa" "native")
>            (symbol_ref "!TARGET_MMX_WITH_SSE")
>          (eq_attr "mmx_isa" "x64")
>            (symbol_ref "TARGET_MMX_WITH_SSE")
>          (eq_attr "mmx_isa" "x64_avx")
>            (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
>          (eq_attr "mmx_isa" "x64_noavx")
>            (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
>
> We added SSE emulation to MMX patterns and disabled MMX alternatives with
> TARGET_MMX_WITH_SSE.
>
> Most of MMX instructions have equivalent SSE versions and results of some
> SSE versions need to be reshuffled to the right order for MMX.  Thee are
> couple tricky cases:
>
> 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
> maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
> mask operand and handle unmapped bits 64:127 at memory address by
> adjusting source and mask operands together with memory address.
>
> 2. MMX movntq is emulated with SSE2 DImode movnti, which is available
> in 64-bit mode.
>
> 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
> SSE emulation must clear the bit 4 in the shuffle control mask.
>
> 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
> the upper 64 bits of destination XMM register.
>
> Tests are also added to check each SSE emulation of MMX intrinsics.
>
> There are no regressions on i686 and x86-64.  For x86-64, GCC is also
> tested with
>
> --with-arch=native --with-cpu=native
>
> on AVX2 and AVX512F machines.

I went through the code again, and looks OK in general, modulo
mmx_nonimmediate_operand issue and a couple of minor issues.

Please substitute nonimmediate_operand predicate with
mmx_nonimmediate_operand in expanders and insn patterns. Please note
that the proposed convention is to name the operand
register_mmxmem_operand (c.f. register_ssemem_operand), so I suggest
we name the predicate in this way.

There is an issue with a change to emms pattern.

And let's remove _mm_empty () calls from testcases; they complicate
things too much for no apparent benefit.

With those issues fixed, the patchset is OK for gcc-10 when it opens.

Uros.

> H.J. Lu (41):
>   i386: Allow MMX register modes in SSE registers
>   i386: Add mmx_nonimmediate_operand
>   i386: Emulate MMX packsswb/packssdw/packuswb with SSE2
>   i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX
>   i386: Emulate MMX plusminus/sat_plusminus with SSE
>   i386: Emulate MMX mulv4hi3 with SSE
>   i386: Emulate MMX smulv4hi3_highpart with SSE
>   i386: Emulate MMX mmx_pmaddwd with SSE
>   i386: Emulate MMX ashr<mode>3/<shift_insn><mode>3 with SSE
>   i386: Emulate MMX <any_logic><mode>3 with SSE
>   i386: Emulate MMX mmx_andnot<mode>3 with SSE
>   i386: Emulate MMX mmx_eq/mmx_gt<mode>3 with SSE
>   i386: Emulate MMX vec_dupv2si with SSE
>   i386: Emulate MMX pshufw with SSE
>   i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE
>   i386: Emulate MMX sse_cvtpi2ps with SSE
>   i386: Emulate MMX mmx_pextrw with SSE
>   i386: Emulate MMX mmx_pinsrw with SSE
>   i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE
>   i386: Emulate MMX mmx_pmovmskb with SSE
>   i386: Emulate MMX mmx_umulv4hi3_highpart with SSE
>   i386: Emulate MMX maskmovq with SSE2 maskmovdqu
>   i386: Emulate MMX mmx_uavgv8qi3 with SSE
>   i386: Emulate MMX mmx_uavgv4hi3 with SSE
>   i386: Emulate MMX mmx_psadbw with SSE
>   i386: Emulate MMX movntq with SSE2 movntidi
>   i386: Emulate MMX umulv1siv1di3 with SSE2
>   i386: Make _mm_empty () as NOP when MMX is disabled
>   i386: Emulate MMX ssse3_ph<plusminus_mnemonic>wv4hi3 with SSE
>   i386: Emulate MMX ssse3_ph<plusminus_mnemonic>dv2si3 with SSE
>   i386: Emulate MMX ssse3_pmaddubsw with SSE
>   i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE
>   i386: Emulate MMX pshufb with SSE version
>   i386: Emulate MMX ssse3_psign<mode>3 with SSE
>   i386: Emulate MMX ssse3_palignrdi with SSE
>   i386: Emulate MMX abs<mode>2 with SSE
>   i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE
>   i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE
>   i386: Allow MMX intrinsic emulation with SSE
>   i386: Enable TM MMX intrinsics with SSE2
>   i386: Add tests for MMX intrinsic emulations with SSE
>
> Uros Bizjak (1):
>   Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE
>
>  gcc/config/i386/constraints.md                |   6 +
>  gcc/config/i386/i386-builtin.def              | 126 +--
>  gcc/config/i386/i386-c.c                      |   2 +
>  gcc/config/i386/i386-protos.h                 |   4 +
>  gcc/config/i386/i386.c                        | 189 +++-
>  gcc/config/i386/i386.h                        |   2 +
>  gcc/config/i386/i386.md                       |  17 +
>  gcc/config/i386/mmintrin.h                    |  12 +-
>  gcc/config/i386/mmx.md                        | 903 ++++++++++++------
>  gcc/config/i386/predicates.md                 |   7 +
>  gcc/config/i386/sse.md                        | 353 +++++--
>  gcc/config/i386/xmmintrin.h                   |  61 ++
>  gcc/testsuite/gcc.target/i386/mmx-vals.h      |  77 ++
>  gcc/testsuite/gcc.target/i386/pr82483-1.c     |   2 +-
>  gcc/testsuite/gcc.target/i386/pr82483-2.c     |   2 +-
>  gcc/testsuite/gcc.target/i386/sse2-mmx-10.c   |  44 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-11.c   |  39 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-12.c   |  43 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-13.c   |  40 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-14.c   |  32 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-15.c   |  37 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-16.c   |  41 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-17.c   |  52 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c  |  14 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c  |   7 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c  |   7 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c  |  14 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c  |   7 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c  |   7 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c  |   7 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c  |   7 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-2.c    |  12 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-20.c   |  12 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-21.c   |  13 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-3.c    |  13 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-4.c    |   4 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-5.c    |  11 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-6.c    |  11 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-7.c    |  13 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-8.c    |   4 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-9.c    |  79 ++
>  .../gcc.target/i386/sse2-mmx-cvtpi2ps.c       |  44 +
>  .../gcc.target/i386/sse2-mmx-cvtps2pi.c       |  37 +
>  .../gcc.target/i386/sse2-mmx-cvttps2pi.c      |  37 +
>  .../gcc.target/i386/sse2-mmx-maskmovq.c       | 100 ++
>  .../gcc.target/i386/sse2-mmx-packssdw.c       |  53 +
>  .../gcc.target/i386/sse2-mmx-packsswb.c       |  53 +
>  .../gcc.target/i386/sse2-mmx-packuswb.c       |  53 +
>  .../gcc.target/i386/sse2-mmx-paddb.c          |  49 +
>  .../gcc.target/i386/sse2-mmx-paddd.c          |  49 +
>  .../gcc.target/i386/sse2-mmx-paddq.c          |  44 +
>  .../gcc.target/i386/sse2-mmx-paddsb.c         |  49 +
>  .../gcc.target/i386/sse2-mmx-paddsw.c         |  49 +
>  .../gcc.target/i386/sse2-mmx-paddusb.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-paddusw.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-paddw.c          |  49 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c |  45 +
>  .../gcc.target/i386/sse2-mmx-pandn.c          |  45 +
>  .../gcc.target/i386/sse2-mmx-pavgb.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-pavgw.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-pcmpeqb.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-pcmpeqd.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-pcmpeqw.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-pcmpgtb.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-pcmpgtd.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-pcmpgtw.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-pextrw.c         |  60 ++
>  .../gcc.target/i386/sse2-mmx-pinsrw.c         |  62 ++
>  .../gcc.target/i386/sse2-mmx-pmaddwd.c        |  48 +
>  .../gcc.target/i386/sse2-mmx-pmaxsw.c         |  49 +
>  .../gcc.target/i386/sse2-mmx-pmaxub.c         |  49 +
>  .../gcc.target/i386/sse2-mmx-pminsw.c         |  49 +
>  .../gcc.target/i386/sse2-mmx-pminub.c         |  49 +
>  .../gcc.target/i386/sse2-mmx-pmovmskb.c       |  47 +
>  .../gcc.target/i386/sse2-mmx-pmulhuw.c        |  52 +
>  .../gcc.target/i386/sse2-mmx-pmulhw.c         |  54 ++
>  .../gcc.target/i386/sse2-mmx-pmullw.c         |  53 +
>  .../gcc.target/i386/sse2-mmx-pmuludq.c        |  48 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-por.c  |  45 +
>  .../gcc.target/i386/sse2-mmx-psadbw.c         |  59 ++
>  .../gcc.target/i386/sse2-mmx-pshufw.c         | 249 +++++
>  .../gcc.target/i386/sse2-mmx-pslld.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-pslldi.c         | 154 +++
>  .../gcc.target/i386/sse2-mmx-psllq.c          |  48 +
>  .../gcc.target/i386/sse2-mmx-psllqi.c         | 246 +++++
>  .../gcc.target/i386/sse2-mmx-psllw.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-psllwi.c         | 106 ++
>  .../gcc.target/i386/sse2-mmx-psrad.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-psradi.c         | 154 +++
>  .../gcc.target/i386/sse2-mmx-psraw.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-psrawi.c         | 106 ++
>  .../gcc.target/i386/sse2-mmx-psrld.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-psrldi.c         | 154 +++
>  .../gcc.target/i386/sse2-mmx-psrlq.c          |  48 +
>  .../gcc.target/i386/sse2-mmx-psrlqi.c         | 246 +++++
>  .../gcc.target/i386/sse2-mmx-psrlw.c          |  53 +
>  .../gcc.target/i386/sse2-mmx-psrlwi.c         | 106 ++
>  .../gcc.target/i386/sse2-mmx-psubb.c          |  49 +
>  .../gcc.target/i386/sse2-mmx-psubd.c          |  49 +
>  .../gcc.target/i386/sse2-mmx-psubq.c          |  44 +
>  .../gcc.target/i386/sse2-mmx-psubusb.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-psubusw.c        |  49 +
>  .../gcc.target/i386/sse2-mmx-psubw.c          |  49 +
>  .../gcc.target/i386/sse2-mmx-punpckhbw.c      |  54 ++
>  .../gcc.target/i386/sse2-mmx-punpckhdq.c      |  48 +
>  .../gcc.target/i386/sse2-mmx-punpckhwd.c      |  50 +
>  .../gcc.target/i386/sse2-mmx-punpcklbw.c      |  54 ++
>  .../gcc.target/i386/sse2-mmx-punpckldq.c      |  48 +
>  .../gcc.target/i386/sse2-mmx-punpcklwd.c      |  50 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c |  45 +
>  gcc/testsuite/gcc.target/i386/sse2-mmx.c      |   1 -
>  111 files changed, 6422 insertions(+), 463 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/mmx-vals.h
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-10.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-11.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-12.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-13.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-14.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-15.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-16.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-17.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-20.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-21.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-7.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-8.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-9.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtpi2ps.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtps2pi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvttps2pi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-maskmovq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packssdw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packsswb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packuswb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pandn.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pinsrw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaddwd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxsw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxub.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminsw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminub.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmovmskb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhuw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmullw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmuludq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-por.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psadbw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pshufw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslld.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslldi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllqi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllwi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrad.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psradi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psraw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrawi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrld.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrldi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlqi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlwi.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusb.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhbw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhdq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhwd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklbw.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckldq.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklwd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c
>
> --
> 2.20.1
>
H.J. Lu Feb. 15, 2019, 6:19 p.m. UTC | #2
On Fri, Feb 15, 2019 at 9:50 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Fri, Feb 15, 2019 at 2:58 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On x86-64, since __m64 is returned and passed in XMM registers, we can
> > emulate MMX intrinsics with SSE instructions. To support it, we added
> >
> >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
> >
> > ;; Define instruction set of MMX instructions
> > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
> >   (const_string "base"))
> >
> >          (eq_attr "mmx_isa" "native")
> >            (symbol_ref "!TARGET_MMX_WITH_SSE")
> >          (eq_attr "mmx_isa" "x64")
> >            (symbol_ref "TARGET_MMX_WITH_SSE")
> >          (eq_attr "mmx_isa" "x64_avx")
> >            (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
> >          (eq_attr "mmx_isa" "x64_noavx")
> >            (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
> >
> > We added SSE emulation to MMX patterns and disabled MMX alternatives with
> > TARGET_MMX_WITH_SSE.
> >
> > Most of MMX instructions have equivalent SSE versions and results of some
> > SSE versions need to be reshuffled to the right order for MMX.  Thee are
> > couple tricky cases:
> >
> > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
> > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
> > mask operand and handle unmapped bits 64:127 at memory address by
> > adjusting source and mask operands together with memory address.
> >
> > 2. MMX movntq is emulated with SSE2 DImode movnti, which is available
> > in 64-bit mode.
> >
> > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
> > SSE emulation must clear the bit 4 in the shuffle control mask.
> >
> > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
> > the upper 64 bits of destination XMM register.
> >
> > Tests are also added to check each SSE emulation of MMX intrinsics.
> >
> > There are no regressions on i686 and x86-64.  For x86-64, GCC is also
> > tested with
> >
> > --with-arch=native --with-cpu=native
> >
> > on AVX2 and AVX512F machines.
>
> I went through the code again, and looks OK in general, modulo
> mmx_nonimmediate_operand issue and a couple of minor issues.
>
> Please substitute nonimmediate_operand predicate with
> mmx_nonimmediate_operand in expanders and insn patterns. Please note

Can we keep nonimmediate_operand in expanders, like

(define_expand "<plusminus_insn><mode>3"
  [(set (match_operand:MMXMODEI 0 "register_operand")
        (plusminus:MMXMODEI
          (match_operand:MMXMODEI 1 "nonimmediate_operand")
          (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
  "TARGET_MMX_WITH_SSE"
  "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")

(define_insn "*mmx_<plusminus_insn><mode>3"
  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,x,Yv")
        (plusminus:MMXMODEI8
          (match_operand:MMXMODEI8 1 "register_mmxmem_operand" "<comm>0,0,Yv")
          (match_operand:MMXMODEI8 2 "register_mmxmem_operand" "ym,x,Yv")))]
  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
   && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
  "@
   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
   vp<plusminus_mnemonic><mmxvecsize>\t{%2, %1, %0|%0, %1, %2}"
  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
   (set_attr "type" "mmxadd,sseadd,sseadd")
   (set_attr "mode" "DI,TI,TI")])

Can RA do the right thing?

> that the proposed convention is to name the operand
> register_mmxmem_operand (c.f. register_ssemem_operand), so I suggest
> we name the predicate in this way.

I will rename it to register_mmxmem_operand.

> There is an issue with a change to emms pattern.
>
> And let's remove _mm_empty () calls from testcases; they complicate
> things too much for no apparent benefit.

Will do.

> With those issues fixed, the patchset is OK for gcc-10 when it opens.
>
> Uros.
>
> > H.J. Lu (41):
> >   i386: Allow MMX register modes in SSE registers
> >   i386: Add mmx_nonimmediate_operand
> >   i386: Emulate MMX packsswb/packssdw/packuswb with SSE2
> >   i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX
> >   i386: Emulate MMX plusminus/sat_plusminus with SSE
> >   i386: Emulate MMX mulv4hi3 with SSE
> >   i386: Emulate MMX smulv4hi3_highpart with SSE
> >   i386: Emulate MMX mmx_pmaddwd with SSE
> >   i386: Emulate MMX ashr<mode>3/<shift_insn><mode>3 with SSE
> >   i386: Emulate MMX <any_logic><mode>3 with SSE
> >   i386: Emulate MMX mmx_andnot<mode>3 with SSE
> >   i386: Emulate MMX mmx_eq/mmx_gt<mode>3 with SSE
> >   i386: Emulate MMX vec_dupv2si with SSE
> >   i386: Emulate MMX pshufw with SSE
> >   i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE
> >   i386: Emulate MMX sse_cvtpi2ps with SSE
> >   i386: Emulate MMX mmx_pextrw with SSE
> >   i386: Emulate MMX mmx_pinsrw with SSE
> >   i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE
> >   i386: Emulate MMX mmx_pmovmskb with SSE
> >   i386: Emulate MMX mmx_umulv4hi3_highpart with SSE
> >   i386: Emulate MMX maskmovq with SSE2 maskmovdqu
> >   i386: Emulate MMX mmx_uavgv8qi3 with SSE
> >   i386: Emulate MMX mmx_uavgv4hi3 with SSE
> >   i386: Emulate MMX mmx_psadbw with SSE
> >   i386: Emulate MMX movntq with SSE2 movntidi
> >   i386: Emulate MMX umulv1siv1di3 with SSE2
> >   i386: Make _mm_empty () as NOP when MMX is disabled
> >   i386: Emulate MMX ssse3_ph<plusminus_mnemonic>wv4hi3 with SSE
> >   i386: Emulate MMX ssse3_ph<plusminus_mnemonic>dv2si3 with SSE
> >   i386: Emulate MMX ssse3_pmaddubsw with SSE
> >   i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE
> >   i386: Emulate MMX pshufb with SSE version
> >   i386: Emulate MMX ssse3_psign<mode>3 with SSE
> >   i386: Emulate MMX ssse3_palignrdi with SSE
> >   i386: Emulate MMX abs<mode>2 with SSE
> >   i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE
> >   i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE
> >   i386: Allow MMX intrinsic emulation with SSE
> >   i386: Enable TM MMX intrinsics with SSE2
> >   i386: Add tests for MMX intrinsic emulations with SSE
> >
> > Uros Bizjak (1):
> >   Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE
> >
> >  gcc/config/i386/constraints.md                |   6 +
> >  gcc/config/i386/i386-builtin.def              | 126 +--
> >  gcc/config/i386/i386-c.c                      |   2 +
> >  gcc/config/i386/i386-protos.h                 |   4 +
> >  gcc/config/i386/i386.c                        | 189 +++-
> >  gcc/config/i386/i386.h                        |   2 +
> >  gcc/config/i386/i386.md                       |  17 +
> >  gcc/config/i386/mmintrin.h                    |  12 +-
> >  gcc/config/i386/mmx.md                        | 903 ++++++++++++------
> >  gcc/config/i386/predicates.md                 |   7 +
> >  gcc/config/i386/sse.md                        | 353 +++++--
> >  gcc/config/i386/xmmintrin.h                   |  61 ++
> >  gcc/testsuite/gcc.target/i386/mmx-vals.h      |  77 ++
> >  gcc/testsuite/gcc.target/i386/pr82483-1.c     |   2 +-
> >  gcc/testsuite/gcc.target/i386/pr82483-2.c     |   2 +-
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-10.c   |  44 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-11.c   |  39 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-12.c   |  43 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-13.c   |  40 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-14.c   |  32 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-15.c   |  37 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-16.c   |  41 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-17.c   |  52 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c  |  14 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c  |   7 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c  |   7 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c  |  14 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c  |   7 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c  |   7 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c  |   7 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c  |   7 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-2.c    |  12 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-20.c   |  12 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-21.c   |  13 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-3.c    |  13 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-4.c    |   4 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-5.c    |  11 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-6.c    |  11 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-7.c    |  13 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-8.c    |   4 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-9.c    |  79 ++
> >  .../gcc.target/i386/sse2-mmx-cvtpi2ps.c       |  44 +
> >  .../gcc.target/i386/sse2-mmx-cvtps2pi.c       |  37 +
> >  .../gcc.target/i386/sse2-mmx-cvttps2pi.c      |  37 +
> >  .../gcc.target/i386/sse2-mmx-maskmovq.c       | 100 ++
> >  .../gcc.target/i386/sse2-mmx-packssdw.c       |  53 +
> >  .../gcc.target/i386/sse2-mmx-packsswb.c       |  53 +
> >  .../gcc.target/i386/sse2-mmx-packuswb.c       |  53 +
> >  .../gcc.target/i386/sse2-mmx-paddb.c          |  49 +
> >  .../gcc.target/i386/sse2-mmx-paddd.c          |  49 +
> >  .../gcc.target/i386/sse2-mmx-paddq.c          |  44 +
> >  .../gcc.target/i386/sse2-mmx-paddsb.c         |  49 +
> >  .../gcc.target/i386/sse2-mmx-paddsw.c         |  49 +
> >  .../gcc.target/i386/sse2-mmx-paddusb.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-paddusw.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-paddw.c          |  49 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c |  45 +
> >  .../gcc.target/i386/sse2-mmx-pandn.c          |  45 +
> >  .../gcc.target/i386/sse2-mmx-pavgb.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-pavgw.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-pcmpeqb.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-pcmpeqd.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-pcmpeqw.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-pcmpgtb.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-pcmpgtd.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-pcmpgtw.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-pextrw.c         |  60 ++
> >  .../gcc.target/i386/sse2-mmx-pinsrw.c         |  62 ++
> >  .../gcc.target/i386/sse2-mmx-pmaddwd.c        |  48 +
> >  .../gcc.target/i386/sse2-mmx-pmaxsw.c         |  49 +
> >  .../gcc.target/i386/sse2-mmx-pmaxub.c         |  49 +
> >  .../gcc.target/i386/sse2-mmx-pminsw.c         |  49 +
> >  .../gcc.target/i386/sse2-mmx-pminub.c         |  49 +
> >  .../gcc.target/i386/sse2-mmx-pmovmskb.c       |  47 +
> >  .../gcc.target/i386/sse2-mmx-pmulhuw.c        |  52 +
> >  .../gcc.target/i386/sse2-mmx-pmulhw.c         |  54 ++
> >  .../gcc.target/i386/sse2-mmx-pmullw.c         |  53 +
> >  .../gcc.target/i386/sse2-mmx-pmuludq.c        |  48 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-por.c  |  45 +
> >  .../gcc.target/i386/sse2-mmx-psadbw.c         |  59 ++
> >  .../gcc.target/i386/sse2-mmx-pshufw.c         | 249 +++++
> >  .../gcc.target/i386/sse2-mmx-pslld.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-pslldi.c         | 154 +++
> >  .../gcc.target/i386/sse2-mmx-psllq.c          |  48 +
> >  .../gcc.target/i386/sse2-mmx-psllqi.c         | 246 +++++
> >  .../gcc.target/i386/sse2-mmx-psllw.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-psllwi.c         | 106 ++
> >  .../gcc.target/i386/sse2-mmx-psrad.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-psradi.c         | 154 +++
> >  .../gcc.target/i386/sse2-mmx-psraw.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-psrawi.c         | 106 ++
> >  .../gcc.target/i386/sse2-mmx-psrld.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-psrldi.c         | 154 +++
> >  .../gcc.target/i386/sse2-mmx-psrlq.c          |  48 +
> >  .../gcc.target/i386/sse2-mmx-psrlqi.c         | 246 +++++
> >  .../gcc.target/i386/sse2-mmx-psrlw.c          |  53 +
> >  .../gcc.target/i386/sse2-mmx-psrlwi.c         | 106 ++
> >  .../gcc.target/i386/sse2-mmx-psubb.c          |  49 +
> >  .../gcc.target/i386/sse2-mmx-psubd.c          |  49 +
> >  .../gcc.target/i386/sse2-mmx-psubq.c          |  44 +
> >  .../gcc.target/i386/sse2-mmx-psubusb.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-psubusw.c        |  49 +
> >  .../gcc.target/i386/sse2-mmx-psubw.c          |  49 +
> >  .../gcc.target/i386/sse2-mmx-punpckhbw.c      |  54 ++
> >  .../gcc.target/i386/sse2-mmx-punpckhdq.c      |  48 +
> >  .../gcc.target/i386/sse2-mmx-punpckhwd.c      |  50 +
> >  .../gcc.target/i386/sse2-mmx-punpcklbw.c      |  54 ++
> >  .../gcc.target/i386/sse2-mmx-punpckldq.c      |  48 +
> >  .../gcc.target/i386/sse2-mmx-punpcklwd.c      |  50 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c |  45 +
> >  gcc/testsuite/gcc.target/i386/sse2-mmx.c      |   1 -
> >  111 files changed, 6422 insertions(+), 463 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/mmx-vals.h
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-10.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-11.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-12.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-13.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-14.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-15.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-16.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-17.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-20.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-21.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-6.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-7.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-8.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-9.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtpi2ps.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtps2pi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvttps2pi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-maskmovq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packssdw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packsswb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packuswb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pandn.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pinsrw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaddwd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxsw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxub.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminsw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminub.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmovmskb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhuw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmullw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmuludq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-por.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psadbw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pshufw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslld.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslldi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllqi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllwi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrad.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psradi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psraw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrawi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrld.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrldi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlqi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlwi.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhbw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhdq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhwd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklbw.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckldq.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklwd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c
> >
> > --
> > 2.20.1
> >
Uros Bizjak Feb. 15, 2019, 6:31 p.m. UTC | #3
On Fri, Feb 15, 2019 at 7:20 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > I went through the code again, and looks OK in general, modulo
> > mmx_nonimmediate_operand issue and a couple of minor issues.
> >
> > Please substitute nonimmediate_operand predicate with
> > mmx_nonimmediate_operand in expanders and insn patterns. Please note
>
> Can we keep nonimmediate_operand in expanders, like

No, expander should also be changed. The way expanders are called is -
if the operand can't satisfy the predicate, then move it to a
register. So, for TARGET_MMX_WITH_SSE, we allow memory operand which
isn't allowed by relevant insn pattern -> ICE.

There is nothing RA can do here. Operand type, produced by expander
must match predicate in the insn pattern to satisfy insn pattern.
Otherwise, the compiler will ICE way before RA comes into play. Also,
in the insn pattern, the constraints must allow a subset of an operand
predicate if we want RA to fixup the operand.

Uros.

> (define_expand "<plusminus_insn><mode>3"
>   [(set (match_operand:MMXMODEI 0 "register_operand")
>         (plusminus:MMXMODEI
>           (match_operand:MMXMODEI 1 "nonimmediate_operand")
>           (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
>   "TARGET_MMX_WITH_SSE"
>   "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
>
> (define_insn "*mmx_<plusminus_insn><mode>3"
>   [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,x,Yv")
>         (plusminus:MMXMODEI8
>           (match_operand:MMXMODEI8 1 "register_mmxmem_operand" "<comm>0,0,Yv")
>           (match_operand:MMXMODEI8 2 "register_mmxmem_operand" "ym,x,Yv")))]
>   "(TARGET_MMX || TARGET_MMX_WITH_SSE)
>    && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
>   "@
>    p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
>    p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
>    vp<plusminus_mnemonic><mmxvecsize>\t{%2, %1, %0|%0, %1, %2}"
>   [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
>    (set_attr "type" "mmxadd,sseadd,sseadd")
>    (set_attr "mode" "DI,TI,TI")])
>
> Can RA do the right thing?
>
> > that the proposed convention is to name the operand
> > register_mmxmem_operand (c.f. register_ssemem_operand), so I suggest
> > we name the predicate in this way.
>
> I will rename it to register_mmxmem_operand.
>
> > There is an issue with a change to emms pattern.
> >
> > And let's remove _mm_empty () calls from testcases; they complicate
> > things too much for no apparent benefit.
>
> Will do.
>
> > With those issues fixed, the patchset is OK for gcc-10 when it opens.
> >
> > Uros.
> >
> > > H.J. Lu (41):
> > >   i386: Allow MMX register modes in SSE registers
> > >   i386: Add mmx_nonimmediate_operand
> > >   i386: Emulate MMX packsswb/packssdw/packuswb with SSE2
> > >   i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX
> > >   i386: Emulate MMX plusminus/sat_plusminus with SSE
> > >   i386: Emulate MMX mulv4hi3 with SSE
> > >   i386: Emulate MMX smulv4hi3_highpart with SSE
> > >   i386: Emulate MMX mmx_pmaddwd with SSE
> > >   i386: Emulate MMX ashr<mode>3/<shift_insn><mode>3 with SSE
> > >   i386: Emulate MMX <any_logic><mode>3 with SSE
> > >   i386: Emulate MMX mmx_andnot<mode>3 with SSE
> > >   i386: Emulate MMX mmx_eq/mmx_gt<mode>3 with SSE
> > >   i386: Emulate MMX vec_dupv2si with SSE
> > >   i386: Emulate MMX pshufw with SSE
> > >   i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE
> > >   i386: Emulate MMX sse_cvtpi2ps with SSE
> > >   i386: Emulate MMX mmx_pextrw with SSE
> > >   i386: Emulate MMX mmx_pinsrw with SSE
> > >   i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE
> > >   i386: Emulate MMX mmx_pmovmskb with SSE
> > >   i386: Emulate MMX mmx_umulv4hi3_highpart with SSE
> > >   i386: Emulate MMX maskmovq with SSE2 maskmovdqu
> > >   i386: Emulate MMX mmx_uavgv8qi3 with SSE
> > >   i386: Emulate MMX mmx_uavgv4hi3 with SSE
> > >   i386: Emulate MMX mmx_psadbw with SSE
> > >   i386: Emulate MMX movntq with SSE2 movntidi
> > >   i386: Emulate MMX umulv1siv1di3 with SSE2
> > >   i386: Make _mm_empty () as NOP when MMX is disabled
> > >   i386: Emulate MMX ssse3_ph<plusminus_mnemonic>wv4hi3 with SSE
> > >   i386: Emulate MMX ssse3_ph<plusminus_mnemonic>dv2si3 with SSE
> > >   i386: Emulate MMX ssse3_pmaddubsw with SSE
> > >   i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE
> > >   i386: Emulate MMX pshufb with SSE version
> > >   i386: Emulate MMX ssse3_psign<mode>3 with SSE
> > >   i386: Emulate MMX ssse3_palignrdi with SSE
> > >   i386: Emulate MMX abs<mode>2 with SSE
> > >   i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE
> > >   i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE
> > >   i386: Allow MMX intrinsic emulation with SSE
> > >   i386: Enable TM MMX intrinsics with SSE2
> > >   i386: Add tests for MMX intrinsic emulations with SSE
> > >
> > > Uros Bizjak (1):
> > >   Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE
> > >
> > >  gcc/config/i386/constraints.md                |   6 +
> > >  gcc/config/i386/i386-builtin.def              | 126 +--
> > >  gcc/config/i386/i386-c.c                      |   2 +
> > >  gcc/config/i386/i386-protos.h                 |   4 +
> > >  gcc/config/i386/i386.c                        | 189 +++-
> > >  gcc/config/i386/i386.h                        |   2 +
> > >  gcc/config/i386/i386.md                       |  17 +
> > >  gcc/config/i386/mmintrin.h                    |  12 +-
> > >  gcc/config/i386/mmx.md                        | 903 ++++++++++++------
> > >  gcc/config/i386/predicates.md                 |   7 +
> > >  gcc/config/i386/sse.md                        | 353 +++++--
> > >  gcc/config/i386/xmmintrin.h                   |  61 ++
> > >  gcc/testsuite/gcc.target/i386/mmx-vals.h      |  77 ++
> > >  gcc/testsuite/gcc.target/i386/pr82483-1.c     |   2 +-
> > >  gcc/testsuite/gcc.target/i386/pr82483-2.c     |   2 +-
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-10.c   |  44 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-11.c   |  39 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-12.c   |  43 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-13.c   |  40 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-14.c   |  32 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-15.c   |  37 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-16.c   |  41 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-17.c   |  52 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c  |  14 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c  |   7 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c  |   7 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c  |  14 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c  |   7 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c  |   7 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c  |   7 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c  |   7 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-2.c    |  12 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-20.c   |  12 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-21.c   |  13 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-3.c    |  13 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-4.c    |   4 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-5.c    |  11 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-6.c    |  11 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-7.c    |  13 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-8.c    |   4 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-9.c    |  79 ++
> > >  .../gcc.target/i386/sse2-mmx-cvtpi2ps.c       |  44 +
> > >  .../gcc.target/i386/sse2-mmx-cvtps2pi.c       |  37 +
> > >  .../gcc.target/i386/sse2-mmx-cvttps2pi.c      |  37 +
> > >  .../gcc.target/i386/sse2-mmx-maskmovq.c       | 100 ++
> > >  .../gcc.target/i386/sse2-mmx-packssdw.c       |  53 +
> > >  .../gcc.target/i386/sse2-mmx-packsswb.c       |  53 +
> > >  .../gcc.target/i386/sse2-mmx-packuswb.c       |  53 +
> > >  .../gcc.target/i386/sse2-mmx-paddb.c          |  49 +
> > >  .../gcc.target/i386/sse2-mmx-paddd.c          |  49 +
> > >  .../gcc.target/i386/sse2-mmx-paddq.c          |  44 +
> > >  .../gcc.target/i386/sse2-mmx-paddsb.c         |  49 +
> > >  .../gcc.target/i386/sse2-mmx-paddsw.c         |  49 +
> > >  .../gcc.target/i386/sse2-mmx-paddusb.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-paddusw.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-paddw.c          |  49 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c |  45 +
> > >  .../gcc.target/i386/sse2-mmx-pandn.c          |  45 +
> > >  .../gcc.target/i386/sse2-mmx-pavgb.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-pavgw.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-pcmpeqb.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pcmpeqd.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pcmpeqw.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pcmpgtb.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pcmpgtd.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pcmpgtw.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pextrw.c         |  60 ++
> > >  .../gcc.target/i386/sse2-mmx-pinsrw.c         |  62 ++
> > >  .../gcc.target/i386/sse2-mmx-pmaddwd.c        |  48 +
> > >  .../gcc.target/i386/sse2-mmx-pmaxsw.c         |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pmaxub.c         |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pminsw.c         |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pminub.c         |  49 +
> > >  .../gcc.target/i386/sse2-mmx-pmovmskb.c       |  47 +
> > >  .../gcc.target/i386/sse2-mmx-pmulhuw.c        |  52 +
> > >  .../gcc.target/i386/sse2-mmx-pmulhw.c         |  54 ++
> > >  .../gcc.target/i386/sse2-mmx-pmullw.c         |  53 +
> > >  .../gcc.target/i386/sse2-mmx-pmuludq.c        |  48 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-por.c  |  45 +
> > >  .../gcc.target/i386/sse2-mmx-psadbw.c         |  59 ++
> > >  .../gcc.target/i386/sse2-mmx-pshufw.c         | 249 +++++
> > >  .../gcc.target/i386/sse2-mmx-pslld.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-pslldi.c         | 154 +++
> > >  .../gcc.target/i386/sse2-mmx-psllq.c          |  48 +
> > >  .../gcc.target/i386/sse2-mmx-psllqi.c         | 246 +++++
> > >  .../gcc.target/i386/sse2-mmx-psllw.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-psllwi.c         | 106 ++
> > >  .../gcc.target/i386/sse2-mmx-psrad.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-psradi.c         | 154 +++
> > >  .../gcc.target/i386/sse2-mmx-psraw.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-psrawi.c         | 106 ++
> > >  .../gcc.target/i386/sse2-mmx-psrld.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-psrldi.c         | 154 +++
> > >  .../gcc.target/i386/sse2-mmx-psrlq.c          |  48 +
> > >  .../gcc.target/i386/sse2-mmx-psrlqi.c         | 246 +++++
> > >  .../gcc.target/i386/sse2-mmx-psrlw.c          |  53 +
> > >  .../gcc.target/i386/sse2-mmx-psrlwi.c         | 106 ++
> > >  .../gcc.target/i386/sse2-mmx-psubb.c          |  49 +
> > >  .../gcc.target/i386/sse2-mmx-psubd.c          |  49 +
> > >  .../gcc.target/i386/sse2-mmx-psubq.c          |  44 +
> > >  .../gcc.target/i386/sse2-mmx-psubusb.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-psubusw.c        |  49 +
> > >  .../gcc.target/i386/sse2-mmx-psubw.c          |  49 +
> > >  .../gcc.target/i386/sse2-mmx-punpckhbw.c      |  54 ++
> > >  .../gcc.target/i386/sse2-mmx-punpckhdq.c      |  48 +
> > >  .../gcc.target/i386/sse2-mmx-punpckhwd.c      |  50 +
> > >  .../gcc.target/i386/sse2-mmx-punpcklbw.c      |  54 ++
> > >  .../gcc.target/i386/sse2-mmx-punpckldq.c      |  48 +
> > >  .../gcc.target/i386/sse2-mmx-punpcklwd.c      |  50 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c |  45 +
> > >  gcc/testsuite/gcc.target/i386/sse2-mmx.c      |   1 -
> > >  111 files changed, 6422 insertions(+), 463 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/mmx-vals.h
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-10.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-11.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-12.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-13.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-14.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-15.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-16.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-17.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18a.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18b.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-18c.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19a.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19b.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19c.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19d.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-19e.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-20.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-21.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-4.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-5.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-6.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-7.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-8.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-9.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtpi2ps.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvtps2pi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-cvttps2pi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-maskmovq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packssdw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packsswb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-packuswb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddsw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddusw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-paddw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pand.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pandn.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pavgw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpeqw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pcmpgtw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pinsrw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaddwd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxsw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmaxub.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminsw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pminub.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmovmskb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhuw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmulhw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmullw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pmuludq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-por.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psadbw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pshufw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslld.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pslldi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllqi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psllwi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrad.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psradi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psraw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrawi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrld.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrldi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlqi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psrlwi.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusb.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubusw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-psubw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhbw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhdq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckhwd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklbw.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpckldq.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-punpcklwd.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-mmx-pxor.c
> > >
> > > --
> > > 2.20.1
> > >
>
>
>
> --
> H.J.
H.J. Lu Feb. 16, 2019, 12:53 a.m. UTC | #4
On Fri, Feb 15, 2019 at 9:50 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Fri, Feb 15, 2019 at 2:58 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On x86-64, since __m64 is returned and passed in XMM registers, we can
> > emulate MMX intrinsics with SSE instructions. To support it, we added
> >
> >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
> >
> > ;; Define instruction set of MMX instructions
> > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
> >   (const_string "base"))
> >
> >          (eq_attr "mmx_isa" "native")
> >            (symbol_ref "!TARGET_MMX_WITH_SSE")
> >          (eq_attr "mmx_isa" "x64")
> >            (symbol_ref "TARGET_MMX_WITH_SSE")
> >          (eq_attr "mmx_isa" "x64_avx")
> >            (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
> >          (eq_attr "mmx_isa" "x64_noavx")
> >            (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
> >
> > We added SSE emulation to MMX patterns and disabled MMX alternatives with
> > TARGET_MMX_WITH_SSE.
> >
> > Most of MMX instructions have equivalent SSE versions and results of some
> > SSE versions need to be reshuffled to the right order for MMX.  Thee are
> > couple tricky cases:
> >
> > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
> > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
> > mask operand and handle unmapped bits 64:127 at memory address by
> > adjusting source and mask operands together with memory address.
> >
> > 2. MMX movntq is emulated with SSE2 DImode movnti, which is available
> > in 64-bit mode.
> >
> > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
> > SSE emulation must clear the bit 4 in the shuffle control mask.
> >
> > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
> > the upper 64 bits of destination XMM register.
> >
> > Tests are also added to check each SSE emulation of MMX intrinsics.
> >
> > There are no regressions on i686 and x86-64.  For x86-64, GCC is also
> > tested with
> >
> > --with-arch=native --with-cpu=native
> >
> > on AVX2 and AVX512F machines.
>
> I went through the code again, and looks OK in general, modulo
> mmx_nonimmediate_operand issue and a couple of minor issues.
>
> Please substitute nonimmediate_operand predicate with
> mmx_nonimmediate_operand in expanders and insn patterns. Please note
> that the proposed convention is to name the operand
> register_mmxmem_operand (c.f. register_ssemem_operand), so I suggest
> we name the predicate in this way.
>
> There is an issue with a change to emms pattern.
>
> And let's remove _mm_empty () calls from testcases; they complicate
> things too much for no apparent benefit.
>
> With those issues fixed, the patchset is OK for gcc-10 when it opens.

The new patch set starts at

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01275.html

including

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01271.html

for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89372