Message ID | 20181013214949.16880-1-hjl.tools@gmail.com |
---|---|
State | New |
Headers | show |
Series | i386: Add register source to movddup | expand |
On Sat, 13 Oct 2018, H.J. Lu wrote: > Add register source to movddup so that IRA will allow register source > for *vec_dupv2di when SSE3 is enabled. > > gcc/ > > PR target/87599 > * config/i386/sse.md (*vec_dupv2di): Add register source to > movddup. > > gcc/testsuite/ > > PR target/87599 > * gcc.target/i386/pr87599.c: New test. I doubt this is a correct fix, and I think the issue merits more investigation. Please see comment #5 in the PR. Alexander
On Sun, 14 Oct 2018, Alexander Monakov wrote: > > I doubt this is a correct fix, and I think the issue merits more investigation. > Please see comment #5 in the PR. Sorry, it seems I was misunderstanding how constraints interact with cost calculation. I withdraw my objection to the patch. Alexander
On Sat, Oct 13, 2018 at 11:54 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > Add register source to movddup so that IRA will allow register source > for *vec_dupv2di when SSE3 is enabled. > > gcc/ > > PR target/87599 > * config/i386/sse.md (*vec_dupv2di): Add register source to > movddup. > > gcc/testsuite/ > > PR target/87599 > * gcc.target/i386/pr87599.c: New test. OK with a small scan-string fix, described below. Uros. > --- > gcc/config/i386/sse.md | 2 +- > gcc/testsuite/gcc.target/i386/pr87599.c | 12 ++++++++++++ > 2 files changed, 13 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr87599.c > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index af4d80d8c9b..4b2193e0462 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -18185,7 +18185,7 @@ > (define_insn "*vec_dupv2di" > [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") > (vec_duplicate:V2DI > - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,m,0")))] > + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] > "TARGET_SSE" > "@ > punpcklqdq\t%0, %0 > diff --git a/gcc/testsuite/gcc.target/i386/pr87599.c b/gcc/testsuite/gcc.target/i386/pr87599.c > new file mode 100644 > index 00000000000..abbeb7a41af > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr87599.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-march=corei7 -O2" } */ > +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ \\t\]+%xmm\[0-9\]" 1 } } */ You need to scan for %xmm\[0-9\]+, otherwise xmm10 is already out of luck. > + > +#include <immintrin.h> > + > +__m128i > +foo (long long val) > +{ > + __m128i rval = {val, val}; > + return rval; > +} > -- > 2.17.2 >
On Sun, 14 Oct 2018, Uros Bizjak wrote: > > +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ \\t\]+%xmm\[0-9\]" 1 } } */ > > You need to scan for %xmm\[0-9\]+, otherwise xmm10 is already out of luck. I think it would be preferable to scan for 'punpcklqdq xmm0, xmm0' exactly, because the result (rval) must be in xmm0 for function return anyway. Alexander > > + > > +#include <immintrin.h> > > + > > +__m128i > > +foo (long long val) > > +{ > > + __m128i rval = {val, val}; > > + return rval; > > +} > > -- > > 2.17.2 > > >
On 10/14/18, Alexander Monakov <amonakov@ispras.ru> wrote: > On Sun, 14 Oct 2018, Uros Bizjak wrote: >> > +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ >> > \\t\]+%xmm\[0-9\]" 1 } } */ >> >> You need to scan for %xmm\[0-9\]+, otherwise xmm10 is already out of >> luck. > > I think it would be preferable to scan for 'punpcklqdq xmm0, xmm0' exactly, > because the result (rval) must be in xmm0 for function return anyway. I checked in /* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\]+,\[ \\t\]+%xmm\[0-9\]+" 1 } } */ It covers 'punpcklqdq xmm0, xmm0' and 'punpcklqdq xmm10, xmm0' . There should be only one punpcklqdq. > Alexander > >> > + >> > +#include <immintrin.h> >> > + >> > +__m128i >> > +foo (long long val) >> > +{ >> > + __m128i rval = {val, val}; >> > + return rval; >> > +} >> > -- >> > 2.17.2 >> > >> >
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index af4d80d8c9b..4b2193e0462 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -18185,7 +18185,7 @@ (define_insn "*vec_dupv2di" [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,m,0")))] + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] "TARGET_SSE" "@ punpcklqdq\t%0, %0 diff --git a/gcc/testsuite/gcc.target/i386/pr87599.c b/gcc/testsuite/gcc.target/i386/pr87599.c new file mode 100644 index 00000000000..abbeb7a41af --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr87599.c @@ -0,0 +1,12 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-march=corei7 -O2" } */ +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ \\t\]+%xmm\[0-9\]" 1 } } */ + +#include <immintrin.h> + +__m128i +foo (long long val) +{ + __m128i rval = {val, val}; + return rval; +}