diff mbox series

i386: Add register source to movddup

Message ID 20181013214949.16880-1-hjl.tools@gmail.com
State New
Headers show
Series i386: Add register source to movddup | expand

Commit Message

H.J. Lu Oct. 13, 2018, 9:49 p.m. UTC
Add register source to movddup so that IRA will allow register source
for *vec_dupv2di when SSE3 is enabled.

gcc/

	PR target/87599
	* config/i386/sse.md (*vec_dupv2di): Add register source to
	movddup.

gcc/testsuite/

	PR target/87599
	* gcc.target/i386/pr87599.c: New test.
---
 gcc/config/i386/sse.md                  |  2 +-
 gcc/testsuite/gcc.target/i386/pr87599.c | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr87599.c

Comments

Alexander Monakov Oct. 13, 2018, 10:05 p.m. UTC | #1
On Sat, 13 Oct 2018, H.J. Lu wrote:

> Add register source to movddup so that IRA will allow register source
> for *vec_dupv2di when SSE3 is enabled.
> 
> gcc/
> 
> 	PR target/87599
> 	* config/i386/sse.md (*vec_dupv2di): Add register source to
> 	movddup.
> 
> gcc/testsuite/
> 
> 	PR target/87599
> 	* gcc.target/i386/pr87599.c: New test.

I doubt this is a correct fix, and I think the issue merits more investigation.
Please see comment #5 in the PR.

Alexander
Alexander Monakov Oct. 14, 2018, 8:05 a.m. UTC | #2
On Sun, 14 Oct 2018, Alexander Monakov wrote:
> 
> I doubt this is a correct fix, and I think the issue merits more investigation.
> Please see comment #5 in the PR.

Sorry, it seems I was misunderstanding how constraints interact with cost
calculation. I withdraw my objection to the patch.

Alexander
Uros Bizjak Oct. 14, 2018, 7:28 p.m. UTC | #3
On Sat, Oct 13, 2018 at 11:54 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Add register source to movddup so that IRA will allow register source
> for *vec_dupv2di when SSE3 is enabled.
>
> gcc/
>
>         PR target/87599
>         * config/i386/sse.md (*vec_dupv2di): Add register source to
>         movddup.
>
> gcc/testsuite/
>
>         PR target/87599
>         * gcc.target/i386/pr87599.c: New test.

OK with a small scan-string fix, described below.

Uros.

> ---
>  gcc/config/i386/sse.md                  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr87599.c | 12 ++++++++++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr87599.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index af4d80d8c9b..4b2193e0462 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -18185,7 +18185,7 @@
>  (define_insn "*vec_dupv2di"
>    [(set (match_operand:V2DI 0 "register_operand"     "=x,v,v,x")
>         (vec_duplicate:V2DI
> -         (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,m,0")))]
> +         (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))]
>    "TARGET_SSE"
>    "@
>     punpcklqdq\t%0, %0
> diff --git a/gcc/testsuite/gcc.target/i386/pr87599.c b/gcc/testsuite/gcc.target/i386/pr87599.c
> new file mode 100644
> index 00000000000..abbeb7a41af
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr87599.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-march=corei7 -O2" } */
> +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ \\t\]+%xmm\[0-9\]" 1 } } */

You need to scan for %xmm\[0-9\]+, otherwise xmm10 is already out of luck.

> +
> +#include <immintrin.h>
> +
> +__m128i
> +foo (long long val)
> +{
> +  __m128i rval = {val, val};
> +  return rval;
> +}
> --
> 2.17.2
>
Alexander Monakov Oct. 14, 2018, 8:11 p.m. UTC | #4
On Sun, 14 Oct 2018, Uros Bizjak wrote:
> > +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ \\t\]+%xmm\[0-9\]" 1 } } */
> 
> You need to scan for %xmm\[0-9\]+, otherwise xmm10 is already out of luck.

I think it would be preferable to scan for 'punpcklqdq xmm0, xmm0' exactly,
because the result (rval) must be in xmm0 for function return anyway.

Alexander

> > +
> > +#include <immintrin.h>
> > +
> > +__m128i
> > +foo (long long val)
> > +{
> > +  __m128i rval = {val, val};
> > +  return rval;
> > +}
> > --
> > 2.17.2
> >
>
H.J. Lu Oct. 14, 2018, 8:48 p.m. UTC | #5
On 10/14/18, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Sun, 14 Oct 2018, Uros Bizjak wrote:
>> > +/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[
>> > \\t\]+%xmm\[0-9\]" 1 } } */
>>
>> You need to scan for %xmm\[0-9\]+, otherwise xmm10 is already out of
>> luck.
>
> I think it would be preferable to scan for 'punpcklqdq xmm0, xmm0' exactly,
> because the result (rval) must be in xmm0 for function return anyway.

I checked in

/* { dg-final { scan-assembler-times "punpcklqdq\[
\\t\]+%xmm\[0-9\]+,\[ \\t\]+%xmm\[0-9\]+" 1 } } */

It covers 'punpcklqdq xmm0, xmm0' and 'punpcklqdq xmm10, xmm0' .
There should be
only one punpcklqdq.

> Alexander
>
>> > +
>> > +#include <immintrin.h>
>> > +
>> > +__m128i
>> > +foo (long long val)
>> > +{
>> > +  __m128i rval = {val, val};
>> > +  return rval;
>> > +}
>> > --
>> > 2.17.2
>> >
>>
>
diff mbox series

Patch

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index af4d80d8c9b..4b2193e0462 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -18185,7 +18185,7 @@ 
 (define_insn "*vec_dupv2di"
   [(set (match_operand:V2DI 0 "register_operand"     "=x,v,v,x")
 	(vec_duplicate:V2DI
-	  (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,m,0")))]
+	  (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))]
   "TARGET_SSE"
   "@
    punpcklqdq\t%0, %0
diff --git a/gcc/testsuite/gcc.target/i386/pr87599.c b/gcc/testsuite/gcc.target/i386/pr87599.c
new file mode 100644
index 00000000000..abbeb7a41af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr87599.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-march=corei7 -O2" } */
+/* { dg-final { scan-assembler-times "punpcklqdq\[ \\t\]+%xmm\[0-9\],\[ \\t\]+%xmm\[0-9\]" 1 } } */
+
+#include <immintrin.h>
+
+__m128i
+foo (long long val)
+{
+  __m128i rval = {val, val};
+  return rval;
+}