diff mbox

[i386] Fix emitting of prefetch instructions

Message ID CAFULd4YJyA88BoX9=s_pSDngVN_-3ObgaQJVF1t69xQugt10Ew@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak March 4, 2014, 12:13 a.m. UTC
On Tue, Mar 4, 2014 at 12:31 AM, Uros Bizjak <ubizjak@gmail.com> wrote:

>>> The new gcc.target/i386/prefetchwt1-1.c test currently FAILs on Solaris 9/x86:
>>>
>>> FAIL: gcc.target/i386/prefetchwt1-1.c (test for excess errors)
>>> Excess errors:
>>> /var/gcc/regression/trunk/9-gcc-gas/build/gcc/include/xmmintrin.h:1195:1: error:
>>>  inlining failed in call to always_inline '_mm_prefetch': target specific option
>>>  mismatch
>>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/prefetchwt1-1.c:12:5:
>>> error: called from here
>>>
>>> gcc.target/i386/prefetchwt1-1.c: output file does not exist
>>> UNRESOLVED: gcc.target/i386/prefetchwt1-1.c scan-assembler [ \\t]+prefetchwt1[ \
>>> \t]+
>>>
>>> This can be fixed by compiling with -msse2.
>>
>> Actually, we should take prefetch instructions out of various GCC
>> target pragmas. Patterns that emit these instructions are designed to
>> (depending on selected ISA) always emit  the most optimal prefetch
>> instruction.
>>
>> The patch also changes the compiler to emit prefetchwt1 only for
>> _MM_HINT_T1, while for _MM_HINT_T0, it still emits prefetchw. In
>> addition, the patch corrects wrong MM_HINT_T0 value.
>>
>> Patch was bootstrapped and tested on x86_64-pc-linux-gnu {,-m32}  and
>> committed to mainline SVN.
>>
>> 2014-03-03  Uros Bizjak  <ubizjak@gmail.com>
>>
>>     * config/i386/xmmintrin.h (enum _mm_hint) <_MM_HINT_ET0>: Correct
>>     hint value.
>>     (_mm_prefetch): Move out of GCC target("sse") pragma.
>>     * config/i386/prfchwintrin.h (_m_prefetchw): Move out of
>>     GCC target("prfchw") pragma.
>>     * config/i386/i386.md (prefetch): Emit prefetchwt1 only
>>     for locality <= 2.
>>     * config/i386/i386.c (ix86_option_override_internal): Enable
>>     -mprfchw with -mprefetchwt1.
>
> BTW: There are a couple of new testsuite failures:
>
> FAIL: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c
> scan-assembler-times vscatterpf0dpd[ \\\\t]+[^\\n]*%ymm[0-9] 2
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c
> scan-assembler-times vscatterpf0dpd[ \\\\t]+[^\\n]*{%k[1-7] 1
> FAIL: gcc.target/i386/avx512pf-vscatterpf0dps-1.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dps-1.c
> scan-assembler-times vscatterpf0dps[ \\\\t]+[^\\n]*%zmm[0-9] 2
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dps-1.c
> scan-assembler-times vscatterpf0dps[ \\\\t]+[^\\n]*{%k[1-7] 1
> FAIL: gcc.target/i386/avx512pf-vscatterpf0qpd-1.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qpd-1.c
> scan-assembler-times vscatterpf0qpd[ \\\\t]+[^\\n]*%zmm[0-9] 2
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qpd-1.c
> scan-assembler-times vscatterpf0qpd[ \\\\t]+[^\\n]*{%k[1-7] 1
> FAIL: gcc.target/i386/avx512pf-vscatterpf0qps-1.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qps-1.c
> scan-assembler-times vscatterpf0qps[ \\\\t]+[^\\n]*%zmm[0-9] 2
> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qps-1.c
> scan-assembler-times vscatterpf0qps[ \\\\t]+[^\\n]*{%k[1-7] 1
>
> They are all:
>
> FAIL: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c (test for excess errors)
> Excess errors:
> /ssd/uros/gcc-build/gcc/include/avx512pfintrin.h:108:3: error: the
> last argument must be hint 0 or 1
>
> They are due to _MM_HINT_ET0 fix, and probably show that the pattern
> was not updated when hint constants were adjusted to 2 and 3.
>
> Kirill, can you please look at this inconsistency?

Attached patch fixes these failures, and also fixes and improves error message.

Uros.

Comments

Kirill Yukhin March 4, 2014, 9:04 a.m. UTC | #1
Hello Uroš,
On 04 Mar 01:13, Uros Bizjak wrote:
> On Tue, Mar 4, 2014 at 12:31 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> > They are all:
> >
> > FAIL: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c (test for excess errors)
> > Excess errors:
> > /ssd/uros/gcc-build/gcc/include/avx512pfintrin.h:108:3: error: the
> > last argument must be hint 0 or 1
> >
> > They are due to _MM_HINT_ET0 fix, and probably show that the pattern
> > was not updated when hint constants were adjusted to 2 and 3.
> >
> > Kirill, can you please look at this inconsistency?
> 
> Attached patch fixes these failures, and also fixes and improves error message.
> 
> Uros.

> Index: i386.c
> ===================================================================
> --- i386.c	(revision 208296)
> +++ i386.c	(working copy)
> @@ -36022,7 +36022,7 @@ addcarryx:
>  
>        if (!insn_data[icode].operand[4].predicate (op4, mode4))
>  	{
> -	  error ("the last argument must be hint 0 or 1");
> +	  error ("incorrect hint operand");
>  	  return const0_rtx;
>  	}
>  
> Index: predicates.md
> ===================================================================
> --- predicates.md	(revision 208295)
> +++ predicates.md	(working copy)
> @@ -660,12 +660,12 @@
>    return i == 2 || i == 4 || i == 8;
>  })
>  
> -;; Match 2, 3, 5, or 6
> -(define_predicate "const2356_operand"
> +;; Match 2, 3, 6, or 7
> +(define_predicate "const2367_operand"
>    (match_code "const_int")
>  {
>    HOST_WIDE_INT i = INTVAL (op);
> -  return i == 2 || i == 3 || i == 5 || i == 6;
> +  return i == 2 || i == 3 || i == 6 || i == 7;
>  })
This will break `immediate' compatibility w/ ICC, since
ICC using ET0=5.
But as far as IMHO using immediates instead of literals
is ugly, I think this is minor issue.

--
Thanks, K
Uros Bizjak March 4, 2014, 2:45 p.m. UTC | #2
On Tue, Mar 4, 2014 at 10:04 AM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
> Hello Uroš,
> On 04 Mar 01:13, Uros Bizjak wrote:
>> On Tue, Mar 4, 2014 at 12:31 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> > They are all:
>> >
>> > FAIL: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c (test for excess errors)
>> > Excess errors:
>> > /ssd/uros/gcc-build/gcc/include/avx512pfintrin.h:108:3: error: the
>> > last argument must be hint 0 or 1
>> >
>> > They are due to _MM_HINT_ET0 fix, and probably show that the pattern
>> > was not updated when hint constants were adjusted to 2 and 3.
>> >
>> > Kirill, can you please look at this inconsistency?
>>
>> Attached patch fixes these failures, and also fixes and improves error message.
>>
>> Uros.
>
>> Index: i386.c
>> ===================================================================
>> --- i386.c    (revision 208296)
>> +++ i386.c    (working copy)
>> @@ -36022,7 +36022,7 @@ addcarryx:
>>
>>        if (!insn_data[icode].operand[4].predicate (op4, mode4))
>>       {
>> -       error ("the last argument must be hint 0 or 1");
>> +       error ("incorrect hint operand");
>>         return const0_rtx;
>>       }
>>
>> Index: predicates.md
>> ===================================================================
>> --- predicates.md     (revision 208295)
>> +++ predicates.md     (working copy)
>> @@ -660,12 +660,12 @@
>>    return i == 2 || i == 4 || i == 8;
>>  })
>>
>> -;; Match 2, 3, 5, or 6
>> -(define_predicate "const2356_operand"
>> +;; Match 2, 3, 6, or 7
>> +(define_predicate "const2367_operand"
>>    (match_code "const_int")
>>  {
>>    HOST_WIDE_INT i = INTVAL (op);
>> -  return i == 2 || i == 3 || i == 5 || i == 6;
>> +  return i == 2 || i == 3 || i == 6 || i == 7;
>>  })
> This will break `immediate' compatibility w/ ICC, since
> ICC using ET0=5.
> But as far as IMHO using immediates instead of literals
> is ugly, I think this is minor issue.

I don't think this is an issue at all, as said earlier - builtins are
considered internal implementation detail, published interface is in
relevant *.h files (this is also why wrong error message was fixed).

Uros.
Uros Bizjak March 4, 2014, 6:14 p.m. UTC | #3
On Tue, Mar 4, 2014 at 1:13 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Mar 4, 2014 at 12:31 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>
>>>> The new gcc.target/i386/prefetchwt1-1.c test currently FAILs on Solaris 9/x86:
>>>>
>>>> FAIL: gcc.target/i386/prefetchwt1-1.c (test for excess errors)
>>>> Excess errors:
>>>> /var/gcc/regression/trunk/9-gcc-gas/build/gcc/include/xmmintrin.h:1195:1: error:
>>>>  inlining failed in call to always_inline '_mm_prefetch': target specific option
>>>>  mismatch
>>>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/prefetchwt1-1.c:12:5:
>>>> error: called from here
>>>>
>>>> gcc.target/i386/prefetchwt1-1.c: output file does not exist
>>>> UNRESOLVED: gcc.target/i386/prefetchwt1-1.c scan-assembler [ \\t]+prefetchwt1[ \
>>>> \t]+
>>>>
>>>> This can be fixed by compiling with -msse2.
>>>
>>> Actually, we should take prefetch instructions out of various GCC
>>> target pragmas. Patterns that emit these instructions are designed to
>>> (depending on selected ISA) always emit  the most optimal prefetch
>>> instruction.
>>>
>>> The patch also changes the compiler to emit prefetchwt1 only for
>>> _MM_HINT_T1, while for _MM_HINT_T0, it still emits prefetchw. In
>>> addition, the patch corrects wrong MM_HINT_T0 value.
>>>
>>> Patch was bootstrapped and tested on x86_64-pc-linux-gnu {,-m32}  and
>>> committed to mainline SVN.
>>>
>>> 2014-03-03  Uros Bizjak  <ubizjak@gmail.com>
>>>
>>>     * config/i386/xmmintrin.h (enum _mm_hint) <_MM_HINT_ET0>: Correct
>>>     hint value.
>>>     (_mm_prefetch): Move out of GCC target("sse") pragma.
>>>     * config/i386/prfchwintrin.h (_m_prefetchw): Move out of
>>>     GCC target("prfchw") pragma.
>>>     * config/i386/i386.md (prefetch): Emit prefetchwt1 only
>>>     for locality <= 2.
>>>     * config/i386/i386.c (ix86_option_override_internal): Enable
>>>     -mprfchw with -mprefetchwt1.
>>
>> BTW: There are a couple of new testsuite failures:
>>
>> FAIL: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c (test for excess errors)
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c
>> scan-assembler-times vscatterpf0dpd[ \\\\t]+[^\\n]*%ymm[0-9] 2
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c
>> scan-assembler-times vscatterpf0dpd[ \\\\t]+[^\\n]*{%k[1-7] 1
>> FAIL: gcc.target/i386/avx512pf-vscatterpf0dps-1.c (test for excess errors)
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dps-1.c
>> scan-assembler-times vscatterpf0dps[ \\\\t]+[^\\n]*%zmm[0-9] 2
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0dps-1.c
>> scan-assembler-times vscatterpf0dps[ \\\\t]+[^\\n]*{%k[1-7] 1
>> FAIL: gcc.target/i386/avx512pf-vscatterpf0qpd-1.c (test for excess errors)
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qpd-1.c
>> scan-assembler-times vscatterpf0qpd[ \\\\t]+[^\\n]*%zmm[0-9] 2
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qpd-1.c
>> scan-assembler-times vscatterpf0qpd[ \\\\t]+[^\\n]*{%k[1-7] 1
>> FAIL: gcc.target/i386/avx512pf-vscatterpf0qps-1.c (test for excess errors)
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qps-1.c
>> scan-assembler-times vscatterpf0qps[ \\\\t]+[^\\n]*%zmm[0-9] 2
>> UNRESOLVED: gcc.target/i386/avx512pf-vscatterpf0qps-1.c
>> scan-assembler-times vscatterpf0qps[ \\\\t]+[^\\n]*{%k[1-7] 1
>>
>> They are all:
>>
>> FAIL: gcc.target/i386/avx512pf-vscatterpf0dpd-1.c (test for excess errors)
>> Excess errors:
>> /ssd/uros/gcc-build/gcc/include/avx512pfintrin.h:108:3: error: the
>> last argument must be hint 0 or 1
>>
>> They are due to _MM_HINT_ET0 fix, and probably show that the pattern
>> was not updated when hint constants were adjusted to 2 and 3.
>>
>> Kirill, can you please look at this inconsistency?
>
> Attached patch fixes these failures, and also fixes and improves error message.

2014-03-04  Uros Bizjak  <ubizjak@gmail.com>

    * config/i386/predicates.md (const2356_operand): Change to ...
    (const2367_operand): ... this.
    * config/i386/sse.md (avx512pf_scatterpf<mode>sf): Use
    const2367_operand.
    (*avx512pf_scatterpf<mode>sf_mask): Ditto.
    (*avx512pf_scatterpf<mode>sf): Ditto.
    (avx512pf_scatterpf<mode>df): Ditto.
    (*avx512pf_scatterpf<mode>df_mask): Ditto.
    (*avx512pf_scatterpf<mode>df): Ditto.
    * config/i386/i386.c (ix86_expand_builtin): Update
    incorrect hint operand error message.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} and
committed to mainline SVN.

Uros.
diff mbox

Patch

Index: i386.c
===================================================================
--- i386.c	(revision 208296)
+++ i386.c	(working copy)
@@ -36022,7 +36022,7 @@  addcarryx:
 
       if (!insn_data[icode].operand[4].predicate (op4, mode4))
 	{
-	  error ("the last argument must be hint 0 or 1");
+	  error ("incorrect hint operand");
 	  return const0_rtx;
 	}
 
Index: predicates.md
===================================================================
--- predicates.md	(revision 208295)
+++ predicates.md	(working copy)
@@ -660,12 +660,12 @@ 
   return i == 2 || i == 4 || i == 8;
 })
 
-;; Match 2, 3, 5, or 6
-(define_predicate "const2356_operand"
+;; Match 2, 3, 6, or 7
+(define_predicate "const2367_operand"
   (match_code "const_int")
 {
   HOST_WIDE_INT i = INTVAL (op);
-  return i == 2 || i == 3 || i == 5 || i == 6;
+  return i == 2 || i == 3 || i == 6 || i == 7;
 })
 
 ;; Match 1, 2, 4, or 8
Index: sse.md
===================================================================
--- sse.md	(revision 208295)
+++ sse.md	(working copy)
@@ -12652,7 +12652,7 @@ 
 	  [(match_operand 2 "vsib_address_operand")
 	   (match_operand:VI48_512 1 "register_operand")
 	   (match_operand:SI 3 "const1248_operand")]))
-      (match_operand:SI 4 "const2356_operand")]
+      (match_operand:SI 4 "const2367_operand")]
      UNSPEC_SCATTER_PREFETCH)]
   "TARGET_AVX512PF"
 {
@@ -12670,7 +12670,7 @@ 
 	    (match_operand:VI48_512 1 "register_operand" "v")
 	    (match_operand:SI 3 "const1248_operand" "n")]
 	   UNSPEC_VSIBADDR)])
-      (match_operand:SI 4 "const2356_operand" "n")]
+      (match_operand:SI 4 "const2367_operand" "n")]
      UNSPEC_SCATTER_PREFETCH)]
   "TARGET_AVX512PF"
 {
@@ -12677,7 +12677,7 @@ 
   switch (INTVAL (operands[4]))
     {
     case 3:
-    case 5:
+    case 7:
       return "vscatterpf0<ssemodesuffix>ps\t{%5%{%0%}|%5%{%0%}}";
     case 2:
     case 6:
@@ -12699,7 +12699,7 @@ 
 	    (match_operand:VI48_512 0 "register_operand" "v")
 	    (match_operand:SI 2 "const1248_operand" "n")]
 	   UNSPEC_VSIBADDR)])
-      (match_operand:SI 3 "const2356_operand" "n")]
+      (match_operand:SI 3 "const2367_operand" "n")]
      UNSPEC_SCATTER_PREFETCH)]
   "TARGET_AVX512PF"
 {
@@ -12706,7 +12706,7 @@ 
   switch (INTVAL (operands[3]))
     {
     case 3:
-    case 5:
+    case 7:
       return "vscatterpf0<ssemodesuffix>ps\t{%4|%4}";
     case 2:
     case 6:
@@ -12728,7 +12728,7 @@ 
 	  [(match_operand 2 "vsib_address_operand")
 	   (match_operand:VI4_256_8_512 1 "register_operand")
 	   (match_operand:SI 3 "const1248_operand")]))
-      (match_operand:SI 4 "const2356_operand")]
+      (match_operand:SI 4 "const2367_operand")]
      UNSPEC_SCATTER_PREFETCH)]
   "TARGET_AVX512PF"
 {
@@ -12746,7 +12746,7 @@ 
 	    (match_operand:VI4_256_8_512 1 "register_operand" "v")
 	    (match_operand:SI 3 "const1248_operand" "n")]
 	   UNSPEC_VSIBADDR)])
-      (match_operand:SI 4 "const2356_operand" "n")]
+      (match_operand:SI 4 "const2367_operand" "n")]
      UNSPEC_SCATTER_PREFETCH)]
   "TARGET_AVX512PF"
 {
@@ -12753,7 +12753,7 @@ 
   switch (INTVAL (operands[4]))
     {
     case 3:
-    case 5:
+    case 7:
       return "vscatterpf0<ssemodesuffix>pd\t{%5%{%0%}|%5%{%0%}}";
     case 2:
     case 6:
@@ -12775,7 +12775,7 @@ 
 	    (match_operand:VI4_256_8_512 0 "register_operand" "v")
 	    (match_operand:SI 2 "const1248_operand" "n")]
 	   UNSPEC_VSIBADDR)])
-      (match_operand:SI 3 "const2356_operand" "n")]
+      (match_operand:SI 3 "const2367_operand" "n")]
      UNSPEC_SCATTER_PREFETCH)]
   "TARGET_AVX512PF"
 {
@@ -12782,7 +12782,7 @@ 
   switch (INTVAL (operands[3]))
     {
     case 3:
-    case 5:
+    case 7:
       return "vscatterpf0<ssemodesuffix>pd\t{%4|%4}";
     case 2:
     case 6: