diff mbox

[i386] : Fine tune prefetchw emission (PR 77270)

Message ID CAFULd4ak+PNDZa3edAs-jPF5QUnH_8WwDAhvhD893Dr4LEuMrw@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak Aug. 21, 2016, 7:06 p.m. UTC
Hello!

Attached patch fine-tunes the condition when prefetchw write prefetch
insns are emitted. prefetchw is preferred for non-SSE2 K7 athlons
(this is covered by i386-prefetch.exp tests), on the other hand, SSE
prefetches are preferred for K8 targets, as measured and reported in
PR 77270.

For newer targets, PRFCHW cpuid bit is respected, and -march=native
correctly emits prefetchw, when PRFCHW cpuid bit is set. (on a related
note, PTA_PRFCHW should probably be set for amdfam10+ targets,
Venkataramanan is looking into this issue).

2016-08-21  Uros Bizjak  <ubizjak@gmail.com>

    PR target/77270
    * config/i386/i386.md (prefetch): When TARGET_PRFCHW or
    TARGET_PREFETCHWT1 are disabled, emit 3dNOW! write prefetches for
    non-SSE2 athlons only, otherwise prefer SSE prefetches.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.

Comments

Kumar, Venkataramanan Aug. 23, 2016, 6:58 a.m. UTC | #1
Hi Uros,

> -----Original Message-----

> From: Uros Bizjak [mailto:ubizjak@gmail.com]

> Sent: Monday, August 22, 2016 12:36 AM

> To: gcc-patches@gcc.gnu.org

> Cc: Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; NightStrike

> StrikeNight <nightstrike@gmail.com>

> Subject: [PATCH, i386]: Fine tune prefetchw emission (PR 77270)

> 

> Hello!

> 

> Attached patch fine-tunes the condition when prefetchw write prefetch insns

> are emitted. prefetchw is preferred for non-SSE2 K7 athlons (this is covered by

> i386-prefetch.exp tests), on the other hand, SSE prefetches are preferred for K8

> targets, as measured and reported in PR 77270.

> 

> For newer targets, PRFCHW cpuid bit is respected, and -march=native correctly

> emits prefetchw, when PRFCHW cpuid bit is set. (on a related note,

> PTA_PRFCHW should probably be set for amdfam10+ targets, Venkataramanan

> is looking into this issue).


Yes AMD family10 targets supports both 3DNowPrefetch (PTA_PRFCHW) and 3DNOW!.  
We should  set PTA_PRFCHW for these targets.

Patch was already posted few years back
 https://gcc.gnu.org/ml/gcc-patches/2012-09/msg00670.html

Now we are not giving priority for prefetches via 3DNOW! ISA , we have to add PTA_PRFCHW for AMD fam10 targets.
I have pushed the changes.
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=239682

> 

> 2016-08-21  Uros Bizjak  <ubizjak@gmail.com>

> 

>     PR target/77270

>     * config/i386/i386.md (prefetch): When TARGET_PRFCHW or

>     TARGET_PREFETCHWT1 are disabled, emit 3dNOW! write prefetches for

>     non-SSE2 athlons only, otherwise prefer SSE prefetches.

> 

> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

> 

> Committed to mainline SVN.

> 

> Uros.


Regards,
Venkat.
diff mbox

Patch

Index: i386.md
===================================================================
--- i386.md	(revision 239642)
+++ i386.md	(working copy)
@@ -18634,20 +18634,24 @@ 
   gcc_assert (IN_RANGE (locality, 0, 3));
 
   /* Use 3dNOW prefetch in case we are asking for write prefetch not
-     supported by SSE counterpart or the SSE prefetch is not available
-     (K6 machines).  Otherwise use SSE prefetch as it allows specifying
-     of locality.  */
+     supported by SSE counterpart (non-SSE2 athlon machines) or the
+     SSE prefetch is not available (K6 machines).  Otherwise use SSE
+     prefetch as it allows specifying of locality.  */
 
   if (write)
     {
       if (TARGET_PREFETCHWT1)
 	operands[2] = GEN_INT (MAX (locality, 2)); 
-      else if (TARGET_3DNOW || TARGET_PRFCHW)
+      else if (TARGET_PRFCHW)
 	operands[2] = GEN_INT (3);
+      else if (TARGET_3DNOW && !TARGET_SSE2)
+	operands[2] = GEN_INT (3);
+      else if (TARGET_PREFETCH_SSE)
+	operands[1] = const0_rtx;
       else
 	{
-	  gcc_assert (TARGET_PREFETCH_SSE);
-	  operands[1] = const0_rtx;
+	  gcc_assert (TARGET_3DNOW);
+	  operands[2] = GEN_INT (3);
 	}
     }
   else