Message ID | CAFULd4ak+PNDZa3edAs-jPF5QUnH_8WwDAhvhD893Dr4LEuMrw@mail.gmail.com |
---|---|
State | New |
Headers | show |
Hi Uros, > -----Original Message----- > From: Uros Bizjak [mailto:ubizjak@gmail.com] > Sent: Monday, August 22, 2016 12:36 AM > To: gcc-patches@gcc.gnu.org > Cc: Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; NightStrike > StrikeNight <nightstrike@gmail.com> > Subject: [PATCH, i386]: Fine tune prefetchw emission (PR 77270) > > Hello! > > Attached patch fine-tunes the condition when prefetchw write prefetch insns > are emitted. prefetchw is preferred for non-SSE2 K7 athlons (this is covered by > i386-prefetch.exp tests), on the other hand, SSE prefetches are preferred for K8 > targets, as measured and reported in PR 77270. > > For newer targets, PRFCHW cpuid bit is respected, and -march=native correctly > emits prefetchw, when PRFCHW cpuid bit is set. (on a related note, > PTA_PRFCHW should probably be set for amdfam10+ targets, Venkataramanan > is looking into this issue). Yes AMD family10 targets supports both 3DNowPrefetch (PTA_PRFCHW) and 3DNOW!. We should set PTA_PRFCHW for these targets. Patch was already posted few years back https://gcc.gnu.org/ml/gcc-patches/2012-09/msg00670.html Now we are not giving priority for prefetches via 3DNOW! ISA , we have to add PTA_PRFCHW for AMD fam10 targets. I have pushed the changes. https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=239682 > > 2016-08-21 Uros Bizjak <ubizjak@gmail.com> > > PR target/77270 > * config/i386/i386.md (prefetch): When TARGET_PRFCHW or > TARGET_PREFETCHWT1 are disabled, emit 3dNOW! write prefetches for > non-SSE2 athlons only, otherwise prefer SSE prefetches. > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > Committed to mainline SVN. > > Uros. Regards, Venkat.
Index: i386.md =================================================================== --- i386.md (revision 239642) +++ i386.md (working copy) @@ -18634,20 +18634,24 @@ gcc_assert (IN_RANGE (locality, 0, 3)); /* Use 3dNOW prefetch in case we are asking for write prefetch not - supported by SSE counterpart or the SSE prefetch is not available - (K6 machines). Otherwise use SSE prefetch as it allows specifying - of locality. */ + supported by SSE counterpart (non-SSE2 athlon machines) or the + SSE prefetch is not available (K6 machines). Otherwise use SSE + prefetch as it allows specifying of locality. */ if (write) { if (TARGET_PREFETCHWT1) operands[2] = GEN_INT (MAX (locality, 2)); - else if (TARGET_3DNOW || TARGET_PRFCHW) + else if (TARGET_PRFCHW) operands[2] = GEN_INT (3); + else if (TARGET_3DNOW && !TARGET_SSE2) + operands[2] = GEN_INT (3); + else if (TARGET_PREFETCH_SSE) + operands[1] = const0_rtx; else { - gcc_assert (TARGET_PREFETCH_SSE); - operands[1] = const0_rtx; + gcc_assert (TARGET_3DNOW); + operands[2] = GEN_INT (3); } } else