Patchwork [PR,49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

login
register
mail settings
Submitter Fang, Changpeng
Date June 20, 2011, 6:03 p.m.
Message ID <D4C76825A6780047854A11E93CDE84D005980DC701@SAUSEXMBP01.amd.com>
Download mbox | patch
Permalink /patch/101174/
State New
Headers show

Comments

Fang, Changpeng - June 20, 2011, 6:03 p.m.
Hi,

  I modified the patch as H.J. suggested (patch attached).

Is it OK to commit to trunk now?

Thanks,

Changpeng
Uros Bizjak - June 20, 2011, 6:38 p.m.
On Mon, Jun 20, 2011 at 8:03 PM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote:

>  I modified the patch as H.J. suggested (patch attached).
>
> Is it OK to commit to trunk now?

Yes, this is OK for trunk.

Thanks,
Uros.
Fang, Changpeng - June 20, 2011, 10:07 p.m.
Thanks,
Patch has been committed to trunk as revision 175230.

Changpeng
Eric Botcazou - June 29, 2011, 4:06 p.m.
> Thanks,

Note that there is no "i386" component in Bugzilla, only a "target" so this 
should have been PR target/49089.  The end result is that there are no xrefs in 
the PR, which is still open btw.  So please add the xrefs to the commits in the 
PR manually and close it if you are done with it.

Patch

From 50310fc367348b406fc88d54c3ab54d1a304ad52 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Mon, 13 Jun 2011 13:13:32 -0700
Subject: [PATCH 2/2] pr49089: enable avx256 splitting unaligned load/store only when beneficial

	* config/i386/i386.c (avx256_split_unaligned_load): New definition.
	  (avx256_split_unaligned_store): New definition.
	  (ix86_option_override_internal): Enable avx256 unaligned load(store)
	  splitting only when avx256_split_unaligned_load(store) is set.
---
 gcc/config/i386/i386.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7b266b9..3bc0b53 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2121,6 +2121,12 @@  static const unsigned int x86_arch_always_fancy_math_387
   = m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4
     | m_NOCONA | m_CORE2I7 | m_GENERIC;
 
+static const unsigned int x86_avx256_split_unaligned_load
+  = m_COREI7 | m_GENERIC;
+
+static const unsigned int x86_avx256_split_unaligned_store
+  = m_COREI7 | m_BDVER1 | m_GENERIC;
+
 /* In case the average insn count for single function invocation is
    lower than this constant, emit fast (but longer) prologue and
    epilogue code.  */
@@ -4194,9 +4200,11 @@  ix86_option_override_internal (bool main_args_p)
 	  if (flag_expensive_optimizations
 	      && !(target_flags_explicit & MASK_VZEROUPPER))
 	    target_flags |= MASK_VZEROUPPER;
-	  if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
+	  if ((x86_avx256_split_unaligned_load & ix86_tune_mask)
+	      && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 	    target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
-	  if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
+	  if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
+	      && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
 	    target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
 	}
     }
-- 
1.7.0.4