[PR,49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

Submitted by Fang, Changpeng on June 20, 2011, 6:03 p.m.

Details

Message ID D4C76825A6780047854A11E93CDE84D005980DC701@SAUSEXMBP01.amd.com
State New
Headers show

Commit Message

Fang, Changpeng June 20, 2011, 6:03 p.m.
Hi,

  I modified the patch as H.J. suggested (patch attached).

Is it OK to commit to trunk now?

Thanks,

Changpeng

Comments

Uros Bizjak June 20, 2011, 6:38 p.m.
On Mon, Jun 20, 2011 at 8:03 PM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote:

>  I modified the patch as H.J. suggested (patch attached).
>
> Is it OK to commit to trunk now?

Yes, this is OK for trunk.

Thanks,
Uros.
Fang, Changpeng June 20, 2011, 10:07 p.m.
Thanks,
Patch has been committed to trunk as revision 175230.

Changpeng
Eric Botcazou June 29, 2011, 4:06 p.m.
> Thanks,

Note that there is no "i386" component in Bugzilla, only a "target" so this 
should have been PR target/49089.  The end result is that there are no xrefs in 
the PR, which is still open btw.  So please add the xrefs to the commits in the 
PR manually and close it if you are done with it.

Patch hide | download patch | download mbox

From 50310fc367348b406fc88d54c3ab54d1a304ad52 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Mon, 13 Jun 2011 13:13:32 -0700
Subject: [PATCH 2/2] pr49089: enable avx256 splitting unaligned load/store only when beneficial

	* config/i386/i386.c (avx256_split_unaligned_load): New definition.
	  (avx256_split_unaligned_store): New definition.
	  (ix86_option_override_internal): Enable avx256 unaligned load(store)
	  splitting only when avx256_split_unaligned_load(store) is set.
---
 gcc/config/i386/i386.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7b266b9..3bc0b53 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2121,6 +2121,12 @@  static const unsigned int x86_arch_always_fancy_math_387
   = m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4
     | m_NOCONA | m_CORE2I7 | m_GENERIC;
 
+static const unsigned int x86_avx256_split_unaligned_load
+  = m_COREI7 | m_GENERIC;
+
+static const unsigned int x86_avx256_split_unaligned_store
+  = m_COREI7 | m_BDVER1 | m_GENERIC;
+
 /* In case the average insn count for single function invocation is
    lower than this constant, emit fast (but longer) prologue and
    epilogue code.  */
@@ -4194,9 +4200,11 @@  ix86_option_override_internal (bool main_args_p)
 	  if (flag_expensive_optimizations
 	      && !(target_flags_explicit & MASK_VZEROUPPER))
 	    target_flags |= MASK_VZEROUPPER;
-	  if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
+	  if ((x86_avx256_split_unaligned_load & ix86_tune_mask)
+	      && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 	    target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
-	  if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
+	  if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
+	      && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
 	    target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
 	}
     }
-- 
1.7.0.4