From patchwork Mon Jun 20 18:03:27 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fang, Changpeng" X-Patchwork-Id: 101174 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A99C2B6F65 for ; Tue, 21 Jun 2011 04:03:54 +1000 (EST) Received: (qmail 15079 invoked by alias); 20 Jun 2011 18:03:51 -0000 Received: (qmail 15067 invoked by uid 22791); 20 Jun 2011 18:03:50 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL, BAYES_00, TW_AV, TW_BD X-Spam-Check-By: sourceware.org Received: from tx2ehsobe002.messaging.microsoft.com (HELO TX2EHSOBE003.bigfish.com) (65.55.88.12) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 20 Jun 2011 18:03:36 +0000 Received: from mail136-tx2-R.bigfish.com (10.9.14.252) by TX2EHSOBE003.bigfish.com (10.9.40.23) with Microsoft SMTP Server id 14.1.225.22; Mon, 20 Jun 2011 18:03:35 +0000 Received: from mail136-tx2 (localhost.localdomain [127.0.0.1]) by mail136-tx2-R.bigfish.com (Postfix) with ESMTP id 1E676388451; Mon, 20 Jun 2011 18:03:35 +0000 (UTC) X-SpamScore: -10 X-BigFish: VPS-10(z1039oz9371M4015L1432N98dKzz1202hzz8275bh8275dhz32i668h839h34h) X-Forefront-Antispam-Report: CIP:163.181.249.108; KIP:(null); UIP:(null); IPVD:NLI; H:ausb3twp01.amd.com; RD:none; EFVD:NLI Received: from mail136-tx2 (localhost.localdomain [127.0.0.1]) by mail136-tx2 (MessageSwitch) id 1308593012975339_13832; Mon, 20 Jun 2011 18:03:32 +0000 (UTC) Received: from TX2EHSMHS044.bigfish.com (unknown [10.9.14.246]) by mail136-tx2.bigfish.com (Postfix) with ESMTP id DF893C00052; Mon, 20 Jun 2011 18:03:32 +0000 (UTC) Received: from ausb3twp01.amd.com (163.181.249.108) by TX2EHSMHS044.bigfish.com (10.9.99.144) with Microsoft SMTP Server id 14.1.225.22; Mon, 20 Jun 2011 18:03:31 +0000 X-M-MSG: Received: from sausexedgep02.amd.com (sausexedgep02-ext.amd.com [163.181.249.73]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ausb3twp01.amd.com (Axway MailGate 3.8.1) with ESMTP id 29BEE1028376; Mon, 20 Jun 2011 13:03:26 -0500 (CDT) Received: from sausexhtp01.amd.com (163.181.3.165) by sausexedgep02.amd.com (163.181.36.59) with Microsoft SMTP Server (TLS) id 8.3.106.1; Mon, 20 Jun 2011 13:04:09 -0500 Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp01.amd.com ([163.181.3.165]) with mapi; Mon, 20 Jun 2011 13:03:28 -0500 From: "Fang, Changpeng" To: "H.J. Lu" , "gcc-patches@gcc.gnu.org" CC: "hubicka@ucw.cz" , "ubizjak@gmail.com" , "rguenther@suse.de" Date: Mon, 20 Jun 2011 13:03:27 -0500 Subject: RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic Message-ID: References: , In-Reply-To: MIME-Version: 1.0 X-OriginatorOrg: amd.com Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, I modified the patch as H.J. suggested (patch attached). Is it OK to commit to trunk now? Thanks, Changpeng From 50310fc367348b406fc88d54c3ab54d1a304ad52 Mon Sep 17 00:00:00 2001 From: Changpeng Fang Date: Mon, 13 Jun 2011 13:13:32 -0700 Subject: [PATCH 2/2] pr49089: enable avx256 splitting unaligned load/store only when beneficial * config/i386/i386.c (avx256_split_unaligned_load): New definition. (avx256_split_unaligned_store): New definition. (ix86_option_override_internal): Enable avx256 unaligned load(store) splitting only when avx256_split_unaligned_load(store) is set. --- gcc/config/i386/i386.c | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 7b266b9..3bc0b53 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2121,6 +2121,12 @@ static const unsigned int x86_arch_always_fancy_math_387 = m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_CORE2I7 | m_GENERIC; +static const unsigned int x86_avx256_split_unaligned_load + = m_COREI7 | m_GENERIC; + +static const unsigned int x86_avx256_split_unaligned_store + = m_COREI7 | m_BDVER1 | m_GENERIC; + /* In case the average insn count for single function invocation is lower than this constant, emit fast (but longer) prologue and epilogue code. */ @@ -4194,9 +4200,11 @@ ix86_option_override_internal (bool main_args_p) if (flag_expensive_optimizations && !(target_flags_explicit & MASK_VZEROUPPER)) target_flags |= MASK_VZEROUPPER; - if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) + if ((x86_avx256_split_unaligned_load & ix86_tune_mask) + && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD; - if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) + if ((x86_avx256_split_unaligned_store & ix86_tune_mask) + && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE; } } -- 1.7.0.4