From patchwork Fri Feb 11 00:20:41 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fang, Changpeng" X-Patchwork-Id: 82705 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id AED46B712B for ; Fri, 11 Feb 2011 11:21:18 +1100 (EST) Received: (qmail 22420 invoked by alias); 11 Feb 2011 00:21:15 -0000 Received: (qmail 22411 invoked by uid 22791); 11 Feb 2011 00:21:14 -0000 X-SWARE-Spam-Status: No, hits=-1.5 required=5.0 tests=AWL, BAYES_05, RCVD_IN_DNSWL_LOW, TW_AV, TW_BD X-Spam-Check-By: sourceware.org Received: from am1ehsobe001.messaging.microsoft.com (HELO AM1EHSOBE001.bigfish.com) (213.199.154.204) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 11 Feb 2011 00:21:10 +0000 Received: from mail62-am1-R.bigfish.com (10.3.201.246) by AM1EHSOBE001.bigfish.com (10.3.204.21) with Microsoft SMTP Server id 14.1.225.8; Fri, 11 Feb 2011 00:21:07 +0000 Received: from mail62-am1 (localhost.localdomain [127.0.0.1]) by mail62-am1-R.bigfish.com (Postfix) with ESMTP id C5895AF036D; Fri, 11 Feb 2011 00:21:06 +0000 (UTC) X-SpamScore: -3 X-BigFish: VPS-3(zz4015Lzz1202hzzz32i668h34h61h) X-Spam-TCS-SCL: 0:0 X-Forefront-Antispam-Report: KIP:(null); UIP:(null); IPVD:NLI; H:ausb3twp01.amd.com; RD:none; EFVD:NLI Received: from mail62-am1 (localhost.localdomain [127.0.0.1]) by mail62-am1 (MessageSwitch) id 129738366431323_26334; Fri, 11 Feb 2011 00:21:04 +0000 (UTC) Received: from AM1EHSMHS001.bigfish.com (unknown [10.3.201.250]) by mail62-am1.bigfish.com (Postfix) with ESMTP id 94E1C1C98051; Fri, 11 Feb 2011 00:21:00 +0000 (UTC) Received: from ausb3twp01.amd.com (163.181.249.108) by AM1EHSMHS001.bigfish.com (10.3.207.101) with Microsoft SMTP Server id 14.1.225.8; Fri, 11 Feb 2011 00:20:44 +0000 X-M-MSG: Received: from sausexedgep02.amd.com (sausexedgep02-ext.amd.com [163.181.249.73]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ausb3twp01.amd.com (Tumbleweed MailGate 3.7.2) with ESMTP id 2D8131028B0D; Thu, 10 Feb 2011 18:20:41 -0600 (CST) Received: from sausexhtp01.amd.com (163.181.3.165) by sausexedgep02.amd.com (163.181.36.59) with Microsoft SMTP Server (TLS) id 8.3.106.1; Thu, 10 Feb 2011 18:21:40 -0600 Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp01.amd.com ([163.181.3.165]) with mapi; Thu, 10 Feb 2011 18:20:42 -0600 From: "Fang, Changpeng" To: "gcc-patches@gcc.gnu.org" CC: "hjl.tools@gmail.com" , "hubicka@ucw.cz" , "rguenther@suse.de" , "rth@redhat.com" Date: Thu, 10 Feb 2011 18:20:41 -0600 Subject: [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1 Message-ID: MIME-Version: 1.0 X-OriginatorOrg: amd.com Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Attached is the patch to force gcc to generate 128-bit avx instructions for bdver1. We found that for the current Bulldozer processors, AVX128 performs better than AVX256. For example, AVX128 is 3% faster than AVX256 on CFP2006, and 2~3% faster than AVX256 on polyhedron. As a result, we prefer gcc 4.6 to generate 128-bit avx instructions only (for bdver1). The patch passed bootstrapping on x86_64-unknown-linux-gnu with "-O3 -g -march=bdver1" and the necessary correctness and performance. Is it OK to commit to trunk? Thanks, Changpeng From b2587889e4c8016f8bc4dde53fa0d59c1a9074da Mon Sep 17 00:00:00 2001 From: Changpeng Fang Date: Thu, 10 Feb 2011 16:11:55 -0800 Subject: [PATCH] Generate 128-bit AVX instructions by default for bdver1 * config/i386/i386.h (enum ix86_tune_indices): Introduce X86_PREFER_AVX128 feature entry. (ix86_tune_features): Define TARGET_PREFER_AVX128. * config/i386/i386.c (initial_ix86_tune_features): Set X86_PREFER_AVX128 for bdver1. (ix86_preferred_simd_mode): Set the appropriate modes when X86_PREFER_AVX128 is set (for bdver1). --- gcc/config/i386/i386.c | 7 +++++-- gcc/config/i386/i386.h | 3 +++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 12c7062..5c8346e 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2082,6 +2082,9 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { /* X86_TUNE_VECTORIZE_DOUBLE: Enable double precision vector instructions. */ ~m_ATOM, + + /* X86_PREFER_AVX128: Generate AVX 128 instead of AVX 256. */ + m_BDVER1, }; /* Feature tests against the various architecture variations. */ @@ -34698,9 +34701,9 @@ ix86_preferred_simd_mode (enum machine_mode mode) switch (mode) { case SFmode: - return TARGET_AVX ? V8SFmode : V4SFmode; + return TARGET_AVX ? (TARGET_PREFER_AVX128 ? V4SFmode : V8SFmode) : V4SFmode; case DFmode: - return TARGET_AVX ? V4DFmode : V2DFmode; + return TARGET_AVX ? (TARGET_PREFER_AVX128 ? V2DFmode : V4DFmode) : V2DFmode; case DImode: return V2DImode; case SImode: diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index f14a95d..b84e6ed 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -322,6 +322,7 @@ enum ix86_tune_indices { X86_TUNE_FUSE_CMP_AND_BRANCH, X86_TUNE_OPT_AGU, X86_TUNE_VECTORIZE_DOUBLE, + X86_PREFER_AVX128, X86_TUNE_LAST }; @@ -418,6 +419,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; #define TARGET_OPT_AGU ix86_tune_features[X86_TUNE_OPT_AGU] #define TARGET_VECTORIZE_DOUBLE \ ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE] +#define TARGET_PREFER_AVX128 \ + ix86_tune_features[X86_PREFER_AVX128] /* Feature tests against the various architecture variations. */ enum ix86_arch_indices { -- 1.6.3.3