From patchwork Thu Apr 5 00:07:00 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Teresa Johnson X-Patchwork-Id: 150828 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D811AB7039 for ; Thu, 5 Apr 2012 10:07:26 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1334189247; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:Received:To:Subject:Message-Id:Date: From:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=hStQE3d /+8QliST2YWFD1AQrxLk=; b=g8XjTy9iN/awt37do4ils5lSBH+Q/u1VpASyxlu NFKJomgStPbGX7VcGuCXiW1Yz03ZgzQ+zmbMLC6FU8UDweR2YPBdTeEBmwZEbSWY RNQRXfjHSgtl/W1QXvvfBKaLcdCK70cvMRpOEeYP8KwKESb6t9sK5DvmdvjM77rI qUkg= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:Received:Received:Received:Received:Received:To:Subject:Message-Id:Date:From:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=QoOd5842YDuXNv3XVwqRKqtbYmJNTyfGIGPbLmlk8uZODR1ZVsrKevJ5ldjr3U 8BonN259CbruCFLn/WYExCWHoiN13AYZIrSMsikNeJ+7JJORzqIIvB1ajUtjdXJ0 SCcbZoLRqCuccdZOH3TubJak8Bop6WefgoXTGXfdUCe10=; Received: (qmail 15262 invoked by alias); 5 Apr 2012 00:07:20 -0000 Received: (qmail 15246 invoked by uid 22791); 5 Apr 2012 00:07:18 -0000 X-SWARE-Spam-Status: No, hits=-4.5 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KHOP_RCVD_TRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail-fa0-f73.google.com (HELO mail-fa0-f73.google.com) (209.85.161.73) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 05 Apr 2012 00:07:03 +0000 Received: by faas16 with SMTP id s16so45089faa.2 for ; Wed, 04 Apr 2012 17:07:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=to:subject:message-id:date:from:x-gm-message-state; bh=NmBZ1zYV/6ftwR9GfF/t1NGG5vG8EEnfnz/SZzjvkas=; b=KJgRZOnQQfCK4xHj/nVEq4nrxf6m1V4x65WZVOW063h9hrXPdKEAYRQfMHx4ht9D/V ZOGKNHfn3xMcgTFrCUn/xRs/hbC6GS909fLFb0ewsa0h4hu7hYqcc0pqZyb+dYqL+FVE bYjj5E8fNXrZCHO5syVCoErBnrMuiZRTsd1ztv3Z/Ync702wypHWJzghx4pAqvr/mljs uVmeO+bwTnfHRIcogVqGl/jtabEJG52Lt1rLa0B5m7aeKB0Sz3hhg8Nb8EpCkAhh8MYi wPw5cDziYLZ+hFHmlfrAQPeOqiUeeLe2lEzg8S5nUkjY/wacOkE+EMP/TLlT/SQKsc+4 F1jA== Received: by 10.14.127.10 with SMTP id c10mr166261eei.2.1333584421531; Wed, 04 Apr 2012 17:07:01 -0700 (PDT) Received: by 10.14.127.10 with SMTP id c10mr166253eei.2.1333584421419; Wed, 04 Apr 2012 17:07:01 -0700 (PDT) Received: from hpza10.eem.corp.google.com ([74.125.121.33]) by gmr-mx.google.com with ESMTPS id z52si1448658eeb.1.2012.04.04.17.07.01 (version=TLSv1/SSLv3 cipher=AES128-SHA); Wed, 04 Apr 2012 17:07:01 -0700 (PDT) Received: from tjsboxrox.mtv.corp.google.com (tjsboxrox.mtv.corp.google.com [172.18.110.68]) by hpza10.eem.corp.google.com (Postfix) with ESMTP id 04AB820004E; Wed, 4 Apr 2012 17:07:01 -0700 (PDT) Received: by tjsboxrox.mtv.corp.google.com (Postfix, from userid 147431) id 5020E61583; Wed, 4 Apr 2012 17:07:00 -0700 (PDT) To: reply@codereview.appspotmail.com,gcc-patches@gcc.gnu.org Subject: [Patch, i386] Avoid LCP stalls (issue5975045) Message-Id: <20120405000700.5020E61583@tjsboxrox.mtv.corp.google.com> Date: Wed, 4 Apr 2012 17:07:00 -0700 (PDT) From: tejohnson@google.com (Teresa Johnson) X-Gm-Message-State: ALoCoQnOEKrGPjsEPTAO8OVyBd60LhbFi9fI6k4DYU8g0BI9kLkzTW4NOtw7xPIaxMmPR6mzmKa/UJ12vYPpnFMc4GlcRmq9a4bLnPi/5Q2mgjZXvEL1wXhsEYWGN1VlDI3P+psnFska54duuAw+v7GUUsPOOkHod6Gxlp6or1/m77Q1PQONcqI= X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org New patch to avoid LCP stalls based on feedback from earlier patch. I modified H.J.'s old patch to perform the peephole2 to split immediate moves to HImode memory. This is now enabled for Core2, Corei7 and Generic. I verified that this enables the splitting to occur in the case that originally motivated the optimization. If we subsequently find situations where LCP stalls are hurting performance but an extra register is required to perform the splitting, then we can revisit whether this should be performed earlier. I also measured SPEC 2000/2006 performance using Generic64 on an AMD Opteron and the results were neutral. Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk? Thanks, Teresa 2012-04-04 Teresa Johnson * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_LCP_STALL. * config/i386/i386.md (move immediate to memory peephole2): Add cases for HImode move when LCP stall avoidance is needed. * config/i386/i386.c (initial_ix86_tune_features): Initialize X86_TUNE_LCP_STALL entry. --- This patch is available for review at http://codereview.appspot.com/5975045 Index: config/i386/i386.h =================================================================== --- config/i386/i386.h (revision 185920) +++ config/i386/i386.h (working copy) @@ -262,6 +262,7 @@ enum ix86_tune_indices { X86_TUNE_MOVX, X86_TUNE_PARTIAL_REG_STALL, X86_TUNE_PARTIAL_FLAG_REG_STALL, + X86_TUNE_LCP_STALL, X86_TUNE_USE_HIMODE_FIOP, X86_TUNE_USE_SIMODE_FIOP, X86_TUNE_USE_MOV0, @@ -340,6 +341,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL] #define TARGET_PARTIAL_FLAG_REG_STALL \ ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL] +#define TARGET_LCP_STALL \ + ix86_tune_features[X86_TUNE_LCP_STALL] #define TARGET_USE_HIMODE_FIOP ix86_tune_features[X86_TUNE_USE_HIMODE_FIOP] #define TARGET_USE_SIMODE_FIOP ix86_tune_features[X86_TUNE_USE_SIMODE_FIOP] #define TARGET_USE_MOV0 ix86_tune_features[X86_TUNE_USE_MOV0] Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 185920) +++ config/i386/i386.md (working copy) @@ -16977,9 +16977,11 @@ (set (match_operand:SWI124 0 "memory_operand") (const_int 0))] "optimize_insn_for_speed_p () - && !TARGET_USE_MOV0 - && TARGET_SPLIT_LONG_MOVES - && get_attr_length (insn) >= ix86_cur_cost ()->large_insn + && ((TARGET_LCP_STALL + && GET_MODE (operands[0]) == HImode) + || (!TARGET_USE_MOV0 + && TARGET_SPLIT_LONG_MOVES + && get_attr_length (insn) >= ix86_cur_cost ()->large_insn)) && peep2_regno_dead_p (0, FLAGS_REG)" [(parallel [(set (match_dup 2) (const_int 0)) (clobber (reg:CC FLAGS_REG))]) @@ -16991,8 +16993,10 @@ (set (match_operand:SWI124 0 "memory_operand") (match_operand:SWI124 1 "immediate_operand"))] "optimize_insn_for_speed_p () - && TARGET_SPLIT_LONG_MOVES - && get_attr_length (insn) >= ix86_cur_cost ()->large_insn" + && ((TARGET_LCP_STALL + && GET_MODE (operands[0]) == HImode) + || (TARGET_SPLIT_LONG_MOVES + && get_attr_length (insn) >= ix86_cur_cost ()->large_insn))" [(set (match_dup 2) (match_dup 1)) (set (match_dup 0) (match_dup 2))]) Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 185920) +++ config/i386/i386.c (working copy) @@ -1964,6 +1964,10 @@ static unsigned int initial_ix86_tune_features[X86 /* X86_TUNE_PARTIAL_FLAG_REG_STALL */ m_CORE2I7 | m_GENERIC, + /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall + * on 16-bit immediate moves into memory on Core2 and Corei7. */ + m_CORE2I7 | m_GENERIC, + /* X86_TUNE_USE_HIMODE_FIOP */ m_386 | m_486 | m_K6_GEODE,