From patchwork Sat Dec 1 05:50:59 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xinliang David Li X-Patchwork-Id: 203112 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id DF7582C009B for ; Sat, 1 Dec 2012 16:51:12 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1354945874; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:Received:Date:Message-ID:Subject:From:To: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=kbTkDkc lsRgmYn0iDSzusU/9/xY=; b=SS1FqMnrWURAihfN6YIwfJ7b2rILSVhOBsdj+Hj TLt18EdPTHakKGBBfEdZR8rqV6A/8q+eBg3aC1qhGUQLz5L1Lgirgydmfk3Pypa2 Yk8lGkY18QRIqWoFF0tejbYcV7BvYq2MBgxL1SDLL9/ThBviM39yowL/tmWOxkP7 lvYM= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:MIME-Version:Received:Received:Date:Message-ID:Subject:From:To:Content-Type:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Jb/Q0eNJCdsvwlcqN+mSvPhf/hJguA4KfROrtxjYbAP/v6m2cg8TqOxzxV8pAx Sz9r2Z0caFjkbUpel6RnPC9lXPjU+rdsNLDOyzHlVTkSjgCjztl8LJKD8CWxiQ9k 6RZsN7peH+/clWkz5Usaik4XVCQB8bMQKmU4MZKKl0//M=; Received: (qmail 26131 invoked by alias); 1 Dec 2012 05:51:07 -0000 Received: (qmail 26116 invoked by uid 22791); 1 Dec 2012 05:51:05 -0000 X-SWARE-Spam-Status: No, hits=-4.4 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KHOP_RCVD_TRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD, TW_HG, TW_IV, TW_OV X-Spam-Check-By: sourceware.org Received: from mail-we0-f175.google.com (HELO mail-we0-f175.google.com) (74.125.82.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 01 Dec 2012 05:51:00 +0000 Received: by mail-we0-f175.google.com with SMTP id z53so383455wey.20 for ; Fri, 30 Nov 2012 21:50:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :x-gm-message-state; bh=U3Ya0HSrz4fk2/cf3HaFfGXis8X57hfSQewsHj9ZxNo=; b=T0MJaxSvI2i0ZJvfTnemqFbZzcIJQkvKZHswn7K3KUdO2ztM2SnNznmbL8q+9klif6 +3KO2/4V+5uBETo/83NFWaBjvHPzIONfi5rUMM32AShrOOcYFugMtoIxE+yzOte5bYL+ 2FCwZ1o/kcI/Iwt5nB93M0xmpqIenMih9bxIsi1ebiPl0hnfdrZ3ocbOCST7JbQ3YeMH U4+95ffEw3kEDqv4g3B15ZNSrlhcdUy6Ts6kuo7dCWkKKwdQ28jMmolHmcIRUFI/JQ+7 VyXUAVmKY2GLfonhuCNNzi3svtcT38c7dgCZVBjMim1khQxN80d4gWPlL9+27Xu01jYy fUUA== MIME-Version: 1.0 Received: by 10.180.74.108 with SMTP id s12mr1014909wiv.12.1354341059464; Fri, 30 Nov 2012 21:50:59 -0800 (PST) Received: by 10.216.190.207 with HTTP; Fri, 30 Nov 2012 21:50:59 -0800 (PST) Date: Fri, 30 Nov 2012 21:50:59 -0800 Message-ID: Subject: [PATCH i386] Allow cltd/cqto etc on modern CPUs From: Xinliang David Li To: GCC Patches X-Gm-Message-State: ALoCoQmlppZ3CvRZlN8fPcG7JmOemFpS2LRWKtGMgB3IpOI+QTMJz8nJOVL7GlAu8R5mQmT25JLgTFGGavY/9BiMq7henGCpM9OYogi3y3ubQcr97oBlLB6vY9oFMrFfp16Q3knExEqNK1JNKjLbGIap1EH4WYuhwLZm87VIt62hgTH3iZxliyq8qKcOOpw3fg0clJRerROt X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Compiling the following code with O2 typedef unsigned long ulong; typedef __SIZE_TYPE__ size_t; long woo_i(long a, long b) { return a/b; } GCC generates: .LFB0: .cfi_startproc movq %rdi, %rdx movq %rdi, %rax sarq $63, %rdx idivq %rsi ret but both ICC and LLVM generate smaller and faster version: movq %rdi, %rax cqto idivq %rsi ret for reference see http://www.agner.org/optimize/instruction_tables.pdf. On Pentium, the latency of the instruction is 3 cycles while on modern CPUs, the instruction has only one uOp with 1 cycle latency. The following proposed patch fixed the problem. Note that for Atom, only the CWD instruction is slow with 5 cycle latency, the rest sign extension instructions are fast -- the fix for Atom needs finer grain control and can be done separately. Ok to install after testing? 2010-11-30 Xinliang David Li * config/i386/i386.c: Allow sign extend instructions (cltd etc) on modern CPUs. thanks, David Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 193861) +++ config/i386/i386.c (working copy) @@ -1822,7 +1822,7 @@ static unsigned int initial_ix86_tune_fe m_K6, /* X86_TUNE_USE_CLTD */ - ~(m_PENT | m_CORE2I7 | m_ATOM | m_K6 | m_GENERIC), + ~(m_PENT | m_ATOM | m_K6), /* X86_TUNE_USE_XCHGB: Use xchgb %rh,%rl instead of rolw/rorw $8,rx. */ m_PENT4,