From patchwork Sun Nov 21 23:11:32 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Pfeifer X-Patchwork-Id: 72469 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 288B8B7133 for ; Mon, 22 Nov 2010 10:11:41 +1100 (EST) Received: (qmail 16444 invoked by alias); 21 Nov 2010 23:11:40 -0000 Received: (qmail 16433 invoked by uid 22791); 21 Nov 2010 23:11:38 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from vexpert.dbai.tuwien.ac.at (HELO vexpert.dbai.tuwien.ac.at) (128.131.111.2) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 21 Nov 2010 23:11:33 +0000 Received: from acrux.dbai.tuwien.ac.at (acrux.dbai.tuwien.ac.at [128.131.111.60]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by vexpert.dbai.tuwien.ac.at (Postfix) with ESMTPS id C494E1E065; Mon, 22 Nov 2010 00:11:28 +0100 (CET) Date: Mon, 22 Nov 2010 00:11:32 +0100 (CET) From: Gerald Pfeifer To: Jan Hubicka cc: Ralf Wildenhues , Diego Novillo , gcc-patches@gcc.gnu.org Subject: Re: [wwwdocs] IPA and LTO updates In-Reply-To: <20101121131530.GB12005@kam.mff.cuni.cz> Message-ID: References: <20101114170056.GA9459@kam.mff.cuni.cz> <4CE04C26.6060307@google.com> <20101120232503.GA26009@kam.mff.cuni.cz> <20101121072519.GC24974@gmx.de> <20101121131530.GB12005@kam.mff.cuni.cz> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Sun, 21 Nov 2010, Jan Hubicka wrote: > here is updated version. Thanks, Jan. I threw in a couple of linguistic changes and markup fixes ( instead of and some real one) and committed the thusly updated version. Pleaes find the patch as committed below. Good stuff! Gerald Index: changes.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/changes.html,v retrieving revision 1.63 retrieving revision 1.66 diff -u -3 -p -r1.63 -r1.66 --- changes.html 20 Nov 2010 20:00:13 -0000 1.63 +++ changes.html 21 Nov 2010 23:08:45 -0000 1.66 @@ -51,21 +51,57 @@

General Optimizer Improvements

    -
  • A new general optimization level, -Ofast has been +
  • A new general optimization level, -Ofast has been introduced. It combines the existing optimization level -O3 with options that can affect standards compliance but result in better optimized code. For example -Ofast enables -ffast-math.
  • +
  • Link-time optimization improvements: +
      +
    • The Scalable Whole + Program Optimizer (WHOPR) project has stabilized to the + point of being usable. It has become the default mode when + using the LTO optimization model. Link time optimization can + now split itself into multiple parallel compilations. Parallelism + is controlled with -flto=n (where + n specifies the number of compilations to execute in + parallel. GCC can also cooperate with a GNU make job server + by specifying the -flto=jobserver option and + adding + to the to the beginning of the of + the Makefile rule executing the linker.
    • +
    • A large number of bugs were fixed. GCC itself, Mozilla + Firefox and other other large applications can be built with + LTO enabled.
    • +
    • Resolution information from the linker plugin is used to drive + whole program assumptions. Use of linker plugin results in + more aggressive optimization on binaries and on shared libraries + that use the hidden visibility attribute.
    • +
    • Hidden symbols used from non-link time objects now have to be + explicitly annotated with externally_visible when + the linker plugin is not used.
    • +
    • C++ inline functions and virtual tables are now privatized more + aggressively, leading to better inter-procedural optimization + and faster dynamic linking.
    • +
    • Memory usage and intermediate language streaming performance + has been improved.
    • +
    • Static constructors and destructors from individual units are + inlined into a single function. + This can significantly improve startup times of large C++ + applications where static constructors are very common. For + example, static contructors are used when including the + iostream header.
    • +
    +
  • Interprocedural optimization improvements
    • The interprocedural framework was re-tuned for link time - optimization.
    • + optimization. Several scalability issues were solved.
    • Improved auto-detection of const and pure functions. Newly, noreturn functions are auto-detected.

      The -Wsuggest-attribute=[const|pure|noreturn] flag is available that informs users when adding attributes to headers might improve code generation.

    • -
    • Inlining heuristics were improved: +
    • A number of inlining heuristic improvements. In particular:
      • Partial inlining is now supported and enabled by default at -O2 and greater. The feature can be @@ -79,12 +115,27 @@

      • Scalability for large compilation units was improved - significantly.
      • + significantly.
      • Inlining of callbacks is now more aggressive.
      • Virtual methods considered for inlining when caller is inlined and devirtualization is possible then.
      • +
      • Inlining when optimizing for size (either in cold + regions of a program or when compiling with + -Os) was improved to better handle C++ + programs with larger abstraction penalty, leading + to smaller and faster code.
    • +
    • The IPA reference optimization pass detecting global + variables used or modified by functions was strengthened + and sped up.
    • +
    • Functions whose address was taken are now optimized out + when all references to them are dead.
    • +
    • A new inter-procedural static profile estimation pass detects + functions that are executed once or unlikely to be executed. + Unlikely executed functions are optimized for size. Functions + executed once are optimized for size except for the inner + loops.
  • A new switch -fstack-usage has been added. It makes @@ -124,6 +175,11 @@ float is implicitly promoted to double. This is especially helpful for CPUs that handle the former in hardware, but emulate the latter in software.
  • +
  • A new function attribute leaf was introduced. + This attribute allows better inter-procedural optimization across + calls to functions that return to the current unit only via returning + or exception handling. This is the case for most library functions + that have no callbacks.

C