Message ID | 20131118161757.GF11338@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On Mon, 18 Nov 2013, Jan Hubicka wrote: > there was many changes in this area. The following are ones I can think > of. Please fell free to suggest more changes. We probably should mention > Teresa's splitting work once it is complete and new micro-architectures > targetd by x86 backend. Yes, those definitely are good additional changes. Thanks for taking the time to write these up (apart from the time to actually hack much of it). :-) > Index: changes.html > =================================================================== > + <li>Type merging was rewritten. New implementation is significantly faster > + and use less memory. "The new implementation" "uses less memory" > + <li>Better partitioning algorithm resulting in less streaming during > + link-time.</li> "link time" > + <li>Early removal of virtual methods reduce size of object files and > + improve link-time memory usage and compile time.</li> "reduces the size of object files and improves" > + <li>Functions are no longer pointlessly renamed.</li> Readers may struggle a bit with this. What does it refer to? > + <li>Function bodies are now loaded on-demand and released early improving > + overall memory usage at link-time.</li> "link time" > + </ul> > + Memory usage of Firefox build with debug enabled was reduced from 15GB to > + 3.5GB. Link time from 1700 seconds to 350 seconds. "Memory usage building Firefox" Perhaps specify the version number? If you want to write the second sentence without a verb, I suggest "; link time..." > + <li>New type inheritance analysis module improving devirtualization. > + Devirtualization now take into account anonymous name-spaces and the > + C++11 <code>final</code> keyword.</li> "takes into account" > + <li>Calls that was speculatively made direct are turned back to indirect > + when doing so does not bring any noticeable benefits.</li> "was" -> "were" Also this sentence is a bit confusing. It can be read that turning back the calls does not bring noticable benefits. > + <li>Local aliases are introduced for symbols that are known to be > + semantically equivalent across shared libraries improving dynamic > + linking times.</li> ", improving" > + <li>New time profiling determine typical order in which functions are executed.</li> "determines", and can you break the long line? > + <li>New function reordering pass (controlled by > + <code>-freorder-functions</code>) significantly reduces > + startup time of large applications. Until binutils support is > + completed, it is effective only with link time optimization.</li> "A new function reordering..." "link-time optimization" > + <li>Better inlining of <code>memcpy</code> and <code>memset</code> > + that is avare of value ranges and produce shorter alignment prologues. "aware" "produces" > + for portions of program optimized for size.</li> "of programs" Cheers, Gerald
> > + <li>Functions are no longer pointlessly renamed.</li> > > Readers may struggle a bit with this. What does it refer to? We previously renamed every static function foo into foo.1234 (just as a precaution because other compilation unit may have also function foo). This confuses many thins, so now we do renaming only when we see a conflict. I am attaching the changes I comitted. I dropped this from news changes. In meantime we merged in the change enabling slim LTO files by defualt, what about: Index: changes.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving revision 1.41 diff -c -p -r1.41 changes.html *** changes.html 28 Nov 2013 15:05:51 -0000 1.41 --- changes.html 28 Nov 2013 16:53:37 -0000 *************** *** 15,20 **** --- 15,25 ---- <h2>Caveats</h2> <ul> + <li>Because <code>-fno-fat-lto-objects</code> is now by default, + <code>gcc-ar</code> and <code>gcc-nm</code> wrappers needs + to be used to handle objects compiled with <code>-flto</code>. + Additionally the resulting binary needs to be linked with + <code>-flto</code> (and appropriate optimization flags).</li> <li><p>Support for a number of older systems and recently unmaintained or untested target ports of GCC has been declared obsolete in GCC 4.9. Unless there is activity to revive them, the *************** *** 45,50 **** --- 50,61 ---- </li> <li>Link-time optimization (LTO) improvements: <ul> + <li>Slim LTO objects are now used by default. This means that with + <code>-flto</code> GCC will no longer produce non-LTO optimized binary + in addition to storing object representation in the intermediate + language. Consequently <code>-flto</code> no longer causes everything + to be optimized twice (once at compile time and again during link time). + This feature can be controlled by <code>-ffat-lto-objects</code>.</li> <li>Type merging was rewritten. The new implementation is significantly faster and uses less memory. <li>Better partitioning algorithm resulting in less streaming during Index: changes.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving revision 1.40 diff -r1.40 changes.html 40,41d39 < </ul> < <ul> 47a46,85 > <li>Link-time optimization (LTO) improvements: > <ul> > <li>Type merging was rewritten. The new implementation is significantly faster > and uses less memory. > <li>Better partitioning algorithm resulting in less streaming during > link time.</li> > <li>Early removal of virtual methods reduces the size of object files and > improves link-time memory usage and compile time.</li> > <li>Function bodies are now loaded on-demand and released early improving > overall memory usage at link time.</li> > <li>C++ hidden keyed methods can now be optimized out.</li> > </ul> > Memory usage building Firefox with debug enabled was reduced from 15GB to > 3.5GB; link time from 1700 seconds to 350 seconds. > </li> > <li>Inter-procedural optimization improvements: > <ul> > <li>New type inheritance analysis module improving devirtualization. > Devirtualization now takes into account anonymous name-spaces and the > C++11 <code>final</code> keyword.</li> > <li>New speculative devirtualization pass (controlled by > <code>-fdevirtualize-speculatively</code>.</li> > <li>Calls that were speculatively made direct are turned back to indirect > where direct call is not cheaper.</li> > <li>Local aliases are introduced for symbols that are known to be > semantically equivalent across shared libraries improving dynamic > linking times.</li> > </ul> > <li>Feedback directed optimization improvements: > <ul> > <li>Profiling of programs using C++ inline functions is now more reliable.</li> > <li>New time profiling determines typical order in which functions are > executed.</li> > <li>A new function reordering pass (controlled by > <code>-freorder-functions</code>) significantly reduces > startup time of large applications. Until binutils support is > completed, it is effective only with link-time optimization.</li> > <li>Feedback driven indirect call removal and devirtualization now handle > cross-module calls when link-time optimization is enabled.</li> > </ul></li> 337c375 < <li> GCC now supports the new Intel microarchitecture named Silvermont --- > <li>GCC now supports the new Intel microarchitecture named Silvermont 339a378,388 > <li><code>-march=generic</code> has been retuned for better support of > Intel core and AMD Bulldozer architectures. Performance of AMD K7, K8, > Intel Pentium-M, and Pentium4 based CPUs is no longer considered important > for generic. > </li> > <li>Better inlining of <code>memcpy</code> and <code>memset</code> > that is aware of value ranges and produces shorter alignment prologues. > </li> > <li><code>-mno-accumulate-outgoing-args</code> is now honored when unwind > information is output. Argument accumulation is also now turned off > for portions of programs optimized for size.</li>
Index: changes.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving revision 1.36 diff -u -r1.36 changes.html --- changes.html 15 Nov 2013 15:40:00 -0000 1.36 +++ changes.html 18 Nov 2013 16:15:32 -0000 @@ -37,14 +37,52 @@ <ul> <li>AddressSanitizer, a fast memory error detector, is now available on ARM. </li> - </ul> - <ul> <li>UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector, has been added and can be enabled via <code>-fsanitize=undefined</code>. Various computations will be instrumented to detect undefined behavior at runtime. UndefinedBehaviorSanitizer is currently available for the C and C++ languages. </li> + <li>Link-time optimization (LTO) improvements: + <ul> + <li>Type merging was rewritten. New implementation is significantly faster + and use less memory. + <li>Better partitioning algorithm resulting in less streaming during + link-time.</li> + <li>Early removal of virtual methods reduce size of object files and + improve link-time memory usage and compile time.</li> + <li>Functions are no longer pointlessly renamed.</li> + <li>Function bodies are now loaded on-demand and released early improving + overall memory usage at link-time.</li> + <li>C++ hidden keyed methods can now be optimized out.</li> + </ul> + Memory usage of Firefox build with debug enabled was reduced from 15GB to + 3.5GB. Link time from 1700 seconds to 350 seconds. + </li> + <li>Inter-procedural optimization improvements: + <ul> + <li>New type inheritance analysis module improving devirtualization. + Devirtualization now take into account anonymous name-spaces and the + C++11 <code>final</code> keyword.</li> + <li>New speculative devirtualization pass (controlled by + <code>-fdevirtualize-speculatively</code>.</li> + <li>Calls that was speculatively made direct are turned back to indirect + when doing so does not bring any noticeable benefits.</li> + <li>Local aliases are introduced for symbols that are known to be + semantically equivalent across shared libraries improving dynamic + linking times.</li> + </ul> + <li>Feedback directed optimization improvements: + <ul> + <li>Profiling of programs using C++ inline functions is now more reliable.</li> + <li>New time profiling determine typical order in which functions are executed.</li> + <li>New function reordering pass (controlled by + <code>-freorder-functions</code>) significantly reduces + startup time of large applications. Until binutils support is + completed, it is effective only with link time optimization.</li> + <li>Feedback driven indirect call removal and devirtualization now handle + cross-module calls when link-time optimization is enabled.</li> + </ul></li> </ul> <h2 id="languages">New Languages and Language specific improvements</h2> @@ -325,9 +363,20 @@ href="http://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html" >Function Multiversioning</a>. </li> - <li> GCC now supports the new Intel microarchitecture named Silvermont + <li>GCC now supports the new Intel microarchitecture named Silvermont through <code>-march=slm</code>. </li> + <li><code>-march=generic</code> has been retuned for better support of + Intel core and AMD Bulldozer architectures. Performance of AMD K7, K8, + Intel Pentium-M, and Pentium4 based CPUs is no longer considered important + for generic. + </li> + <li>Better inlining of <code>memcpy</code> and <code>memset</code> + that is avare of value ranges and produce shorter alignment prologues. + </li> + <li><code>-mno-accumulate-outgoing-args</code> is now honored when unwind + information is output. Argument accumulation is also now turned off + for portions of program optimized for size.</li> </ul> <h3 id="nds32">NDS32</h3> <ul>