diff mbox

patch to implement thread coloring in IRA

Message ID 52842E4D.5070308@redhat.com
State New
Headers show

Commit Message

Vladimir Makarov Nov. 14, 2013, 1:58 a.m. UTC
On 11/13/2013, 5:48 PM, Steven Bosscher wrote:
> On Wed, Nov 13, 2013 at 6:56 PM, Vladimir Makarov wrote:
>>    The following patch improves coloring.  The order of pushing allocnos on
>> the coloring stack from a bunch of colorable allocnos was always important
>> for generated code performance.  LRA has a mechanism of allocating pseudos
>> by threads.  Thread in LRA is a set of non-conflicting pseudos connected by
>> moves (or by future reload insns).  Allocating pseudos by threads in LRA
>> permits to improve code by increasing chance of removing the move insns.
>>
>>    So the same mechanism can be used for IRA.  The patch implements it.  The
>> difference is only that LRA forms thread statically before allocation
>> sub-pass.  That is because the basic allocation are already done in IRA.
>> The statically thread forming works well for IRA too.  But even better
>> results can be got by dynamically forming threads.  It means that we are
>> forming threads during allocation and includes only colorable allocnos.
>>
>>    The results of using threads in IRA is pretty good.  The average code size
>> (text segment) of SPEC2000 is improved (by >0.1% for x86 SPECFP2000 and >
>> 0.3% for x86-64 SPECFP2000). The biggest code performance improvement (> 1%)
>> is obtained on x86-64 SPECFP2000.  Performance tools report that additional
>> code takes only about 0.05% additionally executed insns.
>
> Nice!
>
> Can you please also update the leading comment in ira.c? It seems
> worth mentioning this approach under the "Coloring" bullet
> (ira.c:176).
Thanks.  I've added a comment.  It is not a big change but pretty 
important one (e.g. to get about the same improvement 3 years ago I 
needed 1K lines to implement coloring for irregular reg file 
architecture).  So it is worth to document it.
> (Not sure if that whole comment block is otherwise up to date??)
>
>
It is pretty upto date.  I don't see missed important details. Although 
it would be nice to describe new pre-RA optimizations shrink-wrapping, 
find-moveable, and decrease-live-ranges.
diff mbox

Patch

Index: ira.c
===================================================================
--- ira.c       (revision 204753)
+++ ira.c       (working copy)
@@ -192,7 +192,14 @@  along with GCC; see the file COPYING3.
           this point.  There is some freedom in the order of putting
           allocnos on the stack which can affect the final result of
           the allocation.  IRA uses some heuristics to improve the
-         order.
+         order.  The major one is to form *threads* from colorable
+         allocnos and push them on the stack by threads. Thread is a
+         set of non-conflicting colorable allocnos connected by
+         copies.  The thread contains allocnos from the colorable
+         bucket or colorable allocnos already pushed onto the coloring
+         stack.  Pushing thread allocnos one after another onto the
+         stack increases chances of removing copies when the allocnos
+         get the same hard reg.

          We also use a modification of Chaitin-Briggs algorithm which
           works for intersected register classes of allocnos.  To