WHOPR partitioning algorithm

Hi,
this patch adds simple partitioning algorithm that is not source file driven.
It performs measurably better than 1-to-1 partitioning, saving about 1 minute
of Mozilla compilation (out of 5), reducing /tmp usage from 7GB to 1.6GB and
reducing resulting executable size by about 3-5%.

The algorithm is simple - it takes cgraph nodes in predefined order and inserts
them to current partition, when partition is too large it gets to new
partition.  To reduce boundary in between partitions, it is able to undo last
few insertion decision until number of in-partition references / number of
inter partition references is minimized.

Variables are dragged into current partition when they are referenced and they
are not in previous partititions yet.

I tried more smart graph clustering algorithm, but ended up with this one
because others won't play well with function reordering pass I plan to implement
next (the idea is to feed the order from this new pass instead of currently
used reverse postorder).  For mozilla and other large projects it seems very
important to order functions sequentially in the binary.

All algorithms resulted in pretty much same binary size, the trick with
unwinding brings just small benefits, about 1%.

Bootstrapped/regtested x86_64-linux, seems to make sense?

If accepted, I will update existing WHOPR testcases to use -flto-partition=1to1
since they will get optimized into single partition and stop testing what they
are intended to test otherwise.

Honza

	* doc/invoke.texi (-flto-partition, lto-partitions, lto-minpartition):
	Document.
	* opts.c (decode_options): Handle lto partitions.
	* common.opt (flto-partition): New.
	* params.def (PARAM_LTO_PARTITIONS, MIN_PARTITION_SIZE): New.

	* lto.c:  Include params.h.
	(add_cgraph_node_to_partition, add_varpool_node_to_partition): Do
	refcounting in aux field.
	(undo_partition, partition_cgraph_node_p, partition_varpool_node_p):
	New functions.
	(lto_1_to_1_map): Simplify.
	(lto_balanced_map): New function.
	(do_whole_program_analysis): Chose proper partitioning alg.
	* Makefile.in (lto.o): Add dependency on params.h

WHOPR partitioning algorithm

Commit Message

Comments

Patch