diff mbox

Avoid ggc_collect () after WPA forking

Message ID alpine.LSU.2.11.1403191207150.6041@zhemvz.fhfr.qr
State New
Headers show

Commit Message

Richard Biener March 19, 2014, 11:10 a.m. UTC
This patch avoids calling ggc_collect after we possibly forked
during WPA phase as that necessarily causes a lot of page
unsharing.  I have verified that during a LTO bootstrap we
do not allocate GC memory during (or after) lto_wpa_write_files,
thus the effect on memory use should be positive (the patch
below contains checking code making sure that we don't alloc).

LTO bootstrapped on x86_64-unknown-linux-gnu, will apply shortly
(without the checking code of course).

That should fix the WPA memory explosion Martin sees with building
Chromium.

Richard.

2014-03-19  Richard Biener  <rguenther@suse.de>

	* lto.c (lto_wpa_write_files): Move call to
	lto_promote_cross_file_statics ...
	(do_whole_program_analysis): ... here, into the partitioning
	block.  Do not ggc_collect after lto_wpa_write_files but
	for a last time before it.

Comments

Steven Bosscher March 19, 2014, 12:06 p.m. UTC | #1
On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
> Index: gcc/ggc-page.c
> ===================================================================
> --- gcc/ggc-page.c      (revision 208642)
> +++ gcc/ggc-page.c      (working copy)
> @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
>    return size;
>  }
>
> +int may_alloc = 1;

"bool may_alloc"?

Ciao!
Steven
Richard Biener March 19, 2014, 12:30 p.m. UTC | #2
On Wed, 19 Mar 2014, Steven Bosscher wrote:

> On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
> > Index: gcc/ggc-page.c
> > ===================================================================
> > --- gcc/ggc-page.c      (revision 208642)
> > +++ gcc/ggc-page.c      (working copy)
> > @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
> >    return size;
> >  }
> >
> > +int may_alloc = 1;
> 
> "bool may_alloc"?

It's only checking code I didn't commit.  We may of course alloc
but I wanted to prove we don't.

Richard.
Martin Liška March 19, 2014, 2:41 p.m. UTC | #3
There are stats for Firefox with LTO and -O2. According to graphs it
looks that memory consumption for parallel WPA phase is similar.
When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
footprint is similar to parallel WPA that reduces libxul.so linking by ~10%.

Martin


On 03/19/2014 01:30 PM, Richard Biener wrote:
> On Wed, 19 Mar 2014, Steven Bosscher wrote:
>
>> On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
>>> Index: gcc/ggc-page.c
>>> ===================================================================
>>> --- gcc/ggc-page.c      (revision 208642)
>>> +++ gcc/ggc-page.c      (working copy)
>>> @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
>>>     return size;
>>>   }
>>>
>>> +int may_alloc = 1;
>> "bool may_alloc"?
> It's only checking code I didn't commit.  We may of course alloc
> but I wanted to prove we don't.
>
> Richard.
diff mbox

Patch

Index: gcc/ggc-page.c
===================================================================
--- gcc/ggc-page.c	(revision 208642)
+++ gcc/ggc-page.c	(working copy)
@@ -1199,6 +1199,8 @@  ggc_round_alloc_size (size_t requested_s
   return size;
 }
 
+int may_alloc = 1;
+
 /* Allocate a chunk of memory of SIZE bytes.  Its contents are undefined.  */
 
 void *
@@ -1208,6 +1210,9 @@  ggc_internal_alloc_stat (size_t size MEM
   struct page_entry *entry;
   void *result;
 
+  if (!may_alloc)
+    fatal_error ("allocating GC memory");
+
   ggc_round_alloc_size_1 (size, &order, &object_size);
 
   /* If there are non-full pages for this size allocation, they are at
Index: gcc/lto/lto.c
===================================================================
--- gcc/lto/lto.c	(revision 208642)
+++ gcc/lto/lto.c	(working copy)
@@ -2565,11 +2566,6 @@  lto_wpa_write_files (void)
   FOR_EACH_VEC_ELT (ltrans_partitions, i, part)
     lto_stats.num_output_symtab_nodes += lto_symtab_encoder_size (part->encoder);
 
-  /* Find out statics that need to be promoted
-     to globals with hidden visibility because they are accessed from multiple
-     partitions.  */
-  lto_promote_cross_file_statics ();
-
   timevar_pop (TV_WHOPR_WPA);
 
   timevar_push (TV_WHOPR_WPA_IO);
@@ -3281,11 +3277,25 @@  do_whole_program_analysis (void)
     node->aux = NULL;
 
   lto_stats.num_cgraph_partitions += ltrans_partitions.length ();
+
+  /* Find out statics that need to be promoted
+     to globals with hidden visibility because they are accessed from multiple
+     partitions.  */
+  lto_promote_cross_file_statics ();
   timevar_pop (TV_WHOPR_PARTITIONING);
 
   timevar_stop (TV_PHASE_OPT_GEN);
-  timevar_start (TV_PHASE_STREAM_OUT);
 
+  /* Collect a last time - in lto_wpa_write_files we may end up forking
+     with the idea that this doesn't increase memory usage.  So we
+     absoultely do not want to collect after that.  */
+  ggc_collect ();
+    {
+      extern int may_alloc;
+      may_alloc = 0;
+    }
+
+  timevar_start (TV_PHASE_STREAM_OUT);
   if (!quiet_flag)
     {
       fprintf (stderr, "\nStreaming out");
@@ -3294,10 +3304,8 @@  do_whole_program_analysis (void)
   lto_wpa_write_files ();
   if (!quiet_flag)
     fprintf (stderr, "\n");
-
   timevar_stop (TV_PHASE_STREAM_OUT);
 
-  ggc_collect ();
   if (post_ipa_mem_report)
     {
       fprintf (stderr, "Memory consumption after IPA\n");