LTO streamer reorg - try to reduce WPA memory use
diff mbox

Message ID CAFiYyc0bN2a5NV5bwgjLvrLYRk4nvBaqrBMFLh+fjOB57RknfA@mail.gmail.com
State New
Headers show

Commit Message

Richard Biener July 30, 2014, 11:37 a.m. UTC
On Wed, Jul 30, 2014 at 1:14 PM, Martin Liška <mliska@suse.cz> wrote:
>
> On 07/30/2014 11:41 AM, Richard Biener wrote:
>>
>> On Wed, 30 Jul 2014, Richard Biener wrote:
>>
>>> On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
>>> <markus@trippelsdorf.de> wrote:
>>>>
>>>> On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
>>>>>
>>>>> On Tue, 29 Jul 2014, Richard Biener wrote:
>>>>>
>>>>>> This re-organizes the LTO streamer to do compression transparently
>>>>>> in the data-streamer routines (and disables section compression
>>>>>> by defaulting to -flto-compression-level=0).  This avoids
>>>>>> keeping the whole uncompressed sections in memory, only retaining
>>>>>> the compressed ones.
>>>>>>
>>>>>> The downside is that we lose compression of at least the string
>>>>>> parts (they are abusing the streaming interface quite awkwardly
>>>>>> and doing random-accesses with offsets into the uncompressed
>>>>>> section).  With a little bit of surgery we can get that back I
>>>>>> think (but we'd have to keep the uncompressed piece in memory
>>>>>> somewhere which means losing the memory use advantage).
>>>>>>
>>>>>> Very lightly tested sofar (running lto.exp).  I'll try a LTO
>>>>>> bootstrap now.
>>>>>>
>>>>>> I wonder what the change is on WPA memory use for larger
>>>>>> projects and what the effect on object file size is.
>>>>>
>>>>> Updated patch passing LTO bootstrap (one warning fix) and
>>>>> with a memory leak fixed.
>>>>
>>>> Testing with Firefox is impossible at the moment because of PR61885.
>>>> One thing I've noticed (before the ICE) is that virtual memory usage is
>>>> very high:
>>>>
>>>> Address            Kbytes      RSS    Dirty  Mode  Mapping
>>>> 0000000000400000    16344     9084        0  r-x-- lto1
>>>> 00000000013f6000       36       36       28  rw--- lto1
>>>> 00000000013ff000     1072      276      276  rw---   [ anon ]
>>>> 00000000034aa000 10154940  1540384  1540384  rw---   [ anon ]
>>>> 00002acf04af2000      136      136        0  r-x-- ld-2.19.90.so
>>>> 00002acf04b14000       88       88       88  rw---   [ anon ]
>>>> ...
>>>> ----------------  -------  -------  -------
>>>> total kB         12022060  3388396  3377708
>>>
>>> Maybe there is still a memleak (just checked that LTOing int main() {}
>>> doesn't leak).
>>
>> Found it:
>>
>> Index: gcc/lto-section-in.c
>> ===================================================================
>> --- gcc/lto-section-in.c.orig   2014-07-30 12:40:27.950225826 +0200
>> +++ gcc/lto-section-in.c        2014-07-30 12:37:44.179237102 +0200
>> @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l
>>                                  struct lto_input_block *ib,
>>                                  const char *data, size_t len)
>>   {
>> -  free (ib);
>> +  delete ib;
>>     lto_free_section_data (file_data, section_type, NULL, data, len);
>>   }
>>   Richard.
>
> Hello,
>    there's memory/CPU usage for the patch. for both, I used sync and
> drop_caches.
>
> Url:
> https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing

Ok, it turns out setting -flto-compression-level to 0 doesn't really
short-circuit zlib for sections.  So the following does that the hard
but effective way.


does that change anything?

Thanks,
Richard.

> Martin
>

Comments

Martin Liška July 30, 2014, 11:59 a.m. UTC | #1
On 07/30/2014 12:37 PM, Richard Biener wrote:
> On Wed, Jul 30, 2014 at 1:14 PM, Martin Liška <mliska@suse.cz> wrote:
>> On 07/30/2014 11:41 AM, Richard Biener wrote:
>>> On Wed, 30 Jul 2014, Richard Biener wrote:
>>>
>>>> On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
>>>> <markus@trippelsdorf.de> wrote:
>>>>> On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
>>>>>> On Tue, 29 Jul 2014, Richard Biener wrote:
>>>>>>
>>>>>>> This re-organizes the LTO streamer to do compression transparently
>>>>>>> in the data-streamer routines (and disables section compression
>>>>>>> by defaulting to -flto-compression-level=0).  This avoids
>>>>>>> keeping the whole uncompressed sections in memory, only retaining
>>>>>>> the compressed ones.
>>>>>>>
>>>>>>> The downside is that we lose compression of at least the string
>>>>>>> parts (they are abusing the streaming interface quite awkwardly
>>>>>>> and doing random-accesses with offsets into the uncompressed
>>>>>>> section).  With a little bit of surgery we can get that back I
>>>>>>> think (but we'd have to keep the uncompressed piece in memory
>>>>>>> somewhere which means losing the memory use advantage).
>>>>>>>
>>>>>>> Very lightly tested sofar (running lto.exp).  I'll try a LTO
>>>>>>> bootstrap now.
>>>>>>>
>>>>>>> I wonder what the change is on WPA memory use for larger
>>>>>>> projects and what the effect on object file size is.
>>>>>> Updated patch passing LTO bootstrap (one warning fix) and
>>>>>> with a memory leak fixed.
>>>>> Testing with Firefox is impossible at the moment because of PR61885.
>>>>> One thing I've noticed (before the ICE) is that virtual memory usage is
>>>>> very high:
>>>>>
>>>>> Address            Kbytes      RSS    Dirty  Mode  Mapping
>>>>> 0000000000400000    16344     9084        0  r-x-- lto1
>>>>> 00000000013f6000       36       36       28  rw--- lto1
>>>>> 00000000013ff000     1072      276      276  rw---   [ anon ]
>>>>> 00000000034aa000 10154940  1540384  1540384  rw---   [ anon ]
>>>>> 00002acf04af2000      136      136        0  r-x-- ld-2.19.90.so
>>>>> 00002acf04b14000       88       88       88  rw---   [ anon ]
>>>>> ...
>>>>> ----------------  -------  -------  -------
>>>>> total kB         12022060  3388396  3377708
>>>> Maybe there is still a memleak (just checked that LTOing int main() {}
>>>> doesn't leak).
>>> Found it:
>>>
>>> Index: gcc/lto-section-in.c
>>> ===================================================================
>>> --- gcc/lto-section-in.c.orig   2014-07-30 12:40:27.950225826 +0200
>>> +++ gcc/lto-section-in.c        2014-07-30 12:37:44.179237102 +0200
>>> @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l
>>>                                   struct lto_input_block *ib,
>>>                                   const char *data, size_t len)
>>>    {
>>> -  free (ib);
>>> +  delete ib;
>>>      lto_free_section_data (file_data, section_type, NULL, data, len);
>>>    }
>>>    Richard.
>> Hello,
>>     there's memory/CPU usage for the patch. for both, I used sync and
>> drop_caches.
>>
>> Url:
>> https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing
> Ok, it turns out setting -flto-compression-level to 0 doesn't really
> short-circuit zlib for sections.  So the following does that the hard
> but effective way.
>
> Index: gcc/lto-section-out.c
> ===================================================================
> --- gcc/lto-section-out.c.orig  2014-07-30 13:33:06.634008355 +0200
> +++ gcc/lto-section-out.c       2014-07-30 13:29:19.468023995 +0200
> @@ -80,7 +80,7 @@ lto_begin_section (const char *name, boo
>        data is anything other than assembler output.  The effect here is that
>        we get compression of IL only in non-ltrans object files.  */
>     gcc_assert (compression_stream == NULL);
> -  if (compress)
> +  if (compress && 0)
>       compression_stream = lto_start_compression (lto_append_data, NULL);
>   }
>
> Index: gcc/lto-section-in.c
> ===================================================================
> --- gcc/lto-section-in.c.orig   2014-07-30 13:33:06.637008355 +0200
> +++ gcc/lto-section-in.c        2014-07-30 13:31:57.329013126 +0200
> @@ -153,7 +153,7 @@ lto_get_section_data (struct lto_file_de
>
>     /* FIXME lto: WPA mode does not write compressed sections, so for now
>        suppress uncompression if flag_ltrans.  */
> -  if (!flag_ltrans)
> +  if (!flag_ltrans && 0)
>       {
>         /* Create a mapping header containing the underlying data and length,
>           and prepend this to the uncompression buffer.  The uncompressed data
> @@ -200,7 +200,7 @@ lto_free_section_data (struct lto_file_d
>
>     /* FIXME lto: WPA mode does not write compressed sections, so for now
>        suppress uncompression mapping if flag_ltrans.  */
> -  if (flag_ltrans)
> +  if (flag_ltrans || 1)
>       {
>         (free_section_f) (file_data, section_type, name, data, len);
>         return;
>
> does that change anything?
>
> Thanks,
> Richard.
There are new numbers: https://drive.google.com/file/d/0B0pisUJ80pO1aG83N2JXLWNVUW8/edit?usp=sharing, where I reduced the scale to to 10GB to identify better any differences.

Martin

>> Martin
>>

Patch
diff mbox

Index: gcc/lto-section-out.c
===================================================================
--- gcc/lto-section-out.c.orig  2014-07-30 13:33:06.634008355 +0200
+++ gcc/lto-section-out.c       2014-07-30 13:29:19.468023995 +0200
@@ -80,7 +80,7 @@  lto_begin_section (const char *name, boo
      data is anything other than assembler output.  The effect here is that
      we get compression of IL only in non-ltrans object files.  */
   gcc_assert (compression_stream == NULL);
-  if (compress)
+  if (compress && 0)
     compression_stream = lto_start_compression (lto_append_data, NULL);
 }

Index: gcc/lto-section-in.c
===================================================================
--- gcc/lto-section-in.c.orig   2014-07-30 13:33:06.637008355 +0200
+++ gcc/lto-section-in.c        2014-07-30 13:31:57.329013126 +0200
@@ -153,7 +153,7 @@  lto_get_section_data (struct lto_file_de

   /* FIXME lto: WPA mode does not write compressed sections, so for now
      suppress uncompression if flag_ltrans.  */
-  if (!flag_ltrans)
+  if (!flag_ltrans && 0)
     {
       /* Create a mapping header containing the underlying data and length,
         and prepend this to the uncompression buffer.  The uncompressed data
@@ -200,7 +200,7 @@  lto_free_section_data (struct lto_file_d

   /* FIXME lto: WPA mode does not write compressed sections, so for now
      suppress uncompression mapping if flag_ltrans.  */
-  if (flag_ltrans)
+  if (flag_ltrans || 1)
     {
       (free_section_f) (file_data, section_type, name, data, len);
       return;