Patchwork [2/3] Free large chunks in ggc

login
register
mail settings
Submitter Andi Kleen
Date Oct. 21, 2011, 5:52 a.m.
Message ID <1319176370-26071-3-git-send-email-andi@firstfloor.org>
Download mbox | patch
Permalink /patch/120930/
State New
Headers show

Comments

Andi Kleen - Oct. 21, 2011, 5:52 a.m.
From: Andi Kleen <ak@linux.intel.com>

This implements the freeing back of large chunks in the ggc madvise path
Richard Guenther asked for.  This way on systems with limited
address space malloc() and other allocators still have
a chance to get back at some of the memory ggc freed. The
fragmented pages are still just given back, but the address space
stays allocated.

I tried freeing only aligned 2MB areas to optimize for 2MB huge
pages, but the hit rate was quite low, so I switched to 1MB+
unaligned areas. The target size is a param now.

Passed bootstrap and testing on x86_64-linux

gcc/:
2011-10-18  Andi Kleen  <ak@linux.intel.com>

	* ggc-page (release_pages): First free large continuous
	chunks in the madvise path.
	* params.def (GGC_FREE_UNIT): Add.
	* doc/invoke.texi (ggc-free-unit): Add.
---
 gcc/doc/invoke.texi |    5 +++++
 gcc/ggc-page.c      |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/params.def      |    5 +++++
 3 files changed, 58 insertions(+), 0 deletions(-)
Richard Guenther - Oct. 21, 2011, 9:55 a.m.
On Fri, Oct 21, 2011 at 7:52 AM, Andi Kleen <andi@firstfloor.org> wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> This implements the freeing back of large chunks in the ggc madvise path
> Richard Guenther asked for.  This way on systems with limited
> address space malloc() and other allocators still have
> a chance to get back at some of the memory ggc freed. The
> fragmented pages are still just given back, but the address space
> stays allocated.
>
> I tried freeing only aligned 2MB areas to optimize for 2MB huge
> pages, but the hit rate was quite low, so I switched to 1MB+
> unaligned areas. The target size is a param now.
>
> Passed bootstrap and testing on x86_64-linux
>
> gcc/:
> 2011-10-18  Andi Kleen  <ak@linux.intel.com>
>
>        * ggc-page (release_pages): First free large continuous
>        chunks in the madvise path.
>        * params.def (GGC_FREE_UNIT): Add.
>        * doc/invoke.texi (ggc-free-unit): Add.
> ---
>  gcc/doc/invoke.texi |    5 +++++
>  gcc/ggc-page.c      |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  gcc/params.def      |    5 +++++
>  3 files changed, 58 insertions(+), 0 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 4f55dbc..e622552 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -8858,6 +8858,11 @@ very large effectively disables garbage collection.  Setting this
>  parameter and @option{ggc-min-expand} to zero causes a full collection
>  to occur at every opportunity.
>
> +@item  ggc-free-unit
> +
> +Continuous areas in OS pages to free back to OS immediately. Default is 256
> +pages, which is 1MB with 4K pages.
> +
>  @item max-reload-search-insns
>  The maximum number of instruction reload should look backward for equivalent
>  register.  Increasing values mean more aggressive optimization, making the
> diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
> index ba88e3f..eb0eeef 100644
> --- a/gcc/ggc-page.c
> +++ b/gcc/ggc-page.c
> @@ -972,6 +972,54 @@ release_pages (void)
>   page_entry *p, *start_p;
>   char *start;
>   size_t len;
> +  size_t mapped_len;
> +  page_entry *next, *prev, *newprev;
> +  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
> +
> +  /* First free larger continuous areas to the OS.
> +     This allows other allocators to grab these areas if needed.
> +     This is only done on larger chunks to avoid fragmentation.
> +     This does not always work because the free_pages list is only
> +     sorted over a single GC cycle. */

But release_pages is only called from ggc_collect, or what do you
mean with the above?  Would the hitrate using the quire size increase
if we change how we allocate from the freelist or is it real fragmentation
that causes it?

I'm a bit hesitant to approve the new param, I'd be ok if we just hard-code
quire-size / 2.

Richard.

> +
> +  p = G.free_pages;
> +  prev = NULL;
> +  while (p)
> +    {
> +      start = p->page;
> +      start_p = p;
> +      len = 0;
> +      mapped_len = 0;
> +      newprev = prev;
> +      while (p && p->page == start + len)
> +        {
> +          len += p->bytes;
> +         if (!p->discarded)
> +             mapped_len += p->bytes;
> +         newprev = p;
> +          p = p->next;
> +        }
> +      if (len >= free_unit)
> +        {
> +          while (start_p != p)
> +            {
> +              next = start_p->next;
> +              free (start_p);
> +              start_p = next;
> +            }
> +          munmap (start, len);
> +         if (prev)
> +           prev->next = p;
> +          else
> +            G.free_pages = p;
> +          G.bytes_mapped -= mapped_len;
> +         continue;
> +        }
> +      prev = newprev;
> +   }
> +
> +  /* Now give back the fragmented pages to the OS, but keep the address
> +     space to reuse it next time. */
>
>   for (p = G.free_pages; p; )
>     {
> diff --git a/gcc/params.def b/gcc/params.def
> index 5e49c48..edbf0de 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -561,6 +561,11 @@ DEFPARAM(GGC_MIN_HEAPSIZE,
>  #undef GGC_MIN_EXPAND_DEFAULT
>  #undef GGC_MIN_HEAPSIZE_DEFAULT
>
> +DEFPARAM(GGC_FREE_UNIT,
> +        "ggc-free-unit",
> +        "Continuous areas in OS pages to free back immediately",
> +        256, 0, 0)
> +
>  DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS,
>         "max-reload-search-insns",
>         "The maximum number of instructions to search backward when looking for equivalent reload",
> --
> 1.7.5.4
>
>
Andi Kleen - Oct. 21, 2011, 6:30 p.m.
> > diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
> > index ba88e3f..eb0eeef 100644
> > --- a/gcc/ggc-page.c
> > +++ b/gcc/ggc-page.c
> > @@ -972,6 +972,54 @@ release_pages (void)
> >   page_entry *p, *start_p;
> >   char *start;
> >   size_t len;
> > +  size_t mapped_len;
> > +  page_entry *next, *prev, *newprev;
> > +  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
> > +
> > +  /* First free larger continuous areas to the OS.
> > +     This allows other allocators to grab these areas if needed.
> > +     This is only done on larger chunks to avoid fragmentation.
> > +     This does not always work because the free_pages list is only
> > +     sorted over a single GC cycle. */
> 
> But release_pages is only called from ggc_collect, or what do you

If there was a spike in GC usage and we end up with lots of free
space in the free list afterward we free it back on the next GC cycle.
Then if there's a malloc or other allocator later it can grab
the address space we freed.

That was done to address your earlier concern.

This will only happen on ggc_collect of course.

So one difference from before the madvise patch is that different
generations of free pages can accumulate in the freelist. Before madvise
the freelist would never contain more than one generation.
Normally it's sorted by address due to the way GC works, but there's no 
attempt to keep the sort order over multiple generations.

The "free in batch" heuristic requires sorting, so it will only
work if all the pages are freed in a single gc cycle.

I considered sorting, but it seemed to be too slow.

I can expand the comment on that.


> mean with the above?  Would the hitrate using the quire size increase
> if we change how we allocate from the freelist or is it real fragmentation
> that causes it?

Not sure really about the hitrate. I haven't measured it. If hitrate
was a concern the free list should be probably split into an array.
I'm sure there are lots of other tunings that could be done on the GC,
but probably not by me for now :)

> 
> I'm a bit hesitant to approve the new param, I'd be ok if we just hard-code
> quire-size / 2.

Ok replacing it with a hardcoded value.

-Andi
Richard Guenther - Oct. 23, 2011, 10:23 a.m.
On Fri, Oct 21, 2011 at 8:30 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> > diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
>> > index ba88e3f..eb0eeef 100644
>> > --- a/gcc/ggc-page.c
>> > +++ b/gcc/ggc-page.c
>> > @@ -972,6 +972,54 @@ release_pages (void)
>> >   page_entry *p, *start_p;
>> >   char *start;
>> >   size_t len;
>> > +  size_t mapped_len;
>> > +  page_entry *next, *prev, *newprev;
>> > +  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
>> > +
>> > +  /* First free larger continuous areas to the OS.
>> > +     This allows other allocators to grab these areas if needed.
>> > +     This is only done on larger chunks to avoid fragmentation.
>> > +     This does not always work because the free_pages list is only
>> > +     sorted over a single GC cycle. */
>>
>> But release_pages is only called from ggc_collect, or what do you
>
> If there was a spike in GC usage and we end up with lots of free
> space in the free list afterward we free it back on the next GC cycle.
> Then if there's a malloc or other allocator later it can grab
> the address space we freed.
>
> That was done to address your earlier concern.
>
> This will only happen on ggc_collect of course.
>
> So one difference from before the madvise patch is that different
> generations of free pages can accumulate in the freelist. Before madvise
> the freelist would never contain more than one generation.
> Normally it's sorted by address due to the way GC works, but there's no
> attempt to keep the sort order over multiple generations.
>
> The "free in batch" heuristic requires sorting, so it will only
> work if all the pages are freed in a single gc cycle.
>
> I considered sorting, but it seemed to be too slow.
>
> I can expand the comment on that.

Ah, now I see ... but that's of course bad - I expect large regions to be
free only after multiple collections.  Can you measure what sorting would
make for a difference?

>
>> mean with the above?  Would the hitrate using the quire size increase
>> if we change how we allocate from the freelist or is it real fragmentation
>> that causes it?
>
> Not sure really about the hitrate. I haven't measured it. If hitrate
> was a concern the free list should be probably split into an array.
> I'm sure there are lots of other tunings that could be done on the GC,
> but probably not by me for now :)

Heh.  Yeah, I suppose the freelist could be changed into a list of
allocation groups with free pages and a bitmap.

Richard.

>>
>> I'm a bit hesitant to approve the new param, I'd be ok if we just hard-code
>> quire-size / 2.
>
> Ok replacing it with a hardcoded value.
>
> -Andi
>
Richard Guenther - Oct. 23, 2011, 10:24 a.m.
On Sun, Oct 23, 2011 at 12:23 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Fri, Oct 21, 2011 at 8:30 PM, Andi Kleen <andi@firstfloor.org> wrote:
>>> > diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
>>> > index ba88e3f..eb0eeef 100644
>>> > --- a/gcc/ggc-page.c
>>> > +++ b/gcc/ggc-page.c
>>> > @@ -972,6 +972,54 @@ release_pages (void)
>>> >   page_entry *p, *start_p;
>>> >   char *start;
>>> >   size_t len;
>>> > +  size_t mapped_len;
>>> > +  page_entry *next, *prev, *newprev;
>>> > +  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
>>> > +
>>> > +  /* First free larger continuous areas to the OS.
>>> > +     This allows other allocators to grab these areas if needed.
>>> > +     This is only done on larger chunks to avoid fragmentation.
>>> > +     This does not always work because the free_pages list is only
>>> > +     sorted over a single GC cycle. */
>>>
>>> But release_pages is only called from ggc_collect, or what do you
>>
>> If there was a spike in GC usage and we end up with lots of free
>> space in the free list afterward we free it back on the next GC cycle.
>> Then if there's a malloc or other allocator later it can grab
>> the address space we freed.
>>
>> That was done to address your earlier concern.
>>
>> This will only happen on ggc_collect of course.
>>
>> So one difference from before the madvise patch is that different
>> generations of free pages can accumulate in the freelist. Before madvise
>> the freelist would never contain more than one generation.
>> Normally it's sorted by address due to the way GC works, but there's no
>> attempt to keep the sort order over multiple generations.
>>
>> The "free in batch" heuristic requires sorting, so it will only
>> work if all the pages are freed in a single gc cycle.
>>
>> I considered sorting, but it seemed to be too slow.
>>
>> I can expand the comment on that.
>
> Ah, now I see ... but that's of course bad - I expect large regions to be
> free only after multiple collections.  Can you measure what sorting would
> make for a difference?

I wonder if the free list that falls out of a single collection is sorted
(considering also ggc_free) - if it is, building a new one at each collection
and then merging the two sorted lists should be reasonably fast.

>>
>>> mean with the above?  Would the hitrate using the quire size increase
>>> if we change how we allocate from the freelist or is it real fragmentation
>>> that causes it?
>>
>> Not sure really about the hitrate. I haven't measured it. If hitrate
>> was a concern the free list should be probably split into an array.
>> I'm sure there are lots of other tunings that could be done on the GC,
>> but probably not by me for now :)
>
> Heh.  Yeah, I suppose the freelist could be changed into a list of
> allocation groups with free pages and a bitmap.
>
> Richard.
>
>>>
>>> I'm a bit hesitant to approve the new param, I'd be ok if we just hard-code
>>> quire-size / 2.
>>
>> Ok replacing it with a hardcoded value.
>>
>> -Andi
>>
>
Andi Kleen - Oct. 23, 2011, 5:31 p.m.
On Sun, Oct 23, 2011 at 12:24:46PM +0200, Richard Guenther wrote:
> >> space in the free list afterward we free it back on the next GC cycle.
> >> Then if there's a malloc or other allocator later it can grab
> >> the address space we freed.
> >>
> >> That was done to address your earlier concern.
> >>
> >> This will only happen on ggc_collect of course.
> >>
> >> So one difference from before the madvise patch is that different
> >> generations of free pages can accumulate in the freelist. Before madvise
> >> the freelist would never contain more than one generation.
> >> Normally it's sorted by address due to the way GC works, but there's no
> >> attempt to keep the sort order over multiple generations.
> >>
> >> The "free in batch" heuristic requires sorting, so it will only
> >> work if all the pages are freed in a single gc cycle.
> >>
> >> I considered sorting, but it seemed to be too slow.
> >>
> >> I can expand the comment on that.
> >
> > Ah, now I see ... but that's of course bad - I expect large regions to be
> > free only after multiple collections.  Can you measure what sorting would
> > make for a difference?
> 
> I wonder if the free list that falls out of a single collection is sorted

The original author seemed to have assumed it is usually. The 
allocation part tries hard to insert sorted. So I thought it 
was ok to assume.

I stuck in an assert now nd it triggers in a bootstrap on the large
files, so it's not always true (so my earlier assumption was not fully correct)

I suppose it's just another heuristic which is often enough true.

So madvise may not may have it made that much worse.

> (considering also ggc_free) - if it is, building a new one at each collection

ggc_free does not put into the freelist I believe.

> and then merging the two sorted lists should be reasonably fast.

It's definitely not O(1). Ok one could assume it's usually sorted
and do a merge sort with max one pass only. But I'm sceptical 
it's worth the effort, at least without anyone having a test case.
At least for 64bit it's not needed anyways.

-Andi

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4f55dbc..e622552 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8858,6 +8858,11 @@  very large effectively disables garbage collection.  Setting this
 parameter and @option{ggc-min-expand} to zero causes a full collection
 to occur at every opportunity.
 
+@item  ggc-free-unit
+
+Continuous areas in OS pages to free back to OS immediately. Default is 256
+pages, which is 1MB with 4K pages. 
+
 @item max-reload-search-insns
 The maximum number of instruction reload should look backward for equivalent
 register.  Increasing values mean more aggressive optimization, making the
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index ba88e3f..eb0eeef 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -972,6 +972,54 @@  release_pages (void)
   page_entry *p, *start_p;
   char *start;
   size_t len;
+  size_t mapped_len;
+  page_entry *next, *prev, *newprev;
+  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
+
+  /* First free larger continuous areas to the OS.
+     This allows other allocators to grab these areas if needed.
+     This is only done on larger chunks to avoid fragmentation. 
+     This does not always work because the free_pages list is only
+     sorted over a single GC cycle. */
+
+  p = G.free_pages;
+  prev = NULL;
+  while (p)
+    {
+      start = p->page;
+      start_p = p;
+      len = 0;
+      mapped_len = 0;
+      newprev = prev;
+      while (p && p->page == start + len)
+        {
+          len += p->bytes;
+	  if (!p->discarded)
+	      mapped_len += p->bytes;
+	  newprev = p;
+          p = p->next;
+        }
+      if (len >= free_unit)
+        {
+          while (start_p != p)
+            {
+              next = start_p->next;
+              free (start_p);
+              start_p = next;
+            }
+          munmap (start, len);
+	  if (prev)
+	    prev->next = p;
+          else
+            G.free_pages = p;
+          G.bytes_mapped -= mapped_len;
+	  continue;
+        }
+      prev = newprev;
+   }
+
+  /* Now give back the fragmented pages to the OS, but keep the address 
+     space to reuse it next time. */
 
   for (p = G.free_pages; p; )
     {
diff --git a/gcc/params.def b/gcc/params.def
index 5e49c48..edbf0de 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -561,6 +561,11 @@  DEFPARAM(GGC_MIN_HEAPSIZE,
 #undef GGC_MIN_EXPAND_DEFAULT
 #undef GGC_MIN_HEAPSIZE_DEFAULT
 
+DEFPARAM(GGC_FREE_UNIT,
+	 "ggc-free-unit",
+	 "Continuous areas in OS pages to free back immediately",
+	 256, 0, 0)
+
 DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS,
 	 "max-reload-search-insns",
 	 "The maximum number of instructions to search backward when looking for equivalent reload",