Message ID | 1319006408-6012-2-git-send-email-andi@firstfloor.org |
---|---|
State | New |
Headers | show |
On Wed, Oct 19, 2011 at 08:40:08AM +0200, Andi Kleen wrote: > From: Andi Kleen <ak@linux.intel.com> > > This implements the freeing back of large chunks in the ggc madvise path > Richard Guenther asked for. This way on systems with limited > address space malloc() and other allocators still have > a chance to get back at some of the memory ggc freed. The > fragmented pages are still just given back, but the address space > stays allocated. > > I tried freeing only aligned 2MB areas to optimize for 2MB huge > pages, but the hit rate was quite low, so I switched to 1MB+ > unaligned areas. The target size is a param now. If the size to free is smaller than the quirk size, then it has the very undesirable effect that with using GC only you might run unnecessarily out of virtual address space, because it allocates pages in 2MB chunks, but if they are released in 1MB chunks, those released chunks will never be usable again for GC. Consider on 32-bit address space allocating 3GB of GC memory, then freeing stuff in every odd 1MB chunk of pages, then wanting to allocate through GC the 1.5GB back. IMHO we should munmap immediately in release_pages the > G.pagesize pages, those are not very likely to be reused anyway (and it had one in between ggc_collect cycle to be reused anyway), and for the == G.pagesize (the usual case, the only ones that are allocated in GGC_QUIRK_SIZE sets) we should note which page was the first one in the GGC_QUIRK_SIZE chunk and munmap exactly those 2MB starting at the first page only. Jakub
> If the size to free is smaller than the quirk size, then it has the very > undesirable effect that with using GC only you might run unnecessarily out > of virtual address space, because it allocates pages in 2MB chunks, but > if they are released in 1MB chunks, those released chunks will never be 1MB is just the minimum, it frees it in whatever it can find (but only for a single GC cycle usually). So when enough continuous memory is free it will be reused by GC. I guess it would be possible to add a fallback to allocate a smaller chunk if the large chunk fails, but unless someone actually comes up with a test case I have doubts it is really needed. > usable again for GC. Consider on 32-bit address space allocating 3GB > of GC memory, then freeing stuff in every odd 1MB chunk of pages, then > wanting to allocate through GC the 1.5GB back. > > IMHO we should munmap immediately in release_pages the > G.pagesize pages, Then you get the fragmentation problem back in full force. > those are not very likely to be reused anyway (and it had one in between > ggc_collect cycle to be reused anyway), and for the == G.pagesize > (the usual case, the only ones that are allocated in GGC_QUIRK_SIZE sets) > we should note which page was the first one in the GGC_QUIRK_SIZE chunk > and munmap exactly those 2MB starting at the first page only. I tried this first with aligned 2MB chunks, but it doesn't trigger ever in a normal (non LTO) bootstrap. -Andi
On Wed, Oct 19, 2011 at 04:37:45PM +0200, Andi Kleen wrote: > > If the size to free is smaller than the quirk size, then it has the very > > undesirable effect that with using GC only you might run unnecessarily out > > of virtual address space, because it allocates pages in 2MB chunks, but > > if they are released in 1MB chunks, those released chunks will never be > > 1MB is just the minimum, it frees it in whatever it can find > (but only for a single GC cycle usually). So when enough continuous > memory is free it will be reused by GC. > > I guess it would be possible to add a fallback to allocate a smaller > chunk if the large chunk fails, but unless someone actually comes > up with a test case I have doubts it is really needed. > > > usable again for GC. Consider on 32-bit address space allocating 3GB > > of GC memory, then freeing stuff in every odd 1MB chunk of pages, then > > wanting to allocate through GC the 1.5GB back. > > > > IMHO we should munmap immediately in release_pages the > G.pagesize pages, > > Then you get the fragmentation problem back in full force. Why? For one, such allocations are very rare (you only get them when a single GC allocation requests > page of memory, like perhaps a string literal over 4KB or similar or function call with over 1000 arguments etc.). And if they are unlikely to be reused, not munmapping them means wasting more virtual address space than needed. Jakub
> Why? For one, such allocations are very rare (you only get them when > a single GC allocation requests > page of memory, like perhaps a string > literal over 4KB or similar or function call with over 1000 arguments etc.). > And if they are unlikely to be reused, not munmapping them means wasting > more virtual address space than needed. We produce quite bit of large hashtables in GGC and those will happily be bigger than 4KB. Honza
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4f55dbc..e622552 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -8858,6 +8858,11 @@ very large effectively disables garbage collection. Setting this parameter and @option{ggc-min-expand} to zero causes a full collection to occur at every opportunity. +@item ggc-free-unit + +Continuous areas in OS pages to free back to OS immediately. Default is 256 +pages, which is 1MB with 4K pages. + @item max-reload-search-insns The maximum number of instruction reload should look backward for equivalent register. Increasing values mean more aggressive optimization, making the diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c index ba88e3f..eb0eeef 100644 --- a/gcc/ggc-page.c +++ b/gcc/ggc-page.c @@ -972,6 +972,54 @@ release_pages (void) page_entry *p, *start_p; char *start; size_t len; + size_t mapped_len; + page_entry *next, *prev, *newprev; + size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize; + + /* First free larger continuous areas to the OS. + This allows other allocators to grab these areas if needed. + This is only done on larger chunks to avoid fragmentation. + This does not always work because the free_pages list is only + sorted over a single GC cycle. */ + + p = G.free_pages; + prev = NULL; + while (p) + { + start = p->page; + start_p = p; + len = 0; + mapped_len = 0; + newprev = prev; + while (p && p->page == start + len) + { + len += p->bytes; + if (!p->discarded) + mapped_len += p->bytes; + newprev = p; + p = p->next; + } + if (len >= free_unit) + { + while (start_p != p) + { + next = start_p->next; + free (start_p); + start_p = next; + } + munmap (start, len); + if (prev) + prev->next = p; + else + G.free_pages = p; + G.bytes_mapped -= mapped_len; + continue; + } + prev = newprev; + } + + /* Now give back the fragmented pages to the OS, but keep the address + space to reuse it next time. */ for (p = G.free_pages; p; ) { diff --git a/gcc/params.def b/gcc/params.def index 5e49c48..edbf0de 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -561,6 +561,11 @@ DEFPARAM(GGC_MIN_HEAPSIZE, #undef GGC_MIN_EXPAND_DEFAULT #undef GGC_MIN_HEAPSIZE_DEFAULT +DEFPARAM(GGC_FREE_UNIT, + "ggc-free-unit", + "Continuous areas in OS pages to free back immediately", + 256, 0, 0) + DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS, "max-reload-search-insns", "The maximum number of instructions to search backward when looking for equivalent reload",
From: Andi Kleen <ak@linux.intel.com> This implements the freeing back of large chunks in the ggc madvise path Richard Guenther asked for. This way on systems with limited address space malloc() and other allocators still have a chance to get back at some of the memory ggc freed. The fragmented pages are still just given back, but the address space stays allocated. I tried freeing only aligned 2MB areas to optimize for 2MB huge pages, but the hit rate was quite low, so I switched to 1MB+ unaligned areas. The target size is a param now. Passed bootstrap and testing on x86_64-linux gcc/: 2011-10-18 Andi Kleen <ak@linux.intel.com> * ggc-page (release_pages): First free large continuous chunks in the madvise path. * params.def (GGC_FREE_UNIT): Add. * doc/invoke.texi (ggc-free-unit): Add. --- gcc/doc/invoke.texi | 5 +++++ gcc/ggc-page.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ gcc/params.def | 5 +++++ 3 files changed, 58 insertions(+), 0 deletions(-)