Patchwork [4/5] Add a freeing threshold for the garbage collector.

login
register
mail settings
Submitter Andi Kleen
Date Oct. 9, 2011, 7:55 p.m.
Message ID <1318190141-1220-5-git-send-email-andi@firstfloor.org>
Download mbox | patch
Permalink /patch/118619/
State New
Headers show

Comments

Andi Kleen - Oct. 9, 2011, 7:55 p.m.
From: Andi Kleen <ak@linux.intel.com>

Add a threshold to avoid freeing pages back too early to the OS.
This avoid virtual memory map fragmentation.

Based on a idea from Honza

ggc/doc/:

2011-10-08   Andi Kleen <ak@linux.intel.com>

	PR other/50636
	* invoke.texi (ggc-free-threshold, ggc-free-min): Add.

ggc/:

2011-10-08   Andi Kleen <ak@linux.intel.com>

	PR other/50636
	* ggc-page.c (ggc_collect): Add free threshold.
	* params.def (GGC_FREE_THRESHOLD, GGC_FREE_MIN): Add.
---
 gcc/doc/invoke.texi |   11 +++++++++++
 gcc/ggc-page.c      |   13 +++++++++----
 gcc/params.def      |   10 ++++++++++
 3 files changed, 30 insertions(+), 4 deletions(-)
Richard Guenther - Oct. 10, 2011, 10:20 a.m.
On Sun, Oct 9, 2011 at 9:55 PM, Andi Kleen <andi@firstfloor.org> wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> Add a threshold to avoid freeing pages back too early to the OS.
> This avoid virtual memory map fragmentation.
>
> Based on a idea from Honza

Less than 20% looks high.  Shouldn't ggc-free-min be enough to
avoid fragmentation?

This will hide gc bugs with always-collect (ggc-checking), so
the parameter(s) need to be adjusted for that case to always
give pages back.  The current values should probably be printed
where the two existing ones are printed as well (with -v).

Thanks,
Richard.

> ggc/doc/:
>
> 2011-10-08   Andi Kleen <ak@linux.intel.com>
>
>        PR other/50636
>        * invoke.texi (ggc-free-threshold, ggc-free-min): Add.
>
> ggc/:
>
> 2011-10-08   Andi Kleen <ak@linux.intel.com>
>
>        PR other/50636
>        * ggc-page.c (ggc_collect): Add free threshold.
>        * params.def (GGC_FREE_THRESHOLD, GGC_FREE_MIN): Add.
> ---
>  gcc/doc/invoke.texi |   11 +++++++++++
>  gcc/ggc-page.c      |   13 +++++++++----
>  gcc/params.def      |   10 ++++++++++
>  3 files changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index ef7ac68..6557f66 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -8837,6 +8837,17 @@ very large effectively disables garbage collection.  Setting this
>  parameter and @option{ggc-min-expand} to zero causes a full collection
>  to occur at every opportunity.
>
> +@item ggc-free-threshold
> +
> +Only free memory back to the system when it would free more than this
> +many percent of the total allocated memory. Default is 20 percent.
> +This avoids memory fragmentation.
> +
> +@item ggc-free-min
> +
> +Only free memory back to the system when it would free more than this.
> +Unit is kilobytes.
> +
>  @item max-reload-search-insns
>  The maximum number of instruction reload should look backward for equivalent
>  register.  Increasing values mean more aggressive optimization, making the
> diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
> index 6e08cda..cd1c41a 100644
> --- a/gcc/ggc-page.c
> +++ b/gcc/ggc-page.c
> @@ -1968,14 +1968,19 @@ ggc_collect (void)
>   if (GGC_DEBUG_LEVEL >= 2)
>     fprintf (G.debug_file, "BEGIN COLLECTING\n");
>
> +  /* Release the pages we freed the last time we collected, but didn't
> +     reuse in the interim.  But only do this if this would free a
> +     reasonable number of pages. Otherwise hold on to them
> +     to avoid virtual memory fragmentation. */
> +  if (G.bytes_mapped - G.allocated >=
> +       (PARAM_VALUE (GGC_FREE_THRESHOLD) / 100.0) * G.bytes_mapped &&
> +      G.bytes_mapped - G.allocated >= (size_t)PARAM_VALUE (GGC_FREE_MIN) * 1024)
> +    release_pages ();
> +
>   /* Zero the total allocated bytes.  This will be recalculated in the
>      sweep phase.  */
>   G.allocated = 0;
>
> -  /* Release the pages we freed the last time we collected, but didn't
> -     reuse in the interim.  */
> -  release_pages ();
> -
>   /* Indicate that we've seen collections at this context depth.  */
>   G.context_depth_collections = ((unsigned long)1 << (G.context_depth + 1)) - 1;
>
> diff --git a/gcc/params.def b/gcc/params.def
> index 5e49c48..ca28715 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -561,6 +561,16 @@ DEFPARAM(GGC_MIN_HEAPSIZE,
>  #undef GGC_MIN_EXPAND_DEFAULT
>  #undef GGC_MIN_HEAPSIZE_DEFAULT
>
> +DEFPARAM(GGC_FREE_THRESHOLD,
> +       "ggc-free-threshold",
> +       "Dont free memory back to system less this percent of the total memory",
> +       20, 0, 100)
> +
> +DEFPARAM(GGC_FREE_MIN,
> +        "ggc-free-min",
> +        "Dont free less memory than this back to the system, in kilobytes",
> +        8 * 1024, 0, 0)
> +
>  DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS,
>         "max-reload-search-insns",
>         "The maximum number of instructions to search backward when looking for equivalent reload",
> --
> 1.7.5.4
>
>
Andi Kleen - Oct. 10, 2011, 1:58 p.m.
On Mon, Oct 10, 2011 at 12:20:53PM +0200, Richard Guenther wrote:
> On Sun, Oct 9, 2011 at 9:55 PM, Andi Kleen <andi@firstfloor.org> wrote:
> > From: Andi Kleen <ak@linux.intel.com>
> >
> > Add a threshold to avoid freeing pages back too early to the OS.
> > This avoid virtual memory map fragmentation.
> >
> > Based on a idea from Honza
> 
> Less than 20% looks high.  Shouldn't ggc-free-min be enough to
> avoid fragmentation?

It depends on the working set. If there's more to garbage collect
than max(ggc-free-min, threshold*total) a host without MADV_DONTNEED
will get holes. And ggc-free-min isn't very much on a large 
build.

So it seems safer to me to have a threshold which adjusts for large
working sets. What value do you prefer instead of 20%? (or just 0)

> 
> This will hide gc bugs with always-collect (ggc-checking), so
> the parameter(s) need to be adjusted for that case to always
> give pages back.  The current values should probably be printed
> where the two existing ones are printed as well (with -v).

Will fix.
-Andi
Richard Guenther - Oct. 10, 2011, 2:53 p.m.
On Mon, Oct 10, 2011 at 3:58 PM, Andi Kleen <andi@firstfloor.org> wrote:
> On Mon, Oct 10, 2011 at 12:20:53PM +0200, Richard Guenther wrote:
>> On Sun, Oct 9, 2011 at 9:55 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> > From: Andi Kleen <ak@linux.intel.com>
>> >
>> > Add a threshold to avoid freeing pages back too early to the OS.
>> > This avoid virtual memory map fragmentation.
>> >
>> > Based on a idea from Honza
>>
>> Less than 20% looks high.  Shouldn't ggc-free-min be enough to
>> avoid fragmentation?
>
> It depends on the working set. If there's more to garbage collect
> than max(ggc-free-min, threshold*total) a host without MADV_DONTNEED
> will get holes. And ggc-free-min isn't very much on a large
> build.
>
> So it seems safer to me to have a threshold which adjusts for large
> working sets. What value do you prefer instead of 20%? (or just 0)

I'm not sure honestly - 10% maybe?  I realize it's quite arbitrary ...

>>
>> This will hide gc bugs with always-collect (ggc-checking), so
>> the parameter(s) need to be adjusted for that case to always
>> give pages back.  The current values should probably be printed
>> where the two existing ones are printed as well (with -v).
>
> Will fix.
> -Andi
>

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ef7ac68..6557f66 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8837,6 +8837,17 @@  very large effectively disables garbage collection.  Setting this
 parameter and @option{ggc-min-expand} to zero causes a full collection
 to occur at every opportunity.
 
+@item ggc-free-threshold
+
+Only free memory back to the system when it would free more than this
+many percent of the total allocated memory. Default is 20 percent.
+This avoids memory fragmentation.
+
+@item ggc-free-min
+
+Only free memory back to the system when it would free more than this.
+Unit is kilobytes. 
+
 @item max-reload-search-insns
 The maximum number of instruction reload should look backward for equivalent
 register.  Increasing values mean more aggressive optimization, making the
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 6e08cda..cd1c41a 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -1968,14 +1968,19 @@  ggc_collect (void)
   if (GGC_DEBUG_LEVEL >= 2)
     fprintf (G.debug_file, "BEGIN COLLECTING\n");
 
+  /* Release the pages we freed the last time we collected, but didn't
+     reuse in the interim.  But only do this if this would free a 
+     reasonable number of pages. Otherwise hold on to them
+     to avoid virtual memory fragmentation. */
+  if (G.bytes_mapped - G.allocated >= 
+	(PARAM_VALUE (GGC_FREE_THRESHOLD) / 100.0) * G.bytes_mapped &&
+      G.bytes_mapped - G.allocated >= (size_t)PARAM_VALUE (GGC_FREE_MIN) * 1024)
+    release_pages ();
+
   /* Zero the total allocated bytes.  This will be recalculated in the
      sweep phase.  */
   G.allocated = 0;
 
-  /* Release the pages we freed the last time we collected, but didn't
-     reuse in the interim.  */
-  release_pages ();
-
   /* Indicate that we've seen collections at this context depth.  */
   G.context_depth_collections = ((unsigned long)1 << (G.context_depth + 1)) - 1;
 
diff --git a/gcc/params.def b/gcc/params.def
index 5e49c48..ca28715 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -561,6 +561,16 @@  DEFPARAM(GGC_MIN_HEAPSIZE,
 #undef GGC_MIN_EXPAND_DEFAULT
 #undef GGC_MIN_HEAPSIZE_DEFAULT
 
+DEFPARAM(GGC_FREE_THRESHOLD,
+	"ggc-free-threshold",
+	"Dont free memory back to system less this percent of the total memory",
+	20, 0, 100)
+
+DEFPARAM(GGC_FREE_MIN,
+	 "ggc-free-min",
+	 "Dont free less memory than this back to the system, in kilobytes",
+	 8 * 1024, 0, 0)
+
 DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS,
 	 "max-reload-search-insns",
 	 "The maximum number of instructions to search backward when looking for equivalent reload",