Patchwork RFA: patch to solve PR48455 and improve code size for -Os

login
register
mail settings
Submitter Vladimir Makarov
Date Aug. 16, 2011, 8:07 p.m.
Message ID <4E4ACE1F.8090004@redhat.com>
Download mbox | patch
Permalink /patch/110218/
State New
Headers show

Comments

Vladimir Makarov - Aug. 16, 2011, 8:07 p.m.
After a lot of thinking and some experiments, I did not find a better 
solution to considerably (like on 0.2% - 0.3% on ARM SPEC2000) improve 
code size than use of non-regional RA for -Os.

So the patch makes one region allocation is default for -Os.  I don't 
use function optimize_function_for_size_p because I'd like to keep 
possibility still set up regions independedly of used -O option.

This final patch removes ARM code size regression reported on 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48455.

The patch was successfully bootstrapped on x86-64 and ppc64 (actually 
the generated code is not changed for -O2).

Is it ok to commit?

2011-08-16  Vladimir Makarov <vmakarov@redhat.com>

         PR rtl-optimization/48455
         * doc/invoke.texi (-fira-region): Document default values.

         * flags-types.h (enum ira_region): Add new value
         IRA_REGION_AUTODETECT.

         * common.opt (fira-region): Set up initial value to
         IRA_REGION_AUTODETECT.

         * toplev.c (process_options): Set up flag_ira_region depending on
         -O options.

         * ira.c (ira.c): Remove optimize guard for ira_build.
Sergey Ostanevich - Aug. 18, 2011, 3:04 p.m.
16.08.2011, в 23:07, Vladimir Makarov <vmakarov@redhat.com> написал(а):

> After a lot of thinking and some experiments, I did not find a better solution to considerably (like on 0.2% - 0.3% on ARM SPEC2000) improve code size than use of non-regional RA for -Os.
> 

did you consider change of compile time because of single region RA?

> So the patch makes one region allocation is default for -Os.  I don't use function optimize_function_for_size_p because I'd like to keep possibility still set up regions independedly of used -O option.
> 
> This final patch removes ARM code size regression reported on http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48455.
> 
> The patch was successfully bootstrapped on x86-64 and ppc64 (actually the generated code is not changed for -O2).
> 
> Is it ok to commit?
> 
> 2011-08-16  Vladimir Makarov <vmakarov@redhat.com>
> 
>        PR rtl-optimization/48455
>        * doc/invoke.texi (-fira-region): Document default values.
> 
>        * flags-types.h (enum ira_region): Add new value
>        IRA_REGION_AUTODETECT.
> 
>        * common.opt (fira-region): Set up initial value to
>        IRA_REGION_AUTODETECT.
> 
>        * toplev.c (process_options): Set up flag_ira_region depending on
>        -O options.
> 
>        * ira.c (ira.c): Remove optimize guard for ira_build.
> 
> <code-size.patch>
Vladimir Makarov - Aug. 18, 2011, 3:17 p.m.
On 08/18/2011 11:04 AM, Sergey Ostanevich wrote:
>
> 16.08.2011, в 23:07, Vladimir Makarov<vmakarov@redhat.com>  написал(а):
>
>> After a lot of thinking and some experiments, I did not find a better solution to considerably (like on 0.2% - 0.3% on ARM SPEC2000) improve code size than use of non-regional RA for -Os.
>>
> did you consider change of compile time because of single region RA?
>
Yes, one region RA is the fastest one.  I guess 1% compile time 
improvement although it depends on many factors (target, benchmark etc).
Vladimir Makarov - Aug. 29, 2011, 4:55 p.m.
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01349.html
Jeff Law - Aug. 29, 2011, 5:04 p.m.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/29/11 10:55, Vladimir Makarov wrote:
> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01349.html
OK.
jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOW8aNAAoJEBRtltQi2kC7gn0IALv9iGJgqqhWexfvvCevsy/o
muUMajge3A4gBdOHRk7vaFd68kftYYEwbxRDd7FBziWLUMx5Gy4/whtGOWbl4H2A
meEf2rjEgMsG90jlAYJA8QOXuVZPMsTE2DILMuuZoUOuH9qBLeQDJXLc4NJ5ox2Q
7v/+6ghhcVymBjKJcR6lLuoQtWikUzw5yREyBIPRL8G2IF281RCngFBjIFOQ+ol6
KLvr6fecSHtlJ7t+GuUVZtFglw/6+rZ47eJBsWmYFkklEm2zrcvQDJpo4KmjSJl+
OVcPOLG69kpxpvtkWYeipcjCkL3TR2ETHXHtp1WXxQIHpq6u7JLjOv8QIsGSxvg=
=QTl5
-----END PGP SIGNATURE-----

Patch

Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 177573)
+++ doc/invoke.texi	(working copy)
@@ -6661,13 +6661,16 @@  rule generates a better code.
 Use specified regions for the integrated register allocator.  The
 @var{region} argument should be one of @code{all}, @code{mixed}, or
 @code{one}.  The first value means using all loops as register
-allocation regions, the second value which is the default means using
-all loops except for loops with small register pressure as the
-regions, and third one means using all function as a single region.
-The first value can give best result for machines with small size and
-irregular register set, the third one results in faster and generates
-decent code and the smallest size code, and the default value usually
-give the best results in most cases and for most architectures.
+allocation regions, the second value which is enabled by default when
+compiling with optimization for speed (@option{-O}, @option{-O2},
+@dots{}) means using all loops except for loops with small register
+pressure as the regions, and third one which is enabled by default for
+@option{-Os} or @option{-O0} means using all function as a single
+region.  The first value can give best result for machines with small
+size and irregular register set, the third one results in faster and
+generates decent code and the smallest size code, and the second value
+usually give the best results in most cases and for most
+architectures.
 
 @item -fira-loop-pressure
 @opindex fira-loop-pressure
Index: toplev.c
===================================================================
--- toplev.c	(revision 177573)
+++ toplev.c	(working copy)
@@ -1289,6 +1289,11 @@  process_options (void)
 	   "and -ftree-loop-linear)");
 #endif
 
+  /* One region RA really helps to decrease the code size.  */
+  if (flag_ira_region == IRA_REGION_AUTODETECT)
+    flag_ira_region
+      = optimize_size || !optimize ? IRA_REGION_ONE : IRA_REGION_MIXED;
+
   /* Unrolling all loops implies that standard loop unrolling must also
      be done.  */
   if (flag_unroll_all_loops)
Index: flag-types.h
===================================================================
--- flag-types.h	(revision 177573)
+++ flag-types.h	(working copy)
@@ -118,7 +118,11 @@  enum ira_region
 {
   IRA_REGION_ONE,
   IRA_REGION_ALL,
-  IRA_REGION_MIXED
+  IRA_REGION_MIXED,
+  /* This value means that there were no options -fira-region on the
+     command line and that we should choose a value depending on the
+     used -O option.  */
+  IRA_REGION_AUTODETECT
 };
 
 /* The options for excess precision.  */
Index: common.opt
===================================================================
--- common.opt	(revision 177573)
+++ common.opt	(working copy)
@@ -1313,7 +1313,7 @@  EnumValue
 Enum(ira_algorithm) String(priority) Value(IRA_ALGORITHM_PRIORITY)
 
 fira-region=
-Common Joined RejectNegative Enum(ira_region) Var(flag_ira_region) Init(IRA_REGION_MIXED)
+Common Joined RejectNegative Enum(ira_region) Var(flag_ira_region) Init(IRA_REGION_AUTODETECT)
 -fira-region=[one|all|mixed] Set regions for IRA
 
 Enum
Index: ira.c
===================================================================
--- ira.c	(revision 177573)
+++ ira.c	(working copy)
@@ -3617,9 +3617,8 @@  ira (FILE *f)
 
   if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
     fprintf (ira_dump_file, "Building IRA IR\n");
-  loops_p = ira_build (optimize
-		       && (flag_ira_region == IRA_REGION_ALL
-			   || flag_ira_region == IRA_REGION_MIXED));
+  loops_p = ira_build (flag_ira_region == IRA_REGION_ALL
+		       || flag_ira_region == IRA_REGION_MIXED);
 
   ira_assert (ira_conflicts_p || !loops_p);