RFC: ix86 / x86_64 register pressure aware scheduling

Message ID CAKdSQZmWOf2UXsyVgx-GrvUoaVK0aYdwH1nHPKyjLUY6c2gajA@mail.gmail.com
State New
Headers show

Commit Message

Igor Zamyatin April 17, 2013, 12:54 p.m.
These changes are what we used to try here at Intel after bunch of
changes which made pre-alloc scheduler more stable. We benchmarked
both register pressure algorithms and overall result was not that

We saw number of regressions e.g. for optset "-mavx -O3 -funroll-loops
-ffast-math -march=corei7" (for spec2000 not only lucas but also applu
regressed). And overall gain is negative even for x86_64. For 32 bits
picture was worse if I remember correctly.

In common we have doubts that this feature is good for OOO machine....


-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org
[mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Steven Bosscher
Sent: Monday, April 15, 2013 11:34 PM
To: GCC Patches
Cc: H.J. Lu; Uros Bizjak; Jan Hubicha
Subject: [patch] RFC: ix86 / x86_64 register pressure aware scheduling


The attached patch enables register pressure aware scheduling for the
ix86 and x86_64 targets. It uses the optimistic algorithm to avoid
being overly conservative.

This is the same as what other CISCy targets, like s390, also do.

The motivation for this patch is the excessive spilling I've observed
in a few test cases with relatively large basic blocks, e.g.
encryption algorithms and codecs. The patch passes bootstrap+testing
on x86_64-unknown-linux-gnu and i686-unknown-linux-gnu, with a few new
failures due to PR56950.

Off-list, Uros, Honza and others have already looked at the patch and
benchmarked it. For x86_64 there is an overall improvement for SPEC2k
except that lucas regresses, but such a preliminary result is IMHO
very promising.

Comments/suggestions welcome :-)

* common/config/i386/i386-common.c (ix86_option_optimization_table):
	Do not disable insns scheduling.  Enable register pressure aware
	* config/i386/i386.c (ix86_option_override): Use the alternative,
	optimistic scheduling-pressure algorithm by default.


Index: common/config/i386/i386-common.c
--- common/config/i386/i386-common.c	(revision 197941)
+++ common/config/i386/i386-common.c	(working copy)
@@ -707,9 +707,15 @@  static const struct default_options ix86
     /* Enable redundant extension instructions removal at -O2 and higher.  */
     { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
-    /* Turn off -fschedule-insns by default.  It tends to make the
-       problem with not enough registers even worse.  */
-    { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 },
+    /* Enable -fsched-pressure by default for all optimization levels.
+       Before SCHED_PRESSURE_MODEL register-pressure aware schedule was
+       available, -fschedule-insns was turned off completely by default for
+       this port, because scheduling before register allocation tends to
+       make the problem with not enough registers even worse.  However,
+       for very long basic blocks the scheduler can help bring register
+       pressure down significantly, and SCHED_PRESSURE_MODEL is still
+       conservative enough to avoid creating excessive register pressure.  */
+    { OPT_LEVELS_ALL, OPT_fsched_pressure, NULL, 1 },
Index: config/i386/i386.c
--- config/i386/i386.c	(revision 197941)
+++ config/i386/i386.c	(working copy)
@@ -3936,6 +3936,10 @@  ix86_option_override (void)
   ix86_option_override_internal (true);
+  /* Use the alternative scheduling-pressure algorithm by default.  */
+  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, 2,
+			 global_options.x_param_values,
+			 global_options_set.x_param_values);
   /* This needs to be done at start up.  It's convenient to do it here.  */
   register_pass (&insert_vzeroupper_info);