diff mbox

[PR,target/65103,1/3] Fix cost of PIC register in ix86_address_cost

Message ID 20150310145446.GB27860@msticlxl57.ims.intel.com
State New
Headers show

Commit Message

Ilya Enkovich March 10, 2015, 2:54 p.m. UTC
Hi,

This patches set addresses the problem of inefficient usage of x86 addressing modes in PIC.  Looking into generated 32bit PIC assembler we may see lots of 'leal  <symbol>@GOTOFF, %reg' instructions.  Almost in all cases this constant value may be propagated into following address operand, but it doesn't happen for various reasons.

I scanned assembler generated for SPEC2000 on O2 and Ofast + LTO to measure patches effect.  Here are results for the whole patches set:

Test         O2 ref     patched    Ofast + LTO ref   patched
164.gzip        12      0 (-100%)            39      0 (-100%)
175.vpr         0       0 (-0%)              4       0 (-100%)
176.gcc         141     6 (-96%)             294     10 (-97%)
181.mcf         4       0 (-100%)            4       2 (-50%)
186.crafty      28      0 (-100%)            277     16 (-95%)
197.parser      18      0 (-100%)            20      1 (-95%)
252.eon         593     0 (-100%)            410     0 (-100%)
253.perlbmk     2       0 (-100%)            29      0 (-100%)
254.gap         46      0 (-100%)            294     77 (-74%)
255.vortex      7       0 (-100%)            21      3 (-86%)
256.bzip2       7       0 (-100%)            15      0 (-100%)
300.twolf       0       0 (-0%)              1       13 (+1200%)
168.wupwise     0       0 (-0%)              0       0 (-0%)
171.swim        8       1 (-88%)             3       1 (-67%)
172.mgrid       0       0 (-0%)              3       0 (-100%)
173.applu       25      1 (-96%)             197     85 (-57%)
177.mesa        1       0 (-100%)            3       0 (-100%)
178.galgel      0       0 (-0%)              5       0 (-100%)
179.art         4       0 (-100%)            4       0 (-100%)
183.equake      5       0 (-100%)            4       0 (-100%)
187.facerec     38      0 (-100%)            32      10 (-69%)
188.ammp        2       0 (-100%)            2       0 (-100%)
189.lucas       15      0 (-100%)            11      0 (-100%)
191.fma3d       211     0 (-100%)            1480    172 (-89%)
200.sixtrack    677     93 (-87%)            1013    67 (-94%)
301.apsi        22      1 (-96%)             12      1 (-92%)

Performance results (Ofast + LTO) measured on Avoton server:

SPEC2000INT +0.81%
SPEC2000FP  +0.19%


The first patch fixes ix86_address_cost.  Previous change in this function was made after pseudo PIC register introduction and it tried to keep zero cost for PIC register.  Unfortunately it doesn't work for cases when PIC register is the only register used in address.  Thus addresses [pic_reg + const] and [pic_reg + reg] get the same cost 1 which doesn't match the goal and makes fwprop to reject propagation.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-03-10  Ilya Enkovich  <ilya.enkovich@intel.com>

	PR target/65103
	* gcc/config/i386/i386.c (ix86_address_cost): Fix cost of a PIC
	register.

gcc/testsuite/

2015-03-10  Ilya Enkovich  <ilya.enkovich@intel.com>

	PR target/65103
	* gcc.target/i386/pr65103-1.c: New.

Comments

Jakub Jelinek March 10, 2015, 3:17 p.m. UTC | #1
On Tue, Mar 10, 2015 at 05:54:46PM +0300, Ilya Enkovich wrote:
> This patches set addresses the problem of inefficient usage of x86 addressing modes in PIC.  Looking into generated 32bit PIC assembler we may see lots of 'leal  <symbol>@GOTOFF, %reg' instructions.  Almost in all cases this constant value may be propagated into following address operand, but it doesn't happen for various reasons.
> 
> I scanned assembler generated for SPEC2000 on O2 and Ofast + LTO to measure patches effect.  Here are results for the whole patches set:
> 
> Test         O2 ref     patched    Ofast + LTO ref   patched
> 164.gzip        12      0 (-100%)            39      0 (-100%)
> 175.vpr         0       0 (-0%)              4       0 (-100%)
> 176.gcc         141     6 (-96%)             294     10 (-97%)
> 181.mcf         4       0 (-100%)            4       2 (-50%)
...
The 0s and -100% look very suspicious to me.

> 2015-03-10  Ilya Enkovich  <ilya.enkovich@intel.com>
> 
> 	PR target/65103
> 	* gcc/config/i386/i386.c (ix86_address_cost): Fix cost of a PIC
> 	register.

Avoid the gcc/ prefix in the ChangeLog entry.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr65103-1.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-m32 -O2 -fPIE" } */

No -m32 in dg-options please.  As the test is guarded with
ia32, there is no need for that either.
But -fPIE needs to be guarded too by effective target pie.

> +/* { dg-final { scan-assembler-not "GOTOFF," } } */
> +
> +typedef struct S
> +{
> +  int a;
> +  int sum;
> +  int delta;
> +} S;
> +
> +S gs;
> +int global_opt (int max)
> +{
> +  while (gs.sum < max)
> +    gs.sum += gs.delta;
> +  return gs.a;
> +}

	Jakub
diff mbox

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ab8f03a..06eacd0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12931,30 +12931,27 @@  ix86_address_cost (rtx x, machine_mode, addr_space_t, bool)
   if (parts.index && GET_CODE (parts.index) == SUBREG)
     parts.index = SUBREG_REG (parts.index);
 
-  /* Attempt to minimize number of registers in the address.  */
-  if ((parts.base
-       && (!REG_P (parts.base) || REGNO (parts.base) >= FIRST_PSEUDO_REGISTER))
-      || (parts.index
-	  && (!REG_P (parts.index)
-	      || REGNO (parts.index) >= FIRST_PSEUDO_REGISTER)))
-    cost++;
+  /* Attempt to minimize number of registers in the address.
 
-  /* When address base or index is "pic_offset_table_rtx" we don't increase
+     When address base or index is "pic_offset_table_rtx" we don't increase
      address cost.  When a memopt with "pic_offset_table_rtx" is not invariant
      itself it most likely means that base or index is not invariant.
      Therefore only "pic_offset_table_rtx" could be hoisted out, which is not
      profitable for x86.  */
   if (parts.base
-      && (current_pass->type == GIMPLE_PASS
-	  || (!pic_offset_table_rtx
-	      || REGNO (pic_offset_table_rtx) != REGNO(parts.base)))
       && (!REG_P (parts.base) || REGNO (parts.base) >= FIRST_PSEUDO_REGISTER)
-      && parts.index
       && (current_pass->type == GIMPLE_PASS
-	  || (!pic_offset_table_rtx
-	      || REGNO (pic_offset_table_rtx) != REGNO(parts.index)))
+	  || !pic_offset_table_rtx
+	  || !REG_P (parts.base)
+	  || REGNO (pic_offset_table_rtx) != REGNO (parts.base)))
+    cost++;
+
+  if (parts.index
       && (!REG_P (parts.index) || REGNO (parts.index) >= FIRST_PSEUDO_REGISTER)
-      && parts.base != parts.index)
+      && (current_pass->type == GIMPLE_PASS
+	  || !pic_offset_table_rtx
+	  || !REG_P (parts.index)
+	  || REGNO (pic_offset_table_rtx) != REGNO (parts.index)))
     cost++;
 
   /* AMD-K6 don't like addresses with ModR/M set to 00_xxx_100b,
diff --git a/gcc/testsuite/gcc.target/i386/pr65103-1.c b/gcc/testsuite/gcc.target/i386/pr65103-1.c
new file mode 100644
index 0000000..a3284b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65103-1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-m32 -O2 -fPIE" } */
+/* { dg-final { scan-assembler-not "GOTOFF," } } */
+
+typedef struct S
+{
+  int a;
+  int sum;
+  int delta;
+} S;
+
+S gs;
+int global_opt (int max)
+{
+  while (gs.sum < max)
+    gs.sum += gs.delta;
+  return gs.a;
+}