Patchwork PATCH: Pad short functions for Atom

login
register
mail settings
Submitter Uros Bizjak
Date Sept. 23, 2010, 8:44 a.m.
Message ID <AANLkTi=yNt_bLffcOxDmLR55Pt3F3BNJw5pKvUByANGP@mail.gmail.com>
Download mbox | patch
Permalink /patch/65510/
State New
Headers show

Comments

Uros Bizjak - Sept. 23, 2010, 8:44 a.m.
On Fri, Sep 17, 2010 at 10:53 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

> Here is the updated patch.  OK for trunk?

> gcc/
>
> 2010-09-17  H.J. Lu  <hongjiu.lu@intel.com>
>            Richard Henderson  <rth@redhat.com>
>
>        * config/i386/i386.c (initial_ix86_tune_features): Add
>        X86_TUNE_PAD_SHORT_FUNCTION.
>        (ix86_code_end): Pad with 8 NOPs for TARGET_PAD_SHORT_FUNCTION.
>        (ix86_count_insn): New.
>        (ix86_pad_short_function): Likewise.
>        (ix86_reorg): Support TARGET_PAD_SHORT_FUNCTION.
>
>        * config/i386/i386.h (ix86_tune_indices): Add
>        X86_TUNE_PAD_SHORT_FUNCTION.
>        (TARGET_PAD_SHORT_FUNCTION): New.
>
>        * config/i386/i386.md (UNSPEC_NOPS): New.
>        (nops): Likewise.
>
> gcc/testsuite/
>
> 2010-09-17  H.J. Lu  <hongjiu.lu@intel.com>
>
>        * gcc.target/i386/pad-1.c: New.
>        * gcc.target/i386/pad-2.c: Likewise.
>        * gcc.target/i386/pad-3.c: Likewise.
>        * gcc.target/i386/pad-4.c: Likewise.
>        * gcc.target/i386/pad-5a.c: Likewise.
>        * gcc.target/i386/pad-5b.c: Likewise.
>        * gcc.target/i386/pad-6a.c: Likewise.
>        * gcc.target/i386/pad-6b.c: Likewise.
>        * gcc.target/i386/pad-7.c: Likewise.
>        * gcc.target/i386/pad-8.c: Likewise.
>        * gcc.target/i386/pad-9.c: Likewise.
>        * gcc.target/i386/pad-10.c: Likewise.
>

Sorry for late review, but this patch has plenty of annoying issues:

- "nops" pattern should be defined as unspec_volatile.
- generation of insn templates by case is ...  well ... unusual at best.
- output_asm_insn is waay too complex to be used to output constant strings.

Issues with test cases:

- testing of the whole sequence of "nop"s in one line is not
effective, since scan-assembler-not is needed.
- testcases do not need to use "-S" in dg-options.
- the test should call "dg-require-effective-target pic" directive
before -fPIC is used

Attached patch fixes all these annoyances (and a couple of similar
issues throughout i386 testsuite).

Uros.
Uros Bizjak - Sept. 23, 2010, 9:11 a.m.
On Thu, Sep 23, 2010 at 10:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Sep 17, 2010 at 10:53 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>> Here is the updated patch.  OK for trunk?
>
>> gcc/
>>
>> 2010-09-17  H.J. Lu  <hongjiu.lu@intel.com>
>>            Richard Henderson  <rth@redhat.com>
>>
>>        * config/i386/i386.c (initial_ix86_tune_features): Add
>>        X86_TUNE_PAD_SHORT_FUNCTION.
>>        (ix86_code_end): Pad with 8 NOPs for TARGET_PAD_SHORT_FUNCTION.
>>        (ix86_count_insn): New.
>>        (ix86_pad_short_function): Likewise.
>>        (ix86_reorg): Support TARGET_PAD_SHORT_FUNCTION.
>>
>>        * config/i386/i386.h (ix86_tune_indices): Add
>>        X86_TUNE_PAD_SHORT_FUNCTION.
>>        (TARGET_PAD_SHORT_FUNCTION): New.
>>
>>        * config/i386/i386.md (UNSPEC_NOPS): New.
>>        (nops): Likewise.
>>
>> gcc/testsuite/
>>
>> 2010-09-17  H.J. Lu  <hongjiu.lu@intel.com>
>>
>>        * gcc.target/i386/pad-1.c: New.
>>        * gcc.target/i386/pad-2.c: Likewise.
>>        * gcc.target/i386/pad-3.c: Likewise.
>>        * gcc.target/i386/pad-4.c: Likewise.
>>        * gcc.target/i386/pad-5a.c: Likewise.
>>        * gcc.target/i386/pad-5b.c: Likewise.
>>        * gcc.target/i386/pad-6a.c: Likewise.
>>        * gcc.target/i386/pad-6b.c: Likewise.
>>        * gcc.target/i386/pad-7.c: Likewise.
>>        * gcc.target/i386/pad-8.c: Likewise.
>>        * gcc.target/i386/pad-9.c: Likewise.
>>        * gcc.target/i386/pad-10.c: Likewise.
>>
>
> Sorry for late review, but this patch has plenty of annoying issues:
>
> - "nops" pattern should be defined as unspec_volatile.
> - generation of insn templates by case is ...  well ... unusual at best.
> - output_asm_insn is waay too complex to be used to output constant strings.
>
> Issues with test cases:
>
> - testing of the whole sequence of "nop"s in one line is not
> effective, since scan-assembler-not is needed.
> - testcases do not need to use "-S" in dg-options.
> - the test should call "dg-require-effective-target pic" directive
> before -fPIC is used
>
> Attached patch fixes all these annoyances (and a couple of similar
> issues throughout i386 testsuite).

On a related note, can four "xchgw %ax,%ax" be used instead of 8 "nop"
instructions in the code below:

+      /* Pad stack IP move with 4 instructions (two NOPs count
+	 as one instruction.)  */
       if (TARGET_PAD_SHORT_FUNCTION)
-	output_asm_insn ("nop; nop; nop; nop; nop; nop; nop; nop",
-			 xops);
+	{
+	  int i = 8;
+
+	  while (i--)
+	    fputs ("\tnop\n", asm_out_file);
+	}

Uros.
H.J. Lu - Sept. 23, 2010, 12:03 p.m.
On Thu, Sep 23, 2010 at 2:11 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Thu, Sep 23, 2010 at 10:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Fri, Sep 17, 2010 at 10:53 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>>> Here is the updated patch.  OK for trunk?
>>
>>> gcc/
>>>
>>> 2010-09-17  H.J. Lu  <hongjiu.lu@intel.com>
>>>            Richard Henderson  <rth@redhat.com>
>>>
>>>        * config/i386/i386.c (initial_ix86_tune_features): Add
>>>        X86_TUNE_PAD_SHORT_FUNCTION.
>>>        (ix86_code_end): Pad with 8 NOPs for TARGET_PAD_SHORT_FUNCTION.
>>>        (ix86_count_insn): New.
>>>        (ix86_pad_short_function): Likewise.
>>>        (ix86_reorg): Support TARGET_PAD_SHORT_FUNCTION.
>>>
>>>        * config/i386/i386.h (ix86_tune_indices): Add
>>>        X86_TUNE_PAD_SHORT_FUNCTION.
>>>        (TARGET_PAD_SHORT_FUNCTION): New.
>>>
>>>        * config/i386/i386.md (UNSPEC_NOPS): New.
>>>        (nops): Likewise.
>>>
>>> gcc/testsuite/
>>>
>>> 2010-09-17  H.J. Lu  <hongjiu.lu@intel.com>
>>>
>>>        * gcc.target/i386/pad-1.c: New.
>>>        * gcc.target/i386/pad-2.c: Likewise.
>>>        * gcc.target/i386/pad-3.c: Likewise.
>>>        * gcc.target/i386/pad-4.c: Likewise.
>>>        * gcc.target/i386/pad-5a.c: Likewise.
>>>        * gcc.target/i386/pad-5b.c: Likewise.
>>>        * gcc.target/i386/pad-6a.c: Likewise.
>>>        * gcc.target/i386/pad-6b.c: Likewise.
>>>        * gcc.target/i386/pad-7.c: Likewise.
>>>        * gcc.target/i386/pad-8.c: Likewise.
>>>        * gcc.target/i386/pad-9.c: Likewise.
>>>        * gcc.target/i386/pad-10.c: Likewise.
>>>
>>
>> Sorry for late review, but this patch has plenty of annoying issues:
>>
>> - "nops" pattern should be defined as unspec_volatile.
>> - generation of insn templates by case is ...  well ... unusual at best.
>> - output_asm_insn is waay too complex to be used to output constant strings.
>>
>> Issues with test cases:
>>
>> - testing of the whole sequence of "nop"s in one line is not
>> effective, since scan-assembler-not is needed.
>> - testcases do not need to use "-S" in dg-options.
>> - the test should call "dg-require-effective-target pic" directive
>> before -fPIC is used
>>
>> Attached patch fixes all these annoyances (and a couple of similar
>> issues throughout i386 testsuite).
>
> On a related note, can four "xchgw %ax,%ax" be used instead of 8 "nop"
> instructions in the code below:
>

No, it needs 8 nops for 4 cycles.

Patch

Index: testsuite/gcc.target/i386/pad-3.c
===================================================================
--- testsuite/gcc.target/i386/pad-3.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-3.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -fno-pic -S" } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom -fno-pic" } */
 /* { dg-final { scan-assembler-not "nop" } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
Index: testsuite/gcc.target/i386/pad-5a.c
===================================================================
--- testsuite/gcc.target/i386/pad-5a.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-5a.c	(working copy)
@@ -1,8 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target ilp32 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop" 1 } } */
-/* { dg-final { scan-assembler-not "nop; nop; nop" } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 2 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 int
Index: testsuite/gcc.target/i386/pad-6b.c
===================================================================
--- testsuite/gcc.target/i386/pad-6b.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-6b.c	(working copy)
@@ -1,8 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop; nop; nop" 1 } } */
-/* { dg-final { scan-assembler-not "nop; nop; nop; nop; nop; nop; nop" } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 6 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 int
Index: testsuite/gcc.target/i386/pr36502.c
===================================================================
--- testsuite/gcc.target/i386/pr36502.c	(revision 164548)
+++ testsuite/gcc.target/i386/pr36502.c	(working copy)
@@ -1,6 +1,6 @@ 
 /* PR target/36502 */
 /* { dg-do compile { target { *-*-darwin* && ilp32 } } } */
-/* { dg-options "-O -fomit-frame-pointer -fno-pic -S" } */
+/* { dg-options "-O -fomit-frame-pointer -fno-pic" } */
 int a;
 void f() {a++;}
 /* { dg-final { scan-assembler-not "esp" } } */
Index: testsuite/gcc.target/i386/pad-7.c
===================================================================
--- testsuite/gcc.target/i386/pad-7.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-7.c	(working copy)
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target ilp32 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
 /* { dg-final { scan-assembler-not "nop" } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
Index: testsuite/gcc.target/i386/pad-9.c
===================================================================
--- testsuite/gcc.target/i386/pad-9.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-9.c	(working copy)
@@ -1,8 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop" 1 } } */
-/* { dg-final { scan-assembler-not "nop; nop; nop; nop; nop" } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 4 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 extern void bar (void);
Index: testsuite/gcc.target/i386/zee.c
===================================================================
--- testsuite/gcc.target/i386/zee.c	(revision 164548)
+++ testsuite/gcc.target/i386/zee.c	(working copy)
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -fzee -S" } */
+/* { dg-options "-O2 -fzee" } */
 /* { dg-final { scan-assembler-not "mov\[\\t \]+\(%\[\^,\]+\),\[\\t \]*\\1" } } */
 int mask[100];
 int foo(unsigned x)
Index: testsuite/gcc.target/i386/pad-2.c
===================================================================
--- testsuite/gcc.target/i386/pad-2.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-2.c	(working copy)
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop; nop; nop; nop; nop" 1 } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 8 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 void
Index: testsuite/gcc.target/i386/pad-4.c
===================================================================
--- testsuite/gcc.target/i386/pad-4.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-4.c	(working copy)
@@ -1,7 +1,8 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target ilp32 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S -fPIC" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop; nop; nop; nop; nop" 1 } } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom -fPIC" } */
+/* { dg-final { scan-assembler-times "nop" 8 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 extern int bar;
@@ -9,5 +10,6 @@  extern int bar;
 int
 foo ()
 {
+  asm volatile ("");
   return bar;
 }
Index: testsuite/gcc.target/i386/pad-5b.c
===================================================================
--- testsuite/gcc.target/i386/pad-5b.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-5b.c	(working copy)
@@ -1,8 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop" 1 } } */
-/* { dg-final { scan-assembler-not "nop; nop; nop; nop; nop" } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 4 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 int
Index: testsuite/gcc.target/i386/pad-6a.c
===================================================================
--- testsuite/gcc.target/i386/pad-6a.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-6a.c	(working copy)
@@ -1,8 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target ilp32 } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop" 1 } } */
-/* { dg-final { scan-assembler-not "nop; nop; nop; nop; nop" } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 4 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 int
Index: testsuite/gcc.target/i386/pad-8.c
===================================================================
--- testsuite/gcc.target/i386/pad-8.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-8.c	(working copy)
@@ -1,7 +1,6 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
-/* { dg-final { scan-assembler-times "nop; nop; nop; nop; nop; nop" 1 } } */
-/* { dg-final { scan-assembler-not "nop; nop; nop; nop; nop; nop; nop" } } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
+/* { dg-final { scan-assembler-times "nop" 6 } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
 int
Index: testsuite/gcc.target/i386/20060821-1.c
===================================================================
--- testsuite/gcc.target/i386/20060821-1.c	(revision 164548)
+++ testsuite/gcc.target/i386/20060821-1.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -msse3 -S" } */
+/* { dg-options "-O2 -msse3" } */
 /* { dg-final { scan-assembler-not "%mm" } } */
 /* PR 28825 */
 #include <pmmintrin.h>
Index: testsuite/gcc.target/i386/pad-10.c
===================================================================
--- testsuite/gcc.target/i386/pad-10.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-10.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fomit-frame-pointer -march=atom -S" } */
+/* { dg-options "-O2 -fomit-frame-pointer -march=atom" } */
 /* { dg-final { scan-assembler-not "nop" } } */
 /* { dg-final { scan-assembler-not "rep" } } */
 
Index: testsuite/gcc.target/i386/pad-1.c
===================================================================
--- testsuite/gcc.target/i386/pad-1.c	(revision 164548)
+++ testsuite/gcc.target/i386/pad-1.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fomit-frame-pointer -mtune=generic -S" } */
+/* { dg-options "-O2 -fomit-frame-pointer -mtune=generic" } */
 /* { dg-final { scan-assembler "rep" } } */
 /* { dg-final { scan-assembler-not "nop" } } */
 
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 164548)
+++ config/i386/i386.md	(working copy)
@@ -104,7 +104,6 @@ 
   UNSPEC_LD_MPIC	; load_macho_picbase
   UNSPEC_TRUNC_NOOP
   UNSPEC_DIV_ALREADY_SPLIT
-  UNSPEC_NOPS
 
   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -248,6 +247,7 @@ 
   UNSPECV_LOCK
   UNSPECV_PROLOGUE_USE
   UNSPECV_CLD
+  UNSPECV_NOPS
   UNSPECV_VZEROALL
   UNSPECV_VZEROUPPER
   UNSPECV_RDTSC
@@ -11468,32 +11468,18 @@ 
 
 ;; Generate nops.  Operand 0 is the number of nops, up to 8.
 (define_insn "nops"
-  [(unspec [(match_operand 0 "const_int_operand" "")]
-	   UNSPEC_NOPS)]
+  [(unspec_volatile [(match_operand 0 "const_int_operand" "")]
+		    UNSPECV_NOPS)]
   "reload_completed"
 {
-  switch (INTVAL (operands[0]))
-    {
-    case 1:
-      return "nop";
-    case 2:
-      return "nop; nop";
-    case 3:
-      return "nop; nop; nop";
-    case 4:
-      return "nop; nop; nop; nop";
-    case 5:
-      return "nop; nop; nop; nop; nop";
-    case 6:
-      return "nop; nop; nop; nop; nop; nop";
-    case 7:
-      return "nop; nop; nop; nop; nop; nop; nop";
-    case 8:
-      return "nop; nop; nop; nop; nop; nop; nop; nop";
-    default:
-      gcc_unreachable ();
-      break;
-  }
+  int num = INTVAL (operands[0]);
+
+  gcc_assert (num >= 1 && num <= 8);
+
+  while (num--)
+    fputs ("\tnop\n", asm_out_file);
+
+  return "";
 }
   [(set (attr "length") (symbol_ref "INTVAL (operands[0])"))
    (set_attr "length_immediate" "0")
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 164548)
+++ config/i386/i386.c	(working copy)
@@ -8024,13 +8024,18 @@  ix86_code_end (void)
 
       xops[0] = gen_rtx_REG (Pmode, regno);
       xops[1] = gen_rtx_MEM (Pmode, stack_pointer_rtx);
-      /* Pad stack IP move with 4 instructions.  2 NOPs count as 1
-         instruction.  */
+      /* Pad stack IP move with 4 instructions (two NOPs count
+	 as one instruction.)  */
       if (TARGET_PAD_SHORT_FUNCTION)
-	output_asm_insn ("nop; nop; nop; nop; nop; nop; nop; nop",
-			 xops);
+	{
+	  int i = 8;
+
+	  while (i--)
+	    fputs ("\tnop\n", asm_out_file);
+	}
+
       output_asm_insn ("mov%z0\t{%1, %0|%0, %1}", xops);
-      output_asm_insn ("ret", xops);
+      fputs ("\tret\n", asm_out_file);
       final_end_function ();
       init_insn_lengths ();
       free_after_compilation (cfun);