Patchwork PATCH: PR target/56560: [4.6/4.7 regression] vzeroupper clobbers argument with AVX

login
register
mail settings
Submitter H.J. Lu
Date March 18, 2013, 5:51 p.m.
Message ID <20130318175104.GA7106@intel.com>
Download mbox | patch
Permalink /patch/228904/
State New
Headers show

Comments

H.J. Lu - March 18, 2013, 5:51 p.m.
Hi,

ix86_function_arg sets cfun->machine->callee_pass_avx256_p from the
current argument.  It clears callee_pass_avx256_p when ix86_function_arg
is called to generate a library call to passs an argument.  This patch
adds callee_pass_avx256_p and callee_return_avx256_p to ix86_args to store
the AVX info in CUM and copy it to cfun->machine->callee_pass_avx256_p
when ix86_function_arg is called immediately before the call instruction
is emitted.  OK for 4.7 branch?

Thanks.


H.J.
--
gcc/

2013-03-18  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/56560
	* config/i386/i386.c (init_cumulative_args): Also set
	cum->callee_return_avx256_p.
	(ix86_function_arg): Set cum->callee_pass_avx256_p.  Set
	cfun->machine->callee_pass_avx256_p only when MODE == VOIDmode.

	* config/i386/i386.h (ix86_args): Add callee_pass_avx256_p and
	callee_return_avx256_p.

gcc/

2013-03-18  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/56560
	* gcc.target/i386/pr56560.c: New file.
Uros Bizjak - March 20, 2013, 5:44 p.m.
On Mon, Mar 18, 2013 at 6:51 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:

> ix86_function_arg sets cfun->machine->callee_pass_avx256_p from the
> current argument.  It clears callee_pass_avx256_p when ix86_function_arg
> is called to generate a library call to passs an argument.  This patch
> adds callee_pass_avx256_p and callee_return_avx256_p to ix86_args to store
> the AVX info in CUM and copy it to cfun->machine->callee_pass_avx256_p
> when ix86_function_arg is called immediately before the call instruction
> is emitted.  OK for 4.7 branch?
>
> 2013-03-18  H.J. Lu  <hongjiu.lu@intel.com>
>
>         PR target/56560
>         * config/i386/i386.c (init_cumulative_args): Also set
>         cum->callee_return_avx256_p.
>         (ix86_function_arg): Set cum->callee_pass_avx256_p.  Set
>         cfun->machine->callee_pass_avx256_p only when MODE == VOIDmode.
>
>         * config/i386/i386.h (ix86_args): Add callee_pass_avx256_p and
>         callee_return_avx256_p.
>
> gcc/
>
> 2013-03-18  H.J. Lu  <hongjiu.lu@intel.com>
>
>         PR target/56560
>         * gcc.target/i386/pr56560.c: New file.

OK for branches, but I didn't check all state transitions (and I don't
like this approach anyway)...

Please also add the testcase to the trunk, and to 4.8 branch when it reopens.

Thanks,
Uros.

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c1f6c88..7a441c7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5592,7 +5592,10 @@  init_cumulative_args (CUMULATIVE_ARGS *cum,  /* Argument info to initialize */
 	{
 	  /* The return value of this function uses 256bit AVX modes.  */
 	  if (caller)
-	    cfun->machine->callee_return_avx256_p = true;
+	    {
+	      cfun->machine->callee_return_avx256_p = true;
+	      cum->callee_return_avx256_p = true;
+	    }
 	  else
 	    cfun->machine->caller_return_avx256_p = true;
 	}
@@ -6863,11 +6866,20 @@  ix86_function_arg (cumulative_args_t cum_v, enum machine_mode omode,
     {
       /* This argument uses 256bit AVX modes.  */
       if (cum->caller)
-	cfun->machine->callee_pass_avx256_p = true;
+	cum->callee_pass_avx256_p = true;
       else
 	cfun->machine->caller_pass_avx256_p = true;
     }
 
+  if (cum->caller && mode == VOIDmode)
+    {
+      /* This function is called with MODE == VOIDmode immediately
+	 before the call instruction is emitted.  We copy callee 256bit
+	 AVX info from the current CUM here.  */
+      cfun->machine->callee_return_avx256_p = cum->callee_return_avx256_p;
+      cfun->machine->callee_pass_avx256_p = cum->callee_pass_avx256_p;
+    }
+
   return arg;
 }
 
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 80d19f1..899678d 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1502,6 +1502,10 @@  typedef struct ix86_args {
 				   in SSE registers.  Otherwise 0.  */
   enum calling_abi call_abi;	/* Set to SYSV_ABI for sysv abi. Otherwise
  				   MS_ABI for ms abi.  */
+  /* Nonzero if it passes 256bit AVX modes.  */
+  BOOL_BITFIELD callee_pass_avx256_p : 1;
+  /* Nonzero if it returns 256bit AVX modes.  */
+  BOOL_BITFIELD callee_return_avx256_p : 1;
 } CUMULATIVE_ARGS;
 
 /* Initialize a variable CUM of type CUMULATIVE_ARGS
diff --git a/gcc/testsuite/gcc.target/i386/pr56560.c b/gcc/testsuite/gcc.target/i386/pr56560.c
new file mode 100644
index 0000000..5417cbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr56560.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx -mvzeroupper -dp" } */
+
+extern void abort (void);
+
+typedef double vec_t __attribute__((vector_size(32)));
+
+struct S { int i1; int i2; int i3; };
+
+extern int bar (vec_t, int, int, int, int, int, struct S);
+
+void foo (vec_t v, struct S s)
+{
+  int i = bar (v, 1, 2, 3, 4, 5, s);
+  if (i == 0)
+    abort ();
+}
+
+/* { dg-final { scan-assembler-not "avx_vzeroupper" } } */