diff mbox

VZEROUPPER for simple_return?

Message ID 20111105131659.GV1052@tyan-ft48-01.lab.bos.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Nov. 5, 2011, 1:16 p.m. UTC
Hi!

On Sat, Nov 05, 2011 at 10:50:44AM +0100, Jakub Jelinek wrote:
> On the following testcase with -m64 -O3 -mavx2 (but it is just an example,
> you can replace the loop there with any code that doesn't touch the
> stack or frame pointer at all), only f3 is shrink wrapped and in that case
> it on the other side doesn't add vzeroupper before leaving the AVX using
> code that it IMNSHO should.  But I wonder why we can't shrink-wrap also

Here is a quick hack that deals with the missing vzeroupper issue.
Probably it would be nicer to create a helper in i386.c for that though,
because call_no_avx256 is an enum private to i386.c.



	Jakub
diff mbox

Patch

--- gcc/config/i386/i386.md.jj	2011-11-04 07:49:41.000000000 +0100
+++ gcc/config/i386/i386.md	2011-11-05 14:00:32.000000000 +0100
@@ -11725,6 +11725,12 @@  (define_expand "return"
   [(simple_return)]
   "ix86_can_use_return_insn_p ()"
 {
+  /* Emit vzeroupper if needed.  */
+  if (TARGET_VZEROUPPER
+      && !TREE_THIS_VOLATILE (cfun->decl)
+      && !cfun->machine->caller_return_avx256_p)
+    emit_insn (gen_avx_vzeroupper (const2_rtx));
+
   if (crtl->args.pops_args)
     {
       rtx popc = GEN_INT (crtl->args.pops_args);
@@ -11741,6 +11747,12 @@  (define_expand "simple_return"
   [(simple_return)]
   "!TARGET_SEH"
 {
+  /* Emit vzeroupper if needed.  */
+  if (TARGET_VZEROUPPER
+      && !TREE_THIS_VOLATILE (cfun->decl)
+      && !cfun->machine->caller_return_avx256_p)
+    emit_insn (gen_avx_vzeroupper (const2_rtx));
+
   if (crtl->args.pops_args)
     {
       rtx popc = GEN_INT (crtl->args.pops_args);