Patchwork [dataflow] : Fix PR55845, 454.calculix miscompares on x86 AVX due to movement of vzeroupper

login
register
mail settings
Submitter Jakub Jelinek
Date Jan. 7, 2013, 11:26 p.m.
Message ID <20130107232634.GI7269@tucnak.redhat.com>
Download mbox | patch
Permalink /patch/210267/
State New
Headers show

Comments

Jakub Jelinek - Jan. 7, 2013, 11:26 p.m.
On Mon, Jan 07, 2013 at 05:52:23PM +0100, Uros Bizjak wrote:
> TBH, I'm not that familiar with the RTL infrastructure enough to
> answer these questions. While I can spend some time on this problem,
> and probably waste quite some reviewer's time, the problem is not that
> trivial as I hoped to be, so I would kindly ask someone with better
> understanding of this part of the compiler for the proper solution.

After discussion with rth on IRC, this modified patch just uses
volatile_insn_p, making all UNSPEC_VOLATILE (wherever in insn) and asm
volatile into a complete scheduling barrier for optimizations that use this
function.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-01-08  Jakub Jelinek  <jakub@redhat.com>
	    Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/55845
	* df-problems.c (can_move_insns_across): Stop scanning at
	volatile_insn_p source instruction or give up if
	across_from .. across_to range contains any volatile_insn_p
	instructions.

2012-01-08  Uros Bizjak  <ubizjak@gmail.com>
	    Vladimir Yakovlev  <vladimir.b.yakovlev@intel.com>

	PR rtl-optimization/55845
	* gcc.target/i386/pr55845.c: New test.



	Jakub
Uros Bizjak - Jan. 8, 2013, 7:10 a.m.
On Tue, Jan 8, 2013 at 12:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Jan 07, 2013 at 05:52:23PM +0100, Uros Bizjak wrote:
>> TBH, I'm not that familiar with the RTL infrastructure enough to
>> answer these questions. While I can spend some time on this problem,
>> and probably waste quite some reviewer's time, the problem is not that
>> trivial as I hoped to be, so I would kindly ask someone with better
>> understanding of this part of the compiler for the proper solution.
>
> After discussion with rth on IRC, this modified patch just uses
> volatile_insn_p, making all UNSPEC_VOLATILE (wherever in insn) and asm
> volatile into a complete scheduling barrier for optimizations that use this
> function.

Thanks!

Just two little nits in the testcase:

> +foo (int size, double y[], double x[])

foo (int size, double *y, double *x)

> +  return (sum);

return sum;

Uros.
Richard Henderson - Jan. 8, 2013, 5:54 p.m.
On 01/07/2013 03:26 PM, Jakub Jelinek wrote:
> 2012-01-08  Jakub Jelinek  <jakub@redhat.com>
> 	    Uros Bizjak  <ubizjak@gmail.com>
> 
> 	PR rtl-optimization/55845
> 	* df-problems.c (can_move_insns_across): Stop scanning at
> 	volatile_insn_p source instruction or give up if
> 	across_from .. across_to range contains any volatile_insn_p
> 	instructions.

Ok.


r~

Patch

--- gcc/df-problems.c.jj	2012-11-19 14:41:26.181898964 +0100
+++ gcc/df-problems.c	2013-01-07 18:38:33.064974313 +0100
@@ -3858,6 +3858,8 @@  can_move_insns_across (rtx from, rtx to,
 	}
       if (NONDEBUG_INSN_P (insn))
 	{
+	  if (volatile_insn_p (PATTERN (insn)))
+	    return false;
 	  memrefs_in_across |= for_each_rtx (&PATTERN (insn), find_memory,
 					     NULL);
 	  note_stores (PATTERN (insn), find_memory_stores,
@@ -3917,7 +3919,9 @@  can_move_insns_across (rtx from, rtx to,
       if (NONDEBUG_INSN_P (insn))
 	{
 	  if (may_trap_or_fault_p (PATTERN (insn))
-	      && (trapping_insns_in_across || other_branch_live != NULL))
+	      && (trapping_insns_in_across
+		  || other_branch_live != NULL
+		  || volatile_insn_p (PATTERN (insn))))
 	    break;
 
 	  /* We cannot move memory stores past each other, or move memory
--- gcc/testsuite/gcc.target/i386/pr55845.c.jj	2013-01-07 18:30:19.168801389 +0100
+++ gcc/testsuite/gcc.target/i386/pr55845.c	2013-01-07 18:30:19.168801389 +0100
@@ -0,0 +1,39 @@ 
+/* { dg-do run } */
+/* { dg-require-effective-target avx } */
+/* { dg-options "-O3 -ffast-math -fschedule-insns -mavx -mvzeroupper" } */
+
+#include "avx-check.h"
+
+#define N 100
+
+double
+__attribute__((noinline))
+foo (int size, double y[], double x[])
+{
+  double sum = 0.0;
+  int i;
+  for (i = 0, sum = 0.; i < size; i++)
+    sum += y[i] * x[i];
+  return (sum);
+}
+
+static void
+__attribute__ ((noinline))
+avx_test ()
+{
+  double x[N];
+  double y[N];
+  double s;
+  int i;
+
+  for (i = 0; i < N; i++)
+    {
+      x[i] = i;
+      y[i] = i;
+    }
+
+  s = foo (N, y, x);
+
+  if (s != 328350.0)
+    abort ();
+}