diff mbox

[RFC] Masked load/store vectorization (take 5)

Message ID 20131113122103.GM27813@tucnak.zalov.cz
State New
Headers show

Commit Message

Jakub Jelinek Nov. 13, 2013, 12:21 p.m. UTC
On Tue, Nov 12, 2013 at 03:29:30PM +0100, Jakub Jelinek wrote:
> As disabling tree if-conversion for not vectorized loops seems to be clearly
> controversial and not always a win, I'd like to at least do the versioning
> for masked loads/stores, where I think it is much nicer alternative to the
> original proposal of if-unconversion which didn't work well.
> We can perhaps reopen the case of if-conversion for GCC 5.0 when stage1 reopens next
> year.
> 
> Here is an updated patch (all 3 patches together, ChangeLog would be the 3
> ChangeLogs combined) against latest trunk, with the versioning done only for
> masked loads/stores which would otherwise always result in not trying to
> if-convert at all before, and with added cost calls for the masked
> loads/stores (so far just doing what normal vector load or store does,
> perhaps we want a separate cost for it instead).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux.  I'd like to use
> this also for masked elemental function calls once the gomp-4_0-branch
> elementals hit the trunk.

Sergey has kindly tested the patch on SPEC2k6, but on 4 tests it revealed
an ICE.  Here is an incremental fix for that, for now it just punts on
those.  In theory the invariant conditional loads could be handled e.g.
using gather, or perhaps V{,P}MOVMSKP{S,D,B} on the mask followed by
conditional scalar load of the value if any bits in the mask are set,
then broadcasting it.

2013-11-13  Jakub Jelinek  <jakub@redhat.com>

	* tree-vect-stmts.c (vectorizable_mask_load_store): Punt on
	invariant masked loads.

	* gcc.target/i386/vect-cond-1.c: New test.



	Jakub
diff mbox

Patch

--- gcc/tree-vect-stmts.c.jj	2013-11-12 23:10:09.000000000 +0100
+++ gcc/tree-vect-stmts.c	2013-11-13 12:49:18.597906361 +0100
@@ -1802,7 +1802,7 @@  vectorizable_mask_load_store (gimple stm
     }
   else if (tree_int_cst_compare (nested_in_vect_loop
 				 ? STMT_VINFO_DR_STEP (stmt_info)
-				 : DR_STEP (dr), size_zero_node) < 0)
+				 : DR_STEP (dr), size_zero_node) <= 0)
     return false;
   else if (optab_handler (is_store ? maskstore_optab : maskload_optab,
 			  TYPE_MODE (vectype)) == CODE_FOR_nothing)
--- gcc/testsuite/gcc.target/i386/vect-cond-1.c.jj	2013-11-13 13:05:55.208671302 +0100
+++ gcc/testsuite/gcc.target/i386/vect-cond-1.c	2013-11-13 13:06:10.464586004 +0100
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -mavx2" { target avx2 } } */
+
+int a[1024];
+
+int
+foo (int *p)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    {
+      int t;
+      if (a[i] < 30)
+	t = *p;
+      else
+	t = a[i] + 12;
+      a[i] = t;
+    }
+}
+
+/* { dg-final { cleanup-tree-dump "vect" } } */