Patchwork Handle vectorization of invariant loads (PR46787)

login
register
mail settings
Submitter Richard Guenther
Date June 30, 2011, 3:24 p.m.
Message ID <alpine.LNX.2.00.1106301723420.810@zhemvz.fhfr.qr>
Download mbox | patch
Permalink /patch/102774/
State New
Headers show

Comments

Richard Guenther - June 30, 2011, 3:24 p.m.
On Wed, 29 Jun 2011, Richard Guenther wrote:

> 
> The following patch makes us handle invariant loads during vectorization.
> Dependence analysis currently isn't clever enough to disambiguate them
> thus we insert versioning-for-alias checks.  For the testcase hoisting
> the load is still always possible though, and for a read-after-write
> dependence it would be possible for the vectorized loop copy as the
> may-aliasing write is varying by the scalar variable size.
> 
> The existing code for vectorizing invariant accesses looks very
> suspicious - it generates a vector load at the scalar address
> to then just extract the first vector element.  Huh.  IMHO this
> can be simplified as done, by just re-using the scalar load result.
> But maybe this code was supposed to deal with something entirely
> different?
> 
> This patch gives a 33% speedup to the phoronix himeno testcase
> if you bump the maximum alias versioning checks we want to insert.
> 
> I'm currently re-bootstrapping & testing this but an earlier version
> was ok on x86_64-unknown-linux-gnu.

FYI, I'm testing the following which cures a fallout seen when
building SPEC2k6 with the committed patch.  It's suboptimal for
j != 0 though - is there a way to get to the vectorized stmt
of the j == 0 iteration?

Thanks,
Richard.

2011-06-30  Richard Guenther  <rguenther@suse.de>

	* tree-vect-stmts.c (vectorizable_load): Remove unnecessary
	assert.
H.J. Lu - July 1, 2011, 4:43 a.m.
On Thu, Jun 30, 2011 at 8:24 AM, Richard Guenther <rguenther@suse.de> wrote:
> On Wed, 29 Jun 2011, Richard Guenther wrote:
>
>>
>> The following patch makes us handle invariant loads during vectorization.
>> Dependence analysis currently isn't clever enough to disambiguate them
>> thus we insert versioning-for-alias checks.  For the testcase hoisting
>> the load is still always possible though, and for a read-after-write
>> dependence it would be possible for the vectorized loop copy as the
>> may-aliasing write is varying by the scalar variable size.
>>
>> The existing code for vectorizing invariant accesses looks very
>> suspicious - it generates a vector load at the scalar address
>> to then just extract the first vector element.  Huh.  IMHO this
>> can be simplified as done, by just re-using the scalar load result.
>> But maybe this code was supposed to deal with something entirely
>> different?
>>
>> This patch gives a 33% speedup to the phoronix himeno testcase
>> if you bump the maximum alias versioning checks we want to insert.
>>
>> I'm currently re-bootstrapping & testing this but an earlier version
>> was ok on x86_64-unknown-linux-gnu.
>
> FYI, I'm testing the following which cures a fallout seen when
> building SPEC2k6 with the committed patch.  It's suboptimal for
> j != 0 though - is there a way to get to the vectorized stmt
> of the j == 0 iteration?
>
> Thanks,
> Richard.
>
> 2011-06-30  Richard Guenther  <rguenther@suse.de>
>
>        * tree-vect-stmts.c (vectorizable_load): Remove unnecessary
>        assert.
>

Will this fix

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49603

Patch

Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	(revision 175709)
+++ gcc/tree-vect-stmts.c	(working copy)
@@ -4574,19 +4574,14 @@  vectorizable_load (gimple stmt, gimple_s
 	      /* 4. Handle invariant-load.  */
 	      if (inv_p && !bb_vinfo)
 		{
+		  tree vec_inv;
+		  gimple_stmt_iterator gsi2 = *gsi;
 		  gcc_assert (!strided_load);
-		  if (j == 0)
-		    {
-		      tree vec_inv;
-		      gimple_stmt_iterator gsi2 = *gsi;
-		      gsi_next (&gsi2);
-		      vec_inv = build_vector_from_val (vectype, scalar_dest);
-		      new_temp = vect_init_vector (stmt, vec_inv,
-						   vectype, &gsi2);
-		      new_stmt = SSA_NAME_DEF_STMT (new_temp);
-		    }
-		  else
-		    gcc_unreachable (); /* FORNOW. */
+		  gsi_next (&gsi2);
+		  vec_inv = build_vector_from_val (vectype, scalar_dest);
+		  new_temp = vect_init_vector (stmt, vec_inv,
+					       vectype, &gsi2);
+		  new_stmt = SSA_NAME_DEF_STMT (new_temp);
 		}
 
 	      if (negative)