diff mbox

[gomp4] Basic vectorization enablement for #pragma omp simd

Message ID 20130514061713.GI1377@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek May 14, 2013, 6:17 a.m. UTC
Hi!

This patch adds safelen field to struct loop, teaches expand_omp_simd
to set it on the simd loops and then uses it in a few places:
1) because the loops are explicitly marked for vectorization by the user,
   we'll try to ifconvert them and vectorize even without -O3, -Ofast or
   -ftree-vectorize (but explicit -fno-tree-vectorize will still disable
   that behavior)
2) the data dependency analysis uses it to decide about unknown and bad
   data dependencies
3) unrolling is disabled for those loops, I think we don't want to unroll
   those loops until vectorization, and after vectorization we just clear
   the safelen, so that it can be unrolled afterwards

In the end we'll want to do much more on the vectorizer side, handle calls
to elemental functions, handle conditionalized calls to elemental functions,
or even vectorize loops where some part of the loop isn't really
vectorizable and needs to be sequential, but other parts of the loop are
vectorizable.  for (...) { vectorizable_bb; non-vectorizable_bb; vectorizable_bb; }
can be turned into for (...) { vectorized_bb; for (temp = 0; temp < vf;
temp++) non-vectorizable_bb; vectorized_bb; } etc.

Does this look ok?

2013-05-14  Jakub Jelinek  <jakub@redhat.com>

	* cfgloop.h (struct loop): Add safelen field.
	* omp-low.c (expand_omp_simd): If !broken_loop, fix_loop_structure
	to create loop for the simd region and set safelen field.
	* tree-vectorizer.c (vectorize_loops): If loop has safelen set,
	vectorize it even if flag_vectorize isn't set.  Clear loop->safelen
	after vectorization.
	* tree-ssa-loop.c (gate_tree_vectorize): Return true even for
	flag_openmp if -fno-tree-vectorize hasn't been specified.
	* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Don't
	unroll loops with non-NULL loop->safelen.
	* tree-vect-data-refs.c (vect_analyze_data_ref_dependence): For unknown
	or bad data dependency, if loop->safelen is non-NULL, just decrease
	*max_vf to loop->safelen if needed and return false.
	* tree-if-conv.c (main_tree_if_conversion): If-convert also loops with
	non-NULL loop->safelen.
	(gate_tree_if_conversion): Return true even for
	flag_openmp if -fno-tree-vectorize hasn't been specified.


	Jakub

Comments

Richard Biener May 14, 2013, 9:28 a.m. UTC | #1
On Tue, 14 May 2013, Jakub Jelinek wrote:

> Hi!
> 
> This patch adds safelen field to struct loop, teaches expand_omp_simd
> to set it on the simd loops and then uses it in a few places:
> 1) because the loops are explicitly marked for vectorization by the user,
>    we'll try to ifconvert them and vectorize even without -O3, -Ofast or
>    -ftree-vectorize (but explicit -fno-tree-vectorize will still disable
>    that behavior)
> 2) the data dependency analysis uses it to decide about unknown and bad
>    data dependencies
> 3) unrolling is disabled for those loops, I think we don't want to unroll
>    those loops until vectorization, and after vectorization we just clear
>    the safelen, so that it can be unrolled afterwards
> 
> In the end we'll want to do much more on the vectorizer side, handle calls
> to elemental functions, handle conditionalized calls to elemental functions,
> or even vectorize loops where some part of the loop isn't really
> vectorizable and needs to be sequential, but other parts of the loop are
> vectorizable.  for (...) { vectorizable_bb; non-vectorizable_bb; vectorizable_bb; }
> can be turned into for (...) { vectorized_bb; for (temp = 0; temp < vf;
> temp++) non-vectorizable_bb; vectorized_bb; } etc.
> 
> Does this look ok?
> 
> 2013-05-14  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* cfgloop.h (struct loop): Add safelen field.
> 	* omp-low.c (expand_omp_simd): If !broken_loop, fix_loop_structure
> 	to create loop for the simd region and set safelen field.
> 	* tree-vectorizer.c (vectorize_loops): If loop has safelen set,
> 	vectorize it even if flag_vectorize isn't set.  Clear loop->safelen
> 	after vectorization.
> 	* tree-ssa-loop.c (gate_tree_vectorize): Return true even for
> 	flag_openmp if -fno-tree-vectorize hasn't been specified.
> 	* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Don't
> 	unroll loops with non-NULL loop->safelen.
> 	* tree-vect-data-refs.c (vect_analyze_data_ref_dependence): For unknown
> 	or bad data dependency, if loop->safelen is non-NULL, just decrease
> 	*max_vf to loop->safelen if needed and return false.
> 	* tree-if-conv.c (main_tree_if_conversion): If-convert also loops with
> 	non-NULL loop->safelen.
> 	(gate_tree_if_conversion): Return true even for
> 	flag_openmp if -fno-tree-vectorize hasn't been specified.
> 
> --- gcc/cfgloop.h.jj	2013-05-13 16:49:44.000000000 +0200
> +++ gcc/cfgloop.h	2013-05-13 17:30:18.630883633 +0200
> @@ -176,6 +176,12 @@ struct GTY ((chain_next ("%h.next"))) lo
>  
>    /* Number of iteration analysis data for RTL.  */
>    struct niter_desc *simple_loop_desc;
> +
> +  /* If non-NULL, an INTEGER_CST, where the user asserted that for any
> +     I in [ 0, nb_iterations ) and for any J in
> +     [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations
> +     of the loop can be safely evaluated concurrently.  */
> +  tree safelen;

Can you make this a double_int (or a HOST_WIDE_INT or an int) instead 
please?  It should map to data-dependence analysis distance vectors
which currently is a vector of 'int'.
Is there a magic value to tell safelen is "infinity"?  As I read
above safelen == 0 would mean all iterations are dependent.  Are
negative safelen values well-defined?  The comment doesn't seem
to disallow them.

Also make sure to copy the field in copy_loop_info and stream
it in output/input_cfg in lto-streamer-in/out.c.

>  };
>  
>  /* Flags for state of loop structure.  */
> --- gcc/omp-low.c.jj	2013-05-13 16:37:05.000000000 +0200
> +++ gcc/omp-low.c	2013-05-13 18:46:18.310405585 +0200
> @@ -4960,6 +4960,8 @@ expand_omp_simd (struct omp_region *regi
>    edge e, ne;
>    tree *counts = NULL;
>    int i;
> +  tree safelen = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt),
> +				  OMP_CLAUSE_SAFELEN);
>    type = TREE_TYPE (fd->loop.v);
>    entry_bb = region->entry;
> @@ -5157,6 +5159,22 @@ expand_omp_simd (struct omp_region *regi
>    set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
>    set_immediate_dominator (CDI_DOMINATORS, l2_bb, l1_bb);
>    set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
> +
> +  if (!broken_loop)
> +    {
> +      struct loop *loop;
> +      calculate_dominance_info (CDI_DOMINATORS);
> +      fix_loop_structure (NULL);

Ick.  Didn't I properly add loops everywhere?

> +      loop = l1_bb->loop_father;
> +      if (safelen == NULL_TREE)
> +	{
> +	  safelen = build_nonstandard_integer_type (TYPE_PRECISION (type), 1);
> +	  safelen = TYPE_MAX_VALUE (safelen);
> +	}
> +      else
> +	safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
> +      loop->safelen = safelen;
> +    }
>  }
>  
>  
> --- gcc/tree-vectorizer.c.jj	2013-05-13 16:49:03.000000000 +0200
> +++ gcc/tree-vectorizer.c	2013-05-13 20:44:58.721863725 +0200
> @@ -101,7 +101,8 @@ vectorize_loops (void)
>       than all previously defined loops.  This fact allows us to run
>       only over initial loops skipping newly generated ones.  */
>    FOR_EACH_LOOP (li, loop, 0)
> -    if (optimize_loop_nest_for_speed_p (loop))
> +    if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop))
> +	|| loop->safelen)

So you vectorize all loops with a safelen?  I'd say this warrants an
extra flag in struct loop, force_vect.

>        {
>  	loop_vec_info loop_vinfo;
>  	vect_location = find_loop_location (loop);
> @@ -122,6 +123,9 @@ vectorize_loops (void)
>                         LOC_FILE (vect_location), LOC_LINE (vect_location));
>  	vect_transform_loop (loop_vinfo);
>  	num_vectorized_loops++;
> +	/* Now that the loop has been vectorized, allow it to be unrolled
> +	   etc.  */
> +	loop->safelen = NULL_TREE;
>        }
>  
>    vect_location = UNKNOWN_LOC;
> --- gcc/tree-ssa-loop.c.jj	2013-05-13 16:46:36.000000000 +0200
> +++ gcc/tree-ssa-loop.c	2013-05-13 19:12:57.301538324 +0200
> @@ -225,7 +225,8 @@ tree_vectorize (void)
>  static bool
>  gate_tree_vectorize (void)
>  {
> -  return flag_tree_vectorize;
> +  return flag_tree_vectorize
> +	 || (flag_openmp && !global_options_set.x_flag_tree_vectorize);

And a flag in cfun here, whether any loop has force_vect set (or
a flag in current_loops)

>  }
>  
>  struct gimple_opt_pass pass_vectorize =
> --- gcc/tree-ssa-loop-ivcanon.c.jj	2013-05-13 16:46:36.000000000 +0200
> +++ gcc/tree-ssa-loop-ivcanon.c	2013-05-13 20:06:44.176519188 +0200
> @@ -1123,6 +1123,11 @@ tree_unroll_loops_completely_1 (bool may
>    if (changed)
>      return true;
>  
> +  /* Don't unroll #pragma omp simd loops until the vectorizer
> +     attempts to vectorize those.  */
> +  if (loop->safelen)
> +    return false;
> +
>    /* Try to unroll this loop.  */
>    loop_father = loop_outer (loop);
>    if (!loop_father)
> --- gcc/tree-vect-data-refs.c.jj	2013-05-13 16:49:08.000000000 +0200
> +++ gcc/tree-vect-data-refs.c	2013-05-13 20:41:51.579889330 +0200
> @@ -255,6 +255,16 @@ vect_analyze_data_ref_dependence (struct
>    /* Unknown data dependence.  */
>    if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
>      {
> +      /* If user asserted there safelen consecutive iterations can be
> +	 executed concurrently, and safelen >= *max_vf, assume
> +	 independence.  */
> +      if (loop->safelen)
> +	{
> +	  if (compare_tree_int (loop->safelen, *max_vf) < 0)
> +	    *max_vf = tree_low_cst (loop->safelen, 0);
> +	  return false;
> +	}
> +
>        if (STMT_VINFO_GATHER_P (stmtinfo_a)
>  	  || STMT_VINFO_GATHER_P (stmtinfo_b))
>  	{
> @@ -291,6 +301,16 @@ vect_analyze_data_ref_dependence (struct
>    /* Known data dependence.  */
>    if (DDR_NUM_DIST_VECTS (ddr) == 0)
>      {
> +      /* If user asserted there safelen consecutive iterations can be
> +	 executed concurrently, and safelen >= *max_vf, assume
> +	 independence.  */
> +      if (loop->safelen)
> +	{
> +	  if (compare_tree_int (loop->safelen, *max_vf) < 0)
> +	    *max_vf = tree_low_cst (loop->safelen, 0);
> +	  return false;
> +	}
> +

Rather than looking at safelen from data-dependence analysis consumers
data-dependence analysis itself should use the information.  Which
is why I'd like the 'safelen' thing to map to the distance vector
representation of dependence analysis.

Thanks,
Richard.
Jakub Jelinek May 14, 2013, 9:38 a.m. UTC | #2
On Tue, May 14, 2013 at 11:28:43AM +0200, Richard Biener wrote:
> > +  /* If non-NULL, an INTEGER_CST, where the user asserted that for any
> > +     I in [ 0, nb_iterations ) and for any J in
> > +     [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations
> > +     of the loop can be safely evaluated concurrently.  */
> > +  tree safelen;
> 
> Can you make this a double_int (or a HOST_WIDE_INT or an int) instead 
> please?  It should map to data-dependence analysis distance vectors
> which currently is a vector of 'int'.

If all we care about is int, it can be int.  Infinity is when
#pragma omp simd
doesn't contain any simdlen clause, or when Cilk+
#pragma simd
doesn't contain any vectorlength or vectorlengthfor clauses.
So, shall we assign INT_MAX for infinity?  At least the vectorizer
certainly doesn't care about anything beyond MAX_VECTORIZATION_FACTOR.
And I can just map any explicit safelen in the source larger than INT_MAX
as INT_MAX.

> Is there a magic value to tell safelen is "infinity"?  As I read
> above safelen == 0 would mean all iterations are dependent.  Are
> negative safelen values well-defined?  The comment doesn't seem
> to disallow them.

The FEs disallow safelen <= 0 or non-constant.

> Also make sure to copy the field in copy_loop_info and stream
> it in output/input_cfg in lto-streamer-in/out.c.

Ok.

> > +  if (!broken_loop)
> > +    {
> > +      struct loop *loop;
> > +      calculate_dominance_info (CDI_DOMINATORS);
> > +      fix_loop_structure (NULL);
> 
> Ick.  Didn't I properly add loops everywhere?

The loop was previously containing EDGE_ABNORMAL edges (that is something
to prevent any optimizations on those until ompexp had a chance to deal with
those), so there is no loop at all, just the loop->num == 0 for the whole
function if #pragma omp simd appears outside of loops and doesn't contain
any loops inside of its body.

> > --- gcc/tree-vectorizer.c.jj	2013-05-13 16:49:03.000000000 +0200
> > +++ gcc/tree-vectorizer.c	2013-05-13 20:44:58.721863725 +0200
> > @@ -101,7 +101,8 @@ vectorize_loops (void)
> >       than all previously defined loops.  This fact allows us to run
> >       only over initial loops skipping newly generated ones.  */
> >    FOR_EACH_LOOP (li, loop, 0)
> > -    if (optimize_loop_nest_for_speed_p (loop))
> > +    if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop))
> > +	|| loop->safelen)
> 
> So you vectorize all loops with a safelen?  I'd say this warrants an
> extra flag in struct loop, force_vect.

Ok.

> > @@ -225,7 +225,8 @@ tree_vectorize (void)
> >  static bool
> >  gate_tree_vectorize (void)
> >  {
> > -  return flag_tree_vectorize;
> > +  return flag_tree_vectorize
> > +	 || (flag_openmp && !global_options_set.x_flag_tree_vectorize);
> 
> And a flag in cfun here, whether any loop has force_vect set (or
> a flag in current_loops)

Ok.

> Rather than looking at safelen from data-dependence analysis consumers
> data-dependence analysis itself should use the information.  Which
> is why I'd like the 'safelen' thing to map to the distance vector
> representation of dependence analysis.

Will try.

	Jakub
Richard Biener May 14, 2013, 10:16 a.m. UTC | #3
On Tue, 14 May 2013, Jakub Jelinek wrote:

> On Tue, May 14, 2013 at 11:28:43AM +0200, Richard Biener wrote:
> > > +  /* If non-NULL, an INTEGER_CST, where the user asserted that for any
> > > +     I in [ 0, nb_iterations ) and for any J in
> > > +     [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations
> > > +     of the loop can be safely evaluated concurrently.  */
> > > +  tree safelen;
> > 
> > Can you make this a double_int (or a HOST_WIDE_INT or an int) instead 
> > please?  It should map to data-dependence analysis distance vectors
> > which currently is a vector of 'int'.
> 
> If all we care about is int, it can be int.  Infinity is when
> #pragma omp simd
> doesn't contain any simdlen clause, or when Cilk+
> #pragma simd
> doesn't contain any vectorlength or vectorlengthfor clauses.
> So, shall we assign INT_MAX for infinity?  At least the vectorizer
> certainly doesn't care about anything beyond MAX_VECTORIZATION_FACTOR.
> And I can just map any explicit safelen in the source larger than INT_MAX
> as INT_MAX.

Works for me.

> > Is there a magic value to tell safelen is "infinity"?  As I read
> > above safelen == 0 would mean all iterations are dependent.  Are
> > negative safelen values well-defined?  The comment doesn't seem
> > to disallow them.
> 
> The FEs disallow safelen <= 0 or non-constant.
> 
> > Also make sure to copy the field in copy_loop_info and stream
> > it in output/input_cfg in lto-streamer-in/out.c.
> 
> Ok.
> 
> > > +  if (!broken_loop)
> > > +    {
> > > +      struct loop *loop;
> > > +      calculate_dominance_info (CDI_DOMINATORS);
> > > +      fix_loop_structure (NULL);
> > 
> > Ick.  Didn't I properly add loops everywhere?
> 
> The loop was previously containing EDGE_ABNORMAL edges (that is something
> to prevent any optimizations on those until ompexp had a chance to deal with
> those), so there is no loop at all, just the loop->num == 0 for the whole
> function if #pragma omp simd appears outside of loops and doesn't contain
> any loops inside of its body.

But don't we now know the loop (it's header and possibly its latch)
and so we can simply add_loop here?

> > > --- gcc/tree-vectorizer.c.jj	2013-05-13 16:49:03.000000000 +0200
> > > +++ gcc/tree-vectorizer.c	2013-05-13 20:44:58.721863725 +0200
> > > @@ -101,7 +101,8 @@ vectorize_loops (void)
> > >       than all previously defined loops.  This fact allows us to run
> > >       only over initial loops skipping newly generated ones.  */
> > >    FOR_EACH_LOOP (li, loop, 0)
> > > -    if (optimize_loop_nest_for_speed_p (loop))
> > > +    if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop))
> > > +	|| loop->safelen)
> > 
> > So you vectorize all loops with a safelen?  I'd say this warrants an
> > extra flag in struct loop, force_vect.
> 
> Ok.
> 
> > > @@ -225,7 +225,8 @@ tree_vectorize (void)
> > >  static bool
> > >  gate_tree_vectorize (void)
> > >  {
> > > -  return flag_tree_vectorize;
> > > +  return flag_tree_vectorize
> > > +	 || (flag_openmp && !global_options_set.x_flag_tree_vectorize);
> > 
> > And a flag in cfun here, whether any loop has force_vect set (or
> > a flag in current_loops)
> 
> Ok.
> 
> > Rather than looking at safelen from data-dependence analysis consumers
> > data-dependence analysis itself should use the information.  Which
> > is why I'd like the 'safelen' thing to map to the distance vector
> > representation of dependence analysis.
> 
> Will try.

Might not be trivial - the dependency whould have to be of "known"
type and the distance vector maybe safelen+1 (which would be
not exactly correct I think, but there isn't sth like "at least"
safelen+1).  So eventually it needs to be an "unknown" dependency
still with a new interface of a "at least" distance result.

Richard.
Jakub Jelinek May 14, 2013, 10:23 a.m. UTC | #4
On Tue, May 14, 2013 at 12:16:07PM +0200, Richard Biener wrote:
> > The loop was previously containing EDGE_ABNORMAL edges (that is something
> > to prevent any optimizations on those until ompexp had a chance to deal with
> > those), so there is no loop at all, just the loop->num == 0 for the whole
> > function if #pragma omp simd appears outside of loops and doesn't contain
> > any loops inside of its body.
> 
> But don't we now know the loop (it's header and possibly its latch)
> and so we can simply add_loop here?

Ah, add_loop, I was looking for something like that but didn't find it.
I see you've added add_loop for other places in omp-low.c, this spot was
just on gomp-4_0-branch and not on the trunk, will see if I can use it.

> > Will try.
> 
> Might not be trivial - the dependency whould have to be of "known"
> type and the distance vector maybe safelen+1 (which would be
> not exactly correct I think, but there isn't sth like "at least"
> safelen+1).  So eventually it needs to be an "unknown" dependency
> still with a new interface of a "at least" distance result.

I guess I'll start with all other points then...

	Jakub
diff mbox

Patch

--- gcc/cfgloop.h.jj	2013-05-13 16:49:44.000000000 +0200
+++ gcc/cfgloop.h	2013-05-13 17:30:18.630883633 +0200
@@ -176,6 +176,12 @@  struct GTY ((chain_next ("%h.next"))) lo
 
   /* Number of iteration analysis data for RTL.  */
   struct niter_desc *simple_loop_desc;
+
+  /* If non-NULL, an INTEGER_CST, where the user asserted that for any
+     I in [ 0, nb_iterations ) and for any J in
+     [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations
+     of the loop can be safely evaluated concurrently.  */
+  tree safelen;
 };
 
 /* Flags for state of loop structure.  */
--- gcc/omp-low.c.jj	2013-05-13 16:37:05.000000000 +0200
+++ gcc/omp-low.c	2013-05-13 18:46:18.310405585 +0200
@@ -4960,6 +4960,8 @@  expand_omp_simd (struct omp_region *regi
   edge e, ne;
   tree *counts = NULL;
   int i;
+  tree safelen = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt),
+				  OMP_CLAUSE_SAFELEN);
 
   type = TREE_TYPE (fd->loop.v);
   entry_bb = region->entry;
@@ -5157,6 +5159,22 @@  expand_omp_simd (struct omp_region *regi
   set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
   set_immediate_dominator (CDI_DOMINATORS, l2_bb, l1_bb);
   set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
+
+  if (!broken_loop)
+    {
+      struct loop *loop;
+      calculate_dominance_info (CDI_DOMINATORS);
+      fix_loop_structure (NULL);
+      loop = l1_bb->loop_father;
+      if (safelen == NULL_TREE)
+	{
+	  safelen = build_nonstandard_integer_type (TYPE_PRECISION (type), 1);
+	  safelen = TYPE_MAX_VALUE (safelen);
+	}
+      else
+	safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
+      loop->safelen = safelen;
+    }
 }
 
 
--- gcc/tree-vectorizer.c.jj	2013-05-13 16:49:03.000000000 +0200
+++ gcc/tree-vectorizer.c	2013-05-13 20:44:58.721863725 +0200
@@ -101,7 +101,8 @@  vectorize_loops (void)
      than all previously defined loops.  This fact allows us to run
      only over initial loops skipping newly generated ones.  */
   FOR_EACH_LOOP (li, loop, 0)
-    if (optimize_loop_nest_for_speed_p (loop))
+    if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop))
+	|| loop->safelen)
       {
 	loop_vec_info loop_vinfo;
 	vect_location = find_loop_location (loop);
@@ -122,6 +123,9 @@  vectorize_loops (void)
                        LOC_FILE (vect_location), LOC_LINE (vect_location));
 	vect_transform_loop (loop_vinfo);
 	num_vectorized_loops++;
+	/* Now that the loop has been vectorized, allow it to be unrolled
+	   etc.  */
+	loop->safelen = NULL_TREE;
       }
 
   vect_location = UNKNOWN_LOC;
--- gcc/tree-ssa-loop.c.jj	2013-05-13 16:46:36.000000000 +0200
+++ gcc/tree-ssa-loop.c	2013-05-13 19:12:57.301538324 +0200
@@ -225,7 +225,8 @@  tree_vectorize (void)
 static bool
 gate_tree_vectorize (void)
 {
-  return flag_tree_vectorize;
+  return flag_tree_vectorize
+	 || (flag_openmp && !global_options_set.x_flag_tree_vectorize);
 }
 
 struct gimple_opt_pass pass_vectorize =
--- gcc/tree-ssa-loop-ivcanon.c.jj	2013-05-13 16:46:36.000000000 +0200
+++ gcc/tree-ssa-loop-ivcanon.c	2013-05-13 20:06:44.176519188 +0200
@@ -1123,6 +1123,11 @@  tree_unroll_loops_completely_1 (bool may
   if (changed)
     return true;
 
+  /* Don't unroll #pragma omp simd loops until the vectorizer
+     attempts to vectorize those.  */
+  if (loop->safelen)
+    return false;
+
   /* Try to unroll this loop.  */
   loop_father = loop_outer (loop);
   if (!loop_father)
--- gcc/tree-vect-data-refs.c.jj	2013-05-13 16:49:08.000000000 +0200
+++ gcc/tree-vect-data-refs.c	2013-05-13 20:41:51.579889330 +0200
@@ -255,6 +255,16 @@  vect_analyze_data_ref_dependence (struct
   /* Unknown data dependence.  */
   if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
     {
+      /* If user asserted there safelen consecutive iterations can be
+	 executed concurrently, and safelen >= *max_vf, assume
+	 independence.  */
+      if (loop->safelen)
+	{
+	  if (compare_tree_int (loop->safelen, *max_vf) < 0)
+	    *max_vf = tree_low_cst (loop->safelen, 0);
+	  return false;
+	}
+
       if (STMT_VINFO_GATHER_P (stmtinfo_a)
 	  || STMT_VINFO_GATHER_P (stmtinfo_b))
 	{
@@ -291,6 +301,16 @@  vect_analyze_data_ref_dependence (struct
   /* Known data dependence.  */
   if (DDR_NUM_DIST_VECTS (ddr) == 0)
     {
+      /* If user asserted there safelen consecutive iterations can be
+	 executed concurrently, and safelen >= *max_vf, assume
+	 independence.  */
+      if (loop->safelen)
+	{
+	  if (compare_tree_int (loop->safelen, *max_vf) < 0)
+	    *max_vf = tree_low_cst (loop->safelen, 0);
+	  return false;
+	}
+
       if (STMT_VINFO_GATHER_P (stmtinfo_a)
 	  || STMT_VINFO_GATHER_P (stmtinfo_b))
 	{
--- gcc/tree-if-conv.c.jj	2013-05-13 16:49:06.000000000 +0200
+++ gcc/tree-if-conv.c	2013-05-13 19:08:27.227188600 +0200
@@ -1822,6 +1822,10 @@  main_tree_if_conversion (void)
     return 0;
 
   FOR_EACH_LOOP (li, loop, 0)
+    if (flag_tree_loop_if_convert == 1
+	|| flag_tree_loop_if_convert_stores == 1
+	|| flag_tree_vectorize
+	|| loop->safelen)
     changed |= tree_if_conversion (loop);
 
   if (changed)
@@ -1848,7 +1852,9 @@  main_tree_if_conversion (void)
 static bool
 gate_tree_if_conversion (void)
 {
-  return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
+  return (((flag_tree_vectorize
+	    || (flag_openmp && !global_options_set.x_flag_tree_vectorize))
+	   && flag_tree_loop_if_convert != 0)
 	  || flag_tree_loop_if_convert == 1
 	  || flag_tree_loop_if_convert_stores == 1);
 }