Patchwork [gomp4] Some progress on #pragma omp simd

login
register
mail settings
Submitter Jakub Jelinek
Date April 19, 2013, 1:29 p.m.
Message ID <20130419132957.GE12880@tucnak.redhat.com>
Download mbox | patch
Permalink /patch/237963/
State New
Headers show

Comments

Jakub Jelinek - April 19, 2013, 1:29 p.m.
Hi!

I've committed the following patch to gomp4 branch.
#pragma omp simd loops now are handled with all its clauses from parsing up
to and including omp expansion, so should actually run correctly, though
haven't added any runtime testcases yet.

#pragma omp for simd is handled only partially, the omp expansion isn't
written for it yet (not sure what is better, if to split it say at omplower
time into #pragma omp for with #pragma omp simd inside of it, or handle it
all during expansion with using expand_omp_simd etc. as helpers.

Pointer iterators are untested yet, so likely broken, will need to handle it
later, similarly haven't tested broken loops (if #pragma omp simd body
contains some noreturn call or similar).

Aligned clauses are parsed, but nothing is emitted out of those.
I think best would be to emit __builtin_assume_aligned before the loop; the
exact semantics of the clauses aren't 100% clear yet, my current
understanding of the intent is that for POINTER_TYPE_P vars the aligned
directive talks about alignment of what the pointer points to at right
before the loop, so for !is_global_var pointers perhaps all we need is
really ptr = __builtin_assume_aligned (ptr, aligment);
where default would for non-zero
targetm.vectorize.autovectorize_vector_sizes ()
the highest size in it (in bytes), and otherwise maximum mode alignment of
targetm.vectorize.preferred_simd_mode for say SI/SF/DF modes or so.
For is_global_var __builtin_assume_aligned wouldn't likely work and probably
we want to avoid self-assignment, not sure if it is important enough to
handle.  For array vars, especially globals (and ones where gcc can't affect
their alignment easily, i.e. common vars and externs), it would be nice to
have something too; so perhaps just compute __builtin_assume_aligned on
their address before the loop and replace all occurrences of the var
addresses in the loop with the result of __builtin_assume_aligned.  Thoughts
on this?

More important for vectorization is propagation of the safelen clause (or
the implicit safelen(infinity) if not present) to the vectorizer, waiting
for richi's loop preservation patch here.

The patch handles even expansion of collapsed #pragma omp simd, see the
comment above expand_omp_simd for what it emits (or simd2.c testcase),
I've tried to make the code vectorization friendly (no jumps etc.), though
the question is how many collapsed simd loops will be actually vectorizable
in reality (and have some questions about this in
http://openmp.org/forum/viewtopic.php?f=12&t=1544#p6184 ).  Perhaps if
collapsed simd loops would be important and a few inner steps or all steps would be
known powers of two and similarly the number of iterations of the inner or
all loops, we could do better, instead of the heavy COND_EXPR using code
we could just shift and mask around the iteration var.

Any comments?

2013-04-19  Jakub Jelinek  <jakub@redhat.com>

	* tree.h (OMP_CLAUSE_LINEAR_NO_COPYIN,
	OMP_CLAUSE_LINEAR_NO_COPYOUT): Define.
	* omp-low.c (extract_omp_for_data): Handle #pragma omp simd.
	(build_outer_var_ref): For #pragma omp simd allow linear etc.
	clauses to bind even to private vars.
	(scan_sharing_clauses): Handle OMP_CLAUSE_LINEAR, OMP_CLAUSE_ALIGNED
	and OMP_CLAUSE_SAFELEN.
	(lower_rec_input_clauses): Handle OMP_CLAUSE_LINEAR.  Don't emit
	a GOMP_barrier call for firstprivate/lastprivate in #pragma omp simd.
	(lower_lastprivate_clauses): Handle also OMP_CLAUSE_LINEAR.
	(expand_omp_simd): New function.
	(expand_omp_for): Handle #pragma omp simd.
	* gimplify.c (enum gimplify_omp_var_data): Add GOVD_LINEAR and
	GOVD_ALIGNED, add GOVD_LINEAR into GOVD_DATA_SHARE_CLASS.
	(enum omp_region_type): Add ORT_SIMD.
	(gimple_add_tmp_var, gimplify_var_or_parm_decl, omp_check_private,
	omp_firstprivatize_variable, omp_notice_variable): Handle ORT_SIMD
	like ORT_WORKSHARE.
	(omp_is_private): Likewise.  Add SIMD argument, tweak diagnostics
	and add extra errors in simd constructs.
	(gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle
	OMP_CLAUSE_LINEAR, OMP_CLAUSE_ALIGNED and OMP_CLAUSE_SAFELEN.
	(gimplify_adjust_omp_clauses_1): Handle GOVD_LASTPRIVATE and
	GOVD_ALIGNED.
	(gimplify_omp_for): Handle #pragma omp simd.
cp/
	* cp-tree.h (CP_OMP_CLAUSE_INFO): Also allow it on OMP_CLAUSE_LINEAR.
	* parser.c (cp_parser_omp_var_list_no_open): If colon is non-NULL,
	temporarily disable colon_corrects_to_scope_p during the parsing
	of the variable list.
	(cp_parser_omp_clause_safelen, cp_parser_omp_clause_simdlen): New
	functions.
	(cp_parser_omp_all_clauses): Handle OMP_CLAUSE_SAFELEN and
	OMP_CLAUSE_SIMDLEN.
	* semantics.c (finish_omp_clauses): Allow NULL_TREE in
	OMP_CLAUSE_ALIGNED_ALIGNMENT.
testsuite/
	* c-c++-common/gomp/simd1.c: New test.
	* c-c++-common/gomp/simd2.c: New test.


	Jakub

Patch

--- gcc/tree.h.jj	2013-03-27 13:01:09.000000000 +0100
+++ gcc/tree.h	2013-03-27 13:01:09.000000000 +0100
@@ -613,6 +613,9 @@  struct GTY(()) tree_base {
        OMP_CLAUSE_PRIVATE_DEBUG in
            OMP_CLAUSE_PRIVATE
 
+       OMP_CLAUSE_LINEAR_NO_COPYIN in
+	   OMP_CLAUSE_LINEAR
+
        TRANSACTION_EXPR_RELAXED in
 	   TRANSACTION_EXPR
 
@@ -633,6 +636,9 @@  struct GTY(()) tree_base {
        OMP_CLAUSE_PRIVATE_OUTER_REF in
 	   OMP_CLAUSE_PRIVATE
 
+       OMP_CLAUSE_LINEAR_NO_COPYOUT in
+	   OMP_CLAUSE_LINEAR
+
        TYPE_REF_IS_RVALUE in
 	   REFERENCE_TYPE
 
@@ -1917,6 +1923,16 @@  extern void protected_set_expr_location
 #define OMP_CLAUSE_REDUCTION_PLACEHOLDER(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_REDUCTION), 3)
 
+/* True if a LINEAR clause doesn't need copy in.  True for iterator vars which
+   are always initialized inside of the loop construct, false otherwise.  */
+#define OMP_CLAUSE_LINEAR_NO_COPYIN(NODE) \
+  (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR)->base.public_flag)
+
+/* True if a LINEAR clause doesn't need copy out.  True for iterator vars which
+   are declared inside of the simd construct.  */
+#define OMP_CLAUSE_LINEAR_NO_COPYOUT(NODE) \
+  TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR))
+
 #define OMP_CLAUSE_LINEAR_STEP(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR), 1)
 
--- gcc/omp-low.c.jj	2013-04-10 19:11:23.000000000 +0200
+++ gcc/omp-low.c	2013-04-19 12:31:57.207254045 +0200
@@ -222,6 +222,7 @@  extract_omp_for_data (gimple for_stmt, s
   int i;
   struct omp_for_data_loop dummy_loop;
   location_t loc = gimple_location (for_stmt);
+  bool non_ws = gimple_omp_for_kind (for_stmt) == GF_OMP_FOR_KIND_SIMD;
 
   fd->for_stmt = for_stmt;
   fd->pre = NULL;
@@ -292,7 +292,6 @@  extract_omp_for_data (gimple for_stmt, s
       else
 	loop = &dummy_loop;
 
-
       loop->v = gimple_omp_for_index (for_stmt, i);
       gcc_assert (SSA_VAR_P (loop->v));
       gcc_assert (TREE_CODE (TREE_TYPE (loop->v)) == INTEGER_TYPE
@@ -349,7 +348,18 @@  extract_omp_for_data (gimple for_stmt, s
 	  gcc_unreachable ();
 	}
 
-      if (iter_type != long_long_unsigned_type_node)
+      if (non_ws)
+	{
+	  if (fd->collapse == 1)
+	    iter_type = TREE_TYPE (loop->v);
+	  else if (i == 0
+		   || TYPE_PRECISION (iter_type)
+		      < TYPE_PRECISION (TREE_TYPE (loop->v)))
+	    iter_type
+	      = build_nonstandard_integer_type
+		  (TYPE_PRECISION (TREE_TYPE (loop->v)), 1);
+	}
+      else if (iter_type != long_long_unsigned_type_node)
 	{
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->v)))
 	    iter_type = long_long_unsigned_type_node;
@@ -440,7 +450,7 @@  extract_omp_for_data (gimple for_stmt, s
 	}
     }
 
-  if (count)
+  if (count && !non_ws)
     {
       if (!tree_int_cst_lt (count, TYPE_MAX_VALUE (long_integer_type_node)))
 	iter_type = long_long_unsigned_type_node;
@@ -919,6 +929,11 @@  build_outer_var_ref (tree var, omp_conte
     /* This can happen with orphaned constructs.  If var is reference, it is
        possible it is shared and as such valid.  */
     x = var;
+  else if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
+	   && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+    /* #pragma omp simd isn't a worksharing construct, and can reference even
+       private vars in its linear etc. clauses.  */
+    x = var;
   else
     gcc_unreachable ();
 
@@ -1420,6 +1435,7 @@  scan_sharing_clauses (tree clauses, omp_
 
 	case OMP_CLAUSE_FIRSTPRIVATE:
 	case OMP_CLAUSE_REDUCTION:
+	case OMP_CLAUSE_LINEAR:
 	  decl = OMP_CLAUSE_DECL (c);
 	do_private:
 	  if (is_variable_sized (decl))
@@ -1472,6 +1488,8 @@  scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_UNTIED:
 	case OMP_CLAUSE_MERGEABLE:
 	case OMP_CLAUSE_PROC_BIND:
+	case OMP_CLAUSE_SAFELEN:
+	case OMP_CLAUSE_ALIGNED:
 	  break;
 
 	default:
@@ -1495,6 +1513,7 @@  scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_PRIVATE:
 	case OMP_CLAUSE_FIRSTPRIVATE:
 	case OMP_CLAUSE_REDUCTION:
+	case OMP_CLAUSE_LINEAR:
 	  decl = OMP_CLAUSE_DECL (c);
 	  if (is_variable_sized (decl))
 	    install_var_local (decl, ctx);
@@ -1525,6 +1544,8 @@  scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_FINAL:
 	case OMP_CLAUSE_MERGEABLE:
 	case OMP_CLAUSE_PROC_BIND:
+	case OMP_CLAUSE_SAFELEN:
+	case OMP_CLAUSE_ALIGNED:
 	  break;
 
 	default:
@@ -2297,6 +2318,7 @@  lower_rec_input_clauses (tree clauses, g
 	    case OMP_CLAUSE_FIRSTPRIVATE:
 	    case OMP_CLAUSE_COPYIN:
 	    case OMP_CLAUSE_REDUCTION:
+	    case OMP_CLAUSE_LINEAR:
 	      break;
 	    case OMP_CLAUSE_LASTPRIVATE:
 	      if (OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE (c))
@@ -2442,6 +2464,7 @@  lower_rec_input_clauses (tree clauses, g
 		}
 	      else
 		x = NULL;
+	    do_private:
 	      x = lang_hooks.decls.omp_clause_default_ctor (c, new_var, x);
 	      if (x)
 		gimplify_and_add (x, ilist);
@@ -2459,6 +2482,15 @@  lower_rec_input_clauses (tree clauses, g
 		}
 	      break;
 
+	    case OMP_CLAUSE_LINEAR:
+	      if (!OMP_CLAUSE_LINEAR_NO_COPYIN (c))
+		goto do_firstprivate;
+	      if (OMP_CLAUSE_LINEAR_NO_COPYOUT (c))
+		x = NULL;
+	      else
+		x = build_outer_var_ref (var, ctx);
+	      goto do_private;
+
 	    case OMP_CLAUSE_FIRSTPRIVATE:
 	      if (is_task_ctx (ctx))
 		{
@@ -2474,6 +2506,7 @@  lower_rec_input_clauses (tree clauses, g
 		      goto do_dtor;
 		    }
 		}
+	    do_firstprivate:
 	      x = build_outer_var_ref (var, ctx);
 	      x = lang_hooks.decls.omp_clause_copy_ctor (c, new_var, x);
 	      gimplify_and_add (x, ilist);
@@ -2537,7 +2570,12 @@  lower_rec_input_clauses (tree clauses, g
      lastprivate clauses we need to ensure the lastprivate copying
      happens after firstprivate copying in all threads.  */
   if (copyin_by_ref || lastprivate_firstprivate)
-    gimplify_and_add (build_omp_barrier (), ilist);
+    {
+      /* Don't add any barrier for #pragma omp simd.  */
+      if (gimple_code (ctx->stmt) != GIMPLE_OMP_FOR
+	  || gimple_omp_for_kind (ctx->stmt) != GF_OMP_FOR_KIND_SIMD)
+	gimplify_and_add (build_omp_barrier (), ilist);
+    }
 }
 
 
@@ -2547,13 +2585,17 @@  lower_rec_input_clauses (tree clauses, g
 
 static void
 lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list,
-			    omp_context *ctx)
+			   omp_context *ctx)
 {
   tree x, c, label = NULL;
   bool par_clauses = false;
 
-  /* Early exit if there are no lastprivate clauses.  */
-  clauses = find_omp_clause (clauses, OMP_CLAUSE_LASTPRIVATE);
+  /* Early exit if there are no lastprivate or linear clauses.  */
+  for (; clauses ; clauses = OMP_CLAUSE_CHAIN (clauses))
+    if (OMP_CLAUSE_CODE (clauses) == OMP_CLAUSE_LASTPRIVATE
+	|| (OMP_CLAUSE_CODE (clauses) == OMP_CLAUSE_LINEAR
+	    && !OMP_CLAUSE_LINEAR_NO_COPYOUT (clauses)))
+      break;
   if (clauses == NULL)
     {
       /* If this was a workshare clause, see if it had been combined
@@ -2595,18 +2637,21 @@  lower_lastprivate_clauses (tree clauses,
       tree var, new_var;
       location_t clause_loc = OMP_CLAUSE_LOCATION (c);
 
-      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE)
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+	  || (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR
+	      && !OMP_CLAUSE_LINEAR_NO_COPYOUT (c)))
 	{
 	  var = OMP_CLAUSE_DECL (c);
 	  new_var = lookup_decl (var, ctx);
 
-	  if (OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c))
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+	      && OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c))
 	    {
 	      lower_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 	      gimple_seq_add_seq (stmt_list,
 				  OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c));
+	      OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c) = NULL;
 	    }
-	  OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c) = NULL;
 
 	  x = build_outer_var_ref (var, ctx);
 	  if (is_reference (var))
@@ -4625,6 +4670,273 @@  expand_omp_for_static_chunk (struct omp_
 }
 
 
+/* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
+   loop.  Given parameters:
+
+	for (V = N1; V cond N2; V += STEP) BODY;
+
+   where COND is "<" or ">", we generate pseudocode
+
+	V = N1;
+	goto L1;
+    L0:
+	BODY;
+	V += STEP;
+    L1:
+	if (V cond N2) goto L0; else goto L2;
+    L2:
+
+    For collapsed loops, given parameters:
+      collapse(3)
+      for (V1 = N11; V1 cond1 N12; V1 += STEP1)
+	for (V2 = N21; V2 cond2 N22; V2 += STEP2)
+	  for (V3 = N31; V3 cond3 N32; V3 += STEP3)
+	    BODY;
+
+    we generate pseudocode
+
+	if (cond3 is <)
+	  adj = STEP3 - 1;
+	else
+	  adj = STEP3 + 1;
+	count3 = (adj + N32 - N31) / STEP3;
+	if (cond2 is <)
+	  adj = STEP2 - 1;
+	else
+	  adj = STEP2 + 1;
+	count2 = (adj + N22 - N21) / STEP2;
+	if (cond1 is <)
+	  adj = STEP1 - 1;
+	else
+	  adj = STEP1 + 1;
+	count1 = (adj + N12 - N11) / STEP1;
+	count = count1 * count2 * count3;
+	V = 0;
+	V1 = N11;
+	V2 = N21;
+	V3 = N31;
+	goto L1;
+    L0:
+	BODY;
+	V += 1;
+	V3 += STEP3;
+	V2 += (V3 cond3 N32) ? 0 : STEP2;
+	V3 = (V3 cond3 N32) ? V3 : N31;
+	V1 += (V2 cond2 N22) ? 0 : STEP1;
+	V2 = (V2 cond2 N22) ? V2 : N21;
+    L1:
+	if (V < count) goto L0; else goto L2;
+    L2:
+
+      */
+
+static void
+expand_omp_simd (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type, t;
+  basic_block entry_bb, cont_bb, exit_bb, l0_bb, l1_bb, l2_bb;
+  gimple_stmt_iterator gsi;
+  gimple stmt;
+  bool broken_loop = region->cont == NULL;
+  edge e, ne;
+  tree *counts = NULL;
+  int i;
+
+  type = TREE_TYPE (fd->loop.v);
+  entry_bb = region->entry;
+  cont_bb = region->cont;
+  gcc_assert (EDGE_COUNT (entry_bb->succs) == 2);
+  gcc_assert (broken_loop
+	      || BRANCH_EDGE (entry_bb)->dest == FALLTHRU_EDGE (cont_bb)->dest);
+  l0_bb = FALLTHRU_EDGE (entry_bb)->dest;
+  if (!broken_loop)
+    {
+      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l0_bb);
+      gcc_assert (EDGE_COUNT (cont_bb->succs) == 2);
+      l1_bb = split_block (cont_bb, last_stmt (cont_bb))->dest;
+      l2_bb = BRANCH_EDGE (entry_bb)->dest;
+    }
+  else
+    {
+      BRANCH_EDGE (entry_bb)->flags &= ~EDGE_ABNORMAL;
+      l1_bb = split_edge (BRANCH_EDGE (entry_bb));
+      l2_bb = single_succ (l1_bb);
+    }
+  exit_bb = region->exit;
+
+  gsi = gsi_last_bb (entry_bb);
+
+  gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR);
+  /* Not needed in SSA form right now.  */
+  gcc_assert (!gimple_in_ssa_p (cfun));
+  if (fd->collapse > 1)
+    {
+      counts = XALLOCAVEC (tree, fd->collapse);
+      for (i = 0; i < fd->collapse; i++)
+	{
+	  tree itype = TREE_TYPE (fd->loops[i].v);
+
+	  if (POINTER_TYPE_P (itype))
+	    itype = signed_type_for (itype);
+	  t = build_int_cst (itype, (fd->loops[i].cond_code == LT_EXPR
+				     ? -1 : 1));
+	  t = fold_build2 (PLUS_EXPR, itype,
+			   fold_convert (itype, fd->loops[i].step), t);
+	  t = fold_build2 (PLUS_EXPR, itype, t,
+			   fold_convert (itype, fd->loops[i].n2));
+	  t = fold_build2 (MINUS_EXPR, itype, t,
+			   fold_convert (itype, fd->loops[i].n1));
+	  if (TYPE_UNSIGNED (itype) && fd->loops[i].cond_code == GT_EXPR)
+	    t = fold_build2 (TRUNC_DIV_EXPR, itype,
+			     fold_build1 (NEGATE_EXPR, itype, t),
+			     fold_build1 (NEGATE_EXPR, itype,
+					  fold_convert (itype,
+							fd->loops[i].step)));
+	  else
+	    t = fold_build2 (TRUNC_DIV_EXPR, itype, t,
+			     fold_convert (itype, fd->loops[i].step));
+	  t = fold_convert (type, t);
+	  if (TREE_CODE (t) == INTEGER_CST)
+	    counts[i] = t;
+	  else
+	    {
+	      counts[i] = create_tmp_reg (type, ".count");
+	      t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+					    true, GSI_SAME_STMT);
+	      stmt = gimple_build_assign (counts[i], t);
+	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+	    }
+	  if (SSA_VAR_P (fd->loop.n2))
+	    {
+	      if (i == 0)
+		t = counts[0];
+	      else
+		{
+		  t = fold_build2 (MULT_EXPR, type, fd->loop.n2, counts[i]);
+		  t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+						true, GSI_SAME_STMT);
+		}
+	      stmt = gimple_build_assign (fd->loop.n2, t);
+	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+	    }
+	}
+    }
+  t = fold_convert (type, fd->loop.n1);
+  t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+			       	true, GSI_SAME_STMT);
+  gsi_insert_before (&gsi, gimple_build_assign (fd->loop.v, t), GSI_SAME_STMT);
+  if (fd->collapse > 1)
+    for (i = 0; i < fd->collapse; i++)
+      {
+	t = fold_convert (TREE_TYPE (fd->loops[i].v), fd->loops[i].n1);
+	t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+				      true, GSI_SAME_STMT);
+	gsi_insert_before (&gsi, gimple_build_assign (fd->loops[i].v, t),
+			   GSI_SAME_STMT);
+      }
+
+  /* Remove the GIMPLE_OMP_FOR statement.  */
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+
+      if (POINTER_TYPE_P (type))
+	t = fold_build_pointer_plus (fd->loop.v, fd->loop.step);
+      else
+	t = fold_build2 (PLUS_EXPR, type, fd->loop.v, fd->loop.step);
+      t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+				    true, GSI_SAME_STMT);
+      stmt = gimple_build_assign (fd->loop.v, t);
+      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+
+      if (fd->collapse > 1)
+	{
+	  i = fd->collapse - 1;
+	  t = fold_convert (TREE_TYPE (fd->loops[i].v), fd->loops[i].step);
+	  t = build2 (PLUS_EXPR, TREE_TYPE (fd->loops[i].v),
+		      fd->loops[i].v, t);
+	  t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+					true, GSI_SAME_STMT);
+	  stmt = gimple_build_assign (fd->loops[i].v, t);
+	  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+
+	  for (i = fd->collapse - 1; i > 0; i--)
+	    {
+	      tree itype = TREE_TYPE (fd->loops[i].v);
+	      tree itype2 = TREE_TYPE (fd->loops[i - 1].v);
+	      t = build3 (COND_EXPR, itype2,
+			  build2 (fd->loops[i].cond_code, boolean_type_node,
+				  fd->loops[i].v,
+				  fold_convert (itype, fd->loops[i].n2)),
+			  build_int_cst (itype2, 0),
+			  fold_convert (itype2, fd->loops[i - 1].step));
+	      t = build2 (PLUS_EXPR, itype2, fd->loops[i - 1].v, t);
+	      t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+					    true, GSI_SAME_STMT);
+	      stmt = gimple_build_assign (fd->loops[i - 1].v, t);
+	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+
+	      t = build3 (COND_EXPR, itype,
+			  build2 (fd->loops[i].cond_code, boolean_type_node,
+				  fd->loops[i].v,
+				  fold_convert (itype, fd->loops[i].n2)),
+			  fd->loops[i].v,
+			  fold_convert (itype, fd->loops[i].n1));
+	      t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
+					    true, GSI_SAME_STMT);
+	      stmt = gimple_build_assign (fd->loops[i].v, t);
+	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+	    }
+	}
+
+      /* Remove GIMPLE_OMP_CONTINUE.  */
+      gsi_remove (&gsi, true);
+    }
+
+  /* Emit the condition in L1_BB.  */
+  gsi = gsi_start_bb (l1_bb);
+
+  t = fold_convert (type, fd->loop.n2);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+				false, GSI_CONTINUE_LINKING);
+  t = build2 (fd->loop.cond_code, boolean_type_node, fd->loop.v, t);
+  gsi_insert_after (&gsi, gimple_build_cond_empty (t), GSI_CONTINUE_LINKING);
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+
+  /* Connect the new blocks.  */
+  remove_edge (FALLTHRU_EDGE (entry_bb));
+
+  if (!broken_loop)
+    {
+      remove_edge (BRANCH_EDGE (entry_bb));
+      make_edge (entry_bb, l1_bb, EDGE_FALLTHRU);
+
+      e = BRANCH_EDGE (l1_bb);
+      ne = FALLTHRU_EDGE (l1_bb);
+      e->flags = EDGE_TRUE_VALUE;
+      ne->flags = EDGE_FALSE_VALUE;
+      e->probability = REG_BR_PROB_BASE * 7 / 8;
+      ne->probability = REG_BR_PROB_BASE / 8;
+
+      set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
+      set_immediate_dominator (CDI_DOMINATORS, l2_bb, l1_bb);
+      set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
+    }
+  else
+    {
+      single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+    }
+}
+
+
 /* Expand the OpenMP loop defined by REGION.  */
 
 static void
@@ -4650,10 +4962,12 @@  expand_omp_for (struct omp_region *regio
       FALLTHRU_EDGE (region->cont)->flags &= ~EDGE_ABNORMAL;
     }
 
-  if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
-      && !fd.have_ordered
-      && fd.collapse == 1
-      && region->cont != NULL)
+  if (gimple_omp_for_kind (fd.for_stmt) == GF_OMP_FOR_KIND_SIMD)
+    expand_omp_simd (region, &fd);
+  else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
+	   && !fd.have_ordered
+	   && fd.collapse == 1
+	   && region->cont != NULL)
     {
       if (fd.chunk_size == NULL)
 	expand_omp_for_static_nochunk (region, &fd);
@@ -6318,7 +6632,8 @@  lower_omp_for (gimple_stmt_iterator *gsi
   /* Once lowered, extract the bounds and clauses.  */
   extract_omp_for_data (stmt, &fd, NULL);
 
-  lower_omp_for_lastprivate (&fd, &body, &dlist, ctx);
+  if (gimple_omp_for_kind (fd.for_stmt) != GF_OMP_FOR_KIND_SIMD)
+    lower_omp_for_lastprivate (&fd, &body, &dlist, ctx);
 
   gimple_seq_add_stmt (&body, stmt);
   gimple_seq_add_seq (&body, gimple_omp_body (stmt));
@@ -6334,6 +6649,13 @@  lower_omp_for (gimple_stmt_iterator *gsi
 
   /* Region exit marker goes at the end of the loop body.  */
   gimple_seq_add_stmt (&body, gimple_build_omp_return (fd.have_nowait));
+  if (gimple_omp_for_kind (fd.for_stmt) == GF_OMP_FOR_KIND_SIMD)
+    {
+      dlist = NULL;
+      lower_lastprivate_clauses (gimple_omp_for_clauses (fd.for_stmt),
+				 NULL_TREE, &dlist, ctx);
+      gimple_seq_add_seq (&body, dlist);
+    }
 
   pop_gimplify_context (new_stmt);
 
--- gcc/gimplify.c.jj	2013-04-10 19:11:23.000000000 +0200
+++ gcc/gimplify.c	2013-04-16 15:40:16.120410400 +0200
@@ -59,14 +59,18 @@  enum gimplify_omp_var_data
   GOVD_LOCAL = 128,
   GOVD_DEBUG_PRIVATE = 256,
   GOVD_PRIVATE_OUTER_REF = 512,
+  GOVD_LINEAR = 1024,
+  GOVD_ALIGNED = 2048,
   GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
-			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LOCAL)
+			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
+			   | GOVD_LOCAL)
 };
 
 
 enum omp_region_type
 {
   ORT_WORKSHARE = 0,
+  ORT_SIMD = 1, /* #pragma omp for simd is ORT_WORKSHARE.  */
   ORT_PARALLEL = 2,
   ORT_COMBINED_PARALLEL = 3,
   ORT_TASK = 4,
@@ -755,7 +759,9 @@  gimple_add_tmp_var (tree tmp)
       if (gimplify_omp_ctxp)
 	{
 	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
-	  while (ctx && ctx->region_type == ORT_WORKSHARE)
+	  while (ctx
+		 && (ctx->region_type == ORT_WORKSHARE
+		     || ctx->region_type == ORT_SIMD))
 	    ctx = ctx->outer_context;
 	  if (ctx)
 	    omp_add_variable (ctx, tmp, GOVD_LOCAL | GOVD_SEEN);
@@ -2102,7 +2108,9 @@  gimplify_var_or_parm_decl (tree *expr_p)
 	  && decl_function_context (decl) != current_function_decl)
 	{
 	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
-	  while (ctx && ctx->region_type == ORT_WORKSHARE)
+	  while (ctx
+		 && (ctx->region_type == ORT_WORKSHARE
+		     || ctx->region_type == ORT_SIMD))
 	    ctx = ctx->outer_context;
 	  if (!ctx && !pointer_set_insert (nonlocal_vlas, decl))
 	    {
@@ -5751,7 +5759,8 @@  omp_firstprivatize_variable (struct gimp
 	  else
 	    return;
 	}
-      else if (ctx->region_type != ORT_WORKSHARE)
+      else if (ctx->region_type != ORT_WORKSHARE
+	       && ctx->region_type != ORT_SIMD)
 	omp_add_variable (ctx, decl, GOVD_FIRSTPRIVATE);
 
       ctx = ctx->outer_context;
@@ -5973,7 +5982,8 @@  omp_notice_variable (struct gimplify_omp
       enum omp_clause_default_kind default_kind, kind;
       struct gimplify_omp_ctx *octx;
 
-      if (ctx->region_type == ORT_WORKSHARE)
+      if (ctx->region_type == ORT_WORKSHARE
+	  || ctx->region_type == ORT_SIMD)
 	goto do_outer;
 
       /* ??? Some compiler-generated variables (like SAVE_EXPRs) could be
@@ -6086,7 +6096,7 @@  omp_notice_variable (struct gimplify_omp
    to the contrary in the innermost scope, generate an error.  */
 
 static bool
-omp_is_private (struct gimplify_omp_ctx *ctx, tree decl)
+omp_is_private (struct gimplify_omp_ctx *ctx, tree decl, bool simd)
 {
   splay_tree_node n;
 
@@ -6097,8 +6107,12 @@  omp_is_private (struct gimplify_omp_ctx
 	{
 	  if (ctx == gimplify_omp_ctxp)
 	    {
-	      error ("iteration variable %qE should be private",
-		     DECL_NAME (decl));
+	      if (simd)
+		error ("iteration variable %qE is predetermined linear",
+		       DECL_NAME (decl));
+	      else
+		error ("iteration variable %qE should be private",
+		       DECL_NAME (decl));
 	      n->value = GOVD_PRIVATE;
 	      return true;
 	    }
@@ -6116,16 +6130,26 @@  omp_is_private (struct gimplify_omp_ctx
 	  else if ((n->value & GOVD_REDUCTION) != 0)
 	    error ("iteration variable %qE should not be reduction",
 		   DECL_NAME (decl));
+	  else if (simd && (n->value & GOVD_LASTPRIVATE) != 0)
+	    error ("iteration variable %qE should not be lastprivate",
+		   DECL_NAME (decl));
+	  else if (simd && (n->value & GOVD_PRIVATE) != 0)
+	    error ("iteration variable %qE should not be private",
+		   DECL_NAME (decl));
+	  else if (simd && (n->value & GOVD_LINEAR) != 0)
+	    error ("iteration variable %qE is predetermined linear",
+		   DECL_NAME (decl));
 	}
       return (ctx == gimplify_omp_ctxp
 	      || (ctx->region_type == ORT_COMBINED_PARALLEL
 		  && gimplify_omp_ctxp->outer_context == ctx));
     }
 
-  if (ctx->region_type != ORT_WORKSHARE)
+  if (ctx->region_type != ORT_WORKSHARE
+      && ctx->region_type != ORT_SIMD)
     return false;
   else if (ctx->outer_context)
-    return omp_is_private (ctx->outer_context, decl);
+    return omp_is_private (ctx->outer_context, decl, simd);
   return false;
 }
 
@@ -6150,7 +6174,8 @@  omp_check_private (struct gimplify_omp_c
       if (n != NULL)
 	return (n->value & GOVD_SHARED) == 0;
     }
-  while (ctx->region_type == ORT_WORKSHARE);
+  while (ctx->region_type == ORT_WORKSHARE
+	 || ctx->region_type == ORT_SIMD);
   return false;
 }
 
@@ -6203,6 +6228,15 @@  gimplify_scan_omp_clauses (tree *list_p,
 	  flags = GOVD_REDUCTION | GOVD_SEEN | GOVD_EXPLICIT;
 	  check_non_private = "reduction";
 	  goto do_add;
+	case OMP_CLAUSE_LINEAR:
+	  if (gimplify_expr (&OMP_CLAUSE_LINEAR_STEP (c), pre_p, NULL,
+			     is_gimple_val, fb_rvalue) == GS_ERROR)
+	    {
+	      remove = true;
+	      break;
+	    }
+	  flags = GOVD_LINEAR | GOVD_EXPLICIT;
+	  goto do_add;
 
 	do_add:
 	  decl = OMP_CLAUSE_DECL (c);
@@ -6293,7 +6327,7 @@  gimplify_scan_omp_clauses (tree *list_p,
 	case OMP_CLAUSE_NUM_THREADS:
 	  if (gimplify_expr (&OMP_CLAUSE_OPERAND (c, 0), pre_p, NULL,
 			     is_gimple_val, fb_rvalue) == GS_ERROR)
-	      remove = true;
+	    remove = true;
 	  break;
 
 	case OMP_CLAUSE_NOWAIT:
@@ -6302,6 +6336,19 @@  gimplify_scan_omp_clauses (tree *list_p,
 	case OMP_CLAUSE_COLLAPSE:
 	case OMP_CLAUSE_MERGEABLE:
 	case OMP_CLAUSE_PROC_BIND:
+	case OMP_CLAUSE_SAFELEN:
+	  break;
+
+	case OMP_CLAUSE_ALIGNED:
+	  decl = OMP_CLAUSE_DECL (c);
+	  if (error_operand_p (decl))
+	    {
+	      remove = true;
+	      break;
+	    }
+	  if (!is_global_var (decl)
+	      && TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE)
+	    omp_add_variable (ctx, decl, GOVD_ALIGNED);
 	  break;
 
 	case OMP_CLAUSE_DEFAULT:
@@ -6372,6 +6419,10 @@  gimplify_adjust_omp_clauses_1 (splay_tre
     code = OMP_CLAUSE_PRIVATE;
   else if (flags & GOVD_FIRSTPRIVATE)
     code = OMP_CLAUSE_FIRSTPRIVATE;
+  else if (flags & GOVD_LASTPRIVATE)
+    code = OMP_CLAUSE_LASTPRIVATE;
+  else if (flags & GOVD_ALIGNED)
+    return 0;
   else
     gcc_unreachable ();
 
@@ -6404,6 +6455,7 @@  gimplify_adjust_omp_clauses (tree *list_
 	case OMP_CLAUSE_PRIVATE:
 	case OMP_CLAUSE_SHARED:
 	case OMP_CLAUSE_FIRSTPRIVATE:
+	case OMP_CLAUSE_LINEAR:
 	  decl = OMP_CLAUSE_DECL (c);
 	  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
 	  remove = !(n->value & GOVD_SEEN);
@@ -6419,6 +6471,27 @@  gimplify_adjust_omp_clauses (tree *list_
 		  OMP_CLAUSE_SET_CODE (c, OMP_CLAUSE_PRIVATE);
 		  OMP_CLAUSE_PRIVATE_DEBUG (c) = 1;
 		}
+	      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR
+		  && ctx->outer_context
+		  && ctx->outer_context->region_type == ORT_COMBINED_PARALLEL
+		  && !(OMP_CLAUSE_LINEAR_NO_COPYIN (c)
+		       && OMP_CLAUSE_LINEAR_NO_COPYOUT (c))
+		  && !is_global_var (decl))
+		{
+		  n = splay_tree_lookup (ctx->outer_context->variables,
+					 (splay_tree_key) decl);
+		  if (n == NULL
+		      || (n->value & GOVD_DATA_SHARE_CLASS) == 0)
+		    {
+		      int flags = OMP_CLAUSE_LINEAR_NO_COPYIN (c)
+				  ? GOVD_LASTPRIVATE : GOVD_SHARED;
+		      if (n == NULL)
+			omp_add_variable (ctx->outer_context, decl,
+					  flags | GOVD_SEEN);
+		      else
+			n->value |= flags | GOVD_SEEN;
+		    }
+		}
 	    }
 	  break;
 
@@ -6431,6 +6504,15 @@  gimplify_adjust_omp_clauses (tree *list_
 	    = (n->value & GOVD_FIRSTPRIVATE) != 0;
 	  break;
 
+	case OMP_CLAUSE_ALIGNED:
+	  decl = OMP_CLAUSE_DECL (c);
+	  if (!is_global_var (decl))
+	    {
+	      n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
+	      remove = n == NULL || !(n->value & GOVD_SEEN);
+	    }
+	  break;
+
 	case OMP_CLAUSE_REDUCTION:
 	case OMP_CLAUSE_COPYIN:
 	case OMP_CLAUSE_COPYPRIVATE:
@@ -6445,6 +6527,7 @@  gimplify_adjust_omp_clauses (tree *list_
 	case OMP_CLAUSE_FINAL:
 	case OMP_CLAUSE_MERGEABLE:
 	case OMP_CLAUSE_PROC_BIND:
+	case OMP_CLAUSE_SAFELEN:
 	  break;
 
 	default:
@@ -6548,14 +6631,42 @@  gimplify_omp_for (tree *expr_p, gimple_s
   gimple gfor;
   gimple_seq for_body, for_pre_body;
   int i;
+  bool simd;
+  bitmap has_decl_expr = NULL;
 
   for_stmt = *expr_p;
 
+  simd = TREE_CODE (for_stmt) == OMP_SIMD
+	 || TREE_CODE (for_stmt) == OMP_FOR_SIMD;
   gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p,
-			     ORT_WORKSHARE);
+			     TREE_CODE (for_stmt) == OMP_SIMD
+			     ? ORT_SIMD : ORT_WORKSHARE);
 
   /* Handle OMP_FOR_INIT.  */
   for_pre_body = NULL;
+  if (simd && OMP_FOR_PRE_BODY (for_stmt))
+    {
+      has_decl_expr = BITMAP_ALLOC (NULL);
+      if (TREE_CODE (OMP_FOR_PRE_BODY (for_stmt)) == DECL_EXPR
+	  && TREE_CODE (DECL_EXPR_DECL (OMP_FOR_PRE_BODY (for_stmt)))
+	     == VAR_DECL)
+	{
+	  t = OMP_FOR_PRE_BODY (for_stmt);
+	  bitmap_set_bit (has_decl_expr, DECL_UID (DECL_EXPR_DECL (t)));
+	}
+      else if (TREE_CODE (OMP_FOR_PRE_BODY (for_stmt)) == STATEMENT_LIST)
+	{
+	  tree_stmt_iterator si;
+	  for (si = tsi_start (OMP_FOR_PRE_BODY (for_stmt)); !tsi_end_p (si);
+	       tsi_next (&si))
+	    {
+	      t = tsi_stmt (si);
+	      if (TREE_CODE (t) == DECL_EXPR
+		  && TREE_CODE (DECL_EXPR_DECL (t)) == VAR_DECL)
+		bitmap_set_bit (has_decl_expr, DECL_UID (DECL_EXPR_DECL (t)));
+	    }
+	}
+    }
   gimplify_and_add (OMP_FOR_PRE_BODY (for_stmt), &for_pre_body);
   OMP_FOR_PRE_BODY (for_stmt) = NULL_TREE;
 
@@ -6574,7 +6685,29 @@  gimplify_omp_for (tree *expr_p, gimple_s
 		  || POINTER_TYPE_P (TREE_TYPE (decl)));
 
       /* Make sure the iteration variable is private.  */
-      if (omp_is_private (gimplify_omp_ctxp, decl))
+      bool is_private = omp_is_private (gimplify_omp_ctxp, decl, simd);
+      tree c = NULL_TREE;
+      if (simd)
+	{
+	  splay_tree_node n = splay_tree_lookup (gimplify_omp_ctxp->variables,
+						 (splay_tree_key)decl);
+	  if (n != NULL && (n->value & GOVD_DATA_SHARE_CLASS) != 0)
+	    omp_notice_variable (gimplify_omp_ctxp, decl, true);
+	  else
+	    {
+	      c = build_omp_clause (input_location, OMP_CLAUSE_LINEAR);
+	      OMP_CLAUSE_LINEAR_NO_COPYIN (c) = 1;
+	      if (has_decl_expr
+		  && bitmap_bit_p (has_decl_expr, DECL_UID (decl)))
+		OMP_CLAUSE_LINEAR_NO_COPYOUT (c) = 1;
+	      OMP_CLAUSE_DECL (c) = decl;
+	      OMP_CLAUSE_CHAIN (c) = OMP_FOR_CLAUSES (for_stmt);
+	      OMP_FOR_CLAUSES (for_stmt) = c;
+	      omp_add_variable (gimplify_omp_ctxp, decl,
+				GOVD_LINEAR | GOVD_EXPLICIT | GOVD_SEEN);
+	    }
+	}
+      else if (is_private)
 	omp_notice_variable (gimplify_omp_ctxp, decl, true);
       else
 	omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN);
@@ -6616,6 +6749,8 @@  gimplify_omp_for (tree *expr_p, gimple_s
 	case PREINCREMENT_EXPR:
 	case POSTINCREMENT_EXPR:
 	  t = build_int_cst (TREE_TYPE (decl), 1);
+	  if (c)
+	    OMP_CLAUSE_LINEAR_STEP (c) = t;
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
@@ -6624,6 +6759,8 @@  gimplify_omp_for (tree *expr_p, gimple_s
 	case PREDECREMENT_EXPR:
 	case POSTDECREMENT_EXPR:
 	  t = build_int_cst (TREE_TYPE (decl), -1);
+	  if (c)
+	    OMP_CLAUSE_LINEAR_STEP (c) = t;
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
@@ -6657,6 +6794,20 @@  gimplify_omp_for (tree *expr_p, gimple_s
 	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 				is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
+	  if (c)
+	    {
+	      OMP_CLAUSE_LINEAR_STEP (c) = TREE_OPERAND (t, 1);
+	      if (TREE_CODE (t) == MINUS_EXPR)
+		{
+		  t = TREE_OPERAND (t, 1);
+		  OMP_CLAUSE_LINEAR_STEP (c)
+		    = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), t);
+		  tret = gimplify_expr (&OMP_CLAUSE_LINEAR_STEP (c),
+					&for_pre_body, NULL,
+					is_gimple_val, fb_rvalue);
+		  ret = MIN (ret, tret);
+		}
+	    }
 	  break;
 
 	default:
@@ -6665,7 +6816,6 @@  gimplify_omp_for (tree *expr_p, gimple_s
 
       if (var != decl || TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)) > 1)
 	{
-	  tree c;
 	  for (c = OMP_FOR_CLAUSES (for_stmt); c ; c = OMP_CLAUSE_CHAIN (c))
 	    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
 		&& OMP_CLAUSE_DECL (c) == decl
@@ -6687,6 +6837,8 @@  gimplify_omp_for (tree *expr_p, gimple_s
 	}
     }
 
+  BITMAP_FREE (has_decl_expr);
+
   gimplify_and_add (OMP_FOR_BODY (for_stmt), &for_body);
 
   gimplify_adjust_omp_clauses (&OMP_FOR_CLAUSES (for_stmt));
--- gcc/cp/cp-tree.h.jj	2013-04-10 19:11:23.000000000 +0200
+++ gcc/cp/cp-tree.h	2013-04-15 17:58:02.794601211 +0200
@@ -3984,7 +3984,7 @@  more_aggr_init_expr_args_p (const aggr_i
    See semantics.c for details.  */
 #define CP_OMP_CLAUSE_INFO(NODE) \
   TREE_TYPE (OMP_CLAUSE_RANGE_CHECK (NODE, OMP_CLAUSE_PRIVATE, \
-				     OMP_CLAUSE_COPYPRIVATE))
+				     OMP_CLAUSE_LINEAR))
 
 /* Nonzero if this transaction expression's body contains statements.  */
 #define TRANSACTION_EXPR_IS_STMT(NODE) \
--- gcc/cp/parser.c.jj	2013-04-10 19:11:23.000000000 +0200
+++ gcc/cp/parser.c	2013-04-15 15:36:59.530462981 +0200
@@ -25875,8 +25875,12 @@  cp_parser_omp_var_list_no_open (cp_parse
 				tree list, bool *colon)
 {
   cp_token *token;
+  bool saved_colon_corrects_to_scope_p = parser->colon_corrects_to_scope_p;
   if (colon)
-    *colon = false;
+    {
+      parser->colon_corrects_to_scope_p = false;
+      *colon = false;
+    }
   while (1)
     {
       tree name, decl;
@@ -25888,7 +25892,12 @@  cp_parser_omp_var_list_no_open (cp_parse
 				      /*declarator_p=*/false,
 				      /*optional_p=*/false);
       if (name == error_mark_node)
-	goto skip_comma;
+	{
+	  if (colon)
+	    parser->colon_corrects_to_scope_p
+	      = saved_colon_corrects_to_scope_p;
+	  goto skip_comma;
+	}
 
       decl = cp_parser_lookup_name_simple (parser, name, token->location);
       if (decl == error_mark_node)
@@ -25910,6 +25919,9 @@  cp_parser_omp_var_list_no_open (cp_parse
       cp_lexer_consume_token (parser->lexer);
     }
 
+  if (colon)
+    parser->colon_corrects_to_scope_p = saved_colon_corrects_to_scope_p;
+
   if (colon != NULL && cp_lexer_next_token_is (parser->lexer, CPP_COLON))
     {
       *colon = true;
@@ -26520,6 +26532,64 @@  cp_parser_omp_clause_linear (cp_parser *
 }
 
 /* OpenMP 4.0:
+   safelen ( constant-expression )  */
+
+static tree
+cp_parser_omp_clause_safelen (cp_parser *parser, tree list,
+			      location_t location)
+{
+  tree t, c;
+
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    return list;
+
+  t = cp_parser_constant_expression (parser, false, NULL);
+
+  if (t == error_mark_node
+      || !cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+    cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
+					   /*or_comma=*/false,
+					   /*consume_paren=*/true);
+
+  check_no_duplicate_clause (list, OMP_CLAUSE_SAFELEN, "safelen", location);
+
+  c = build_omp_clause (location, OMP_CLAUSE_SAFELEN);
+  OMP_CLAUSE_SAFELEN_EXPR (c) = t;
+  OMP_CLAUSE_CHAIN (c) = list;
+
+  return c;
+}
+
+/* OpenMP 4.0:
+   simdlen ( constant-expression )  */
+
+static tree
+cp_parser_omp_clause_simdlen (cp_parser *parser, tree list,
+			      location_t location)
+{
+  tree t, c;
+
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    return list;
+
+  t = cp_parser_constant_expression (parser, false, NULL);
+
+  if (t == error_mark_node
+      || !cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+    cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
+					   /*or_comma=*/false,
+					   /*consume_paren=*/true);
+
+  check_no_duplicate_clause (list, OMP_CLAUSE_SIMDLEN, "simdlen", location);
+
+  c = build_omp_clause (location, OMP_CLAUSE_SIMDLEN);
+  OMP_CLAUSE_SIMDLEN_EXPR (c) = t;
+  OMP_CLAUSE_CHAIN (c) = list;
+
+  return c;
+}
+
+/* OpenMP 4.0:
    depend ( depend-kind : variable-list )
 
    depend-kind:
@@ -26944,6 +27014,16 @@  cp_parser_omp_all_clauses (cp_parser *pa
 						    token->location);
 	  c_name = "proc_bind";
 	  break;
+	case PRAGMA_OMP_CLAUSE_SAFELEN:
+	  clauses = cp_parser_omp_clause_safelen (parser, clauses,
+						  token->location);
+	  c_name = "safelen";
+	  break;
+	case PRAGMA_OMP_CLAUSE_SIMDLEN:
+	  clauses = cp_parser_omp_clause_simdlen (parser, clauses,
+						  token->location);
+	  c_name = "simdlen";
+	  break;
 	default:
 	  cp_parser_error (parser, "expected %<#pragma omp%> clause");
 	  goto saw_error;
--- gcc/cp/semantics.c.jj	2013-04-10 19:11:23.000000000 +0200
+++ gcc/cp/semantics.c	2013-04-19 13:29:05.498943854 +0200
@@ -4317,6 +4318,8 @@  finish_omp_clauses (tree clauses)
 	  t = OMP_CLAUSE_ALIGNED_ALIGNMENT (c);
 	  if (t == error_mark_node)
 	    remove = true;
+	  else if (t == NULL_TREE)
+	    break;
 	  else if (!type_dependent_expression_p (t)
 		   && !INTEGRAL_TYPE_P (TREE_TYPE (t)))
 	    {
--- gcc/testsuite/c-c++-common/gomp/simd1.c.jj	2013-04-19 13:47:57.913891055 +0200
+++ gcc/testsuite/c-c++-common/gomp/simd1.c	2013-04-19 13:50:05.060213120 +0200
@@ -0,0 +1,31 @@ 
+/* { dg-do compile { target { ! c } } } */
+/* { dg-options "-fopenmp" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+extern int a[1024], b[1024], k, l, m;
+
+void
+foo ()
+{
+  int i;
+  #pragma omp simd safelen(16) aligned(a, b : 32)
+  for (i = 0; i < 1024; i++)
+    a[i] *= b[i];
+}
+
+void
+bar (int *p)
+{
+  int i;
+  #pragma omp simd safelen(16) aligned(a, p : 32) linear(k, l : m + 1)
+  for (i = 0; i < 1024; i++)
+    a[i] *= p[i], k += m + 1;
+}
+
+void
+baz (int *p)
+{
+  #pragma omp simd safelen(16) aligned(a, p : 32) linear(k, l : m + 1)
+  for (int i = 0; i < 1024; i++)
+    a[i] *= p[i], k += m + 1;
+}
--- gcc/testsuite/c-c++-common/gomp/simd2.c.jj	2013-04-19 13:48:22.340761227 +0200
+++ gcc/testsuite/c-c++-common/gomp/simd2.c	2013-04-19 13:50:14.416160124 +0200
@@ -0,0 +1,29 @@ 
+/* { dg-do compile { target { ! c } } } */
+/* { dg-options "-fopenmp" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+extern int a[13][13][13][13], k, l, m;
+
+void
+foo (int *q, float *p)
+{
+  int i, j, n, o;
+#pragma omp simd collapse (4) linear(k : m + 1) aligned(p, q)
+  for (i = 0; i < 13; i++)
+    for (j = 0; j < 13; j++)
+      for (n = 0; n < 13; n++)
+	for (o = 0; o < 13; o += 2)
+	  q[k] *= p[k] + 7 * i + 14 * j + 21 * n + 28 * o, k += m + 1;
+}
+
+void
+bar (float *p)
+{
+  int i, j, n, o;
+#pragma omp simd collapse (4) linear(k : m + 1)
+  for (i = 0; i < 13; i++)
+    for (j = 0; j < 13; j++)
+      for (n = 0; n < 13; n++)
+	for (o = 0; o < 13; o += 2)
+	  a[i][j][n][o] *= p[k], k += m + 1;
+}