diff mbox

implement simd loops in trunk (OMP_SIMD)

Message ID 51DC3405.4070509@redhat.com
State New
Headers show

Commit Message

Aldy Hernandez July 9, 2013, 4:02 p.m. UTC
Hi folks.

I have distilled the relevant machinery to implement and handle simd 
loops from the gomp-4_0-branch.  This will be used by both gomp4 
(#pragma omp simd) and Cilk Plus (#pragma simd).

All of it is Jakub's code.  There is nothing new here, as I believe 
Jakub and the Richards have talked about all this at length.  Included 
in the patch are the vectorizer changes that have been previously 
discussed, all from the gomp-4_0-branch.

It would be ideal to get an ok from either Mr. Biener or Mr. Henderson 
on the vectorizer changes.

The patch I am formally submitting is the attached patch named OMP_SIMD. 
  I am including the second (CILK_SIMD), which I will submit 
independently, as reference to how it will be used (if for some reason 
it isn't clear from the gomp4 branch usage of OMP_SIMD).

Tested on x86-64 Linux with and without the Cilk Plus bits, not to 
mention that this has been extensively tested on the gomp4 branch.

OK for trunk?
* Makefile.in (omp-low.o): Depend on $(TARGET_H).
	* cfgloop.h (struct loop): Add safelen, force_vect, simduid.
	* function.h (struct function): Add has_force_vect_loops and
	has_simduid_loops.
	* gimple-pretty-print.c (dump_gimple_omp_for): Handle
	GF_OMP_FOR_KIND*.
	* gimple.c (gimple_build_omp_critical): Add KIND argument and
	handle it.
	* gimple.def: Update CLAUSES comments.
	* gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_{FOR,SIMD}.
	(gimple_build_omp_for): Add argument to prototype.
	(gimple_omp_for_kind): New.
	(gimple_omp_for_set_kind): New.
	* gimplify.c (enum gimplify_omp_var_data): Add GOVD_LINEAR to
	GOVD_DATA_SHARE_CLASS.
	(enum omp_region_type): Add ORT_SIMD.
	(gimple_add_tmp_var): Handle ORT_SIMD.
	(gimplify_var_or_parm_decl): Same.
	(is_gimple_stmt): Same.
	(omp_firstprivatize_variable): Same.
	(omp_add_variable): Only use splay_tree_insert if lookup failed.
	(omp_notice_variable): Handle ORT_SIMD.
	(omp_is_private): Add SIMD argument and handle it as well as
	ORT_SIMD.
	(omp_check_private): Handle ORT_SIMD.
	(gimplify_scan_omp_clauses): Handle OMP_CLAUSE_LINEAR and
	OMP_CLAUSE_SAFELEN.
	(gimplify_adjust_omp_clauses_1): Handle GOVD_LINEAR.
	Handle OMP_CLAUSE_LASTPRIVATE.
	(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_LINEAR and
	OMP_CLAUSE_SAFELEN.
	(gimplify_omp_for): Handle OMP_SIMD and OMP_CLAUSE_LINEAR.
	(gimplify_expr): Handle OMP_SIMD.
	* internal-fn.c (expand_GOMP_SIMD_LANE): New.
	(expand_GOMP_SIMD_VF): New.
	(expand_GOMP_SIMD_LAST_LANE): New.
	* internal-fn.def (GOMP_SIMD_LANE): New.
	(GOMP_SIMD_VF): New.
	(GOMP_SIMD_LAST_LANE): New.
	* omp-low.c: Include target.h.
	(extract_omp_for_data): Handle OMP_SIMD, OMP_CLAUSE_LINEAR,
	OMP_CLAUSE_SAFELEN.
	(check_omp_nesting_restrictions): Same.
	(omp_max_vf): New.
	(lower_rec_simd_input_clauses): New.
	(lower_rec_input_clauses): Handle OMP_SIMD, GF_OMP_FOR_KIND_SIMD,
	OMP_CLAUSE_LINEAR.
	(lower_lastprivate_clauses): Handle OMP_CLAUSE_LINEAR,
	GF_OMP_FOR_KIND_SIMD, OMP_SIMD.
	(expand_omp_build_assign): New.
	(expand_omp_for_init_counts): New.
	(expand_omp_for_init_vars): New.
	(extract_omp_for_update_vars): New.
	(expand_omp_for_generic): Use expand_omp_for_{init,update}_vars
	and rewrite accordingly.
	(expand_omp_simd): New.
	(expand_omp_for): Use expand_omp_simd.
	(lower_omp_for_lastprivate): Unshare vinit when appropriate.
	(lower_omp_for): Do not lower the body.
	* tree-data-ref (get_references_in_stmt): Allow IFN_GOMP_SIMD_LANE
	in their own loops.
	* tree-flow.h (find_omp_clause): Remove prototype.
	* tree-if-conv.c (main_tree_if_conversion): Run if doing if
	conversion, forcing vectorization of the loop, or if
	flag_tree_vectorize.
	(gate_tree_if_conversion): Similarly.
	* tree-inline.c (remap_gimple_stmt): Pass for kind argument to
	gimple_build_omp_for.
	(copy_cfg_body): set has_force_vect_loops and has_simduid_loops.
	* tree-parloops (create_parallel_loop): Pass kind argument to
	gimple_build_omp_for.
	* tree-pretty-print.c (dump_omp_clause): Add cases for
	OMP_CLAUSE_UNIFORM, OMP_CLAUSE_LINEAR, OMP_CLAUSE_SAFELEN,
	OMP_CLAUSE__SIMDUID_.
	(dump_generic_node): Handle OMP_SIMD.
	* tree-ssa-ccp.c (likely_value): Handle IFN_GOMP_SIMD*.
	* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Do not
	unroll OMP_SIMD loops here.
	* tree-ssa-loop.c (gate_tree_vectorize): Run if
	has_force_vect_loops.
	* tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Handle
	loop->safelen
	(vect_analyze_data_refs): Handle simd loops.
	* tree-vect-loop.c (vectorizable_live_operation): Handle
	IFN_GOMP_SIMD*.
	* tree-vect-stmts.c (vectorizable_call): Handle
	IFN_GOMP_SIMD_LANE.
	(vectorizable_store): Handle STMT_VINFO_SIMD_LANE_ACCESS_P.
	(vectorizable_load): Same.
	* tree-vectorizer.c: Include hash-table.h and
	tree-ssa-propagate.h.
	(struct simduid_to_vf): New.
	(simduid_to_vf::hash): New.
	(simduid_to-vf::equal): New.
	(struct decl_to_simduid): New.
	(decl_to_simduid::hash): New.
	(decl_to_simduid::equal): New.
	(adjust_simduid_builtins): New.
	(struct note_simd_array_uses_struct): New.
	(note_simd_array_uses_cb): New.
	(note_simd_array_uses): New.
	(vectorize_loops): Handle simd hints and adjust simd builtins
	accordingly.
	* tree-vectorizer.h (struct _stmt_vec_info): Add
	simd_lane_access_p field.
	(STMT_VINFO_SIMD_LANE_ACCESS_P): New macro.
	* tree.c (omp_clause_num_ops): Add entries for OMP_CLAUSE_LINEAR,
	OMP_CLAUSE_SAFELEN, OMP_CLAUSE__SIMDUID_, OMP_CLAUSE_UNIFORM.
	(omp_clause_code_name): Same.
	(walk_tree_1): Handle OMP_CLAUSE_UNIFORM, OMP_CLAUSE_SAFELEN,
	OMP_CLAUSE__SIMDUID_, OMP_CLAUSE_LINEAR.
	* tree.def (OMP_SIMD): New entry.
	* tree.h (enum omp_clause_code): Add entries for
	OMP_CLAUSE_LINEAR, OMP_CLAUSE_UNIFORM, OMP_CLAUSE_SAFELEN,
	OMP_CLAUSE__SIMDUID_.
	(OMP_CLAUSE_DECL): Adjust range for new clauses.
	(OMP_CLAUSE_LINEAR_NO_COPYIN): New.
	(OMP_CLAUSE_LINEAR_NO_COPYOUT): New.
	(OMP_CLAUSE_LINEAR_STEP): New.
	(OMP_CLAUSE_SAFELEN_EXPR): New.
	(OMP_CLAUSE__SIMDUID__DECL): New.
	(find_omp_clause): New prototype.
cp/
	* cp-tree.h (CP_OMP_CLAUSE_INFO): Adjust range for new clauses.
* Makefile.in (C_COMMON_OBJS): Depend on c-family/c-cilkplus.o.
	(c-cilkplus.o): New dependency.
	* omp-low.c (extract_omp_for_data): Add case for NE_EXPR.
	(build_outer_var_ref): Check for GF_OMP_FOR_KIND_SIMD bitwise.
	(check_omp_nesting_restrictions): Same.
	(lower_rec_input_clauses): Same.
	(expand_omp_for): Same.
	(lower_omp_for): Same.
	(diagnose_sb_0): Adjust for Cilk Plus for loops.
	(gate_expand_omp): Check for Cilk Plus.
	(execute_lower_omp): Same.
	(gate_diagnose_omp_blocks): Same.
	* tree.h (OMP_LOOP_CHECK): New.
	Adapt all OMP_FOR_* macros to use OMP_LOOP_CHECK.
	* tree.def (CILK_SIMD): New entry.
	* tree-pretty-print.c (dump_generic_node): Add case for CILK_SIMD.
	* gimple-pretty-print.c (dump_gimple_omp_for): Add case for
	GF_OMP_FOR_KIND_CILKSIMD.
	* gimplify.c (gimplify_omp_for): Add case for CILK_SIMD.
	(gimplify_expr): Same.
	(is_gimple_stmt): Same.

c-family/
	* c-cilkplus.c: New.
	* c-pragma.c (init_pragma): Register "simd" pragma.
	* c-pragma.h (enum pragma_kind): Add PRAGMA_CILK_SIMD enum.
	(enum pragma_cilk_clause): New.
	* c.opt (fcilkplus): New flag.
	* c-common.h (c_finish_cilk_simd_loop): Protoize.
	(c_finish_cilk_clauses): Same.

c/
	* c-parser.c (c_parser_pragma): Add case for PRAGMA_CILK_SIMD.
	(c_parser_cilk_verify_simd): New.
	(c_parser_cilk_clause_vectorlength): New.
	(c_parser_cilk_clause_linear): New.
	(c_parser_cilk_clause_name): New.
	(c_parser_cilk_all_clauses): New.
	(c_parser_cilk_for_statement): New.
	(c_parser_cilk_simd_construct): New.
	* c-typeck.c (c_finish_bc_stmt): Add case for _Cilk_for loops.

testsuite/
	* gcc.dg/cilk-plus: New directory and associated infrastructure.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b264f1b..e88672c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1150,7 +1150,7 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \
   c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o \
   c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
   c-family/c-semantics.o c-family/c-ada-spec.o tree-mudflap.o \
-  c-family/array-notation-common.o
+  c-family/array-notation-common.o c-family/c-cilkplus.o
 
 # Language-independent object files.
 # We put the insn-*.o files first so that a parallel make will build
@@ -1979,6 +1979,9 @@ c-family/c-lex.o : c-family/c-lex.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 c-family/c-omp.o : c-family/c-omp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 	$(TREE_H) $(C_COMMON_H) $(GIMPLE_H) langhooks.h
 
+c-family/c-cilkplus.o : c-family/c-cilkplus.c $(CONFIG_H) $(SYSTEM_H) \
+	coretypes.h $(TREE_H) $(C_COMMON_H) langhooks.h
+
 CFLAGS-c-family/c-opts.o += @TARGET_SYSTEM_ROOT_DEFINE@
 c-family/c-opts.o : c-family/c-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
         $(TREE_H) $(C_PRAGMA_H) $(FLAGS_H) toplev.h langhooks.h \
diff --git a/gcc/c-family/c-cilkplus.c b/gcc/c-family/c-cilkplus.c
new file mode 100644
index 0000000..111321b
--- /dev/null
+++ b/gcc/c-family/c-cilkplus.c
@@ -0,0 +1,406 @@
+/* This file contains routines to construct and validate Cilk Plus
+   constructs within the C and C++ front ends.
+
+   Copyright (C) 2011-2013  Free Software Foundation, Inc.
+   Contributed by Balaji V. Iyer <balaji.v.iyer@intel.com>,
+		  Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-common.h"
+
+/* Helper function for c_check_cilk_loop.
+
+   Validate the increment in a _Cilk_for construct or a <#pragma simd>
+   for loop.
+
+   LOC is the location of the `for' keyword.  DECL is the induction
+   variable.  INCR is the original increment expression.
+
+   Returns the canonicalized increment expression for an OMP_FOR_INCR.
+   If there is a validation error, returns error_mark_node.  */
+
+static tree
+c_check_cilk_loop_incr (location_t loc, tree decl, tree incr)
+{
+  if (EXPR_HAS_LOCATION (incr))
+    loc = EXPR_LOCATION (incr);
+
+  if (!incr)
+    {
+      error_at (loc, "missing increment");
+      return error_mark_node;
+    }
+
+  switch (TREE_CODE (incr))
+    {
+    case POSTINCREMENT_EXPR:
+    case PREINCREMENT_EXPR:
+    case POSTDECREMENT_EXPR:
+    case PREDECREMENT_EXPR:
+      if (TREE_OPERAND (incr, 0) != decl)
+	break;
+
+      // Bah... canonicalize into whatever OMP_FOR_INCR needs.
+      if (POINTER_TYPE_P (TREE_TYPE (decl))
+	  && TREE_OPERAND (incr, 1))
+	{
+	  tree t = fold_convert_loc (loc,
+				     sizetype, TREE_OPERAND (incr, 1));
+
+	  if (TREE_CODE (incr) == POSTDECREMENT_EXPR
+	      || TREE_CODE (incr) == PREDECREMENT_EXPR)
+	    t = fold_build1_loc (loc, NEGATE_EXPR, sizetype, t);
+	  t = fold_build_pointer_plus (decl, t);
+	  incr = build2 (MODIFY_EXPR, void_type_node, decl, t);
+	}
+      return incr;
+
+    case MODIFY_EXPR:
+      {
+	tree rhs;
+
+	if (TREE_OPERAND (incr, 0) != decl)
+	  break;
+
+	rhs = TREE_OPERAND (incr, 1);
+	if (TREE_CODE (rhs) == PLUS_EXPR
+	    && (TREE_OPERAND (rhs, 0) == decl
+		|| TREE_OPERAND (rhs, 1) == decl)
+	    && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  return incr;
+	else if (TREE_CODE (rhs) == MINUS_EXPR
+		 && TREE_OPERAND (rhs, 0) == decl
+		 && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  return incr;
+	// Otherwise fail because only PLUS_EXPR and MINUS_EXPR are
+	// allowed.
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  error_at (loc, "invalid increment expression");
+  return error_mark_node;
+}
+
+/* Callback for walk_tree to validate the body of a pragma simd loop
+   or _cilk_for loop.
+
+   This function is passed in as a function pointer to walk_tree.  *TP is
+   the current tree pointer, *WALK_SUBTREES is set to 0 by this function if
+   recursing into TP's subtrees is unnecessary. *DATA is a bool variable that
+   is set to false if an error has occured.  */
+
+tree
+c_validate_cilk_plus_loop (tree *tp, int *walk_subtrees, void *data)
+{
+  if (!tp || !*tp)
+    return NULL_TREE;
+
+  bool *valid = (bool *) data;
+
+  switch (TREE_CODE (*tp))
+    {
+    case CALL_EXPR:
+      {
+	tree fndecl = CALL_EXPR_FN (*tp);
+
+	if (TREE_CODE (fndecl) == ADDR_EXPR)
+	  fndecl = TREE_OPERAND (fndecl, 0);
+	if (TREE_CODE (fndecl) == FUNCTION_DECL)
+	  {
+	    if (setjmp_call_p (fndecl))
+	      {
+		error_at (EXPR_LOCATION (*tp),
+			  "calls to setjmp are not allowed within loops "
+			  "annotated with #pragma simd");
+		*valid = false;
+		*walk_subtrees = 0;
+	      }
+	  }
+	break;
+      }
+
+    case OMP_PARALLEL:
+    case OMP_TASK:
+    case OMP_FOR:
+    case OMP_SIMD:
+    case OMP_SECTIONS:
+    case OMP_SINGLE:
+    case OMP_SECTION:
+    case OMP_MASTER:
+    case OMP_ORDERED:
+    case OMP_CRITICAL:
+    case OMP_ATOMIC:
+    case OMP_ATOMIC_READ:
+    case OMP_ATOMIC_CAPTURE_OLD:
+    case OMP_ATOMIC_CAPTURE_NEW:
+      error_at (EXPR_LOCATION (*tp), "OpenMP statements are not allowed "
+		"within loops annotated with #pragma simd");
+      *valid = false;
+      *walk_subtrees = 0;
+      break;
+
+    default:
+      break;
+    }
+  return NULL_TREE;
+}  
+
+/* Validate the body of a _Cilk_for construct or a <#pragma simd> for
+   loop.
+
+   Returns true if there were no errors, false otherwise.  */
+
+static bool
+c_check_cilk_loop_body (tree body)
+{
+  bool valid = true;
+  walk_tree (&body, c_validate_cilk_plus_loop, (void *) &valid, NULL);
+  return valid;
+}
+
+/* Validate a _Cilk_for construct (or a #pragma simd for loop, which
+   has the same syntactic restrictions).  Returns TRUE if there were
+   no errors, FALSE otherwise.  LOC is the location of the for.  DECL
+   is the controlling variable.  COND is the condition.  INCRP is a
+   pointer the increment expression (in case, the increment needs to
+   be canonicalized).  BODY is the body of the LOOP.  */
+
+static bool
+c_check_cilk_loop (location_t loc, tree decl, tree cond, tree *incrp,
+		   tree body)
+{
+  tree incr = *incrp;
+
+  if (decl == error_mark_node
+      || cond == error_mark_node 
+      || incr == error_mark_node
+      || body == error_mark_node)
+    return false;
+
+  /* Validate the initialization.  */
+  gcc_assert (decl != NULL);
+  if (TREE_THIS_VOLATILE (decl))
+    {
+      error_at (loc, "induction variable cannot be volatile");
+      return false;
+    }
+  if (DECL_EXTERNAL (decl))
+    {
+      error_at (loc, "induction variable cannot be extern");
+      return false;
+    }
+  if (TREE_STATIC (decl))
+    {
+      error_at (loc, "induction variable cannot be static");
+      return false;
+    }
+  if (DECL_REGISTER (decl))
+    {
+      error_at (loc, "induction variable cannot be declared register");
+      return false;
+    }
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (decl))
+      && !POINTER_TYPE_P (TREE_TYPE (decl)))
+    {
+      error_at (loc, "initialization variable must be of integral "
+		"or pointer type");
+      return false;
+    }
+
+  /* Validate the condition.  */
+  if (!cond)
+    {
+      error_at (loc, "missing condition");
+      return false;
+    }
+  bool cond_ok = false;
+  if (TREE_CODE (cond) == NE_EXPR
+      || TREE_CODE (cond) == LT_EXPR
+      || TREE_CODE (cond) == LE_EXPR
+      || TREE_CODE (cond) == GT_EXPR
+      || TREE_CODE (cond) == GE_EXPR)
+    {
+      /* Comparison must either be:
+	   DECL <comparison_operator> EXPR
+	   EXPR <comparison_operator> DECL
+      */
+      if (decl == TREE_OPERAND (cond, 0))
+	cond_ok = true;
+      else if (decl == TREE_OPERAND (cond, 1))
+	{
+	  /* Canonicalize the comparison so the DECL is on the LHS.  */
+	  TREE_SET_CODE (cond,
+			 swap_tree_comparison (TREE_CODE (cond)));
+	  TREE_OPERAND (cond, 1) = TREE_OPERAND (cond, 0);
+	  TREE_OPERAND (cond, 0) = decl;
+	  cond_ok = true;
+	}
+    }
+  if (!cond_ok)
+    {
+      error_at (loc, "invalid controlling predicate");
+      return false;
+    }
+
+  /* Validate and canonicalize the increment.  */
+  incr = c_check_cilk_loop_incr (loc, decl, incr);
+  if (incr == error_mark_node)
+    return false;
+  *incrp = incr;
+
+  if (!c_check_cilk_loop_body (body))
+    return false;
+
+  return true;
+}
+
+/* Adjust any clauses to match the requirements for OpenMP.  */
+
+static tree
+adjust_clauses_for_omp (tree clauses)
+{
+  return clauses;
+}
+
+/* Validate and emit code for the FOR loop following a #<pragma simd>
+   construct.
+
+   LOC is the location of the location of the FOR.
+   DECL is the iteration variable.
+   INIT is the initialization expression.
+   COND is the controlling predicate.
+   INCR is the increment expression.
+   BODY is the body of the loop.
+   CLAUSES are the clauses associated with the pragma simd loop.
+
+   Returns the generated statement.  */
+
+tree
+c_finish_cilk_simd_loop (location_t loc,
+			 tree decl,
+			 tree init, tree cond, tree incr,
+			 tree body,
+			 tree clauses)
+{
+  location_t rhs_loc;
+
+  if (!c_check_cilk_loop (loc, decl, cond, &incr, body))
+    return NULL;
+
+  /* In the case of "for (int i = 0...)", init will be a decl.  It should
+     have a DECL_INITIAL that we can turn into an assignment.  */
+  if (init == decl)
+    {
+      rhs_loc = DECL_SOURCE_LOCATION (decl);
+
+      init = DECL_INITIAL (decl);
+      if (init == NULL)
+	{
+	  error_at (rhs_loc, "%qE is not initialized", decl);
+	  init = integer_zero_node;
+	  return NULL;
+	}
+
+      init = build_modify_expr (loc, decl, NULL_TREE, NOP_EXPR, rhs_loc,
+				init, NULL_TREE);
+    }
+
+  // The C++ parser just gives us the rhs.
+  if (TREE_CODE (init) != MODIFY_EXPR)
+    init = build2 (MODIFY_EXPR, void_type_node, decl, init);
+
+  gcc_assert (TREE_OPERAND (init, 0) == decl);
+
+  tree initv = make_tree_vec (1);
+  tree condv = make_tree_vec (1);
+  tree incrv = make_tree_vec (1);
+  TREE_VEC_ELT (initv, 0) = init;
+  TREE_VEC_ELT (condv, 0) = cond;
+  TREE_VEC_ELT (incrv, 0) = incr;
+
+  tree t = make_node (CILK_SIMD);
+  TREE_TYPE (t) = void_type_node;
+  OMP_FOR_INIT (t) = initv;
+  OMP_FOR_COND (t) = condv;
+  OMP_FOR_INCR (t) = incrv;
+  OMP_FOR_BODY (t) = body;
+  OMP_FOR_PRE_BODY (t) = NULL;
+  OMP_FOR_CLAUSES (t) = adjust_clauses_for_omp (clauses);
+
+  SET_EXPR_LOCATION (t, loc);
+  return add_stmt (t);
+}
+
+/* Validate and emit code for <#pragma simd> clauses.  */
+
+tree
+c_finish_cilk_clauses (tree clauses)
+{
+  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+    {
+      tree prev = clauses;
+
+      /* If a variable appears in a linear clause it cannot appear in
+	 any other OMP clause.  */
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR)
+	for (tree c2 = clauses; c2; c2 = OMP_CLAUSE_CHAIN (c2))
+	  {
+	    if (c == c2)
+	      continue;
+	    enum omp_clause_code code = OMP_CLAUSE_CODE (c2);
+
+	    switch (code)
+	      {
+	      case OMP_CLAUSE_LINEAR:
+	      case OMP_CLAUSE_PRIVATE:
+	      case OMP_CLAUSE_FIRSTPRIVATE:
+	      case OMP_CLAUSE_LASTPRIVATE:
+	      case OMP_CLAUSE_REDUCTION:
+		break;
+
+	      case OMP_CLAUSE_SAFELEN:
+		goto next;
+
+	      default:
+		gcc_unreachable ();
+	      }
+
+	    if (OMP_CLAUSE_DECL (c) == OMP_CLAUSE_DECL (c2))
+	      {
+		error_at (OMP_CLAUSE_LOCATION (c2),
+			  "variable appears in more than one clause");
+		inform (OMP_CLAUSE_LOCATION (c),
+			"multiple clause defined here");
+		// Remove problematic clauses.
+		OMP_CLAUSE_CHAIN (prev) = OMP_CLAUSE_CHAIN (c2);
+	      }
+	  next:
+	    prev = c2;
+	  }
+    }
+
+  return clauses;
+}
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 6dfcffd..2cf1e14 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -521,6 +521,11 @@ struct GTY(()) c_language_function {
 
 #define building_stmt_list_p() (stmt_list_stack && !stmt_list_stack->is_empty())
 
+/* In c-cilkplus.c */
+extern tree c_finish_cilk_simd_loop (location_t, tree, tree, tree, tree,
+				     tree, tree);
+extern tree c_finish_cilk_clauses (tree);
+
 /* Language-specific hooks.  */
 
 /* If non-NULL, this function is called after a precompile header file
@@ -1136,6 +1141,12 @@ enum stv_conv {
 extern enum stv_conv scalar_to_vector (location_t loc, enum tree_code code,
 				       tree op0, tree op1, bool);
 
+/* In c-cilkplus.c  */
+extern tree c_finish_cilk_simd_loop (location_t, tree, tree, tree, tree,
+				     tree, tree);
+extern tree c_finish_cilk_clauses (tree);
+extern tree c_validate_cilk_plus_loop (tree *, int *, void *);
+
 /* These #defines allow users to access different operands of the
    array notation tree.  */
 
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 309859f..428ecfa 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1349,6 +1349,12 @@ init_pragma (void)
 				      omp_pragmas[i].id, true, true);
     }
 
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    {
+      cpp_register_deferred_pragma (parse_in, NULL, "simd", 
+				    PRAGMA_CILK_SIMD, true, false);
+    }
+
   if (!flag_preprocess_only)
     cpp_register_deferred_pragma (parse_in, "GCC", "pch_preprocess",
 				  PRAGMA_GCC_PCH_PREPROCESS, false, false);
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 41215db..4c88dc3 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -45,6 +45,9 @@ typedef enum pragma_kind {
   PRAGMA_OMP_TASKYIELD,
   PRAGMA_OMP_THREADPRIVATE,
 
+  /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
+  PRAGMA_CILK_SIMD,
+
   PRAGMA_GCC_PCH_PREPROCESS,
 
   PRAGMA_FIRST_EXTERNAL
@@ -75,6 +78,17 @@ typedef enum pragma_omp_clause {
   PRAGMA_OMP_CLAUSE_MERGEABLE
 } pragma_omp_clause;
 
+/* All Cilk Plus #pragma omp clauses.  */
+typedef enum pragma_cilk_clause {
+  PRAGMA_CILK_CLAUSE_NONE = 0,
+  PRAGMA_CILK_CLAUSE_VECTORLENGTH,
+  PRAGMA_CILK_CLAUSE_LINEAR,
+  PRAGMA_CILK_CLAUSE_PRIVATE,
+  PRAGMA_CILK_CLAUSE_FIRSTPRIVATE,
+  PRAGMA_CILK_CLAUSE_LASTPRIVATE,
+  PRAGMA_CILK_CLAUSE_REDUCTION
+} pragma_cilk_clause;
+
 extern struct cpp_reader* parse_in;
 
 /* It's safe to always leave visibility pragma enabled as if
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index c7846ce..b5c09f9 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1216,6 +1216,10 @@ static void c_parser_objc_at_dynamic_declaration (c_parser *);
 static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
+/* Cilk Plus supporting routines.  */
+static void c_parser_cilk_for_statement (c_parser *, enum rid, tree);
+static void c_parser_cilk_simd_construct (c_parser *);
+static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
@@ -8729,6 +8733,13 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
       return false;
 
+    case PRAGMA_CILK_SIMD:
+      if (!c_parser_cilk_verify_simd (parser, context))
+	return false;
+      c_parser_consume_pragma (parser);
+      c_parser_cilk_simd_construct (parser);
+      return false;
+
     default:
       if (id < PRAGMA_FIRST_EXTERNAL)
 	{
@@ -10761,7 +10772,391 @@ c_parser_omp_threadprivate (c_parser *parser)
 
   c_parser_skip_to_pragma_eol (parser);
 }
+
+/* Cilk Plus <#pragma simd> parsing routines.  */
+
+/* Helper function for c_parser_pragma.  Perform some sanity checking
+   for <#pragma simd> constructs.  Returns FALSE if there was a
+   problem.  */
+
+static bool
+c_parser_cilk_verify_simd (c_parser *parser,
+				  enum pragma_context context)
+{
+  if (!flag_enable_cilkplus)
+    {
+      warning (0, "pragma simd ignored because -fcilkplus is not enabled");
+      c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+      return false;
+    }
+  if (!flag_tree_vectorize)
+    {
+      warning (0, "pragma simd is useless without -ftree-vectorize");
+      c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+      return false;
+    }
+  if (context == pragma_external)
+    {
+      c_parser_error (parser,"pragma simd must be inside a function");
+      c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+      return false;
+    }
+  return true;
+}
+
+/* Cilk Plus:
+   vectorlength ( constant-expression ) */
+
+static tree
+c_parser_cilk_clause_vectorlength (c_parser *parser, tree clauses)
+{
+  /* The vectorlength clause behaves exactly like OpenMP's safelen
+     clause.  Represent it in OpenMP terms.  */
+  check_no_duplicate_clause (clauses, OMP_CLAUSE_SAFELEN, "vectorlength");
+
+  if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
+    return clauses;
+
+  location_t loc = c_parser_peek_token (parser)->location;
+  tree expr = c_parser_expr_no_commas (parser, NULL).value;
+  expr = c_fully_fold (expr, false, NULL);
+
+  if (!TREE_TYPE (expr)
+      || !TREE_CONSTANT (expr)
+      || !INTEGRAL_TYPE_P (TREE_TYPE (expr)))
+    error_at (loc, "vectorlength must be an integer constant");
+  else if (exact_log2 (TREE_INT_CST_LOW (expr)) == -1)
+    error_at (loc, "vectorlength must be a power of 2");
+  else
+    {
+      tree u = build_omp_clause (loc, OMP_CLAUSE_SAFELEN);
+      OMP_CLAUSE_SAFELEN_EXPR (u) = expr;
+      OMP_CLAUSE_CHAIN (u) = clauses;
+      clauses = u;
+    }
+
+  c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+
+  return clauses;
+}
+
+/* Cilk Plus:
+   linear ( simd-linear-variable-list )
+
+   simd-linear-variable-list:
+     simd-linear-variable
+     simd-linear-variable-list , simd-linear-variable
+
+   simd-linear-variable:
+     id-expression
+     id-expression : simd-linear-step
+
+   simd-linear-step:
+   conditional-expression */
+
+static tree
+c_parser_cilk_clause_linear (c_parser *parser, tree clauses)
+{
+  if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
+    return clauses;
+
+  location_t loc = c_parser_peek_token (parser)->location;
+
+  if (c_parser_next_token_is_not (parser, CPP_NAME)
+      || c_parser_peek_token (parser)->id_kind != C_ID_ID)
+    c_parser_error (parser, "expected identifier");
+
+  while (c_parser_next_token_is (parser, CPP_NAME)
+	 && c_parser_peek_token (parser)->id_kind == C_ID_ID)
+    {
+      tree var = lookup_name (c_parser_peek_token (parser)->value);
+
+      if (var == NULL)
+	{
+	  undeclared_variable (c_parser_peek_token (parser)->location,
+			       c_parser_peek_token (parser)->value);
+	c_parser_consume_token (parser);
+	}
+      else if (var == error_mark_node)
+	c_parser_consume_token (parser);
+      else
+	{
+	  tree step = integer_one_node;
+
+	  /* Parse the linear step if present.  */
+	  if (c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+	    {
+	      c_parser_consume_token (parser);
+	      c_parser_consume_token (parser);
+
+	      tree expr = c_parser_expr_no_commas (parser, NULL).value;
+	      expr = c_fully_fold (expr, false, NULL);
+
+	      if (!TREE_TYPE (expr)
+		  || !TREE_CONSTANT (expr)
+		  || !INTEGRAL_TYPE_P (TREE_TYPE (expr)))
+		c_parser_error (parser,
+				"step size must be an integer constant");
+	      else
+		step = expr;
+	    }
+	  else
+	    c_parser_consume_token (parser);
+
+	  /* Use OMP_CLAUSE_LINEAR, which has the same semantics.  */
+	  tree u = build_omp_clause (loc, OMP_CLAUSE_LINEAR);
+	  OMP_CLAUSE_DECL (u) = var;
+	  OMP_CLAUSE_LINEAR_STEP (u) = step;
+	  OMP_CLAUSE_CHAIN (u) = clauses;
+	  clauses = u;
+	}
+
+      if (c_parser_next_token_is_not (parser, CPP_COMMA))
+	break;
+
+      c_parser_consume_token (parser);
+    }
+
+  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+
+  return clauses;
+}
+
+/* Returns the name of the next clause.  If the clause is not
+   recognized SIMD_OMP_CLAUSE_NONE is returned and the next token is
+   not consumed.  Otherwise, the appropriate pragma_simd_clause is
+   returned and the token is consumed.  */
+
+static pragma_cilk_clause
+c_parser_cilk_clause_name (c_parser *parser)
+{
+  pragma_cilk_clause result;
+  c_token *token = c_parser_peek_token (parser);
+
+  if (!token->value || token->type != CPP_NAME)
+    return PRAGMA_CILK_CLAUSE_NONE;
+
+  const char *p = IDENTIFIER_POINTER (token->value);
+
+  if (!strcmp (p, "vectorlength"))
+    result = PRAGMA_CILK_CLAUSE_VECTORLENGTH;
+  else if (!strcmp (p, "linear"))
+    result = PRAGMA_CILK_CLAUSE_LINEAR;
+  else if (!strcmp (p, "private"))
+    result = PRAGMA_CILK_CLAUSE_PRIVATE;
+  else if (!strcmp (p, "firstprivate"))
+    result = PRAGMA_CILK_CLAUSE_FIRSTPRIVATE;
+  else if (!strcmp (p, "lastprivate"))
+    result = PRAGMA_CILK_CLAUSE_LASTPRIVATE;
+  else if (!strcmp (p, "reduction"))
+    result = PRAGMA_CILK_CLAUSE_REDUCTION;
+  else
+    return PRAGMA_CILK_CLAUSE_NONE;
+
+  c_parser_consume_token (parser);
+  return result;
+}
 
+/* Parse all #<pragma simd> clauses.  Return the list of clauses
+   found.  */
+
+static tree
+c_parser_cilk_all_clauses (c_parser *parser)
+{
+  tree clauses = NULL;
+
+  while (c_parser_next_token_is_not (parser, CPP_PRAGMA_EOL))
+    {
+      pragma_cilk_clause c_kind;
+
+      c_kind = c_parser_cilk_clause_name (parser);
+
+      switch (c_kind)
+	{
+	case PRAGMA_CILK_CLAUSE_VECTORLENGTH:
+	  clauses = c_parser_cilk_clause_vectorlength (parser, clauses);
+	  break;
+	case PRAGMA_CILK_CLAUSE_LINEAR:
+	  clauses = c_parser_cilk_clause_linear (parser, clauses);
+	  break;
+	case PRAGMA_CILK_CLAUSE_PRIVATE:
+	  /* Use the OpenMP counterpart.  */
+	  clauses = c_parser_omp_clause_private (parser, clauses);
+	  break;
+	case PRAGMA_CILK_CLAUSE_FIRSTPRIVATE:
+	  /* Use the OpenMP counterpart.  */
+	  clauses = c_parser_omp_clause_firstprivate (parser, clauses);
+	  break;
+	case PRAGMA_CILK_CLAUSE_LASTPRIVATE:
+	  /* Use the OpenMP counterpart.  */
+	  clauses = c_parser_omp_clause_lastprivate (parser, clauses);
+	  break;
+	case PRAGMA_CILK_CLAUSE_REDUCTION:
+	  /* Use the OpenMP counterpart.  */
+	  clauses = c_parser_omp_clause_reduction (parser, clauses);
+	  break;
+	default:
+	  c_parser_error (parser, "expected %<#pragma simd%> clause");
+	  goto saw_error;
+	}
+    }
+
+ saw_error:
+  c_parser_skip_to_pragma_eol (parser);
+  return c_finish_cilk_clauses (clauses);
+}
+
+/* Parse the restriction form of the for statement allowed by
+   Cilk Plus.  This function parses both the _CILK_FOR construct as
+   well as the for loop following a <#pragma simd> construct, both of
+   which have the same syntactic restrictions.
+
+   FOR_KEYWORD can be either RID_CILK_FOR or RID_FOR, for parsing
+   _Cilk_for or the <#pragma simd> for loop construct respectively.
+
+   (NOTE: For now, only RID_FOR is handled).
+
+   For a <#pragma simd>, CLAUSES are the clauses that should have been
+   previously parsed.  If there are none, or if we are parsing a
+   _Cilk_for instead, this will be NULL.  */
+   
+static void
+c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
+			     tree clauses)
+{
+  tree init, decl,  cond, stmt;
+  tree block, incr, save_break, save_cont, body;
+  location_t loc;
+  bool fail = false;
+
+  gcc_assert (/*for_keyword == RID_CILK_FOR || */for_keyword == RID_FOR);
+
+  if (!c_parser_next_token_is_keyword (parser, for_keyword))
+    {
+      if (for_keyword == RID_FOR)
+	c_parser_error (parser, "for statement expected");
+      else
+	c_parser_error (parser, "_Cilk_for statement expected");
+      return;
+    }
+
+  loc = c_parser_peek_token (parser)->location;
+  c_parser_consume_token (parser);
+
+  block = c_begin_compound_stmt (true);
+
+  if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
+    {
+      add_stmt (c_end_compound_stmt (loc, block, true));
+      return;
+    }
+
+  /* Parse the initialization declaration.  */
+  if (c_parser_next_tokens_start_declaration (parser))
+    {
+      c_parser_declaration_or_fndef (parser, true, false, false,
+				     false, false, NULL);
+      decl = check_for_loop_decls (loc, flag_isoc99);
+      if (decl == NULL)
+	goto error_init;
+      if (DECL_INITIAL (decl) == error_mark_node)
+	decl = error_mark_node;
+      init = decl;
+    }
+  else if (c_parser_next_token_is (parser, CPP_NAME)
+	   && c_parser_peek_2nd_token (parser)->type == CPP_EQ)
+    {
+      struct c_expr decl_exp;
+      struct c_expr init_exp;
+      location_t init_loc;
+
+      decl_exp = c_parser_postfix_expression (parser);
+      decl = decl_exp.value;
+
+      c_parser_require (parser, CPP_EQ, "expected %<=%>");
+
+      init_loc = c_parser_peek_token (parser)->location;
+      init_exp = c_parser_expr_no_commas (parser, NULL);
+      init_exp = default_function_array_read_conversion (init_loc,
+							 init_exp);
+      init = build_modify_expr (init_loc, decl, decl_exp.original_type,
+				NOP_EXPR, init_loc, init_exp.value,
+				init_exp.original_type);
+      init = c_process_expr_stmt (init_loc, init);
+
+      c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
+    }
+  else
+    {
+    error_init:
+      c_parser_error (parser,
+		      "expected iteration declaration or initialization");
+      c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
+				 "expected %<)%>");
+      return;
+    }
+
+  /* Parse the loop condition.  */
+  cond = NULL_TREE;
+  if (c_parser_next_token_is_not (parser, CPP_SEMICOLON))
+    {
+      location_t cond_loc = c_parser_peek_token (parser)->location;
+      struct c_expr cond_expr = c_parser_binary_expression (parser, NULL,
+							    PREC_NONE);
+
+      cond = cond_expr.value;
+      cond = c_objc_common_truthvalue_conversion (cond_loc, cond);
+      cond = c_fully_fold (cond, false, NULL);
+    }
+  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
+
+  /* Parse the increment expression.  */
+  incr = NULL_TREE;
+  if (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN))
+    {
+      location_t incr_loc = c_parser_peek_token (parser)->location;
+      incr = c_process_expr_stmt (incr_loc,
+				  c_parser_expression (parser).value);
+    }
+  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+
+  if (decl == NULL || decl == error_mark_node || init == error_mark_node)
+    fail = true;
+
+  save_break = c_break_label;
+  /* Magic number to inform c_finish_bc_stmt() that we are within a
+     Cilk for construct.  */
+  c_break_label = build_int_cst (size_type_node, 2);
+
+  save_cont = c_cont_label;
+  c_cont_label = NULL_TREE;
+  body = c_parser_c99_block_statement (parser);
+  c_break_label = save_break;
+  c_cont_label = save_cont;
+
+  if (!fail)
+    {
+      if (for_keyword == RID_FOR)
+	c_finish_cilk_simd_loop (loc, decl, init, cond, incr, body, clauses);
+    }
+
+  stmt = c_end_compound_stmt (loc, block, true);
+  add_stmt (stmt);
+  c_break_label = save_break;
+  c_cont_label = save_cont;
+}
+
+/* Main entry point for parsing Cilk Plus <#pragma simd> for
+   loops.  */
+
+static void
+c_parser_cilk_simd_construct (c_parser *parser)
+{
+  tree clauses = c_parser_cilk_all_clauses (parser);
+
+  c_parser_cilk_for_statement (parser, RID_FOR, clauses);
+}
+
 /* Parse a transaction attribute (GCC Extension).
 
    transaction-attribute:
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 3a92311..d5e4175 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -9153,6 +9153,13 @@ c_finish_bc_stmt (location_t loc, tree *label_p, bool is_break)
       error_at (loc, "break statement used with OpenMP for loop");
       return NULL_TREE;
 
+    case 2:
+      if (is_break) 
+	error ("break statement within <#pragma simd> loop body");
+      else 
+	error ("continue statement within <#pragma simd> loop loop");
+      return NULL_TREE;
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/cp/ChangeLog.cilkplus b/gcc/cp/ChangeLog.cilkplus
new file mode 100644
index 0000000..f0ee3ee
--- /dev/null
+++ b/gcc/cp/ChangeLog.cilkplus
@@ -0,0 +1,28 @@
+2013-06-26  Aldy Hernandez  <aldyh@redhat.com>
+
+	* cp-gimplify.c (cp_gimplify_expr): Add case for CILK_SIMD.
+	(cp_genericize_r): Same.
+	* pt.c (tsubst_expr): Same.
+	* semantics.c (finish_omp_for): Same.
+
+2013-05-21  Balaji V. Iyer  <balaji.v.iyer@intel.com>
+	    Aldy Hernandez  <aldyh@redhat.com>
+
+	* cp-tree.h (p_simd_valid_stmts_in_body_p): New prototype.
+	(finish_cilk_for_cond): Likewise.
+	* parser.h (IN_CILK_P_SIMD_FOR): New #define.
+	* Make-lang.in (CXX_AND_OBJCXX_OBJS): Added new obj-file cp-cilkplus.o
+	* cp-cilkplus.c: New file.
+	* semantics.c (finish_cilk_for_cond): New.
+	* parser.c (cp_parser_pragma): Added a PRAGMA_CILK_SIMD case.
+	(cp_parser_cilk_simd_vectorlength): New function.
+	(cp_parser_cilk_simd_linear): Likewise.
+	(cp_parser_cilk_simd_clause_name): Likewise.
+	(cp_parser_cilk_simd_all_clauses): Likewise.
+	(cp_parser_cilk_simd_construct): Likewise.
+	(cp_parser_simd_for_init_statement): Likewise.
+	(cp_parser_cilk_for_expression_iterator): Likewise.
+	(cp_parser_cilk_for_condition): Likewise.
+	(cp_parser_cilk_for): Likewise.
+	(cp_parser_jump_statement): Added a IN_CILK_P_SIMD_FOR case.
+
diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 6e80bcf..fa36369 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -80,6 +80,7 @@ CXX_AND_OBJCXX_OBJS = cp/call.o cp/decl.o cp/expr.o cp/pt.o cp/typeck2.o \
  cp/typeck.o cp/cvt.o cp/except.o cp/friend.o cp/init.o cp/method.o \
  cp/search.o cp/semantics.o cp/tree.o cp/repo.o cp/dump.o cp/optimize.o \
  cp/mangle.o cp/cp-objcp-common.o cp/name-lookup.o cp/cxx-pretty-print.o \
+ cp/cp-cilkplus.o \
  cp/cp-gimplify.o cp/cp-array-notation.o $(CXX_C_OBJS)
 
 # Language-specific object files for C++.
@@ -348,3 +349,5 @@ cp/name-lookup.o: cp/name-lookup.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 
 cp/cxx-pretty-print.o: cp/cxx-pretty-print.c $(CXX_PRETTY_PRINT_H) \
   $(CONFIG_H) $(SYSTEM_H) $(TM_H) coretypes.h $(CXX_TREE_H) tree-pretty-print.h
+cp/cp-cilkplus.o: cp/cp-cilkplus.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+    $(CXX_TREE_H) $(DIAGNOSTIC_CORE_H)
diff --git a/gcc/cp/cp-cilkplus.c b/gcc/cp/cp-cilkplus.c
new file mode 100644
index 0000000..aa80343
--- /dev/null
+++ b/gcc/cp/cp-cilkplus.c
@@ -0,0 +1,78 @@
+/* This file is part of the Intel(R) Cilk(TM) Plus support
+   This file contains routines to handle Cilk Plus specific
+   routines for the C++ Compiler.
+   Copyright (C) 2013  Free Software Foundation, Inc.
+   Contributed by Balaji V. Iyer <balaji.v.iyer@intel.com>,
+		  Aldy Hernandez <aldyh@redhat.com>.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "cp-tree.h"
+#include "diagnostic-core.h"
+
+
+/* Callback for cp_walk_tree to validate the body of a pragma simd loop
+   or _cilk_for loop.
+
+   This function is passed in as a function pointer to walk_tree.  *TP is
+   the current tree pointer, *WALK_SUBTREES is set to 0 by this function if
+   recursing into TP's subtrees is unnecessary. *DATA is a bool variable that
+   is set to false if an error has occured.  */
+
+static tree
+cpp_validate_cilk_plus_loop_aux (tree *tp, int *walk_subtrees, void *data)
+{
+  bool *valid = (bool *) data;
+  location_t loc = EXPR_HAS_LOCATION (*tp) ? EXPR_LOCATION (*tp) :
+    UNKNOWN_LOCATION;
+
+  if (!tp || !*tp)
+    return NULL_TREE;
+
+  if (TREE_CODE (*tp) == THROW_EXPR)
+    {
+      error_at (loc, "throw expressions are not allowed inside loops "
+		"marked with pragma simd");
+      *walk_subtrees = 0;
+      *valid = false;
+    }
+  else if (TREE_CODE (*tp) == TRY_BLOCK)
+    {
+      error_at (loc, "try statements are not allowed inside loops marked "
+		"with #pragma simd");
+      *valid = false;
+      *walk_subtrees = 0;
+    }
+  return NULL_TREE;
+}  
+
+
+/* Walks through all the subtrees of BODY using walk_tree to make sure
+   invalid statements/expressions are not found inside BODY.  Returns
+   false if any invalid statements are found.  */
+
+bool
+cpp_validate_cilk_plus_loop (tree body)
+{
+  bool valid = true;
+  cp_walk_tree (&body, cpp_validate_cilk_plus_loop_aux,
+		(void *) &valid, NULL);
+  return valid;
+}
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 01d906e..a5f84fd 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6142,6 +6142,10 @@ extern bool cxx_omp_privatize_by_reference	(const_tree);
 extern void suggest_alternatives_for            (location_t, tree);
 extern tree strip_using_decl                    (tree);
 
+/* In cp-cilkplus.c.  */
+extern bool cpp_validate_cilk_plus_loop		(tree);
+extern tree finish_cilk_for_cond		(tree);
+
 /* In cp/cp-array-notations.c */
 extern tree expand_array_notation_exprs         (tree);
 bool cilkplus_an_triplet_types_ok_p             (location_t, tree, tree, tree,
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 6e8293b..b6c1289 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -231,6 +231,11 @@ static void cp_parser_initial_pragma
 static tree cp_literal_operator_id
   (const char *);
 
+static void cp_parser_cilk_simd_construct
+  (cp_parser *, cp_token *);
+static tree cp_parser_cilk_for
+  (cp_parser *, enum rid, tree);
+
 /* Manifest constants.  */
 #define CP_LEXER_BUFFER_SIZE ((256 * 1024) / sizeof (cp_token))
 #define CP_SAVED_TOKEN_STACK 5
@@ -10376,6 +10381,10 @@ cp_parser_jump_statement (cp_parser* parser)
 	case IN_OMP_FOR:
 	  error_at (token->location, "break statement used with OpenMP for loop");
 	  break;
+	case IN_CILK_P_SIMD_FOR:
+	  error_at (token->location,
+		    "break statement within <#pragma simd> loop body");
+	  break;
 	}
       cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
       break;
@@ -10393,6 +10402,10 @@ cp_parser_jump_statement (cp_parser* parser)
 	case IN_OMP_BLOCK:
 	  error_at (token->location, "invalid exit from OpenMP structured block");
 	  break;
+	case IN_CILK_P_SIMD_FOR:
+	  error_at (token->location,
+		    "continue statement within <#pragma simd> loop loop");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -28732,6 +28745,16 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		"%<#pragma omp sections%> construct");
       break;
 
+    case PRAGMA_CILK_SIMD:
+      if (context == pragma_external)
+	{
+	  error_at (pragma_tok->location,
+		    "%<#pragma simd%> must be inside a function");
+	  break;
+	}
+      cp_parser_cilk_simd_construct (parser, pragma_tok);
+      return true;
+
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -28797,4 +28820,522 @@ c_parse_file (void)
   the_parser = NULL;
 }
 
+
+/* Parses the Cilk Plus #pragma simd vectorlength clause:
+   Syntax:
+   vectorlength ( constant-expression )  */
+
+static tree
+cp_parser_cilk_simd_vectorlength (cp_parser *parser, tree clauses)
+{
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+  tree expr;
+  /* The vectorlength clause behaves exactly like OpenMP's safelen
+     clause.  Thus, vectorlength is represented as OMP 4.0
+     safelen.  */
+  check_no_duplicate_clause (clauses, OMP_CLAUSE_SAFELEN, "vectorlength", loc);
+  
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    return error_mark_node;
+  
+  expr = cp_parser_constant_expression (parser, false, NULL);
+  expr = maybe_constant_value (expr);
+
+  if (TREE_CONSTANT (expr)
+	   && exact_log2 (TREE_INT_CST_LOW (expr)) == -1)
+    error_at (loc, "vectorlength must be a power of 2");
+  else if (expr != error_mark_node)
+    {
+      tree c = build_omp_clause (loc, OMP_CLAUSE_SAFELEN);
+      OMP_CLAUSE_SAFELEN_EXPR (c) = expr;
+      OMP_CLAUSE_CHAIN (c) = clauses;
+      clauses = c;
+    }
+
+  if (!cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+    return error_mark_node;
+  return clauses;
+}
+
+/* Handles the Cilk Plus #pragma simd linear clause.
+   Syntax:
+   linear ( simd-linear-variable-list )
+
+   simd-linear-variable-list:
+     simd-linear-variable
+     simd-linear-variable-list , simd-linear-variable
+
+   simd-linear-variable:
+     id-expression
+     id-expression : simd-linear-step
+
+   simd-linear-step:
+   conditional-expression */
+
+static tree
+cp_parser_cilk_simd_linear (cp_parser *parser, tree clauses)
+{
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+  
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    return clauses;
+  if (cp_lexer_next_token_is_not (parser->lexer, CPP_NAME))
+    {
+      cp_parser_error (parser, "expected identifier");
+      cp_parser_skip_to_closing_parenthesis (parser, false, false, true);
+      return error_mark_node;
+    }
+
+  while (1)
+    {
+      cp_token *token = cp_lexer_peek_token (parser->lexer);
+      if (cp_lexer_next_token_is_not (parser->lexer, CPP_NAME))
+	{
+	  cp_parser_error (parser, "expected variable-name");
+	  clauses = error_mark_node;
+	  break;
+	}
+
+      tree var_name = cp_parser_id_expression (parser, false, true, NULL,
+					       false, false);
+      tree decl = cp_parser_lookup_name_simple (parser, var_name,
+						token->location);
+      if (decl == error_mark_node)
+	{
+	  cp_parser_name_lookup_error (parser, var_name, decl, NLE_NULL,
+				       token->location);
+	  clauses = error_mark_node;
+	}
+      else
+	{
+	  tree e = NULL_TREE;
+	  tree step_size = integer_one_node;
+
+	  /* If present, parse the linear step.  Otherwise, assume the default
+	     value of 1.  */
+	  if (cp_lexer_peek_token (parser->lexer)->type == CPP_COLON)
+	    {
+	      cp_lexer_consume_token (parser->lexer);
+
+	      e = cp_parser_constant_expression (parser, false, NULL);
+	      e = maybe_constant_value (e);
+
+	      if (e == error_mark_node)
+		{
+		  /* If an error has occurred,  then the whole pragma is
+		     considered ill-formed.  Thus, no reason to keep
+		     parsing.  */
+		  clauses = error_mark_node;
+		  break;
+		}
+	      else if (!TREE_TYPE (e) || !TREE_CONSTANT (e)
+		       || !INTEGRAL_TYPE_P (TREE_TYPE (e)))
+		cp_parser_error (parser,
+				 "step size must be an integer constant");
+	      else
+		step_size = e;
+	    }
+
+	  /* Use the OMP_CLAUSE_LINEAR,  which has the same semantics.  */
+	  tree l = build_omp_clause (loc, OMP_CLAUSE_LINEAR);
+	  OMP_CLAUSE_DECL (l) = decl;
+	  OMP_CLAUSE_LINEAR_STEP (l) = step_size;
+	  OMP_CLAUSE_CHAIN (l) = clauses;
+	  clauses = l;
+	}
+      if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA))
+	cp_lexer_consume_token (parser->lexer);
+      else if (cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_PAREN))
+	break;
+      else
+	{
+	  error_at (cp_lexer_peek_token (parser->lexer)->location,
+		    "expected %<,%> or %<)%> after %qE", decl);
+	  clauses = error_mark_node;
+	  break;
+	}
+    }
+  cp_parser_skip_to_closing_parenthesis (parser, false, false, true);
+  return clauses;
+}
+
+/* Returns the name of the next clause.  If the clause is not
+   recognized, then PRAGMA_CILK_CLAUSE_NONE is returned and the next
+   token is not consumed.  Otherwise, the appropriate enum from the
+   pragma_simd_clause is returned and the token is consumed.  */
+
+static pragma_cilk_clause
+cp_parser_cilk_simd_clause_name (cp_parser *parser)
+{
+  pragma_cilk_clause clause_type;
+  cp_token *token = cp_lexer_peek_token (parser->lexer);
+
+  if (token->keyword == RID_PRIVATE)
+    clause_type = PRAGMA_CILK_CLAUSE_PRIVATE;
+  else if (!token->u.value || token->type != CPP_NAME)
+    return PRAGMA_CILK_CLAUSE_NONE;
+  else if (!strcmp (IDENTIFIER_POINTER (token->u.value), "vectorlength"))
+    clause_type = PRAGMA_CILK_CLAUSE_VECTORLENGTH;
+  else if (!strcmp (IDENTIFIER_POINTER (token->u.value), "linear"))
+    clause_type = PRAGMA_CILK_CLAUSE_LINEAR;
+  else if (!strcmp (IDENTIFIER_POINTER (token->u.value), "firstprivate"))
+    clause_type = PRAGMA_CILK_CLAUSE_FIRSTPRIVATE;
+  else if (!strcmp (IDENTIFIER_POINTER (token->u.value), "lastprivate"))
+    clause_type = PRAGMA_CILK_CLAUSE_LASTPRIVATE;
+  else if (!strcmp (IDENTIFIER_POINTER (token->u.value), "reduction"))
+    clause_type = PRAGMA_CILK_CLAUSE_REDUCTION;
+  else
+    return PRAGMA_CILK_CLAUSE_NONE;
+
+  cp_lexer_consume_token (parser->lexer);
+  return clause_type;
+}
+
+/* Parses all the #pragma simd clauses.  Returns a list of clauses found.  */
+
+static tree
+cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token)
+{
+  tree clauses = NULL_TREE;
+
+  while (cp_lexer_next_token_is_not (parser->lexer, CPP_PRAGMA_EOL)
+	 && clauses != error_mark_node)
+    {
+      pragma_cilk_clause c_kind;
+      c_kind = cp_parser_cilk_simd_clause_name (parser);
+      if (c_kind == PRAGMA_CILK_CLAUSE_VECTORLENGTH)
+	clauses = cp_parser_cilk_simd_vectorlength (parser, clauses);
+      else if (c_kind == PRAGMA_CILK_CLAUSE_LINEAR)
+	clauses = cp_parser_cilk_simd_linear (parser, clauses);
+      else if (c_kind == PRAGMA_CILK_CLAUSE_PRIVATE)
+	/* Use the OpenMP 4.0 equivalent function.  */
+	clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_PRIVATE, clauses);
+      else if (c_kind == PRAGMA_CILK_CLAUSE_FIRSTPRIVATE)
+	/* Use the OpenMP 4.0 equivalent function.  */
+	clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_FIRSTPRIVATE,
+					  clauses);
+      else if (c_kind == PRAGMA_CILK_CLAUSE_LASTPRIVATE)
+	/* Use the OMP 4.0 equivalent function.  */
+	clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_LASTPRIVATE,
+					  clauses);
+      else if (c_kind == PRAGMA_CILK_CLAUSE_REDUCTION)
+	/* Use the OMP 4.0 equivalent function.  */
+	clauses = cp_parser_omp_clause_reduction (parser, clauses);
+      else
+	{
+	  clauses = error_mark_node;
+	  cp_parser_error (parser, "expected %<#pragma simd%> clause");
+	  break;
+	}
+    }
+
+  cp_parser_skip_to_pragma_eol (parser, pragma_token);
+
+  if (clauses == error_mark_node)
+    return error_mark_node;
+  else
+    return c_finish_cilk_clauses (clauses);
+}
+
+/* Main entry-point for parsing Cilk Plus <#pragma simd> for loops.  */
+
+static void
+cp_parser_cilk_simd_construct (cp_parser *parser, cp_token *pragma_token)
+{
+  tree clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
+
+  if (clauses == error_mark_node)
+    return;
+  
+  if (cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
+    {
+      error_at (cp_lexer_peek_token (parser->lexer)->location,
+		"for statement expected");
+      return;
+    }
+
+  tree sb = begin_omp_structured_block ();
+  int save = cp_parser_begin_omp_structured_block (parser);
+  cp_parser_cilk_for (parser, RID_FOR, clauses);
+  cp_parser_end_omp_structured_block (parser, save);
+  add_stmt (finish_omp_structured_block (sb));
+  return;
+}
+
+/* Parses the initializer of a for/_Cilk_for statement.  The initial
+   value is stored in *INIT, and the inital value's declaration is
+   stored as DECL_EXPR in *PRE_BODY.  */
+
+static tree
+cp_parser_simd_for_init_statement (cp_parser *parser, tree *init,
+				   tree *pre_body)
+{
+  cp_token *token = cp_lexer_peek_token (parser->lexer);
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+  tree decl = NULL_TREE;
+  cp_decl_specifier_seq type_specifiers;
+  tree this_pre_body = push_stmt_list ();
+  if (token->type == CPP_SEMICOLON)
+    {
+      error_at (loc, "expected iteration declaration");
+      return error_mark_node;
+    }
+
+  if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_REGISTER)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_EXTERN)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_MUTABLE)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_THREAD))
+    {
+      error_at (loc, "storage class is not allowed");
+      cp_lexer_consume_token (parser->lexer);
+    }
+
+  cp_parser_parse_tentatively (parser);
+  cp_parser_type_specifier_seq (parser, true, false, &type_specifiers);
+  if (cp_parser_parse_definitely (parser))
+    {
+      cp_declarator *cp_decl;
+      tree asm_spec, attr;
+      cp_decl = cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED,
+				      NULL, NULL, false);
+      attr = cp_parser_attributes_opt (parser);
+      asm_spec = cp_parser_asm_specification_opt (parser);
+      if (cp_decl == cp_error_declarator)
+	cp_parser_skip_to_end_of_statement (parser);
+      else
+	{
+	  tree pushed_scope, auto_node;
+	  decl = start_decl (cp_decl, &type_specifiers, SD_INITIALIZED, attr,
+			     NULL_TREE, &pushed_scope);
+	  auto_node = type_uses_auto (TREE_TYPE (decl));
+	  if (cp_lexer_next_token_is_not (parser->lexer, CPP_EQ))
+	    {
+	      if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN))
+		error_at (loc, "parenthesized initialization is "
+			  "not allowed in for-loop");
+	      else
+		{	  
+		  if (!cp_parser_require (parser, CPP_EQ, RT_EQ))
+		    decl = error_mark_node;
+		}
+
+	      *init = error_mark_node;
+	      cp_parser_skip_to_end_of_statement (parser);
+	    }
+	  else if (CLASS_TYPE_P (TREE_TYPE (decl)) || auto_node
+		   || type_dependent_expression_p (decl))
+	    {
+	      bool is_direct_init, is_non_constant_init;
+	      *init = cp_parser_initializer (parser, &is_direct_init,
+					    &is_non_constant_init);
+	      if (auto_node)
+		{
+		  TREE_TYPE (decl)
+		    = do_auto_deduction (TREE_TYPE (decl), *init, auto_node);
+		  if (!CLASS_TYPE_P (TREE_TYPE (decl))
+		      && !type_dependent_expression_p (decl))
+		    goto non_class;
+		}
+	      cp_finish_decl (decl, *init, !is_non_constant_init, asm_spec,
+			      LOOKUP_ONLYCONVERTING);
+	      if (CLASS_TYPE_P (TREE_TYPE (decl)))
+		*init = NULL_TREE;
+	      else
+		*init = pop_stmt_list (this_pre_body);
+	      this_pre_body = NULL_TREE;
+	    }
+	  else
+	    {
+	      /* Consume the '='.  */
+	      cp_lexer_consume_token (parser->lexer);
+	      *init = cp_parser_assignment_expression (parser, false, NULL);
+	    non_class:
+	      if (TREE_CODE (TREE_TYPE (decl)) == REFERENCE_TYPE)
+		*init = error_mark_node;
+	      else
+		cp_finish_decl (decl, NULL_TREE, false, asm_spec,
+				LOOKUP_ONLYCONVERTING);
+	      if (decl != error_mark_node)
+		DECL_INITIAL (decl) = (*init || *init != error_mark_node) ?
+		  *init : NULL_TREE;
+	    }
+	  if (pushed_scope)
+	    pop_scope (pushed_scope);
+	}
+    }
+  else
+    {
+      cp_id_kind idk;
+      cp_parser_parse_tentatively (parser);
+      decl = cp_parser_primary_expression (parser, false, false,
+					   false, &idk);
+      if (!cp_parser_error_occurred (parser) && decl && DECL_P (decl)
+	  && CLASS_TYPE_P (TREE_TYPE (decl)))
+	{
+	  tree rhs, new_expr;
+	  // ?? FIXME: I don't see any definition for *init in this
+	  // code path. ??
+	  gcc_unreachable ();
+	  cp_parser_parse_definitely (parser);
+	  cp_parser_require (parser, CPP_EQ, RT_EQ);
+	  rhs = cp_parser_assignment_expression (parser, false, NULL);
+	  new_expr = build_x_modify_expr (EXPR_LOCATION (rhs), decl,
+					  NOP_EXPR, rhs,
+					  tf_warning_or_error);
+	  finish_expr_stmt (new_expr);
+	}
+      else
+	{
+	  if (decl != error_mark_node)
+	    decl = NULL;
+	  cp_parser_abort_tentative_parse (parser);
+	  *init = cp_parser_expression (parser, false, NULL);
+	}
+    }
+  
+  if (this_pre_body)
+    this_pre_body = pop_stmt_list (this_pre_body);
+
+  *pre_body = this_pre_body;
+  return decl;
+}
+  
+/* Top-level function to parse _Cilk_for and the for statement
+   following <#pragma simd>.  */
+
+static tree
+cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
+{
+  bool valid = true;
+  tree cond = NULL_TREE;
+  tree incr_expr = NULL_TREE;
+  tree init = NULL_TREE, pre_body = NULL_TREE, decl;
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+  
+  gcc_assert (for_keyword == RID_FOR);
+
+  if (!cp_lexer_next_token_is_keyword (parser->lexer, for_keyword))
+    {
+      if (for_keyword == RID_FOR)
+	cp_parser_error (parser, "for statement expected");
+      else
+	cp_parser_error (parser, "_Cilk_for statement expected");
+      return error_mark_node;
+    }
+  cp_lexer_consume_token (parser->lexer);
+
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      return error_mark_node;
+    }
+
+  /* Parse initialization.  */
+  if (for_keyword == RID_FOR)
+    decl = cp_parser_simd_for_init_statement (parser, &init, &pre_body);
+
+  if (decl == error_mark_node)
+    valid = false;
+  else if (!decl || (TREE_CODE (decl) != VAR_DECL
+		     && TREE_CODE (decl) != DECL_EXPR))
+    {
+      error_at (loc, "%s-loop initializer does not declare a variable",
+		for_keyword == RID_FOR ? "for" : "_Cilk_for");
+      valid = false;
+      decl = error_mark_node;
+    }
+  else if (!processing_template_decl
+	   && !DECL_NONTRIVIALLY_INITIALIZED_P (decl)
+	   && !DECL_INITIAL (decl)
+	   && !TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (decl)))
+    {
+      error_at (loc, "control variable for the %s-loop needs to "
+		"be initialized",
+		for_keyword == RID_FOR ? "for" : "_Cilk_for");
+      valid = false;
+    }
+  if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA))
+    {
+      error_at (loc, "%s-loop initializer cannot have multiple variable "
+		"declarations", for_keyword == RID_FOR ? "for" : "_Cilk_for");
+      cp_parser_skip_to_end_of_statement (parser);
+      valid = false;
+    }
+
+  if (!valid)
+    {
+      /* Skip to the semicolon ending the init.  */
+      cp_parser_skip_to_end_of_statement (parser);
+    }
+
+  /* Parse condition.  */
+  if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
+    return error_mark_node;
+  if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
+    {
+      error_at (loc, "missing condition");
+      cond = error_mark_node;
+    }
+  else
+    {
+      cond = cp_parser_condition (parser);
+      cond = finish_cilk_for_cond (cond);
+    }
+
+  if (cond == error_mark_node)
+    valid = false;
+  cp_parser_consume_semicolon_at_end_of_statement (parser);
+
+  /* Parse increment.  */
+  if (cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_PAREN))
+    {
+      error_at (loc, "missing increment");
+      incr_expr = error_mark_node;
+    }
+  else
+    incr_expr = cp_parser_expression (parser, false, NULL);
+  
+  if (incr_expr == error_mark_node)
+    {
+      cp_parser_skip_to_closing_parenthesis (parser, true, false, false);
+      valid = false;
+    }
+
+  if (!cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      valid = false;
+    }
+  
+  if (!valid)
+    {
+      gcc_assert (sorrycount || errorcount);
+      return error_mark_node;
+    }
+
+  if (for_keyword == RID_FOR)
+    {
+      parser->in_statement = IN_CILK_P_SIMD_FOR;
+      tree body = push_stmt_list ();
+      cp_parser_statement (parser, NULL_TREE, false, NULL);
+      body = pop_stmt_list (body);
+
+      /* Check if the body satisfies all the requirement of a #pragma
+	 simd for body.  If it is invalid, then do not make the OpenMP
+	 nodes, just return an error mark node.  */
+      if (!cpp_validate_cilk_plus_loop (body))
+	return error_mark_node;
+
+      return c_finish_cilk_simd_loop (loc, decl, init, cond, incr_expr,
+				      body, clauses);
+    }
+  else
+    {
+      /* Handle _Cilk_for here when implemented.  */
+      gcc_unreachable ();
+      return NULL_TREE;
+    }
+}
+
 #include "gt-cp-parser.h"
diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index 3d8bb74..4fbc655 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -292,6 +292,7 @@ typedef struct GTY(()) cp_parser {
 #define IN_OMP_BLOCK		4
 #define IN_OMP_FOR		8
 #define IN_IF_STMT             16
+#define IN_CILK_P_SIMD_FOR     32
   unsigned char in_statement;
 
   /* TRUE if we are presently parsing the body of a switch statement.
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index f821754..4ee7477 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5142,6 +5142,13 @@ finish_omp_taskyield (void)
   finish_expr_stmt (stmt);
 }
 
+/* Perform any canonicalization of the conditional in a Cilk for loop.  */
+tree
+finish_cilk_for_cond (tree cond)
+{
+  return cp_truthvalue_conversion (cond);
+}
+
 /* Begin a __transaction_atomic or __transaction_relaxed statement.
    If PCOMPOUND is non-null, this is for a function-transaction-block, and we
    should create an extra compound stmt.  */
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index a698251..59724f8 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1110,6 +1110,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_SIMD:
 	  kind = " simd";
 	  break;
+	case GF_OMP_FOR_KIND_CILKSIMD:
+	  kind = " cilksimd";
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1138,6 +1141,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_SIMD:
 	  pp_string (buffer, "#pragma omp simd");
 	  break;
+	case GF_OMP_FOR_KIND_CILKSIMD:
+	  pp_string (buffer, "#pragma simd");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 37c37a6..b07f850 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -112,7 +112,8 @@ enum gf_mask {
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
     GF_OMP_FOR_KIND_MASK	= 3 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
-    GF_OMP_FOR_KIND_SIMD	= 1 << 0,
+    GF_OMP_FOR_KIND_SIMD	= 2 << 0,
+    GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
 
     /* True on an GIMPLE_OMP_RETURN statement if the return does not require
        a thread synchronization via some sort of barrier.  The exact barrier
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 7b16f87..3b09ea8 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4711,6 +4711,7 @@ is_gimple_stmt (tree t)
     case OMP_PARALLEL:
     case OMP_FOR:
     case OMP_SIMD:
+    case CILK_SIMD:
     case OMP_SECTIONS:
     case OMP_SECTION:
     case OMP_SINGLE:
@@ -6584,7 +6585,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 
   for_stmt = *expr_p;
 
-  simd = TREE_CODE (for_stmt) == OMP_SIMD; 
+  simd = TREE_CODE (for_stmt) == OMP_SIMD
+    || TREE_CODE (for_stmt) == CILK_SIMD;
   gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p,
 			     simd ? ORT_SIMD : ORT_WORKSHARE);
 
@@ -6810,6 +6812,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     {
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
+    case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
     default:
       gcc_unreachable ();
     }
@@ -7752,6 +7755,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 
 	case OMP_FOR:
 	case OMP_SIMD:
+	case CILK_SIMD:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
 
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 4e2356d..0736b45 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -223,7 +223,7 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   int i;
   struct omp_for_data_loop dummy_loop;
   location_t loc = gimple_location (for_stmt);
-  bool simd = gimple_omp_for_kind (for_stmt) == GF_OMP_FOR_KIND_SIMD;
+  bool simd = gimple_omp_for_kind (for_stmt) & GF_OMP_FOR_KIND_SIMD;
 
   fd->for_stmt = for_stmt;
   fd->pre = NULL;
@@ -309,6 +309,10 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	case LT_EXPR:
 	case GT_EXPR:
 	  break;
+	case NE_EXPR:
+	  gcc_assert (gimple_omp_for_kind (for_stmt)
+		      == GF_OMP_FOR_KIND_CILKSIMD);
+	  break;
 	case LE_EXPR:
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->n2)))
 	    loop->n2 = fold_build_pointer_plus_hwi_loc (loc, loop->n2, 1);
@@ -938,7 +942,7 @@ build_outer_var_ref (tree var, omp_context *ctx)
       x = build_receiver_ref (var, by_ref, ctx);
     }
   else if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
-	   && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+	   && gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_SIMD)
     {
       /* #pragma omp simd isn't a worksharing construct, and can reference even
 	 private vars in its linear etc. clauses.  */
@@ -1869,7 +1873,7 @@ check_omp_nesting_restrictions (gimple stmt, omp_context *ctx)
   if (ctx != NULL)
     {
       if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
-	  && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+	  && gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_SIMD)
 	{
 	  error_at (gimple_location (stmt),
 		    "OpenMP constructs may not be nested inside simd region");
@@ -1879,7 +1883,7 @@ check_omp_nesting_restrictions (gimple stmt, omp_context *ctx)
   switch (gimple_code (stmt))
     {
     case GIMPLE_OMP_FOR:
-      if (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_SIMD)
+      if (gimple_omp_for_kind (stmt) & GF_OMP_FOR_KIND_SIMD)
 	return true;
       /* FALLTHRU */
     case GIMPLE_OMP_SECTIONS:
@@ -2383,7 +2387,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
   bool lastprivate_firstprivate = false;
   int pass;
   bool is_simd = (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
-		  && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD);
+		  && gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_SIMD);
   int max_vf = 0;
   tree lane = NULL_TREE, idx = NULL_TREE;
   tree ivar = NULL_TREE, lvar = NULL_TREE;
@@ -2839,7 +2843,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
       /* Don't add any barrier for #pragma omp simd or
 	 #pragma omp distribute.  */
       if (gimple_code (ctx->stmt) != GIMPLE_OMP_FOR
-	  || gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_FOR)
+	  || gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_FOR)
 	gimplify_and_add (build_omp_barrier (), ilist);
     }
 
@@ -2918,7 +2922,7 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list,
     }
 
   if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
-      && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+      && gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_SIMD)
     {
       simduid = find_omp_clause (orig_clauses, OMP_CLAUSE__SIMDUID_);
       if (simduid)
@@ -3013,7 +3017,7 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx)
 
   /* SIMD reductions are handled in lower_rec_input_clauses.  */
   if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
-      && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+      && gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_SIMD)
     return;
 
   /* First see if there is exactly one reduction clause.  Use OMP_ATOMIC
@@ -5702,7 +5706,7 @@ expand_omp_for (struct omp_region *region)
        original loops from being detected.  Fix that up.  */
     loops_state_set (LOOPS_NEED_FIXUP);
 
-  if (gimple_omp_for_kind (fd.for_stmt) == GF_OMP_FOR_KIND_SIMD)
+  if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
       && !fd.have_ordered
@@ -6807,7 +6811,7 @@ execute_expand_omp (void)
 static bool
 gate_expand_omp (void)
 {
-  return (flag_openmp != 0 && !seen_error ());
+  return ((flag_openmp || flag_enable_cilkplus) && !seen_error ());
 }
 
 struct gimple_opt_pass pass_expand_omp =
@@ -7958,7 +7962,7 @@ execute_lower_omp (void)
 
   /* This pass always runs, to provide PROP_gimple_lomp.
      But there is nothing to do unless -fopenmp is given.  */
-  if (flag_openmp == 0)
+  if (!flag_openmp && !flag_enable_cilkplus)
     return 0;
 
   all_contexts = splay_tree_new (splay_tree_compare_pointers, 0,
@@ -8061,12 +8065,33 @@ diagnose_sb_0 (gimple_stmt_iterator *gsi_p,
     error ("invalid entry to OpenMP structured block");
 #endif
 
+  bool cilkplus_block = false;
+  if (flag_enable_cilkplus)
+    {
+      if ((branch_ctx
+	   && gimple_code (branch_ctx) == GIMPLE_OMP_FOR
+	   && gimple_omp_for_kind (branch_ctx) == GF_OMP_FOR_KIND_CILKSIMD)
+	  || (gimple_code (label_ctx) == GIMPLE_OMP_FOR
+	      && gimple_omp_for_kind (label_ctx) == GF_OMP_FOR_KIND_CILKSIMD))
+	cilkplus_block = true;
+    }
+
   /* If it's obvious we have an invalid entry, be specific about the error.  */
   if (branch_ctx == NULL)
-    error ("invalid entry to OpenMP structured block");
+    {
+      if (cilkplus_block)
+	error ("invalid entry to Cilk Plus structured block");
+      else
+	error ("invalid entry to OpenMP structured block");
+    }
   else
-    /* Otherwise, be vague and lazy, but efficient.  */
-    error ("invalid branch to/from an OpenMP structured block");
+    {
+      /* Otherwise, be vague and lazy, but efficient.  */
+      if (cilkplus_block)
+	error ("invalid branch to/from a Cilk Plus structured block");
+      else
+	error ("invalid branch to/from an OpenMP structured block");
+    }
 
   gsi_replace (gsi_p, gimple_build_nop (), false);
   return true;
@@ -8249,7 +8274,7 @@ diagnose_omp_structured_block_errors (void)
 static bool
 gate_diagnose_omp_blocks (void)
 {
-  return flag_openmp != 0;
+  return flag_openmp || flag_enable_cilkplus;
 }
 
 struct gimple_opt_pass pass_diagnose_omp_blocks =
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c
new file mode 100644
index 0000000..e8e2066
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -fopenmp" } */
+
+int *a, *b, c;
+void *jmpbuf[10];
+
+void foo()
+{
+  int j;
+
+#pragma simd
+  for (int i=0; i < 1000; ++i)
+    {
+      if (c == 6)
+	__builtin_setjmp (jmpbuf); /* { dg-error "calls to setjmp are not allowed" } */
+      a[i] = b[i];
+    }
+
+#pragma simd
+  for (int i=0; i < 1000; ++i)
+    {
+      if (c==5)
+	break; /* { dg-error "break statement within" } */
+    }
+
+#pragma simd
+  for (int i=0; i < 1000; ++i)
+    {
+#pragma omp for /* { dg-error "OpenMP statements are not allowed" } */
+      for (j=0; j < 1000; ++j)
+	a[i] = b[i];
+    }
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses1.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses1.c
new file mode 100644
index 0000000..6d84791
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses1.c
@@ -0,0 +1,76 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -Werror -Wunknown-pragmas -fcilkplus" } */
+
+volatile int *a, *b;
+
+void foo()
+{
+  int i, j, k;
+
+#pragma simd assert /* { dg-error "expected '#pragma simd' clause" } */
+  for (i=0; i < 100; ++i)
+    a[i] = b[i];
+
+#pragma simd vectorlength /* { dg-error "expected '\\('" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd vectorlength /* { dg-error "expected '\\('" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd vectorlength(sizeof (a) == sizeof (float) ? 4 : 8)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd vectorlength(4,8) /* { dg-error "expected '\\)'" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd vectorlength(i) /* { dg-error "\(vectorlength must be an integer\|in a constant\)" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(35) /* { dg-error "expected identifier" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(blah) /* { dg-error "'blah' \(undeclared\|has not been\)" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(j, 36, k) /* { dg-error "expected" } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(i, j)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(i)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(i : 4)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(i : 2, j : 4, k)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(j : sizeof (a) == sizeof (float) ? 4 : 8)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+  // And now everyone in unison!
+#pragma simd linear(j : 4) vectorlength(4)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+
+#pragma simd linear(blah2, 36)
+  /* { dg-error "'blah2' \(undeclared\|has not been\)" "undeclared" { target *-*-* } 71 } */
+  /* { dg-error "expected" "expected" { target *-*-* } 71 } */
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses2.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses2.c
new file mode 100644
index 0000000..71589c2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-original -fcilkplus" } */
+
+volatile int *a, *b;
+
+void foo()
+{
+  int j, k;
+
+#pragma simd linear(j : 4, k) vectorlength(4)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[j];
+}
+
+/* { dg-final { scan-tree-dump-times "linear\\(j:4\\)" 1 "original" } } */
+/* { dg-final { scan-tree-dump-times "linear\\(k:1\\)" 1 "original" } } */
+/* { dg-final { scan-tree-dump-times "safelen\\(4\\)" 1 "original" } } */
+/* { dg-final { cleanup-tree-dump "original" } } */
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses3.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses3.c
new file mode 100644
index 0000000..579b718
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses3.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+#define N 1000
+
+int A[N], B[N], C[N];
+int main (void)
+{
+#pragma simd private (B) linear(B:1) /* { dg-error "more than one clause" } */
+  for (int ii = 0; ii < N; ii++)
+    {
+      A[ii] = B[ii] + C[ii];
+    }
+
+#pragma simd private (B, C) linear(B:1) /* { dg-error "more than one clause" } */
+  for (int ii = 0; ii < N; ii++)
+    {
+      A[ii] = B[ii] + C[ii];
+    }
+
+#pragma simd private (B) linear(C:2, B:1) /* { dg-error "more than one clause" } */
+  for (int ii = 0; ii < N; ii++)
+    {
+      A[ii] = B[ii] + C[ii];
+    }
+
+#pragma simd reduction (+:B) linear(B:1) /* { dg-error "more than one clause" } */
+  for (int ii = 0; ii < N; ii++)
+    {
+      A[ii] = B[ii] + C[ii];
+    }
+
+#pragma simd reduction (+:B) linear(B) /* { dg-error "more than one clause" } */
+  for (int ii = 0; ii < N; ii++)
+    {
+      A[ii] = B[ii] + C[ii];
+    }
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/for1.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/for1.c
new file mode 100644
index 0000000..04773d1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/for1.c
@@ -0,0 +1,139 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+int *a, *b, *c;
+int something;
+
+void foo()
+{
+  int i, j;
+
+  // Declaration and initialization is allowed.
+#pragma simd
+  for (int i=0; i < 1000; i++)
+    a[i] = b[j];
+
+  // Empty initialization is not allowed.
+#pragma simd
+  for (; i < 5; ++i)		// { dg-error "expected iteration decl" }
+    a[i] = i;
+
+  // Empty condition is not allowed.
+#pragma simd
+  for (int i=0; ; ++i)		/* { dg-error "missing condition" } */
+    a[i] = i;
+
+  // Empty increment is not allowed.
+#pragma simd
+  for (int i=0; i < 1234; )		/* { dg-error "missing increment" } */
+    a[i] = i*2;
+
+#pragma simd
+  i = 5; /* { dg-error "for statement expected" } */
+
+  // Initialization variables must be either integral or pointer types.
+  struct S {
+    int i;
+  };
+#pragma simd
+  for (struct S ss = { 0 }; ss.i <= 1000; ++ss.i) /* { dg-error "initialization variable must be of integral or pointer type" } */
+    a[ss.i] = b[ss.i];
+
+  #pragma simd
+  for (float f=0.0; f < 15.0; ++f) /* { dg-error "must be of integral" } */
+    a[(int)f] = (int) f;
+
+  // Pointers are OK.
+  #pragma simd
+  for (int *i=c; i < &c[100]; ++i)
+    *a = '5';
+
+  // Condition of '==' is not allowed.
+#pragma simd
+  for (int i=j; i == 5; ++i) /* { dg-error "invalid controlling predicate" } */
+    a[i] = b[i];
+
+  // The LHS or RHS of the condition must be the initialization variable.
+#pragma simd
+  for (int i=0; i+j < 1234; ++i) /* { dg-error "invalid controlling predicate" } */
+    a[i] = b[i];  
+
+  // Likewise.
+#pragma simd
+  for (int i=0; 1234 < i + j; ++i) /* { dg-error "invalid controlling predicate" } */
+    a[i] = b[i];  
+
+  // Likewise, this is ok.
+#pragma simd
+  for (int i=0; 1234 + j < i; ++i)
+    a[i] = b[i];
+
+  // According to the CilkPlus forum, casts are not allowed, even if
+  // they are no-ops.
+#pragma simd
+  for (int i=0; (char)i < 1234; ++i) /* { dg-error "invalid controlling predicate" } */
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=255; i != something; --i)
+    a[i] = b[i];
+
+  // This condition gets folded into "i != 0" by
+  // c_parser_cilk_for_statement().  This is allowed as per the "!="
+  // allowance above.
+#pragma simd
+  for (int i=100; i; --i)
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=100; i != 5; i += something)
+    a[i] = b[i];
+
+  // Increment must be on the induction variable.
+#pragma simd
+  for (int i=0; i < 100; j++) /* { dg-error "invalid increment expression" } */
+    a[i] = b[i];
+
+  // Likewise.
+#pragma simd
+  for (int i=0; i < 100; j = i + 1) /* { dg-error "invalid increment expression" } */
+    a[i] = b[i];
+
+  // Likewise.
+#pragma simd
+  for (int i=0; i < 100; i = j + 1) /* { dg-error "invalid increment expression" } */
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=0; i < 100; i = i + 5)
+    a[i] = b[i];
+
+  // Only PLUS and MINUS increments are allowed.
+#pragma simd
+  for (int i=0; i < 100; i *= 5) /* { dg-error "invalid increment expression" } */
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=0; i < 100; i -= j)
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=0; i < 100; i = i + j)
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=0; i < 100; i = j + i)
+    a[i] = b[i];
+
+#pragma simd
+  for (int i=0; i < 100; ++i, ++j) /* { dg-error "invalid increment expression" } */
+    a[i] = b[i];
+
+#pragma simd
+  for (int *point=0; point < b; ++point)
+    *point = 555;
+
+#pragma simd
+  for (int *point=0; point > b; --point)
+    *point = 555;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/for3.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/for3.c
new file mode 100644
index 0000000..8660627
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/for3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+#pragma simd		/* { dg-error "must be inside a function" } */
+
+void foo()
+{
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/for4.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/for4.c
new file mode 100644
index 0000000..2da8235
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/for4.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+int *a, *c;
+
+void foo()
+{
+  int i, j;
+
+  // Pointers are OK.
+  #pragma simd
+  for (int *i=c; i < c; ++i)
+    *a = '5';
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/for5.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/for5.c
new file mode 100644
index 0000000..7075a44
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/for5.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+int *a, *b;
+
+void foo()
+{
+#pragma simd
+  for (int i=100; i; --i)
+    a[i] = b[i];
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/p_simd_test1.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/p_simd_test1.c
new file mode 100644
index 0000000..43a359a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/p_simd_test1.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+#include <stdio.h>
+
+#define ARRAY_SIZE  (256)
+int a[ARRAY_SIZE];
+
+__attribute__((noinline))
+int addit (int *arr, int N)
+{
+  int s=0;
+#pragma simd reduction (+:s)
+  for (int i = 0; i < N; i++)
+    s += arr[i];
+  return s;
+}
+
+int main () {
+  int i, s = 0, r = 0;
+  for (i = 0; i < ARRAY_SIZE; i++)
+    {
+      a[i] = i;
+    }
+
+  s = addit (a, ARRAY_SIZE);
+
+  for (i = 0; i < ARRAY_SIZE; i++) 
+    r += i;
+
+  if (s == r)
+    return 0;
+  else
+    return 1;
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/p_simd_test2.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/p_simd_test2.c
new file mode 100644
index 0000000..fe51a29
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/p_simd_test2.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+#define N 256
+#if HAVE_IO
+#include <stdio.h>
+#endif
+#include <malloc.h>
+
+int
+reduction_simd (int *a)
+{
+  int s = 0;
+
+#pragma simd reduction (+:s)
+  for (int i = 0; i < N; i++)
+    {
+      s += a[i];
+    }
+
+  return s;
+}
+
+int
+main ()
+{
+  int *a = (int *) malloc (N * sizeof (int));
+  int i, s = (N - 1) * N / 2;
+
+  for (i = 0; i < N; i++)
+    {
+      a[i] = i;
+    }
+#if HAVE_IO
+  printf ("%d, %d\n", s, reduction_simd (a));
+#endif
+  if (s == reduction_simd (a))
+    return 0;
+  else
+    return 1;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/reduction_ex.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/reduction_ex.c
new file mode 100644
index 0000000..920b6db
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/reduction_ex.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+int argc = 1;
+
+/* This is a simple vectorization, it checks if check_off_reduction_var works 
+   and it also checks if it can vectorize this loop in func correctly. */
+#define N 1000
+
+int func (int *p, int *q) {
+    int x = 0;
+#pragma simd reduction (+:x)
+    for (int ii = 0; ii < N; ii++) { 
+	x += (q[ii] + p[ii]);
+    }
+    return x; 
+
+}
+
+int main ()
+{
+  int ii = 0, x;
+  int Array[N], Array2[N];
+
+  for (ii = 0; ii < N; ii++)
+    {
+      Array[ii] = 5 + argc;
+      Array2[ii] = argc;
+    }
+  x = func (Array, Array2);
+
+  if (x != N * 7)
+    return 1;
+  return 0;
+}
+
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/run-1.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/run-1.c
new file mode 100644
index 0000000..c8fe1c7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/run-1.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-fcilkplus -O3" } */
+
+#include <stdlib.h>
+
+#define N 4
+
+float f1[] =  { 2.0, 3.0,  4.0,  5.0 };
+float f2[] =  { 1.0, 6.0, -1.0, -2.0 };
+float res[] = { 3.0, 9.0,  3.0,  3.0 };
+
+__attribute__((noinline))
+void verify (float *sum)
+{
+  for (int i=0; i < N; ++i)
+    if (sum[i] != res[i])
+      abort ();
+}
+
+int main()
+{
+  float sum[N];
+#pragma simd
+  for (int i=0; i < N; ++i)
+    sum[i] = f1[i] + f2[i];
+  verify (sum);
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/safelen.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/safelen.c
new file mode 100644
index 0000000..2c59de9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/safelen.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-gimple -fcilkplus" } */
+
+int *a, *b;
+
+void foo()
+{
+#pragma simd vectorlength(8)
+  for (int i=0; i < 1000; ++i)
+    a[i] = b[i];
+}
+
+/* { dg-final { scan-tree-dump-times "safelen\\(8\\)" 1 "gimple" } } */
+/* { dg-final { cleanup-tree-dump "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/vectorlength.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/vectorlength.c
new file mode 100644
index 0000000..9aa4a68
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/vectorlength.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+volatile int *a, *b, N;
+typedef int tint;
+struct someclass {
+  int a;
+  char b;
+  int *p;
+};
+
+void foo()
+{
+#pragma simd vectorlength(4) vectorlength(8) /* { dg-error "too many 'vectorlength' clauses" } */
+  for (int i=0; i < N; ++i)
+    a[i] = b[i];
+
+#pragma simd vectorlength(3) /* { dg-error "must be a power of 2" } */
+  for (int i=0; i < N; ++i)
+    a[i] = b[i];
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
index a153529..3accc99 100644
--- a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
@@ -16,10 +16,16 @@
 
 # Written by Balaji V. Iyer <balaji.v.iyer@intel.com>
 
-
 load_lib g++-dg.exp
 
 dg-init
+# Run the tests that are shared with C.
+g++-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/PS/*.c]] ""
+# Run the C++ only tests.
+g++-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] ""
+dg-finish
+
+dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O1 -fcilkplus" " "
diff --git a/gcc/testsuite/g++.dg/cilk-plus/for.C b/gcc/testsuite/g++.dg/cilk-plus/for.C
new file mode 100644
index 0000000..2295d21
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/for.C
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-ftree-vectorize -fcilkplus" } */
+
+int *a, *b;
+
+void foo()
+{
+  int i;
+#pragma simd
+  for (i=0; i < 10000; ++i) /* { dg-error "initializer does not declare a var" } */
+    a[i] = b[i];
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/for2.C b/gcc/testsuite/g++.dg/cilk-plus/for2.C
new file mode 100644
index 0000000..d30e057
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/for2.C
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+// Test storage classes in the initialization of a <#pragma simd> for
+// loop.
+
+int *a, *b;
+
+void foo()
+{
+#pragma simd
+  for (static int tt=5; tt < 10; ++tt) /* { dg-error "storage class is not allowed" } */
+    a[tt] = b[tt];
+
+#pragma simd
+  for (extern int var=0; var < 1000; ++var) /* { dg-error "storage class is not allowed" } */
+    a[var] = var;
+
+#pragma simd
+  for (register int regj = 0; regj < 1000; ++regj) /* { dg-error "storage class is not allowed" } */
+    b[regj] = a[regj] * 2;
+
+#pragma simd
+  for (volatile int vj=0; vj<1000; ++vj) /* { dg-error "induction variable cannot be volatile" } */
+    a[vj] = b[vj];
+}
diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
index 710218e..e9d0428 100644
--- a/gcc/testsuite/g++.dg/dg.exp
+++ b/gcc/testsuite/g++.dg/dg.exp
@@ -49,6 +49,7 @@ set tests [prune $tests $srcdir/$subdir/tree-prof/*]
 set tests [prune $tests $srcdir/$subdir/torture/*]
 set tests [prune $tests $srcdir/$subdir/graphite/*]
 set tests [prune $tests $srcdir/$subdir/tm/*]
+set tests [prune $tests $srcdir/$subdir/cilk-plus/*]
 set tests [prune $tests $srcdir/$subdir/guality/*]
 set tests [prune $tests $srcdir/$subdir/simulate-thread/*]
 set tests [prune $tests $srcdir/$subdir/asan/*]
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/auto.c b/gcc/testsuite/gcc.dg/cilk-plus/auto.c
new file mode 100644
index 0000000..253acee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/auto.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+
+int *a, *b;
+
+void foo()
+{
+  // This seems like it should be ok.
+  // Must check with standards people.
+#pragma simd
+  for (auto int autoi = 0; autoi < 1000; ++autoi)
+    b[autoi] = a[autoi] * 2;
+  // Similarly here.
+  auto int autoj;
+#pragma simd
+  for (auto int autoj = 0; autoj < 1000; ++autoj)
+    b[autoj] = a[autoj] * 2;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
index 59b2305..e109c71 100644
--- a/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
@@ -20,6 +20,13 @@
 load_lib gcc-dg.exp
 
 dg-init
+
+# Run the tests that are shared with C++.
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/PS/*.c]] " -ftree-vectorize -fcilkplus -std=c99" " "
+# Run the C-only tests.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \
+	"-ftree-vectorize -fcilkplus -std=c99" " "
+
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O1 -fcilkplus" " "
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/for1.c b/gcc/testsuite/gcc.dg/cilk-plus/for1.c
new file mode 100644
index 0000000..4fb5342
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/for1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+int *a, *b, *c;
+
+void foo()
+{
+  int i, j;
+  // The initialization shall declare or initialize a *SINGLE* variable.
+#pragma simd
+  for (i=0, j=5; i < 1000; i++) // { dg-error "expected ';' before ','" }
+    a[i] = b[j];
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/for2.c b/gcc/testsuite/gcc.dg/cilk-plus/for2.c
new file mode 100644
index 0000000..dc0a41e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/for2.c
@@ -0,0 +1,66 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcilkplus" } */
+
+// Test storage classes in the initialization of a <#pragma simd> for
+// loop.
+
+int *a, *b;
+
+void foo()
+{
+#pragma simd
+  for (static int foo=5; foo < 10; ++foo)
+    a[foo] = b[foo];
+  /* { dg-error "declaration of static variable" "storage class1" { target *-*-* } 12 } */
+  /* { dg-error "induction variable cannot be static" "storage class2" { target *-*-* } 12 } */
+
+  static int bar;
+#pragma simd
+  for (bar=0; bar < 1000; ++bar) /* { dg-error "induction variable cannot be static" } */
+    a[bar] = bar;
+
+#pragma simd
+  for (extern int var=0; var < 1000; ++var)
+    a[var] = var;
+  /* { dg-error "has both 'extern' and initializer" "extern" { target *-*-* } 23 } */
+  /* { dg-error "declaration of static variable" "" { target *-*-* } 23 } */
+  /* { dg-error "induction variable cannot be static" "" { target *-*-* } 23 } */
+
+  extern int extvar;
+#pragma simd
+  for (extvar = 0; extvar < 1000; ++extvar) /* { dg-error "induction variable cannot be extern" } */
+    b[extvar] = a[extvar];
+
+  // This seems like it should be ok.
+  // Must check with standards people.
+#pragma simd
+  for (auto int autoi = 0; autoi < 1000; ++autoi)
+    b[autoi] = a[autoi] * 2;
+  // Similarly here.
+  auto int autoj;
+#pragma simd
+  for (auto int autoj = 0; autoj < 1000; ++autoj)
+    b[autoj] = a[autoj] * 2;
+
+  register int regi;
+#pragma simd
+  for (regi = 0; regi < 1000; ++regi) /* { dg-error "induction variable cannot be declared register" } */
+    b[regi] = a[regi] * 2;
+
+#pragma simd
+  for (register int regj = 0; regj < 1000; ++regj) /* { dg-error "induction variable cannot be declared register" } */
+    b[regj] = a[regj] * 2;
+
+  volatile int vi;
+#pragma simd
+  for (vi=0; vi<1000; ++vi) /* { dg-error "induction variable cannot be volatile" } */
+    a[vi] = b[vi];
+
+#pragma simd
+  for (volatile int vj=0; vj<1000; ++vj) /* { dg-error "induction variable cannot be volatile" } */
+    a[vj] = b[vj];
+
+#pragma simd
+  for (const int ci=0; ci<1000; ++ci) /* { dg-error "increment of read-only var" } */
+    a[ci] = b[ci];
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/jump.c b/gcc/testsuite/gcc.dg/cilk-plus/jump.c
new file mode 100644
index 0000000..9ec3293
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/jump.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+
+int *a, *b, c;
+
+void foo()
+{
+#pragma simd
+  for (int i=0; i < 1000; ++i)
+    {
+      a[i] = b[i];
+      if (c == 5)
+	return;	 /* { dg-error "invalid branch to.from a Cilk" } */
+    }
+}
+
+void bar()
+{
+#pragma simd
+  for (int i=0; i < 1000; ++i)
+    {
+    lab:
+      a[i] = b[i];
+    }
+  if (c == 6)
+    goto lab; /* { dg-error "invalid entry to Cilk Plus" } */
+}
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index b2d32fa8..bfab737 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2211,6 +2211,10 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
       pp_string (buffer, "#pragma omp simd");
       goto dump_omp_loop;
 
+    case CILK_SIMD:
+      pp_string (buffer, "#pragma simd");
+      goto dump_omp_loop;
+
     dump_omp_loop:
       dump_omp_clauses (buffer, OMP_FOR_CLAUSES (node), spc, flags);
 
diff --git a/gcc/tree.def b/gcc/tree.def
index f825aad..552c704 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1034,6 +1034,10 @@ DEFTREECODE (OMP_FOR, "omp_for", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
 
+/* Cilk Plus - #pragma simd [clause1 ... clauseN]
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
+
 /* OpenMP - #pragma omp sections [clause1 ... clauseN]
    Operand 0: OMP_SECTIONS_BODY: Sections body.
    Operand 1: OMP_SECTIONS_CLAUSES: List of clauses.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index b902e39..74ec261 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1792,12 +1792,13 @@ extern void protected_set_expr_location (tree, location_t);
 #define OMP_TASKREG_BODY(NODE)    TREE_OPERAND (OMP_TASKREG_CHECK (NODE), 0)
 #define OMP_TASKREG_CLAUSES(NODE) TREE_OPERAND (OMP_TASKREG_CHECK (NODE), 1)
 
-#define OMP_FOR_BODY(NODE)	   TREE_OPERAND (OMP_FOR_CHECK (NODE), 0)
-#define OMP_FOR_CLAUSES(NODE)	   TREE_OPERAND (OMP_FOR_CHECK (NODE), 1)
-#define OMP_FOR_INIT(NODE)	   TREE_OPERAND (OMP_FOR_CHECK (NODE), 2)
-#define OMP_FOR_COND(NODE)	   TREE_OPERAND (OMP_FOR_CHECK (NODE), 3)
-#define OMP_FOR_INCR(NODE)	   TREE_OPERAND (OMP_FOR_CHECK (NODE), 4)
-#define OMP_FOR_PRE_BODY(NODE)	   TREE_OPERAND (OMP_FOR_CHECK (NODE), 5)
+#define OMP_LOOP_CHECK(NODE) TREE_RANGE_CHECK (NODE, OMP_FOR, CILK_SIMD)
+#define OMP_FOR_BODY(NODE)	   TREE_OPERAND (OMP_LOOP_CHECK (NODE), 0)
+#define OMP_FOR_CLAUSES(NODE)	   TREE_OPERAND (OMP_LOOP_CHECK (NODE), 1)
+#define OMP_FOR_INIT(NODE)	   TREE_OPERAND (OMP_LOOP_CHECK (NODE), 2)
+#define OMP_FOR_COND(NODE)	   TREE_OPERAND (OMP_LOOP_CHECK (NODE), 3)
+#define OMP_FOR_INCR(NODE)	   TREE_OPERAND (OMP_LOOP_CHECK (NODE), 4)
+#define OMP_FOR_PRE_BODY(NODE)	   TREE_OPERAND (OMP_LOOP_CHECK (NODE), 5)
 
 #define OMP_SECTIONS_BODY(NODE)    TREE_OPERAND (OMP_SECTIONS_CHECK (NODE), 0)
 #define OMP_SECTIONS_CLAUSES(NODE) TREE_OPERAND (OMP_SECTIONS_CHECK (NODE), 1)
diff mbox

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ffd85e7..b264f1b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2557,7 +2557,7 @@  omp-low.o : omp-low.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
    $(RTL_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h $(DIAGNOSTIC_CORE_H) \
    $(TREE_FLOW_H) $(FLAGS_H) $(EXPR_H) $(DIAGNOSTIC_CORE_H) \
    $(TREE_PASS_H) $(GGC_H) $(EXCEPT_H) $(SPLAY_TREE_H) $(OPTABS_H) \
-   $(CFGLOOP_H) tree-iterator.h gt-omp-low.h
+   $(CFGLOOP_H) tree-iterator.h $(TARGET_H) gt-omp-low.h
 tree-browser.o : tree-browser.c tree-browser.def $(CONFIG_H) $(SYSTEM_H) \
    coretypes.h $(HASH_TABLE_H) $(TREE_H) $(TREE_PRETTY_PRINT_H)
 omega.o : omega.c $(OMEGA_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(DUMPFILE_H) \
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 0f24799..cd2f527 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -168,6 +168,20 @@  struct GTY ((chain_next ("%h.next"))) loop {
      describes what is the state of the estimation.  */
   enum loop_estimation estimate_state;
 
+  /* If > 0, an integer, where the user asserted that for any
+     I in [ 0, nb_iterations ) and for any J in
+     [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations
+     of the loop can be safely evaluated concurrently.  */
+  int safelen;
+
+  /* True if we should try harder to vectorize this loop.  */
+  bool force_vect;
+
+  /* For SIMD loops, this is a unique identifier of the loop, referenced
+     by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE
+     builtins.  */
+  tree simduid;
+
   /* Upper bound on number of iterations of a loop.  */
   struct nb_iter_bound *bounds;
 
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 3e8043a..01d906e 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4024,7 +4024,7 @@  more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter)
    See semantics.c for details.  */
 #define CP_OMP_CLAUSE_INFO(NODE) \
   TREE_TYPE (OMP_CLAUSE_RANGE_CHECK (NODE, OMP_CLAUSE_PRIVATE, \
-				     OMP_CLAUSE_COPYPRIVATE))
+				     OMP_CLAUSE_LINEAR))
 
 /* Nonzero if this transaction expression's body contains statements.  */
 #define TRANSACTION_EXPR_IS_STMT(NODE) \
diff --git a/gcc/function.h b/gcc/function.h
index c651f50..d1f4ffc 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -650,6 +650,14 @@  struct GTY(()) function {
      adjusts one of its arguments and forwards to another
      function.  */
   unsigned int is_thunk : 1;
+
+  /* Nonzero if the current function contains any loops with
+     loop->force_vect set.  */
+  unsigned int has_force_vect_loops : 1;
+
+  /* Nonzero if the current function contains any loops with
+     nonzero value in loop->simduid.  */
+  unsigned int has_simduid_loops : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index ddb086c..a698251 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1101,8 +1101,20 @@  dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 
   if (flags & TDF_RAW)
     {
-      dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S>%nCLAUSES <", gs,
-                       gimple_omp_body (gs));
+      const char *kind;
+      switch (gimple_omp_for_kind (gs))
+	{
+	case GF_OMP_FOR_KIND_FOR:
+	  kind = "";
+	  break;
+	case GF_OMP_FOR_KIND_SIMD:
+	  kind = " simd";
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      dump_gimple_fmt (buffer, spc, flags, "%G%s <%+BODY <%S>%nCLAUSES <", gs,
+		       kind, gimple_omp_body (gs));
       dump_omp_clauses (buffer, gimple_omp_for_clauses (gs), spc, flags);
       dump_gimple_fmt (buffer, spc, flags, " >,");
       for (i = 0; i < gimple_omp_for_collapse (gs); i++)
@@ -1118,7 +1130,17 @@  dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
     }
   else
     {
-      pp_string (buffer, "#pragma omp for");
+      switch (gimple_omp_for_kind (gs))
+	{
+	case GF_OMP_FOR_KIND_FOR:
+	  pp_string (buffer, "#pragma omp for");
+	  break;
+	case GF_OMP_FOR_KIND_SIMD:
+	  pp_string (buffer, "#pragma omp simd");
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
       dump_omp_clauses (buffer, gimple_omp_for_clauses (gs), spc, flags);
       for (i = 0; i < gimple_omp_for_collapse (gs); i++)
 	{
diff --git a/gcc/gimple.c b/gcc/gimple.c
index f507419..8f3b938 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -902,19 +902,21 @@  gimple_build_omp_critical (gimple_seq body, tree name)
 /* Build a GIMPLE_OMP_FOR statement.
 
    BODY is sequence of statements inside the for loop.
+   KIND is the `for' variant.
    CLAUSES, are any of the OMP loop construct's clauses: private, firstprivate,
    lastprivate, reductions, ordered, schedule, and nowait.
    COLLAPSE is the collapse count.
    PRE_BODY is the sequence of statements that are loop invariant.  */
 
 gimple
-gimple_build_omp_for (gimple_seq body, tree clauses, size_t collapse,
+gimple_build_omp_for (gimple_seq body, int kind, tree clauses, size_t collapse,
 		      gimple_seq pre_body)
 {
   gimple p = gimple_alloc (GIMPLE_OMP_FOR, 0);
   if (body)
     gimple_omp_set_body (p, body);
   gimple_omp_for_set_clauses (p, clauses);
+  gimple_omp_for_set_kind (p, kind);
   p->gimple_omp_for.collapse = collapse;
   p->gimple_omp_for.iter
       = ggc_alloc_cleared_vec_gimple_omp_for_iter (collapse);
diff --git a/gcc/gimple.def b/gcc/gimple.def
index acad572..f3652f4 100644
--- a/gcc/gimple.def
+++ b/gcc/gimple.def
@@ -287,7 +287,7 @@  DEFGSCODE(GIMPLE_OMP_ORDERED, "gimple_omp_ordered", GSS_OMP)
 
    BODY is a the sequence of statements to be executed by all threads.
 
-   CLAUSES is a TREE_LIST node with all the clauses.
+   CLAUSES is an OMP_CLAUSE chain with all the clauses.
 
    CHILD_FN is set when outlining the body of the parallel region.
    All the statements in BODY are moved into this newly created
@@ -306,7 +306,7 @@  DEFGSCODE(GIMPLE_OMP_PARALLEL, "gimple_omp_parallel", GSS_OMP_PARALLEL)
 
    BODY is a the sequence of statements to be executed by all threads.
 
-   CLAUSES is a TREE_LIST node with all the clauses.
+   CLAUSES is an OMP_CLAUSE chain with all the clauses.
 
    CHILD_FN is set when outlining the body of the explicit task region.
    All the statements in BODY are moved into this newly created
@@ -334,7 +334,7 @@  DEFGSCODE(GIMPLE_OMP_SECTION, "gimple_omp_section", GSS_OMP)
 /* OMP_SECTIONS <BODY, CLAUSES, CONTROL> represents #pragma omp sections.
 
    BODY is the sequence of statements in the sections body.
-   CLAUSES is a TREE_LIST node holding the list of associated clauses.
+   CLAUSES is an OMP_CLAUSE chain holding the list of associated clauses.
    CONTROL is a VAR_DECL used for deciding which of the sections
    to execute.  */
 DEFGSCODE(GIMPLE_OMP_SECTIONS, "gimple_omp_sections", GSS_OMP_SECTIONS)
@@ -346,7 +346,7 @@  DEFGSCODE(GIMPLE_OMP_SECTIONS_SWITCH, "gimple_omp_sections_switch", GSS_BASE)
 
 /* GIMPLE_OMP_SINGLE <BODY, CLAUSES> represents #pragma omp single
    BODY is the sequence of statements inside the single section.
-   CLAUSES is a TREE_LIST node holding the associated clauses.  */
+   CLAUSES is an OMP_CLAUSE chain holding the associated clauses.  */
 DEFGSCODE(GIMPLE_OMP_SINGLE, "gimple_omp_single", GSS_OMP_SINGLE)
 
 /* GIMPLE_PREDICT <PREDICT, OUTCOME> specifies a hint for branch prediction.
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 8ae07c9..37c37a6 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -110,6 +110,9 @@  enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
+    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_FOR		= 0 << 0,
+    GF_OMP_FOR_KIND_SIMD	= 1 << 0,
 
     /* True on an GIMPLE_OMP_RETURN statement if the return does not require
        a thread synchronization via some sort of barrier.  The exact barrier
@@ -799,7 +802,7 @@  gimple gimple_build_switch_nlabels (unsigned, tree, tree);
 gimple gimple_build_switch (tree, tree, vec<tree> );
 gimple gimple_build_omp_parallel (gimple_seq, tree, tree, tree);
 gimple gimple_build_omp_task (gimple_seq, tree, tree, tree, tree, tree, tree);
-gimple gimple_build_omp_for (gimple_seq, tree, size_t, gimple_seq);
+gimple gimple_build_omp_for (gimple_seq, int, tree, size_t, gimple_seq);
 gimple gimple_build_omp_critical (gimple_seq, tree);
 gimple gimple_build_omp_section (gimple_seq);
 gimple gimple_build_omp_continue (tree, tree);
@@ -3948,6 +3951,27 @@  gimple_omp_critical_set_name (gimple gs, tree name)
 }
 
 
+/* Return the kind of OMP for statemement.  */
+
+static inline int
+gimple_omp_for_kind (const_gimple g)
+{
+  GIMPLE_CHECK (g, GIMPLE_OMP_FOR);
+  return (gimple_omp_subcode (g) & GF_OMP_FOR_KIND_MASK);
+}
+
+
+/* Set the OMP for kind.  */
+
+static inline void
+gimple_omp_for_set_kind (gimple g, int kind)
+{
+  GIMPLE_CHECK (g, GIMPLE_OMP_FOR);
+  g->gsbase.subcode = (g->gsbase.subcode & ~GF_OMP_FOR_KIND_MASK)
+		      | (kind & GF_OMP_FOR_KIND_MASK);
+}
+
+
 /* Return the clauses associated with OMP_FOR GS.  */
 
 static inline tree
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index e2ae893..7b16f87 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -59,14 +59,17 @@  enum gimplify_omp_var_data
   GOVD_LOCAL = 128,
   GOVD_DEBUG_PRIVATE = 256,
   GOVD_PRIVATE_OUTER_REF = 512,
+  GOVD_LINEAR = 2048,
   GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
-			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LOCAL)
+			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
+			   | GOVD_LOCAL)
 };
 
 
 enum omp_region_type
 {
   ORT_WORKSHARE = 0,
+  ORT_SIMD = 1,
   ORT_PARALLEL = 2,
   ORT_COMBINED_PARALLEL = 3,
   ORT_TASK = 4,
@@ -711,7 +714,9 @@  gimple_add_tmp_var (tree tmp)
       if (gimplify_omp_ctxp)
 	{
 	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
-	  while (ctx && ctx->region_type == ORT_WORKSHARE)
+	  while (ctx
+		 && (ctx->region_type == ORT_WORKSHARE
+		     || ctx->region_type == ORT_SIMD))
 	    ctx = ctx->outer_context;
 	  if (ctx)
 	    omp_add_variable (ctx, tmp, GOVD_LOCAL | GOVD_SEEN);
@@ -2062,7 +2067,9 @@  gimplify_var_or_parm_decl (tree *expr_p)
 	  && decl_function_context (decl) != current_function_decl)
 	{
 	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
-	  while (ctx && ctx->region_type == ORT_WORKSHARE)
+	  while (ctx
+		 && (ctx->region_type == ORT_WORKSHARE
+		     || ctx->region_type == ORT_SIMD))
 	    ctx = ctx->outer_context;
 	  if (!ctx && !pointer_set_insert (nonlocal_vlas, decl))
 	    {
@@ -4703,6 +4710,7 @@  is_gimple_stmt (tree t)
     case STATEMENT_LIST:
     case OMP_PARALLEL:
     case OMP_FOR:
+    case OMP_SIMD:
     case OMP_SECTIONS:
     case OMP_SECTION:
     case OMP_SINGLE:
@@ -5715,7 +5723,8 @@  omp_firstprivatize_variable (struct gimplify_omp_ctx *ctx, tree decl)
 	  else
 	    return;
 	}
-      else if (ctx->region_type != ORT_WORKSHARE)
+      else if (ctx->region_type != ORT_WORKSHARE
+	       && ctx->region_type != ORT_SIMD)
 	omp_add_variable (ctx, decl, GOVD_FIRSTPRIVATE);
 
       ctx = ctx->outer_context;
@@ -5807,7 +5816,8 @@  omp_add_variable (struct gimplify_omp_ctx *ctx, tree decl, unsigned int flags)
 	 FIRSTPRIVATE and LASTPRIVATE.  */
       nflags = n->value | flags;
       gcc_assert ((nflags & GOVD_DATA_SHARE_CLASS)
-		  == (GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE));
+		  == (GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE)
+		  || (flags & GOVD_DATA_SHARE_CLASS) == 0);
       n->value = nflags;
       return;
     }
@@ -5871,7 +5881,10 @@  omp_add_variable (struct gimplify_omp_ctx *ctx, tree decl, unsigned int flags)
 	}
     }
 
-  splay_tree_insert (ctx->variables, (splay_tree_key)decl, flags);
+  if (n != NULL)
+    n->value |= flags;
+  else
+    splay_tree_insert (ctx->variables, (splay_tree_key)decl, flags);
 }
 
 /* Notice a threadprivate variable DECL used in OpenMP context CTX.
@@ -5937,7 +5950,8 @@  omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
       enum omp_clause_default_kind default_kind, kind;
       struct gimplify_omp_ctx *octx;
 
-      if (ctx->region_type == ORT_WORKSHARE)
+      if (ctx->region_type == ORT_WORKSHARE
+	  || ctx->region_type == ORT_SIMD)
 	goto do_outer;
 
       /* ??? Some compiler-generated variables (like SAVE_EXPRs) could be
@@ -6050,7 +6064,7 @@  omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
    to the contrary in the innermost scope, generate an error.  */
 
 static bool
-omp_is_private (struct gimplify_omp_ctx *ctx, tree decl)
+omp_is_private (struct gimplify_omp_ctx *ctx, tree decl, bool simd)
 {
   splay_tree_node n;
 
@@ -6061,8 +6075,12 @@  omp_is_private (struct gimplify_omp_ctx *ctx, tree decl)
 	{
 	  if (ctx == gimplify_omp_ctxp)
 	    {
-	      error ("iteration variable %qE should be private",
-		     DECL_NAME (decl));
+	      if (simd)
+		error ("iteration variable %qE is predetermined linear",
+		       DECL_NAME (decl));
+	      else
+		error ("iteration variable %qE should be private",
+		       DECL_NAME (decl));
 	      n->value = GOVD_PRIVATE;
 	      return true;
 	    }
@@ -6080,16 +6098,26 @@  omp_is_private (struct gimplify_omp_ctx *ctx, tree decl)
 	  else if ((n->value & GOVD_REDUCTION) != 0)
 	    error ("iteration variable %qE should not be reduction",
 		   DECL_NAME (decl));
+	  else if (simd && (n->value & GOVD_LASTPRIVATE) != 0)
+	    error ("iteration variable %qE should not be lastprivate",
+		   DECL_NAME (decl));
+	  else if (simd && (n->value & GOVD_PRIVATE) != 0)
+	    error ("iteration variable %qE should not be private",
+		   DECL_NAME (decl));
+	  else if (simd && (n->value & GOVD_LINEAR) != 0)
+	    error ("iteration variable %qE is predetermined linear",
+		   DECL_NAME (decl));
 	}
       return (ctx == gimplify_omp_ctxp
 	      || (ctx->region_type == ORT_COMBINED_PARALLEL
 		  && gimplify_omp_ctxp->outer_context == ctx));
     }
 
-  if (ctx->region_type != ORT_WORKSHARE)
+  if (ctx->region_type != ORT_WORKSHARE
+      && ctx->region_type != ORT_SIMD)
     return false;
   else if (ctx->outer_context)
-    return omp_is_private (ctx->outer_context, decl);
+    return omp_is_private (ctx->outer_context, decl, simd);
   return false;
 }
 
@@ -6114,7 +6142,8 @@  omp_check_private (struct gimplify_omp_ctx *ctx, tree decl)
       if (n != NULL)
 	return (n->value & GOVD_SHARED) == 0;
     }
-  while (ctx->region_type == ORT_WORKSHARE);
+  while (ctx->region_type == ORT_WORKSHARE
+	 || ctx->region_type == ORT_SIMD);
   return false;
 }
 
@@ -6167,6 +6196,15 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	  flags = GOVD_REDUCTION | GOVD_SEEN | GOVD_EXPLICIT;
 	  check_non_private = "reduction";
 	  goto do_add;
+       case OMP_CLAUSE_LINEAR:
+	 if (gimplify_expr (&OMP_CLAUSE_LINEAR_STEP (c), pre_p, NULL,
+			    is_gimple_val, fb_rvalue) == GS_ERROR)
+	   {
+	     remove = true;
+	     break;
+	   }
+	 flags = GOVD_LINEAR | GOVD_EXPLICIT;
+	 goto do_add;
 
 	do_add:
 	  decl = OMP_CLAUSE_DECL (c);
@@ -6265,6 +6303,7 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	case OMP_CLAUSE_UNTIED:
 	case OMP_CLAUSE_COLLAPSE:
 	case OMP_CLAUSE_MERGEABLE:
+	case OMP_CLAUSE_SAFELEN:
 	  break;
 
 	case OMP_CLAUSE_DEFAULT:
@@ -6322,7 +6361,8 @@  gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
 	      splay_tree_node on
 		= splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
 	      if (on && (on->value & (GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE
-				      | GOVD_PRIVATE | GOVD_REDUCTION)) != 0)
+				      | GOVD_PRIVATE | GOVD_REDUCTION
+				      | GOVD_LINEAR)) != 0)
 		break;
 	      ctx = ctx->outer_context;
 	    }
@@ -6335,6 +6375,8 @@  gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
     code = OMP_CLAUSE_PRIVATE;
   else if (flags & GOVD_FIRSTPRIVATE)
     code = OMP_CLAUSE_FIRSTPRIVATE;
+  else if (flags & GOVD_LASTPRIVATE)
+    code = OMP_CLAUSE_LASTPRIVATE;
   else
     gcc_unreachable ();
 
@@ -6367,6 +6409,7 @@  gimplify_adjust_omp_clauses (tree *list_p)
 	case OMP_CLAUSE_PRIVATE:
 	case OMP_CLAUSE_SHARED:
 	case OMP_CLAUSE_FIRSTPRIVATE:
+	case OMP_CLAUSE_LINEAR:
 	  decl = OMP_CLAUSE_DECL (c);
 	  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
 	  remove = !(n->value & GOVD_SEEN);
@@ -6382,6 +6425,31 @@  gimplify_adjust_omp_clauses (tree *list_p)
 		  OMP_CLAUSE_SET_CODE (c, OMP_CLAUSE_PRIVATE);
 		  OMP_CLAUSE_PRIVATE_DEBUG (c) = 1;
 		}
+	      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR
+		  && ctx->outer_context
+		  && !(OMP_CLAUSE_LINEAR_NO_COPYIN (c)
+		       && OMP_CLAUSE_LINEAR_NO_COPYOUT (c))
+		  && !is_global_var (decl))
+		{
+		  if (ctx->outer_context->region_type == ORT_COMBINED_PARALLEL)
+		    {
+		      n = splay_tree_lookup (ctx->outer_context->variables,
+					     (splay_tree_key) decl);
+		      if (n == NULL
+			  || (n->value & GOVD_DATA_SHARE_CLASS) == 0)
+			{
+			  int flags = OMP_CLAUSE_LINEAR_NO_COPYIN (c)
+				      ? GOVD_LASTPRIVATE : GOVD_SHARED;
+			  if (n == NULL)
+			    omp_add_variable (ctx->outer_context, decl,
+					      flags | GOVD_SEEN);
+			  else
+			    n->value |= flags | GOVD_SEEN;
+			}
+		    }
+		  else
+		    omp_notice_variable (ctx->outer_context, decl, true);
+		}
 	    }
 	  break;
 
@@ -6407,6 +6475,7 @@  gimplify_adjust_omp_clauses (tree *list_p)
 	case OMP_CLAUSE_COLLAPSE:
 	case OMP_CLAUSE_FINAL:
 	case OMP_CLAUSE_MERGEABLE:
+	case OMP_CLAUSE_SAFELEN:
 	  break;
 
 	default:
@@ -6510,14 +6579,40 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   gimple gfor;
   gimple_seq for_body, for_pre_body;
   int i;
+  bool simd;
+  bitmap has_decl_expr = NULL;
 
   for_stmt = *expr_p;
 
+  simd = TREE_CODE (for_stmt) == OMP_SIMD; 
   gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p,
-			     ORT_WORKSHARE);
+			     simd ? ORT_SIMD : ORT_WORKSHARE);
 
   /* Handle OMP_FOR_INIT.  */
   for_pre_body = NULL;
+  if (simd && OMP_FOR_PRE_BODY (for_stmt))
+    {
+      has_decl_expr = BITMAP_ALLOC (NULL);
+      if (TREE_CODE (OMP_FOR_PRE_BODY (for_stmt)) == DECL_EXPR
+	  && TREE_CODE (DECL_EXPR_DECL (OMP_FOR_PRE_BODY (for_stmt)))
+	  == VAR_DECL)
+	{
+	  t = OMP_FOR_PRE_BODY (for_stmt);
+	  bitmap_set_bit (has_decl_expr, DECL_UID (DECL_EXPR_DECL (t)));
+	}
+      else if (TREE_CODE (OMP_FOR_PRE_BODY (for_stmt)) == STATEMENT_LIST)
+	{
+	  tree_stmt_iterator si;
+	  for (si = tsi_start (OMP_FOR_PRE_BODY (for_stmt)); !tsi_end_p (si);
+	       tsi_next (&si))
+	    {
+	      t = tsi_stmt (si);
+	      if (TREE_CODE (t) == DECL_EXPR
+		  && TREE_CODE (DECL_EXPR_DECL (t)) == VAR_DECL)
+		bitmap_set_bit (has_decl_expr, DECL_UID (DECL_EXPR_DECL (t)));
+	    }
+	}
+    }
   gimplify_and_add (OMP_FOR_PRE_BODY (for_stmt), &for_pre_body);
   OMP_FOR_PRE_BODY (for_stmt) = NULL_TREE;
 
@@ -6536,7 +6631,44 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 		  || POINTER_TYPE_P (TREE_TYPE (decl)));
 
       /* Make sure the iteration variable is private.  */
-      if (omp_is_private (gimplify_omp_ctxp, decl))
+      tree c = NULL_TREE;
+      if (simd)
+	{
+	  splay_tree_node n = splay_tree_lookup (gimplify_omp_ctxp->variables,
+						 (splay_tree_key)decl);
+	  omp_is_private (gimplify_omp_ctxp, decl, simd);
+	  if (n != NULL && (n->value & GOVD_DATA_SHARE_CLASS) != 0)
+	    omp_notice_variable (gimplify_omp_ctxp, decl, true);
+	  else if (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)) == 1)
+	    {
+	      c = build_omp_clause (input_location, OMP_CLAUSE_LINEAR);
+	      OMP_CLAUSE_LINEAR_NO_COPYIN (c) = 1;
+	      if (has_decl_expr
+		  && bitmap_bit_p (has_decl_expr, DECL_UID (decl)))
+		OMP_CLAUSE_LINEAR_NO_COPYOUT (c) = 1;
+	      OMP_CLAUSE_DECL (c) = decl;
+	      OMP_CLAUSE_CHAIN (c) = OMP_FOR_CLAUSES (for_stmt);
+	      OMP_FOR_CLAUSES (for_stmt) = c;
+	      omp_add_variable (gimplify_omp_ctxp, decl,
+				GOVD_LINEAR | GOVD_EXPLICIT | GOVD_SEEN);
+	    }
+	  else
+	    {
+	      bool lastprivate
+		= (!has_decl_expr
+		   || !bitmap_bit_p (has_decl_expr, DECL_UID (decl)));
+	      c = build_omp_clause (input_location,
+				    lastprivate ? OMP_CLAUSE_LASTPRIVATE
+						: OMP_CLAUSE_PRIVATE);
+	      OMP_CLAUSE_DECL (c) = decl;
+	      OMP_CLAUSE_CHAIN (c) = OMP_FOR_CLAUSES (for_stmt);
+	      omp_add_variable (gimplify_omp_ctxp, decl,
+				(lastprivate ? GOVD_LASTPRIVATE : GOVD_PRIVATE)
+				| GOVD_SEEN);
+	      c = NULL_TREE;
+	    }
+	}
+      else if (omp_is_private (gimplify_omp_ctxp, decl, simd))
 	omp_notice_variable (gimplify_omp_ctxp, decl, true);
       else
 	omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN);
@@ -6578,6 +6710,8 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	case PREINCREMENT_EXPR:
 	case POSTINCREMENT_EXPR:
 	  t = build_int_cst (TREE_TYPE (decl), 1);
+	  if (c)
+	    OMP_CLAUSE_LINEAR_STEP (c) = t;
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
@@ -6586,6 +6720,8 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	case PREDECREMENT_EXPR:
 	case POSTDECREMENT_EXPR:
 	  t = build_int_cst (TREE_TYPE (decl), -1);
+	  if (c)
+	    OMP_CLAUSE_LINEAR_STEP (c) = t;
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
@@ -6619,6 +6755,20 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 				is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
+	  if (c)
+	    {
+	      OMP_CLAUSE_LINEAR_STEP (c) = TREE_OPERAND (t, 1);
+	      if (TREE_CODE (t) == MINUS_EXPR)
+		{
+		  t = TREE_OPERAND (t, 1);
+		  OMP_CLAUSE_LINEAR_STEP (c)
+		    = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), t);
+		  tret = gimplify_expr (&OMP_CLAUSE_LINEAR_STEP (c),
+					&for_pre_body, NULL,
+					is_gimple_val, fb_rvalue);
+		  ret = MIN (ret, tret);
+		}
+	    }
 	  break;
 
 	default:
@@ -6649,11 +6799,21 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	}
     }
 
+  BITMAP_FREE (has_decl_expr);
+
   gimplify_and_add (OMP_FOR_BODY (for_stmt), &for_body);
 
   gimplify_adjust_omp_clauses (&OMP_FOR_CLAUSES (for_stmt));
 
-  gfor = gimple_build_omp_for (for_body, OMP_FOR_CLAUSES (for_stmt),
+  int kind;
+  switch (TREE_CODE (for_stmt))
+    {
+    case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
+    case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
+    default:
+      gcc_unreachable ();
+    }
+  gfor = gimple_build_omp_for (for_body, kind, OMP_FOR_CLAUSES (for_stmt),
 			       TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)),
 			       for_pre_body);
 
@@ -6670,7 +6830,10 @@  gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     }
 
   gimplify_seq_add_stmt (pre_p, gfor);
-  return ret == GS_ALL_DONE ? GS_ALL_DONE : GS_ERROR;
+  if (ret != GS_ALL_DONE)
+    return GS_ERROR;
+  *expr_p = NULL_TREE;
+  return GS_ALL_DONE;
 }
 
 /* Gimplify the gross structure of other OpenMP worksharing constructs.
@@ -7588,6 +7751,7 @@  gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	  break;
 
 	case OMP_FOR:
+	case OMP_SIMD:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
 
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index b841abd..983efeb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -109,6 +109,30 @@  expand_STORE_LANES (gimple stmt)
   expand_insn (get_multi_vector_move (type, vec_store_lanes_optab), 2, ops);
 }
 
+/* This should get expanded in adjust_simduid_builtins.  */
+
+static void
+expand_GOMP_SIMD_LANE (gimple stmt ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
+
+/* This should get expanded in adjust_simduid_builtins.  */
+
+static void
+expand_GOMP_SIMD_VF (gimple stmt ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
+
+/* This should get expanded in adjust_simduid_builtins.  */
+
+static void
+expand_GOMP_SIMD_LAST_LANE (gimple stmt ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
+
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 8900d90..5427664 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -40,3 +40,6 @@  along with GCC; see the file COPYING3.  If not see
 
 DEF_INTERNAL_FN (LOAD_LANES, ECF_CONST | ECF_LEAF)
 DEF_INTERNAL_FN (STORE_LANES, ECF_CONST | ECF_LEAF)
+DEF_INTERNAL_FN (GOMP_SIMD_LANE, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW)
+DEF_INTERNAL_FN (GOMP_SIMD_VF, ECF_CONST | ECF_LEAF | ECF_NOTHROW)
+DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF | ECF_NOTHROW)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index afddf37..faa01ca 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -42,6 +42,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "splay-tree.h"
 #include "optabs.h"
 #include "cfgloop.h"
+#include "target.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -222,6 +223,7 @@  extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   int i;
   struct omp_for_data_loop dummy_loop;
   location_t loc = gimple_location (for_stmt);
+  bool simd = gimple_omp_for_kind (for_stmt) == GF_OMP_FOR_KIND_SIMD;
 
   fd->for_stmt = for_stmt;
   fd->pre = NULL;
@@ -349,7 +351,21 @@  extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	  gcc_unreachable ();
 	}
 
-      if (iter_type != long_long_unsigned_type_node)
+      if (simd
+	  /*
+	  || (fd->sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
+	  && !fd->have_ordered)*/)
+	{
+	  if (fd->collapse == 1)
+	    iter_type = TREE_TYPE (loop->v);
+	  else if (i == 0
+		   || TYPE_PRECISION (iter_type)
+		      < TYPE_PRECISION (TREE_TYPE (loop->v)))
+	    iter_type
+	      = build_nonstandard_integer_type
+	      (TYPE_PRECISION (TREE_TYPE (loop->v)), 1);
+	}
+      else if (iter_type != long_long_unsigned_type_node)
 	{
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->v)))
 	    iter_type = long_long_unsigned_type_node;
@@ -445,7 +461,8 @@  extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	}
     }
 
-  if (count)
+  if (count
+      && !simd)
     {
       if (!tree_int_cst_lt (count, TYPE_MAX_VALUE (long_integer_type_node)))
 	iter_type = long_long_unsigned_type_node;
@@ -918,6 +935,19 @@  build_outer_var_ref (tree var, omp_context *ctx)
       bool by_ref = use_pointer_for_field (var, NULL);
       x = build_receiver_ref (var, by_ref, ctx);
     }
+  else if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
+	   && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+    {
+      /* #pragma omp simd isn't a worksharing construct, and can reference even
+	 private vars in its linear etc. clauses.  */
+      x = NULL_TREE;
+      if (ctx->outer && is_taskreg_ctx (ctx))
+	x = lookup_decl (var, ctx->outer);
+      else if (ctx->outer)
+	x = maybe_lookup_decl (var, ctx->outer);
+      if (x == NULL_TREE)
+	x = var;
+    }
   else if (ctx->outer)
     x = lookup_decl (var, ctx->outer);
   else if (is_reference (var))
@@ -1423,6 +1453,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 
 	case OMP_CLAUSE_FIRSTPRIVATE:
 	case OMP_CLAUSE_REDUCTION:
+	case OMP_CLAUSE_LINEAR:
 	  decl = OMP_CLAUSE_DECL (c);
 	do_private:
 	  if (is_variable_sized (decl))
@@ -1474,6 +1505,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_COLLAPSE:
 	case OMP_CLAUSE_UNTIED:
 	case OMP_CLAUSE_MERGEABLE:
+	case OMP_CLAUSE_SAFELEN:
 	  break;
 
 	default:
@@ -1497,6 +1529,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_PRIVATE:
 	case OMP_CLAUSE_FIRSTPRIVATE:
 	case OMP_CLAUSE_REDUCTION:
+	case OMP_CLAUSE_LINEAR:
 	  decl = OMP_CLAUSE_DECL (c);
 	  if (is_variable_sized (decl))
 	    install_var_local (decl, ctx);
@@ -1526,6 +1559,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_UNTIED:
 	case OMP_CLAUSE_FINAL:
 	case OMP_CLAUSE_MERGEABLE:
+	case OMP_CLAUSE_SAFELEN:
 	  break;
 
 	default:
@@ -1631,7 +1665,6 @@  create_omp_child_function (omp_context *ctx, bool task_copy)
   pop_cfun ();
 }
 
-
 /* Scan an OpenMP parallel directive.  */
 
 static void
@@ -1831,9 +1864,22 @@  scan_omp_single (gimple stmt, omp_context *outer_ctx)
 static bool
 check_omp_nesting_restrictions (gimple stmt, omp_context *ctx)
 {
+  if (ctx != NULL)
+    {
+      if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
+	  && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+	{
+	  error_at (gimple_location (stmt),
+		    "OpenMP constructs may not be nested inside simd region");
+	  return false;
+	}
+    }
   switch (gimple_code (stmt))
     {
     case GIMPLE_OMP_FOR:
+      if (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_SIMD)
+	return true;
+      /* FALLTHRU */
     case GIMPLE_OMP_SECTIONS:
     case GIMPLE_OMP_SINGLE:
     case GIMPLE_CALL:
@@ -2254,6 +2300,73 @@  omp_reduction_init (tree clause, tree type)
     }
 }
 
+/* Return maximum possible vectorization factor for the target.  */
+
+static int
+omp_max_vf (void)
+{
+  if (!optimize
+      || optimize_debug
+      || (!flag_tree_vectorize
+	  && global_options_set.x_flag_tree_vectorize))
+    return 1;
+
+  int vs = targetm.vectorize.autovectorize_vector_sizes ();
+  if (vs)
+    {
+      vs = 1 << floor_log2 (vs);
+      return vs;
+    }
+  enum machine_mode vqimode = targetm.vectorize.preferred_simd_mode (QImode);
+  if (GET_MODE_CLASS (vqimode) == MODE_VECTOR_INT)
+    return GET_MODE_NUNITS (vqimode);
+  return 1;
+}
+
+/* Helper function of lower_rec_input_clauses, used for #pragma omp simd
+   privatization.  */
+
+static bool
+lower_rec_simd_input_clauses (tree new_var, omp_context *ctx, int &max_vf,
+			      tree &idx, tree &lane, tree &ivar, tree &lvar)
+{
+  if (max_vf == 0)
+    {
+      max_vf = omp_max_vf ();
+      if (max_vf > 1)
+	{
+	  tree c = find_omp_clause (gimple_omp_for_clauses (ctx->stmt),
+				    OMP_CLAUSE_SAFELEN);
+	  if (c
+	      && compare_tree_int (OMP_CLAUSE_SAFELEN_EXPR (c), max_vf) == -1)
+	    max_vf = tree_low_cst (OMP_CLAUSE_SAFELEN_EXPR (c), 0);
+	}
+      if (max_vf > 1)
+	{
+	  idx = create_tmp_var (unsigned_type_node, NULL);
+	  lane = create_tmp_var (unsigned_type_node, NULL);
+	}
+    }
+  if (max_vf == 1)
+    return false;
+
+  tree atype = build_array_type_nelts (TREE_TYPE (new_var), max_vf);
+  tree avar = create_tmp_var_raw (atype, NULL);
+  if (TREE_ADDRESSABLE (new_var))
+    TREE_ADDRESSABLE (avar) = 1;
+  DECL_ATTRIBUTES (avar)
+    = tree_cons (get_identifier ("omp simd array"), NULL,
+		 DECL_ATTRIBUTES (avar));
+  gimple_add_tmp_var (avar);
+  ivar = build4 (ARRAY_REF, TREE_TYPE (new_var), avar, idx,
+		 NULL_TREE, NULL_TREE);
+  lvar = build4 (ARRAY_REF, TREE_TYPE (new_var), avar, lane,
+		 NULL_TREE, NULL_TREE);
+  SET_DECL_VALUE_EXPR (new_var, lvar);
+  DECL_HAS_VALUE_EXPR_P (new_var) = 1;
+  return true;
+}
+
 /* Generate code to implement the input clauses, FIRSTPRIVATE and COPYIN,
    from the receiver (aka child) side and initializers for REFERENCE_TYPE
    private variables.  Initialization statements go in ILIST, while calls
@@ -2267,9 +2380,37 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
   bool copyin_by_ref = false;
   bool lastprivate_firstprivate = false;
   int pass;
+  bool is_simd = (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
+		  && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD);
+  int max_vf = 0;
+  tree lane = NULL_TREE, idx = NULL_TREE;
+  tree ivar = NULL_TREE, lvar = NULL_TREE;
+  gimple_seq llist[2] = { NULL, NULL };
 
   copyin_seq = NULL;
 
+  /* Enforce simdlen 1 in simd loops with data sharing clauses referencing
+     variable sized vars.  That is unnecessarily hard to support and very
+     unlikely to result in vectorized code anyway.  */
+  if (is_simd)
+    for (c = clauses; c ; c = OMP_CLAUSE_CHAIN (c))
+      switch (OMP_CLAUSE_CODE (c))
+	{
+	case OMP_CLAUSE_REDUCTION:
+	  if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c))
+	    max_vf = 1;
+	  /* FALLTHRU */
+	case OMP_CLAUSE_PRIVATE:
+	case OMP_CLAUSE_FIRSTPRIVATE:
+	case OMP_CLAUSE_LASTPRIVATE:
+	case OMP_CLAUSE_LINEAR:
+	  if (is_variable_sized (OMP_CLAUSE_DECL (c)))
+	    max_vf = 1;
+	  break;
+	default:
+	  continue;
+	}
+
   /* Do all the fixed sized types in the first pass, and the variable sized
      types in the second pass.  This makes sure that the scalar arguments to
      the variable sized types are processed before we use them in the
@@ -2299,6 +2440,8 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	    case OMP_CLAUSE_COPYIN:
 	    case OMP_CLAUSE_REDUCTION:
 	      break;
+	    case OMP_CLAUSE_LINEAR:
+	      break;
 	    case OMP_CLAUSE_LASTPRIVATE:
 	      if (OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE (c))
 		{
@@ -2443,7 +2586,36 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 		}
 	      else
 		x = NULL;
+	    do_private:
 	      x = lang_hooks.decls.omp_clause_default_ctor (c, new_var, x);
+	      if (is_simd)
+		{
+		  tree y = lang_hooks.decls.omp_clause_dtor (c, new_var);
+		  if ((TREE_ADDRESSABLE (new_var) || x || y
+		       || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE)
+		      && lower_rec_simd_input_clauses (new_var, ctx, max_vf,
+						       idx, lane, ivar, lvar))
+		    {
+		      if (x)
+			x = lang_hooks.decls.omp_clause_default_ctor
+						(c, unshare_expr (ivar), x);
+		      if (x)
+			gimplify_and_add (x, &llist[0]);
+		      if (y)
+			{
+			  y = lang_hooks.decls.omp_clause_dtor (c, ivar);
+			  if (y)
+			    {
+			      gimple_seq tseq = NULL;
+
+			      dtor = y;
+			      gimplify_stmt (&dtor, &tseq);
+			      gimple_seq_add_seq (&llist[1], tseq);
+			    }
+			}
+		      break;
+		    }
+		}
 	      if (x)
 		gimplify_and_add (x, ilist);
 	      /* FALLTHRU */
@@ -2460,6 +2632,15 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 		}
 	      break;
 
+	    case OMP_CLAUSE_LINEAR:
+	      if (!OMP_CLAUSE_LINEAR_NO_COPYIN (c))
+		goto do_firstprivate;
+	      if (OMP_CLAUSE_LINEAR_NO_COPYOUT (c))
+		x = NULL;
+	      else
+		x = build_outer_var_ref (var, ctx);
+	      goto do_private;
+
 	    case OMP_CLAUSE_FIRSTPRIVATE:
 	      if (is_task_ctx (ctx))
 		{
@@ -2475,11 +2656,56 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 		      goto do_dtor;
 		    }
 		}
+	    do_firstprivate:
 	      x = build_outer_var_ref (var, ctx);
+	      if (is_simd)
+		{
+		  if ((OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR
+		       || TREE_ADDRESSABLE (new_var))
+		      && lower_rec_simd_input_clauses (new_var, ctx, max_vf,
+						       idx, lane, ivar, lvar))
+		    {
+		      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR)
+			{
+			  tree iv = create_tmp_var (TREE_TYPE (new_var), NULL);
+			  x = lang_hooks.decls.omp_clause_copy_ctor (c, iv, x);
+			  gimplify_and_add (x, ilist);
+			  gimple_stmt_iterator gsi
+			    = gsi_start_1 (gimple_omp_body_ptr (ctx->stmt));
+			  gimple g
+			    = gimple_build_assign (unshare_expr (lvar), iv);
+			  gsi_insert_before_without_update (&gsi, g,
+							    GSI_SAME_STMT);
+			  tree stept = POINTER_TYPE_P (TREE_TYPE (x))
+				       ? sizetype : TREE_TYPE (x);
+			  tree t = fold_convert (stept,
+						 OMP_CLAUSE_LINEAR_STEP (c));
+			  enum tree_code code = PLUS_EXPR;
+			  if (POINTER_TYPE_P (TREE_TYPE (new_var)))
+			    code = POINTER_PLUS_EXPR;
+			  g = gimple_build_assign_with_ops (code, iv, iv, t);
+			  gsi_insert_before_without_update (&gsi, g,
+							    GSI_SAME_STMT);
+			  break;
+			}
+		      x = lang_hooks.decls.omp_clause_copy_ctor
+						(c, unshare_expr (ivar), x);
+		      gimplify_and_add (x, &llist[0]);
+		      x = lang_hooks.decls.omp_clause_dtor (c, ivar);
+		      if (x)
+			{
+			  gimple_seq tseq = NULL;
+
+			  dtor = x;
+			  gimplify_stmt (&dtor, &tseq);
+			  gimple_seq_add_seq (&llist[1], tseq);
+			}
+		      break;
+		    }
+		}
 	      x = lang_hooks.decls.omp_clause_copy_ctor (c, new_var, x);
 	      gimplify_and_add (x, ilist);
 	      goto do_dtor;
-	      break;
 
 	    case OMP_CLAUSE_COPYIN:
 	      by_ref = use_pointer_for_field (var, NULL);
@@ -2495,6 +2721,8 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 		  tree placeholder = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c);
 		  x = build_outer_var_ref (var, ctx);
 
+		  /* FIXME: Not handled yet.  */
+		  gcc_assert (!is_simd);
 		  if (is_reference (var))
 		    x = build_fold_addr_expr_loc (clause_loc, x);
 		  SET_DECL_VALUE_EXPR (placeholder, x);
@@ -2509,7 +2737,31 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 		{
 		  x = omp_reduction_init (c, TREE_TYPE (new_var));
 		  gcc_assert (TREE_CODE (TREE_TYPE (new_var)) != ARRAY_TYPE);
-		  gimplify_assign (new_var, x, ilist);
+		  if (is_simd
+		      && lower_rec_simd_input_clauses (new_var, ctx, max_vf,
+						       idx, lane, ivar, lvar))
+		    {
+		      enum tree_code code = OMP_CLAUSE_REDUCTION_CODE (c);
+		      tree ref = build_outer_var_ref (var, ctx);
+
+		      gimplify_assign (unshare_expr (ivar), x, &llist[0]);
+
+		      /* reduction(-:var) sums up the partial results, so it
+			 acts identically to reduction(+:var).  */
+		      if (code == MINUS_EXPR)
+			code = PLUS_EXPR;
+
+		      x = build2 (code, TREE_TYPE (ref), ref, ivar);
+		      ref = build_outer_var_ref (var, ctx);
+		      gimplify_assign (ref, x, &llist[1]);
+		    }
+		  else
+		    {
+		      gimplify_assign (new_var, x, ilist);
+		      if (is_simd)
+			gimplify_assign (build_outer_var_ref (var, ctx),
+					 new_var, dlist);
+		    }
 		}
 	      break;
 
@@ -2519,6 +2771,49 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	}
     }
 
+  if (lane)
+    {
+      tree uid = create_tmp_var (ptr_type_node, "simduid");
+      gimple g
+	= gimple_build_call_internal (IFN_GOMP_SIMD_LANE, 1, uid);
+      gimple_call_set_lhs (g, lane);
+      gimple_stmt_iterator gsi = gsi_start_1 (gimple_omp_body_ptr (ctx->stmt));
+      gsi_insert_before_without_update (&gsi, g, GSI_SAME_STMT);
+      c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE__SIMDUID_);
+      OMP_CLAUSE__SIMDUID__DECL (c) = uid;
+      OMP_CLAUSE_CHAIN (c) = gimple_omp_for_clauses (ctx->stmt);
+      gimple_omp_for_set_clauses (ctx->stmt, c);
+      g = gimple_build_assign_with_ops (INTEGER_CST, lane,
+					build_int_cst (unsigned_type_node, 0),
+					NULL_TREE);
+      gimple_seq_add_stmt (ilist, g);
+      for (int i = 0; i < 2; i++)
+	if (llist[i])
+	  {
+	    tree vf = create_tmp_var (unsigned_type_node, NULL);
+	    g = gimple_build_call_internal (IFN_GOMP_SIMD_VF, 1, uid);
+	    gimple_call_set_lhs (g, vf);
+	    gimple_seq *seq = i == 0 ? ilist : dlist;
+	    gimple_seq_add_stmt (seq, g);
+	    tree t = build_int_cst (unsigned_type_node, 0);
+	    g = gimple_build_assign_with_ops (INTEGER_CST, idx, t, NULL_TREE);
+	    gimple_seq_add_stmt (seq, g);
+	    tree body = create_artificial_label (UNKNOWN_LOCATION);
+	    tree header = create_artificial_label (UNKNOWN_LOCATION);
+	    tree end = create_artificial_label (UNKNOWN_LOCATION);
+	    gimple_seq_add_stmt (seq, gimple_build_goto (header));
+	    gimple_seq_add_stmt (seq, gimple_build_label (body));
+	    gimple_seq_add_seq (seq, llist[i]);
+	    t = build_int_cst (unsigned_type_node, 1);
+	    g = gimple_build_assign_with_ops (PLUS_EXPR, idx, idx, t);
+	    gimple_seq_add_stmt (seq, g);
+	    gimple_seq_add_stmt (seq, gimple_build_label (header));
+	    g = gimple_build_cond (LT_EXPR, idx, vf, body, end);
+	    gimple_seq_add_stmt (seq, g);
+	    gimple_seq_add_stmt (seq, gimple_build_label (end));
+	  }
+    }
+
   /* The copyin sequence is not to be executed by the main thread, since
      that would result in self-copies.  Perhaps not visible to scalars,
      but it certainly is to C++ operator=.  */
@@ -2538,7 +2833,31 @@  lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
      lastprivate clauses we need to ensure the lastprivate copying
      happens after firstprivate copying in all threads.  */
   if (copyin_by_ref || lastprivate_firstprivate)
-    gimplify_and_add (build_omp_barrier (), ilist);
+    {
+      /* Don't add any barrier for #pragma omp simd or
+	 #pragma omp distribute.  */
+      if (gimple_code (ctx->stmt) != GIMPLE_OMP_FOR
+	  || gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_FOR)
+	gimplify_and_add (build_omp_barrier (), ilist);
+    }
+
+  /* If max_vf is non-NULL, then we can use only vectorization factor
+     up to the max_vf we chose.  So stick it into safelen clause.  */
+  if (max_vf)
+    {
+      tree c = find_omp_clause (gimple_omp_for_clauses (ctx->stmt),
+				OMP_CLAUSE_SAFELEN);
+      if (c == NULL_TREE
+	  || compare_tree_int (OMP_CLAUSE_SAFELEN_EXPR (c),
+			       max_vf) == 1)
+	{
+	  c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_SAFELEN);
+	  OMP_CLAUSE_SAFELEN_EXPR (c) = build_int_cst (integer_type_node,
+						       max_vf);
+	  OMP_CLAUSE_CHAIN (c) = gimple_omp_for_clauses (ctx->stmt);
+	  gimple_omp_for_set_clauses (ctx->stmt, c);
+	}
+    }
 }
 
 
@@ -2550,11 +2869,16 @@  static void
 lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list,
 			    omp_context *ctx)
 {
-  tree x, c, label = NULL;
+  tree x, c, label = NULL, orig_clauses = clauses;
   bool par_clauses = false;
+  tree simduid = NULL, lastlane = NULL;
 
-  /* Early exit if there are no lastprivate clauses.  */
-  clauses = find_omp_clause (clauses, OMP_CLAUSE_LASTPRIVATE);
+  /* Early exit if there are no lastprivate or linear clauses.  */
+  for (; clauses ; clauses = OMP_CLAUSE_CHAIN (clauses))
+    if (OMP_CLAUSE_CODE (clauses) == OMP_CLAUSE_LASTPRIVATE
+	|| (OMP_CLAUSE_CODE (clauses) == OMP_CLAUSE_LINEAR
+	    && !OMP_CLAUSE_LINEAR_NO_COPYOUT (clauses)))
+      break;
   if (clauses == NULL)
     {
       /* If this was a workshare clause, see if it had been combined
@@ -2591,23 +2915,59 @@  lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list,
       gimple_seq_add_stmt (stmt_list, gimple_build_label (label_true));
     }
 
+  if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
+      && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+    {
+      simduid = find_omp_clause (orig_clauses, OMP_CLAUSE__SIMDUID_);
+      if (simduid)
+	simduid = OMP_CLAUSE__SIMDUID__DECL (simduid);
+    }
+
   for (c = clauses; c ;)
     {
       tree var, new_var;
       location_t clause_loc = OMP_CLAUSE_LOCATION (c);
 
-      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE)
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+	  || (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR
+	      && !OMP_CLAUSE_LINEAR_NO_COPYOUT (c)))
 	{
 	  var = OMP_CLAUSE_DECL (c);
 	  new_var = lookup_decl (var, ctx);
 
-	  if (OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c))
+	  if (simduid && DECL_HAS_VALUE_EXPR_P (new_var))
+	    {
+	      tree val = DECL_VALUE_EXPR (new_var);
+	      if (TREE_CODE (val) == ARRAY_REF
+		  && VAR_P (TREE_OPERAND (val, 0))
+		  && lookup_attribute ("omp simd array",
+				       DECL_ATTRIBUTES (TREE_OPERAND (val,
+								      0))))
+		{
+		  if (lastlane == NULL)
+		    {
+		      lastlane = create_tmp_var (unsigned_type_node, NULL);
+		      gimple g
+			= gimple_build_call_internal (IFN_GOMP_SIMD_LAST_LANE,
+						      2, simduid,
+						      TREE_OPERAND (val, 1));
+		      gimple_call_set_lhs (g, lastlane);
+		      gimple_seq_add_stmt (stmt_list, g);
+		    }
+		  new_var = build4 (ARRAY_REF, TREE_TYPE (val),
+				    TREE_OPERAND (val, 0), lastlane,
+				    NULL_TREE, NULL_TREE);
+		}
+	    }
+
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+	      && OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c))
 	    {
 	      lower_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 	      gimple_seq_add_seq (stmt_list,
 				  OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c));
+	      OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c) = NULL;
 	    }
-	  OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c) = NULL;
 
 	  x = build_outer_var_ref (var, ctx);
 	  if (is_reference (var))
@@ -2649,6 +3009,11 @@  lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx)
   tree x, c;
   int count = 0;
 
+  /* SIMD reductions are handled in lower_rec_input_clauses.  */
+  if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
+      && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD)
+    return;
+
   /* First see if there is exactly one reduction clause.  Use OMP_ATOMIC
      update in that case, otherwise use a lock.  */
   for (c = clauses; c && count < 2; c = OMP_CLAUSE_CHAIN (c))
@@ -3411,6 +3776,24 @@  expand_omp_regimplify_p (tree *tp, int *walk_subtrees, void *)
   return NULL_TREE;
 }
 
+/* Prepend TO = FROM assignment before *GSI_P.  */
+
+static void
+expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
+{
+  bool simple_p = DECL_P (to) && TREE_ADDRESSABLE (to);
+  from = force_gimple_operand_gsi (gsi_p, from, simple_p, NULL_TREE,
+				   true, GSI_SAME_STMT);
+  gimple stmt = gimple_build_assign (to, from);
+  gsi_insert_before (gsi_p, stmt, GSI_SAME_STMT);
+  if (walk_tree (&from, expand_omp_regimplify_p, NULL, NULL)
+      || walk_tree (&to, expand_omp_regimplify_p, NULL, NULL))
+    {
+      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -3654,6 +4037,280 @@  expand_omp_taskreg (struct omp_region *region)
 }
 
 
+/* Helper function for expand_omp_{for_*,simd}.  If this is the outermost
+   of the combined collapse > 1 loop constructs, generate code like:
+	if (__builtin_expect (N32 cond3 N31, 0)) goto ZERO_ITER_BB;
+	if (cond3 is <)
+	  adj = STEP3 - 1;
+	else
+	  adj = STEP3 + 1;
+	count3 = (adj + N32 - N31) / STEP3;
+	if (__builtin_expect (N22 cond2 N21, 0)) goto ZERO_ITER_BB;
+	if (cond2 is <)
+	  adj = STEP2 - 1;
+	else
+	  adj = STEP2 + 1;
+	count2 = (adj + N22 - N21) / STEP2;
+	if (__builtin_expect (N12 cond1 N11, 0)) goto ZERO_ITER_BB;
+	if (cond1 is <)
+	  adj = STEP1 - 1;
+	else
+	  adj = STEP1 + 1;
+	count1 = (adj + N12 - N11) / STEP1;
+	count = count1 * count2 * count3;
+   Furthermore, if ZERO_ITER_BB is NULL, create a BB which does:
+	count = 0;
+   and set ZERO_ITER_BB to that bb.  */
+
+static void
+expand_omp_for_init_counts (struct omp_for_data *fd, gimple_stmt_iterator *gsi,
+			    basic_block &entry_bb, tree *counts,
+			    basic_block &zero_iter_bb, int &first_zero_iter,
+			    basic_block &l2_dom_bb)
+{
+  tree t, type = TREE_TYPE (fd->loop.v);
+  gimple stmt;
+  edge e, ne;
+  int i;
+
+  /* Collapsed loops need work for expansion into SSA form.  */
+  gcc_assert (!gimple_in_ssa_p (cfun));
+
+  for (i = 0; i < fd->collapse; i++)
+    {
+      tree itype = TREE_TYPE (fd->loops[i].v);
+
+      if (SSA_VAR_P (fd->loop.n2)
+	  && ((t = fold_binary (fd->loops[i].cond_code, boolean_type_node,
+				fold_convert (itype, fd->loops[i].n1),
+				fold_convert (itype, fd->loops[i].n2)))
+	      == NULL_TREE || !integer_onep (t)))
+	{
+	  tree n1, n2;
+	  n1 = fold_convert (itype, unshare_expr (fd->loops[i].n1));
+	  n1 = force_gimple_operand_gsi (gsi, n1, true, NULL_TREE,
+					 true, GSI_SAME_STMT);
+	  n2 = fold_convert (itype, unshare_expr (fd->loops[i].n2));
+	  n2 = force_gimple_operand_gsi (gsi, n2, true, NULL_TREE,
+					 true, GSI_SAME_STMT);
+	  stmt = gimple_build_cond (fd->loops[i].cond_code, n1, n2,
+				    NULL_TREE, NULL_TREE);
+	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+	  if (walk_tree (gimple_cond_lhs_ptr (stmt),
+			 expand_omp_regimplify_p, NULL, NULL)
+	      || walk_tree (gimple_cond_rhs_ptr (stmt),
+			    expand_omp_regimplify_p, NULL, NULL))
+	    {
+	      *gsi = gsi_for_stmt (stmt);
+	      gimple_regimplify_operands (stmt, gsi);
+	    }
+	  e = split_block (entry_bb, stmt);
+	  if (zero_iter_bb == NULL)
+	    {
+	      first_zero_iter = i;
+	      zero_iter_bb = create_empty_bb (entry_bb);
+	      if (current_loops)
+		add_bb_to_loop (zero_iter_bb, entry_bb->loop_father);
+	      *gsi = gsi_after_labels (zero_iter_bb);
+	      stmt = gimple_build_assign (fd->loop.n2,
+					  build_zero_cst (type));
+	      gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+	      set_immediate_dominator (CDI_DOMINATORS, zero_iter_bb,
+				       entry_bb);
+	    }
+	  ne = make_edge (entry_bb, zero_iter_bb, EDGE_FALSE_VALUE);
+	  ne->probability = REG_BR_PROB_BASE / 2000 - 1;
+	  e->flags = EDGE_TRUE_VALUE;
+	  e->probability = REG_BR_PROB_BASE - ne->probability;
+	  if (l2_dom_bb == NULL)
+	    l2_dom_bb = entry_bb;
+	  entry_bb = e->dest;
+	  *gsi = gsi_last_bb (entry_bb);
+	}
+
+      if (POINTER_TYPE_P (itype))
+	itype = signed_type_for (itype);
+      t = build_int_cst (itype, (fd->loops[i].cond_code == LT_EXPR
+				 ? -1 : 1));
+      t = fold_build2 (PLUS_EXPR, itype,
+		       fold_convert (itype, fd->loops[i].step), t);
+      t = fold_build2 (PLUS_EXPR, itype, t,
+		       fold_convert (itype, fd->loops[i].n2));
+      t = fold_build2 (MINUS_EXPR, itype, t,
+		       fold_convert (itype, fd->loops[i].n1));
+      if (TYPE_UNSIGNED (itype) && fd->loops[i].cond_code == GT_EXPR)
+	t = fold_build2 (TRUNC_DIV_EXPR, itype,
+			 fold_build1 (NEGATE_EXPR, itype, t),
+			 fold_build1 (NEGATE_EXPR, itype,
+				      fold_convert (itype,
+						    fd->loops[i].step)));
+      else
+	t = fold_build2 (TRUNC_DIV_EXPR, itype, t,
+			 fold_convert (itype, fd->loops[i].step));
+      t = fold_convert (type, t);
+      if (TREE_CODE (t) == INTEGER_CST)
+	counts[i] = t;
+      else
+	{
+	  counts[i] = create_tmp_reg (type, ".count");
+	  expand_omp_build_assign (gsi, counts[i], t);
+	}
+      if (SSA_VAR_P (fd->loop.n2))
+	{
+	  if (i == 0)
+	    t = counts[0];
+	  else
+	    t = fold_build2 (MULT_EXPR, type, fd->loop.n2, counts[i]);
+	  expand_omp_build_assign (gsi, fd->loop.n2, t);
+	}
+    }
+}
+
+
+/* Helper function for expand_omp_{for_*,simd}.  Generate code like:
+	T = V;
+	V3 = N31 + (T % count3) * STEP3;
+	T = T / count3;
+	V2 = N21 + (T % count2) * STEP2;
+	T = T / count2;
+	V1 = N11 + T * STEP1;
+   if this loop doesn't have an inner loop construct combined with it.  */
+
+static void
+expand_omp_for_init_vars (struct omp_for_data *fd, gimple_stmt_iterator *gsi,
+			  tree *counts, tree startvar)
+{
+  int i;
+  tree type = TREE_TYPE (fd->loop.v);
+  tree tem = create_tmp_reg (type, ".tem");
+  gimple stmt = gimple_build_assign (tem, startvar);
+  gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
+
+  for (i = fd->collapse - 1; i >= 0; i--)
+    {
+      tree vtype = TREE_TYPE (fd->loops[i].v), itype, t;
+      itype = vtype;
+      if (POINTER_TYPE_P (vtype))
+	itype = signed_type_for (vtype);
+      if (i != 0)
+	t = fold_build2 (TRUNC_MOD_EXPR, type, tem, counts[i]);
+      else
+	t = tem;
+      t = fold_convert (itype, t);
+      t = fold_build2 (MULT_EXPR, itype, t,
+		       fold_convert (itype, fd->loops[i].step));
+      if (POINTER_TYPE_P (vtype))
+	t = fold_build_pointer_plus (fd->loops[i].n1, t);
+      else
+	t = fold_build2 (PLUS_EXPR, itype, fd->loops[i].n1, t);
+      t = force_gimple_operand_gsi (gsi, t,
+				    DECL_P (fd->loops[i].v)
+				    && TREE_ADDRESSABLE (fd->loops[i].v),
+				    NULL_TREE, false,
+				    GSI_CONTINUE_LINKING);
+      stmt = gimple_build_assign (fd->loops[i].v, t);
+      gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
+      if (i != 0)
+	{
+	  t = fold_build2 (TRUNC_DIV_EXPR, type, tem, counts[i]);
+	  t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
+					false, GSI_CONTINUE_LINKING);
+	  stmt = gimple_build_assign (tem, t);
+	  gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
+	}
+    }
+}
+
+
+/* Helper function for expand_omp_for_*.  Generate code like:
+    L10:
+	V3 += STEP3;
+	if (V3 cond3 N32) goto BODY_BB; else goto L11;
+    L11:
+	V3 = N31;
+	V2 += STEP2;
+	if (V2 cond2 N22) goto BODY_BB; else goto L12;
+    L12:
+	V2 = N21;
+	V1 += STEP1;
+	goto BODY_BB;  */
+
+static basic_block
+extract_omp_for_update_vars (struct omp_for_data *fd, basic_block cont_bb,
+			     basic_block body_bb)
+{
+  basic_block last_bb, bb, collapse_bb = NULL;
+  int i;
+  gimple_stmt_iterator gsi;
+  edge e;
+  tree t;
+  gimple stmt;
+
+  last_bb = cont_bb;
+  for (i = fd->collapse - 1; i >= 0; i--)
+    {
+      tree vtype = TREE_TYPE (fd->loops[i].v);
+
+      bb = create_empty_bb (last_bb);
+      if (current_loops)
+	add_bb_to_loop (bb, last_bb->loop_father);
+      gsi = gsi_start_bb (bb);
+
+      if (i < fd->collapse - 1)
+	{
+	  e = make_edge (last_bb, bb, EDGE_FALSE_VALUE);
+	  e->probability = REG_BR_PROB_BASE / 8;
+
+	  t = fd->loops[i + 1].n1;
+	  t = force_gimple_operand_gsi (&gsi, t,
+					DECL_P (fd->loops[i + 1].v)
+					&& TREE_ADDRESSABLE (fd->loops[i
+								       + 1].v),
+					NULL_TREE, false,
+					GSI_CONTINUE_LINKING);
+	  stmt = gimple_build_assign (fd->loops[i + 1].v, t);
+	  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+	}
+      else
+	collapse_bb = bb;
+
+      set_immediate_dominator (CDI_DOMINATORS, bb, last_bb);
+
+      if (POINTER_TYPE_P (vtype))
+	t = fold_build_pointer_plus (fd->loops[i].v, fd->loops[i].step);
+      else
+	t = fold_build2 (PLUS_EXPR, vtype, fd->loops[i].v, fd->loops[i].step);
+      t = force_gimple_operand_gsi (&gsi, t,
+				    DECL_P (fd->loops[i].v)
+				    && TREE_ADDRESSABLE (fd->loops[i].v),
+				    NULL_TREE, false, GSI_CONTINUE_LINKING);
+      stmt = gimple_build_assign (fd->loops[i].v, t);
+      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+      if (i > 0)
+	{
+	  t = fd->loops[i].n2;
+	  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+					false, GSI_CONTINUE_LINKING);
+	  tree v = fd->loops[i].v;
+	  if (DECL_P (v) && TREE_ADDRESSABLE (v))
+	    v = force_gimple_operand_gsi (&gsi, v, true, NULL_TREE,
+					  false, GSI_CONTINUE_LINKING);
+	  t = fold_build2 (fd->loops[i].cond_code, boolean_type_node, v, t);
+	  stmt = gimple_build_cond_empty (t);
+	  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+	  e = make_edge (bb, body_bb, EDGE_TRUE_VALUE);
+	  e->probability = REG_BR_PROB_BASE * 7 / 8;
+	}
+      else
+	make_edge (bb, body_bb, EDGE_FALLTHRU);
+      last_bb = bb;
+    }
+
+  return collapse_bb;
+}
+
+
 /* A subroutine of expand_omp_for.  Generate code for a parallel
    loop with any schedule.  Given parameters:
 
@@ -3816,105 +4473,14 @@  expand_omp_for_generic (struct omp_region *region,
   gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR);
   if (fd->collapse > 1)
     {
-      basic_block zero_iter_bb = NULL;
       int first_zero_iter = -1;
+      basic_block zero_iter_bb = NULL, l2_dom_bb = NULL;
 
-      /* collapsed loops need work for expansion in SSA form.  */
-      gcc_assert (!gimple_in_ssa_p (cfun));
-      counts = (tree *) alloca (fd->collapse * sizeof (tree));
-      for (i = 0; i < fd->collapse; i++)
-	{
-	  tree itype = TREE_TYPE (fd->loops[i].v);
+      counts = XALLOCAVEC (tree, fd->collapse);
+      expand_omp_for_init_counts (fd, &gsi, entry_bb, counts,
+				  zero_iter_bb, first_zero_iter,
+				  l2_dom_bb);
 
-	  if (SSA_VAR_P (fd->loop.n2)
-	      && ((t = fold_binary (fd->loops[i].cond_code, boolean_type_node,
-				    fold_convert (itype, fd->loops[i].n1),
-				    fold_convert (itype, fd->loops[i].n2)))
-		  == NULL_TREE || !integer_onep (t)))
-	    {
-	      tree n1, n2;
-	      n1 = fold_convert (itype, unshare_expr (fd->loops[i].n1));
-	      n1 = force_gimple_operand_gsi (&gsi, n1, true, NULL_TREE,
-					     true, GSI_SAME_STMT);
-	      n2 = fold_convert (itype, unshare_expr (fd->loops[i].n2));
-	      n2 = force_gimple_operand_gsi (&gsi, n2, true, NULL_TREE,
-					     true, GSI_SAME_STMT);
-	      stmt = gimple_build_cond (fd->loops[i].cond_code, n1, n2,
-					NULL_TREE, NULL_TREE);
-	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
-	      if (walk_tree (gimple_cond_lhs_ptr (stmt),
-			     expand_omp_regimplify_p, NULL, NULL)
-		  || walk_tree (gimple_cond_rhs_ptr (stmt),
-				expand_omp_regimplify_p, NULL, NULL))
-		{
-		  gsi = gsi_for_stmt (stmt);
-		  gimple_regimplify_operands (stmt, &gsi);
-		}
-	      e = split_block (entry_bb, stmt);
-	      if (zero_iter_bb == NULL)
-		{
-		  first_zero_iter = i;
-		  zero_iter_bb = create_empty_bb (entry_bb);
-		  if (current_loops)
-		    add_bb_to_loop (zero_iter_bb, entry_bb->loop_father);
-		  gsi = gsi_after_labels (zero_iter_bb);
-		  stmt = gimple_build_assign (fd->loop.n2,
-					      build_zero_cst (type));
-		  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
-		  set_immediate_dominator (CDI_DOMINATORS, zero_iter_bb,
-					   entry_bb);
-		}
-	      ne = make_edge (entry_bb, zero_iter_bb, EDGE_FALSE_VALUE);
-	      ne->probability = REG_BR_PROB_BASE / 2000 - 1;
-	      e->flags = EDGE_TRUE_VALUE;
-	      e->probability = REG_BR_PROB_BASE - ne->probability;
-	      entry_bb = e->dest;
-	      gsi = gsi_last_bb (entry_bb);
-	    }
-	  if (POINTER_TYPE_P (itype))
-	    itype = signed_type_for (itype);
-	  t = build_int_cst (itype, (fd->loops[i].cond_code == LT_EXPR
-				     ? -1 : 1));
-	  t = fold_build2 (PLUS_EXPR, itype,
-			   fold_convert (itype, fd->loops[i].step), t);
-	  t = fold_build2 (PLUS_EXPR, itype, t,
-			   fold_convert (itype, fd->loops[i].n2));
-	  t = fold_build2 (MINUS_EXPR, itype, t,
-			   fold_convert (itype, fd->loops[i].n1));
-	  if (TYPE_UNSIGNED (itype) && fd->loops[i].cond_code == GT_EXPR)
-	    t = fold_build2 (TRUNC_DIV_EXPR, itype,
-			     fold_build1 (NEGATE_EXPR, itype, t),
-			     fold_build1 (NEGATE_EXPR, itype,
-					  fold_convert (itype,
-							fd->loops[i].step)));
-	  else
-	    t = fold_build2 (TRUNC_DIV_EXPR, itype, t,
-			     fold_convert (itype, fd->loops[i].step));
-	  t = fold_convert (type, t);
-	  if (TREE_CODE (t) == INTEGER_CST)
-	    counts[i] = t;
-	  else
-	    {
-	      counts[i] = create_tmp_reg (type, ".count");
-	      t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
-					    true, GSI_SAME_STMT);
-	      stmt = gimple_build_assign (counts[i], t);
-	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
-	    }
-	  if (SSA_VAR_P (fd->loop.n2))
-	    {
-	      if (i == 0)
-		t = counts[0];
-	      else
-		{
-		  t = fold_build2 (MULT_EXPR, type, fd->loop.n2, counts[i]);
-		  t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
-						true, GSI_SAME_STMT);
-		}
-	      stmt = gimple_build_assign (fd->loop.n2, t);
-	      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
-	    }
-	}
       if (zero_iter_bb)
 	{
 	  /* Some counts[i] vars might be uninitialized if
@@ -3949,18 +4515,21 @@  expand_omp_for_generic (struct omp_region *region,
       t4 = build_fold_addr_expr (iend0);
       t3 = build_fold_addr_expr (istart0);
       t2 = fold_convert (fd->iter_type, fd->loop.step);
-      if (POINTER_TYPE_P (type)
-	  && TYPE_PRECISION (type) != TYPE_PRECISION (fd->iter_type))
+      t1 = fd->loop.n2;
+      t0 = fd->loop.n1;
+      if (POINTER_TYPE_P (TREE_TYPE (t0))
+	  && TYPE_PRECISION (TREE_TYPE (t0))
+	     != TYPE_PRECISION (fd->iter_type))
 	{
 	  /* Avoid casting pointers to integer of a different size.  */
 	  tree itype = signed_type_for (type);
-	  t1 = fold_convert (fd->iter_type, fold_convert (itype, fd->loop.n2));
-	  t0 = fold_convert (fd->iter_type, fold_convert (itype, fd->loop.n1));
+	  t1 = fold_convert (fd->iter_type, fold_convert (itype, t1));
+	  t0 = fold_convert (fd->iter_type, fold_convert (itype, t0));
 	}
       else
 	{
-	  t1 = fold_convert (fd->iter_type, fd->loop.n2);
-	  t0 = fold_convert (fd->iter_type, fd->loop.n1);
+	  t1 = fold_convert (fd->iter_type, t1);
+	  t0 = fold_convert (fd->iter_type, t0);
 	}
       if (bias)
 	{
@@ -4015,64 +4584,38 @@  expand_omp_for_generic (struct omp_region *region,
   gsi_remove (&gsi, true);
 
   /* Iteration setup for sequential loop goes in L0_BB.  */
+  tree startvar = fd->loop.v;
+  tree endvar = NULL_TREE;
+
   gsi = gsi_start_bb (l0_bb);
   t = istart0;
   if (bias)
     t = fold_build2 (MINUS_EXPR, fd->iter_type, t, bias);
-  if (POINTER_TYPE_P (type))
-    t = fold_convert (signed_type_for (type), t);
-  t = fold_convert (type, t);
+  if (POINTER_TYPE_P (TREE_TYPE (startvar)))
+    t = fold_convert (signed_type_for (TREE_TYPE (startvar)), t);
+  t = fold_convert (TREE_TYPE (startvar), t);
   t = force_gimple_operand_gsi (&gsi, t,
-				DECL_P (fd->loop.v)
-				&& TREE_ADDRESSABLE (fd->loop.v),
+				DECL_P (startvar)
+				&& TREE_ADDRESSABLE (startvar),
 				NULL_TREE, false, GSI_CONTINUE_LINKING);
-  stmt = gimple_build_assign (fd->loop.v, t);
+  stmt = gimple_build_assign (startvar, t);
   gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
 
   t = iend0;
   if (bias)
     t = fold_build2 (MINUS_EXPR, fd->iter_type, t, bias);
-  if (POINTER_TYPE_P (type))
-    t = fold_convert (signed_type_for (type), t);
-  t = fold_convert (type, t);
+  if (POINTER_TYPE_P (TREE_TYPE (startvar)))
+    t = fold_convert (signed_type_for (TREE_TYPE (startvar)), t);
+  t = fold_convert (TREE_TYPE (startvar), t);
   iend = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
 				   false, GSI_CONTINUE_LINKING);
-  if (fd->collapse > 1)
+  if (endvar)
     {
-      tree tem = create_tmp_reg (type, ".tem");
-      stmt = gimple_build_assign (tem, fd->loop.v);
+      stmt = gimple_build_assign (endvar, iend);
       gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
-      for (i = fd->collapse - 1; i >= 0; i--)
-	{
-	  tree vtype = TREE_TYPE (fd->loops[i].v), itype;
-	  itype = vtype;
-	  if (POINTER_TYPE_P (vtype))
-	    itype = signed_type_for (vtype);
-	  t = fold_build2 (TRUNC_MOD_EXPR, type, tem, counts[i]);
-	  t = fold_convert (itype, t);
-	  t = fold_build2 (MULT_EXPR, itype, t,
-			   fold_convert (itype, fd->loops[i].step));
-	  if (POINTER_TYPE_P (vtype))
-	    t = fold_build_pointer_plus (fd->loops[i].n1, t);
-	  else
-	    t = fold_build2 (PLUS_EXPR, itype, fd->loops[i].n1, t);
-	  t = force_gimple_operand_gsi (&gsi, t,
-					DECL_P (fd->loops[i].v)
-					&& TREE_ADDRESSABLE (fd->loops[i].v),
-					NULL_TREE, false,
-					GSI_CONTINUE_LINKING);
-	  stmt = gimple_build_assign (fd->loops[i].v, t);
-	  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
-	  if (i != 0)
-	    {
-	      t = fold_build2 (TRUNC_DIV_EXPR, type, tem, counts[i]);
-	      t = force_gimple_operand_gsi (&gsi, t, false, NULL_TREE,
-					    false, GSI_CONTINUE_LINKING);
-	      stmt = gimple_build_assign (tem, t);
-	      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
-	    }
-	}
     }
+  if (fd->collapse > 1)
+    expand_omp_for_init_vars (fd, &gsi, counts, startvar);
 
   if (!broken_loop)
     {
@@ -4084,93 +4627,32 @@  expand_omp_for_generic (struct omp_region *region,
       vmain = gimple_omp_continue_control_use (stmt);
       vback = gimple_omp_continue_control_def (stmt);
 
-      if (POINTER_TYPE_P (type))
-	t = fold_build_pointer_plus (vmain, fd->loop.step);
-      else
-	t = fold_build2 (PLUS_EXPR, type, vmain, fd->loop.step);
-      t = force_gimple_operand_gsi (&gsi, t,
-				    DECL_P (vback) && TREE_ADDRESSABLE (vback),
-				    NULL_TREE, true, GSI_SAME_STMT);
-      stmt = gimple_build_assign (vback, t);
-      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
-
-      t = build2 (fd->loop.cond_code, boolean_type_node,
-		  DECL_P (vback) && TREE_ADDRESSABLE (vback) ? t : vback,
-		  iend);
-      stmt = gimple_build_cond_empty (t);
-      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+      /* OMP4 placeholder: if (!gimple_omp_for_combined_p (fd->for_stmt)).  */
+      if (1)
+	{
+	  if (POINTER_TYPE_P (type))
+	    t = fold_build_pointer_plus (vmain, fd->loop.step);
+	  else
+	    t = fold_build2 (PLUS_EXPR, type, vmain, fd->loop.step);
+	  t = force_gimple_operand_gsi (&gsi, t,
+					DECL_P (vback)
+					&& TREE_ADDRESSABLE (vback),
+					NULL_TREE, true, GSI_SAME_STMT);
+	  stmt = gimple_build_assign (vback, t);
+	  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+
+	  t = build2 (fd->loop.cond_code, boolean_type_node,
+		      DECL_P (vback) && TREE_ADDRESSABLE (vback) ? t : vback,
+		      iend);
+	  stmt = gimple_build_cond_empty (t);
+	  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+	}
 
       /* Remove GIMPLE_OMP_CONTINUE.  */
       gsi_remove (&gsi, true);
 
       if (fd->collapse > 1)
-	{
-	  basic_block last_bb, bb;
-
-	  last_bb = cont_bb;
-	  for (i = fd->collapse - 1; i >= 0; i--)
-	    {
-	      tree vtype = TREE_TYPE (fd->loops[i].v);
-
-	      bb = create_empty_bb (last_bb);
-	      if (current_loops)
-		add_bb_to_loop (bb, last_bb->loop_father);
-	      gsi = gsi_start_bb (bb);
-
-	      if (i < fd->collapse - 1)
-		{
-		  e = make_edge (last_bb, bb, EDGE_FALSE_VALUE);
-		  e->probability = REG_BR_PROB_BASE / 8;
-
-		  t = fd->loops[i + 1].n1;
-		  t = force_gimple_operand_gsi (&gsi, t,
-						DECL_P (fd->loops[i + 1].v)
-						&& TREE_ADDRESSABLE
-							(fd->loops[i + 1].v),
-						NULL_TREE, false,
-						GSI_CONTINUE_LINKING);
-		  stmt = gimple_build_assign (fd->loops[i + 1].v, t);
-		  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
-		}
-	      else
-		collapse_bb = bb;
-
-	      set_immediate_dominator (CDI_DOMINATORS, bb, last_bb);
-
-	      if (POINTER_TYPE_P (vtype))
-		t = fold_build_pointer_plus (fd->loops[i].v, fd->loops[i].step);
-	      else
-		t = fold_build2 (PLUS_EXPR, vtype, fd->loops[i].v,
-				 fd->loops[i].step);
-	      t = force_gimple_operand_gsi (&gsi, t,
-					    DECL_P (fd->loops[i].v)
-					    && TREE_ADDRESSABLE (fd->loops[i].v),
-					    NULL_TREE, false,
-					    GSI_CONTINUE_LINKING);
-	      stmt = gimple_build_assign (fd->loops[i].v, t);
-	      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
-
-	      if (i > 0)
-		{
-		  t = fd->loops[i].n2;
-		  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
-						false, GSI_CONTINUE_LINKING);
-		  tree v = fd->loops[i].v;
-		  if (DECL_P (v) && TREE_ADDRESSABLE (v))
-		    v = force_gimple_operand_gsi (&gsi, v, true, NULL_TREE,
-						  false, GSI_CONTINUE_LINKING);
-		  t = fold_build2 (fd->loops[i].cond_code, boolean_type_node,
-				   v, t);
-		  stmt = gimple_build_cond_empty (t);
-		  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
-		  e = make_edge (bb, l1_bb, EDGE_TRUE_VALUE);
-		  e->probability = REG_BR_PROB_BASE * 7 / 8;
-		}
-	      else
-		make_edge (bb, l1_bb, EDGE_FALLTHRU);
-	      last_bb = bb;
-	    }
-	}
+	collapse_bb = extract_omp_for_update_vars (fd, cont_bb, l1_bb);
 
       /* Emit code to get the next parallel iteration in L2_BB.  */
       gsi = gsi_start_bb (l2_bb);
@@ -4220,19 +4702,27 @@  expand_omp_for_generic (struct omp_region *region,
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
       if (current_loops)
 	add_bb_to_loop (l2_bb, cont_bb->loop_father);
-      if (fd->collapse > 1)
+      e = find_edge (cont_bb, l1_bb);
+      /* OMP4 placeholder for gimple_omp_for_combined_p (fd->for_stmt).  */
+      if (0)
+	;
+      else if (fd->collapse > 1)
 	{
-	  e = find_edge (cont_bb, l1_bb);
 	  remove_edge (e);
 	  e = make_edge (cont_bb, collapse_bb, EDGE_TRUE_VALUE);
 	}
       else
+	e->flags = EDGE_TRUE_VALUE;
+      if (e)
 	{
-	  e = find_edge (cont_bb, l1_bb);
-	  e->flags = EDGE_TRUE_VALUE;
+	  e->probability = REG_BR_PROB_BASE * 7 / 8;
+	  find_edge (cont_bb, l2_bb)->probability = REG_BR_PROB_BASE / 8;
+	}
+      else
+	{
+	  e = find_edge (cont_bb, l2_bb);
+	  e->flags = EDGE_FALLTHRU;
 	}
-      e->probability = REG_BR_PROB_BASE * 7 / 8;
-      find_edge (cont_bb, l2_bb)->probability = REG_BR_PROB_BASE / 8;
       make_edge (l2_bb, l0_bb, EDGE_TRUE_VALUE);
 
       set_immediate_dominator (CDI_DOMINATORS, l2_bb,
@@ -4249,10 +4739,14 @@  expand_omp_for_generic (struct omp_region *region,
       outer_loop->latch = l2_bb;
       add_loop (outer_loop, l0_bb->loop_father);
 
-      struct loop *loop = alloc_loop ();
-      loop->header = l1_bb;
-      /* The loop may have multiple latches.  */
-      add_loop (loop, outer_loop);
+      /* OMP4 placeholder: if (!gimple_omp_for_combined_p (fd->for_stmt)).  */
+      if (1)
+	{
+	  struct loop *loop = alloc_loop ();
+	  loop->header = l1_bb;
+	  /* The loop may have multiple latches.  */
+	  add_loop (loop, outer_loop);
+	}
     }
 }
 
@@ -4883,6 +5377,295 @@  expand_omp_for_static_chunk (struct omp_region *region, struct omp_for_data *fd)
   add_loop (loop, trip_loop);
 }
 
+/* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
+   loop.  Given parameters:
+
+	for (V = N1; V cond N2; V += STEP) BODY;
+
+   where COND is "<" or ">", we generate pseudocode
+
+	V = N1;
+	goto L1;
+    L0:
+	BODY;
+	V += STEP;
+    L1:
+	if (V cond N2) goto L0; else goto L2;
+    L2:
+
+    For collapsed loops, given parameters:
+      collapse(3)
+      for (V1 = N11; V1 cond1 N12; V1 += STEP1)
+	for (V2 = N21; V2 cond2 N22; V2 += STEP2)
+	  for (V3 = N31; V3 cond3 N32; V3 += STEP3)
+	    BODY;
+
+    we generate pseudocode
+
+	if (cond3 is <)
+	  adj = STEP3 - 1;
+	else
+	  adj = STEP3 + 1;
+	count3 = (adj + N32 - N31) / STEP3;
+	if (cond2 is <)
+	  adj = STEP2 - 1;
+	else
+	  adj = STEP2 + 1;
+	count2 = (adj + N22 - N21) / STEP2;
+	if (cond1 is <)
+	  adj = STEP1 - 1;
+	else
+	  adj = STEP1 + 1;
+	count1 = (adj + N12 - N11) / STEP1;
+	count = count1 * count2 * count3;
+	V = 0;
+	V1 = N11;
+	V2 = N21;
+	V3 = N31;
+	goto L1;
+    L0:
+	BODY;
+	V += 1;
+	V3 += STEP3;
+	V2 += (V3 cond3 N32) ? 0 : STEP2;
+	V3 = (V3 cond3 N32) ? V3 : N31;
+	V1 += (V2 cond2 N22) ? 0 : STEP1;
+	V2 = (V2 cond2 N22) ? V2 : N21;
+    L1:
+	if (V < count) goto L0; else goto L2;
+    L2:
+
+      */
+
+static void
+expand_omp_simd (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type, t;
+  basic_block entry_bb, cont_bb, exit_bb, l0_bb, l1_bb, l2_bb, l2_dom_bb;
+  gimple_stmt_iterator gsi;
+  gimple stmt;
+  bool broken_loop = region->cont == NULL;
+  edge e, ne;
+  tree *counts = NULL;
+  int i;
+  tree safelen = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt),
+				  OMP_CLAUSE_SAFELEN);
+  tree simduid = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt),
+				  OMP_CLAUSE__SIMDUID_);
+  tree n2;
+
+  type = TREE_TYPE (fd->loop.v);
+  entry_bb = region->entry;
+  cont_bb = region->cont;
+  gcc_assert (EDGE_COUNT (entry_bb->succs) == 2);
+  gcc_assert (broken_loop
+	      || BRANCH_EDGE (entry_bb)->dest == FALLTHRU_EDGE (cont_bb)->dest);
+  l0_bb = FALLTHRU_EDGE (entry_bb)->dest;
+  if (!broken_loop)
+    {
+      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l0_bb);
+      gcc_assert (EDGE_COUNT (cont_bb->succs) == 2);
+      l1_bb = split_block (cont_bb, last_stmt (cont_bb))->dest;
+      l2_bb = BRANCH_EDGE (entry_bb)->dest;
+    }
+  else
+    {
+      BRANCH_EDGE (entry_bb)->flags &= ~EDGE_ABNORMAL;
+      l1_bb = split_edge (BRANCH_EDGE (entry_bb));
+      l2_bb = single_succ (l1_bb);
+    }
+  exit_bb = region->exit;
+  l2_dom_bb = NULL;
+
+  gsi = gsi_last_bb (entry_bb);
+
+  gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR);
+  /* Not needed in SSA form right now.  */
+  gcc_assert (!gimple_in_ssa_p (cfun));
+  if (fd->collapse > 1)
+    {
+      int first_zero_iter = -1;
+      basic_block zero_iter_bb = l2_bb;
+
+      counts = XALLOCAVEC (tree, fd->collapse);
+      expand_omp_for_init_counts (fd, &gsi, entry_bb, counts,
+				  zero_iter_bb, first_zero_iter,
+				  l2_dom_bb);
+    }
+  if (l2_dom_bb == NULL)
+    l2_dom_bb = l1_bb;
+
+  n2 = fd->loop.n2;
+  if (0)
+    /* Place holder for gimple_omp_for_combined_into_p() in
+       the upcoming gomp-4_0-branch merge.  */;
+  else
+    {
+      expand_omp_build_assign (&gsi, fd->loop.v,
+			       fold_convert (type, fd->loop.n1));
+      if (fd->collapse > 1)
+	for (i = 0; i < fd->collapse; i++)
+	  {
+	    tree itype = TREE_TYPE (fd->loops[i].v);
+	    if (POINTER_TYPE_P (itype))
+	      itype = signed_type_for (itype);
+	    t = fold_convert (TREE_TYPE (fd->loops[i].v), fd->loops[i].n1);
+	    expand_omp_build_assign (&gsi, fd->loops[i].v, t);
+	  }
+      }
+
+  /* Remove the GIMPLE_OMP_FOR statement.  */
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+
+      if (POINTER_TYPE_P (type))
+	t = fold_build_pointer_plus (fd->loop.v, fd->loop.step);
+      else
+	t = fold_build2 (PLUS_EXPR, type, fd->loop.v, fd->loop.step);
+      expand_omp_build_assign (&gsi, fd->loop.v, t);
+
+      if (fd->collapse > 1)
+	{
+	  i = fd->collapse - 1;
+	  if (POINTER_TYPE_P (TREE_TYPE (fd->loops[i].v)))
+	    {
+	      t = fold_convert (sizetype, fd->loops[i].step);
+	      t = fold_build_pointer_plus (fd->loops[i].v, t);
+	    }
+	  else
+	    {
+	      t = fold_convert (TREE_TYPE (fd->loops[i].v),
+				fd->loops[i].step);
+	      t = fold_build2 (PLUS_EXPR, TREE_TYPE (fd->loops[i].v),
+			       fd->loops[i].v, t);
+	    }
+	  expand_omp_build_assign (&gsi, fd->loops[i].v, t);
+
+	  for (i = fd->collapse - 1; i > 0; i--)
+	    {
+	      tree itype = TREE_TYPE (fd->loops[i].v);
+	      tree itype2 = TREE_TYPE (fd->loops[i - 1].v);
+	      if (POINTER_TYPE_P (itype2))
+		itype2 = signed_type_for (itype2);
+	      t = build3 (COND_EXPR, itype2,
+			  build2 (fd->loops[i].cond_code, boolean_type_node,
+				  fd->loops[i].v,
+				  fold_convert (itype, fd->loops[i].n2)),
+			  build_int_cst (itype2, 0),
+			  fold_convert (itype2, fd->loops[i - 1].step));
+	      if (POINTER_TYPE_P (TREE_TYPE (fd->loops[i - 1].v)))
+		t = fold_build_pointer_plus (fd->loops[i - 1].v, t);
+	      else
+		t = fold_build2 (PLUS_EXPR, itype2, fd->loops[i - 1].v, t);
+	      expand_omp_build_assign (&gsi, fd->loops[i - 1].v, t);
+
+	      t = build3 (COND_EXPR, itype,
+			  build2 (fd->loops[i].cond_code, boolean_type_node,
+				  fd->loops[i].v,
+				  fold_convert (itype, fd->loops[i].n2)),
+			  fd->loops[i].v,
+			  fold_convert (itype, fd->loops[i].n1));
+	      expand_omp_build_assign (&gsi, fd->loops[i].v, t);
+	    }
+	}
+
+      /* Remove GIMPLE_OMP_CONTINUE.  */
+      gsi_remove (&gsi, true);
+    }
+
+  /* Emit the condition in L1_BB.  */
+  gsi = gsi_start_bb (l1_bb);
+
+  t = fold_convert (type, n2);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+				false, GSI_CONTINUE_LINKING);
+  t = build2 (fd->loop.cond_code, boolean_type_node, fd->loop.v, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+
+  /* Connect the new blocks.  */
+  remove_edge (FALLTHRU_EDGE (entry_bb));
+
+  if (!broken_loop)
+    {
+      remove_edge (BRANCH_EDGE (entry_bb));
+      make_edge (entry_bb, l1_bb, EDGE_FALLTHRU);
+
+      e = BRANCH_EDGE (l1_bb);
+      ne = FALLTHRU_EDGE (l1_bb);
+      e->flags = EDGE_TRUE_VALUE;
+    }
+  else
+    {
+      single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+      ne = single_succ_edge (l1_bb);
+      e = make_edge (l1_bb, l0_bb, EDGE_TRUE_VALUE);
+
+    }
+  ne->flags = EDGE_FALSE_VALUE;
+  e->probability = REG_BR_PROB_BASE * 7 / 8;
+  ne->probability = REG_BR_PROB_BASE / 8;
+
+  set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
+  set_immediate_dominator (CDI_DOMINATORS, l2_bb, l2_dom_bb);
+  set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
+
+  if (!broken_loop)
+    {
+      struct loop *loop = alloc_loop ();
+      loop->header = l1_bb;
+      loop->latch = e->dest;
+      add_loop (loop, l1_bb->loop_father);
+      if (safelen == NULL_TREE)
+	loop->safelen = INT_MAX;
+      else
+	{
+	  safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
+	  if (!host_integerp (safelen, 1)
+	      || (unsigned HOST_WIDE_INT) tree_low_cst (safelen, 1)
+		 > INT_MAX)
+	    loop->safelen = INT_MAX;
+	  else
+	    loop->safelen = tree_low_cst (safelen, 1);
+	  if (loop->safelen == 1)
+	    loop->safelen = 0;
+	}
+      if (simduid)
+	{
+	  loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid);
+	  cfun->has_simduid_loops = true;
+	}
+      /* If not -fno-tree-vectorize, hint that we want to vectorize
+	 the loop.  */
+      if ((flag_tree_vectorize
+	   || !global_options_set.x_flag_tree_vectorize)
+	  && loop->safelen > 1)
+	{
+	  loop->force_vect = true;
+	  cfun->has_force_vect_loops = true;
+	}
+    }
+}
+
 
 /* Expand the OpenMP loop defined by REGION.  */
 
@@ -4914,7 +5697,9 @@  expand_omp_for (struct omp_region *region)
        original loops from being detected.  Fix that up.  */
     loops_state_set (LOOPS_NEED_FIXUP);
 
-  if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
+  if (gimple_omp_for_kind (fd.for_stmt) == GF_OMP_FOR_KIND_SIMD)
+    expand_omp_simd (region, &fd);
+  else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
       && !fd.have_ordered
       && fd.collapse == 1
       && region->cont != NULL)
@@ -4928,6 +5713,8 @@  expand_omp_for (struct omp_region *region)
     {
       int fn_index, start_ix, next_ix;
 
+      gcc_assert (gimple_omp_for_kind (fd.for_stmt)
+		  == GF_OMP_FOR_KIND_FOR);
       if (fd.chunk_size == NULL
 	  && fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC)
 	fd.chunk_size = integer_zero_node;
@@ -6516,6 +7303,8 @@  lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
 	  && host_integerp (fd->loop.n2, 0)
 	  && ! integer_zerop (fd->loop.n2))
 	vinit = build_int_cst (TREE_TYPE (fd->loop.v), 0);
+      else
+	vinit = unshare_expr (vinit);
 
       /* Initialize the iterator variable, so that threads that don't execute
 	 any iterations don't execute the lastprivate clauses by accident.  */
@@ -6539,7 +7328,6 @@  lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   push_gimplify_context (&gctx);
 
   lower_omp (gimple_omp_for_pre_body_ptr (stmt), ctx);
-  lower_omp (gimple_omp_body_ptr (stmt), ctx);
 
   block = make_node (BLOCK);
   new_stmt = gimple_build_bind (NULL, NULL, block);
@@ -6564,6 +7352,8 @@  lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   lower_rec_input_clauses (gimple_omp_for_clauses (stmt), &body, &dlist, ctx);
   gimple_seq_add_seq (&body, gimple_omp_for_pre_body (stmt));
 
+  lower_omp (gimple_omp_body_ptr (stmt), ctx);
+
   /* Lower the header expressions.  At this point, we can assume that
      the header is of the form:
 
diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 10431c0..5bd7719 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -4331,10 +4331,25 @@  get_references_in_stmt (gimple stmt, vec<data_ref_loc, va_stack> *references)
   /* ASM_EXPR and CALL_EXPR may embed arbitrary side effects.
      As we cannot model data-references to not spelled out
      accesses give up if they may occur.  */
-  if ((stmt_code == GIMPLE_CALL
-       && !(gimple_call_flags (stmt) & ECF_CONST))
-      || (stmt_code == GIMPLE_ASM
-	  && (gimple_asm_volatile_p (stmt) || gimple_vuse (stmt))))
+  if (stmt_code == GIMPLE_CALL
+      && !(gimple_call_flags (stmt) & ECF_CONST))
+    {
+      /* Allow IFN_GOMP_SIMD_LANE in their own loops.  */
+      if (gimple_call_internal_p (stmt)
+	  && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE)
+	{
+	  struct loop *loop = gimple_bb (stmt)->loop_father;
+	  tree uid = gimple_call_arg (stmt, 0);
+	  gcc_assert (TREE_CODE (uid) == SSA_NAME);
+	  if (loop == NULL
+	      || loop->simduid != SSA_NAME_VAR (uid))
+	    clobbers_memory = true;
+	}
+      else
+	clobbers_memory = true;
+    }
+  else if (stmt_code == GIMPLE_ASM
+	   && (gimple_asm_volatile_p (stmt) || gimple_vuse (stmt)))
     clobbers_memory = true;
 
   if (!gimple_vuse (stmt))
diff --git a/gcc/tree-flow.h b/gcc/tree-flow.h
index caa8d74..fe9ecee 100644
--- a/gcc/tree-flow.h
+++ b/gcc/tree-flow.h
@@ -344,7 +344,6 @@  extern struct omp_region *new_omp_region (basic_block, enum gimple_code,
 					  struct omp_region *);
 extern void free_omp_regions (void);
 void omp_expand_local (basic_block);
-extern tree find_omp_clause (tree, enum omp_clause_code);
 tree copy_var_decl (tree, tree, tree);
 
 /*---------------------------------------------------------------------------
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 0ebb8c3..eb3a3fa 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1822,6 +1822,10 @@  main_tree_if_conversion (void)
     return 0;
 
   FOR_EACH_LOOP (li, loop, 0)
+    if (flag_tree_loop_if_convert == 1
+	|| flag_tree_loop_if_convert_stores == 1
+	|| flag_tree_vectorize
+	|| loop->force_vect)
     changed |= tree_if_conversion (loop);
 
   if (changed)
@@ -1848,7 +1852,8 @@  main_tree_if_conversion (void)
 static bool
 gate_tree_if_conversion (void)
 {
-  return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
+  return (((flag_tree_vectorize || cfun->has_force_vect_loops)
+	   && flag_tree_loop_if_convert != 0)
 	  || flag_tree_loop_if_convert == 1
 	  || flag_tree_loop_if_convert_stores == 1);
 }
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index f524771..4ec1d66 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -1298,7 +1298,8 @@  remap_gimple_stmt (gimple stmt, copy_body_data *id)
 	case GIMPLE_OMP_FOR:
 	  s1 = remap_gimple_seq (gimple_omp_body (stmt), id);
 	  s2 = remap_gimple_seq (gimple_omp_for_pre_body (stmt), id);
-	  copy = gimple_build_omp_for (s1, gimple_omp_for_clauses (stmt),
+	  copy = gimple_build_omp_for (s1, gimple_omp_for_kind (stmt),
+				       gimple_omp_for_clauses (stmt),
 				       gimple_omp_for_collapse (stmt), s2);
 	  {
 	    size_t i;
@@ -2331,6 +2332,8 @@  copy_cfg_body (copy_body_data * id, gcov_type count, int frequency_scale,
 		  get_loop (src_cfun, 0));
       /* Defer to cfgcleanup to update loop-father fields of basic-blocks.  */
       loops_state_set (LOOPS_NEED_FIXUP);
+      cfun->has_force_vect_loops |= src_cfun->has_force_vect_loops;
+      cfun->has_simduid_loops |= src_cfun->has_simduid_loops;
     }
 
   /* If the loop tree in the source function needed fixup, mark the
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index cea6f03..1e6bb07 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1686,7 +1686,7 @@  create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
   t = build_omp_clause (loc, OMP_CLAUSE_SCHEDULE);
   OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
 
-  for_stmt = gimple_build_omp_for (NULL, t, 1, NULL);
+  for_stmt = gimple_build_omp_for (NULL, GF_OMP_FOR_KIND_FOR, t, 1, NULL);
   gimple_set_location (for_stmt, loc);
   gimple_omp_for_set_index (for_stmt, 0, initvar);
   gimple_omp_for_set_initial (for_stmt, 0, cvar_init);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7745f73..b2d32fa8 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -314,11 +314,14 @@  dump_omp_clause (pretty_printer *buffer, tree clause, int spc, int flags)
     case OMP_CLAUSE_COPYPRIVATE:
       name = "copyprivate";
       goto print_remap;
+    case OMP_CLAUSE_UNIFORM:
+      name = "uniform";
+      goto print_remap;
   print_remap:
       pp_string (buffer, name);
       pp_character (buffer, '(');
       dump_generic_node (buffer, OMP_CLAUSE_DECL (clause),
-	  spc, flags, false);
+			 spc, flags, false);
       pp_character (buffer, ')');
       break;
 
@@ -431,6 +434,30 @@  dump_omp_clause (pretty_printer *buffer, tree clause, int spc, int flags)
       pp_string (buffer, "mergeable");
       break;
 
+    case OMP_CLAUSE_LINEAR:
+      pp_string (buffer, "linear(");
+      dump_generic_node (buffer, OMP_CLAUSE_DECL (clause),
+			 spc, flags, false);
+      pp_character (buffer, ':');
+      dump_generic_node (buffer, OMP_CLAUSE_LINEAR_STEP (clause),
+			 spc, flags, false);
+      pp_character (buffer, ')');
+      break;
+
+    case OMP_CLAUSE_SAFELEN:
+      pp_string (buffer, "safelen(");
+      dump_generic_node (buffer, OMP_CLAUSE_SAFELEN_EXPR (clause),
+			 spc, flags, false);
+      pp_character (buffer, ')');
+      break;
+
+    case OMP_CLAUSE__SIMDUID_:
+      pp_string (buffer, "_simduid_(");
+      dump_generic_node (buffer, OMP_CLAUSE__SIMDUID__DECL (clause),
+			 spc, flags, false);
+      pp_character (buffer, ')');
+      break;
+
     default:
       /* Should never happen.  */
       dump_generic_node (buffer, clause, spc, flags, false);
@@ -2178,6 +2205,13 @@  dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 
     case OMP_FOR:
       pp_string (buffer, "#pragma omp for");
+      goto dump_omp_loop;
+
+    case OMP_SIMD:
+      pp_string (buffer, "#pragma omp simd");
+      goto dump_omp_loop;
+
+    dump_omp_loop:
       dump_omp_clauses (buffer, OMP_FOR_CLAUSES (node), spc, flags);
 
       if (!(flags & TDF_SLIM))
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 1bc4c2f..7f66bda 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -626,6 +626,22 @@  likely_value (gimple stmt)
   if (has_constant_operand)
     all_undefined_operands = false;
 
+  if (has_undefined_operand
+      && code == GIMPLE_CALL
+      && gimple_call_internal_p (stmt))
+    switch (gimple_call_internal_fn (stmt))
+      {
+	/* These 3 builtins use the first argument just as a magic
+	   way how to find out a decl uid.  */
+      case IFN_GOMP_SIMD_LANE:
+      case IFN_GOMP_SIMD_VF:
+      case IFN_GOMP_SIMD_LAST_LANE:
+	has_undefined_operand = false;
+	break;
+      default:
+	break;
+      }
+
   /* If the operation combines operands like COMPLEX_EXPR make sure to
      not mark the result UNDEFINED if only one part of the result is
      undefined.  */
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index 91cf8c1..abe4557 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1125,6 +1125,11 @@  tree_unroll_loops_completely_1 (bool may_increase_size, bool unroll_outer,
   if (changed)
     return true;
 
+  /* Don't unroll #pragma omp simd loops until the vectorizer
+     attempts to vectorize those.  */
+  if (loop->force_vect)
+    return false;
+
   /* Try to unroll this loop.  */
   loop_father = loop_outer (loop);
   if (!loop_father)
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 99e27a1..2160318 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -225,7 +225,7 @@  tree_vectorize (void)
 static bool
 gate_tree_vectorize (void)
 {
-  return flag_tree_vectorize;
+  return flag_tree_vectorize || cfun->has_force_vect_loops;
 }
 
 struct gimple_opt_pass pass_vectorize =
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 47ecad3..f6e2131 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -255,6 +255,15 @@  vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
   /* Unknown data dependence.  */
   if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
     {
+      /* If user asserted safelen consecutive iterations can be
+	 executed concurrently, assume independence.  */
+      if (loop->safelen >= 2)
+	{
+	  if (loop->safelen < *max_vf)
+	    *max_vf = loop->safelen;
+	  return false;
+	}
+
       if (STMT_VINFO_GATHER_P (stmtinfo_a)
 	  || STMT_VINFO_GATHER_P (stmtinfo_b))
 	{
@@ -291,6 +300,15 @@  vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
   /* Known data dependence.  */
   if (DDR_NUM_DIST_VECTS (ddr) == 0)
     {
+      /* If user asserted safelen consecutive iterations can be
+	 executed concurrently, assume independence.  */
+      if (loop->safelen >= 2)
+	{
+	  if (loop->safelen < *max_vf)
+	    *max_vf = loop->safelen;
+	  return false;
+	}
+
       if (STMT_VINFO_GATHER_P (stmtinfo_a)
 	  || STMT_VINFO_GATHER_P (stmtinfo_b))
 	{
@@ -2859,6 +2877,7 @@  vect_analyze_data_refs (loop_vec_info loop_vinfo,
       stmt_vec_info stmt_info;
       tree base, offset, init;
       bool gather = false;
+      bool simd_lane_access = false;
       int vf;
 
 again:
@@ -2890,12 +2909,17 @@  again:
       if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_INIT (dr)
 	  || !DR_STEP (dr))
         {
-	  /* If target supports vector gather loads, see if they can't
-	     be used.  */
-	  if (loop_vinfo
-	      && DR_IS_READ (dr)
+	  bool maybe_gather
+	    = DR_IS_READ (dr)
 	      && !TREE_THIS_VOLATILE (DR_REF (dr))
-	      && targetm.vectorize.builtin_gather != NULL
+	      && targetm.vectorize.builtin_gather != NULL;
+	  bool maybe_simd_lane_access
+	    = loop_vinfo && loop->simduid;
+
+	  /* If target supports vector gather loads, or if this might be
+	     a SIMD lane access, see if they can't be used.  */
+	  if (loop_vinfo
+	      && (maybe_gather || maybe_simd_lane_access)
 	      && !nested_in_vect_loop_p (loop, stmt))
 	    {
 	      struct data_reference *newdr
@@ -2908,14 +2932,59 @@  again:
 		  && DR_STEP (newdr)
 		  && integer_zerop (DR_STEP (newdr)))
 		{
-		  dr = newdr;
-		  gather = true;
+		  if (maybe_simd_lane_access)
+		    {
+		      tree off = DR_OFFSET (newdr);
+		      STRIP_NOPS (off);
+		      if (TREE_CODE (DR_INIT (newdr)) == INTEGER_CST
+			  && TREE_CODE (off) == MULT_EXPR
+			  && host_integerp (TREE_OPERAND (off, 1), 1))
+			{
+			  tree step = TREE_OPERAND (off, 1);
+			  off = TREE_OPERAND (off, 0);
+			  STRIP_NOPS (off);
+			  if (CONVERT_EXPR_P (off)
+			      && TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (off,
+									  0)))
+				 < TYPE_PRECISION (TREE_TYPE (off)))
+			    off = TREE_OPERAND (off, 0);
+			  if (TREE_CODE (off) == SSA_NAME)
+			    {
+			      gimple def = SSA_NAME_DEF_STMT (off);
+			      tree reft = TREE_TYPE (DR_REF (newdr));
+			      if (gimple_call_internal_p (def)
+				  && gimple_call_internal_fn (def)
+				  == IFN_GOMP_SIMD_LANE)
+				{
+				  tree arg = gimple_call_arg (def, 0);
+				  gcc_assert (TREE_CODE (arg) == SSA_NAME);
+				  arg = SSA_NAME_VAR (arg);
+				  if (arg == loop->simduid
+				      /* For now.  */
+				      && tree_int_cst_equal
+					   (TYPE_SIZE_UNIT (reft),
+					    step))
+				    {
+				      DR_OFFSET (newdr) = ssize_int (0);
+				      DR_STEP (newdr) = step;
+				      dr = newdr;
+				      simd_lane_access = true;
+				    }
+				}
+			    }
+			}
+		    }
+		  if (!simd_lane_access && maybe_gather)
+		    {
+		      dr = newdr;
+		      gather = true;
+		    }
 		}
-	      else
+	      if (!gather && !simd_lane_access)
 		free_data_ref (newdr);
 	    }
 
-	  if (!gather)
+	  if (!gather && !simd_lane_access)
 	    {
 	      if (dump_enabled_p ())
 		{
@@ -2942,7 +3011,7 @@  again:
           if (bb_vinfo)
 	    break;
 
-	  if (gather)
+	  if (gather || simd_lane_access)
 	    free_data_ref (dr);
 	  return false;
         }
@@ -2975,7 +3044,7 @@  again:
           if (bb_vinfo)
 	    break;
 
-	  if (gather)
+	  if (gather || simd_lane_access)
 	    free_data_ref (dr);
           return false;
         }
@@ -2994,7 +3063,7 @@  again:
           if (bb_vinfo)
 	    break;
 
-	  if (gather)
+	  if (gather || simd_lane_access)
 	    free_data_ref (dr);
           return false;
 	}
@@ -3015,7 +3084,7 @@  again:
 	  if (bb_vinfo)
 	    break;
 
-	  if (gather)
+	  if (gather || simd_lane_access)
 	    free_data_ref (dr);
 	  return false;
 	}
@@ -3150,12 +3219,17 @@  again:
           if (bb_vinfo)
 	    break;
 
-	  if (gather)
+	  if (gather || simd_lane_access)
 	    free_data_ref (dr);
           return false;
         }
 
       STMT_VINFO_DATA_REF (stmt_info) = dr;
+      if (simd_lane_access)
+	{
+	  STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) = true;
+	  datarefs[i] = dr;
+	}
 
       /* Set vectype for STMT.  */
       scalar_type = TREE_TYPE (DR_REF (dr));
@@ -3176,7 +3250,7 @@  again:
           if (bb_vinfo)
 	    break;
 
-	  if (gather)
+	  if (gather || simd_lane_access)
 	    {
 	      STMT_VINFO_DATA_REF (stmt_info) = NULL;
 	      free_data_ref (dr);
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index c9b1021..3346dbf 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -5361,7 +5361,7 @@  vectorizable_induction (gimple phi, gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
 bool
 vectorizable_live_operation (gimple stmt,
 			     gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
-			     gimple *vec_stmt ATTRIBUTE_UNUSED)
+			     gimple *vec_stmt)
 {
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
@@ -5381,7 +5381,41 @@  vectorizable_live_operation (gimple stmt,
     return false;
 
   if (!is_gimple_assign (stmt))
-    return false;
+    {
+      if (gimple_call_internal_p (stmt)
+	  && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE
+	  && gimple_call_lhs (stmt)
+	  && loop->simduid
+	  && TREE_CODE (gimple_call_arg (stmt, 0)) == SSA_NAME
+	  && loop->simduid
+	     == SSA_NAME_VAR (gimple_call_arg (stmt, 0)))
+	{
+	  edge e = single_exit (loop);
+	  basic_block merge_bb = e->dest;
+	  imm_use_iterator imm_iter;
+	  use_operand_p use_p;
+	  tree lhs = gimple_call_lhs (stmt);
+
+	  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
+	    {
+	      gimple use_stmt = USE_STMT (use_p);
+	      if (gimple_code (use_stmt) == GIMPLE_PHI
+		  || gimple_bb (use_stmt) == merge_bb)
+		{
+		  if (vec_stmt)
+		    {
+		      tree vfm1
+			= build_int_cst (unsigned_type_node,
+					 loop_vinfo->vectorization_factor - 1);
+		      SET_PHI_ARG_DEF (use_stmt, e->dest_idx, vfm1);
+		    }
+		  return true;
+		}
+	    }
+	}
+
+      return false;
+    }
 
   if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
     return false;
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 0580f7d..3768dcd 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1755,6 +1755,14 @@  vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   if (nargs == 0 || nargs > 3)
     return false;
 
+  /* Ignore the argument of IFN_GOMP_SIMD_LANE, it is magic.  */
+  if (gimple_call_internal_p (stmt)
+      && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE)
+    {
+      nargs = 0;
+      rhs_type = unsigned_type_node;
+    }
+
   for (i = 0; i < nargs; i++)
     {
       tree opvectype;
@@ -1830,11 +1838,26 @@  vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   fndecl = vectorizable_function (stmt, vectype_out, vectype_in);
   if (fndecl == NULL_TREE)
     {
-      if (dump_enabled_p ())
-	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-                         "function is not vectorizable.");
-
-      return false;
+      if (gimple_call_internal_p (stmt)
+	  && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE
+	  && !slp_node
+	  && loop_vinfo
+	  && LOOP_VINFO_LOOP (loop_vinfo)->simduid
+	  && TREE_CODE (gimple_call_arg (stmt, 0)) == SSA_NAME
+	  && LOOP_VINFO_LOOP (loop_vinfo)->simduid
+	     == SSA_NAME_VAR (gimple_call_arg (stmt, 0)))
+	{
+	  /* We can handle IFN_GOMP_SIMD_LANE by returning a
+	     { 0, 1, 2, ... vf - 1 } vector.  */
+	  gcc_assert (nargs == 0);
+	}
+      else
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "function is not vectorizable.");
+	  return false;
+	}
     }
 
   gcc_assert (!gimple_vuse (stmt));
@@ -1932,9 +1955,30 @@  vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	      vargs.quick_push (vec_oprnd0);
 	    }
 
-	  new_stmt = gimple_build_call_vec (fndecl, vargs);
-	  new_temp = make_ssa_name (vec_dest, new_stmt);
-	  gimple_call_set_lhs (new_stmt, new_temp);
+	  if (gimple_call_internal_p (stmt)
+	      && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE)
+	    {
+	      tree *v = XALLOCAVEC (tree, nunits_out);
+	      int k;
+	      for (k = 0; k < nunits_out; ++k)
+		v[k] = build_int_cst (unsigned_type_node, j * nunits_out + k);
+	      tree cst = build_vector (vectype_out, v);
+	      tree new_var
+		= vect_get_new_vect_var (vectype_out, vect_simple_var, "cst_");
+	      gimple init_stmt = gimple_build_assign (new_var, cst);
+	      new_temp = make_ssa_name (new_var, init_stmt);
+	      gimple_assign_set_lhs (init_stmt, new_temp);
+	      vect_init_vector_1 (stmt, init_stmt, NULL);
+	      new_temp = make_ssa_name (vec_dest, NULL);
+	      new_stmt = gimple_build_assign (new_temp,
+					      gimple_assign_lhs (init_stmt));
+	    }
+	  else
+	    {
+	      new_stmt = gimple_build_call_vec (fndecl, vargs);
+	      new_temp = make_ssa_name (vec_dest, new_stmt);
+	      gimple_call_set_lhs (new_stmt, new_temp);
+	    }
 	  vect_finish_stmt_generation (stmt, new_stmt, gsi);
 
 	  if (j == 0)
@@ -3796,6 +3840,7 @@  vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   enum vect_def_type dt;
   stmt_vec_info prev_stmt_info = NULL;
   tree dataref_ptr = NULL_TREE;
+  tree dataref_offset = NULL_TREE;
   gimple ptr_incr = NULL;
   int nunits = TYPE_VECTOR_SUBPARTS (vectype);
   int ncopies;
@@ -4085,9 +4130,26 @@  vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	  /* We should have catched mismatched types earlier.  */
 	  gcc_assert (useless_type_conversion_p (vectype,
 						 TREE_TYPE (vec_oprnd)));
-	  dataref_ptr = vect_create_data_ref_ptr (first_stmt, aggr_type, NULL,
-						  NULL_TREE, &dummy, gsi,
-						  &ptr_incr, false, &inv_p);
+	  bool simd_lane_access_p
+	    = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info);
+	  if (simd_lane_access_p
+	      && TREE_CODE (DR_BASE_ADDRESS (first_dr)) == ADDR_EXPR
+	      && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr), 0))
+	      && integer_zerop (DR_OFFSET (first_dr))
+	      && integer_zerop (DR_INIT (first_dr))
+	      && alias_sets_conflict_p (get_alias_set (aggr_type),
+					get_alias_set (DR_REF (first_dr))))
+	    {
+	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr));
+	      dataref_offset = build_int_cst (reference_alias_ptr_type
+					      (DR_REF (first_dr)), 0);
+	    }
+	  else
+	    dataref_ptr
+	      = vect_create_data_ref_ptr (first_stmt, aggr_type,
+					  simd_lane_access_p ? loop : NULL,
+					  NULL_TREE, &dummy, gsi, &ptr_incr,
+					  simd_lane_access_p, &inv_p);
 	  gcc_assert (bb_vinfo || !inv_p);
 	}
       else
@@ -4108,8 +4170,13 @@  vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	      dr_chain[i] = vec_oprnd;
 	      oprnds[i] = vec_oprnd;
 	    }
-	  dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, stmt,
-					 TYPE_SIZE_UNIT (aggr_type));
+	  if (dataref_offset)
+	    dataref_offset
+	      = int_const_binop (PLUS_EXPR, dataref_offset,
+				 TYPE_SIZE_UNIT (aggr_type));
+	  else
+	    dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, stmt,
+					   TYPE_SIZE_UNIT (aggr_type));
 	}
 
       if (store_lanes_p)
@@ -4161,8 +4228,10 @@  vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 		vec_oprnd = result_chain[i];
 
 	      data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd), dataref_ptr,
-				 build_int_cst (reference_alias_ptr_type
-						(DR_REF (first_dr)), 0));
+				 dataref_offset
+				 ? dataref_offset
+				 : build_int_cst (reference_alias_ptr_type
+						  (DR_REF (first_dr)), 0));
 	      align = TYPE_ALIGN_UNIT (vectype);
 	      if (aligned_access_p (first_dr))
 		misalign = 0;
@@ -4181,8 +4250,9 @@  vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 					  TYPE_ALIGN (elem_type));
 		  misalign = DR_MISALIGNMENT (first_dr);
 		}
-	      set_ptr_info_alignment (get_ptr_info (dataref_ptr), align,
-				      misalign);
+	      if (dataref_offset == NULL_TREE)
+		set_ptr_info_alignment (get_ptr_info (dataref_ptr), align,
+					misalign);
 
 	      /* Arguments are ready.  Create the new vector stmt.  */
 	      new_stmt = gimple_build_assign (data_ref, vec_oprnd);
@@ -4314,6 +4384,7 @@  vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   tree dummy;
   enum dr_alignment_support alignment_support_scheme;
   tree dataref_ptr = NULL_TREE;
+  tree dataref_offset = NULL_TREE;
   gimple ptr_incr = NULL;
   int nunits = TYPE_VECTOR_SUBPARTS (vectype);
   int ncopies;
@@ -4947,9 +5018,32 @@  vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
     {
       /* 1. Create the vector or array pointer update chain.  */
       if (j == 0)
-        dataref_ptr = vect_create_data_ref_ptr (first_stmt, aggr_type, at_loop,
-						offset, &dummy, gsi,
-						&ptr_incr, false, &inv_p);
+	{
+	  bool simd_lane_access_p
+	    = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info);
+	  if (simd_lane_access_p
+	      && TREE_CODE (DR_BASE_ADDRESS (first_dr)) == ADDR_EXPR
+	      && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr), 0))
+	      && integer_zerop (DR_OFFSET (first_dr))
+	      && integer_zerop (DR_INIT (first_dr))
+	      && alias_sets_conflict_p (get_alias_set (aggr_type),
+					get_alias_set (DR_REF (first_dr)))
+	      && (alignment_support_scheme == dr_aligned
+		  || alignment_support_scheme == dr_unaligned_supported))
+	    {
+	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr));
+	      dataref_offset = build_int_cst (reference_alias_ptr_type
+					      (DR_REF (first_dr)), 0);
+	    }
+	  else
+	    dataref_ptr
+	      = vect_create_data_ref_ptr (first_stmt, aggr_type, at_loop,
+					  offset, &dummy, gsi, &ptr_incr,
+					  simd_lane_access_p, &inv_p);
+	}
+      else if (dataref_offset)
+	dataref_offset = int_const_binop (PLUS_EXPR, dataref_offset,
+					  TYPE_SIZE_UNIT (aggr_type));
       else
         dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, stmt,
 				       TYPE_SIZE_UNIT (aggr_type));
@@ -4999,8 +5093,10 @@  vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 
 		    data_ref
 		      = build2 (MEM_REF, vectype, dataref_ptr,
-				build_int_cst (reference_alias_ptr_type
-					       (DR_REF (first_dr)), 0));
+				dataref_offset
+				? dataref_offset
+				: build_int_cst (reference_alias_ptr_type
+						 (DR_REF (first_dr)), 0));
 		    align = TYPE_ALIGN_UNIT (vectype);
 		    if (alignment_support_scheme == dr_aligned)
 		      {
@@ -5022,8 +5118,9 @@  vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 						TYPE_ALIGN (elem_type));
 			misalign = DR_MISALIGNMENT (first_dr);
 		      }
-		    set_ptr_info_alignment (get_ptr_info (dataref_ptr),
-					    align, misalign);
+		    if (dataref_offset == NULL_TREE)
+		      set_ptr_info_alignment (get_ptr_info (dataref_ptr),
+					      align, misalign);
 		    break;
 		  }
 		case dr_explicit_realign:
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 1ef31ee..6de914f 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -66,13 +66,209 @@  along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "tree-vectorizer.h"
 #include "tree-pass.h"
+#include "hash-table.h"
+#include "tree-ssa-propagate.h"
 
 /* Loop or bb location.  */
 LOC vect_location;
 
 /* Vector mapping GIMPLE stmt to stmt_vec_info. */
 vec<vec_void_p> stmt_vec_info_vec;
+
+/* For mapping simduid to vectorization factor.  */
+
+struct simduid_to_vf : typed_free_remove<simduid_to_vf>
+{
+  unsigned int simduid;
+  int vf;
+
+  /* hash_table support.  */
+  typedef simduid_to_vf value_type;
+  typedef simduid_to_vf compare_type;
+  static inline hashval_t hash (const value_type *);
+  static inline int equal (const value_type *, const compare_type *);
+};
+
+inline hashval_t
+simduid_to_vf::hash (const value_type *p)
+{
+  return p->simduid;
+}
+
+inline int
+simduid_to_vf::equal (const value_type *p1, const value_type *p2)
+{
+  return p1->simduid == p2->simduid;
+}
+
+/* For mapping decl to simduid.  */
+
+struct decl_to_simduid : typed_free_remove<decl_to_simduid>
+{
+  tree decl;
+  unsigned int simduid;
+
+  /* hash_table support.  */
+  typedef decl_to_simduid value_type;
+  typedef decl_to_simduid compare_type;
+  static inline hashval_t hash (const value_type *);
+  static inline int equal (const value_type *, const compare_type *);
+};
+
+inline hashval_t
+decl_to_simduid::hash (const value_type *p)
+{
+  return DECL_UID (p->decl);
+}
+
+inline int
+decl_to_simduid::equal (const value_type *p1, const value_type *p2)
+{
+  return p1->decl == p2->decl;
+}
+
+/* Fold IFN_GOMP_SIMD_LANE, IFN_GOMP_SIMD_VF and IFN_GOMP_SIMD_LAST_LANE
+   into their corresponding constants.  */
+
+static void
+adjust_simduid_builtins (hash_table <simduid_to_vf> &htab)
+{
+  basic_block bb;
+
+  FOR_EACH_BB (bb)
+    {
+      gimple_stmt_iterator i;
+
+      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+	{
+	  unsigned int vf = 1;
+	  enum internal_fn ifn;
+	  gimple stmt = gsi_stmt (i);
+	  tree t;
+	  if (!is_gimple_call (stmt)
+	      || !gimple_call_internal_p (stmt))
+	    continue;
+	  ifn = gimple_call_internal_fn (stmt);
+	  switch (ifn)
+	    {
+	    case IFN_GOMP_SIMD_LANE:
+	    case IFN_GOMP_SIMD_VF:
+	    case IFN_GOMP_SIMD_LAST_LANE:
+	      break;
+	    default:
+	      continue;
+	    }
+	  tree arg = gimple_call_arg (stmt, 0);
+	  gcc_assert (arg != NULL_TREE);
+	  gcc_assert (TREE_CODE (arg) == SSA_NAME);
+	  simduid_to_vf *p = NULL, data;
+	  data.simduid = DECL_UID (SSA_NAME_VAR (arg));
+	  if (htab.is_created ())
+	    p = htab.find (&data);
+	  if (p)
+	    vf = p->vf;
+	  switch (ifn)
+	    {
+	    case IFN_GOMP_SIMD_VF:
+	      t = build_int_cst (unsigned_type_node, vf);
+	      break;
+	    case IFN_GOMP_SIMD_LANE:
+	      t = build_int_cst (unsigned_type_node, 0);
+	      break;
+	    case IFN_GOMP_SIMD_LAST_LANE:
+	      t = gimple_call_arg (stmt, 1);
+	      break;
+	    default:
+	      gcc_unreachable ();
+	    }
+	  update_call_from_tree (&i, t);
+	}
+    }
+}
 
+/* Helper structure for note_simd_array_uses.  */
+
+struct note_simd_array_uses_struct
+{
+  hash_table <decl_to_simduid> *htab;
+  unsigned int simduid;
+};
+
+/* Callback for note_simd_array_uses, called through walk_gimple_op.  */
+
+static tree
+note_simd_array_uses_cb (tree *tp, int *walk_subtrees, void *data)
+{
+  struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
+  struct note_simd_array_uses_struct *ns
+    = (struct note_simd_array_uses_struct *) wi->info;
+
+  if (TYPE_P (*tp))
+    *walk_subtrees = 0;
+  else if (VAR_P (*tp)
+	   && lookup_attribute ("omp simd array", DECL_ATTRIBUTES (*tp))
+	   && DECL_CONTEXT (*tp) == current_function_decl)
+    {
+      decl_to_simduid data;
+      if (!ns->htab->is_created ())
+	ns->htab->create (15);
+      data.decl = *tp;
+      data.simduid = ns->simduid;
+      decl_to_simduid **slot = ns->htab->find_slot (&data, INSERT);
+      if (*slot == NULL)
+	{
+	  decl_to_simduid *p = XNEW (decl_to_simduid);
+	  *p = data;
+	  *slot = p;
+	}
+      else if ((*slot)->simduid != ns->simduid)
+	(*slot)->simduid = -1U;
+      *walk_subtrees = 0;
+    }
+  return NULL_TREE;
+}
+
+/* Find "omp simd array" temporaries and map them to corresponding
+   simduid.  */
+
+static void
+note_simd_array_uses (hash_table <decl_to_simduid> *htab)
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  struct walk_stmt_info wi;
+  struct note_simd_array_uses_struct ns;
+
+  memset (&wi, 0, sizeof (wi));
+  wi.info = &ns;
+  ns.htab = htab;
+
+  FOR_EACH_BB (bb)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (!is_gimple_call (stmt) || !gimple_call_internal_p (stmt))
+	  continue;
+	switch (gimple_call_internal_fn (stmt))
+	  {
+	  case IFN_GOMP_SIMD_LANE:
+	  case IFN_GOMP_SIMD_VF:
+	  case IFN_GOMP_SIMD_LAST_LANE:
+	    break;
+	  default:
+	    continue;
+	  }
+	tree lhs = gimple_call_lhs (stmt);
+	if (lhs == NULL_TREE)
+	  continue;
+	imm_use_iterator use_iter;
+	gimple use_stmt;
+	ns.simduid = DECL_UID (SSA_NAME_VAR (gimple_call_arg (stmt, 0)));
+	FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, lhs)
+	  if (!is_gimple_debug (use_stmt))
+	    walk_gimple_op (use_stmt, note_simd_array_uses_cb, &wi);
+      }
+}
 
 /* Function vectorize_loops.
 
@@ -86,12 +282,21 @@  vectorize_loops (void)
   unsigned int vect_loops_num;
   loop_iterator li;
   struct loop *loop;
+  hash_table <simduid_to_vf> simduid_to_vf_htab;
+  hash_table <decl_to_simduid> decl_to_simduid_htab;
 
   vect_loops_num = number_of_loops (cfun);
 
   /* Bail out if there are no loops.  */
   if (vect_loops_num <= 1)
-    return 0;
+    {
+      if (cfun->has_simduid_loops)
+	adjust_simduid_builtins (simduid_to_vf_htab);
+      return 0;
+    }
+
+  if (cfun->has_simduid_loops)
+    note_simd_array_uses (&decl_to_simduid_htab);
 
   init_stmt_vec_info_vec ();
 
@@ -101,7 +306,8 @@  vectorize_loops (void)
      than all previously defined loops.  This fact allows us to run
      only over initial loops skipping newly generated ones.  */
   FOR_EACH_LOOP (li, loop, 0)
-    if (optimize_loop_nest_for_speed_p (loop))
+    if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop))
+	|| loop->force_vect)
       {
 	loop_vec_info loop_vinfo;
 	vect_location = find_loop_location (loop);
@@ -122,6 +328,20 @@  vectorize_loops (void)
                            "Vectorized loop\n");
 	vect_transform_loop (loop_vinfo);
 	num_vectorized_loops++;
+	/* Now that the loop has been vectorized, allow it to be unrolled
+	   etc.  */
+	loop->force_vect = false;
+
+	if (loop->simduid)
+	  {
+	    simduid_to_vf *simduid_to_vf_data = XNEW (simduid_to_vf);
+	    if (!simduid_to_vf_htab.is_created ())
+	      simduid_to_vf_htab.create (15);
+	    simduid_to_vf_data->simduid = DECL_UID (loop->simduid);
+	    simduid_to_vf_data->vf = loop_vinfo->vectorization_factor;
+	    *simduid_to_vf_htab.find_slot (simduid_to_vf_data, INSERT)
+	      = simduid_to_vf_data;
+	  }
       }
 
   vect_location = UNKNOWN_LOC;
@@ -149,6 +369,40 @@  vectorize_loops (void)
 
   free_stmt_vec_info_vec ();
 
+  /* Fold IFN_GOMP_SIMD_{VF,LANE,LAST_LANE} builtins.  */
+  if (cfun->has_simduid_loops)
+    adjust_simduid_builtins (simduid_to_vf_htab);
+
+  /* Shrink any "omp array simd" temporary arrays to the
+     actual vectorization factors.  */
+  if (decl_to_simduid_htab.is_created ())
+    {
+      for (hash_table <decl_to_simduid>::iterator iter
+	   = decl_to_simduid_htab.begin ();
+	   iter != decl_to_simduid_htab.end (); ++iter)
+	if ((*iter).simduid != -1U)
+	  {
+	    tree decl = (*iter).decl;
+	    int vf = 1;
+	    if (simduid_to_vf_htab.is_created ())
+	      {
+		simduid_to_vf *p = NULL, data;
+		data.simduid = (*iter).simduid;
+		p = simduid_to_vf_htab.find (&data);
+		if (p)
+		  vf = p->vf;
+	      }
+	    tree atype
+	      = build_array_type_nelts (TREE_TYPE (TREE_TYPE (decl)), vf);
+	    TREE_TYPE (decl) = atype;
+	    relayout_decl (decl);
+	  }
+
+      decl_to_simduid_htab.dispose ();
+    }
+  if (simduid_to_vf_htab.is_created ())
+    simduid_to_vf_htab.dispose ();
+
   if (num_vectorized_loops > 0)
     {
       /* If we vectorized any loop only virtual SSA form needs to be updated.
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 7c5dfe8..3570dc9 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -576,6 +576,9 @@  typedef struct _stmt_vec_info {
   /* For loads only, true if this is a gather load.  */
   bool gather_p;
   bool stride_load_p;
+
+  /* For both loads and stores.  */
+  bool simd_lane_access_p;
 } *stmt_vec_info;
 
 /* Access Functions.  */
@@ -591,6 +594,7 @@  typedef struct _stmt_vec_info {
 #define STMT_VINFO_DATA_REF(S)             (S)->data_ref_info
 #define STMT_VINFO_GATHER_P(S)		   (S)->gather_p
 #define STMT_VINFO_STRIDE_LOAD_P(S)	   (S)->stride_load_p
+#define STMT_VINFO_SIMD_LANE_ACCESS_P(S)   (S)->simd_lane_access_p
 
 #define STMT_VINFO_DR_BASE_ADDRESS(S)      (S)->dr_base_address
 #define STMT_VINFO_DR_INIT(S)              (S)->dr_init
diff --git a/gcc/tree.c b/gcc/tree.c
index ab11735..c1a0e93 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -236,6 +236,8 @@  unsigned const char omp_clause_num_ops[] =
   4, /* OMP_CLAUSE_REDUCTION  */
   1, /* OMP_CLAUSE_COPYIN  */
   1, /* OMP_CLAUSE_COPYPRIVATE  */
+  2, /* OMP_CLAUSE_LINEAR  */
+  1, /* OMP_CLAUSE_UNIFORM  */
   1, /* OMP_CLAUSE_IF  */
   1, /* OMP_CLAUSE_NUM_THREADS  */
   1, /* OMP_CLAUSE_SCHEDULE  */
@@ -245,7 +247,9 @@  unsigned const char omp_clause_num_ops[] =
   3, /* OMP_CLAUSE_COLLAPSE  */
   0, /* OMP_CLAUSE_UNTIED   */
   1, /* OMP_CLAUSE_FINAL  */
-  0  /* OMP_CLAUSE_MERGEABLE  */
+  0, /* OMP_CLAUSE_MERGEABLE  */
+  1, /* OMP_CLAUSE_SAFELEN  */
+  1, /* OMP_CLAUSE__SIMDUID_  */
 };
 
 const char * const omp_clause_code_name[] =
@@ -258,6 +262,8 @@  const char * const omp_clause_code_name[] =
   "reduction",
   "copyin",
   "copyprivate",
+  "linear",
+  "uniform",
   "if",
   "num_threads",
   "schedule",
@@ -267,7 +273,9 @@  const char * const omp_clause_code_name[] =
   "collapse",
   "untied",
   "final",
-  "mergeable"
+  "mergeable",
+  "safelen",
+  "_simduid_"
 };
 
 
@@ -11033,6 +11041,9 @@  walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	case OMP_CLAUSE_IF:
 	case OMP_CLAUSE_NUM_THREADS:
 	case OMP_CLAUSE_SCHEDULE:
+	case OMP_CLAUSE_UNIFORM:
+	case OMP_CLAUSE_SAFELEN:
+	case OMP_CLAUSE__SIMDUID_:
 	  WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, 0));
 	  /* FALLTHRU */
 
@@ -11056,6 +11067,11 @@  walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	    WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
 	  }
 
+	case OMP_CLAUSE_LINEAR:
+	  WALK_SUBTREE (OMP_CLAUSE_DECL (*tp));
+	  WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, 1));
+	  WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
+
 	case OMP_CLAUSE_REDUCTION:
 	  {
 	    int i;
diff --git a/gcc/tree.def b/gcc/tree.def
index da30074..f825aad 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1030,6 +1030,10 @@  DEFTREECODE (OMP_TASK, "omp_task", tcc_statement, 2)
    unspecified by the standard.  */
 DEFTREECODE (OMP_FOR, "omp_for", tcc_statement, 6)
 
+/* OpenMP - #pragma omp simd [clause1 ... clauseN]
+   Operands like for OMP_FOR.  */
+DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
+
 /* OpenMP - #pragma omp sections [clause1 ... clauseN]
    Operand 0: OMP_SECTIONS_BODY: Sections body.
    Operand 1: OMP_SECTIONS_CLAUSES: List of clauses.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index b444517..b902e39 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -365,6 +365,12 @@  enum omp_clause_code
   /* OpenMP clause: copyprivate (variable_list).  */
   OMP_CLAUSE_COPYPRIVATE,
 
+  /* OpenMP clause: linear (variable-list[:linear-step]).  */
+  OMP_CLAUSE_LINEAR,
+
+  /* OpenMP clause: uniform (argument-list).  */
+  OMP_CLAUSE_UNIFORM,
+
   /* OpenMP clause: if (scalar-expression).  */
   OMP_CLAUSE_IF,
 
@@ -393,7 +399,13 @@  enum omp_clause_code
   OMP_CLAUSE_FINAL,
 
   /* OpenMP clause: mergeable.  */
-  OMP_CLAUSE_MERGEABLE
+  OMP_CLAUSE_MERGEABLE,
+
+  /* OpenMP clause: safelen (constant-integer-expression).  */
+  OMP_CLAUSE_SAFELEN,
+
+  /* Internally used only clause, holding SIMD uid.  */
+  OMP_CLAUSE__SIMDUID_
 };
 
 /* The definition of tree nodes fills the next several pages.  */
@@ -560,6 +572,9 @@  struct GTY(()) tree_base {
        OMP_CLAUSE_PRIVATE_DEBUG in
            OMP_CLAUSE_PRIVATE
 
+       OMP_CLAUSE_LINEAR_NO_COPYIN in
+	   OMP_CLAUSE_LINEAR
+
        TRANSACTION_EXPR_RELAXED in
 	   TRANSACTION_EXPR
 
@@ -580,6 +595,9 @@  struct GTY(()) tree_base {
        OMP_CLAUSE_PRIVATE_OUTER_REF in
 	   OMP_CLAUSE_PRIVATE
 
+       OMP_CLAUSE_LINEAR_NO_COPYOUT in
+	   OMP_CLAUSE_LINEAR
+
        TYPE_REF_IS_RVALUE in
 	   REFERENCE_TYPE
 
@@ -1800,7 +1818,7 @@  extern void protected_set_expr_location (tree, location_t);
 #define OMP_CLAUSE_DECL(NODE)      					\
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_RANGE_CHECK (OMP_CLAUSE_CHECK (NODE),	\
 					      OMP_CLAUSE_PRIVATE,	\
-	                                      OMP_CLAUSE_COPYPRIVATE), 0)
+	                                      OMP_CLAUSE_UNIFORM), 0)
 #define OMP_CLAUSE_HAS_LOCATION(NODE) \
   (LOCATION_LOCUS ((OMP_CLAUSE_CHECK (NODE))->omp_clause.locus)		\
   != UNKNOWN_LOCATION)
@@ -1867,6 +1885,25 @@  extern void protected_set_expr_location (tree, location_t);
 #define OMP_CLAUSE_REDUCTION_PLACEHOLDER(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_REDUCTION), 3)
 
+/* True if a LINEAR clause doesn't need copy in.  True for iterator vars which
+   are always initialized inside of the loop construct, false otherwise.  */
+#define OMP_CLAUSE_LINEAR_NO_COPYIN(NODE) \
+  (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR)->base.public_flag)
+
+/* True if a LINEAR clause doesn't need copy out.  True for iterator vars which
+   are declared inside of the simd construct.  */
+#define OMP_CLAUSE_LINEAR_NO_COPYOUT(NODE) \
+  TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR))
+
+#define OMP_CLAUSE_LINEAR_STEP(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR), 1)
+
+#define OMP_CLAUSE_SAFELEN_EXPR(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_SAFELEN), 0)
+
+#define OMP_CLAUSE__SIMDUID__DECL(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__SIMDUID_), 0)
+
 enum omp_clause_schedule_kind
 {
   OMP_CLAUSE_SCHEDULE_STATIC,
@@ -4783,6 +4820,7 @@  extern tree build_translation_unit_decl (tree);
 extern tree build_block (tree, tree, tree, tree);
 extern tree build_empty_stmt (location_t);
 extern tree build_omp_clause (location_t, enum omp_clause_code);
+extern tree find_omp_clause (tree, enum omp_clause_code);
 
 extern tree build_vl_exp_stat (enum tree_code, int MEM_STAT_DECL);
 #define build_vl_exp(c,n) build_vl_exp_stat (c,n MEM_STAT_INFO)