diff mbox

Pass -foffload targets from driver to libgomp at link time

Message ID alpine.DEB.2.10.1508272043510.20506@digraph.polyomino.org.uk
State New
Headers show

Commit Message

Joseph Myers Aug. 27, 2015, 8:45 p.m. UTC
This patch, a version of
<https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01264.html> cleaned up
for trunk, arranges for the -foffload= targets specified at link time
to be passed to libgomp via a constructor function generated by the
driver.

In this patch, I've tried to remove all the miscellaneous cleanups in
the gomp-4_0-branch version that didn't appear to be necessarily
required as part of passing -foffload from the driver to libgomp.
Thus, care should be taken when next merging from trunk to
gomp-4_0-branch not to lose those cleanups where patch conflicts
arise, where the cleanups are still desired for merging to trunk
separately.  It's possible I missed some such changes; thus, this
patch should be reviewed carefully to make sure there isn't anything
unrelated mixed in.

This patch uses GOMP_4.0.2 as the symbol version for the new function
GOMP_set_offload_targets (where the gomp-4_0-branch patch had
GOACC_2.0.GOMP_4_BRANCH).  I hope this is the correct version for a
GOMP_* function that is new in GCC 6.

Tested with no regressions for x86_64-none-linux-gnu, offloading to
nvptx-none; 24 libgomp test FAILs start to pass with the patch.  OK to
commit?

gcc:
2015-08-27  Thomas Schwinge  <thomas@codesourcery.com>
	    Joseph Myers  <joseph@codesourcery.com>

	* gcc.c (offload_targets): Update comment.
	(add_omp_infile_spec_func, spec_lang_mask_accept): New.
	(driver_self_specs) [ENABLE_OFFLOADING]: Add spec to use
	%:add-omp-infile().
	(static_spec_functions): Add add-omp-infile.
	(struct switchstr): Add lang_mask field.
	(struct infile): Add lang_mask field.
	(add_infile, save_switch, do_spec): Add lang_mask argument.
	(driver_unknown_option_callback, driver_wrong_lang_callback)
	(driver_handle_option, process_command, do_self_spec)
	(driver::do_spec_on_infiles): All callers changed.
	(process_command): Call handle_foffload_option (OFFLOAD_TARGETS)
	if no offload target specified.
	(give_switch): Check languages of switch against
	spec_lang_mask_accept.
	(driver::maybe_putenv_OFFLOAD_TARGETS): Do not use intermediate
	targets variable.
	* gcc.h (do_spec): Update prototype.

gcc/fortran:
2015-08-27  Joseph Myers  <joseph@codesourcery.com>

	* gfortranspec.c (lang_specific_pre_link): Update call to do_spec.

gcc/java:
2015-08-27  Joseph Myers  <joseph@codesourcery.com>

	* jvspec.c (lang_specific_pre_link): Update call to do_spec.

libgomp:
2015-08-27  Thomas Schwinge  <thomas@codesourcery.com>
	    Joseph Myers  <joseph@codesourcery.com>

	* plugin/configfrag.ac (tgt_name): Do not set.
	(offload_targets): Separate with colons not commas.
	* config.h.in, configure: Regenerate.
	* libgomp.map (GOMP_4.0.2): Add GOMP_set_offload_targets.
	* libgomp_g.h (GOMP_set_offload_targets): New prototype.
	* target.c (offload_target_to_plugin_name, gomp_offload_targets)
	(gomp_offload_targets_init, GOMP_set_offload_targets): New.
	(gomp_target_init): Use gomp_offload_targets instead of
	OFFLOAD_TARGETS.  Handle and rewrite colon-separated string.
	* testsuite/lib/libgomp.exp: Expect offload targets to be
	colon-separated.  Adjust matching of offload targets.
	(libgomp_init)
	(check_effective_target_openacc_nvidia_accel_supported)
	(check_effective_target_openacc_host_selected): Adjust checks of
	offload target names.
	* testsuite/libgomp.oacc-c++/c++.exp: Adjust set of offload
	targets.  Use -foffload=.
	* testsuite/libgomp.oacc-c/c.exp: Adjust set of offload targets.
	Use -foffload=.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Adjust set of
	offload targets.  Use -foffload=.

Comments

Joseph Myers Sept. 3, 2015, 2:52 p.m. UTC | #1
Ping.  This patch 
<https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01748.html> is pending 
review.
Joseph Myers Sept. 10, 2015, 1:41 p.m. UTC | #2
Ping^2.  This patch 
<https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01748.html> is still 
pending review.
Bernd Schmidt Sept. 10, 2015, 2:02 p.m. UTC | #3
On 09/10/2015 03:41 PM, Joseph Myers wrote:
> Ping^2.  This patch 
> <https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01748.html> is still 
> pending review.

No fundamental objections, but I have some questions. Cuold you describe
what the handling of flags/lang_mask accomplishes in this patch? Would
option handling be simpler if the creation/compilation of the extra file
happened in lto_wrapper (where we already do similar things through
mkoffload)?

I initially thought the information you're giving to
GOMP_set_offload_targets is already available implicitly, from the calls
to GOMP_offload_register. But digging through the archives it sounds
like the problem is that if there's no offloadable code, no offload
image will be generated. Is that correct?



Bernd
Joseph Myers Sept. 11, 2015, 2:23 p.m. UTC | #4
On Thu, 10 Sep 2015, Bernd Schmidt wrote:

> On 09/10/2015 03:41 PM, Joseph Myers wrote:
> > Ping^2.  This patch 
> > <https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01748.html> is still 
> > pending review.
> 
> No fundamental objections, but I have some questions. Cuold you describe
> what the handling of flags/lang_mask accomplishes in this patch? Would
> option handling be simpler if the creation/compilation of the extra file
> happened in lto_wrapper (where we already do similar things through
> mkoffload)?

The point of the lang_mask handling is that if, say, we're compiling C++ 
or Fortran code, with options that aren't valid for C, we mustn't pass 
those options to cc1 when building the constructor as C code, but we do 
still need to pass options valid for C (which might e.g. affect the ABI).

There's an argument that this sort of option filtering should be done more 
generally.  That is, if we have a mixed-language compilation in a single 
call to the driver, it should filter the options so that cc1 gets those 
options applicable for C, cc1plus those applicable to C++, etc., with 
options for inappropriate languages only being diagnosed if none of the 
source files are for that language.  I don't know if that's the right 
thing to do or not, but it's at least plausible.

I don't see lto-wrapper as being any easier as a place to do this; no 
doubt lto-wrapper or collect2 could create the file and call back into the 
driver to compile it, but I don't see the advantage in doing that over 
having the driver (which already has all the relevant information, since 
it's coming from the command line rather than inspection of object files 
being linked) do it.

> I initially thought the information you're giving to
> GOMP_set_offload_targets is already available implicitly, from the calls
> to GOMP_offload_register. But digging through the archives it sounds
> like the problem is that if there's no offloadable code, no offload
> image will be generated. Is that correct?

Yes.  In the message Thomas referred to, "On the other hand, for example, 
for -foffload=nvptx-none, even if user program code doesn't contain any 
offloaded data (and thus the offload machinery has not been run), the user 
program might still contain any executable directives or OpenACC runtime 
library calls, so we'd still like to use the libgomp nvptx plugin.  
However, we currently cannot detect this situation.".
Bernd Schmidt Sept. 11, 2015, 2:46 p.m. UTC | #5
On 09/11/2015 04:23 PM, Joseph Myers wrote:
> On Thu, 10 Sep 2015, Bernd Schmidt wrote:
>
>> On 09/10/2015 03:41 PM, Joseph Myers wrote:
>>> Ping^2.  This patch
>>> <https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01748.html> is still
>>> pending review.
>>
>> No fundamental objections, but I have some questions. Cuold you describe
>> what the handling of flags/lang_mask accomplishes in this patch? Would
>> option handling be simpler if the creation/compilation of the extra file
>> happened in lto_wrapper (where we already do similar things through
>> mkoffload)?
>
> The point of the lang_mask handling is that if, say, we're compiling C++
> or Fortran code, with options that aren't valid for C, we mustn't pass
> those options to cc1 when building the constructor as C code, but we do
> still need to pass options valid for C (which might e.g. affect the ABI).
[...]
> I don't see lto-wrapper as being any easier as a place to do this; no
> doubt lto-wrapper or collect2 could create the file and call back into the
> driver to compile it, but I don't see the advantage in doing that over
> having the driver (which already has all the relevant information, since
> it's coming from the command line rather than inspection of object files
> being linked) do it.

The point would be that lto_wrapper already produces such an appropriate 
set of options. But I guess if you're thinking ahead to using this 
filtering in gcc.c for other purposes then that's also a good argument. 
So, patch is ok, but please update the comment for give_switch (document 
the new behaviour and that it depends on a global variable).

I expect you know best what to do in the OpenACC testsuite driver, but 
you might want to run the libgomp.exp parts by Jakub. If the testsuite 
parts are independent of the rest of the patch, please repost them 
separately.


Bernd
Joseph Myers Sept. 11, 2015, 3:26 p.m. UTC | #6
On Fri, 11 Sep 2015, Bernd Schmidt wrote:

> I expect you know best what to do in the OpenACC testsuite driver, but you
> might want to run the libgomp.exp parts by Jakub. If the testsuite parts are
> independent of the rest of the patch, please repost them separately.

Jakub?  The testsuite changes and the rest of the patch depend on each 
other.
Jakub Jelinek Sept. 11, 2015, 3:43 p.m. UTC | #7
On Fri, Sep 11, 2015 at 03:26:04PM +0000, Joseph Myers wrote:
> On Fri, 11 Sep 2015, Bernd Schmidt wrote:
> 
> > I expect you know best what to do in the OpenACC testsuite driver, but you
> > might want to run the libgomp.exp parts by Jakub. If the testsuite parts are
> > independent of the rest of the patch, please repost them separately.
> 
> Jakub?  The testsuite changes and the rest of the patch depend on each 
> other.

So, do I understand well that you'll call GOMP_set_offload_targets from
constructs of all shared libraries (and the binary) that contain offloaded
code?  If yes, that is surely going to fail the assertions in there.
You can dlopen such libraries etc.  What if you link one library with
-fopenmp=nvptx-none and another one with -fopenmp=x86_64-intelmicemul-linux?
Can't the -foffload= string be passed to GOMP_offload_register_ver
(or just derive the list of plugins that should be loaded or at least those
that should be tried first from the list of offloaded data that has been
registered so far)?
I mean, it is also very well possible some program calls omp_get_num_devices
() etc. say from main binary and only then dlopens shared libraries that
contain offloaded regions and then attempt to offload in those shared
libraries.  So, better it should always load all possible plugins, but
perhaps in order determined by what has been registered?

	Jakub
Joseph Myers Sept. 11, 2015, 3:59 p.m. UTC | #8
On Fri, 11 Sep 2015, Jakub Jelinek wrote:

> On Fri, Sep 11, 2015 at 03:26:04PM +0000, Joseph Myers wrote:
> > On Fri, 11 Sep 2015, Bernd Schmidt wrote:
> > 
> > > I expect you know best what to do in the OpenACC testsuite driver, but you
> > > might want to run the libgomp.exp parts by Jakub. If the testsuite parts are
> > > independent of the rest of the patch, please repost them separately.
> > 
> > Jakub?  The testsuite changes and the rest of the patch depend on each 
> > other.
> 
> So, do I understand well that you'll call GOMP_set_offload_targets from
> constructs of all shared libraries (and the binary) that contain offloaded
> code?  If yes, that is surely going to fail the assertions in there.
> You can dlopen such libraries etc.  What if you link one library with
> -fopenmp=nvptx-none and another one with -fopenmp=x86_64-intelmicemul-linux?

Thomas (I think you're back next week), any comments on how shared 
libraries with different offloading selected fit into your design 
(including the case where some but not all of the executable / shared 
libraries specify -foffload=disable)?
diff mbox

Patch

Index: libgomp/config.h.in
===================================================================
--- libgomp/config.h.in	(revision 227194)
+++ libgomp/config.h.in	(working copy)
@@ -95,7 +95,7 @@ 
    */
 #undef LT_OBJDIR
 
-/* Define to hold the list of target names suitable for offloading. */
+/* Define to hold the list of offload targets, separated by colons. */
 #undef OFFLOAD_TARGETS
 
 /* Name of package */
Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 227194)
+++ libgomp/target.c	(working copy)
@@ -1209,6 +1209,41 @@  gomp_load_plugin_for_device (struct gomp_device_de
   return 0;
 }
 
+/* Return the corresponding plugin name for the offload target name
+   OFFLOAD_TARGET.  */
+
+static const char *
+offload_target_to_plugin_name (const char *offload_target)
+{
+  if (strstr (offload_target, "-intelmic") != NULL)
+    return "intelmic";
+  if (strncmp (offload_target, "nvptx", 5) == 0)
+    return "nvptx";
+  gomp_fatal ("Unknown offload target: %s", offload_target);
+}
+
+/* List of offload targets, separated by colon.  Defaults to the list
+   determined when configuring libgomp.  */
+static const char *gomp_offload_targets = OFFLOAD_TARGETS;
+static bool gomp_offload_targets_init = false;
+
+/* Override the list of offload targets with OFFLOAD_TARGETS, the set
+   passed to the compiler at link time.  This must be called early,
+   and only once.  */
+
+void
+GOMP_set_offload_targets (const char *offload_targets)
+{
+  gomp_debug (0, "%s (\"%s\")\n", __FUNCTION__, offload_targets);
+
+  /* Make sure this gets called early.  */
+  assert (gomp_is_initialized == PTHREAD_ONCE_INIT);
+  /* Make sure this only gets called once.  */
+  assert (!gomp_offload_targets_init);
+  gomp_offload_targets_init = true;
+  gomp_offload_targets = offload_targets;
+}
+
 /* This function initializes the runtime needed for offloading.
    It parses the list of offload targets and tries to load the plugins for
    these targets.  On return, the variables NUM_DEVICES and NUM_DEVICES_OPENMP
@@ -1228,26 +1263,45 @@  gomp_target_init (void)
   num_devices = 0;
   devices = NULL;
 
-  cur = OFFLOAD_TARGETS;
+  cur = gomp_offload_targets;
   if (*cur)
     do
       {
 	struct gomp_device_descr current_device;
 
-	next = strchr (cur, ',');
-
-	plugin_name = (char *) malloc (1 + (next ? next - cur : strlen (cur))
-				       + strlen (prefix) + strlen (suffix));
+	next = strchr (cur, ':');
+	size_t prefix_len = strlen (prefix);
+	size_t cur_len = next ? next - cur : strlen (cur);
+	size_t suffix_len = strlen (suffix);
+	plugin_name = (char *) malloc (prefix_len
+				       + cur_len
+				       + suffix_len
+				       + 1);
 	if (!plugin_name)
 	  {
 	    num_devices = 0;
 	    break;
 	  }
+	memcpy (plugin_name, prefix, prefix_len);
+	memcpy (plugin_name + prefix_len, cur, cur_len);
+	/* NUL-terminate the string here...  */
+	plugin_name[prefix_len + cur_len] = '\0';
+	/* ..., so that we can then use it to translate the offload target to
+	   the plugin name...  */
+	const char *cur_plugin_name
+	  = offload_target_to_plugin_name (plugin_name
+					   + prefix_len);
+	size_t cur_plugin_name_len = strlen (cur_plugin_name);
+	assert (cur_plugin_name_len <= cur_len);
+	/* ..., and then rewrite it.  */
+	memcpy (plugin_name + prefix_len,
+		cur_plugin_name, cur_plugin_name_len);
+	memcpy (plugin_name + prefix_len + cur_plugin_name_len,
+		suffix, suffix_len);
+	plugin_name[prefix_len
+		    + cur_plugin_name_len
+		    + suffix_len] = '\0';
 
-	strcpy (plugin_name, prefix);
-	strncat (plugin_name, cur, next ? next - cur : strlen (cur));
-	strcat (plugin_name, suffix);
-
 	if (gomp_load_plugin_for_device (&current_device, plugin_name))
 	  {
 	    new_num_devices = current_device.get_num_devices_func ();
Index: libgomp/configure
===================================================================
--- libgomp/configure	(revision 227194)
+++ libgomp/configure	(working copy)
@@ -15236,10 +15236,8 @@  if test x"$enable_offload_targets" != x; then
     tgt=`echo $tgt | sed 's/=.*//'`
     case $tgt in
       *-intelmic-* | *-intelmicemul-*)
-	tgt_name=intelmic
 	;;
       nvptx*)
-        tgt_name=nvptx
 	PLUGIN_NVPTX=$tgt
 	PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS
 	PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS
@@ -15282,9 +15280,9 @@  rm -f core conftest.err conftest.$ac_objext \
 	;;
     esac
     if test x"$offload_targets" = x; then
-      offload_targets=$tgt_name
+      offload_targets=$tgt
     else
-      offload_targets=$offload_targets,$tgt_name
+      offload_targets=$offload_targets:$tgt
     fi
     if test x"$tgt_dir" != x; then
       offload_additional_options="$offload_additional_options -B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) -B$tgt_dir/bin"
Index: libgomp/libgomp_g.h
===================================================================
--- libgomp/libgomp_g.h	(revision 227194)
+++ libgomp/libgomp_g.h	(working copy)
@@ -206,6 +206,7 @@  extern void GOMP_single_copy_end (void *);
 
 /* target.c */
 
+extern void GOMP_set_offload_targets (const char *);
 extern void GOMP_target (int, void (*) (void *), const void *,
 			 size_t, void **, size_t *, unsigned char *);
 extern void GOMP_target_data (int, const void *,
Index: libgomp/plugin/configfrag.ac
===================================================================
--- libgomp/plugin/configfrag.ac	(revision 227194)
+++ libgomp/plugin/configfrag.ac	(working copy)
@@ -92,10 +92,8 @@  if test x"$enable_offload_targets" != x; then
     tgt=`echo $tgt | sed 's/=.*//'`
     case $tgt in
       *-intelmic-* | *-intelmicemul-*)
-	tgt_name=intelmic
 	;;
       nvptx*)
-        tgt_name=nvptx
 	PLUGIN_NVPTX=$tgt
 	PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS
 	PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS
@@ -127,9 +125,9 @@  if test x"$enable_offload_targets" != x; then
 	;;
     esac
     if test x"$offload_targets" = x; then
-      offload_targets=$tgt_name
+      offload_targets=$tgt
     else
-      offload_targets=$offload_targets,$tgt_name
+      offload_targets=$offload_targets:$tgt
     fi
     if test x"$tgt_dir" != x; then
       offload_additional_options="$offload_additional_options -B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) -B$tgt_dir/bin"
@@ -141,7 +139,7 @@  if test x"$enable_offload_targets" != x; then
   done
 fi
 AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
-  [Define to hold the list of target names suitable for offloading.])
+  [Define to hold the list of offload targets, separated by colons.])
 AM_CONDITIONAL([PLUGIN_NVPTX], [test $PLUGIN_NVPTX = 1])
 AC_DEFINE_UNQUOTED([PLUGIN_NVPTX], [$PLUGIN_NVPTX],
   [Define to 1 if the NVIDIA plugin is built, 0 if not.])
Index: libgomp/libgomp.map
===================================================================
--- libgomp/libgomp.map	(revision 227194)
+++ libgomp/libgomp.map	(working copy)
@@ -238,6 +238,7 @@  GOMP_4.0.2 {
   global:
 	GOMP_offload_register_ver;
 	GOMP_offload_unregister_ver;
+	GOMP_set_offload_targets;
 } GOMP_4.0.1;
 
 OACC_2.0 {
Index: libgomp/testsuite/libgomp.oacc-c++/c++.exp
===================================================================
--- libgomp/testsuite/libgomp.oacc-c++/c++.exp	(revision 227194)
+++ libgomp/testsuite/libgomp.oacc-c++/c++.exp	(working copy)
@@ -75,13 +75,12 @@  if { $lang_test_file_found } {
 
     # Test OpenACC with available accelerators.
     foreach offload_target_openacc $offload_targets_s_openacc {
-	set tagopt "-DACC_DEVICE_TYPE_$offload_target_openacc=1"
-
-	switch $offload_target_openacc {
-	    host {
+	switch -glob $offload_target_openacc {
+	    disable {
 		set acc_mem_shared 1
+		set tagopt "-DACC_DEVICE_TYPE_host=1"
 	    }
-	    nvidia {
+	    nvptx* {
 		if { ![check_effective_target_openacc_nvidia_accel_present] } {
 		    # Don't bother; execution testing is going to FAIL.
 		    untested "$subdir $offload_target_openacc offloading"
@@ -95,15 +94,14 @@  if { $lang_test_file_found } {
 		lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/libgomp.oacc-c-c++-common"
 
 		set acc_mem_shared 0
+		set tagopt "-DACC_DEVICE_TYPE_nvidia=1"
 	    }
 	    default {
 		set acc_mem_shared 0
 	    }
 	}
-	set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared"
+	set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared -foffload=$offload_target_openacc"
 
-	setenv ACC_DEVICE_TYPE $offload_target_openacc
-
 	dg-runtest $tests "$tagopt" "$libstdcxx_includes $DEFAULT_CFLAGS"
     }
 }
Index: libgomp/testsuite/lib/libgomp.exp
===================================================================
--- libgomp/testsuite/lib/libgomp.exp	(revision 227194)
+++ libgomp/testsuite/lib/libgomp.exp	(working copy)
@@ -36,24 +36,21 @@  load_gcc_lib fortran-modules.exp
 load_file libgomp-test-support.exp
 
 # Populate offload_targets_s (offloading targets separated by a space), and
-# offload_targets_s_openacc (the same, but with OpenACC names; OpenACC spells
-# some of them a little differently).
-set offload_targets_s [split $offload_targets ","]
+# offload_targets_s_openacc (those suitable for OpenACC).
+set offload_targets_s [split $offload_targets ":"]
 set offload_targets_s_openacc {}
 foreach offload_target_openacc $offload_targets_s {
-    switch $offload_target_openacc {
-	intelmic {
+    switch -glob $offload_target_openacc {
+	*-intelmic* {
 	    # Skip; will all FAIL because of missing
 	    # GOMP_OFFLOAD_CAP_OPENACC_200.
 	    continue
 	}
-	nvptx {
-	    set offload_target_openacc "nvidia"
-	}
     }
     lappend offload_targets_s_openacc "$offload_target_openacc"
 }
-lappend offload_targets_s_openacc "host"
+# Host fallback.
+lappend offload_targets_s_openacc "disable"
 
 set dg-do-what-default run
 
@@ -134,7 +131,7 @@  proc libgomp_init { args } {
     # Add liboffloadmic build directory in LD_LIBRARY_PATH to support
     # non-fallback testing for Intel MIC targets
     global offload_targets
-    if { [string match "*,intelmic,*" ",$offload_targets,"] } {
+    if { [string match "*:*-intelmic*:*" ":$offload_targets:"] } {
 	append always_ld_library_path ":${blddir}/../liboffloadmic/.libs"
 	append always_ld_library_path ":${blddir}/../liboffloadmic/plugin/.libs"
 	# libstdc++ is required by liboffloadmic
@@ -332,15 +329,14 @@  proc check_effective_target_openacc_nvidia_accel_p
 }
 
 # Return 1 if at least one nvidia board is present, and the nvidia device type
-# is selected by default by means of setting the environment variable
-# ACC_DEVICE_TYPE.
+# is selected by default.
 
 proc check_effective_target_openacc_nvidia_accel_selected { } {
     if { ![check_effective_target_openacc_nvidia_accel_present] } {
 	return 0;
     }
     global offload_target_openacc
-    if { $offload_target_openacc == "nvidia" } {
+    if { [string match "nvptx*" $offload_target_openacc] } {
         return 1;
     }
     return 0;
@@ -350,7 +346,7 @@  proc check_effective_target_openacc_nvidia_accel_s
 
 proc check_effective_target_openacc_host_selected { } {
     global offload_target_openacc
-    if { $offload_target_openacc == "host" } {
+    if { $offload_target_openacc == "disable" } {
         return 1;
     }
     return 0;
Index: libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
===================================================================
--- libgomp/testsuite/libgomp.oacc-fortran/fortran.exp	(revision 227194)
+++ libgomp/testsuite/libgomp.oacc-fortran/fortran.exp	(working copy)
@@ -67,13 +67,12 @@  if { $lang_test_file_found } {
 
     # Test OpenACC with available accelerators.
     foreach offload_target_openacc $offload_targets_s_openacc {
-	set tagopt "-DACC_DEVICE_TYPE_$offload_target_openacc=1"
-
-	switch $offload_target_openacc {
-	    host {
+	switch -glob $offload_target_openacc {
+	    disable {
 		set acc_mem_shared 1
+		set tagopt "-DACC_DEVICE_TYPE_host=1"
 	    }
-	    nvidia {
+	    nvptx* {
 		if { ![check_effective_target_openacc_nvidia_accel_present] } {
 		    # Don't bother; execution testing is going to FAIL.
 		    untested "$subdir $offload_target_openacc offloading"
@@ -81,15 +80,14 @@  if { $lang_test_file_found } {
 		}
 
 		set acc_mem_shared 0
+		set tagopt "-DACC_DEVICE_TYPE_nvidia=1"
 	    }
 	    default {
 		set acc_mem_shared 0
 	    }
 	}
-	set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared"
+	set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared -foffload=$offload_target_openacc"
 
-	setenv ACC_DEVICE_TYPE $offload_target_openacc
-
 	# For Fortran we're doing torture testing, as Fortran has far more tests
 	# with arrays etc. that testing just -O0 or -O2 is insufficient, that is
 	# typically not the case for C/C++.
Index: libgomp/testsuite/libgomp.oacc-c/c.exp
===================================================================
--- libgomp/testsuite/libgomp.oacc-c/c.exp	(revision 227194)
+++ libgomp/testsuite/libgomp.oacc-c/c.exp	(working copy)
@@ -38,13 +38,13 @@  set_ld_library_path_env_vars
 set SAVE_ALWAYS_CFLAGS "$ALWAYS_CFLAGS"
 foreach offload_target_openacc $offload_targets_s_openacc {
     set ALWAYS_CFLAGS "$SAVE_ALWAYS_CFLAGS"
-    set tagopt "-DACC_DEVICE_TYPE_$offload_target_openacc=1"
 
-    switch $offload_target_openacc {
-	host {
+    switch -glob $offload_target_openacc {
+	disable {
 	    set acc_mem_shared 1
+	    set tagopt "-DACC_DEVICE_TYPE_host=1"
 	}
-	nvidia {
+	nvptx* {
 	    if { ![check_effective_target_openacc_nvidia_accel_present] } {
 		# Don't bother; execution testing is going to FAIL.
 		untested "$subdir $offload_target_openacc offloading"
@@ -58,15 +58,14 @@  foreach offload_target_openacc $offload_targets_s_
 	    lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/libgomp.oacc-c-c++-common"
 
 	    set acc_mem_shared 0
+	    set tagopt "-DACC_DEVICE_TYPE_nvidia=1"
 	}
 	default {
 	    set acc_mem_shared 0
 	}
     }
-    set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared"
+    set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared -foffload=$offload_target_openacc"
 
-    setenv ACC_DEVICE_TYPE $offload_target_openacc
-
     dg-runtest $tests "$tagopt" $DEFAULT_CFLAGS
 }
 
Index: gcc/java/jvspec.c
===================================================================
--- gcc/java/jvspec.c	(revision 227194)
+++ gcc/java/jvspec.c	(working copy)
@@ -629,7 +629,7 @@  lang_specific_pre_link (void)
      class name.  Append dummy `.c' that can be stripped by set_input so %b
      is correct.  */ 
   set_input (concat (main_class_name, "main.c", NULL));
-  err = do_spec (jvgenmain_spec);
+  err = do_spec (jvgenmain_spec, 0);
   if (err == 0)
     {
       /* Shift the outfiles array so the generated main comes first.
Index: gcc/gcc.c
===================================================================
--- gcc/gcc.c	(revision 227194)
+++ gcc/gcc.c	(working copy)
@@ -284,7 +284,7 @@  static const char *const spec_version = DEFAULT_TA
 static const char *spec_machine = DEFAULT_TARGET_MACHINE;
 static const char *spec_host_machine = DEFAULT_REAL_TARGET_MACHINE;
 
-/* List of offload targets.  */
+/* List of offload targets.  Empty string for -foffload=disable.  */
 
 static char *offload_targets = NULL;
 
@@ -400,6 +400,8 @@  static const char *compare_debug_auxbase_opt_spec_
 static const char *pass_through_libs_spec_func (int, const char **);
 static const char *replace_extension_spec_func (int, const char **);
 static const char *greater_than_spec_func (int, const char **);
+static const char *add_omp_infile_spec_func (int, const char **);
+
 static char *convert_white_space (char *);
 
 /* The Specs Language
@@ -1186,6 +1188,11 @@  static const char *const multilib_defaults_raw[] =
 
 static const char *const driver_self_specs[] = {
   "%{fdump-final-insns:-fdump-final-insns=.} %<fdump-final-insns",
+#ifdef ENABLE_OFFLOADING
+  /* If linking against libgomp, add a setup file.  */
+  "%{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*} 1):" \
+  "%:add-omp-infile()}",
+#endif /* ENABLE_OFFLOADING */
   DRIVER_SELF_SPECS, CONFIGURE_SPECS, GOMP_SELF_SPECS, GTM_SELF_SPECS,
   CILK_SELF_SPECS
 };
@@ -1613,6 +1620,7 @@  static const struct spec_function static_spec_func
   { "pass-through-libs",	pass_through_libs_spec_func },
   { "replace-extension",	replace_extension_spec_func },
   { "gt",			greater_than_spec_func },
+  { "add-omp-infile",		add_omp_infile_spec_func },
 #ifdef EXTRA_SPEC_FUNCTIONS
   EXTRA_SPEC_FUNCTIONS
 #endif
@@ -3209,7 +3217,8 @@  execute (void)
    The `validated' field describes whether any spec has looked at this switch;
    if it remains false at the end of the run, the switch must be meaningless.
    The `ordering' field is used to temporarily mark switches that have to be
-   kept in a specific order.  */
+   kept in a specific order.
+   The `lang_mask' field stores the flags associated with this option.  */
 
 #define SWITCH_LIVE    			(1 << 0)
 #define SWITCH_FALSE   			(1 << 1)
@@ -3225,6 +3234,7 @@  struct switchstr
   bool known;
   bool validated;
   bool ordering;
+  unsigned int lang_mask;
 };
 
 static struct switchstr *switches;
@@ -3233,6 +3243,10 @@  static int n_switches;
 
 static int n_switches_alloc;
 
+/* If nonzero, do not pass through switches for languages not matching
+   this mask.  */
+static unsigned int spec_lang_mask_accept;
+
 /* Set to zero if -fcompare-debug is disabled, positive if it's
    enabled and we're running the first compilation, negative if it's
    enabled and we're running the second compilation.  For most of the
@@ -3270,6 +3284,7 @@  struct infile
   const char *name;
   const char *language;
   struct compiler *incompiler;
+  unsigned int lang_mask;
   bool compiled;
   bool preprocessed;
 };
@@ -3463,15 +3478,16 @@  alloc_infile (void)
     }
 }
 
-/* Store an input file with the given NAME and LANGUAGE in
+/* Store an input file with the given NAME and LANGUAGE and LANG_MASK in
    infiles.  */
 
 static void
-add_infile (const char *name, const char *language)
+add_infile (const char *name, const char *language, unsigned int lang_mask)
 {
   alloc_infile ();
   infiles[n_infiles].name = name;
-  infiles[n_infiles++].language = language;
+  infiles[n_infiles].language = language;
+  infiles[n_infiles++].lang_mask = lang_mask;
 }
 
 /* Allocate space for a switch in switches.  */
@@ -3492,11 +3508,12 @@  alloc_switch (void)
 }
 
 /* Save an option OPT with N_ARGS arguments in array ARGS, marking it
-   as validated if VALIDATED and KNOWN if it is an internal switch.  */
+   as validated if VALIDATED and KNOWN if it is an internal switch.
+   LANG_MASK is the flags associated with this option.  */
 
 static void
 save_switch (const char *opt, size_t n_args, const char *const *args,
-	     bool validated, bool known)
+	     bool validated, bool known, unsigned int lang_mask)
 {
   alloc_switch ();
   switches[n_switches].part1 = opt + 1;
@@ -3513,6 +3530,7 @@  save_switch (const char *opt, size_t n_args, const
   switches[n_switches].validated = validated;
   switches[n_switches].known = known;
   switches[n_switches].ordering = 0;
+  switches[n_switches].lang_mask = lang_mask;
   n_switches++;
 }
 
@@ -3530,7 +3548,8 @@  driver_unknown_option_callback (const struct cl_de
 	 diagnosed only if there are warnings.  */
       save_switch (decoded->canonical_option[0],
 		   decoded->canonical_option_num_elements - 1,
-		   &decoded->canonical_option[1], false, true);
+		   &decoded->canonical_option[1], false, true,
+		   cl_options[decoded->opt_index].flags);
       return false;
     }
   if (decoded->opt_index == OPT_SPECIAL_unknown)
@@ -3538,7 +3557,8 @@  driver_unknown_option_callback (const struct cl_de
       /* Give it a chance to define it a spec file.  */
       save_switch (decoded->canonical_option[0],
 		   decoded->canonical_option_num_elements - 1,
-		   &decoded->canonical_option[1], false, false);
+		   &decoded->canonical_option[1], false, false,
+		   cl_options[decoded->opt_index].flags);
       return false;
     }
   else
@@ -3565,7 +3585,8 @@  driver_wrong_lang_callback (const struct cl_decode
   else
     save_switch (decoded->canonical_option[0],
 		 decoded->canonical_option_num_elements - 1,
-		 &decoded->canonical_option[1], false, true);
+		 &decoded->canonical_option[1], false, true,
+		 option->flags);
 }
 
 static const char *spec_lang = 0;
@@ -3815,7 +3836,8 @@  driver_handle_option (struct gcc_options *opts,
 	compare_debug_opt = NULL;
       else
 	compare_debug_opt = arg;
-      save_switch (compare_debug_replacement_opt, 0, NULL, validated, true);
+      save_switch (compare_debug_replacement_opt, 0, NULL, validated, true,
+		   cl_options[opt_index].flags);
       return true;
 
     case OPT_fdiagnostics_color_:
@@ -3870,17 +3892,17 @@  driver_handle_option (struct gcc_options *opts,
 	for (j = 0; arg[j]; j++)
 	  if (arg[j] == ',')
 	    {
-	      add_infile (save_string (arg + prev, j - prev), "*");
+	      add_infile (save_string (arg + prev, j - prev), "*", 0);
 	      prev = j + 1;
 	    }
 	/* Record the part after the last comma.  */
-	add_infile (arg + prev, "*");
+	add_infile (arg + prev, "*", 0);
       }
       do_save = false;
       break;
 
     case OPT_Xlinker:
-      add_infile (arg, "*");
+      add_infile (arg, "*", 0);
       do_save = false;
       break;
 
@@ -3897,19 +3919,21 @@  driver_handle_option (struct gcc_options *opts,
     case OPT_l:
       /* POSIX allows separation of -l and the lib arg; canonicalize
 	 by concatenating -l with its arg */
-      add_infile (concat ("-l", arg, NULL), "*");
+      add_infile (concat ("-l", arg, NULL), "*", 0);
       do_save = false;
       break;
 
     case OPT_L:
       /* Similarly, canonicalize -L for linkers that may not accept
 	 separate arguments.  */
-      save_switch (concat ("-L", arg, NULL), 0, NULL, validated, true);
+      save_switch (concat ("-L", arg, NULL), 0, NULL, validated, true,
+		   cl_options[opt_index].flags);
       return true;
 
     case OPT_F:
       /* Likewise -F.  */
-      save_switch (concat ("-F", arg, NULL), 0, NULL, validated, true);
+      save_switch (concat ("-F", arg, NULL), 0, NULL, validated, true,
+		   cl_options[opt_index].flags);
       return true;
 
     case OPT_save_temps:
@@ -4032,7 +4056,8 @@  driver_handle_option (struct gcc_options *opts,
       save_temps_prefix = xstrdup (arg);
       /* On some systems, ld cannot handle "-o" without a space.  So
 	 split the option from its argument.  */
-      save_switch ("-o", 1, &arg, validated, true);
+      save_switch ("-o", 1, &arg, validated, true,
+		   cl_options[opt_index].flags);
       return true;
 
 #ifdef ENABLE_DEFAULT_PIE
@@ -4068,7 +4093,8 @@  driver_handle_option (struct gcc_options *opts,
   if (do_save)
     save_switch (decoded->canonical_option[0],
 		 decoded->canonical_option_num_elements - 1,
-		 &decoded->canonical_option[1], validated, true);
+		 &decoded->canonical_option[1], validated, true,
+		 cl_options[opt_index].flags);
   return true;
 }
 
@@ -4365,7 +4391,7 @@  process_command (unsigned int decoded_options_coun
           if (strcmp (fname, "-") != 0 && access (fname, F_OK) < 0)
 	    perror_with_name (fname);
           else
-	    add_infile (arg, spec_lang);
+	    add_infile (arg, spec_lang, 0);
 
           free (fname);
 	  continue;
@@ -4376,6 +4402,11 @@  process_command (unsigned int decoded_options_coun
 			   CL_DRIVER, &handlers, global_dc);
     }
 
+  /* If the user didn't specify any, default to all configured offload
+     targets.  */
+  if (offload_targets == NULL)
+    handle_foffload_option (OFFLOAD_TARGETS);
+
   if (output_file
       && strcmp (output_file, "-") != 0
       && strcmp (output_file, HOST_BIT_BUCKET) != 0)
@@ -4507,7 +4538,8 @@  process_command (unsigned int decoded_options_coun
   if (compare_debug == 2 || compare_debug == 3)
     {
       const char *opt = concat ("-fcompare-debug=", compare_debug_opt, NULL);
-      save_switch (opt, 0, NULL, false, true);
+      save_switch (opt, 0, NULL, false, true,
+		   cl_options[OPT_fcompare_debug_].flags);
       compare_debug = 1;
     }
 
@@ -4518,7 +4550,7 @@  process_command (unsigned int decoded_options_coun
 
       /* Create a dummy input file, so that we can pass
 	 the help option on to the various sub-processes.  */
-      add_infile ("help-dummy", "c");
+      add_infile ("help-dummy", "c", 0);
     }
 
   alloc_switch ();
@@ -4719,13 +4751,15 @@  insert_wrapper (const char *wrapper)
 }
 
 /* Process the spec SPEC and run the commands specified therein.
+   If LANG_MASK is nonzero, switches for other languages are discarded.
    Returns 0 if the spec is successfully processed; -1 if failed.  */
 
 int
-do_spec (const char *spec)
+do_spec (const char *spec, unsigned int lang_mask)
 {
   int value;
 
+  spec_lang_mask_accept = lang_mask;
   value = do_spec_2 (spec);
 
   /* Force out any unfinished command.
@@ -4883,7 +4917,8 @@  do_self_spec (const char *spec)
 	      save_switch (decoded_options[j].canonical_option[0],
 			   (decoded_options[j].canonical_option_num_elements
 			    - 1),
-			   &decoded_options[j].canonical_option[1], false, true);
+			   &decoded_options[j].canonical_option[1], false, true,
+			   cl_options[decoded_options[j].opt_index].flags);
 	      break;
 
 	    default:
@@ -6479,6 +6514,14 @@  check_live_switch (int switchnum, int prefix_lengt
 static void
 give_switch (int switchnum, int omit_first_word)
 {
+  int lang_mask = switches[switchnum].lang_mask & ((1U << cl_lang_count) - 1);
+  unsigned int lang_mask_accept = (1U << cl_lang_count) - 1;
+  if (spec_lang_mask_accept != 0)
+    lang_mask_accept = spec_lang_mask_accept;
+  /* Drop switches specific to a language not in the given mask.  */
+  if (lang_mask != 0 && !(lang_mask & lang_mask_accept))
+    return;
+
   if ((switches[switchnum].live_cond & SWITCH_IGNORE) != 0)
     return;
 
@@ -7572,22 +7615,14 @@  driver::maybe_putenv_COLLECT_LTO_WRAPPER () const
 void
 driver::maybe_putenv_OFFLOAD_TARGETS () const
 {
-  const char *targets = offload_targets;
-
-  /* If no targets specified by -foffload, use all available targets.  */
-  if (!targets)
-    targets = OFFLOAD_TARGETS;
-
-  if (strlen (targets) > 0)
+  if (offload_targets && offload_targets[0] != '\0')
     {
       obstack_grow (&collect_obstack, "OFFLOAD_TARGET_NAMES=",
 		    sizeof ("OFFLOAD_TARGET_NAMES=") - 1);
-      obstack_grow (&collect_obstack, targets,
-		    strlen (targets) + 1);
+      obstack_grow (&collect_obstack, offload_targets,
+		    strlen (offload_targets) + 1);
       xputenv (XOBFINISH (&collect_obstack, char *));
     }
-
-  free (offload_targets);
 }
 
 /* Reject switches that no pass was interested in.  */
@@ -7891,7 +7926,8 @@  driver::do_spec_on_infiles () const
 		  debug_check_temp_file[1] = NULL;
 		}
 
-	      value = do_spec (input_file_compiler->spec);
+	      value = do_spec (input_file_compiler->spec,
+			       infiles[i].lang_mask);
 	      infiles[i].compiled = true;
 	      if (value < 0)
 		this_file_error = 1;
@@ -7905,7 +7941,8 @@  driver::do_spec_on_infiles () const
 		  n_switches_alloc = n_switches_alloc_debug_check[1];
 		  switches = switches_debug_check[1];
 
-		  value = do_spec (input_file_compiler->spec);
+		  value = do_spec (input_file_compiler->spec,
+				   infiles[i].lang_mask);
 
 		  compare_debug = -compare_debug;
 		  n_switches = n_switches_debug_check[0];
@@ -8060,7 +8097,7 @@  driver::maybe_run_linker (const char *argv0) const
 		    " to the linker.\n\n"));
 	  fflush (stdout);
 	}
-      int value = do_spec (link_command_spec);
+      int value = do_spec (link_command_spec, 0);
       if (value < 0)
 	errorcount = 1;
       linker_was_run = (tmp != execution_count);
@@ -9651,6 +9688,50 @@  greater_than_spec_func (int argc, const char **arg
   return NULL;
 }
 
+/* If applicable, generate a C source file containing a constructor call to
+   GOMP_set_offload_targets, to inform libgomp which offload targets have
+   actually been requested (-foffload=[...]), and adds that as an infile.  */
+
+static const char *
+add_omp_infile_spec_func (int argc, const char **)
+{
+  gcc_assert (argc == 0);
+  gcc_assert (offload_targets != NULL);
+
+  /* Nothing to do if we're not actually linking.  */
+  if (have_c)
+    return NULL;
+
+  int err;
+  const char *tmp_filename;
+  tmp_filename = make_temp_file (".c");
+  record_temp_file (tmp_filename, !save_temps_flag, 0);
+  FILE *f = fopen (tmp_filename, "w");
+  if (f == NULL)
+    fatal_error (input_location,
+		 "could not open temporary file %s", tmp_filename);
+  /* As libgomp uses constructors internally, and this code is only added when
+     linking against libgomp, it is fine to use a constructor here.  */
+  err = fprintf (f,
+		 "extern void GOMP_set_offload_targets (const char *);\n"
+		 "static __attribute__ ((constructor)) void\n"
+		 "init (void)\n"
+		 "{\n"
+		 "  GOMP_set_offload_targets (\"%s\");\n"
+		 "}\n",
+		 offload_targets);
+  if (err < 0)
+    fatal_error (input_location,
+		 "could not write to temporary file %s", tmp_filename);
+  err = fclose (f);
+  if (err == EOF)
+    fatal_error (input_location,
+		 "could not close temporary file %s", tmp_filename);
+
+  add_infile (tmp_filename, "cpp-output", CL_C);
+  return NULL;
+}
+
 /* Insert backslash before spaces in ORIG (usually a file path), to 
    avoid being broken by spec parser.
 
Index: gcc/gcc.h
===================================================================
--- gcc/gcc.h	(revision 227194)
+++ gcc/gcc.h	(working copy)
@@ -68,7 +68,7 @@  struct spec_function
 };
 
 /* These are exported by gcc.c.  */
-extern int do_spec (const char *);
+extern int do_spec (const char *, unsigned int);
 extern void record_temp_file (const char *, int, int);
 extern void pfatal_with_name (const char *) ATTRIBUTE_NORETURN;
 extern void set_input (const char *);
Index: gcc/fortran/gfortranspec.c
===================================================================
--- gcc/fortran/gfortranspec.c	(revision 227194)
+++ gcc/fortran/gfortranspec.c	(working copy)
@@ -439,7 +439,7 @@  int
 lang_specific_pre_link (void)
 {
   if (library)
-    do_spec ("%:include(libgfortran.spec)");
+    do_spec ("%:include(libgfortran.spec)", 0);
 
   return 0;
 }