diff mbox

Ping^2 Re: Pass -foffload targets from driver to libgomp at link time

Message ID 87a8s6vq69.fsf@kepler.schwinge.homeip.net
State New
Headers show

Commit Message

Thomas Schwinge Sept. 28, 2015, 9:39 a.m. UTC
Hi!

On Fri, 11 Sep 2015 17:43:49 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> So, do I understand well that you'll call GOMP_set_offload_targets from
> construct[ors] of all shared libraries (and the binary) that contain offloaded
> code?  If yes, that is surely going to fail the assertions in there.

Indeed.  My original plan has been to generate/invoke this constructor
only for/from the final executable and not for any shared libraries, but
it seems I didn't implemented this correctly.

> You can dlopen such libraries etc.  What if you link one library with
> -fopenmp=nvptx-none and another one with -fopenmp=x86_64-intelmicemul-linux?

So, the first question to answer is: what do we expect to happen in this
case, or similarly, if the executable and any shared libraries are
compiled with different/incompatible -foffload options?

Given that OpenMP's default-device-var ICV is per process (that is, not
separate for the executable and any shared library), and thus, once
libgomp has settled on this ICV (by first execution of an offloading
construct, for example), any offloading attempt of code compiled with
incompatible -foffload options will have to fail, because the
corresponding offloading device's code just isn't available.  We can't
avoid this situation, as it is not possible for libgomp to simply switch
to a different offloading device (or host fallback, for that matter):
libgomp doesn't have any knowledge of the current state of data regions
setup between the host and device(s), for instance.

For this, I propose that the only mode of operation that we currently can
support is that all of the executable and any shared libraries agree on
the offload targets specified by -foffload, and I thus propose the
following patch on top of what Joseph has posted before (passes the
testsuite, but not yet tested otherwise):

 libgomp/libgomp-plugin.h |    3 +-
 libgomp/target.c         |  157 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 130 insertions(+), 30 deletions(-)



Grüße,
 Thomas

Comments

Jakub Jelinek Sept. 29, 2015, 8:18 a.m. UTC | #1
On Mon, Sep 28, 2015 at 11:39:10AM +0200, Thomas Schwinge wrote:
> Hi!
> 
> On Fri, 11 Sep 2015 17:43:49 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> > So, do I understand well that you'll call GOMP_set_offload_targets from
> > construct[ors] of all shared libraries (and the binary) that contain offloaded
> > code?  If yes, that is surely going to fail the assertions in there.
> 
> Indeed.  My original plan has been to generate/invoke this constructor
> only for/from the final executable and not for any shared libraries, but
> it seems I didn't implemented this correctly.

How would you mean to implement it?  -fopenmp or -fopenacc code with
offloading bits might not be in the final executable at all, nor in shared
libraries it is linked against; such libraries could be only dlopened,
consider say python plugin.  And this is not just made up, perhaps not with
offloading yet, but people regularly use OpenMP code in plugins and then we
get complains that fork child of the main program is not allowed to do
anything but async-signal-safe functions.
> 
> > You can dlopen such libraries etc.  What if you link one library with
> > -fopenmp=nvptx-none and another one with -fopenmp=x86_64-intelmicemul-linux?
> 
> So, the first question to answer is: what do we expect to happen in this
> case, or similarly, if the executable and any shared libraries are
> compiled with different/incompatible -foffload options?

As the device numbers are per-process, the only possibility I see is that
all the physically available devices are always available, and just if you
try to offload from some code to a device that doesn't support it, you get
host fallback.  Because, one shared library could carefully use device(xyz)
to offload to say XeonPhi it is compiled for and supports, and another
library device(abc) to offload to PTX it is compiled for and supports.

> For this, I propose that the only mode of operation that we currently can
> support is that all of the executable and any shared libraries agree on
> the offload targets specified by -foffload, and I thus propose the
> following patch on top of what Joseph has posted before (passes the
> testsuite, but not yet tested otherwise):

See above, no.

	Jakub
diff mbox

Patch

diff --git libgomp/libgomp-plugin.h libgomp/libgomp-plugin.h
index 24fbb94..5da4fa7 100644
--- libgomp/libgomp-plugin.h
+++ libgomp/libgomp-plugin.h
@@ -48,7 +48,8 @@  enum offload_target_type
   OFFLOAD_TARGET_TYPE_HOST = 2,
   /* OFFLOAD_TARGET_TYPE_HOST_NONSHM = 3 removed.  */
   OFFLOAD_TARGET_TYPE_NVIDIA_PTX = 5,
-  OFFLOAD_TARGET_TYPE_INTEL_MIC = 6
+  OFFLOAD_TARGET_TYPE_INTEL_MIC = 6,
+  OFFLOAD_TARGET_TYPE_HWM
 };
 
 /* Auxiliary struct, used for transferring pairs of addresses from plugin
diff --git libgomp/target.c libgomp/target.c
index 4dd5913..d1e794a 100644
--- libgomp/target.c
+++ libgomp/target.c
@@ -68,6 +68,9 @@  static struct offload_image_descr *offload_images;
 /* Total number of offload images.  */
 static int num_offload_images;
 
+/* List of offload targets, separated by colon.  */
+static const char *gomp_offload_targets;
+
 /* Array of descriptors for all available devices.  */
 static struct gomp_device_descr *devices;
 
@@ -1121,6 +1124,8 @@  static bool
 gomp_load_plugin_for_device (struct gomp_device_descr *device,
 			     const char *plugin_name)
 {
+  gomp_debug (0, "%s (\"%s\")\n", __FUNCTION__, plugin_name);
+
   const char *err = NULL, *last_missing = NULL;
 
   void *plugin_handle = dlopen (plugin_name, RTLD_LAZY);
@@ -1216,39 +1221,120 @@  gomp_load_plugin_for_device (struct gomp_device_descr *device,
   return 0;
 }
 
-/* Return the corresponding plugin name for the offload target name
-   OFFLOAD_TARGET.  */
+/* Return the corresponding offload target type for the offload target name
+   OFFLOAD_TARGET, or 0 if unknown.  */
 
-static const char *
-offload_target_to_plugin_name (const char *offload_target)
+static enum offload_target_type
+offload_target_to_type (const char *offload_target)
 {
   if (strstr (offload_target, "-intelmic") != NULL)
-    return "intelmic";
-  if (strncmp (offload_target, "nvptx", 5) == 0)
-    return "nvptx";
-  gomp_fatal ("Unknown offload target: %s", offload_target);
+    return OFFLOAD_TARGET_TYPE_INTEL_MIC;
+  else if (strncmp (offload_target, "nvptx", 5) == 0)
+    return OFFLOAD_TARGET_TYPE_NVIDIA_PTX;
+  else
+    return 0;
 }
 
-/* List of offload targets, separated by colon.  Defaults to the list
-   determined when configuring libgomp.  */
-static const char *gomp_offload_targets = OFFLOAD_TARGETS;
-static bool gomp_offload_targets_init = false;
+/* Return the corresponding plugin name for the offload target type TYPE, or
+   NULL if unknown.  */
+
+static const char *
+offload_target_type_to_plugin_name (enum offload_target_type type)
+{
+  switch (type)
+    {
+    case OFFLOAD_TARGET_TYPE_INTEL_MIC:
+      return "intelmic";
+    case OFFLOAD_TARGET_TYPE_NVIDIA_PTX:
+      return "nvptx";
+    default:
+      return NULL;
+    }
+}
 
 /* Override the list of offload targets with OFFLOAD_TARGETS, the set
-   passed to the compiler at link time.  This must be called early,
-   and only once.  */
+   passed to the compiler at link time.  */
 
 void
 GOMP_set_offload_targets (const char *offload_targets)
 {
   gomp_debug (0, "%s (\"%s\")\n", __FUNCTION__, offload_targets);
 
-  /* Make sure this gets called early.  */
-  assert (gomp_is_initialized == PTHREAD_ONCE_INIT);
-  /* Make sure this only gets called once.  */
-  assert (!gomp_offload_targets_init);
-  gomp_offload_targets_init = true;
-  gomp_offload_targets = offload_targets;
+  char *offload_targets_dup = strdup (offload_targets);
+  if (offload_targets_dup == NULL)
+    gomp_fatal ("Out of memory");
+
+  const char *err = NULL;
+
+  gomp_mutex_lock (&register_lock);
+
+  if (gomp_offload_targets == NULL)
+    {
+      /* Make sure this gets called early.  */
+      assert (gomp_is_initialized == PTHREAD_ONCE_INIT);
+
+      gomp_offload_targets = offload_targets_dup;
+    }
+  else
+    {
+      /* If this gets called multiple times, or gets called after
+	 initialization, make sure that the same set of offload targets has
+	 been specified.  */
+
+      bool offload_targets_registered[OFFLOAD_TARGET_TYPE_HWM];
+      bool offload_targets_requested[OFFLOAD_TARGET_TYPE_HWM];
+      for (int i = 0; i < OFFLOAD_TARGET_TYPE_HWM; ++i)
+	offload_targets_registered[i] = offload_targets_requested[i] = false;
+
+      char *cur = offload_targets_dup;
+      while (cur)
+	{
+	  char *next = strchr (cur, ':');
+	  if (next != NULL)
+	    {
+	      *next = '\0';
+	      ++next;
+	    }
+	  enum offload_target_type type = offload_target_to_type (cur);
+	  if (type == 0)
+	    {
+	      /* An unknown offload target has been requested; ignore it.  This
+		 makes us (future-)proof if offload targets are requested that
+		 are not supported in this build of libgomp.  */
+	    }
+	  else
+	    offload_targets_requested[type] = true;
+
+	  cur = next;
+	}
+
+      for (int i = 0; i < num_devices; ++i)
+	offload_targets_registered[devices[i].type] = true;
+
+      for (enum offload_target_type type = 0;
+	   type < OFFLOAD_TARGET_TYPE_HWM;
+	   ++type)
+	if (offload_targets_registered[type]
+	    != offload_targets_requested[type])
+	  {
+	    err = "terminating";
+
+	    const char *name = offload_target_type_to_plugin_name (type);
+	    if (offload_targets_requested[type])
+	      gomp_error ("offload target %s has been requested, "
+			  "but not registered before", name);
+	    else
+	      gomp_error ("offload target %s has been registered, "
+			  "but not requested now", name);
+	  }
+
+      free (offload_targets_dup);
+    }
+
+  gomp_mutex_unlock (&register_lock);
+
+  if (err != NULL)
+    gomp_fatal ("%s", err);
 }
 
 /* This function initializes the runtime needed for offloading.
@@ -1264,11 +1350,15 @@  gomp_target_init (void)
   const char *prefix ="libgomp-plugin-";
   const char *suffix = SONAME_SUFFIX (1);
   const char *cur, *next;
-  char *plugin_name;
   int i, new_num_devices;
 
   gomp_mutex_lock (&register_lock);
 
+  /* If no offload targets have been requested explicitly, default to those
+     determined when configuring libgomp.  */
+  if (gomp_offload_targets == NULL)
+    gomp_offload_targets = OFFLOAD_TARGETS;
+
   num_devices = 0;
   devices = NULL;
 
@@ -1276,16 +1366,14 @@  gomp_target_init (void)
   if (*cur)
     do
       {
-	struct gomp_device_descr current_device;
-
 	next = strchr (cur, ':');
 	size_t prefix_len = strlen (prefix);
 	size_t cur_len = next ? next - cur : strlen (cur);
 	size_t suffix_len = strlen (suffix);
-	plugin_name = (char *) malloc (prefix_len
-				       + cur_len
-				       + suffix_len
-				       + 1);
+	char *plugin_name = (char *) malloc (prefix_len
+					     + cur_len
+					     + suffix_len
+					     + 1);
 	if (!plugin_name)
 	  {
 	    num_devices = 0;
@@ -1297,9 +1385,18 @@  gomp_target_init (void)
 	plugin_name[prefix_len + cur_len] = '\0';
 	/* ..., so that we can then use it to translate the offload target to
 	   the plugin name...  */
+	enum offload_target_type type
+	  = offload_target_to_type (plugin_name + prefix_len);
 	const char *cur_plugin_name
-	  = offload_target_to_plugin_name (plugin_name
-					   + prefix_len);
+	  = offload_target_type_to_plugin_name (type);
+	if (cur_plugin_name == NULL)
+	  {
+	    /* An unknown offload target has been requested; ignore it.  This
+	       makes us (future-)proof if offload targets are requested that
+	       are not supported in this build of libgomp.  */
+	    gomp_debug (0, "skipping unknown offload target %d", (int) type);
+	    goto skip;
+	  }
 	size_t cur_plugin_name_len = strlen (cur_plugin_name);
 	assert (cur_plugin_name_len <= cur_len);
 	/* ..., and then rewrite it.  */
@@ -1311,6 +1408,7 @@  gomp_target_init (void)
 		    + cur_plugin_name_len
 		    + suffix_len] = '\0';
 
+	struct gomp_device_descr current_device;
 	if (gomp_load_plugin_for_device (&current_device, plugin_name))
 	  {
 	    new_num_devices = current_device.get_num_devices_func ();
@@ -1343,6 +1441,7 @@  gomp_target_init (void)
 	      }
 	  }
 
+      skip:
 	free (plugin_name);
 	cur = next + 1;
       }