[gomp4] Use GOMP_OFFLOAD_ prefix for (OpenACC) plugin hooks
diff mbox

Message ID 20141105175656.3f2a9630@octopus
State New
Headers show

Commit Message

Julian Brown Nov. 5, 2014, 5:56 p.m. UTC
Hi,

Mirroring changes in Ilya Verbin's libgomp offloading pieces posted to
trunk, this patch adds a prefix of GOMP_OFFLOAD_ to the OpenACC plugin
hooks. Some of these bits will not be needed for a trunk version of the
patch once Ilya's patch is approved (I'm hoping other
incompatibilities haven't crept in other than the renaming!).

I will apply to the gomp4 branch shortly.

Thanks,

Julian

ChangeLog

    libgomp/
    * oacc-host.c: Add GOMP_OFFLOAD_ prefix for plugin hooks. Rename
    device_init to init_device, device_fini to fini_device,
    offload_register to register_image and remove extraneous "device_"
    from device_alloc, device_free, device_dev2host, device_host2dev and
    device_run.
    (host_dispatch): Use new names for hooks.
    * oacc-init.c: Use new names for hooks, throughout.
    * plugin-nvptx.c: Likewise.
    * target.c: Likewise.
    (gomp_load_plugin_for_device): Likewise. Look for new hook names.
    * target.h (gomp_device_descr): Use new hook names.

Comments

Ilya Verbin Nov. 5, 2014, 7:02 p.m. UTC | #1
Hi,

On 05 Nov 17:56, Julian Brown wrote:
> +GOMP_OFFLOAD_register_image (void *host_table, void *target_data)
> +GOMP_OFFLOAD_get_table (struct mapping_table **table)

FYI, these interfaces may change in the near future.
Currently GOMP_OFFLOAD_get_table returns a joint table for all images, offloaded
to a device.  But this doesn't work properly with offloading from dlopened libs.
Do you plan to support such cases for PTX?
Perhaps it's worth to replace them with a function like GOMP_OFFLOAD_load_image,
which will offload one image, and return a target table for this image.
In this case there is no need to pass host_table to the plugin, and return a
joint table, since libgomp will join host and target tables itself.

Another question is what to do with multiple devices of same type.
Can they have different images?  There are 2 options:
1. GOMP_OFFLOAD_load_image will offload one image to one device and receive a
table from it.
or
2. GOMP_OFFLOAD_register_image will register one image in the plugin for all
devices of same type, and
GOMP_OFFLOAD_get_table will return a table for one image and for one device.

Multiple MICs can't have different images, but for the generality we can use
option #1.

  -- Ilya
Julian Brown Nov. 5, 2014, 9:28 p.m. UTC | #2
On Wed, 5 Nov 2014 22:02:33 +0300
Ilya Verbin <iverbin@gmail.com> wrote:

> Hi,
> 
> On 05 Nov 17:56, Julian Brown wrote:
> > +GOMP_OFFLOAD_register_image (void *host_table, void *target_data)
> > +GOMP_OFFLOAD_get_table (struct mapping_table **table)
> 
> FYI, these interfaces may change in the near future.
> Currently GOMP_OFFLOAD_get_table returns a joint table for all
> images, offloaded to a device.  But this doesn't work properly with
> offloading from dlopened libs. Do you plan to support such cases for
> PTX? Perhaps it's worth to replace them with a function like
> GOMP_OFFLOAD_load_image, which will offload one image, and return a
> target table for this image. In this case there is no need to pass
> host_table to the plugin, and return a joint table, since libgomp
> will join host and target tables itself.

I made some changes to table initialisation on the gomp4 branch also --
probably not enough to genuinely support multiple devices, but
hopefully some of the way there. Have you seen those? I haven't
considered dlopened libs though.

> Another question is what to do with multiple devices of same type.
> Can they have different images?  There are 2 options:
> 1. GOMP_OFFLOAD_load_image will offload one image to one device and
> receive a table from it.
> or
> 2. GOMP_OFFLOAD_register_image will register one image in the plugin
> for all devices of same type, and
> GOMP_OFFLOAD_get_table will return a table for one image and for one
> device.

Similarly, I added (partial, in the case of OpenMP) support for
multiple devices of the same type on the gomp4 branch.

Thanks,

Julian
Ilya Verbin Nov. 5, 2014, 10:47 p.m. UTC | #3
2014-11-06 0:28 GMT+03:00 Julian Brown <julian@codesourcery.com>:
> I made some changes to table initialisation on the gomp4 branch also --
> probably not enough to genuinely support multiple devices, but
> hopefully some of the way there. Have you seen those? I haven't
> considered dlopened libs though.
>
> Similarly, I added (partial, in the case of OpenMP) support for
> multiple devices of the same type on the gomp4 branch.

Multiple devices of same type are already supported in this patch:
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00475.html
Each plugin function receives device_id as the first argument.

The question is which interface is more preferable for image
registration, i.e. can 2 devices of the same type in general contain a
different set of offload images or not?

  -- Ilya

Patch
diff mbox

commit 4e1b71a5e0d15de4c6e89ab5139964e32b563d68
Author: Julian Brown <julian@codesourcery.com>
Date:   Wed Nov 5 02:34:22 2014 -0800

    Use GOMP_OFFLOAD_ prefix for plugin hooks.

diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index fc3e77c..02794bb 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -60,7 +60,7 @@  static struct gomp_device_descr host_dispatch;
 #endif
 
 STATIC const char *
-get_name (void)
+GOMP_OFFLOAD_get_name (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -74,7 +74,7 @@  get_name (void)
 }
 
 STATIC int
-get_type (void)
+GOMP_OFFLOAD_get_type (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -88,7 +88,7 @@  get_type (void)
 }
 
 STATIC unsigned int
-get_caps (void)
+GOMP_OFFLOAD_get_caps (void)
 {
   unsigned int caps = TARGET_CAP_OPENACC_200 | TARGET_CAP_OPENMP_400
 		      | TARGET_CAP_NATIVE_EXEC;
@@ -105,7 +105,7 @@  get_caps (void)
 }
 
 STATIC int
-get_num_devices (void)
+GOMP_OFFLOAD_get_num_devices (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -115,7 +115,7 @@  get_num_devices (void)
 }
 
 STATIC void
-offload_register (void *host_table, void *target_data)
+GOMP_OFFLOAD_register_image (void *host_table, void *target_data)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p, %p)\n", __FILE__, __FUNCTION__, host_table,
@@ -124,17 +124,17 @@  offload_register (void *host_table, void *target_data)
 }
 
 STATIC int
-device_init (void)
+GOMP_OFFLOAD_init_device (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
 #endif
 
-  return get_num_devices ();
+  return GOMP_OFFLOAD_get_num_devices ();
 }
 
 STATIC int
-device_fini (void)
+GOMP_OFFLOAD_fini_device (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -144,7 +144,7 @@  device_fini (void)
 }
 
 STATIC int
-device_get_table (struct mapping_table **table)
+GOMP_OFFLOAD_get_table (struct mapping_table **table)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p)\n", __FILE__, __FUNCTION__, table);
@@ -154,7 +154,7 @@  device_get_table (struct mapping_table **table)
 }
 
 STATIC bool
-openacc_avail (void)
+GOMP_OFFLOAD_openacc_avail (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -164,7 +164,7 @@  openacc_avail (void)
 }
 
 STATIC void *
-openacc_open_device (int n)
+GOMP_OFFLOAD_openacc_open_device (int n)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%u)\n", __FILE__, __FUNCTION__, n);
@@ -174,7 +174,7 @@  openacc_open_device (int n)
 }
 
 STATIC int
-openacc_close_device (void *hnd)
+GOMP_OFFLOAD_openacc_close_device (void *hnd)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p)\n", __FILE__, __FUNCTION__, hnd);
@@ -184,7 +184,7 @@  openacc_close_device (void *hnd)
 }
 
 STATIC int
-openacc_get_device_num (void)
+GOMP_OFFLOAD_openacc_get_device_num (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -194,7 +194,7 @@  openacc_get_device_num (void)
 }
 
 STATIC void
-openacc_set_device_num (int n)
+GOMP_OFFLOAD_openacc_set_device_num (int n)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%u)\n", __FILE__, __FUNCTION__, n);
@@ -205,7 +205,7 @@  openacc_set_device_num (int n)
 }
 
 STATIC void *
-device_alloc (size_t s)
+GOMP_OFFLOAD_alloc (size_t s)
 {
   void *ptr = GOMP(malloc) (s);
 
@@ -217,7 +217,7 @@  device_alloc (size_t s)
 }
 
 STATIC void
-device_free (void *p)
+GOMP_OFFLOAD_free (void *p)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p)\n", __FILE__, __FUNCTION__, p);
@@ -227,7 +227,7 @@  device_free (void *p)
 }
 
 STATIC void *
-device_host2dev (void *d, const void *h, size_t s)
+GOMP_OFFLOAD_host2dev (void *d, const void *h, size_t s)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p, %p, %zd)\n", __FILE__, __FUNCTION__, d, h,
@@ -242,7 +242,7 @@  device_host2dev (void *d, const void *h, size_t s)
 }
 
 STATIC void *
-device_dev2host (void *h, const void *d, size_t s)
+GOMP_OFFLOAD_dev2host (void *h, const void *d, size_t s)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p, %p, %zd)\n", __FILE__, __FUNCTION__, h, d,
@@ -257,7 +257,7 @@  device_dev2host (void *h, const void *d, size_t s)
 }
 
 STATIC void
-device_run (void *fn_ptr, void *vars)
+GOMP_OFFLOAD_run (void *fn_ptr, void *vars)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p, %p)\n", __FILE__, __FUNCTION__, fn_ptr,
@@ -270,16 +270,17 @@  device_run (void *fn_ptr, void *vars)
 }
 
 STATIC void
-openacc_parallel (void (*fn) (void *), size_t mapnum __attribute__((unused)),
-		  void **hostaddrs __attribute__((unused)),
-		  void **devaddrs __attribute__((unused)),
-		  size_t *sizes __attribute__((unused)),
-		  unsigned short *kinds __attribute__((unused)),
-		  int num_gangs __attribute__((unused)),
-		  int num_workers __attribute__((unused)),
-		  int vector_length __attribute__((unused)),
-		  int async __attribute__((unused)),
-		  void *targ_mem_desc __attribute__((unused)))
+GOMP_OFFLOAD_openacc_parallel (void (*fn) (void *),
+			       size_t mapnum __attribute__((unused)),
+			       void **hostaddrs __attribute__((unused)),
+			       void **devaddrs __attribute__((unused)),
+			       size_t *sizes __attribute__((unused)),
+			       unsigned short *kinds __attribute__((unused)),
+			       int num_gangs __attribute__((unused)),
+			       int num_workers __attribute__((unused)),
+			       int vector_length __attribute__((unused)),
+			       int async __attribute__((unused)),
+			       void *targ_mem_desc __attribute__((unused)))
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%p, %zu, %p, %p, %p, %d, %d, %d, %d, %p)\n",
@@ -295,7 +296,7 @@  openacc_parallel (void (*fn) (void *), size_t mapnum __attribute__((unused)),
 }
 
 STATIC void
-openacc_register_async_cleanup (void *targ_mem_desc)
+GOMP_OFFLOAD_openacc_register_async_cleanup (void *targ_mem_desc)
 {
 #ifdef HOST_NONSHM_PLUGIN
   /* "Asynchronous" launches are executed synchronously on the (non-SHM) host,
@@ -305,7 +306,7 @@  openacc_register_async_cleanup (void *targ_mem_desc)
 }
 
 STATIC void
-openacc_async_set_async (int async __attribute__((unused)))
+GOMP_OFFLOAD_openacc_async_set_async (int async __attribute__((unused)))
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%d)\n", __FILE__, __FUNCTION__, async);
@@ -313,7 +314,7 @@  openacc_async_set_async (int async __attribute__((unused)))
 }
 
 STATIC int
-openacc_async_test (int async __attribute__((unused)))
+GOMP_OFFLOAD_openacc_async_test (int async __attribute__((unused)))
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%d)\n", __FILE__, __FUNCTION__, async);
@@ -323,7 +324,7 @@  openacc_async_test (int async __attribute__((unused)))
 }
 
 STATIC int
-openacc_async_test_all (void)
+GOMP_OFFLOAD_openacc_async_test_all (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -333,7 +334,7 @@  openacc_async_test_all (void)
 }
 
 STATIC void
-openacc_async_wait (int async __attribute__((unused)))
+GOMP_OFFLOAD_openacc_async_wait (int async __attribute__((unused)))
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%d)\n", __FILE__, __FUNCTION__, async);
@@ -341,7 +342,7 @@  openacc_async_wait (int async __attribute__((unused)))
 }
 
 STATIC void
-openacc_async_wait_all (void)
+GOMP_OFFLOAD_openacc_async_wait_all (void)
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s\n", __FILE__, __FUNCTION__);
@@ -349,8 +350,8 @@  openacc_async_wait_all (void)
 }
 
 STATIC void
-openacc_async_wait_async (int async1 __attribute__((unused)),
-                	  int async2 __attribute__((unused)))
+GOMP_OFFLOAD_openacc_async_wait_async (int async1 __attribute__((unused)),
+				       int async2 __attribute__((unused)))
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%d, %d)\n", __FILE__, __FUNCTION__, async1,
@@ -359,7 +360,7 @@  openacc_async_wait_async (int async1 __attribute__((unused)),
 }
 
 STATIC void
-openacc_async_wait_all_async (int async __attribute__((unused)))
+GOMP_OFFLOAD_openacc_async_wait_all_async (int async __attribute__((unused)))
 {
 #ifdef DEBUG
   fprintf (stderr, SELF "%s:%s (%d)\n", __FILE__, __FUNCTION__, async);
@@ -367,13 +368,13 @@  openacc_async_wait_all_async (int async __attribute__((unused)))
 }
 
 STATIC void *
-openacc_create_thread_data (void *targ_data __attribute__((unused)))
+GOMP_OFFLOAD_openacc_create_thread_data (void *targ_data __attribute__((unused)))
 {
   return NULL;
 }
 
 STATIC void
-openacc_destroy_thread_data (void *tls_data __attribute__((unused)))
+GOMP_OFFLOAD_openacc_destroy_thread_data (void *tls_data __attribute__((unused)))
 {
 }
 
@@ -390,47 +391,48 @@  static struct gomp_device_descr host_dispatch =
     .is_initialized = false,
     .offload_regions_registered = false,
 
-    .get_name_func = get_name,
-    .get_type_func = get_type,
-    .get_caps_func = get_caps,
+    .get_name_func = GOMP_OFFLOAD_get_name,
+    .get_type_func = GOMP_OFFLOAD_get_type,
+    .get_caps_func = GOMP_OFFLOAD_get_caps,
 
-    .device_init_func = device_init,
-    .device_fini_func = device_fini,
-    .get_num_devices_func = get_num_devices,
-    .offload_register_func = offload_register,
-    .device_get_table_func = device_get_table,
+    .init_device_func = GOMP_OFFLOAD_init_device,
+    .fini_device_func = GOMP_OFFLOAD_fini_device,
+    .get_num_devices_func = GOMP_OFFLOAD_get_num_devices,
+    .register_image_func = GOMP_OFFLOAD_register_image,
+    .get_table_func = GOMP_OFFLOAD_get_table,
 
-    .device_alloc_func = device_alloc,
-    .device_free_func = device_free,
-    .device_host2dev_func = device_host2dev,
-    .device_dev2host_func = device_dev2host,
+    .alloc_func = GOMP_OFFLOAD_alloc,
+    .free_func = GOMP_OFFLOAD_free,
+    .host2dev_func = GOMP_OFFLOAD_host2dev,
+    .dev2host_func = GOMP_OFFLOAD_dev2host,
     
-    .device_run_func = device_run,
+    .run_func = GOMP_OFFLOAD_run,
 
     .openacc = {
-      .open_device_func = openacc_open_device,
-      .close_device_func = openacc_close_device,
+      .open_device_func = GOMP_OFFLOAD_openacc_open_device,
+      .close_device_func = GOMP_OFFLOAD_openacc_close_device,
 
-      .get_device_num_func = openacc_get_device_num,
-      .set_device_num_func = openacc_set_device_num,
+      .get_device_num_func = GOMP_OFFLOAD_openacc_get_device_num,
+      .set_device_num_func = GOMP_OFFLOAD_openacc_set_device_num,
 
       /* Device available.  */
-      .avail_func = openacc_avail,
+      .avail_func = GOMP_OFFLOAD_openacc_avail,
 
-      .exec_func = openacc_parallel,
+      .exec_func = GOMP_OFFLOAD_openacc_parallel,
 
-      .register_async_cleanup_func = openacc_register_async_cleanup,
+      .register_async_cleanup_func
+	= GOMP_OFFLOAD_openacc_register_async_cleanup,
 
-      .async_set_async_func = openacc_async_set_async,
-      .async_test_func = openacc_async_test,
-      .async_test_all_func = openacc_async_test_all,
-      .async_wait_func = openacc_async_wait,
-      .async_wait_async_func = openacc_async_wait_async,
-      .async_wait_all_func = openacc_async_wait_all,
-      .async_wait_all_async_func = openacc_async_wait_all_async,
+      .async_set_async_func = GOMP_OFFLOAD_openacc_async_set_async,
+      .async_test_func = GOMP_OFFLOAD_openacc_async_test,
+      .async_test_all_func = GOMP_OFFLOAD_openacc_async_test_all,
+      .async_wait_func = GOMP_OFFLOAD_openacc_async_wait,
+      .async_wait_async_func = GOMP_OFFLOAD_openacc_async_wait_async,
+      .async_wait_all_func = GOMP_OFFLOAD_openacc_async_wait_all,
+      .async_wait_all_async_func = GOMP_OFFLOAD_openacc_async_wait_all_async,
 
-      .create_thread_data_func = openacc_create_thread_data,
-      .destroy_thread_data_func = openacc_destroy_thread_data,
+      .create_thread_data_func = GOMP_OFFLOAD_openacc_create_thread_data,
+      .destroy_thread_data_func = GOMP_OFFLOAD_openacc_destroy_thread_data,
 
       .cuda = {
 	.get_current_device_func = NULL,
diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c
index 6f72578..6db3421 100644
--- a/libgomp/oacc-init.c
+++ b/libgomp/oacc-init.c
@@ -417,7 +417,7 @@  acc_get_num_devices (acc_device_t d)
   if (!acc_dev)
     return 0;
 
-  n = acc_dev->device_init_func ();
+  n = acc_dev->init_device_func ();
   if (n < 0)
     n = 0;
 
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 0c45d19..981418c 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -112,7 +112,7 @@  acc_malloc (size_t s)
 
   ACC_lazy_initialize ();
 
-  return base_dev->device_alloc_func (s);
+  return base_dev->alloc_func (s);
 }
 
 /* OpenACC 2.0a (3.2.16) doesn't specify what to do in the event
@@ -139,7 +139,7 @@  acc_free (void *d)
      acc_unmap_data ((void *)(k->host_start + offset));
    }
 
-  base_dev->device_free_func (d);
+  base_dev->free_func (d);
 }
 
 void
@@ -147,7 +147,7 @@  acc_memcpy_to_device (void *d, void *h, size_t s)
 {
   /* No need to call lazy open here, as the device pointer must have
      been obtained from a routine that did that.  */
-  base_dev->device_host2dev_func (d, h, s);
+  base_dev->host2dev_func (d, h, s);
 }
 
 void
@@ -155,7 +155,7 @@  acc_memcpy_from_device (void *h, void *d, size_t s)
 {
   /* No need to call lazy open here, as the device pointer must have
      been obtained from a routine that did that.  */
-  base_dev->device_dev2host_func (h, d, s);
+  base_dev->dev2host_func (h, d, s);
 }
 
 /* Return the device pointer that corresponds to host data H.  Or NULL
@@ -449,11 +449,11 @@  delete_copyout (unsigned f, void *h, size_t s)
         	(void *) n->host_start, (int) host_size, (void *) h, (int) s);
 
   if (f & DC_Copyout)
-    acc_dev->device_dev2host_func (h, d, s);
+    acc_dev->dev2host_func (h, d, s);
   
   acc_unmap_data (h);
 
-  acc_dev->device_free_func (d);
+  acc_dev->free_func (d);
 }
 
 void
@@ -486,9 +486,9 @@  update_dev_host (int is_dev, void *h, size_t s)
   d = (void *) (n->tgt->tgt_start + n->tgt_offset);
 
   if (is_dev)
-    acc_dev->device_host2dev_func (d, h, s);
+    acc_dev->host2dev_func (d, h, s);
   else
-    acc_dev->device_dev2host_func (h, d, s);
+    acc_dev->dev2host_func (h, d, s);
 }
 
 void
diff --git a/libgomp/plugin-nvptx.c b/libgomp/plugin-nvptx.c
index 4db2f32..4271c69 100644
--- a/libgomp/plugin-nvptx.c
+++ b/libgomp/plugin-nvptx.c
@@ -1493,7 +1493,7 @@  PTX_set_cuda_stream (int async, void *stream)
 
 
 int
-get_type (void)
+GOMP_OFFLOAD_get_type (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1503,19 +1503,19 @@  get_type (void)
 }
 
 unsigned int
-get_caps (void)
+GOMP_OFFLOAD_get_caps (void)
 {
   return TARGET_CAP_OPENACC_200;
 }
 
 const char *
-get_name (void)
+GOMP_OFFLOAD_get_name (void)
 {
   return "nvidia";
 }
 
 int
-get_num_devices (void)
+GOMP_OFFLOAD_get_num_devices (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1528,7 +1528,7 @@  static void **kernel_target_data;
 static void **kernel_host_table;
 
 void
-offload_register (void *host_table, void *target_data)
+GOMP_OFFLOAD_register_image (void *host_table, void *target_data)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%p, %p)\n", __FILE__, __FUNCTION__,
@@ -1540,7 +1540,7 @@  offload_register (void *host_table, void *target_data)
 }
 
 int
-device_init (void)
+GOMP_OFFLOAD_init_device (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1550,7 +1550,7 @@  device_init (void)
 }
 
 int
-device_fini (void)
+GOMP_OFFLOAD_fini_device (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1560,7 +1560,7 @@  device_fini (void)
 }
 
 int
-device_get_table (struct mapping_table **tablep)
+GOMP_OFFLOAD_get_table (struct mapping_table **tablep)
 {
   CUmodule module;
   void **fn_table;
@@ -1623,7 +1623,7 @@  device_get_table (struct mapping_table **tablep)
 }
 
 void *
-device_alloc (size_t size)
+GOMP_OFFLOAD_alloc (size_t size)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%zu)\n", __FILE__, __FUNCTION__,
@@ -1634,7 +1634,7 @@  device_alloc (size_t size)
 }
 
 void
-device_free (void *ptr)
+GOMP_OFFLOAD_free (void *ptr)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%p)\n", __FILE__, __FUNCTION__, ptr);
@@ -1644,7 +1644,7 @@  device_free (void *ptr)
 }
 
 void *
-device_dev2host (void *dst, const void *src, size_t n)
+GOMP_OFFLOAD_dev2host (void *dst, const void *src, size_t n)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%p, %p, %zu)\n", __FILE__,
@@ -1656,7 +1656,7 @@  device_dev2host (void *dst, const void *src, size_t n)
 }
 
 void *
-device_host2dev (void *dst, const void *src, size_t n)
+GOMP_OFFLOAD_host2dev (void *dst, const void *src, size_t n)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%p, %p, %zu)\n", __FILE__,
@@ -1669,10 +1669,11 @@  device_host2dev (void *dst, const void *src, size_t n)
 void (*device_run) (void *fn_ptr, void *vars) = NULL;
 
 void
-openacc_parallel (void (*fn) (void *), size_t mapnum, void **hostaddrs,
-		  void **devaddrs, size_t *sizes, unsigned short *kinds,
-		  int num_gangs, int num_workers, int vector_length,
-		  int async, void *targ_mem_desc)
+GOMP_OFFLOAD_openacc_parallel (void (*fn) (void *), size_t mapnum,
+			      void **hostaddrs, void **devaddrs, size_t *sizes,
+			      unsigned short *kinds, int num_gangs,
+			      int num_workers, int vector_length, int async,
+			      void *targ_mem_desc)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%p, %zu, %p, %p, %p, %d, %d, %d, "
@@ -1685,7 +1686,7 @@  openacc_parallel (void (*fn) (void *), size_t mapnum, void **hostaddrs,
 }
 
 void *
-openacc_open_device (int n)
+GOMP_OFFLOAD_openacc_open_device (int n)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d)\n", __FILE__, __FUNCTION__, n);
@@ -1694,7 +1695,7 @@  openacc_open_device (int n)
 }
 
 int
-openacc_close_device (void *h)
+GOMP_OFFLOAD_openacc_close_device (void *h)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%p)\n", __FILE__, __FUNCTION__, h);
@@ -1703,7 +1704,7 @@  openacc_close_device (void *h)
 }
 
 void
-openacc_set_device_num (int n)
+GOMP_OFFLOAD_openacc_set_device_num (int n)
 {
   struct nvptx_thread *nvthd = nvptx_thread ();
 
@@ -1719,7 +1720,7 @@  openacc_set_device_num (int n)
    (oacc-init.c:acc_get_device_num) handle it.  */
 
 int
-openacc_get_device_num (void)
+GOMP_OFFLOAD_openacc_get_device_num (void)
 {
   struct nvptx_thread *nvthd = nvptx_thread ();
 
@@ -1730,7 +1731,7 @@  openacc_get_device_num (void)
 }
 
 bool
-openacc_avail (void)
+GOMP_OFFLOAD_openacc_avail (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1739,7 +1740,7 @@  openacc_avail (void)
 }
 
 void
-openacc_register_async_cleanup (void *targ_mem_desc)
+GOMP_OFFLOAD_openacc_register_async_cleanup (void *targ_mem_desc)
 {
   CUevent *e;
   CUresult r;
@@ -1764,7 +1765,7 @@  openacc_register_async_cleanup (void *targ_mem_desc)
 }
 
 int
-openacc_async_test (int async)
+GOMP_OFFLOAD_openacc_async_test (int async)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d)\n", __FILE__, __FUNCTION__,
@@ -1774,7 +1775,7 @@  openacc_async_test (int async)
 }
 
 int
-openacc_async_test_all (void)
+GOMP_OFFLOAD_openacc_async_test_all (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1783,7 +1784,7 @@  openacc_async_test_all (void)
 }
 
 void
-openacc_async_wait (int async)
+GOMP_OFFLOAD_openacc_async_wait (int async)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d)\n", __FILE__, __FUNCTION__,
@@ -1793,7 +1794,7 @@  openacc_async_wait (int async)
 }
 
 void
-openacc_async_wait_async (int async1, int async2)
+GOMP_OFFLOAD_openacc_async_wait_async (int async1, int async2)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d, %d)\n", __FILE__, __FUNCTION__,
@@ -1803,7 +1804,7 @@  openacc_async_wait_async (int async1, int async2)
 }
 
 void
-openacc_async_wait_all (void)
+GOMP_OFFLOAD_openacc_async_wait_all (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1812,7 +1813,7 @@  openacc_async_wait_all (void)
 }
 
 void
-openacc_async_wait_all_async (int async)
+GOMP_OFFLOAD_openacc_async_wait_all_async (int async)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d)\n", __FILE__, __FUNCTION__,
@@ -1822,7 +1823,7 @@  openacc_async_wait_all_async (int async)
 }
 
 void
-openacc_async_set_async (int async)
+GOMP_OFFLOAD_openacc_async_set_async (int async)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d)\n", __FILE__, __FUNCTION__,
@@ -1832,7 +1833,7 @@  openacc_async_set_async (int async)
 }
 
 void *
-openacc_create_thread_data (void *targ_data)
+GOMP_OFFLOAD_openacc_create_thread_data (void *targ_data)
 {
   struct PTX_device *ptx_dev = (struct PTX_device *) targ_data;
   struct nvptx_thread *nvthd
@@ -1860,13 +1861,13 @@  openacc_create_thread_data (void *targ_data)
 }
 
 void
-openacc_destroy_thread_data (void *data)
+GOMP_OFFLOAD_openacc_destroy_thread_data (void *data)
 {
   free (data);
 }
 
 void *
-openacc_get_current_cuda_device (void)
+GOMP_OFFLOAD_openacc_get_current_cuda_device (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1875,7 +1876,7 @@  openacc_get_current_cuda_device (void)
 }
 
 void *
-openacc_get_current_cuda_context (void)
+GOMP_OFFLOAD_openacc_get_current_cuda_context (void)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s\n", __FILE__, __FUNCTION__);
@@ -1886,7 +1887,7 @@  openacc_get_current_cuda_context (void)
 /* NOTE: This returns a CUstream, not a PTX_stream pointer.  */
 
 void *
-openacc_get_cuda_stream (int async)
+GOMP_OFFLOAD_openacc_get_cuda_stream (int async)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d)\n", __FILE__, __FUNCTION__,
@@ -1898,7 +1899,7 @@  openacc_get_cuda_stream (int async)
 /* NOTE: This takes a CUstream, not a PTX_stream pointer.  */
 
 int
-openacc_set_cuda_stream (int async, void *stream)
+GOMP_OFFLOAD_openacc_set_cuda_stream (int async, void *stream)
 {
 #ifdef DEBUG
   fprintf (stderr, "libgomp plugin: %s:%s (%d, %p)\n", __FILE__, __FUNCTION__,
diff --git a/libgomp/target.c b/libgomp/target.c
index 507488e..8ce31a1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -284,7 +284,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
       /* Allocate tgt_align aligned tgt_size block of memory.  */
       /* FIXME: Perhaps change interface to allocate properly aligned
 	 memory.  */
-      tgt->to_free = devicep->device_alloc_func (tgt_size + tgt_align - 1);
+      tgt->to_free = devicep->alloc_func (tgt_size + tgt_align - 1);
       tgt->tgt_start = (uintptr_t) tgt->to_free;
       tgt->tgt_start = (tgt->tgt_start + tgt_align - 1) & ~(tgt_align - 1);
       tgt->tgt_end = tgt->tgt_start + tgt_size;
@@ -361,7 +361,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 		    /* FIXME: Perhaps add some smarts, like if copying
 		       several adjacent fields from host to target, use some
 		       host buffer to avoid sending each var individually.  */
-		    devicep->device_host2dev_func
+		    devicep->host2dev_func
 		      ((void *) (tgt->tgt_start + k->tgt_offset),
 		       (void *) k->host_start,
 		       k->host_end - k->host_start);
@@ -374,7 +374,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 			cur_node.tgt_offset = (uintptr_t) NULL;
 			/* Copy from host to device memory.  */
 			/* FIXME: see above FIXME comment.  */
-			devicep->device_host2dev_func
+			devicep->host2dev_func
 			  ((void *) (tgt->tgt_start + k->tgt_offset),
 			   (void *) &cur_node.tgt_offset,
 			   sizeof (void *));
@@ -409,7 +409,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 		    cur_node.tgt_offset -= sizes[i];
 		    /* Copy from host to device memory.  */
 		    /* FIXME: see above FIXME comment.  */
-		    devicep->device_host2dev_func
+		    devicep->host2dev_func
 		      ((void *) (tgt->tgt_start + k->tgt_offset),
 		       (void *) &cur_node.tgt_offset,
 		       sizeof (void *));
@@ -418,11 +418,11 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 		    {
 		      /* Copy from host to device memory.  */
 		      /* FIXME: see above FIXME comment.  */
-		      devicep->device_host2dev_func
+		      devicep->host2dev_func
 				((void *) (tgt->tgt_start + k->tgt_offset),
 				(void *) k->host_start,
 				(k->host_end - k->host_start));
-		      devicep->device_host2dev_func
+		      devicep->host2dev_func
 				((void *) (tgt->tgt_start + k->tgt_offset),
 				(void *) &tgt->tgt_start,
 				sizeof (void *));
@@ -446,7 +446,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 			        cur_node.tgt_offset = (uintptr_t) NULL;
 			        /* Copy from host to device memory.  */
 			        /* FIXME: see above FIXME comment.  */
-			        devicep->device_host2dev_func
+			        devicep->host2dev_func
 				  ((void *) (tgt->tgt_start + k->tgt_offset
 					   + ((uintptr_t) hostaddrs[j]
 					      - k->host_start)),
@@ -488,7 +488,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 			    /* Copy from host to device memory.  */
 			    /* FIXME: see above FIXME comment.  */
 
-			    devicep->device_host2dev_func
+			    devicep->host2dev_func
 				((void *) (tgt->tgt_start + k->tgt_offset
 				       + ((uintptr_t) hostaddrs[j]
 					  - k->host_start)),
@@ -511,7 +511,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 		    case GOMP_MAP_FORCE_DEVICEPTR:
 		      assert (k->host_end - k->host_start == sizeof (void *));
 		      
-		      devicep->device_host2dev_func
+		      devicep->host2dev_func
 		        ((void *) (tgt->tgt_start + k->tgt_offset),
 			 (void *) k->host_start,
 			 sizeof (void *));
@@ -545,7 +545,7 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 				  + tgt->list[i]->tgt_offset;
 	  /* Copy from host to device memory.  */
 	  /* FIXME: see above FIXME comment.  */
-	  devicep->device_host2dev_func
+	  devicep->host2dev_func
 	    ((void *) (tgt->tgt_start + i * sizeof (void *)),
 	     (void *) &cur_node.tgt_offset,
 	     sizeof (void *));
@@ -561,7 +561,7 @@  gomp_unmap_tgt (struct target_mem_desc *tgt)
 {
   /* Deallocate on target the tgt->tgt_start .. tgt->tgt_end region.  */
   if (tgt->tgt_end)
-    tgt->device_descr->device_free_func(tgt->to_free);
+    tgt->device_descr->free_func (tgt->to_free);
 
   free (tgt->array);
   free (tgt);
@@ -595,7 +595,7 @@  gomp_copy_from_async (struct target_mem_desc *tgt)
 	splay_tree_key k = tgt->list[i];
 	if (k->copy_from)
 	  /* Copy from device to host memory.  */
-	  devicep->device_dev2host_func
+	  devicep->dev2host_func
 	    ((void *) k->host_start,
 	     (void *) (k->tgt->tgt_start + k->tgt_offset),
 	     k->host_end - k->host_start);
@@ -634,7 +634,7 @@  gomp_unmap_vars (struct target_mem_desc *tgt, bool do_copyfrom)
 	splay_tree_key k = tgt->list[i];
 	if (k->copy_from && do_copyfrom)
 	  /* Copy from device to host memory.  */
-	  devicep->device_dev2host_func
+	  devicep->dev2host_func
 	    ((void *) k->host_start,
 	     (void *) (k->tgt->tgt_start + k->tgt_offset),
 	     k->host_end - k->host_start);
@@ -688,7 +688,7 @@  gomp_update (struct gomp_device_descr *devicep, struct gomp_memory_mapping *mm,
 			  (void *) n->host_end);
 	    if (GOMP_MAP_COPYTO_P (kind & typemask))
 	      /* Copy from host to device memory.  */
-	      devicep->device_host2dev_func
+	      devicep->host2dev_func
 		((void *) (n->tgt->tgt_start
 			   + n->tgt_offset
 			   + cur_node.host_start
@@ -697,7 +697,7 @@  gomp_update (struct gomp_device_descr *devicep, struct gomp_memory_mapping *mm,
 		 cur_node.host_end - cur_node.host_start);
 	    else if (GOMP_MAP_COPYFROM_P (kind & typemask))
 	      /* Copy from device to host memory.  */
-	      devicep->device_dev2host_func
+	      devicep->dev2host_func
 		((void *) cur_node.host_start,
 		 (void *) (n->tgt->tgt_start
 			   + n->tgt_offset
@@ -740,7 +740,7 @@  attribute_hidden void
 gomp_init_device (struct gomp_device_descr *devicep)
 {
   /* Initialize the target device.  */
-  devicep->device_init_func ();
+  devicep->init_device_func ();
 
   devicep->is_initialized = true;
 }
@@ -751,7 +751,7 @@  gomp_init_tables (const struct gomp_device_descr *devicep,
 {
   /* Get address mapping table for device.  */
   struct mapping_table *table = NULL;
-  int i, num_entries = devicep->device_get_table_func (&table);
+  int i, num_entries = devicep->get_table_func (&table);
 
   /* Insert host-target address mapping into dev_splay_tree.  */
   for (i = 0; i < num_entries; i++)
@@ -808,7 +808,7 @@  attribute_hidden void
 gomp_fini_device (struct gomp_device_descr *devicep)
 {
   if (devicep->is_initialized)
-    devicep->device_fini_func ();
+    devicep->fini_device_func ();
 
   devicep->is_initialized = false;
 }
@@ -872,9 +872,9 @@  GOMP_target (int device, void (*fn) (void *), const void *openmp_target,
       thr->ts.place_partition_len = gomp_places_list_len;
     }
   if (devicep->capabilities & TARGET_CAP_NATIVE_EXEC)
-    devicep->device_run_func (fn, (void *) tgt_vars->tgt_start);
+    devicep->run_func (fn, (void *) tgt_vars->tgt_start);
   else
-    devicep->device_run_func ((void *) tgt_fn->tgt->tgt_start,
+    devicep->run_func ((void *) tgt_fn->tgt->tgt_start,
 			      (void *) tgt_vars->tgt_start);
   gomp_free_thread (thr);
   *thr = old_thr;
@@ -1001,7 +1001,8 @@  gomp_load_plugin_for_device (struct gomp_device_descr *device,
 #define DLSYM(f) \
   do									\
     {									\
-      device->f##_func = dlsym (device->plugin_handle, #f);		\
+      device->f##_func = dlsym (device->plugin_handle,			\
+				"GOMP_OFFLOAD_" #f);			\
       err = dlerror ();							\
       if (err != NULL)							\
 	goto out;							\
@@ -1012,7 +1013,8 @@  gomp_load_plugin_for_device (struct gomp_device_descr *device,
   do									\
     {									\
       char *tmp_err;							\
-      device->f##_func = dlsym (device->plugin_handle, #n);		\
+      device->f##_func = dlsym (device->plugin_handle,			\
+				"GOMP_OFFLOAD_" #n);			\
       tmp_err = dlerror ();						\
       if (tmp_err == NULL)						\
         optional_present++;						\
@@ -1026,17 +1028,17 @@  gomp_load_plugin_for_device (struct gomp_device_descr *device,
   DLSYM (get_caps);
   DLSYM (get_type);
   DLSYM (get_num_devices);
-  DLSYM (offload_register);
-  DLSYM (device_init);
-  DLSYM (device_fini);
-  DLSYM (device_get_table);
-  DLSYM (device_alloc);
-  DLSYM (device_free);
-  DLSYM (device_dev2host);
-  DLSYM (device_host2dev);
+  DLSYM (register_image);
+  DLSYM (init_device);
+  DLSYM (fini_device);
+  DLSYM (get_table);
+  DLSYM (alloc);
+  DLSYM (free);
+  DLSYM (dev2host);
+  DLSYM (host2dev);
   device->capabilities = device->get_caps_func ();
   if (device->capabilities & TARGET_CAP_OPENMP_400)
-    DLSYM (device_run);
+    DLSYM (run);
   if (device->capabilities & TARGET_CAP_OPENACC_200)
     {
       optional_present = optional_total = 0;
@@ -1102,7 +1104,7 @@  gomp_register_image_for_device (struct gomp_device_descr *device,
   if (!device->offload_regions_registered
       && (device->type == image->type || device->type == TARGET_TYPE_HOST))
     {
-      device->offload_register_func (image->host_table, image->target_data);
+      device->register_image_func (image->host_table, image->target_data);
       device->offload_regions_registered = true;
     }
 }
diff --git a/libgomp/target.h b/libgomp/target.h
index d4c1120..abe9678 100644
--- a/libgomp/target.h
+++ b/libgomp/target.h
@@ -178,15 +178,15 @@  struct gomp_device_descr
   unsigned int (*get_caps_func) (void);
   int (*get_type_func) (void);
   int (*get_num_devices_func) (void);
-  void (*offload_register_func) (void *, void *);
-  int (*device_init_func) (void);
-  int (*device_fini_func) (void);
-  int (*device_get_table_func) (struct mapping_table **);
-  void *(*device_alloc_func) (size_t);
-  void (*device_free_func) (void *);
-  void *(*device_dev2host_func) (void *, const void *, size_t);
-  void *(*device_host2dev_func) (void *, const void *, size_t);
-  void (*device_run_func) (void *, void *);
+  void (*register_image_func) (void *, void *);
+  int (*init_device_func) (void);
+  int (*fini_device_func) (void);
+  int (*get_table_func) (struct mapping_table **);
+  void *(*alloc_func) (size_t);
+  void (*free_func) (void *);
+  void *(*dev2host_func) (void *, const void *, size_t);
+  void *(*host2dev_func) (void *, const void *, size_t);
+  void (*run_func) (void *, void *);
 
   /* OpenACC-specific functions.  */
   ACC_dispatch_t openacc;