diff mbox

[1/4,libgomp] Resolve deadlock on plugin exit

Message ID 56EFCB0E.7010308@codesourcery.com
State New
Headers show

Commit Message

Chung-Lin Tang March 21, 2016, 10:21 a.m. UTC
Hi, this is the set of patches from https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01411.html
revised again, this time also with audits for the HSA plugin.

The changes are pretty minor, mainly that the unload_image hook now
receives similar error handling treatment.

Tested again without regressions for nvptx and intelmic, however
while I was able to build the toolchain with HSA offloading support, I was
unsure how I could test it, as I currently don't have any AMD hardware (not
aware if there's an emulator like intelmic).  I would be grateful if
the HSA folks can run them for me.

Thanks,
Chung-Lin

ChangeLog for the libgomp proper parts, patch as attached.

2016-03-20  Chung-Lin Tang  <cltang@codesourcery.com>

        * target.c (gomp_device_copy): New function.
        (gomp_copy_host2dev): Likewise.
        (gomp_copy_dev2host): Likewise.
        (gomp_free_device_memory): Likewise.
        (gomp_map_vars_existing): Adjust to call gomp_copy_host2dev().
        (gomp_map_pointer): Likewise.
        (gomp_map_vars): Adjust to call gomp_copy_host2dev(), handle
        NULL value from alloc_func plugin hook.
        (gomp_unmap_tgt): Adjust to call gomp_free_device_memory().
        (gomp_copy_from_async): Adjust to call gomp_copy_dev2host().
        (gomp_unmap_vars): Likewise.
        (gomp_update): Adjust to call gomp_copy_dev2host() and
        gomp_copy_host2dev() functions.
        (gomp_unload_image_from_device): Handle false value from
        unload_image_func plugin hook.
        (gomp_init_device): Handle false value from init_device_func
        plugin hook.
        (gomp_exit_data): Adjust to call gomp_copy_dev2host().
        (omp_target_free): Adjust to call gomp_free_device_memory().
        (omp_target_memcpy): Handle return values from host2dev_func,
        dev2host_func, and dev2dev_func plugin hooks.
        (omp_target_memcpy_rect_worker): Likewise.
        (gomp_target_fini): Handle false value from fini_device_func
        plugin hook.
        * libgomp.h (struct gomp_device_descr): Adjust return type of
        init_device_func, fini_device_func, unload_image_func, free_func,
        dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'.
        * oacc-host.c (host_init_device): Change return type to bool.
        (host_fini_device): Likewise.
        (host_unload_image): Likewise.
        (host_free): Likewise.
        (host_dev2host): Likewise.
        (host_host2dev): Likewise.
        * oacc-mem.c (acc_free): Handle plugin hook fatal error case.
        (acc_memcpy_to_device): Likewise.
        (acc_memcpy_from_device): Likewise.
        (delete_copyout): Add libfnname parameter, handle free_func
        hook fatal error case.
        (acc_delete): Adjust delete_copyout call.
        (acc_copyout): Likewise.
        (update_dev_host): Move gomp_mutex_unlock to after
        host2dev/dev2host hook calls.

Comments

Alexander Monakov March 21, 2016, 11:01 a.m. UTC | #1
Hi,

I'd like to note that I have a small patch on gomp-nvptx branch that deals
with the worst user-visible regression in a non-intrusive manner:

https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01109.html

Alexander
Chung-Lin Tang March 21, 2016, 11:50 a.m. UTC | #2
On 2016/3/21 07:01 PM, Alexander Monakov wrote:
> Hi,
> 
> I'd like to note that I have a small patch on gomp-nvptx branch that deals
> with the worst user-visible regression in a non-intrusive manner:
> 
> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01109.html
> 
> Alexander
> 

That error condition you report is interesting, but seems to not be
the device lock deadlock I was fixing, see original description of problem here:
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01701.html

Anyways, my patch contains the equivalent of your fix, so that shouldn't be a problem.

Thanks,
Chung-Lin
Martin Jambor March 24, 2016, 6:25 p.m. UTC | #3
Hi,

On Mon, Mar 21, 2016 at 06:21:02PM +0800, Chung-Lin Tang wrote:
> Hi, this is the set of patches from https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01411.html
> revised again, this time also with audits for the HSA plugin.
> 
> The changes are pretty minor, mainly that the unload_image hook now
> receives similar error handling treatment.
> 
> Tested again without regressions for nvptx and intelmic, however
> while I was able to build the toolchain with HSA offloading support, I was
> unsure how I could test it, as I currently don't have any AMD hardware (not
> aware if there's an emulator like intelmic).  I would be grateful if
> the HSA folks can run them for me.

I have just tested the whole patch-set on my HSA box (i.e. gomp.exp
tests and all libgomp tests on trunk + some extra testing on the hsa
branch) and found no issues.

I have had a very superficial look over the patch and have no
objections but since I am not familiar with the issue this addresses
and because I do not have detailed understanding of the of internals
of copying data to/from devices, my opinion should not really count
much.

Nevertheless, thanks for thinking about HSA and making me aware of the
change,

Martin


> 
> Thanks,
> Chung-Lin
> 
> ChangeLog for the libgomp proper parts, patch as attached.
> 
> 2016-03-20  Chung-Lin Tang  <cltang@codesourcery.com>
> 
>         * target.c (gomp_device_copy): New function.
>         (gomp_copy_host2dev): Likewise.
>         (gomp_copy_dev2host): Likewise.
>         (gomp_free_device_memory): Likewise.
>         (gomp_map_vars_existing): Adjust to call gomp_copy_host2dev().
>         (gomp_map_pointer): Likewise.
>         (gomp_map_vars): Adjust to call gomp_copy_host2dev(), handle
>         NULL value from alloc_func plugin hook.
>         (gomp_unmap_tgt): Adjust to call gomp_free_device_memory().
>         (gomp_copy_from_async): Adjust to call gomp_copy_dev2host().
>         (gomp_unmap_vars): Likewise.
>         (gomp_update): Adjust to call gomp_copy_dev2host() and
>         gomp_copy_host2dev() functions.
>         (gomp_unload_image_from_device): Handle false value from
>         unload_image_func plugin hook.
>         (gomp_init_device): Handle false value from init_device_func
>         plugin hook.
>         (gomp_exit_data): Adjust to call gomp_copy_dev2host().
>         (omp_target_free): Adjust to call gomp_free_device_memory().
>         (omp_target_memcpy): Handle return values from host2dev_func,
>         dev2host_func, and dev2dev_func plugin hooks.
>         (omp_target_memcpy_rect_worker): Likewise.
>         (gomp_target_fini): Handle false value from fini_device_func
>         plugin hook.
>         * libgomp.h (struct gomp_device_descr): Adjust return type of
>         init_device_func, fini_device_func, unload_image_func, free_func,
>         dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'.
>         * oacc-host.c (host_init_device): Change return type to bool.
>         (host_fini_device): Likewise.
>         (host_unload_image): Likewise.
>         (host_free): Likewise.
>         (host_dev2host): Likewise.
>         (host_host2dev): Likewise.
>         * oacc-mem.c (acc_free): Handle plugin hook fatal error case.
>         (acc_memcpy_to_device): Likewise.
>         (acc_memcpy_from_device): Likewise.
>         (delete_copyout): Add libfnname parameter, handle free_func
>         hook fatal error case.
>         (acc_delete): Adjust delete_copyout call.
>         (acc_copyout): Likewise.
>         (update_dev_host): Move gomp_mutex_unlock to after
>         host2dev/dev2host hook calls.
>
Chung-Lin Tang April 16, 2016, 7:39 a.m. UTC | #4
Ping.

On 2016/3/21 06:21 PM, Chung-Lin Tang wrote:
> Hi, this is the set of patches from https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01411.html
> revised again, this time also with audits for the HSA plugin.
> 
> The changes are pretty minor, mainly that the unload_image hook now
> receives similar error handling treatment.
> 
> Tested again without regressions for nvptx and intelmic, however
> while I was able to build the toolchain with HSA offloading support, I was
> unsure how I could test it, as I currently don't have any AMD hardware (not
> aware if there's an emulator like intelmic).  I would be grateful if
> the HSA folks can run them for me.
> 
> Thanks,
> Chung-Lin
> 
> ChangeLog for the libgomp proper parts, patch as attached.
> 
> 2016-03-20  Chung-Lin Tang  <cltang@codesourcery.com>
> 
>         * target.c (gomp_device_copy): New function.
>         (gomp_copy_host2dev): Likewise.
>         (gomp_copy_dev2host): Likewise.
>         (gomp_free_device_memory): Likewise.
>         (gomp_map_vars_existing): Adjust to call gomp_copy_host2dev().
>         (gomp_map_pointer): Likewise.
>         (gomp_map_vars): Adjust to call gomp_copy_host2dev(), handle
>         NULL value from alloc_func plugin hook.
>         (gomp_unmap_tgt): Adjust to call gomp_free_device_memory().
>         (gomp_copy_from_async): Adjust to call gomp_copy_dev2host().
>         (gomp_unmap_vars): Likewise.
>         (gomp_update): Adjust to call gomp_copy_dev2host() and
>         gomp_copy_host2dev() functions.
>         (gomp_unload_image_from_device): Handle false value from
>         unload_image_func plugin hook.
>         (gomp_init_device): Handle false value from init_device_func
>         plugin hook.
>         (gomp_exit_data): Adjust to call gomp_copy_dev2host().
>         (omp_target_free): Adjust to call gomp_free_device_memory().
>         (omp_target_memcpy): Handle return values from host2dev_func,
>         dev2host_func, and dev2dev_func plugin hooks.
>         (omp_target_memcpy_rect_worker): Likewise.
>         (gomp_target_fini): Handle false value from fini_device_func
>         plugin hook.
>         * libgomp.h (struct gomp_device_descr): Adjust return type of
>         init_device_func, fini_device_func, unload_image_func, free_func,
>         dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'.
>         * oacc-host.c (host_init_device): Change return type to bool.
>         (host_fini_device): Likewise.
>         (host_unload_image): Likewise.
>         (host_free): Likewise.
>         (host_dev2host): Likewise.
>         (host_host2dev): Likewise.
>         * oacc-mem.c (acc_free): Handle plugin hook fatal error case.
>         (acc_memcpy_to_device): Likewise.
>         (acc_memcpy_from_device): Likewise.
>         (delete_copyout): Add libfnname parameter, handle free_func
>         hook fatal error case.
>         (acc_delete): Adjust delete_copyout call.
>         (acc_copyout): Likewise.
>         (update_dev_host): Move gomp_mutex_unlock to after
>         host2dev/dev2host hook calls.
>
Chung-Lin Tang May 11, 2016, 6:49 a.m. UTC | #5
Ping x2

On 2016/4/16 3:39 PM, Chung-Lin Tang wrote:
> Ping.
> 
> On 2016/3/21 06:21 PM, Chung-Lin Tang wrote:
>> Hi, this is the set of patches from https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01411.html
>> revised again, this time also with audits for the HSA plugin.
>>
>> The changes are pretty minor, mainly that the unload_image hook now
>> receives similar error handling treatment.
>>
>> Tested again without regressions for nvptx and intelmic, however
>> while I was able to build the toolchain with HSA offloading support, I was
>> unsure how I could test it, as I currently don't have any AMD hardware (not
>> aware if there's an emulator like intelmic).  I would be grateful if
>> the HSA folks can run them for me.
>>
>> Thanks,
>> Chung-Lin
>>
>> ChangeLog for the libgomp proper parts, patch as attached.
>>
>> 2016-03-20  Chung-Lin Tang  <cltang@codesourcery.com>
>>
>>         * target.c (gomp_device_copy): New function.
>>         (gomp_copy_host2dev): Likewise.
>>         (gomp_copy_dev2host): Likewise.
>>         (gomp_free_device_memory): Likewise.
>>         (gomp_map_vars_existing): Adjust to call gomp_copy_host2dev().
>>         (gomp_map_pointer): Likewise.
>>         (gomp_map_vars): Adjust to call gomp_copy_host2dev(), handle
>>         NULL value from alloc_func plugin hook.
>>         (gomp_unmap_tgt): Adjust to call gomp_free_device_memory().
>>         (gomp_copy_from_async): Adjust to call gomp_copy_dev2host().
>>         (gomp_unmap_vars): Likewise.
>>         (gomp_update): Adjust to call gomp_copy_dev2host() and
>>         gomp_copy_host2dev() functions.
>>         (gomp_unload_image_from_device): Handle false value from
>>         unload_image_func plugin hook.
>>         (gomp_init_device): Handle false value from init_device_func
>>         plugin hook.
>>         (gomp_exit_data): Adjust to call gomp_copy_dev2host().
>>         (omp_target_free): Adjust to call gomp_free_device_memory().
>>         (omp_target_memcpy): Handle return values from host2dev_func,
>>         dev2host_func, and dev2dev_func plugin hooks.
>>         (omp_target_memcpy_rect_worker): Likewise.
>>         (gomp_target_fini): Handle false value from fini_device_func
>>         plugin hook.
>>         * libgomp.h (struct gomp_device_descr): Adjust return type of
>>         init_device_func, fini_device_func, unload_image_func, free_func,
>>         dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'.
>>         * oacc-host.c (host_init_device): Change return type to bool.
>>         (host_fini_device): Likewise.
>>         (host_unload_image): Likewise.
>>         (host_free): Likewise.
>>         (host_dev2host): Likewise.
>>         (host_host2dev): Likewise.
>>         * oacc-mem.c (acc_free): Handle plugin hook fatal error case.
>>         (acc_memcpy_to_device): Likewise.
>>         (acc_memcpy_from_device): Likewise.
>>         (delete_copyout): Add libfnname parameter, handle free_func
>>         hook fatal error case.
>>         (acc_delete): Adjust delete_copyout call.
>>         (acc_copyout): Likewise.
>>         (update_dev_host): Move gomp_mutex_unlock to after
>>         host2dev/dev2host hook calls.
>>
>
Jakub Jelinek May 12, 2016, 10:25 a.m. UTC | #6
On Wed, May 11, 2016 at 02:49:03PM +0800, Chung-Lin Tang wrote:
> Ping x2

These are again plugin ABI incompatible changes, so you need to arrange
for the 6.1 plugins to be rejected by trunk libgomp with the changes
and vice versa.

	Jakub
diff mbox

Patch

Index: libgomp/libgomp.h
===================================================================
--- libgomp/libgomp.h	(revision 234358)
+++ libgomp/libgomp.h	(working copy)
@@ -932,16 +932,16 @@  struct gomp_device_descr
   unsigned int (*get_caps_func) (void);
   int (*get_type_func) (void);
   int (*get_num_devices_func) (void);
-  void (*init_device_func) (int);
-  void (*fini_device_func) (int);
+  bool (*init_device_func) (int);
+  bool (*fini_device_func) (int);
   unsigned (*version_func) (void);
   int (*load_image_func) (int, unsigned, const void *, struct addr_pair **);
-  void (*unload_image_func) (int, unsigned, const void *);
+  bool (*unload_image_func) (int, unsigned, const void *);
   void *(*alloc_func) (int, size_t);
-  void (*free_func) (int, void *);
-  void *(*dev2host_func) (int, void *, const void *, size_t);
-  void *(*host2dev_func) (int, void *, const void *, size_t);
-  void *(*dev2dev_func) (int, void *, const void *, size_t);
+  bool (*free_func) (int, void *);
+  bool (*dev2host_func) (int, void *, const void *, size_t);
+  bool (*host2dev_func) (int, void *, const void *, size_t);
+  bool (*dev2dev_func) (int, void *, const void *, size_t);
   bool (*can_run_func) (void *);
   void (*run_func) (int, void *, void *, void **);
   void (*async_run_func) (int, void *, void *, void **, void *);
Index: libgomp/oacc-host.c
===================================================================
--- libgomp/oacc-host.c	(revision 234358)
+++ libgomp/oacc-host.c	(working copy)
@@ -60,14 +60,16 @@  host_get_num_devices (void)
   return 1;
 }
 
-static void
+static bool
 host_init_device (int n __attribute__ ((unused)))
 {
+  return true;
 }
 
-static void
+static bool
 host_fini_device (int n __attribute__ ((unused)))
 {
+  return true;
 }
 
 static unsigned
@@ -85,11 +87,12 @@  host_load_image (int n __attribute__ ((unused)),
   return 0;
 }
 
-static void
+static bool
 host_unload_image (int n __attribute__ ((unused)),
 		   unsigned v __attribute__ ((unused)),
 		   const void *t __attribute__ ((unused)))
 {
+  return true;
 }
 
 static void *
@@ -98,28 +101,29 @@  host_alloc (int n __attribute__ ((unused)), size_t
   return gomp_malloc (s);
 }
 
-static void
+static bool
 host_free (int n __attribute__ ((unused)), void *p)
 {
   free (p);
+  return true;
 }
 
-static void *
+static bool
 host_dev2host (int n __attribute__ ((unused)),
 	       void *h __attribute__ ((unused)),
 	       const void *d __attribute__ ((unused)),
 	       size_t s __attribute__ ((unused)))
 {
-  return NULL;
+  return true;
 }
 
-static void *
+static bool
 host_host2dev (int n __attribute__ ((unused)),
 	       void *d __attribute__ ((unused)),
 	       const void *h __attribute__ ((unused)),
 	       size_t s __attribute__ ((unused)))
 {
-  return NULL;
+  return true;
 }
 
 static void
Index: libgomp/oacc-init.c
===================================================================
--- libgomp/oacc-init.c	(revision 234358)
+++ libgomp/oacc-init.c	(working copy)
@@ -300,8 +300,8 @@  acc_shutdown_1 (acc_device_t d)
 
   gomp_mutex_unlock (&goacc_thread_lock);
 
-
   /* Close all the devices of this type that have been opened.  */
+  bool ret = true;
   for (i = 0; i < ndevs; i++)
     {
       struct gomp_device_descr *acc_dev = &base_dev[i];
@@ -309,12 +309,15 @@  acc_shutdown_1 (acc_device_t d)
       if (acc_dev->state == GOMP_DEVICE_INITIALIZED)
         {
 	  devices_active = true;
-	  acc_dev->fini_device_func (acc_dev->target_id);
+	  ret &= acc_dev->fini_device_func (acc_dev->target_id);
 	  acc_dev->state = GOMP_DEVICE_UNINITIALIZED;
 	}
       gomp_mutex_unlock (&acc_dev->lock);
     }
 
+  if (!ret)
+    gomp_fatal ("device finalization failed");
+
   if (!devices_active)
     gomp_fatal ("no device initialized");
 }
Index: libgomp/oacc-mem.c
===================================================================
--- libgomp/oacc-mem.c	(revision 234358)
+++ libgomp/oacc-mem.c	(working copy)
@@ -142,7 +142,8 @@  acc_free (void *d)
   else
     gomp_mutex_unlock (&acc_dev->lock);
 
-  acc_dev->free_func (acc_dev->target_id, d);
+  if (!acc_dev->free_func (acc_dev->target_id, d))
+    gomp_fatal ("error in freeing device memory in %s", __FUNCTION__);
 }
 
 void
@@ -154,7 +155,8 @@  acc_memcpy_to_device (void *d, void *h, size_t s)
 
   assert (thr && thr->dev);
 
-  thr->dev->host2dev_func (thr->dev->target_id, d, h, s);
+  if (!thr->dev->host2dev_func (thr->dev->target_id, d, h, s))
+    gomp_fatal ("error in %s", __FUNCTION__);
 }
 
 void
@@ -166,7 +168,8 @@  acc_memcpy_from_device (void *h, void *d, size_t s
 
   assert (thr && thr->dev);
 
-  thr->dev->dev2host_func (thr->dev->target_id, h, d, s);
+  if (!thr->dev->dev2host_func (thr->dev->target_id, h, d, s))
+    gomp_fatal ("error in %s", __FUNCTION__);
 }
 
 /* Return the device pointer that corresponds to host data H.  Or NULL
@@ -488,7 +491,7 @@  acc_present_or_copyin (void *h, size_t s)
 #define FLAG_COPYOUT (1 << 0)
 
 static void
-delete_copyout (unsigned f, void *h, size_t s)
+delete_copyout (unsigned f, void *h, size_t s, const char *libfnname)
 {
   size_t host_size;
   splay_tree_key n;
@@ -527,18 +530,20 @@  static void
 
   acc_unmap_data (h);
 
-  acc_dev->free_func (acc_dev->target_id, d);
+  if (!acc_dev->free_func (acc_dev->target_id, d))
+    gomp_fatal ("error in freeing device memory in %s", libfnname);
 }
 
 void
 acc_delete (void *h , size_t s)
 {
-  delete_copyout (0, h, s);
+  delete_copyout (0, h, s, __FUNCTION__);
 }
 
-void acc_copyout (void *h, size_t s)
+void
+acc_copyout (void *h, size_t s)
 {
-  delete_copyout (FLAG_COPYOUT, h, s);
+  delete_copyout (FLAG_COPYOUT, h, s, __FUNCTION__);
 }
 
 static void
@@ -564,12 +569,12 @@  update_dev_host (int is_dev, void *h, size_t s)
 
   d = (void *) (n->tgt->tgt_start + n->tgt_offset);
 
-  gomp_mutex_unlock (&acc_dev->lock);
-
   if (is_dev)
     acc_dev->host2dev_func (acc_dev->target_id, d, h, s);
   else
     acc_dev->dev2host_func (acc_dev->target_id, h, d, s);
+
+  gomp_mutex_unlock (&acc_dev->lock);
 }
 
 void
Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 234358)
+++ libgomp/target.c	(working copy)
@@ -162,6 +162,45 @@  gomp_map_0len_lookup (splay_tree mem_map, splay_tr
   return n;
 }
 
+static inline void
+gomp_device_copy (struct gomp_device_descr *devicep,
+		  bool (*copy_func) (int, void *, const void *, size_t),
+		  const char *dst, void *dstaddr,
+		  const char *src, const void *srcaddr,
+		  size_t size)
+{
+  if (!copy_func (devicep->target_id, dstaddr, srcaddr, size))
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("Copying of %s object [%p..%p) to %s object [%p..%p) failed",
+		  src, srcaddr, srcaddr + size, dst, dstaddr, dstaddr + size);
+    }
+}
+
+static void
+gomp_copy_host2dev (struct gomp_device_descr *devicep,
+		    void *d, const void *h, size_t sz)
+{
+  gomp_device_copy (devicep, devicep->host2dev_func, "dev", d, "host", h, sz);
+}
+
+static void
+gomp_copy_dev2host (struct gomp_device_descr *devicep,
+		    void *h, const void *d, size_t sz)
+{
+  gomp_device_copy (devicep, devicep->dev2host_func, "host", h, "dev", d, sz);
+}
+
+static void
+gomp_free_device_memory (struct gomp_device_descr *devicep, void *devptr)
+{
+  if (!devicep->free_func (devicep->target_id, devptr))
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("error in freeing device memory block at %p", devptr);
+    }
+}
+
 /* Handle the case where gomp_map_lookup, splay_tree_lookup or
    gomp_map_0len_lookup found oldn for newn.
    Helper function of gomp_map_vars.  */
@@ -189,11 +228,12 @@  gomp_map_vars_existing (struct gomp_device_descr *
     }
 
   if (GOMP_MAP_ALWAYS_TO_P (kind))
-    devicep->host2dev_func (devicep->target_id,
-			    (void *) (oldn->tgt->tgt_start + oldn->tgt_offset
-				      + newn->host_start - oldn->host_start),
-			    (void *) newn->host_start,
-			    newn->host_end - newn->host_start);
+    gomp_copy_host2dev (devicep,
+			(void *) (oldn->tgt->tgt_start + oldn->tgt_offset
+				  + newn->host_start - oldn->host_start),
+			(void *) newn->host_start,
+			newn->host_end - newn->host_start);
+
   if (oldn->refcount != REFCOUNT_INFINITY)
     oldn->refcount++;
 }
@@ -218,10 +258,10 @@  gomp_map_pointer (struct target_mem_desc *tgt, uin
     {
       cur_node.tgt_offset = (uintptr_t) NULL;
       /* FIXME: see comment about coalescing host/dev transfers below.  */
-      devicep->host2dev_func (devicep->target_id,
-			      (void *) (tgt->tgt_start + target_offset),
-			      (void *) &cur_node.tgt_offset,
-			      sizeof (void *));
+      gomp_copy_host2dev (devicep,
+			  (void *) (tgt->tgt_start + target_offset),
+			  (void *) &cur_node.tgt_offset,
+			  sizeof (void *));
       return;
     }
   /* Add bias to the pointer value.  */
@@ -241,10 +281,8 @@  gomp_map_pointer (struct target_mem_desc *tgt, uin
      to initialize the pointer with.  */
   cur_node.tgt_offset -= bias;
   /* FIXME: see comment about coalescing host/dev transfers below.  */
-  devicep->host2dev_func (devicep->target_id,
-			  (void *) (tgt->tgt_start + target_offset),
-			  (void *) &cur_node.tgt_offset,
-			  sizeof (void *));
+  gomp_copy_host2dev (devicep, (void *) (tgt->tgt_start + target_offset),
+		      (void *) &cur_node.tgt_offset, sizeof (void *));
 }
 
 static void
@@ -515,6 +553,12 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 	 memory.  */
       tgt->to_free = devicep->alloc_func (devicep->target_id,
 					  tgt_size + tgt_align - 1);
+      if (!tgt->to_free)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("device memory allocation fail");
+	}
+
       tgt->tgt_start = (uintptr_t) tgt->to_free;
       tgt->tgt_start = (tgt->tgt_start + tgt_align - 1) & ~(tgt_align - 1);
       tgt->tgt_end = tgt->tgt_start + tgt_size;
@@ -554,9 +598,9 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 		tgt_size = (tgt_size + align - 1) & ~(align - 1);
 		tgt->list[i].offset = tgt_size;
 		len = sizes[i];
-		devicep->host2dev_func (devicep->target_id,
-					(void *) (tgt->tgt_start + tgt_size),
-					(void *) hostaddrs[i], len);
+		gomp_copy_host2dev (devicep,
+				    (void *) (tgt->tgt_start + tgt_size),
+				    (void *) hostaddrs[i], len);
 		tgt_size += len;
 		continue;
 	      case GOMP_MAP_FIRSTPRIVATE_INT:
@@ -608,13 +652,13 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 		  cur_node.tgt_offset = gomp_map_val (tgt, hostaddrs, i - 1);
 		if (cur_node.tgt_offset)
 		  cur_node.tgt_offset -= sizes[i];
-		devicep->host2dev_func (devicep->target_id,
-					(void *) (n->tgt->tgt_start
-						  + n->tgt_offset
-						  + cur_node.host_start
-						  - n->host_start),
-					(void *) &cur_node.tgt_offset,
-					sizeof (void *));
+		gomp_copy_host2dev (devicep,
+				    (void *) (n->tgt->tgt_start
+					      + n->tgt_offset
+					      + cur_node.host_start
+					      - n->host_start),
+				    (void *) &cur_node.tgt_offset,
+				    sizeof (void *));
 		cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
 				      + cur_node.host_start - n->host_start;
 		continue;
@@ -685,11 +729,11 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 		    /* FIXME: Perhaps add some smarts, like if copying
 		       several adjacent fields from host to target, use some
 		       host buffer to avoid sending each var individually.  */
-		    devicep->host2dev_func (devicep->target_id,
-					    (void *) (tgt->tgt_start
-						      + k->tgt_offset),
-					    (void *) k->host_start,
-					    k->host_end - k->host_start);
+		    gomp_copy_host2dev (devicep,
+					(void *) (tgt->tgt_start
+						  + k->tgt_offset),
+					(void *) k->host_start,
+					k->host_end - k->host_start);
 		    break;
 		  case GOMP_MAP_POINTER:
 		    gomp_map_pointer (tgt, (uintptr_t) *(void **) k->host_start,
@@ -697,11 +741,11 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 		    break;
 		  case GOMP_MAP_TO_PSET:
 		    /* FIXME: see above FIXME comment.  */
-		    devicep->host2dev_func (devicep->target_id,
-					    (void *) (tgt->tgt_start
-						      + k->tgt_offset),
-					    (void *) k->host_start,
-					    k->host_end - k->host_start);
+		    gomp_copy_host2dev (devicep,
+					(void *) (tgt->tgt_start
+						  + k->tgt_offset),
+					(void *) k->host_start,
+					k->host_end - k->host_start);
 
 		    for (j = i + 1; j < mapnum; j++)
 		      if (!GOMP_MAP_POINTER_P (get_kind (short_mapkind, kinds,
@@ -748,12 +792,11 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 		    break;
 		  case GOMP_MAP_FORCE_DEVICEPTR:
 		    assert (k->host_end - k->host_start == sizeof (void *));
-
-		    devicep->host2dev_func (devicep->target_id,
-					    (void *) (tgt->tgt_start
-						      + k->tgt_offset),
-					    (void *) k->host_start,
-					    sizeof (void *));
+		    gomp_copy_host2dev (devicep,
+					(void *) (tgt->tgt_start
+						  + k->tgt_offset),
+					(void *) k->host_start,
+					sizeof (void *));
 		    break;
 		  default:
 		    gomp_mutex_unlock (&devicep->lock);
@@ -781,11 +824,9 @@  gomp_map_vars (struct gomp_device_descr *devicep,
 	{
 	  cur_node.tgt_offset = gomp_map_val (tgt, hostaddrs, i);
 	  /* FIXME: see above FIXME comment.  */
-	  devicep->host2dev_func (devicep->target_id,
-				  (void *) (tgt->tgt_start
-					    + i * sizeof (void *)),
-				  (void *) &cur_node.tgt_offset,
-				  sizeof (void *));
+	  gomp_copy_host2dev (devicep,
+			      (void *) (tgt->tgt_start + i * sizeof (void *)),
+			      (void *) &cur_node.tgt_offset, sizeof (void *));
 	}
     }
 
@@ -807,7 +848,7 @@  gomp_unmap_tgt (struct target_mem_desc *tgt)
 {
   /* Deallocate on target the tgt->tgt_start .. tgt->tgt_end region.  */
   if (tgt->tgt_end)
-    tgt->device_descr->free_func (tgt->device_descr->target_id, tgt->to_free);
+    gomp_free_device_memory (tgt->device_descr, tgt->to_free);
 
   free (tgt->array);
   free (tgt);
@@ -839,9 +880,9 @@  gomp_copy_from_async (struct target_mem_desc *tgt)
       {
 	splay_tree_key k = tgt->list[i].key;
 	if (tgt->list[i].copy_from)
-	  devicep->dev2host_func (devicep->target_id, (void *) k->host_start,
-				  (void *) (k->tgt->tgt_start + k->tgt_offset),
-				  k->host_end - k->host_start);
+	  gomp_copy_dev2host (devicep, (void *) k->host_start,
+			      (void *) (k->tgt->tgt_start + k->tgt_offset),
+			      k->host_end - k->host_start);
       }
 
   gomp_mutex_unlock (&devicep->lock);
@@ -894,11 +935,11 @@  gomp_unmap_vars (struct target_mem_desc *tgt, bool
 
       if ((do_unmap && do_copyfrom && tgt->list[i].copy_from)
 	  || tgt->list[i].always_copy_from)
-	devicep->dev2host_func (devicep->target_id,
-				(void *) (k->host_start + tgt->list[i].offset),
-				(void *) (k->tgt->tgt_start + k->tgt_offset
-					  + tgt->list[i].offset),
-				tgt->list[i].length);
+	gomp_copy_dev2host (devicep,
+			    (void *) (k->host_start + tgt->list[i].offset),
+			    (void *) (k->tgt->tgt_start + k->tgt_offset
+				      + tgt->list[i].offset),
+			    tgt->list[i].length);
       if (do_unmap)
 	{
 	  splay_tree_remove (&devicep->mem_map, k);
@@ -961,22 +1002,17 @@  gomp_update (struct gomp_device_descr *devicep, si
 			    (void *) n->host_start,
 			    (void *) n->host_end);
 	      }
+
+
+	    void *hostaddr = (void *) cur_node.host_start;
+	    void *devaddr = (void *) (n->tgt->tgt_start + n->tgt_offset
+				      + cur_node.host_start - n->host_start);
+	    size_t size = cur_node.host_end - cur_node.host_start;
+
 	    if (GOMP_MAP_COPY_TO_P (kind & typemask))
-	      devicep->host2dev_func (devicep->target_id,
-				      (void *) (n->tgt->tgt_start
-						+ n->tgt_offset
-						+ cur_node.host_start
-						- n->host_start),
-				      (void *) cur_node.host_start,
-				      cur_node.host_end - cur_node.host_start);
+	      gomp_copy_host2dev (devicep, devaddr, hostaddr, size);
 	    if (GOMP_MAP_COPY_FROM_P (kind & typemask))
-	      devicep->dev2host_func (devicep->target_id,
-				      (void *) cur_node.host_start,
-				      (void *) (n->tgt->tgt_start
-						+ n->tgt_offset
-						+ cur_node.host_start
-						- n->host_start),
-				      cur_node.host_end - cur_node.host_start);
+	      gomp_copy_dev2host (devicep, hostaddr, devaddr, size);
 	  }
       }
   gomp_mutex_unlock (&devicep->lock);
@@ -1114,7 +1150,11 @@  gomp_unload_image_from_device (struct gomp_device_
       node = splay_tree_lookup (&devicep->mem_map, &k);
     }
 
-  devicep->unload_image_func (devicep->target_id, version, target_data);
+  if (!devicep->unload_image_func (devicep->target_id, version, target_data))
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("image unload fail");
+    }
 
   /* Remove mappings from splay tree.  */
   int i;
@@ -1261,7 +1301,11 @@  attribute_hidden void
 gomp_init_device (struct gomp_device_descr *devicep)
 {
   int i;
-  devicep->init_device_func (devicep->target_id);
+  if (!devicep->init_device_func (devicep->target_id))
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("device initialization failed");
+    }
 
   /* Load to device all images registered by the moment.  */
   for (i = 0; i < num_offload_images; i++)
@@ -1774,12 +1818,11 @@  gomp_exit_data (struct gomp_device_descr *devicep,
 
 	  if ((kind == GOMP_MAP_FROM && k->refcount == 0)
 	      || kind == GOMP_MAP_ALWAYS_FROM)
-	    devicep->dev2host_func (devicep->target_id,
-				    (void *) cur_node.host_start,
-				    (void *) (k->tgt->tgt_start + k->tgt_offset
-					      + cur_node.host_start
-					      - k->host_start),
-				    cur_node.host_end - cur_node.host_start);
+	    gomp_copy_dev2host (devicep, (void *) cur_node.host_start,
+				(void *) (k->tgt->tgt_start + k->tgt_offset
+					  + cur_node.host_start
+					  - k->host_start),
+				cur_node.host_end - cur_node.host_start);
 	  if (k->refcount == 0)
 	    {
 	      splay_tree_remove (&devicep->mem_map, k);
@@ -2015,7 +2058,7 @@  omp_target_free (void *device_ptr, int device_num)
     }
 
   gomp_mutex_lock (&devicep->lock);
-  devicep->free_func (devicep->target_id, device_ptr);
+  gomp_free_device_memory (devicep, device_ptr);
   gomp_mutex_unlock (&devicep->lock);
 }
 
@@ -2056,6 +2099,7 @@  omp_target_memcpy (void *dst, void *src, size_t le
 		   size_t src_offset, int dst_device_num, int src_device_num)
 {
   struct gomp_device_descr *dst_devicep = NULL, *src_devicep = NULL;
+  bool ret;
 
   if (dst_device_num != GOMP_DEVICE_HOST_FALLBACK)
     {
@@ -2091,29 +2135,29 @@  omp_target_memcpy (void *dst, void *src, size_t le
   if (src_devicep == NULL)
     {
       gomp_mutex_lock (&dst_devicep->lock);
-      dst_devicep->host2dev_func (dst_devicep->target_id,
-				  (char *) dst + dst_offset,
-				  (char *) src + src_offset, length);
+      ret = dst_devicep->host2dev_func (dst_devicep->target_id,
+					(char *) dst + dst_offset,
+					(char *) src + src_offset, length);
       gomp_mutex_unlock (&dst_devicep->lock);
-      return 0;
+      return (ret ? 0 : EINVAL);
     }
   if (dst_devicep == NULL)
     {
       gomp_mutex_lock (&src_devicep->lock);
-      src_devicep->dev2host_func (src_devicep->target_id,
-				  (char *) dst + dst_offset,
-				  (char *) src + src_offset, length);
+      ret = src_devicep->dev2host_func (src_devicep->target_id,
+					(char *) dst + dst_offset,
+					(char *) src + src_offset, length);
       gomp_mutex_unlock (&src_devicep->lock);
-      return 0;
+      return (ret ? 0 : EINVAL);
     }
   if (src_devicep == dst_devicep)
     {
       gomp_mutex_lock (&src_devicep->lock);
-      src_devicep->dev2dev_func (src_devicep->target_id,
-				 (char *) dst + dst_offset,
-				 (char *) src + src_offset, length);
+      ret = src_devicep->dev2dev_func (src_devicep->target_id,
+				       (char *) dst + dst_offset,
+				       (char *) src + src_offset, length);
       gomp_mutex_unlock (&src_devicep->lock);
-      return 0;
+      return (ret ? 0 : EINVAL);
     }
   return EINVAL;
 }
@@ -2140,22 +2184,25 @@  omp_target_memcpy_rect_worker (void *dst, void *sr
 	  || __builtin_mul_overflow (element_size, src_offsets[0], &src_off))
 	return EINVAL;
       if (dst_devicep == NULL && src_devicep == NULL)
-	memcpy ((char *) dst + dst_off, (char *) src + src_off, length);
+	{
+	  memcpy ((char *) dst + dst_off, (char *) src + src_off, length);
+	  ret = 1;
+	}
       else if (src_devicep == NULL)
-	dst_devicep->host2dev_func (dst_devicep->target_id,
-				    (char *) dst + dst_off,
-				    (char *) src + src_off, length);
+	ret = dst_devicep->host2dev_func (dst_devicep->target_id,
+					  (char *) dst + dst_off,
+					  (char *) src + src_off, length);
       else if (dst_devicep == NULL)
-	src_devicep->dev2host_func (src_devicep->target_id,
-				    (char *) dst + dst_off,
-				    (char *) src + src_off, length);
+	ret = src_devicep->dev2host_func (src_devicep->target_id,
+					  (char *) dst + dst_off,
+					  (char *) src + src_off, length);
       else if (src_devicep == dst_devicep)
-	src_devicep->dev2dev_func (src_devicep->target_id,
-				   (char *) dst + dst_off,
-				   (char *) src + src_off, length);
+	ret = src_devicep->dev2dev_func (src_devicep->target_id,
+					 (char *) dst + dst_off,
+					 (char *) src + src_off, length);
       else
-	return EINVAL;
-      return 0;
+	ret = 0;
+      return ret ? 0 : EINVAL;
     }
 
   /* FIXME: it would be nice to have some plugin function to handle
@@ -2470,14 +2517,17 @@  gomp_target_fini (void)
   int i;
   for (i = 0; i < num_devices; i++)
     {
+      bool ret = true;
       struct gomp_device_descr *devicep = &devices[i];
       gomp_mutex_lock (&devicep->lock);
       if (devicep->state == GOMP_DEVICE_INITIALIZED)
 	{
-	  devicep->fini_device_func (devicep->target_id);
+	  ret = devicep->fini_device_func (devicep->target_id);
 	  devicep->state = GOMP_DEVICE_FINALIZED;
 	}
       gomp_mutex_unlock (&devicep->lock);
+      if (!ret)
+	gomp_fatal ("device finalization failed");
     }
 }