diff mbox

[1/3,libgomp] Adjust offload plugin interface for avoiding deadlock on exit

Message ID 55DF1452.9050501@codesourcery.com
State New
Headers show

Commit Message

Chung-Lin Tang Aug. 27, 2015, 1:44 p.m. UTC
We've discovered that, for several of the libgomp plugin interface routines,
if the target specific routine calls exit() (usually upon a fatal condition),
deadlock ensues. We found this using nvptx, but it's possible on intelmic as well.

This is due to many of the plugin routines are called with the device lock held,
and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor
tries to iterate through and acquire all device locks to cleanup. Since we already hold
one of the device locks, this just gets stuck.  Also because gomp_mutex_t is a
simple futex based lock implementation (instead of pthreads), we don't have a
trylock mechanism to use either.

So this patch tries to alleviate this problem by changing the plugin interface;
the plugin routines that are called while holding the device lock are adjusted
to assume to never fatal exit, but return a value back to libgomp proper to
indicate execution results. The core libgomp code then may unlock and call gomp_fatal().

We believe this is the right route to solve the problem, since there's only
two accel target plugins so far. Besides the nvptx plugin, I have made some effort
to update the intelmic plugin as well, though it's not as thoroughly audited.
Intel folks might want to further make sure your plugin code is free of this problem as well.

This patch contains the libgomp proper changes. The nvptx and intelmic patches follow.
I have tested the libgomp testsuite without regressions for both accel targets, is this
okay for trunk?

Thanks,
Chung-Lin

2015-08-27 Chung-Lin Tang <cltang@codesourcery.com>

        * oacc-host.c (host_init_device): Change return type to bool.
        (host_fini_device): Likewise.
        (host_dev2host): Likewise.
        (host_host2dev): Likewise.
        (host_free): Likewise.
        (host_alloc): Change return type to bool, change to use out
        parameter to return allocated pointer.
        * oacc-mem.c (acc_malloc): Adjust plugin hook declaration change,
        handle fatal error.
        (acc_free): Likewise.
        (acc_memcpy_to_device): Likewise.
        (acc_memcpy_from_device): Likewise.
        * oacc-init.c (acc_init_1): Handle gomp_init_device return code,
        handle fatal error.
        (acc_set_device_type): Likewise.
        (acc_set_device_num): Likewise.
        * target.c (gomp_map_vars): Adjust alloc_func plugin hook call,
        add device unlock, handle fatal error.
        (gomp_unmap_tgt): Change return type to bool, adjust free_func
        plugin call.
        (gomp_copy_from_async): Handle dev2host_func return code, handle
        fatal error.
        (gomp_unmap_vars): Likewise.
        (gomp_init_device): Change return type to bool, adjust call to
        init_device_func plugin hook.
        (GOMP_target): Adjust call to gomp_init_device, handle fatal error.
        (GOMP_target_data): Likewise.
        (GOMP_target_update): Likewise.
        * libgomp.h (gomp_device_descr.init_device_func): Change return
        type to bool.
        (gomp_device_descr.fini_device_func): Likewise.
        (gomp_device_descr.free_func): Likewise.
        (gomp_device_descr.dev2host_func): Likewise.
        (gomp_device_descr.host2dev_func) Likewise.
        (gomp_device_descr.alloc_func): Change return
        type to bool, use out parameter to return pointer.
        (gomp_init_device): Change return
        type to bool.

Comments

Chung-Lin Tang Sept. 9, 2015, 8:08 a.m. UTC | #1
Ping.

On 2015/8/27 09:44 PM, Chung-Lin Tang wrote:
> We've discovered that, for several of the libgomp plugin interface routines,
> if the target specific routine calls exit() (usually upon a fatal condition),
> deadlock ensues. We found this using nvptx, but it's possible on intelmic as well.
> 
> This is due to many of the plugin routines are called with the device lock held,
> and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor
> tries to iterate through and acquire all device locks to cleanup. Since we already hold
> one of the device locks, this just gets stuck.  Also because gomp_mutex_t is a
> simple futex based lock implementation (instead of pthreads), we don't have a
> trylock mechanism to use either.
> 
> So this patch tries to alleviate this problem by changing the plugin interface;
> the plugin routines that are called while holding the device lock are adjusted
> to assume to never fatal exit, but return a value back to libgomp proper to
> indicate execution results. The core libgomp code then may unlock and call gomp_fatal().
> 
> We believe this is the right route to solve the problem, since there's only
> two accel target plugins so far. Besides the nvptx plugin, I have made some effort
> to update the intelmic plugin as well, though it's not as thoroughly audited.
> Intel folks might want to further make sure your plugin code is free of this problem as well.
> 
> This patch contains the libgomp proper changes. The nvptx and intelmic patches follow.
> I have tested the libgomp testsuite without regressions for both accel targets, is this
> okay for trunk?
> 
> Thanks,
> Chung-Lin
> 
> 2015-08-27 Chung-Lin Tang <cltang@codesourcery.com>
> 
>         * oacc-host.c (host_init_device): Change return type to bool.
>         (host_fini_device): Likewise.
>         (host_dev2host): Likewise.
>         (host_host2dev): Likewise.
>         (host_free): Likewise.
>         (host_alloc): Change return type to bool, change to use out
>         parameter to return allocated pointer.
>         * oacc-mem.c (acc_malloc): Adjust plugin hook declaration change,
>         handle fatal error.
>         (acc_free): Likewise.
>         (acc_memcpy_to_device): Likewise.
>         (acc_memcpy_from_device): Likewise.
>         * oacc-init.c (acc_init_1): Handle gomp_init_device return code,
>         handle fatal error.
>         (acc_set_device_type): Likewise.
>         (acc_set_device_num): Likewise.
>         * target.c (gomp_map_vars): Adjust alloc_func plugin hook call,
>         add device unlock, handle fatal error.
>         (gomp_unmap_tgt): Change return type to bool, adjust free_func
>         plugin call.
>         (gomp_copy_from_async): Handle dev2host_func return code, handle
>         fatal error.
>         (gomp_unmap_vars): Likewise.
>         (gomp_init_device): Change return type to bool, adjust call to
>         init_device_func plugin hook.
>         (GOMP_target): Adjust call to gomp_init_device, handle fatal error.
>         (GOMP_target_data): Likewise.
>         (GOMP_target_update): Likewise.
>         * libgomp.h (gomp_device_descr.init_device_func): Change return
>         type to bool.
>         (gomp_device_descr.fini_device_func): Likewise.
>         (gomp_device_descr.free_func): Likewise.
>         (gomp_device_descr.dev2host_func): Likewise.
>         (gomp_device_descr.host2dev_func) Likewise.
>         (gomp_device_descr.alloc_func): Change return
>         type to bool, use out parameter to return pointer.
>         (gomp_init_device): Change return
>         type to bool.
>
Chung-Lin Tang Sept. 22, 2015, 12:33 p.m. UTC | #2
Ping x2.

On 2015/9/9 04:08 PM, Chung-Lin Tang wrote:
> Ping.
> 
> On 2015/8/27 09:44 PM, Chung-Lin Tang wrote:
>> We've discovered that, for several of the libgomp plugin interface routines,
>> if the target specific routine calls exit() (usually upon a fatal condition),
>> deadlock ensues. We found this using nvptx, but it's possible on intelmic as well.
>>
>> This is due to many of the plugin routines are called with the device lock held,
>> and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor
>> tries to iterate through and acquire all device locks to cleanup. Since we already hold
>> one of the device locks, this just gets stuck.  Also because gomp_mutex_t is a
>> simple futex based lock implementation (instead of pthreads), we don't have a
>> trylock mechanism to use either.
>>
>> So this patch tries to alleviate this problem by changing the plugin interface;
>> the plugin routines that are called while holding the device lock are adjusted
>> to assume to never fatal exit, but return a value back to libgomp proper to
>> indicate execution results. The core libgomp code then may unlock and call gomp_fatal().
>>
>> We believe this is the right route to solve the problem, since there's only
>> two accel target plugins so far. Besides the nvptx plugin, I have made some effort
>> to update the intelmic plugin as well, though it's not as thoroughly audited.
>> Intel folks might want to further make sure your plugin code is free of this problem as well.
>>
>> This patch contains the libgomp proper changes. The nvptx and intelmic patches follow.
>> I have tested the libgomp testsuite without regressions for both accel targets, is this
>> okay for trunk?
>>
>> Thanks,
>> Chung-Lin
>>
>> 2015-08-27 Chung-Lin Tang <cltang@codesourcery.com>
>>
>>         * oacc-host.c (host_init_device): Change return type to bool.
>>         (host_fini_device): Likewise.
>>         (host_dev2host): Likewise.
>>         (host_host2dev): Likewise.
>>         (host_free): Likewise.
>>         (host_alloc): Change return type to bool, change to use out
>>         parameter to return allocated pointer.
>>         * oacc-mem.c (acc_malloc): Adjust plugin hook declaration change,
>>         handle fatal error.
>>         (acc_free): Likewise.
>>         (acc_memcpy_to_device): Likewise.
>>         (acc_memcpy_from_device): Likewise.
>>         * oacc-init.c (acc_init_1): Handle gomp_init_device return code,
>>         handle fatal error.
>>         (acc_set_device_type): Likewise.
>>         (acc_set_device_num): Likewise.
>>         * target.c (gomp_map_vars): Adjust alloc_func plugin hook call,
>>         add device unlock, handle fatal error.
>>         (gomp_unmap_tgt): Change return type to bool, adjust free_func
>>         plugin call.
>>         (gomp_copy_from_async): Handle dev2host_func return code, handle
>>         fatal error.
>>         (gomp_unmap_vars): Likewise.
>>         (gomp_init_device): Change return type to bool, adjust call to
>>         init_device_func plugin hook.
>>         (GOMP_target): Adjust call to gomp_init_device, handle fatal error.
>>         (GOMP_target_data): Likewise.
>>         (GOMP_target_update): Likewise.
>>         * libgomp.h (gomp_device_descr.init_device_func): Change return
>>         type to bool.
>>         (gomp_device_descr.fini_device_func): Likewise.
>>         (gomp_device_descr.free_func): Likewise.
>>         (gomp_device_descr.dev2host_func): Likewise.
>>         (gomp_device_descr.host2dev_func) Likewise.
>>         (gomp_device_descr.alloc_func): Change return
>>         type to bool, use out parameter to return pointer.
>>         (gomp_init_device): Change return
>>         type to bool.
>>
>
Ilya Verbin Sept. 24, 2015, 8:27 p.m. UTC | #3
On Thu, Aug 27, 2015 at 21:44:50 +0800, Chung-Lin Tang wrote:
> We've discovered that, for several of the libgomp plugin interface routines,
> if the target specific routine calls exit() (usually upon a fatal condition),
> deadlock ensues. We found this using nvptx, but it's possible on intelmic as well.
> 
> This is due to many of the plugin routines are called with the device lock held,
> and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor
> tries to iterate through and acquire all device locks to cleanup. Since we already hold
> one of the device locks, this just gets stuck.  Also because gomp_mutex_t is a
> simple futex based lock implementation (instead of pthreads), we don't have a
> trylock mechanism to use either.
> 
> So this patch tries to alleviate this problem by changing the plugin interface;
> the plugin routines that are called while holding the device lock are adjusted
> to assume to never fatal exit, but return a value back to libgomp proper to
> indicate execution results. The core libgomp code then may unlock and call gomp_fatal().
> 
> We believe this is the right route to solve the problem, since there's only
> two accel target plugins so far. Besides the nvptx plugin, I have made some effort
> to update the intelmic plugin as well, though it's not as thoroughly audited.
> Intel folks might want to further make sure your plugin code is free of this problem as well.
> 
> This patch contains the libgomp proper changes. The nvptx and intelmic patches follow.
> I have tested the libgomp testsuite without regressions for both accel targets, is this
> okay for trunk?

(I have no objections)

However, in case of intelmic, these exit()s are just the tip of the iceberg,
because underlying liboffloadmic contains other exit()s at fatal errors.
And I don't know what to do with such deadlocks.

  -- Ilya
Chung-Lin Tang Sept. 29, 2015, 12:42 p.m. UTC | #4
On 2015/9/25 上午 04:27, Ilya Verbin wrote:
> On Thu, Aug 27, 2015 at 21:44:50 +0800, Chung-Lin Tang wrote:
>> We've discovered that, for several of the libgomp plugin interface routines,
>> if the target specific routine calls exit() (usually upon a fatal condition),
>> deadlock ensues. We found this using nvptx, but it's possible on intelmic as well.
>>
>> This is due to many of the plugin routines are called with the device lock held,
>> and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor
>> tries to iterate through and acquire all device locks to cleanup. Since we already hold
>> one of the device locks, this just gets stuck.  Also because gomp_mutex_t is a
>> simple futex based lock implementation (instead of pthreads), we don't have a
>> trylock mechanism to use either.
>>
>> So this patch tries to alleviate this problem by changing the plugin interface;
>> the plugin routines that are called while holding the device lock are adjusted
>> to assume to never fatal exit, but return a value back to libgomp proper to
>> indicate execution results. The core libgomp code then may unlock and call gomp_fatal().
>>
>> We believe this is the right route to solve the problem, since there's only
>> two accel target plugins so far. Besides the nvptx plugin, I have made some effort
>> to update the intelmic plugin as well, though it's not as thoroughly audited.
>> Intel folks might want to further make sure your plugin code is free of this problem as well.
>>
>> This patch contains the libgomp proper changes. The nvptx and intelmic patches follow.
>> I have tested the libgomp testsuite without regressions for both accel targets, is this
>> okay for trunk?
> 
> (I have no objections)
> 
> However, in case of intelmic, these exit()s are just the tip of the iceberg,
> because underlying liboffloadmic contains other exit()s at fatal errors.
> And I don't know what to do with such deadlocks.
> 
>   -- Ilya

Yes, I think I saw more things to adjust wrt this issue within liboffloadmic, though I
hope this plugin interface change can set things ready.

And ping again, for the libgomp proper changes.

Thanks,
Chung-Lin
diff mbox

Patch

Index: libgomp/libgomp.h
===================================================================
--- libgomp/libgomp.h	(revision 227257)
+++ libgomp/libgomp.h	(working copy)
@@ -746,15 +746,15 @@  struct gomp_device_descr
   unsigned int (*get_caps_func) (void);
   int (*get_type_func) (void);
   int (*get_num_devices_func) (void);
-  void (*init_device_func) (int);
-  void (*fini_device_func) (int);
+  bool (*init_device_func) (int);
+  bool (*fini_device_func) (int);
   unsigned (*version_func) (void);
   int (*load_image_func) (int, unsigned, const void *, struct addr_pair **);
   void (*unload_image_func) (int, unsigned, const void *);
-  void *(*alloc_func) (int, size_t);
-  void (*free_func) (int, void *);
-  void *(*dev2host_func) (int, void *, const void *, size_t);
-  void *(*host2dev_func) (int, void *, const void *, size_t);
+  bool (*alloc_func) (int, size_t, void**);
+  bool (*free_func) (int, void *);
+  bool (*dev2host_func) (int, void *, const void *, size_t);
+  bool (*host2dev_func) (int, void *, const void *, size_t);
   void (*run_func) (int, void *, void *);
 
   /* Splay tree containing information about mapped memory regions.  */
@@ -780,7 +780,7 @@  extern struct target_mem_desc *gomp_map_vars (stru
 					      size_t *, void *, bool, bool);
 extern void gomp_copy_from_async (struct target_mem_desc *);
 extern void gomp_unmap_vars (struct target_mem_desc *, bool);
-extern void gomp_init_device (struct gomp_device_descr *);
+extern bool gomp_init_device (struct gomp_device_descr *);
 extern void gomp_free_memmap (struct splay_tree_s *);
 extern void gomp_fini_device (struct gomp_device_descr *);
 extern void gomp_unload_device (struct gomp_device_descr *);
Index: libgomp/oacc-host.c
===================================================================
--- libgomp/oacc-host.c	(revision 227257)
+++ libgomp/oacc-host.c	(working copy)
@@ -60,14 +60,16 @@  host_get_num_devices (void)
   return 1;
 }
 
-static void
+static bool
 host_init_device (int n __attribute__ ((unused)))
 {
+  return true;
 }
 
-static void
+static bool
 host_fini_device (int n __attribute__ ((unused)))
 {
+  return true;
 }
 
 static unsigned
@@ -92,34 +94,36 @@  host_unload_image (int n __attribute__ ((unused)),
 {
 }
 
-static void *
-host_alloc (int n __attribute__ ((unused)), size_t s)
+static bool
+host_alloc (int n __attribute__ ((unused)), size_t s, void **ptr)
 {
-  return gomp_malloc (s);
+  *ptr = gomp_malloc (s);
+  return true;
 }
 
-static void
+static bool
 host_free (int n __attribute__ ((unused)), void *p)
 {
   free (p);
+  return true;
 }
 
-static void *
+static bool
 host_dev2host (int n __attribute__ ((unused)),
 	       void *h __attribute__ ((unused)),
 	       const void *d __attribute__ ((unused)),
 	       size_t s __attribute__ ((unused)))
 {
-  return NULL;
+  return true;
 }
 
-static void *
+static bool
 host_host2dev (int n __attribute__ ((unused)),
 	       void *d __attribute__ ((unused)),
 	       const void *h __attribute__ ((unused)),
 	       size_t s __attribute__ ((unused)))
 {
-  return NULL;
+  return true;
 }
 
 static void
Index: libgomp/oacc-init.c
===================================================================
--- libgomp/oacc-init.c	(revision 227257)
+++ libgomp/oacc-init.c	(working copy)
@@ -231,8 +231,10 @@  acc_init_1 (acc_device_t d)
       gomp_fatal ("device already active");
     }
 
-  gomp_init_device (acc_dev);
+  bool ret = gomp_init_device (acc_dev);
   gomp_mutex_unlock (&acc_dev->lock);
+  if (!ret)
+    gomp_fatal ("device initialization failed");
 
   return base_dev;
 }
@@ -503,13 +505,17 @@  acc_set_device_type (acc_device_t d)
   cached_base_dev = base_dev = resolve_device (d, true);
   acc_dev = &base_dev[goacc_device_num];
 
+  bool ret = true;
   gomp_mutex_lock (&acc_dev->lock);
   if (!acc_dev->is_initialized)
-    gomp_init_device (acc_dev);
+    ret = gomp_init_device (acc_dev);
   gomp_mutex_unlock (&acc_dev->lock);
 
   gomp_mutex_unlock (&acc_device_lock);
 
+  if (!ret)
+    gomp_fatal ("device initialization failed");
+
   /* We're changing device type: invalidate the current thread's dev and
      base_dev pointers.  */
   if (thr && thr->base_dev != base_dev)
@@ -605,13 +611,17 @@  acc_set_device_num (int ord, acc_device_t d)
 
       acc_dev = &base_dev[ord];
 
+      bool ret = true;
       gomp_mutex_lock (&acc_dev->lock);
       if (!acc_dev->is_initialized)
-        gomp_init_device (acc_dev);
+        ret = gomp_init_device (acc_dev);
       gomp_mutex_unlock (&acc_dev->lock);
 
       gomp_mutex_unlock (&acc_device_lock);
 
+      if (!ret)
+	gomp_fatal ("device initialization failed");
+
       goacc_attach_host_thread_to_device (ord);
     }
   
Index: libgomp/oacc-mem.c
===================================================================
--- libgomp/oacc-mem.c	(revision 227257)
+++ libgomp/oacc-mem.c	(working copy)
@@ -105,7 +105,10 @@  acc_malloc (size_t s)
 
   assert (thr->dev);
 
-  return thr->dev->alloc_func (thr->dev->target_id, s);
+  void *ptr;
+  if (!thr->dev->alloc_func (thr->dev->target_id, s, &ptr))
+    gomp_fatal ("error in %s", __FUNCTION__);
+  return ptr;
 }
 
 /* OpenACC 2.0a (3.2.16) doesn't specify what to do in the event
@@ -143,7 +146,8 @@  acc_free (void *d)
   else
     gomp_mutex_unlock (&acc_dev->lock);
 
-  acc_dev->free_func (acc_dev->target_id, d);
+  if (!acc_dev->free_func (acc_dev->target_id, d))
+    gomp_fatal ("error in %s", __FUNCTION__);
 }
 
 void
@@ -155,7 +159,8 @@  acc_memcpy_to_device (void *d, void *h, size_t s)
 
   assert (thr && thr->dev);
 
-  thr->dev->host2dev_func (thr->dev->target_id, d, h, s);
+  if (!thr->dev->host2dev_func (thr->dev->target_id, d, h, s))
+    gomp_fatal ("error in %s", __FUNCTION__);
 }
 
 void
@@ -167,7 +172,8 @@  acc_memcpy_from_device (void *h, void *d, size_t s
 
   assert (thr && thr->dev);
 
-  thr->dev->dev2host_func (thr->dev->target_id, h, d, s);
+  if (!thr->dev->dev2host_func (thr->dev->target_id, h, d, s))
+    gomp_fatal ("error in %s", __FUNCTION__);
 }
 
 /* Return the device pointer that corresponds to host data H.  Or NULL
Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 227257)
+++ libgomp/target.c	(working copy)
@@ -313,8 +313,13 @@  gomp_map_vars (struct gomp_device_descr *devicep,
       /* Allocate tgt_align aligned tgt_size block of memory.  */
       /* FIXME: Perhaps change interface to allocate properly aligned
 	 memory.  */
-      tgt->to_free = devicep->alloc_func (devicep->target_id,
-					  tgt_size + tgt_align - 1);
+      if (!devicep->alloc_func (devicep->target_id, tgt_size + tgt_align - 1,
+				&tgt->to_free))
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("device memory allocation fail");
+	}
+
       tgt->tgt_start = (uintptr_t) tgt->to_free;
       tgt->tgt_start = (tgt->tgt_start + tgt_align - 1) & ~(tgt_align - 1);
       tgt->tgt_end = tgt->tgt_start + tgt_size;
@@ -482,15 +487,17 @@  gomp_map_vars (struct gomp_device_descr *devicep,
   return tgt;
 }
 
-static void
+static bool
 gomp_unmap_tgt (struct target_mem_desc *tgt)
 {
   /* Deallocate on target the tgt->tgt_start .. tgt->tgt_end region.  */
+  bool ret = true;
   if (tgt->tgt_end)
-    tgt->device_descr->free_func (tgt->device_descr->target_id, tgt->to_free);
-
+    ret = tgt->device_descr->free_func (tgt->device_descr->target_id,
+					tgt->to_free);
   free (tgt->array);
   free (tgt);
+  return ret;
 }
 
 /* Decrease the refcount for a set of mapped variables, and queue asychronous
@@ -504,6 +511,7 @@  gomp_copy_from_async (struct target_mem_desc *tgt)
 {
   struct gomp_device_descr *devicep = tgt->device_descr;
   size_t i;
+  bool ret = true;
 
   gomp_mutex_lock (&devicep->lock);
 
@@ -519,12 +527,17 @@  gomp_copy_from_async (struct target_mem_desc *tgt)
       {
 	splay_tree_key k = tgt->list[i];
 	if (k->copy_from)
-	  devicep->dev2host_func (devicep->target_id, (void *) k->host_start,
-				  (void *) (k->tgt->tgt_start + k->tgt_offset),
-				  k->host_end - k->host_start);
+	  ret &= devicep->dev2host_func (devicep->target_id,
+					 (void *) k->host_start,
+					 (void *) (k->tgt->tgt_start
+						   + k->tgt_offset),
+					 k->host_end - k->host_start);
       }
 
   gomp_mutex_unlock (&devicep->lock);
+
+  if (!ret)
+    gomp_fatal ("device to host copy fail");
 }
 
 /* Unmap variables described by TGT.  If DO_COPYFROM is true, copy relevant
@@ -544,6 +557,7 @@  gomp_unmap_vars (struct target_mem_desc *tgt, bool
 
   gomp_mutex_lock (&devicep->lock);
 
+  bool ret = true;
   size_t i;
   for (i = 0; i < tgt->list_count; i++)
     if (tgt->list[i] == NULL)
@@ -556,22 +570,27 @@  gomp_unmap_vars (struct target_mem_desc *tgt, bool
       {
 	splay_tree_key k = tgt->list[i];
 	if (k->copy_from && do_copyfrom)
-	  devicep->dev2host_func (devicep->target_id, (void *) k->host_start,
-				  (void *) (k->tgt->tgt_start + k->tgt_offset),
-				  k->host_end - k->host_start);
+	  ret &= devicep->dev2host_func (devicep->target_id,
+					 (void *) k->host_start,
+					 (void *) (k->tgt->tgt_start
+						   + k->tgt_offset),
+					 k->host_end - k->host_start);
 	splay_tree_remove (&devicep->mem_map, k);
 	if (k->tgt->refcount > 1)
 	  k->tgt->refcount--;
 	else
-	  gomp_unmap_tgt (k->tgt);
+	  ret &= gomp_unmap_tgt (k->tgt);
       }
 
   if (tgt->refcount > 1)
     tgt->refcount--;
   else
-    gomp_unmap_tgt (tgt);
+    ret &= gomp_unmap_tgt (tgt);
 
   gomp_mutex_unlock (&devicep->lock);
+
+  if (!ret)
+    gomp_fatal ("error in unmapping device memory variables");
 }
 
 static void
@@ -879,11 +898,12 @@  GOMP_offload_unregister (const void *host_table, i
 /* This function initializes the target device, specified by DEVICEP.  DEVICEP
    must be locked on entry, and remains locked on return.  */
 
-attribute_hidden void
+attribute_hidden bool
 gomp_init_device (struct gomp_device_descr *devicep)
 {
   int i;
-  devicep->init_device_func (devicep->target_id);
+  if (!devicep->init_device_func (devicep->target_id))
+    return false;
 
   /* Load to device all images registered by the moment.  */
   for (i = 0; i < num_offload_images; i++)
@@ -896,6 +916,7 @@  gomp_init_device (struct gomp_device_descr *device
     }
 
   devicep->is_initialized = true;
+  return true;
 }
 
 attribute_hidden void
@@ -980,11 +1001,15 @@  GOMP_target (int device, void (*fn) (void *), cons
       return;
     }
 
+  bool ret = true;
   gomp_mutex_lock (&devicep->lock);
   if (!devicep->is_initialized)
-    gomp_init_device (devicep);
+    ret = gomp_init_device (devicep);
   gomp_mutex_unlock (&devicep->lock);
 
+  if (!ret)
+    gomp_fatal ("device initialization failed");
+
   void *fn_addr;
 
   if (devicep->capabilities & GOMP_OFFLOAD_CAP_NATIVE_EXEC)
@@ -1048,11 +1073,15 @@  GOMP_target_data (int device, const void *unused,
       return;
     }
 
+  bool ret = true;
   gomp_mutex_lock (&devicep->lock);
   if (!devicep->is_initialized)
-    gomp_init_device (devicep);
+    ret = gomp_init_device (devicep);
   gomp_mutex_unlock (&devicep->lock);
 
+  if (!ret)
+    gomp_fatal ("device initialization failed");
+
   struct target_mem_desc *tgt
     = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, false,
 		     false);
@@ -1083,11 +1112,15 @@  GOMP_target_update (int device, const void *unused
       || !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
     return;
 
+  bool ret = true;
   gomp_mutex_lock (&devicep->lock);
   if (!devicep->is_initialized)
-    gomp_init_device (devicep);
+    ret = gomp_init_device (devicep);
   gomp_mutex_unlock (&devicep->lock);
 
+  if (!ret)
+    gomp_fatal ("device initialization failed");
+
   gomp_update (devicep, mapnum, hostaddrs, sizes, kinds, false);
 }