From patchwork Thu Aug 27 13:44:50 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 511312 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 5BCED1401C7 for ; Thu, 27 Aug 2015 23:45:08 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=b7dRU61C; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:message-id:date:mime-version:content-type; q=dns; s=default; b=AhWqN2ODRJiGCqrNEGKRMp+vsEnhJbqJxDTFDbq7Kd8COtAzlQ K4ljSJm47jHfo19UasTw31hgOS/8in+Q/q9I8b1YlL7xCk7zMV3W+G/QEcIu67NQ tfHsmlqo/D+ARIFu0MD2Or1IDfxRh0erUh5QY6Y6bB+xQv9/X04AD2QFE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:message-id:date:mime-version:content-type; s= default; bh=dVXvLMMIUfiZnHWbL8b0GcZgryo=; b=b7dRU61C/F79lRL2aPCr KZf+yD1tZpREfvG+jrsPo9syitk+kMs+Pha1ATuElaCldJ/Q78pW3/6Wy0kKDvgQ nlTfaUsvCUirp9IHbBUgVomgPLcZ8+kkpX86Xl54nB/cA1xBFwHIL9KQ0tJOtzno v6uItvdI8H0qFG3BOKKTIcM= Received: (qmail 98226 invoked by alias); 27 Aug 2015 13:44:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 98213 invoked by uid 89); 27 Aug 2015 13:44:58 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 27 Aug 2015 13:44:50 +0000 Received: from svr-orw-fem-04.mgc.mentorg.com ([147.34.97.41]) by relay1.mentorg.com with esmtp id 1ZUxTy-0004FI-3p from ChungLin_Tang@mentor.com ; Thu, 27 Aug 2015 06:44:46 -0700 Received: from [0.0.0.0] (147.34.91.1) by svr-orw-fem-04.mgc.mentorg.com (147.34.97.41) with Microsoft SMTP Server id 14.3.224.2; Thu, 27 Aug 2015 06:44:43 -0700 From: Chung-Lin Tang Subject: [PATCH 1/3, libgomp] Adjust offload plugin interface for avoiding deadlock on exit To: gcc-patches CC: Nathan Sidwell , Jakub Jelinek , "Verbin, Ilya" , Thomas Schwinge Message-ID: <55DF1452.9050501@codesourcery.com> Date: Thu, 27 Aug 2015 21:44:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 X-IsSubscribed: yes We've discovered that, for several of the libgomp plugin interface routines, if the target specific routine calls exit() (usually upon a fatal condition), deadlock ensues. We found this using nvptx, but it's possible on intelmic as well. This is due to many of the plugin routines are called with the device lock held, and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor tries to iterate through and acquire all device locks to cleanup. Since we already hold one of the device locks, this just gets stuck. Also because gomp_mutex_t is a simple futex based lock implementation (instead of pthreads), we don't have a trylock mechanism to use either. So this patch tries to alleviate this problem by changing the plugin interface; the plugin routines that are called while holding the device lock are adjusted to assume to never fatal exit, but return a value back to libgomp proper to indicate execution results. The core libgomp code then may unlock and call gomp_fatal(). We believe this is the right route to solve the problem, since there's only two accel target plugins so far. Besides the nvptx plugin, I have made some effort to update the intelmic plugin as well, though it's not as thoroughly audited. Intel folks might want to further make sure your plugin code is free of this problem as well. This patch contains the libgomp proper changes. The nvptx and intelmic patches follow. I have tested the libgomp testsuite without regressions for both accel targets, is this okay for trunk? Thanks, Chung-Lin 2015-08-27 Chung-Lin Tang * oacc-host.c (host_init_device): Change return type to bool. (host_fini_device): Likewise. (host_dev2host): Likewise. (host_host2dev): Likewise. (host_free): Likewise. (host_alloc): Change return type to bool, change to use out parameter to return allocated pointer. * oacc-mem.c (acc_malloc): Adjust plugin hook declaration change, handle fatal error. (acc_free): Likewise. (acc_memcpy_to_device): Likewise. (acc_memcpy_from_device): Likewise. * oacc-init.c (acc_init_1): Handle gomp_init_device return code, handle fatal error. (acc_set_device_type): Likewise. (acc_set_device_num): Likewise. * target.c (gomp_map_vars): Adjust alloc_func plugin hook call, add device unlock, handle fatal error. (gomp_unmap_tgt): Change return type to bool, adjust free_func plugin call. (gomp_copy_from_async): Handle dev2host_func return code, handle fatal error. (gomp_unmap_vars): Likewise. (gomp_init_device): Change return type to bool, adjust call to init_device_func plugin hook. (GOMP_target): Adjust call to gomp_init_device, handle fatal error. (GOMP_target_data): Likewise. (GOMP_target_update): Likewise. * libgomp.h (gomp_device_descr.init_device_func): Change return type to bool. (gomp_device_descr.fini_device_func): Likewise. (gomp_device_descr.free_func): Likewise. (gomp_device_descr.dev2host_func): Likewise. (gomp_device_descr.host2dev_func) Likewise. (gomp_device_descr.alloc_func): Change return type to bool, use out parameter to return pointer. (gomp_init_device): Change return type to bool. Index: libgomp/libgomp.h =================================================================== --- libgomp/libgomp.h (revision 227257) +++ libgomp/libgomp.h (working copy) @@ -746,15 +746,15 @@ struct gomp_device_descr unsigned int (*get_caps_func) (void); int (*get_type_func) (void); int (*get_num_devices_func) (void); - void (*init_device_func) (int); - void (*fini_device_func) (int); + bool (*init_device_func) (int); + bool (*fini_device_func) (int); unsigned (*version_func) (void); int (*load_image_func) (int, unsigned, const void *, struct addr_pair **); void (*unload_image_func) (int, unsigned, const void *); - void *(*alloc_func) (int, size_t); - void (*free_func) (int, void *); - void *(*dev2host_func) (int, void *, const void *, size_t); - void *(*host2dev_func) (int, void *, const void *, size_t); + bool (*alloc_func) (int, size_t, void**); + bool (*free_func) (int, void *); + bool (*dev2host_func) (int, void *, const void *, size_t); + bool (*host2dev_func) (int, void *, const void *, size_t); void (*run_func) (int, void *, void *); /* Splay tree containing information about mapped memory regions. */ @@ -780,7 +780,7 @@ extern struct target_mem_desc *gomp_map_vars (stru size_t *, void *, bool, bool); extern void gomp_copy_from_async (struct target_mem_desc *); extern void gomp_unmap_vars (struct target_mem_desc *, bool); -extern void gomp_init_device (struct gomp_device_descr *); +extern bool gomp_init_device (struct gomp_device_descr *); extern void gomp_free_memmap (struct splay_tree_s *); extern void gomp_fini_device (struct gomp_device_descr *); extern void gomp_unload_device (struct gomp_device_descr *); Index: libgomp/oacc-host.c =================================================================== --- libgomp/oacc-host.c (revision 227257) +++ libgomp/oacc-host.c (working copy) @@ -60,14 +60,16 @@ host_get_num_devices (void) return 1; } -static void +static bool host_init_device (int n __attribute__ ((unused))) { + return true; } -static void +static bool host_fini_device (int n __attribute__ ((unused))) { + return true; } static unsigned @@ -92,34 +94,36 @@ host_unload_image (int n __attribute__ ((unused)), { } -static void * -host_alloc (int n __attribute__ ((unused)), size_t s) +static bool +host_alloc (int n __attribute__ ((unused)), size_t s, void **ptr) { - return gomp_malloc (s); + *ptr = gomp_malloc (s); + return true; } -static void +static bool host_free (int n __attribute__ ((unused)), void *p) { free (p); + return true; } -static void * +static bool host_dev2host (int n __attribute__ ((unused)), void *h __attribute__ ((unused)), const void *d __attribute__ ((unused)), size_t s __attribute__ ((unused))) { - return NULL; + return true; } -static void * +static bool host_host2dev (int n __attribute__ ((unused)), void *d __attribute__ ((unused)), const void *h __attribute__ ((unused)), size_t s __attribute__ ((unused))) { - return NULL; + return true; } static void Index: libgomp/oacc-init.c =================================================================== --- libgomp/oacc-init.c (revision 227257) +++ libgomp/oacc-init.c (working copy) @@ -231,8 +231,10 @@ acc_init_1 (acc_device_t d) gomp_fatal ("device already active"); } - gomp_init_device (acc_dev); + bool ret = gomp_init_device (acc_dev); gomp_mutex_unlock (&acc_dev->lock); + if (!ret) + gomp_fatal ("device initialization failed"); return base_dev; } @@ -503,13 +505,17 @@ acc_set_device_type (acc_device_t d) cached_base_dev = base_dev = resolve_device (d, true); acc_dev = &base_dev[goacc_device_num]; + bool ret = true; gomp_mutex_lock (&acc_dev->lock); if (!acc_dev->is_initialized) - gomp_init_device (acc_dev); + ret = gomp_init_device (acc_dev); gomp_mutex_unlock (&acc_dev->lock); gomp_mutex_unlock (&acc_device_lock); + if (!ret) + gomp_fatal ("device initialization failed"); + /* We're changing device type: invalidate the current thread's dev and base_dev pointers. */ if (thr && thr->base_dev != base_dev) @@ -605,13 +611,17 @@ acc_set_device_num (int ord, acc_device_t d) acc_dev = &base_dev[ord]; + bool ret = true; gomp_mutex_lock (&acc_dev->lock); if (!acc_dev->is_initialized) - gomp_init_device (acc_dev); + ret = gomp_init_device (acc_dev); gomp_mutex_unlock (&acc_dev->lock); gomp_mutex_unlock (&acc_device_lock); + if (!ret) + gomp_fatal ("device initialization failed"); + goacc_attach_host_thread_to_device (ord); } Index: libgomp/oacc-mem.c =================================================================== --- libgomp/oacc-mem.c (revision 227257) +++ libgomp/oacc-mem.c (working copy) @@ -105,7 +105,10 @@ acc_malloc (size_t s) assert (thr->dev); - return thr->dev->alloc_func (thr->dev->target_id, s); + void *ptr; + if (!thr->dev->alloc_func (thr->dev->target_id, s, &ptr)) + gomp_fatal ("error in %s", __FUNCTION__); + return ptr; } /* OpenACC 2.0a (3.2.16) doesn't specify what to do in the event @@ -143,7 +146,8 @@ acc_free (void *d) else gomp_mutex_unlock (&acc_dev->lock); - acc_dev->free_func (acc_dev->target_id, d); + if (!acc_dev->free_func (acc_dev->target_id, d)) + gomp_fatal ("error in %s", __FUNCTION__); } void @@ -155,7 +159,8 @@ acc_memcpy_to_device (void *d, void *h, size_t s) assert (thr && thr->dev); - thr->dev->host2dev_func (thr->dev->target_id, d, h, s); + if (!thr->dev->host2dev_func (thr->dev->target_id, d, h, s)) + gomp_fatal ("error in %s", __FUNCTION__); } void @@ -167,7 +172,8 @@ acc_memcpy_from_device (void *h, void *d, size_t s assert (thr && thr->dev); - thr->dev->dev2host_func (thr->dev->target_id, h, d, s); + if (!thr->dev->dev2host_func (thr->dev->target_id, h, d, s)) + gomp_fatal ("error in %s", __FUNCTION__); } /* Return the device pointer that corresponds to host data H. Or NULL Index: libgomp/target.c =================================================================== --- libgomp/target.c (revision 227257) +++ libgomp/target.c (working copy) @@ -313,8 +313,13 @@ gomp_map_vars (struct gomp_device_descr *devicep, /* Allocate tgt_align aligned tgt_size block of memory. */ /* FIXME: Perhaps change interface to allocate properly aligned memory. */ - tgt->to_free = devicep->alloc_func (devicep->target_id, - tgt_size + tgt_align - 1); + if (!devicep->alloc_func (devicep->target_id, tgt_size + tgt_align - 1, + &tgt->to_free)) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("device memory allocation fail"); + } + tgt->tgt_start = (uintptr_t) tgt->to_free; tgt->tgt_start = (tgt->tgt_start + tgt_align - 1) & ~(tgt_align - 1); tgt->tgt_end = tgt->tgt_start + tgt_size; @@ -482,15 +487,17 @@ gomp_map_vars (struct gomp_device_descr *devicep, return tgt; } -static void +static bool gomp_unmap_tgt (struct target_mem_desc *tgt) { /* Deallocate on target the tgt->tgt_start .. tgt->tgt_end region. */ + bool ret = true; if (tgt->tgt_end) - tgt->device_descr->free_func (tgt->device_descr->target_id, tgt->to_free); - + ret = tgt->device_descr->free_func (tgt->device_descr->target_id, + tgt->to_free); free (tgt->array); free (tgt); + return ret; } /* Decrease the refcount for a set of mapped variables, and queue asychronous @@ -504,6 +511,7 @@ gomp_copy_from_async (struct target_mem_desc *tgt) { struct gomp_device_descr *devicep = tgt->device_descr; size_t i; + bool ret = true; gomp_mutex_lock (&devicep->lock); @@ -519,12 +527,17 @@ gomp_copy_from_async (struct target_mem_desc *tgt) { splay_tree_key k = tgt->list[i]; if (k->copy_from) - devicep->dev2host_func (devicep->target_id, (void *) k->host_start, - (void *) (k->tgt->tgt_start + k->tgt_offset), - k->host_end - k->host_start); + ret &= devicep->dev2host_func (devicep->target_id, + (void *) k->host_start, + (void *) (k->tgt->tgt_start + + k->tgt_offset), + k->host_end - k->host_start); } gomp_mutex_unlock (&devicep->lock); + + if (!ret) + gomp_fatal ("device to host copy fail"); } /* Unmap variables described by TGT. If DO_COPYFROM is true, copy relevant @@ -544,6 +557,7 @@ gomp_unmap_vars (struct target_mem_desc *tgt, bool gomp_mutex_lock (&devicep->lock); + bool ret = true; size_t i; for (i = 0; i < tgt->list_count; i++) if (tgt->list[i] == NULL) @@ -556,22 +570,27 @@ gomp_unmap_vars (struct target_mem_desc *tgt, bool { splay_tree_key k = tgt->list[i]; if (k->copy_from && do_copyfrom) - devicep->dev2host_func (devicep->target_id, (void *) k->host_start, - (void *) (k->tgt->tgt_start + k->tgt_offset), - k->host_end - k->host_start); + ret &= devicep->dev2host_func (devicep->target_id, + (void *) k->host_start, + (void *) (k->tgt->tgt_start + + k->tgt_offset), + k->host_end - k->host_start); splay_tree_remove (&devicep->mem_map, k); if (k->tgt->refcount > 1) k->tgt->refcount--; else - gomp_unmap_tgt (k->tgt); + ret &= gomp_unmap_tgt (k->tgt); } if (tgt->refcount > 1) tgt->refcount--; else - gomp_unmap_tgt (tgt); + ret &= gomp_unmap_tgt (tgt); gomp_mutex_unlock (&devicep->lock); + + if (!ret) + gomp_fatal ("error in unmapping device memory variables"); } static void @@ -879,11 +898,12 @@ GOMP_offload_unregister (const void *host_table, i /* This function initializes the target device, specified by DEVICEP. DEVICEP must be locked on entry, and remains locked on return. */ -attribute_hidden void +attribute_hidden bool gomp_init_device (struct gomp_device_descr *devicep) { int i; - devicep->init_device_func (devicep->target_id); + if (!devicep->init_device_func (devicep->target_id)) + return false; /* Load to device all images registered by the moment. */ for (i = 0; i < num_offload_images; i++) @@ -896,6 +916,7 @@ gomp_init_device (struct gomp_device_descr *device } devicep->is_initialized = true; + return true; } attribute_hidden void @@ -980,11 +1001,15 @@ GOMP_target (int device, void (*fn) (void *), cons return; } + bool ret = true; gomp_mutex_lock (&devicep->lock); if (!devicep->is_initialized) - gomp_init_device (devicep); + ret = gomp_init_device (devicep); gomp_mutex_unlock (&devicep->lock); + if (!ret) + gomp_fatal ("device initialization failed"); + void *fn_addr; if (devicep->capabilities & GOMP_OFFLOAD_CAP_NATIVE_EXEC) @@ -1048,11 +1073,15 @@ GOMP_target_data (int device, const void *unused, return; } + bool ret = true; gomp_mutex_lock (&devicep->lock); if (!devicep->is_initialized) - gomp_init_device (devicep); + ret = gomp_init_device (devicep); gomp_mutex_unlock (&devicep->lock); + if (!ret) + gomp_fatal ("device initialization failed"); + struct target_mem_desc *tgt = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, false, false); @@ -1083,11 +1112,15 @@ GOMP_target_update (int device, const void *unused || !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)) return; + bool ret = true; gomp_mutex_lock (&devicep->lock); if (!devicep->is_initialized) - gomp_init_device (devicep); + ret = gomp_init_device (devicep); gomp_mutex_unlock (&devicep->lock); + if (!ret) + gomp_fatal ("device initialization failed"); + gomp_update (devicep, mapnum, hostaddrs, sizes, kinds, false); }