From patchwork Wed Jan 23 08:22:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 1029786 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-494578-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="qXyzcTF1"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43kyvs6Jzjz9s3q for ; Wed, 23 Jan 2019 19:22:49 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=U3qNtnjKecWI/tKYq+xSXHUM2r1Ebadm0BHN/hqBVz1Jd70JPh 8zN0B6maFgtJDJEzDCdWNenPkXhJ+ZtScMUEV5dOQHqyVHPFzNg5D7NUi085NkE8 rH5UVEU2WUJrpIRNiu42urNZc+Ypm19b7/L60u60/FXFqpmR+3xe/4py4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=rTwu7eqiI8r6I+LzBiOH1TE8O5k=; b=qXyzcTF1v3ZM/TYT5XNG a+dL1MbstGINJs+LtYt9GfoejX29+FIIaeliEP3xCIdgBRkeH8cK0an2criXfKCM VI6Uk283NTyV5d51wEjd1039qVGA/Kk543iacp1RrGCo9Vowvd7IrgBuSWc7WFEQ 8+BTKBYUyIc556AQu0ESBWU= Received: (qmail 97112 invoked by alias); 23 Jan 2019 08:22:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 96989 invoked by uid 89); 23 Jan 2019 08:22:33 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 23 Jan 2019 08:22:31 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 59343B003; Wed, 23 Jan 2019 08:22:28 +0000 (UTC) Date: Wed, 23 Jan 2019 09:22:58 +0100 From: Tom de Vries To: gcc-patches@gcc.gnu.org Cc: Thomas Schwinge Subject: [committed][nvptx, libgomp] Fix cuMemAlloc with size zero Message-ID: <20190123082256.GA7888@delia> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes Hi, Consider test-case: ... int main (void) { #pragma acc parallel async ; #pragma acc parallel async ; #pragma acc wait return 0; } ... This fails with: ... libgomp: cuMemAlloc error: invalid argument Segmentation fault (core dumped) ... The cuMemAlloc error is due to the fact that we're try to allocate 0 bytes. Fix this by preventing calling map_push with size zero argument in nvptx_exec. This also has the consequence that for the abort-1.c test-case, we end up calling cuMemFree during map_fini for the struct cuda_map allocated in map_init, which fails because an abort happened. Fix this by calling cuMemFree with CUDA_CALL_NOCHECK in cuda_map_destroy. Committed to trunk. Thanks, - Tom [nvptx, libgomp] Fix cuMemAlloc with size zero 2019-01-22 Tom de Vries PR target/PR88946 * plugin/plugin-nvptx.c (cuda_map_destroy): Use CUDA_CALL_NOCHECK for cuMemFree. (nvptx_exec): Don't call map_push if mapnum == 0. * testsuite/libgomp.oacc-c-c++-common/pr88946.c: New test. --- libgomp/plugin/plugin-nvptx.c | 48 +++++++++++++--------- .../testsuite/libgomp.oacc-c-c++-common/pr88946.c | 15 +++++++ 2 files changed, 43 insertions(+), 20 deletions(-) diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 4a67191932e..ff90b67cb86 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -260,7 +260,7 @@ cuda_map_destroy (struct cuda_map *map) atexit handler (PR83795). */ ; else - CUDA_CALL_ASSERT (cuMemFree, map->d); + CUDA_CALL_NOCHECK (cuMemFree, map->d); free (map); } @@ -1164,7 +1164,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, struct ptx_stream *dev_str; void *kargs[1]; void *hp; - CUdeviceptr dp; + CUdeviceptr dp = 0; struct nvptx_thread *nvthd = nvptx_thread (); int warp_size = nvthd->ptx_dev->warp_size; const char *maybe_abort_msg = "(perhaps abort was called)"; @@ -1361,23 +1361,27 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, dims[GOMP_DIM_VECTOR]); } - /* This reserves a chunk of a pre-allocated page of memory mapped on both - the host and the device. HP is a host pointer to the new chunk, and DP is - the corresponding device pointer. */ - pthread_mutex_lock (&ptx_event_lock); - dp = map_push (dev_str, mapnum * sizeof (void *)); - pthread_mutex_unlock (&ptx_event_lock); - - GOMP_PLUGIN_debug (0, " %s: prepare mappings\n", __FUNCTION__); - - /* Copy the array of arguments to the mapped page. */ - hp = alloca(sizeof(void *) * mapnum); - for (i = 0; i < mapnum; i++) - ((void **) hp)[i] = devaddrs[i]; + if (mapnum > 0) + { + /* This reserves a chunk of a pre-allocated page of memory mapped on both + the host and the device. HP is a host pointer to the new chunk, and DP is + the corresponding device pointer. */ + pthread_mutex_lock (&ptx_event_lock); + dp = map_push (dev_str, mapnum * sizeof (void *)); + pthread_mutex_unlock (&ptx_event_lock); + + GOMP_PLUGIN_debug (0, " %s: prepare mappings\n", __FUNCTION__); + + /* Copy the array of arguments to the mapped page. */ + hp = alloca(sizeof(void *) * mapnum); + for (i = 0; i < mapnum; i++) + ((void **) hp)[i] = devaddrs[i]; + + /* Copy the (device) pointers to arguments to the device */ + CUDA_CALL_ASSERT (cuMemcpyHtoD, dp, hp, + mapnum * sizeof (void *)); + } - /* Copy the (device) pointers to arguments to the device */ - CUDA_CALL_ASSERT (cuMemcpyHtoD, dp, hp, - mapnum * sizeof (void *)); GOMP_PLUGIN_debug (0, " %s: kernel %s: launch" " gangs=%u, workers=%u, vectors=%u\n", __FUNCTION__, targ_fn->launch->fn, dims[GOMP_DIM_GANG], @@ -1422,7 +1426,8 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, CUDA_CALL_ASSERT (cuEventRecord, *e, dev_str->stream); - event_add (PTX_EVT_KNL, e, (void *)dev_str, 0); + if (mapnum > 0) + event_add (PTX_EVT_KNL, e, (void *)dev_str, 0); } #else r = CUDA_CALL_NOCHECK (cuCtxSynchronize, ); @@ -1439,7 +1444,10 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, #ifndef DISABLE_ASYNC if (async < acc_async_noval) #endif - map_pop (dev_str); + { + if (mapnum > 0) + map_pop (dev_str); + } } void * openacc_get_current_cuda_context (void); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr88946.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr88946.c new file mode 100644 index 00000000000..ad56ded1d2b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr88946.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ + +int +main (void) +{ + #pragma acc parallel async + ; + + #pragma acc parallel async + ; + + #pragma acc wait + + return 0; +}