From patchwork Wed Jan 23 08:22:58 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tom de Vries <tdevries@suse.de>
X-Patchwork-Id: 1029786
Return-Path: 
 <gcc-patches-return-494578-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=gcc-patches-return-494578-incoming=patchwork.ozlabs.org@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dmarc=none (p=none dis=none) header.from=suse.de
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="qXyzcTF1"; dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 43kyvs6Jzjz9s3q
	for <incoming@patchwork.ozlabs.org>;
	Wed, 23 Jan 2019 19:22:49 +1100 (AEDT)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:cc:subject:message-id:mime-version:content-type; q=dns;
	s=default; b=U3qNtnjKecWI/tKYq+xSXHUM2r1Ebadm0BHN/hqBVz1Jd70JPh
	8zN0B6maFgtJDJEzDCdWNenPkXhJ+ZtScMUEV5dOQHqyVHPFzNg5D7NUi085NkE8
	rH5UVEU2WUJrpIRNiu42urNZc+Ypm19b7/L60u60/FXFqpmR+3xe/4py4=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:cc:subject:message-id:mime-version:content-type; s=
	default; bh=rTwu7eqiI8r6I+LzBiOH1TE8O5k=; b=qXyzcTF1v3ZM/TYT5XNG
	a+dL1MbstGINJs+LtYt9GfoejX29+FIIaeliEP3xCIdgBRkeH8cK0an2criXfKCM
	VI6Uk283NTyV5d51wEjd1039qVGA/Kk543iacp1RrGCo9Vowvd7IrgBuSWc7WFEQ
	8+BTKBYUyIc556AQu0ESBWU=
Received: (qmail 97112 invoked by alias); 23 Jan 2019 08:22:33 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 96989 invoked by uid 89); 23 Jan 2019 08:22:33 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0,
	GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: mx1.suse.de
Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Wed, 23 Jan 2019 08:22:31 +0000
Received: from relay2.suse.de (unknown [195.135.220.254])	by mx1.suse.de
	(Postfix) with ESMTP id 59343B003;
	Wed, 23 Jan 2019 08:22:28 +0000 (UTC)
Date: Wed, 23 Jan 2019 09:22:58 +0100
From: Tom de Vries <tdevries@suse.de>
To: gcc-patches@gcc.gnu.org
Cc: Thomas Schwinge <thomas@schwinge.name>
Subject: [committed][nvptx, libgomp] Fix cuMemAlloc with size zero
Message-ID: <20190123082256.GA7888@delia>
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.10.1 (2018-07-13)
X-IsSubscribed: yes

Hi,

Consider test-case:
...
int
main (void)
{
  #pragma acc parallel async
  ;
  #pragma acc parallel async
  ;
  #pragma acc wait

  return 0;
}
...

This fails with:
...
libgomp: cuMemAlloc error: invalid argument
Segmentation fault (core dumped)
...
The cuMemAlloc error is due to the fact that we're try to allocate 0 bytes.

Fix this by preventing calling map_push with size zero argument in nvptx_exec.

This also has the consequence that for the abort-1.c test-case, we end up
calling cuMemFree during map_fini for the struct cuda_map allocated in
map_init, which fails because an abort happened.  Fix this by calling
cuMemFree with CUDA_CALL_NOCHECK in cuda_map_destroy.

Committed to trunk.

Thanks,
- Tom

[nvptx, libgomp] Fix cuMemAlloc with size zero

2019-01-22  Tom de Vries  <tdevries@suse.de>

	PR target/PR88946
	* plugin/plugin-nvptx.c (cuda_map_destroy): Use CUDA_CALL_NOCHECK for
	cuMemFree.
	(nvptx_exec): Don't call map_push if mapnum == 0.
	* testsuite/libgomp.oacc-c-c++-common/pr88946.c: New test.
---
 libgomp/plugin/plugin-nvptx.c                      | 48 +++++++++++++---------
 .../testsuite/libgomp.oacc-c-c++-common/pr88946.c  | 15 +++++++
 2 files changed, 43 insertions(+), 20 deletions(-)
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 4a67191932e..ff90b67cb86 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -260,7 +260,7 @@ cuda_map_destroy (struct cuda_map *map)
        atexit handler (PR83795).  */
     ;
   else
-    CUDA_CALL_ASSERT (cuMemFree, map->d);
+    CUDA_CALL_NOCHECK (cuMemFree, map->d);
 
   free (map);
 }
@@ -1164,7 +1164,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
   struct ptx_stream *dev_str;
   void *kargs[1];
   void *hp;
-  CUdeviceptr dp;
+  CUdeviceptr dp = 0;
   struct nvptx_thread *nvthd = nvptx_thread ();
   int warp_size = nvthd->ptx_dev->warp_size;
   const char *maybe_abort_msg = "(perhaps abort was called)";
@@ -1361,23 +1361,27 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 			   dims[GOMP_DIM_VECTOR]);
     }
 
-  /* This reserves a chunk of a pre-allocated page of memory mapped on both
-     the host and the device. HP is a host pointer to the new chunk, and DP is
-     the corresponding device pointer.  */
-  pthread_mutex_lock (&ptx_event_lock);
-  dp = map_push (dev_str, mapnum * sizeof (void *));
-  pthread_mutex_unlock (&ptx_event_lock);
-
-  GOMP_PLUGIN_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
-
-  /* Copy the array of arguments to the mapped page.  */
-  hp = alloca(sizeof(void *) * mapnum);
-  for (i = 0; i < mapnum; i++)
-    ((void **) hp)[i] = devaddrs[i];
+  if (mapnum > 0)
+    {
+      /* This reserves a chunk of a pre-allocated page of memory mapped on both
+	 the host and the device. HP is a host pointer to the new chunk, and DP is
+	 the corresponding device pointer.  */
+      pthread_mutex_lock (&ptx_event_lock);
+      dp = map_push (dev_str, mapnum * sizeof (void *));
+      pthread_mutex_unlock (&ptx_event_lock);
+
+      GOMP_PLUGIN_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
+
+      /* Copy the array of arguments to the mapped page.  */
+      hp = alloca(sizeof(void *) * mapnum);
+      for (i = 0; i < mapnum; i++)
+	((void **) hp)[i] = devaddrs[i];
+
+      /* Copy the (device) pointers to arguments to the device */
+      CUDA_CALL_ASSERT (cuMemcpyHtoD, dp, hp,
+			mapnum * sizeof (void *));
+    }
 
-  /* Copy the (device) pointers to arguments to the device */
-  CUDA_CALL_ASSERT (cuMemcpyHtoD, dp, hp,
-		    mapnum * sizeof (void *));
   GOMP_PLUGIN_debug (0, "  %s: kernel %s: launch"
 		     " gangs=%u, workers=%u, vectors=%u\n",
 		     __FUNCTION__, targ_fn->launch->fn, dims[GOMP_DIM_GANG],
@@ -1422,7 +1426,8 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
       CUDA_CALL_ASSERT (cuEventRecord, *e, dev_str->stream);
 
-      event_add (PTX_EVT_KNL, e, (void *)dev_str, 0);
+      if (mapnum > 0)
+	event_add (PTX_EVT_KNL, e, (void *)dev_str, 0);
     }
 #else
   r = CUDA_CALL_NOCHECK (cuCtxSynchronize, );
@@ -1439,7 +1444,10 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 #ifndef DISABLE_ASYNC
   if (async < acc_async_noval)
 #endif
-    map_pop (dev_str);
+    {
+      if (mapnum > 0)
+	map_pop (dev_str);
+    }
 }
 
 void * openacc_get_current_cuda_context (void);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr88946.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr88946.c
new file mode 100644
index 00000000000..ad56ded1d2b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr88946.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+
+int
+main (void)
+{
+  #pragma acc parallel async
+  ;
+
+  #pragma acc parallel async
+  ;
+
+  #pragma acc wait
+
+  return 0;
+}