From patchwork Wed Apr 22 18:42:43 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 463776 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 335BF140134 for ; Thu, 23 Apr 2015 04:43:11 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=QiyKP5Gi; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=Z1XFYitVXkPa9aLTe77DVEILIAs4GMtKk/Vo0wav63W5+ATIwxzaY /2ZpjmCTslSwBRcCuJwMwXkd45Uh66UP5S7Q4/SF5eiEYQo/xYItATQs6ZU/J73d xJiyjJlociGcJ2ZM652S4cZ+Hfe87IA8W/xxFYnE5i/qb+f/KcpZPI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=jNlyUxmJbg87jqUwdraskjn/3gs=; b=QiyKP5GiKk1ELfEUgfm+ 7cVkvDqDE7H66uMfJpd/GKx3Ys4Osqj3VaicM5DcJl3N21M8/+ww4/JCO75ZmR+M 62lPIFnGvhVruReEfi+xfuNfrcx+oV2togVrwXzgtXfMkoGiJfZ79clljLb+pFPC mII7Q1VLg6V2Vb+pDv6BBzo= Received: (qmail 68114 invoked by alias); 22 Apr 2015 18:43:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 68096 invoked by uid 89); 22 Apr 2015 18:43:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 22 Apr 2015 18:42:58 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Ykzbp-0004Zq-W7 from Julian_Brown@mentor.com for gcc-patches@gcc.gnu.org; Wed, 22 Apr 2015 11:42:54 -0700 Received: from octopus (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Wed, 22 Apr 2015 19:42:51 +0100 Date: Wed, 22 Apr 2015 19:42:43 +0100 From: Julian Brown To: , Thomas Schwinge Subject: [PATCH] Tidy up locking for libgomp OpenACC entry points Message-ID: <20150422194243.115f26fa@octopus> MIME-Version: 1.0 X-IsSubscribed: yes Hi, This patch is an attempt to fix some potential race conditions with accesses to shared data structures from multiple concurrent threads in libgomp's OpenACC entry points. The main change is to move locking out of lookup_host and lookup_dev in oacc-mem.c and into their callers (which can then hold the locks for the whole operation that they are performing). Also missing locking has been added for gomp_acc_insert_pointer. Tests look OK (with offloading to NVidia PTX). OK? (For the gomp4 branch, maybe, if trunk's not suitable at the moment?) Thanks, Julian ChangeLog libgomp/ * oacc-mem.c (lookup_host): Remove locking from function. Note locking requirement for caller in function comment. (lookup_dev): Likewise. (acc_free, acc_deviceptr, acc_hostptr, acc_is_present) (acc_map_data, acc_unmap_data, present_create_copy, delete_copyout) (update_dev_host, gomp_acc_insert_pointer, gomp_acc_remove_pointer): Add locking. commit 983e08e46be24380a52095851cd9c6eb481eb47c Author: Julian Brown Date: Tue Apr 21 12:42:17 2015 -0700 More locking in oacc-mem.c diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 89ef5fc..d53af4b 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -35,7 +35,8 @@ #include #include -/* Return block containing [H->S), or NULL if not contained. */ +/* Return block containing [H->S), or NULL if not contained. The device lock + for DEV must be locked on entry, and remains locked on exit. */ static splay_tree_key lookup_host (struct gomp_device_descr *dev, void *h, size_t s) @@ -46,9 +47,7 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s) node.host_start = (uintptr_t) h; node.host_end = (uintptr_t) h + s; - gomp_mutex_lock (&dev->lock); key = splay_tree_lookup (&dev->mem_map, &node); - gomp_mutex_unlock (&dev->lock); return key; } @@ -56,7 +55,8 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s) /* Return block containing [D->S), or NULL if not contained. The list isn't ordered by device address, so we have to iterate over the whole array. This is not expected to be a common - operation. */ + operation. The device lock associated with TGT must be locked on entry, and + remains locked on exit. */ static splay_tree_key lookup_dev (struct target_mem_desc *tgt, void *d, size_t s) @@ -67,16 +67,12 @@ lookup_dev (struct target_mem_desc *tgt, void *d, size_t s) if (!tgt) return NULL; - gomp_mutex_lock (&tgt->device_descr->lock); - for (t = tgt; t != NULL; t = t->prev) { if (t->tgt_start <= (uintptr_t) d && t->tgt_end >= (uintptr_t) d + s) break; } - gomp_mutex_unlock (&tgt->device_descr->lock); - if (!t) return NULL; @@ -120,25 +116,32 @@ acc_free (void *d) { splay_tree_key k; struct goacc_thread *thr = goacc_thread (); + struct gomp_device_descr *acc_dev = thr->dev; if (!d) return; assert (thr && thr->dev); + gomp_mutex_lock (&acc_dev->lock); + /* We don't have to call lazy open here, as the ptr value must have been returned by acc_malloc. It's not permitted to pass NULL in (unless you got that null from acc_malloc). */ - if ((k = lookup_dev (thr->dev->openacc.data_environ, d, 1))) - { - void *offset; + if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1))) + { + void *offset; + + offset = d - k->tgt->tgt_start + k->tgt_offset; - offset = d - k->tgt->tgt_start + k->tgt_offset; + gomp_mutex_unlock (&acc_dev->lock); - acc_unmap_data ((void *)(k->host_start + offset)); - } + acc_unmap_data ((void *)(k->host_start + offset)); + } + else + gomp_mutex_unlock (&acc_dev->lock); - thr->dev->free_func (thr->dev->target_id, d); + acc_dev->free_func (acc_dev->target_id, d); } void @@ -178,16 +181,24 @@ acc_deviceptr (void *h) goacc_lazy_initialize (); struct goacc_thread *thr = goacc_thread (); + struct gomp_device_descr *dev = thr->dev; + + gomp_mutex_lock (&dev->lock); - n = lookup_host (thr->dev, h, 1); + n = lookup_host (dev, h, 1); if (!n) - return NULL; + { + gomp_mutex_unlock (&dev->lock); + return NULL; + } offset = h - n->host_start; d = n->tgt->tgt_start + n->tgt_offset + offset; + gomp_mutex_unlock (&dev->lock); + return d; } @@ -204,16 +215,24 @@ acc_hostptr (void *d) goacc_lazy_initialize (); struct goacc_thread *thr = goacc_thread (); + struct gomp_device_descr *acc_dev = thr->dev; - n = lookup_dev (thr->dev->openacc.data_environ, d, 1); + gomp_mutex_lock (&acc_dev->lock); + + n = lookup_dev (acc_dev->openacc.data_environ, d, 1); if (!n) - return NULL; + { + gomp_mutex_unlock (&acc_dev->lock); + return NULL; + } offset = d - n->tgt->tgt_start + n->tgt_offset; h = n->host_start + offset; + gomp_mutex_unlock (&acc_dev->lock); + return h; } @@ -232,6 +251,8 @@ acc_is_present (void *h, size_t s) struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + gomp_mutex_lock (&acc_dev->lock); + n = lookup_host (acc_dev, h, s); if (n && ((uintptr_t)h < n->host_start @@ -239,6 +260,8 @@ acc_is_present (void *h, size_t s) || s > n->host_end - n->host_start)) n = NULL; + gomp_mutex_unlock (&acc_dev->lock); + return n != NULL; } @@ -274,20 +297,32 @@ acc_map_data (void *h, void *d, size_t s) gomp_fatal ("[%p,+%d]->[%p,+%d] is a bad map", (void *)h, (int)s, (void *)d, (int)s); + gomp_mutex_lock (&acc_dev->lock); + if (lookup_host (acc_dev, h, s)) - gomp_fatal ("host address [%p, +%d] is already mapped", (void *)h, - (int)s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("host address [%p, +%d] is already mapped", (void *)h, + (int)s); + } if (lookup_dev (thr->dev->openacc.data_environ, d, s)) - gomp_fatal ("device address [%p, +%d] is already mapped", (void *)d, - (int)s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("device address [%p, +%d] is already mapped", (void *)d, + (int)s); + } + + gomp_mutex_unlock (&acc_dev->lock); tgt = gomp_map_vars (acc_dev, mapnum, &hostaddrs, &devaddrs, &sizes, &kinds, true, false); } + gomp_mutex_lock (&acc_dev->lock); tgt->prev = acc_dev->openacc.data_environ; acc_dev->openacc.data_environ = tgt; + gomp_mutex_unlock (&acc_dev->lock); } void @@ -299,17 +334,26 @@ acc_unmap_data (void *h) /* No need to call lazy open, as the address must have been mapped. */ size_t host_size; + + gomp_mutex_lock (&acc_dev->lock); + splay_tree_key n = lookup_host (acc_dev, h, 1); struct target_mem_desc *t; if (!n) - gomp_fatal ("%p is not a mapped block", (void *)h); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("%p is not a mapped block", (void *)h); + } host_size = n->host_end - n->host_start; if (n->host_start != (uintptr_t) h) - gomp_fatal ("[%p,%d] surrounds1 %p", - (void *) n->host_start, (int) host_size, (void *) h); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("[%p,%d] surrounds %p", + (void *) n->host_start, (int) host_size, (void *) h); + } t = n->tgt; @@ -323,8 +367,6 @@ acc_unmap_data (void *h) t->tgt_end = 0; t->to_free = 0; - gomp_mutex_lock (&acc_dev->lock); - for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL; tp = t, t = t->prev) if (n->tgt == t) @@ -336,10 +378,10 @@ acc_unmap_data (void *h) break; } - - gomp_mutex_unlock (&acc_dev->lock); } + gomp_mutex_unlock (&acc_dev->lock); + gomp_unmap_vars (t, true); } @@ -361,6 +403,8 @@ present_create_copy (unsigned f, void *h, size_t s) struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + gomp_mutex_lock (&acc_dev->lock); + n = lookup_host (acc_dev, h, s); if (n) { @@ -368,13 +412,22 @@ present_create_copy (unsigned f, void *h, size_t s) d = (void *) (n->tgt->tgt_start + n->tgt_offset); if (!(f & FLAG_PRESENT)) - gomp_fatal ("[%p,+%d] already mapped to [%p,+%d]", - (void *)h, (int)s, (void *)d, (int)s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("[%p,+%d] already mapped to [%p,+%d]", + (void *)h, (int)s, (void *)d, (int)s); + } if ((h + s) > (void *)n->host_end) - gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s); + } + + gomp_mutex_unlock (&acc_dev->lock); } else if (!(f & FLAG_CREATE)) { + gomp_mutex_unlock (&acc_dev->lock); gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s); } else @@ -389,6 +442,8 @@ present_create_copy (unsigned f, void *h, size_t s) else kinds = GOMP_MAP_ALLOC; + gomp_mutex_unlock (&acc_dev->lock); + tgt = gomp_map_vars (acc_dev, mapnum, &hostaddrs, NULL, &s, &kinds, true, false); @@ -439,25 +494,35 @@ delete_copyout (unsigned f, void *h, size_t s) struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + gomp_mutex_lock (&acc_dev->lock); + n = lookup_host (acc_dev, h, s); /* No need to call lazy open, as the data must already have been mapped. */ if (!n) - gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s); + } d = (void *) (n->tgt->tgt_start + n->tgt_offset); host_size = n->host_end - n->host_start; if (n->host_start != (uintptr_t) h || host_size != s) - gomp_fatal ("[%p,%d] surrounds2 [%p,+%d]", - (void *) n->host_start, (int) host_size, (void *) h, (int) s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("[%p,%d] surrounds2 [%p,+%d]", + (void *) n->host_start, (int) host_size, (void *) h, (int) s); + } if (f & FLAG_COPYOUT) acc_dev->dev2host_func (acc_dev->target_id, h, d, s); + gomp_mutex_unlock (&acc_dev->lock); + acc_unmap_data (h); acc_dev->free_func (acc_dev->target_id, d); @@ -482,20 +547,27 @@ update_dev_host (int is_dev, void *h, size_t s) struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + gomp_mutex_lock (&acc_dev->lock); + n = lookup_host (acc_dev, h, s); /* No need to call lazy open, as the data must already have been mapped. */ if (!n) - gomp_fatal ("[%p,%d] is not mapped", h, (int)s); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("[%p,%d] is not mapped", h, (int)s); + } d = (void *) (n->tgt->tgt_start + n->tgt_offset); + gomp_mutex_unlock (&acc_dev->lock); + if (is_dev) acc_dev->host2dev_func (acc_dev->target_id, d, h, s); else - acc_dev->dev2host_func (acc_dev->target_id, h, d, s); + acc_dev->dev2host_func (acc_dev->target_id, h, d, s); } void @@ -522,8 +594,11 @@ gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes, tgt = gomp_map_vars (acc_dev, mapnum, hostaddrs, NULL, sizes, kinds, true, false); gomp_debug (0, " %s: mappings prepared\n", __FUNCTION__); + + gomp_mutex_lock (&acc_dev->lock); tgt->prev = acc_dev->openacc.data_environ; acc_dev->openacc.data_environ = tgt; + gomp_mutex_unlock (&acc_dev->lock); } void @@ -535,10 +610,15 @@ gomp_acc_remove_pointer (void *h, bool force_copyfrom, int async, int mapnum) struct target_mem_desc *t; int minrefs = (mapnum == 1) ? 2 : 3; + gomp_mutex_lock (&acc_dev->lock); + n = lookup_host (acc_dev, h, 1); if (!n) - gomp_fatal ("%p is not a mapped block", (void *)h); + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("%p is not a mapped block", (void *)h); + } gomp_debug (0, " %s: restore mappings\n", __FUNCTION__); @@ -546,8 +626,6 @@ gomp_acc_remove_pointer (void *h, bool force_copyfrom, int async, int mapnum) struct target_mem_desc *tp; - gomp_mutex_lock (&acc_dev->lock); - if (t->refcount == minrefs) { /* This is the last reference, so pull the descriptor off the