From patchwork Fri May 22 22:21:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 1296486 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49TLdx1whmz9sPK for ; Sat, 23 May 2020 08:25:09 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7EBC2398403F; Fri, 22 May 2020 22:25:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 591903851C0C for ; Fri, 22 May 2020 22:25:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 591903851C0C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Julian_Brown@mentor.com IronPort-SDR: jrKUMmXawHTiQ7HYq/EFpgscrmOpi+jmR6VnBCuCb0hiwnVWQEY/cnoV48qwNysq35vIyKHPlE ok/F0OpCV3ikbECAXB9Cxnf5ju/GBTic//GfFmb2MB3ImWCHLGDkuiEeZxx/p73U3INtm3/tq9 dPzz7qE/GCRnjMYPStAUpBJS4AXqs3B6u4BZJWD/629pplJHJUL+eZLZd7RJeDJCK3SugnSImb zU2mxq66GkmdXrwpDr+4W1AeqXndt5rX2uG7S1NxS7VvF0rwb+knL9DUdYuo7vFhmJZfC5AZYY JYc= X-IronPort-AV: E=Sophos;i="5.73,423,1583222400"; d="scan'208";a="49127128" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa3.mentor.iphmx.com with ESMTP; 22 May 2020 14:25:00 -0800 IronPort-SDR: n5TZRCN1E6EQ7J7ac7Ywymp4cyUndCY+joQatpudOnNWKWPXh1OhMOf+hNAbRMV5DAPtgv29zm o2v9JSxvXK8a19+njrIy6oK7j+9sEWKgdL0rEQTKhymCcTYDSNEij8OJNHI67yygB37auwKaFd tONOIiLBXvmJxqme3EvDGZgpf4gl0boP/l7cUQioUVAQYYA04WuGJxtPLX0H+dREd+ETtzjd7p d/xdYlp1LieJGaze5KYiWmRe3PFeqfVHsnCqgcdAwBrtfjBEutOKLxjJ3bNOCpLbLSV/HNUmxK d4Q= From: Julian Brown To: Subject: [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version) Date: Fri, 22 May 2020 15:21:44 -0700 Message-ID: <1cc9ea4c5807c5da2df9f17a0a11935e78b0c721.1590182783.git.julian@codesourcery.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-08.mgc.mentorg.com (139.181.222.8) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Catherine_Moore@mentor.com, jakub@redhat.com, thomas@codesourcery.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This is a new version of the reference count self-checking code, adjusted to work with the new (old) dynamic_refcount counting scheme. The key observation is that a target_mem_desc that was created from a dynamic data lifetime should not contribute to the structured refcount for splay tree keys in its variable list. We can figure out which target_mem_descs that applies to using the information recorded in the previous patch. In a sense, this takes the "awkward corner cases" from the virtual_refcount ("overhaul") patch, and moves them to the optional self-test code, where they can potentially do less harm. With this, we still have a formal-ish model of what refcounts mean and some confidence that they remain consistent (at least throughout execution of a test run), which I think is a good thing. OK? (We probably want a way of configuring-in this testing automatically, as mentioned previously.) Julian ChangeLog libgomp/ * libgomp.h (RC_CHECKING): New macro, disabled by default, guarding all hunks in this patch. (target_mem_desc): Add refcount_chk, mark fields. (splay_tree_key_s): Add refcount_chk field. (dump_tgt, gomp_rc_check): Add prototypes. * oacc-mem.c (GOACC_enter_exit_data): Add refcount self-check code. * oacc-parallel.c (GOACC_parallel_keyed_internal): Add refcount self-check code. (GOACC_data_start, GOACC_data_end, GOACC_enter_exit_data): Likewise. * target.c (stdio.h): Include. (dump_tgt, rc_check_clear, rc_check_count, rc_check_verify, gomp_rc_check): New functions to consistency-check reference counts. --- libgomp/libgomp.h | 18 ++++ libgomp/oacc-mem.c | 6 ++ libgomp/oacc-parallel.c | 27 ++++++ libgomp/target.c | 185 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 236 insertions(+) diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index 0d1978ffb13..eaa7c6ebb4c 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -960,9 +960,17 @@ struct target_var_desc { uintptr_t length; }; +/* Uncomment to enable reference-count consistency checking (for development + use only). */ +//#define RC_CHECKING 1 + struct target_mem_desc { /* Reference count. */ uintptr_t refcount; +#ifdef RC_CHECKING + uintptr_t refcount_chk; + bool mark; +#endif /* All the splay nodes allocated together. */ splay_tree_node array; /* Start of the target region. */ @@ -1019,6 +1027,10 @@ struct splay_tree_key_s { uintptr_t refcount; /* Dynamic reference count. */ uintptr_t dynamic_refcount; +#ifdef RC_CHECKING + /* The recalculated reference count, for verification. */ + uintptr_t refcount_chk; +#endif struct splay_tree_aux *aux; }; @@ -1174,6 +1186,12 @@ extern void gomp_detach_pointer (struct gomp_device_descr *, struct goacc_asyncqueue *, splay_tree_key, uintptr_t, bool, struct gomp_coalesce_buf *); +#ifdef RC_CHECKING +extern void dump_tgt (const char *, struct target_mem_desc *); +extern void gomp_rc_check (struct gomp_device_descr *, + struct target_mem_desc *); +#endif + extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *, size_t, void **, void **, size_t *, void *, bool, diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 038ab68e8a2..c8ec3c9a7dd 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -1450,4 +1450,10 @@ GOACC_enter_exit_data (int flags_m, size_t mapnum, void **hostaddrs, thr->prof_info = NULL; thr->api_info = NULL; } + +#ifdef RC_CHECKING + gomp_mutex_lock (&acc_dev->lock); + gomp_rc_check (acc_dev, thr->mapped_data); + gomp_mutex_unlock (&acc_dev->lock); +#endif } diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c index c7e46e35bd6..0774cdc7e4f 100644 --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -301,6 +301,15 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *), &api_info); } +#ifdef RC_CHECKING + gomp_mutex_lock (&acc_dev->lock); + assert (tgt); + dump_tgt (__FUNCTION__, tgt); + tgt->prev = thr->mapped_data; + gomp_rc_check (acc_dev, tgt); + gomp_mutex_unlock (&acc_dev->lock); +#endif + devaddrs = gomp_alloca (sizeof (void *) * mapnum); for (i = 0; i < mapnum; i++) devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i); @@ -347,6 +356,12 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *), thr->prof_info = NULL; thr->api_info = NULL; } + +#ifdef RC_CHECKING + gomp_mutex_lock (&acc_dev->lock); + gomp_rc_check (acc_dev, thr->mapped_data); + gomp_mutex_unlock (&acc_dev->lock); +#endif } /* Legacy entry point (GCC 5). Only provide host fallback execution. */ @@ -481,6 +496,12 @@ GOACC_data_start (int flags_m, size_t mapnum, thr->prof_info = NULL; thr->api_info = NULL; } + +#ifdef RC_CHECKING + gomp_mutex_lock (&acc_dev->lock); + gomp_rc_check (acc_dev, thr->mapped_data); + gomp_mutex_unlock (&acc_dev->lock); +#endif } void @@ -554,6 +575,12 @@ GOACC_data_end (void) thr->prof_info = NULL; thr->api_info = NULL; } + +#ifdef RC_CHECKING + gomp_mutex_lock (&thr->dev->lock); + gomp_rc_check (thr->dev, thr->mapped_data); + gomp_mutex_unlock (&thr->dev->lock); +#endif } void diff --git a/libgomp/target.c b/libgomp/target.c index 1d60d0cb573..9a51e1c70f6 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -38,6 +38,9 @@ #include #include #include +#ifdef RC_CHECKING +#include +#endif #ifdef PLUGIN_SUPPORT #include @@ -347,6 +350,188 @@ gomp_free_device_memory (struct gomp_device_descr *devicep, void *devptr) } } +#ifdef RC_CHECKING +void +dump_tgt (const char *where, struct target_mem_desc *tgt) +{ + if (!getenv ("GOMP_DEBUG_TGT")) + return; + + fprintf (stderr, "%s: %s: tgt=%p\n", __FUNCTION__, where, (void*) tgt); + fprintf (stderr, "refcount=%d\n", (int) tgt->refcount); + fprintf (stderr, "tgt_start=%p\n", (void*) tgt->tgt_start); + fprintf (stderr, "tgt_end=%p\n", (void*) tgt->tgt_end); + fprintf (stderr, "to_free=%p\n", tgt->to_free); + fprintf (stderr, "list_count=%d\n", (int) tgt->list_count); + for (int i = 0; i < tgt->list_count; i++) + { + fprintf (stderr, "list item %d:\n", i); + fprintf (stderr, " key: %p\n", (void*) tgt->list[i].key); + if (tgt->list[i].key) + { + fprintf (stderr, " key.host_start=%p\n", + (void*) tgt->list[i].key->host_start); + fprintf (stderr, " key.host_end=%p\n", + (void*) tgt->list[i].key->host_end); + fprintf (stderr, " key.tgt=%p\n", (void*) tgt->list[i].key->tgt); + fprintf (stderr, " key.offset=%d\n", + (int) tgt->list[i].key->tgt_offset); + fprintf (stderr, " key.refcount=%d\n", + (int) tgt->list[i].key->refcount); + fprintf (stderr, " key.dynamic_refcount=%d\n", + (int) tgt->list[i].key->dynamic_refcount); + if (tgt->list[i].key->aux) + { + fprintf (stderr, " key.aux.link_key=%p\n", + (void*) tgt->list[i].key->aux->link_key); + fprintf (stderr, " key.aux.attach_count=%p\n", + (void*) tgt->list[i].key->aux->attach_count); + } + } + } + fprintf (stderr, "\n"); +} + +static void +rc_check_clear (splay_tree_node node) +{ + splay_tree_key k = &node->key; + + k->refcount_chk = 0; + k->tgt->refcount_chk = 0; + k->tgt->mark = false; + + if (node->left) + rc_check_clear (node->left); + if (node->right) + rc_check_clear (node->right); +} + +static void +rc_check_count (splay_tree_node node) +{ + splay_tree_key k = &node->key; + struct target_mem_desc *t; + + /* Add dynamic reference counts ("acc enter data", etc.) for this key. */ + k->refcount_chk += k->dynamic_refcount; + + t = k->tgt; + t->refcount_chk++; + + /* Do not count references from tgt_mem_descs that arise from dynamic data + lifetimes: those are counted already by their keys' dynamic_refcount. */ + if (!t->mark && goacc_marked_dynamic_p (t)) + t->mark = true; + else if (!t->mark) + { + for (int i = 0; i < t->list_count; i++) + if (t->list[i].key) + t->list[i].key->refcount_chk++; + + t->mark = true; + } + + if (node->left) + rc_check_count (node->left); + if (node->right) + rc_check_count (node->right); +} + +static bool +rc_check_verify (splay_tree_node node, bool noisy, bool errors) +{ + splay_tree_key k = &node->key; + struct target_mem_desc *t; + + if (k->refcount != REFCOUNT_INFINITY) + { + if (noisy) + fprintf (stderr, "key %p (%p..+%d): rc=%d/%d, dyn_rc=%d\n", k, + (void *) k->host_start, (int) (k->host_end - k->host_start), + (int) k->refcount, (int) k->refcount_chk, + (int) k->dynamic_refcount); + + if (k->refcount != k->refcount_chk) + { + if (noisy) + fprintf (stderr, " -- key refcount mismatch!\n"); + errors = true; + } + + t = k->tgt; + + if (noisy) + fprintf (stderr, "tgt %p: rc=%d/%d\n", t, (int) t->refcount, + (int) t->refcount_chk); + + if (t->refcount != t->refcount_chk) + { + if (noisy) + fprintf (stderr, + " -- target memory descriptor refcount mismatch!\n"); + errors = true; + } + } + + if (node->left) + errors |= rc_check_verify (node->left, noisy, errors); + if (node->right) + errors |= rc_check_verify (node->right, noisy, errors); + + return errors; +} + +/* Call with device locked. */ + +attribute_hidden void +gomp_rc_check (struct gomp_device_descr *devicep, struct target_mem_desc *tgt) +{ + splay_tree sp = &devicep->mem_map; + + bool noisy = getenv ("GOMP_DEBUG_TGT") != 0; + + if (noisy) + fprintf (stderr, "\n*** GOMP_RC_CHECK ***\n\n"); + + if (sp->root) + { + rc_check_clear (sp->root); + + for (struct target_mem_desc *t = tgt; t; t = t->prev) + { + t->refcount_chk = 0; + t->mark = false; + } + + /* Add references for interconnected splay-tree keys. */ + rc_check_count (sp->root); + + /* Add references for the tgt for a currently-executing kernel and/or + any enclosing data directives. */ + for (struct target_mem_desc *t = tgt; t; t = t->prev) + { + t->refcount_chk++; + + if (!t->mark) + { + for (int i = 0; i < t->list_count; i++) + if (t->list[i].key) + t->list[i].key->refcount_chk++; + + t->mark = true; + } + } + + if (rc_check_verify (sp->root, noisy, false)) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("refcount checking failure"); + } + } +} +#endif + /* Handle the case where gomp_map_lookup, splay_tree_lookup or gomp_map_0len_lookup found oldn for newn. Helper function of gomp_map_vars. */