From patchwork Wed Apr 20 11:35:28 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 612654 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qqfw844Gqz9t3s for ; Wed, 20 Apr 2016 21:36:08 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Xxkkanli; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type; q=dns; s=default; b=WlFKvOCtBemjSzNo K02Tp5QqOyPUceMnwGWUOGWVua+MgUwBWnz+G5msKXzCmAPbh2hgGfx8Wvw8ZImH 7St5Qa7imu+Qm2rhYDl1erCfQ3d5movSkoMJky/S7jH8CoWKwCNtTxj/DrBICqCH vhD61tIaaxo9htdfTWHbZjBngTA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type; s=default; bh=1Mqe1x6J20b/1pP6LStA0a Ht7iI=; b=XxkkanliO8HY5yYtjLFirWutbTWqY768+3SZIfeXbwCelljEN+vRWs qW27WftiltOGgvcNFejEQ+VUKZcjrdmxG4PhLUA5kMQ7ZF/K1w86a14ds1xSzILA yTWUEWlss4Y1I8LSB2nFkHL9ODpbeuLXVIEYggfrkX9lRKeU0a2zM= Received: (qmail 70588 invoked by alias); 20 Apr 2016 11:35:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 70571 invoked by uid 89); 20 Apr 2016 11:35:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL, BAYES_00, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=no version=3.3.2 spammy=hoops, UD:target.c, UD:openacc.h, targetc X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 20 Apr 2016 11:35:43 +0000 Received: from svr-orw-fem-06.mgc.mentorg.com ([147.34.97.120]) by relay1.mentorg.com with esmtp id 1asqQ0-0001Qm-9Z from Thomas_Schwinge@mentor.com ; Wed, 20 Apr 2016 04:35:40 -0700 Received: from tftp-cs (147.34.91.1) by SVR-ORW-FEM-06.mgc.mentorg.com (147.34.97.120) with Microsoft SMTP Server id 14.3.224.2; Wed, 20 Apr 2016 04:35:40 -0700 Received: by tftp-cs (Postfix, from userid 49978) id 5A445C23E9; Wed, 20 Apr 2016 04:35:39 -0700 (PDT) From: Thomas Schwinge To: Jakub Jelinek , CC: Nathan Sidwell Subject: libgomp: Make GCC 5 OpenACC offloading executables work (was: Openacc launch API) In-Reply-To: <56099751.6000208@acm.org> References: <55DC6DC2.5030600@acm.org> <55FA89AF.2020403@t-online.de> <55FAD0D9.1050205@acm.org> <55FBD59F.1060700@redhat.com> <20150924084034.GC1847@tucnak.redhat.com> <56099751.6000208@acm.org> User-Agent: Notmuch/0.9-125-g4686d11 (http://notmuchmail.org) Emacs/24.5.1 (i586-pc-linux-gnu) Date: Wed, 20 Apr 2016 13:35:28 +0200 Message-ID: <87bn54343j.fsf@kepler.schwinge.homeip.net> MIME-Version: 1.0 Hi! On Mon, 28 Sep 2015 15:38:57 -0400, Nathan Sidwell wrote: > On 09/24/15 04:40, Jakub Jelinek wrote: > > Iff GCC 5 compiled offloaded OpenACC/PTX code will always do host fallback > > anyway because of the incompatible PTX version I do agree that it's reasonable to require users to re-compile their code when switching between major GCC releases, to retain the offloading feature, or otherwise resort to host fallback execution. I'll propose some text along these lines for the GCC 6 release notes. > > why don't you just > > do > > goacc_save_and_set_bind (acc_device_host); > > fn (hostaddrs); > > goacc_restore_bind (); > > Committed the attached. Thanks for the review. What we now got, doesn't work, for several reasons. GCC 5 OpenACC offloading executables will just run into SIGSEGV. Here is a patch (which depends on ). Unfortunately, we have to jump through some hoops: because GCC 5 compiler-generated OpenACC reductions code emits calls to acc_get_device_type, and because we'll (have to) always resort to host fallback execution for GCC 5 executables, we also have to enforce these acc_get_device_type calls to return acc_device_host; otherwise reductions will give bogus results. (I hope I'm correctly implementing/using the symbol versioning "magic".) OK for gcc-6-branch and trunk? Assuming we want this fixed on gcc-6-branch, should it be part of 6.1 (to avoid 6.1 users running into the SIGSEGV), or delay for 6.2? We don't have an easy way to add test cases to make sure we don't break such legacy interfaces, do we? (So, I just manually checked a few test cases.) commit c68c6b8e79176f5dc21684efe2517cbfb83a182e Author: Thomas Schwinge Date: Wed Apr 20 13:08:57 2016 +0200 libgomp: Make GCC 5 OpenACC offloading executables work * libgomp.h: Include "openacc.h". (goacc_get_device_type_201, goacc_get_device_type_20): New prototypes. (oacc_20_201_symver, goacc_get_device_type_201): New macros. * libgomp.map: Add acc_get_device_type with OACC_2.0.1 symbol version. * oacc-init.c (acc_get_device_type): Rename to goacc_get_device_type_201. (goacc_get_device_type_20): New function. * oacc-parallel.c (GOACC_parallel): Call goacc_lazy_initialize. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Refuse version 0 offload images. * target.c (gomp_load_image_to_device): Gracefully handle the case that a plugin refuses to load offload images. --- libgomp/libgomp.h | 10 ++++++++++ libgomp/libgomp.map | 10 ++++++++++ libgomp/oacc-init.c | 18 +++++++++++++++++- libgomp/oacc-parallel.c | 11 +++++++++++ libgomp/plugin/plugin-nvptx.c | 10 +++++++++- libgomp/target.c | 6 +++++- 6 files changed, 62 insertions(+), 3 deletions(-) Grüße Thomas diff --git libgomp/libgomp.h libgomp/libgomp.h index 6a05bbc..9fa1cb1 100644 --- libgomp/libgomp.h +++ libgomp/libgomp.h @@ -1011,6 +1011,8 @@ gomp_work_share_init_done (void) /* Now that we're back to default visibility, include the globals. */ #include "libgomp_g.h" +#include "openacc.h" + /* Include omp.h by parts. */ #include "omp-lock.h" #define _LIBGOMP_OMP_LOCK_DEFINED 1 @@ -1047,11 +1049,17 @@ extern void gomp_set_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; extern void gomp_unset_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; extern int gomp_test_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; +extern acc_device_t goacc_get_device_type_201 (void) __GOACC_NOTHROW; +extern acc_device_t goacc_get_device_type_20 (void) __GOACC_NOTHROW; + # define strong_alias(fn, al) \ extern __typeof (fn) al __attribute__ ((alias (#fn))); # define omp_lock_symver(fn) \ __asm (".symver g" #fn "_30, " #fn "@@OMP_3.0"); \ __asm (".symver g" #fn "_25, " #fn "@OMP_1.0"); +# define oacc_20_201_symver(fn) \ + __asm (".symver go" #fn "_201, " #fn "@@OACC_2.0.1"); \ + __asm (".symver go" #fn "_20, " #fn "@OACC_2.0"); #else # define gomp_init_lock_30 omp_init_lock # define gomp_destroy_lock_30 omp_destroy_lock @@ -1063,6 +1071,8 @@ extern int gomp_test_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW; # define gomp_set_nest_lock_30 omp_set_nest_lock # define gomp_unset_nest_lock_30 omp_unset_nest_lock # define gomp_test_nest_lock_30 omp_test_nest_lock + +# define goacc_get_device_type_201 acc_get_device_type #endif #ifdef HAVE_ATTRIBUTE_VISIBILITY diff --git libgomp/libgomp.map libgomp/libgomp.map index 4d42c42..4803aab 100644 --- libgomp/libgomp.map +++ libgomp/libgomp.map @@ -304,7 +304,12 @@ OACC_2.0 { acc_get_num_devices_h_; acc_set_device_type; acc_set_device_type_h_; +#ifdef HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT + # If the assembler used lacks the .symver directive or the linker + # doesn't support GNU symbol versioning, we have the same symbol in + # two versions, which Sun ld chokes on. acc_get_device_type; +#endif acc_get_device_type_h_; acc_set_device_num; acc_set_device_num_h_; @@ -378,6 +383,11 @@ OACC_2.0 { acc_set_cuda_stream; }; +OACC_2.0.1 { + global: + acc_get_device_type; +} OACC_2.0; + GOACC_2.0 { global: GOACC_data_end; diff --git libgomp/oacc-init.c libgomp/oacc-init.c index 42d005d..a7a2243 100644 --- libgomp/oacc-init.c +++ libgomp/oacc-init.c @@ -528,7 +528,7 @@ acc_set_device_type (acc_device_t d) ialias (acc_set_device_type) acc_device_t -acc_get_device_type (void) +goacc_get_device_type_201 (void) { acc_device_t res = acc_device_none; struct gomp_device_descr *dev; @@ -552,8 +552,24 @@ acc_get_device_type (void) return res; } +#ifdef LIBGOMP_GNU_SYMBOL_VERSIONING + +/* Legacy entry point (GCC 5). Only provide host fallback execution. */ + +acc_device_t +goacc_get_device_type_20 (void) +{ + return acc_device_host; +} + +oacc_20_201_symver (acc_get_device_type) + +#else /* LIBGOMP_GNU_SYMBOL_VERSIONING */ + ialias (acc_get_device_type) +#endif /* LIBGOMP_GNU_SYMBOL_VERSIONING */ + int acc_get_device_num (acc_device_t d) { diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c index 9fe5020..321fd66 100644 --- libgomp/oacc-parallel.c +++ libgomp/oacc-parallel.c @@ -203,6 +203,17 @@ GOACC_parallel (int device, void (*fn) (void *), int num_gangs, int num_workers, int vector_length, int async, int num_waits, ...) { +#ifdef HAVE_INTTYPES_H + gomp_debug (0, "%s: mapnum=%"PRIu64", hostaddrs=%p, sizes=%p, kinds=%p, " + "async = %d\n", + __FUNCTION__, (uint64_t) mapnum, hostaddrs, sizes, kinds, async); +#else + gomp_debug (0, "%s: mapnum=%lu, hostaddrs=%p, sizes=%p, kinds=%p, async=%d\n", + __FUNCTION__, (unsigned long) mapnum, hostaddrs, sizes, kinds, + async); +#endif + goacc_lazy_initialize (); + goacc_save_and_set_bind (acc_device_host); fn (hostaddrs); goacc_restore_bind (); diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c index fc5f298..56e6fae 100644 --- libgomp/plugin/plugin-nvptx.c +++ libgomp/plugin/plugin-nvptx.c @@ -1537,7 +1537,15 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, GOMP_PLUGIN_fatal ("Offload data incompatible with PTX plugin" " (expected %u, received %u)", GOMP_VERSION_NVIDIA_PTX, GOMP_VERSION_DEV (version)); - + if (GOMP_VERSION_DEV (version) == 0) + { + /* We're no longer support offload data generated by version 0 mkoffload; + it won't be used in the legacy GOMP_parallel entry point. */ + GOMP_PLUGIN_debug (0, "Offload data not loaded (version %u)\n", + GOMP_VERSION_DEV (version)); + return -1; + } + GOMP_OFFLOAD_init_device (ord); dev = ptx_devices[ord]; diff --git libgomp/target.c libgomp/target.c index dd6f74d..2fbfa6e 100644 --- libgomp/target.c +++ libgomp/target.c @@ -1008,7 +1008,11 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, num_target_entries = devicep->load_image_func (devicep->target_id, version, target_data, &target_table); - + if (num_target_entries < 0) + { + /* The plugin refused this offload data. */ + return; + } if (num_target_entries != num_funcs + num_vars) { gomp_mutex_unlock (&devicep->lock);