From patchwork Fri Oct 30 00:22:46 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 538086 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4B97F140787 for ; Fri, 30 Oct 2015 11:23:01 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=siCypYzN; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=y4viTTuoy2jgcf9R4xXhzIJ9C/be7pBZ7zlLGnODFuJSParJ0C cl/41BxEdu4bQelV/3CXPP/INkIOSOM2F+w+S1vFgdNwQ1upzilS/Y0FAGgvVpAx AUKUWU6c8BYtMT7aHgNIhMSnXOvroBlo67nMcYja6ZjB4Ec8yqUjDvzks= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=Tl5EvVEpFL7htG0vboK4ezHmyG8=; b=siCypYzNteyVaE9Ta/zC ExupZQCW0EdAZjrhmJ/8PAJwOTVeI85W5rPX5TzHZVSv9/d9s56uxwnyTP7K5Mb+ yCWhT6UnbmO3LYg4oHTJTeNUFluiunGBt0tm+GDI9v7P1ODTjibpEKJVWHil4DIt 78CnC9x2N+sdbWzsYzg88Zk= Received: (qmail 85218 invoked by alias); 30 Oct 2015 00:22:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 85208 invoked by uid 89); 30 Oct 2015 00:22:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=BAYES_50, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-pa0-f45.google.com Received: from mail-pa0-f45.google.com (HELO mail-pa0-f45.google.com) (209.85.220.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 30 Oct 2015 00:22:50 +0000 Received: by pasz6 with SMTP id z6so55311967pas.2 for ; Thu, 29 Oct 2015 17:22:48 -0700 (PDT) X-Received: by 10.68.65.37 with SMTP id u5mr5133694pbs.76.1446164568252; Thu, 29 Oct 2015 17:22:48 -0700 (PDT) Received: from [172.18.5.97] ([208.185.168.98]) by smtp.googlemail.com with ESMTPSA id ha1sm4455173pbc.54.2015.10.29.17.22.47 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 29 Oct 2015 17:22:47 -0700 (PDT) To: GCC Patches From: Nathan Sidwell Subject: [openacc] on_device fix Message-ID: <5632B856.4050509@acm.org> Date: Thu, 29 Oct 2015 17:22:46 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 acc_on_device and it's builtin had a conflict. The function formally takes an enum argument, but the builtin takes an int -- primarily to avoid the compiler having to generate the enum type internally. This works fine for C, where the external declaration of the function (in openacc.h) matches up with the builtin, and we optimize the builtin as expected. It fails for C++ where the builtin doesn't match the declaration in the header. We end up with emitting a call to acc_on_device, which is resolved by libgomp. Unfortunately that means we fail to optimize. We could resolve this in a number of ways 1) make the header file's declaration have an int argument. 2) make the header file have an inline definition fowarding to a differently named function with an int argument, that matched a renamed builtin 3) Do what this patch does. Option 1 would be visible in the type system, if someone took the address of the function (I'm not sure why they'd do that). We used this variant on the gomp4 branch for a long time. Option 2 requires changing the symbols exported from libgomp. Instead of exporting acc_on_device we'd need to export __acc_on_device or something. And we'd need to provide a backwards compatible entry point named acc_on_device anyway. Option 3 leaves things unchanged for C -- declare a function with an enum arg. But for C++ we the extern "C" declaration takes an int -- and therefore matches the builtin. We insert an inline wrapper that takes an enum argument. Because of C++'s overload resolution both the wrapper and the int-taking declaration can have the same source name. We require the enum to be layout compatible with int -- this was an artifact of the earlier implementation anyway. I added enumeration values to acc_device_t to enforce that, just in case someone tries to compile their openacc code with -fshort-enums. Committed to trunk. nathan 2015-10-29 Nathan Sidwell gcc/ * openacc.h (enum acc_device_t): Reformat. Ensure layout compatibility. (enum acc_async_t): Reformat. (acc_on_device): Declare compatible with builtin and provide C++ wrapper. * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: New. gcc/testsuite/ * c-c++-common/goacc/acc_on_device-2-off.c: Delete. * c-c++-common/goacc/acc_on_device-2.c: Delete. Index: libgomp/openacc.h =================================================================== --- libgomp/openacc.h (revision 229535) +++ libgomp/openacc.h (working copy) @@ -47,24 +47,25 @@ extern "C" { #endif /* Types */ -typedef enum acc_device_t - { - /* Keep in sync with include/gomp-constants.h. */ - acc_device_none = 0, - acc_device_default = 1, - acc_device_host = 2, - /* acc_device_host_nonshm = 3 removed. */ - acc_device_not_host = 4, - acc_device_nvidia = 5, - _ACC_device_hwm - } acc_device_t; - -typedef enum acc_async_t - { - /* Keep in sync with include/gomp-constants.h. */ - acc_async_noval = -1, - acc_async_sync = -2 - } acc_async_t; +typedef enum acc_device_t { + /* Keep in sync with include/gomp-constants.h. */ + acc_device_none = 0, + acc_device_default = 1, + acc_device_host = 2, + /* acc_device_host_nonshm = 3 removed. */ + acc_device_not_host = 4, + acc_device_nvidia = 5, + _ACC_device_hwm, + /* Ensure enumeration is layout compatible with int. */ + _ACC_highest = __INT_MAX__, + _ACC_neg = -1 +} acc_device_t; + +typedef enum acc_async_t { + /* Keep in sync with include/gomp-constants.h. */ + acc_async_noval = -1, + acc_async_sync = -2 +} acc_async_t; int acc_get_num_devices (acc_device_t) __GOACC_NOTHROW; void acc_set_device_type (acc_device_t) __GOACC_NOTHROW; @@ -79,7 +80,11 @@ void acc_wait_all (void) __GOACC_NOTHROW void acc_wait_all_async (int) __GOACC_NOTHROW; void acc_init (acc_device_t) __GOACC_NOTHROW; void acc_shutdown (acc_device_t) __GOACC_NOTHROW; -int acc_on_device (acc_device_t) __GOACC_NOTHROW; +#ifdef __cplusplus +int acc_on_device (int __arg) __GOACC_NOTHROW; +#else +int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW; +#endif void *acc_malloc (size_t) __GOACC_NOTHROW; void acc_free (void *) __GOACC_NOTHROW; /* Some of these would be more correct with const qualifiers, but @@ -113,6 +118,13 @@ int acc_set_cuda_stream (int, void *) __ #ifdef __cplusplus } + +/* Forwarding function with correctly typed arg. */ + +inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW +{ + return acc_on_device ((int) __arg); +} #endif #endif /* _OPENACC_H */ Index: libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2" } */ + +#include + +int Foo (acc_device_t x) +{ + return acc_on_device (x); +} + +/* { dg-final { scan-assembler-not "acc_on_device" } } */ Index: gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c =================================================================== --- gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c (revision 229535) +++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c (working copy) @@ -1,24 +0,0 @@ -/* Have to enable optimizations, as otherwise builtins won't be expanded. */ -/* { dg-additional-options "-O -fdump-rtl-expand -fno-openacc" } */ - -#if __cplusplus -extern "C" { -#endif - -typedef enum acc_device_t { acc_device_X = 123 } acc_device_t; -extern int acc_on_device (acc_device_t); - -#if __cplusplus -} -#endif - -int -f (void) -{ - const acc_device_t dev = acc_device_X; - return acc_on_device (dev); -} - -/* Without -fopenacc, we're expecting one call. - { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 1 "expand" } } */ - Index: gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c =================================================================== --- gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c (revision 229535) +++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c (working copy) @@ -1,28 +0,0 @@ -/* Have to enable optimizations, as otherwise builtins won't be expanded. */ -/* { dg-additional-options "-O -fdump-rtl-expand" } */ - -#if __cplusplus -extern "C" { -#endif - -typedef enum acc_device_t { acc_device_X = 123 } acc_device_t; -extern int acc_on_device (acc_device_t); - -#if __cplusplus -} -#endif - -int -f (void) -{ - const acc_device_t dev = acc_device_X; - return acc_on_device (dev); -} - -/* With -fopenacc, we're expecting the builtin to be expanded, so no calls. - TODO: in C++, even under extern "C", the use of enum for acc_device_t - perturbs expansion as a builtin, which expects an int parameter. It's fine - when changing acc_device_t to plain int, but that's not what we're doing in - . - - { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 "expand" { xfail c++ } } } */