From patchwork Tue Dec 1 15:28:26 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Monakov X-Patchwork-Id: 550931 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 112BF14016A for ; Wed, 2 Dec 2015 02:29:31 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=VZRr2gw9; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=Mi6TDXJv6eSoZhkHbOoYWOVF4vLlFEp0WF0navOgd2xZn71h3BZNe BEwhTh/nkMHhhzZMKGz2/DzX7Ua1Y5yYjag+SYcHfU1TEcyfKEJEZho5bvY6eilG JosGk+oLt2mPcBsHbm8K1axQEGkgbAd5J/AOhcPZLfu4HOGBmwxfdA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=c3R65wjefdgkUC+AHW/gYLqV6j0=; b=VZRr2gw9K5LCj+Jx93gk IcmZM0cRm0YkaAlqvWaDFUuA/rd6qDnpagptoRzrVKHBvOOSzSVdxQYROeClXKZh 7fCCrZRMeXz6WwoP4ki7nDAL/eUKvzoKR/4GOYl1Pc+iPx429n0a83Dixs7nwsrk 85lQWjdUF4JdqUMIVfRCDUY= Received: (qmail 13985 invoked by alias); 1 Dec 2015 15:28:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 13908 invoked by uid 89); 1 Dec 2015 15:28:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-HELO: eggs.gnu.org Received: from eggs.gnu.org (HELO eggs.gnu.org) (208.118.235.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 01 Dec 2015 15:28:37 +0000 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3mr2-0006Ah-Ve for gcc-patches@gcc.gnu.org; Tue, 01 Dec 2015 10:28:35 -0500 Received: from smtp.ispras.ru ([83.149.199.79]:49332) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3mr2-0006AT-NZ for gcc-patches@gcc.gnu.org; Tue, 01 Dec 2015 10:28:32 -0500 Received: from condor.intra.ispras.ru (unknown [83.149.199.91]) by smtp.ispras.ru (Postfix) with ESMTP id BDDA220416; Tue, 1 Dec 2015 18:28:28 +0300 (MSK) Received: by condor.intra.ispras.ru (Postfix, from userid 23246) id 437E91225DE5; Tue, 1 Dec 2015 18:28:28 +0300 (MSK) From: Alexander Monakov To: gcc-patches@gcc.gnu.org Cc: Jakub Jelinek , Bernd Schmidt , Dmitry Melnik Subject: [gomp-nvptx 8/9] libgomp: update gomp_nvptx_main for -mgomp Date: Tue, 1 Dec 2015 18:28:26 +0300 Message-Id: <1448983707-18854-9-git-send-email-amonakov@ispras.ru> In-Reply-To: <1448983707-18854-1-git-send-email-amonakov@ispras.ru> References: <1448983707-18854-1-git-send-email-amonakov@ispras.ru> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 83.149.199.79 X-IsSubscribed: yes Here's how I've updated gomp_nvptx_main to set up shared memory arrays __nvptx_stacks and __nvptx_uni for -mgomp. Since it makes sense only for -mgomp multilib, I've wrapped the whole file under #ifdef that checks corresponding built-in macros. Reaching those shared memory arrays is awkward. I cannot declare them with toplevel asms because the compiler implicitely declares them too, and ptxas does not handle duplicated declaration. Ideally I'd like to be able to say: extern char *__shared __nvptx_stacks[32]; Bernd, is your position on exposing shared memory as first-class address space on NVPTX subject to change? Do you remember what middle-end issues you've encountered when trying that? * config/nvptx/team.c (gomp_nvptx_main): Rename to... (gomp_nvptx_main_1): ... this and mark noinline. (gomp_nvptx_main): Wrap the above, set up __nvptx_uni and __nvptx_stacks. --- libgomp/config/nvptx/team.c | 37 +++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/libgomp/config/nvptx/team.c b/libgomp/config/nvptx/team.c index 88d1d34..deb0860 100644 --- a/libgomp/config/nvptx/team.c +++ b/libgomp/config/nvptx/team.c @@ -24,6 +24,8 @@ /* This file handles the maintainence of threads on NVPTX. */ +#if defined __nvptx_softstack && defined __nvptx_unisimt__ + #include "libgomp.h" #include @@ -31,15 +33,9 @@ struct gomp_thread *nvptx_thrs; static void gomp_thread_start (struct gomp_thread_pool *); -void -gomp_nvptx_main (void (*fn) (void *), void *fn_data) +static void __attribute__((noinline)) +gomp_nvptx_main_1 (void (*fn) (void *), void *fn_data, int ntids, int tid) { - int ntids, tid, laneid; - asm ("mov.u32 %0, %%laneid;" : "=r" (laneid)); - if (laneid) - return; - asm ("mov.u32 %0, %%tid.y;" : "=r" (tid)); - asm ("mov.u32 %0, %%ntid.y;" : "=r"(ntids)); if (tid == 0) { gomp_global_icv.nthreads_var = ntids; @@ -72,6 +68,30 @@ gomp_nvptx_main (void (*fn) (void *), void *fn_data) } } +void +gomp_nvptx_main (void (*fn) (void *), void *fn_data) +{ + int tid, ntids; + asm ("mov.u32 %0, %%tid.y;" : "=r" (tid)); + asm ("mov.u32 %0, %%ntid.y;" : "=r"(ntids)); + char *stacks = 0; + int *__nvptx_uni; + asm ("cvta.shared.u64 %0, __nvptx_uni;" : "=r" (__nvptx_uni)); + __nvptx_uni[tid] = 0; + if (tid == 0) + { + size_t stacksize = 131072; + stacks = gomp_malloc (stacksize * ntids); + char **__nvptx_stacks = 0; + asm ("cvta.shared.u64 %0, __nvptx_stacks;" : "=r" (__nvptx_stacks)); + for (int i = 0; i < ntids; i++) + __nvptx_stacks[i] = stacks + stacksize * (i + 1); + } + asm ("bar.sync 0;"); + gomp_nvptx_main_1 (fn, fn_data, ntids, tid); + free (stacks); +} + /* This function is a pthread_create entry point. This contains the idle loop in which a thread waits to be called up to become part of a team. */ @@ -160,3 +180,4 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, } #include "../../team.c" +#endif