From patchwork Mon Apr 3 14:56:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 1764490 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PqvBx74Qnz1yYb for ; Tue, 4 Apr 2023 00:57:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1E0E38582A4 for ; Mon, 3 Apr 2023 14:57:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id ED0083858D39 for ; Mon, 3 Apr 2023 14:57:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ED0083858D39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,314,1673942400"; d="scan'208,223";a="1259934" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 03 Apr 2023 06:57:17 -0800 IronPort-SDR: 4gUOxudkRU4+oTh6N2cW6iwOkLC4Ev4m1dqcjMCMwbJGRpNyp4txGflhvrnAN+D3LvADL1uPPi ekV9AJk56Fp4ylwnEmZR2uKQk8xR6ayHfN194dLNFVW+ML06Q7mH4sYP/1f2Hghcq+PTuErrbQ 94GP7Eqp3xqCvnRvboJ/N/ayMEA5KzJsMngkRL52hg8M+M+cGcdO1BPR1cL1WEmBtRq1ehgK6G wOObs4sG58ODsLp1lxfGlHLeF5qPkjRKjJLQFFczd5j5cMBBXczUo2icxparh11Rm+2JS8u9JJ OdM= From: Thomas Schwinge To: Andrew Stubbs , CC: Hafiz Abid Qadeer , Jakub Jelinek , Tobias Burnus Subject: [og12] '-foffload-memory=pinned' using offloading device interfaces (was: -foffload-memory=pinned) In-Reply-To: <275a67a6-52ae-7f31-6f28-9018140ee12a@codesourcery.com> References: <20220308113059.688551-1-abidh@codesourcery.com> <20220308113059.688551-2-abidh@codesourcery.com> <87sff9zl3u.fsf@euler.schwinge.homeip.net> <275a67a6-52ae-7f31-6f28-9018140ee12a@codesourcery.com> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/28.2 (x86_64-pc-linux-gnu) Date: Mon, 3 Apr 2023 16:56:59 +0200 Message-ID: <878rf9xbd0.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LOTSOFHASH, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi! On 2023-02-13T15:20:07+0000, Andrew Stubbs wrote: > On 13/02/2023 14:38, Thomas Schwinge wrote: >> On 2022-03-08T11:30:55+0000, Hafiz Abid Qadeer wrote: >>> From: Andrew Stubbs >>> >>> Add a new option. It will be used in follow-up patches. >> >>> --- a/gcc/doc/invoke.texi >>> +++ b/gcc/doc/invoke.texi >> >>> +@option{-foffload-memory=pinned} forces all host memory to be pinned (this >>> +mode may require the user to increase the ulimit setting for locked memory). >> >> So, this is currently implemented via 'mlockall', which, as discussed, >> (a) has issues ('ulimit -l'), and (b) doesn't actually achieve what it >> meant to achieve (because it doesn't register the page-locked memory with >> the GPU driver). >> [...] >> As '-foffload-memory=pinned', per the name >> of the option, concerns itself with memory used in offloading but not >> host execution generally, why are we actually attempting to "[force] all >> host memory to be pinned" -- why not just the memory that's being used >> with offloading? That is, if '-foffload-memory=pinned' is set, register >> as page-locked with the GPU driver all memory that appears in OMP >> offloading data regions, such as OpenMP 'target' 'map' clauses etc. That >> way, this is directed at the offloading data transfers, as itended, but >> at the same time we don't "waste" page-locked memory for generic host >> memory allocations. What do you think -- you, who've spent a lot more >> time on this topic than I have, so it's likely possible that I fail to >> realize some "details"? > > The main reason it is the way it is is because in general it's not > possible to know what memory is going to be offloaded at the time it is > allocated (and stack/static memory is never allocated that way). > > If there's a way to pin it after the fact then maybe that's not a > terrible idea? [...] I've now pushed to devel/omp/gcc-12 branch my take on this in commit 43095690ea519205bf56fc148b346edaa43e0f0f "'-foffload-memory=pinned' using offloading device interfaces", and for changes related to og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches": commit 694bbd399c1323975b4a6735646e46c6914de63d "'-foffload-memory=pinned' using offloading device interfaces for non-contiguous array support", see attached. Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 From 694bbd399c1323975b4a6735646e46c6914de63d Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Thu, 30 Mar 2023 10:08:12 +0200 Subject: [PATCH 2/2] '-foffload-memory=pinned' using offloading device interfaces for non-contiguous array support Changes related to og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". libgomp/ * target.c (gomp_map_vars_internal) : Handle 'always_pinned_mode'. --- libgomp/ChangeLog.omp | 3 +++ libgomp/target.c | 55 +++++++++++++++++++++++++++++++++++++++---- 2 files changed, 53 insertions(+), 5 deletions(-) diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index 1b02c057562..09cf9c6f3c1 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,5 +1,8 @@ 2023-04-03 Thomas Schwinge + * target.c (gomp_map_vars_internal) + : Handle 'always_pinned_mode'. + * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_free): Add 'struct goacc_asyncqueue *' formal parameter. (GOMP_OFFLOAD_page_locked_host_register) diff --git a/libgomp/target.c b/libgomp/target.c index ed2fc09cf44..38eb5d1aa5b 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -2087,6 +2087,23 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, k->dynamic_refcount = 0; k->aux = NULL; k->tgt_offset = tgt_size; + k->page_locked_host_p = false; + if (always_pinned_mode) + { + void *ptr = (void *) k->host_start; + size_t size = k->host_end - k->host_start; + int page_locked_host_p = 0; + if (size != 0) + page_locked_host_p = gomp_page_locked_host_register_dev + (devicep, ptr, size, kind & typemask); + if (page_locked_host_p < 0) + { + gomp_mutex_unlock (&devicep->lock); + exit (EXIT_FAILURE); + } + if (page_locked_host_p) + k->page_locked_host_p = true; + } tgt_size += nca->data_row_size; @@ -2120,16 +2137,44 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, accelerator side ptrblock and copy it in. */ if (nca->ptrblock_size) { - void *ptrblock = gomp_malloc (nca->ptrblock_size); + void *ptrblock; + if (always_pinned_mode) + { + ptrblock + = gomp_page_locked_host_alloc_dev (devicep, + nca->ptrblock_size, + false); + if (!ptrblock) + { + gomp_mutex_unlock (&devicep->lock); + exit (EXIT_FAILURE); + } + } + else + ptrblock = gomp_malloc (nca->ptrblock_size); goacc_noncontig_array_create_ptrblock (nca, ptrblock, target_ptrblock); gomp_copy_host2dev (devicep, aq, target_ptrblock, ptrblock, nca->ptrblock_size, false, cbufp); - if (aq) - /* Free once the transfer has completed. */ - devicep->openacc.async.queue_callback_func (aq, free, ptrblock); + if (always_pinned_mode) + { + if (!gomp_page_locked_host_free_dev (devicep, + ptrblock, + aq)) + { + gomp_mutex_unlock (&devicep->lock); + exit (EXIT_FAILURE); + } + } else - free (ptrblock); + { + if (aq) + /* Free once the transfer has completed. */ + devicep->openacc.async.queue_callback_func + (aq, free, ptrblock); + else + free (ptrblock); + } } } } -- 2.25.1