From patchwork Fri May 19 17:18:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 1783895 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QND8R2rj8z20PV for ; Sat, 20 May 2023 03:18:47 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 31C083856DF4 for ; Fri, 19 May 2023 17:18:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id B02873858C41 for ; Fri, 19 May 2023 17:18:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B02873858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="6.00,177,1681200000"; d="diff'?scan'208";a="5884256" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 19 May 2023 09:18:28 -0800 IronPort-SDR: +jFOXHbTafr21sw/mrpmSglK8KCg8rAd94m3ajvOU2oIHpy/G5sxqSTnYxNhXpaBgfD+XI+B51 Dv4O4XiwV9sgsxLDb18oLcuwH2hHF2PzTHaSC9HgN/D0/YV+c8akmYctbYELjtXWcVrfd6vRuP eOEPz07lG5Qxyh/Y8zBuoKT33+uIoOsSL+HHwe1AIFMz41WhW5dB9ONQTqusMMlSAS9MRRgJRi c8VECIJBkRXo6z9TrS6VrEn3RkBRhSR+1oNmj3KeRjkhNthp4lAVSHZmOe414M9pjU2y47YjqR PDw= Message-ID: <212f3744-5ad2-e2c4-5b07-62f1cf0804a6@codesourcery.com> Date: Fri, 19 May 2023 19:18:22 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Content-Language: en-US To: gcc-patches From: Tobias Burnus Subject: [Patch] libgomp: Honor OpenMP's nteams-var ICV as upper limit on num teams [PR109875] X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" I intent to commit this patch early next week — any comments, questions, concerns? * * * I stumbled over this issue when looking at sollve_vv's pull requests for omp_set_num_teams and omp_get_max_teams testcase (#729 + #728). While the num_teams clause was honored everywhere, the nteams-var ICV did set an upper limit on the implementation-defined number of teams. That's fixed by the attached patch. Testing showed with my device setup 120 teams with GCN and 240 teams with nvptx – i.e. plenty values to choose from to reduce the #teams via nteams-var. Spec wording for OpenMP 5.1: See num_teams description in the 2nd and 3rd paragraph of "Description" at https://www.openmp.org/spec-html/5.1/openmpse15.html Tested on x86-64 without offloading (and working setenv support) and with gcn and nvptx offload (running libgomp w/o setenv support and manually also with setting the env vars). Tobias PS: The omp_get_max_teams routine is a bit odd; the return value is described both as being nteams-var (which can be 0, which is actually the default) and as returning the number (or upper bound?) of the number of teams used. → OpenMP spec issue #3619. ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 libgomp: Honor OpenMP's nteams-var ICV as upper limit on num teams [PR109875] The nteams-var ICV exists per device and can be set either via the routine omp_set_num_teams or as environment variable (OMP_NUM_TEAMS with optional _ALL/_DEV/_DEV_ suffix); it is default-initialized to zero. The number of teams created is described under the num_teams clause. If the clause is absent, the number of teams is implementation defined but at least one team must exist and, if nteams-var is positive, at most nteams-var teams may exist. The latter condition was not honored in a target region before this commit, such that too many teams were created. Also before this commit, the num_teams([lower:]upper) was properly honored and the nteams-var ICV was honored for the host, overriding the default of 3. For host fallback without clause, the default is one such that it was and is valid for any ICV value. PR libgomp/109875 libgomp/ChangeLog: * config/gcn/target.c (GOMP_teams4): Honor nteams-var ICV. * config/nvptx/target.c (GOMP_teams4): Likewise. * testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c: New test. * testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c: New test. * testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c: New test. * testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c: New test. libgomp/config/gcn/target.c | 4 +- libgomp/config/nvptx/target.c | 4 +- .../libgomp.c-c++-common/teams-nteams-icv-1.c | 201 +++++++++++++++++++++ .../libgomp.c-c++-common/teams-nteams-icv-2.c | 5 + .../libgomp.c-c++-common/teams-nteams-icv-3.c | 5 + .../libgomp.c-c++-common/teams-nteams-icv-4.c | 8 + 6 files changed, 225 insertions(+), 2 deletions(-) diff --git a/libgomp/config/gcn/target.c b/libgomp/config/gcn/target.c index c6691fde3c6..ea5eb1ff5ed 100644 --- a/libgomp/config/gcn/target.c +++ b/libgomp/config/gcn/target.c @@ -48,7 +48,9 @@ GOMP_teams4 (unsigned int num_teams_lower, unsigned int num_teams_upper, multiple times at least for some workgroups. */ (void) num_teams_lower; if (!num_teams_upper || num_teams_upper >= num_workgroups) - num_teams_upper = num_workgroups; + num_teams_upper = ((GOMP_ADDITIONAL_ICVS.nteams > 0 + && num_workgroups > GOMP_ADDITIONAL_ICVS.nteams) + ? GOMP_ADDITIONAL_ICVS.nteams : num_workgroups); else if (workgroup_id >= num_teams_upper) return false; gomp_num_teams_var = num_teams_upper - 1; diff --git a/libgomp/config/nvptx/target.c b/libgomp/config/nvptx/target.c index f102d7d02d9..125d92a2ea9 100644 --- a/libgomp/config/nvptx/target.c +++ b/libgomp/config/nvptx/target.c @@ -55,7 +55,9 @@ GOMP_teams4 (unsigned int num_teams_lower, unsigned int num_teams_upper, = thread_limit > INT_MAX ? UINT_MAX : thread_limit; } if (!num_teams_upper) - num_teams_upper = num_blocks; + num_teams_upper = ((GOMP_ADDITIONAL_ICVS.nteams > 0 + && num_blocks > GOMP_ADDITIONAL_ICVS.nteams) + ? GOMP_ADDITIONAL_ICVS.nteams : num_blocks); else if (num_blocks < num_teams_lower) num_teams_upper = num_teams_lower; else if (num_blocks < num_teams_upper) diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c new file mode 100644 index 00000000000..fb562a77ef8 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c @@ -0,0 +1,201 @@ +/* Check that the nteams ICV is honored. */ +/* PR libgomp/109875 */ + +/* This base version of testcases is supposed to be run with all + OMP_NUM_TEAMS* env vars being unset. + + The variants teams-nteams-icv-{2,3,4}.c test it by setting the + various OMP_NUM_TEAMS* env vars. However, as DejaGNU's remote testing + does not handle dg-set-target-env-var, the testcase has been written + such that it will still work in that case. + + Once fixed in DejaGNU, the getenv could be replaced by using #define'd + values. */ + +/* OpenMP currently has: + - nteams-var ICV is initialized to 0; one ICV per device + - OMP_NUM_TEAMS(_DEV(_)) overrides it + OMP_NUM_TEAMS_ALL overrides it + - Number of teams is: + -> the value specific by num_teams([lower:]upper) + with lower := upper if unspecified + -> Otherwise, if nteams-var ICV > 0, #teams <= nteams-var ICV + -> Otherwise, if nteams-var ICV <= 0, #teams > 1 + GCC uses 3 as default on the host and 1 for host fallback. + For offloading, it is device specific >> 1. */ + +#include +#include +#include + +int +main () +{ + int num_teams_env = -1, num_teams_env_dev = -1, *num_teams_env_devs = NULL; + const char *env_s; + + /* Get the OMP_NUM_TEAMS environment variables. */ + + env_s = getenv ("OMP_NUM_TEAMS_ALL"); + if (env_s) + { + num_teams_env = num_teams_env_dev = atoi (env_s); + printf ("DEBUG: OMP_NUM_TEAMS_ALL='%s' -> nteams-var = %d\n", + env_s, num_teams_env); + } + env_s = getenv ("OMP_NUM_TEAMS"); + if (env_s) + { + num_teams_env = atoi (env_s); + printf ("DEBUG: OMP_NUM_TEAMS='%s' -> nteams-var = %d\n", + env_s, num_teams_env); + } + env_s = getenv ("OMP_NUM_TEAMS_DEV"); + if (env_s) + { + num_teams_env_dev = atoi (env_s); + printf ("DEBUG: OMP_NUM_TEAMS_DEV='%s' -> nteams-var = %d\n", + env_s, num_teams_env_dev); + } + if (omp_get_num_devices () > 0) + { + num_teams_env_devs = (int*) malloc (sizeof (int) * omp_get_num_devices ()); + for (int i = 0; i < omp_get_num_devices (); i++) + { + char tmp[18+5+1]; + snprintf (tmp, sizeof (tmp), "OMP_NUM_TEAMS_DEV_%d", i); + env_s = getenv (tmp); + if (env_s) + { + num_teams_env_devs[i] = atoi (env_s); + printf ("DEBUG: %s='%s' -> nteams-var = %d\n", + tmp, env_s, num_teams_env_devs[i]); + } + else if (num_teams_env_dev > 0) + num_teams_env_devs[i] = num_teams_env_dev; + else + num_teams_env_devs[i] = -1; + } + } + + /* Check that the number of teams (initial device and in target) is + >= 1 and, if omp_get_max_teams() > 0, it does not + exceed omp_get_max_teams (). */ + + int nteams, num_teams; + + /* Assume that omp_get_max_teams (); returns the ICV, i.e. 0 as default init + and not the number of teams that would be run; hence: '>='. */ + nteams = omp_get_max_teams (); + if (nteams < 0 || (num_teams_env >= 0 && nteams != num_teams_env)) + abort (); + num_teams = -1; + + #pragma omp teams + if (omp_get_team_num () == 0) + num_teams = omp_get_num_teams (); + if (num_teams < 1 || (nteams > 0 && num_teams > nteams)) + abort (); + + /* GCC hard codes 3 teams - check for it. */ + if (nteams <= 0 && num_teams != 3) + abort (); + + /* For each device, including host fallback. */ + for (int dev = 0; dev <= omp_get_num_devices (); dev++) + { + int num_teams_icv = ((dev == omp_get_num_devices ()) + ? num_teams_env : num_teams_env_devs[dev]); + nteams = -1; + #pragma omp target device(dev) map(from: nteams) + nteams = omp_get_max_teams (); + if (nteams < 0 || (num_teams_icv >= 0 && nteams != num_teams_icv)) + abort (); + + num_teams = -1; + #pragma omp target teams device(dev) map(from: num_teams) + if (omp_get_team_num () == 0) + num_teams = omp_get_num_teams (); + + if (num_teams < 1 || (nteams > 0 && num_teams > nteams)) + abort (); + + /* GCC hard codes 1 team for host fallback - check for it. */ + if (dev == omp_get_num_devices () && num_teams != 1) + abort (); + } + + /* Now set the nteams-var ICV and check that omp_get_max_teams() + returns the set value and that the following holds: + num_teams >= 1 and num_teams <= nteams-var ICV. + + Additionally, implementation defined, assume: + - num_teams == (not '<=') nteams-var ICV, except: + - num_teams == 1 for host fallback. */ + + omp_set_num_teams (5); + + nteams = omp_get_max_teams (); + if (nteams != 5) + abort (); + num_teams = -1; + + #pragma omp teams + if (omp_get_team_num () == 0) + num_teams = omp_get_num_teams (); + if (num_teams != 5) + abort (); + + /* For each device, including host fallback. */ + for (int dev = 0; dev <= omp_get_num_devices (); dev++) + { + #pragma omp target device(dev) firstprivate(dev) + omp_set_num_teams (7 + dev); + + #pragma omp target device(dev) map(from: nteams) + nteams = omp_get_max_teams (); + if (nteams != 7 + dev) + abort (); + + num_teams = -1; + #pragma omp target teams device(dev) map(from: num_teams) + if (omp_get_team_num () == 0) + num_teams = omp_get_num_teams (); + + if (dev == omp_get_num_devices ()) + { + if (num_teams != 1) + abort (); + } + else + { + if (num_teams != 7 + dev) + abort (); + } + } + + /* Now use the num_teams clause explicitly. */ + + num_teams = -1; + #pragma omp teams num_teams(6) + if (omp_get_team_num () == 0) + num_teams = omp_get_num_teams (); + if (num_teams != 6) + abort (); + + /* For each device, including host fallback. */ + for (int dev = 0; dev <= omp_get_num_devices (); dev++) + { + num_teams = -1; + #pragma omp target teams device(dev) map(from: num_teams) num_teams(dev+3) + if (omp_get_team_num () == 0) + num_teams = omp_get_num_teams (); + + /* This must match the set value, also with host fallback. */ + if (num_teams != 3 + dev) + abort (); + } + + free (num_teams_env_devs); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c new file mode 100644 index 00000000000..f3a88234ec3 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c @@ -0,0 +1,5 @@ +/* PR libgomp/109875 */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_ALL 9 } */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV 7 } */ + +#include "teams-nteams-icv-1.c" diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c new file mode 100644 index 00000000000..dffd28aa5b5 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c @@ -0,0 +1,5 @@ +/* PR libgomp/109875 */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_ALL 7 } */ +/* { dg-set-target-env-var OMP_NUM_TEAMS 8 } */ + +#include "teams-nteams-icv-1.c" diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c new file mode 100644 index 00000000000..7fa12a8b9fe --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c @@ -0,0 +1,8 @@ +/* PR libgomp/109875 */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_ALL 7 } */ +/* { dg-set-target-env-var OMP_NUM_TEAMS 4 } */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV 8 } */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV_0 5 } */ +/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV_1 11 } */ + +#include "teams-nteams-icv-1.c"