diff mbox series

libgomp: Honor OpenMP's nteams-var ICV as upper limit on num teams [PR109875]

Message ID 212f3744-5ad2-e2c4-5b07-62f1cf0804a6@codesourcery.com
State New
Headers show
Series libgomp: Honor OpenMP's nteams-var ICV as upper limit on num teams [PR109875] | expand

Commit Message

Tobias Burnus May 19, 2023, 5:18 p.m. UTC
I intent to commit this patch early next week — any comments, questions,
concerns?

* * *

I stumbled over this issue when looking at sollve_vv's pull requests
for  omp_set_num_teams and omp_get_max_teams testcase (#729 + #728).

While the num_teams clause was honored everywhere, the nteams-var ICV
did set an upper limit on the implementation-defined number of teams.

That's fixed by the attached patch. Testing showed with my device setup
120 teams with GCN and 240 teams with nvptx – i.e. plenty values to
choose from to reduce the #teams via nteams-var.


Spec wording for OpenMP 5.1: See num_teams description in the 2nd and
3rd paragraph of "Description" at
https://www.openmp.org/spec-html/5.1/openmpse15.html


Tested on x86-64 without offloading (and working setenv support) and
with gcn and nvptx offload (running libgomp w/o setenv support and
manually also with setting the env vars).

Tobias

PS: The omp_get_max_teams routine is a bit odd; the return value is
described both as being nteams-var (which can be 0, which is actually
the default) and as returning the number (or upper bound?) of the number
of teams used. → OpenMP spec issue #3619.
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

Comments

Tobias Burnus May 19, 2023, 5:56 p.m. UTC | #1
I managed to attach an outdated patch.

Namely: After I tested it, I realized that GCC's testsuite setup already
marks testcases as UNSUPPORTED that use all dg-set-target-env-var – if
remote testing is done → see gcc/testsuite/lib/gcc-dg.exp.

Thus, instead of checking getenv, I can directly use #define to state
which environment variable is set, making the testcase more reliable and
avoiding some function calls. Otherwise, unchanged.

Tobias

On 19.05.23 19:18, Tobias Burnus wrote:
> I intent to commit this patch early next week — any comments, questions,
> concerns?
>
> * * *
>
> I stumbled over this issue when looking at sollve_vv's pull requests
> for  omp_set_num_teams and omp_get_max_teams testcase (#729 + #728).
>
> While the num_teams clause was honored everywhere, the nteams-var ICV
> did set an upper limit on the implementation-defined number of teams.
>
> That's fixed by the attached patch. Testing showed with my device setup
> 120 teams with GCN and 240 teams with nvptx – i.e. plenty values to
> choose from to reduce the #teams via nteams-var.
>
>
> Spec wording for OpenMP 5.1: See num_teams description in the 2nd and
> 3rd paragraph of "Description" at
> https://www.openmp.org/spec-html/5.1/openmpse15.html
>
>
> Tested on x86-64 without offloading (and working setenv support) and
> with gcn and nvptx offload (running libgomp w/o setenv support and
> manually also with setting the env vars).
>
> Tobias
>
> PS: The omp_get_max_teams routine is a bit odd; the return value is
> described both as being nteams-var (which can be 0, which is actually
> the default) and as returning the number (or upper bound?) of the number
> of teams used. → OpenMP spec issue #3619.
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße
> 201, 80634 München; Gesellschaft mit beschränkter Haftung;
> Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft:
> München; Registergericht München, HRB 106955
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
diff mbox series

Patch

libgomp: Honor OpenMP's nteams-var ICV as upper limit on num teams [PR109875]

The nteams-var ICV exists per device and can be set either via the routine
omp_set_num_teams or as environment variable (OMP_NUM_TEAMS with optional
_ALL/_DEV/_DEV_<num> suffix); it is default-initialized to zero. The number
of teams created is described under the num_teams clause. If the clause is
absent, the number of teams is implementation defined but at least
one team must exist and, if nteams-var is positive, at most nteams-var
teams may exist.

The latter condition was not honored in a target region before this
commit, such that too many teams were created.

Also before this commit, the num_teams([lower:]upper) was properly
honored and the nteams-var ICV was honored for the host, overriding
the default of 3. For host fallback without clause, the default is one
such that it was and is valid for any ICV value.

	PR libgomp/109875

libgomp/ChangeLog:

	* config/gcn/target.c (GOMP_teams4): Honor nteams-var ICV.
	* config/nvptx/target.c (GOMP_teams4): Likewise.
	* testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c: New test.
	* testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c: New test.
	* testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c: New test.
	* testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c: New test.

 libgomp/config/gcn/target.c                        |   4 +-
 libgomp/config/nvptx/target.c                      |   4 +-
 .../libgomp.c-c++-common/teams-nteams-icv-1.c      | 201 +++++++++++++++++++++
 .../libgomp.c-c++-common/teams-nteams-icv-2.c      |   5 +
 .../libgomp.c-c++-common/teams-nteams-icv-3.c      |   5 +
 .../libgomp.c-c++-common/teams-nteams-icv-4.c      |   8 +
 6 files changed, 225 insertions(+), 2 deletions(-)

diff --git a/libgomp/config/gcn/target.c b/libgomp/config/gcn/target.c
index c6691fde3c6..ea5eb1ff5ed 100644
--- a/libgomp/config/gcn/target.c
+++ b/libgomp/config/gcn/target.c
@@ -48,7 +48,9 @@  GOMP_teams4 (unsigned int num_teams_lower, unsigned int num_teams_upper,
      multiple times at least for some workgroups.  */
   (void) num_teams_lower;
   if (!num_teams_upper || num_teams_upper >= num_workgroups)
-    num_teams_upper = num_workgroups;
+    num_teams_upper = ((GOMP_ADDITIONAL_ICVS.nteams > 0
+			&& num_workgroups > GOMP_ADDITIONAL_ICVS.nteams)
+		       ? GOMP_ADDITIONAL_ICVS.nteams : num_workgroups);
   else if (workgroup_id >= num_teams_upper)
     return false;
   gomp_num_teams_var = num_teams_upper - 1;
diff --git a/libgomp/config/nvptx/target.c b/libgomp/config/nvptx/target.c
index f102d7d02d9..125d92a2ea9 100644
--- a/libgomp/config/nvptx/target.c
+++ b/libgomp/config/nvptx/target.c
@@ -55,7 +55,9 @@  GOMP_teams4 (unsigned int num_teams_lower, unsigned int num_teams_upper,
 	= thread_limit > INT_MAX ? UINT_MAX : thread_limit;
     }
   if (!num_teams_upper)
-    num_teams_upper = num_blocks;
+    num_teams_upper = ((GOMP_ADDITIONAL_ICVS.nteams > 0
+			&& num_blocks > GOMP_ADDITIONAL_ICVS.nteams)
+		       ? GOMP_ADDITIONAL_ICVS.nteams : num_blocks);
   else if (num_blocks < num_teams_lower)
     num_teams_upper = num_teams_lower;
   else if (num_blocks < num_teams_upper)
diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c
new file mode 100644
index 00000000000..fb562a77ef8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c
@@ -0,0 +1,201 @@ 
+/* Check that the nteams ICV is honored. */
+/* PR libgomp/109875  */
+
+/*  This base version of testcases is supposed to be run with all
+    OMP_NUM_TEAMS* env vars being unset.
+
+    The variants teams-nteams-icv-{2,3,4}.c test it by setting the
+    various OMP_NUM_TEAMS* env vars. However, as DejaGNU's remote testing
+    does not handle dg-set-target-env-var, the testcase has been written
+    such that it will still work in that case.
+
+    Once fixed in DejaGNU, the getenv could be replaced by using #define'd
+    values.  */
+
+/* OpenMP currently has:
+   - nteams-var ICV is initialized to 0; one ICV per device
+   - OMP_NUM_TEAMS(_DEV(_<dev-num>)) overrides it
+     OMP_NUM_TEAMS_ALL overrides it
+   - Number of teams is:
+     -> the value specific by num_teams([lower:]upper)
+	with lower := upper if unspecified
+     -> Otherwise, if nteams-var ICV > 0, #teams <= nteams-var ICV
+     -> Otherwise, if nteams-var ICV <= 0, #teams > 1
+ GCC uses 3 as default on the host and 1 for host fallback.
+ For offloading, it is device specific >> 1.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <omp.h>
+
+int
+main ()
+{
+  int num_teams_env = -1, num_teams_env_dev = -1, *num_teams_env_devs = NULL;
+  const char *env_s;
+
+  /* Get the OMP_NUM_TEAMS environment variables.  */
+
+  env_s = getenv ("OMP_NUM_TEAMS_ALL");
+  if (env_s)
+    {
+      num_teams_env = num_teams_env_dev = atoi (env_s);
+      printf ("DEBUG: OMP_NUM_TEAMS_ALL='%s' -> nteams-var = %d\n",
+	      env_s, num_teams_env);
+    }
+  env_s = getenv ("OMP_NUM_TEAMS");
+  if (env_s)
+    {
+      num_teams_env = atoi (env_s);
+      printf ("DEBUG: OMP_NUM_TEAMS='%s' -> nteams-var = %d\n",
+	      env_s, num_teams_env);
+    }
+  env_s = getenv ("OMP_NUM_TEAMS_DEV");
+  if (env_s)
+    {
+      num_teams_env_dev = atoi (env_s);
+      printf ("DEBUG: OMP_NUM_TEAMS_DEV='%s' -> nteams-var = %d\n",
+	      env_s, num_teams_env_dev);
+    }
+  if (omp_get_num_devices () > 0)
+    {
+      num_teams_env_devs = (int*) malloc (sizeof (int) * omp_get_num_devices ());
+      for (int i = 0; i < omp_get_num_devices (); i++)
+	{
+	  char tmp[18+5+1];
+	  snprintf (tmp, sizeof (tmp), "OMP_NUM_TEAMS_DEV_%d", i);
+	  env_s = getenv (tmp);
+	  if (env_s)
+	    {
+	      num_teams_env_devs[i] = atoi (env_s);
+	      printf ("DEBUG: %s='%s' -> nteams-var = %d\n",
+		      tmp, env_s, num_teams_env_devs[i]);
+	    }
+	  else if (num_teams_env_dev > 0)
+	    num_teams_env_devs[i] = num_teams_env_dev;
+	  else
+	    num_teams_env_devs[i] = -1;
+	}
+    }
+
+  /* Check that the number of teams (initial device and in target) is
+     >= 1 and, if omp_get_max_teams() > 0, it does not
+     exceed omp_get_max_teams (). */
+
+  int nteams, num_teams;
+
+  /* Assume that omp_get_max_teams (); returns the ICV, i.e. 0 as default init
+     and not the number of teams that would be run; hence: '>='.  */
+  nteams = omp_get_max_teams ();
+  if (nteams < 0 || (num_teams_env >= 0 && nteams != num_teams_env))
+    abort ();
+  num_teams = -1;
+
+  #pragma omp teams
+   if (omp_get_team_num () == 0)
+     num_teams = omp_get_num_teams ();
+  if (num_teams < 1 || (nteams > 0 && num_teams > nteams))
+    abort ();
+
+  /* GCC hard codes 3 teams - check for it.  */
+  if (nteams <= 0 && num_teams != 3)
+    abort ();
+
+  /* For each device, including host fallback.  */
+  for (int dev = 0; dev <= omp_get_num_devices (); dev++)
+    {
+      int num_teams_icv = ((dev == omp_get_num_devices ())
+			   ? num_teams_env : num_teams_env_devs[dev]);
+      nteams = -1;
+      #pragma omp target device(dev) map(from: nteams)
+	nteams = omp_get_max_teams ();
+      if (nteams < 0 || (num_teams_icv >= 0 && nteams != num_teams_icv))
+	abort ();
+
+      num_teams = -1;
+      #pragma omp target teams device(dev) map(from: num_teams)
+	if (omp_get_team_num () == 0)
+	  num_teams = omp_get_num_teams ();
+
+      if (num_teams < 1 || (nteams > 0 && num_teams > nteams))
+	abort ();
+
+      /* GCC hard codes 1 team for host fallback - check for it.  */
+      if (dev == omp_get_num_devices () && num_teams != 1)
+	abort ();
+    }
+
+  /* Now set the nteams-var ICV and check that omp_get_max_teams()
+     returns the set value and that the following holds:
+     num_teams >= 1 and num_teams <= nteams-var ICV.
+
+     Additionally, implementation defined, assume:
+     - num_teams == (not '<=') nteams-var ICV, except:
+     - num_teams == 1 for host fallback.  */
+
+  omp_set_num_teams (5);
+
+  nteams = omp_get_max_teams ();
+  if (nteams != 5)
+    abort ();
+  num_teams = -1;
+
+  #pragma omp teams
+   if (omp_get_team_num () == 0)
+     num_teams = omp_get_num_teams ();
+  if (num_teams != 5)
+    abort ();
+
+  /* For each device, including host fallback.  */
+  for (int dev = 0; dev <= omp_get_num_devices (); dev++)
+    {
+      #pragma omp target device(dev) firstprivate(dev)
+	omp_set_num_teams (7 + dev);
+
+      #pragma omp target device(dev) map(from: nteams)
+	nteams = omp_get_max_teams ();
+      if (nteams != 7 + dev)
+	abort ();
+
+      num_teams = -1;
+      #pragma omp target teams device(dev) map(from: num_teams)
+	if (omp_get_team_num () == 0)
+	  num_teams = omp_get_num_teams ();
+
+      if (dev == omp_get_num_devices ())
+	{
+	  if (num_teams != 1)
+	    abort ();
+	}
+      else
+	{
+	  if (num_teams != 7 + dev)
+	    abort ();
+	}
+    }
+
+  /* Now use the num_teams clause explicitly.  */
+
+  num_teams = -1;
+  #pragma omp teams num_teams(6)
+   if (omp_get_team_num () == 0)
+     num_teams = omp_get_num_teams ();
+  if (num_teams != 6)
+    abort ();
+
+  /* For each device, including host fallback.  */
+  for (int dev = 0; dev <= omp_get_num_devices (); dev++)
+    {
+      num_teams = -1;
+      #pragma omp target teams device(dev) map(from: num_teams) num_teams(dev+3)
+	if (omp_get_team_num () == 0)
+	  num_teams = omp_get_num_teams ();
+
+      /* This must match the set value, also with host fallback.  */
+      if (num_teams != 3 + dev)
+	abort ();
+    }
+
+  free (num_teams_env_devs);
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c
new file mode 100644
index 00000000000..f3a88234ec3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c
@@ -0,0 +1,5 @@ 
+/* PR libgomp/109875  */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_ALL 9 } */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV 7 } */
+
+#include "teams-nteams-icv-1.c"
diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c
new file mode 100644
index 00000000000..dffd28aa5b5
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c
@@ -0,0 +1,5 @@ 
+/* PR libgomp/109875  */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_ALL 7 } */
+/* { dg-set-target-env-var OMP_NUM_TEAMS 8 } */
+
+#include "teams-nteams-icv-1.c"
diff --git a/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c
new file mode 100644
index 00000000000..7fa12a8b9fe
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c
@@ -0,0 +1,8 @@ 
+/* PR libgomp/109875  */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_ALL 7 } */
+/* { dg-set-target-env-var OMP_NUM_TEAMS 4 } */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV 8 } */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV_0 5 } */
+/* { dg-set-target-env-var OMP_NUM_TEAMS_DEV_1 11 } */
+
+#include "teams-nteams-icv-1.c"