diff mbox series

libgomp: Enforce 1-thread limit in subteams

Message ID 8ce3d009-140e-fa5d-8141-7c379f51f78a@codesourcery.com
State New
Headers show
Series libgomp: Enforce 1-thread limit in subteams | expand

Commit Message

Andrew Stubbs Sept. 29, 2020, 7:25 p.m. UTC
My recent patch to fix barriers in nested teams relied on the assumption 
that nested teams would only ever have one thread each.

However, that can be changed by altering the ICVs, via runtime call or 
environment variable (not that the accelerator-side libgomp can see the 
host environment), so it wasn't completely safe.

This patch ensures that the previous assumption is safe, by ignoring the 
relevant ICV on NVPTX and AMD GCN, neither of which can support it.

OK to commit?

Andrew

Comments

Jakub Jelinek Sept. 30, 2020, 7:46 a.m. UTC | #1
On Tue, Sep 29, 2020 at 08:25:44PM +0100, Andrew Stubbs wrote:
> 2020-09-29  Andrew Stubbs  <ams@codesourcery.com>
> 
> 	* parallel.c (gomp_resolve_num_threads): Ignore nest_var on nvptx
> 	and amdgcn targets.
> 
> diff --git a/libgomp/parallel.c b/libgomp/parallel.c
> index 2423f11f44a..0618056a7fe 100644
> --- a/libgomp/parallel.c
> +++ b/libgomp/parallel.c
> @@ -48,7 +48,14 @@ gomp_resolve_num_threads (unsigned specified, unsigned count)
>  
>    if (specified == 1)
>      return 1;
> -  else if (thr->ts.active_level >= 1 && !icv->nest_var)
> +
> +  /* Accelerators with fixed thread counts require this to return 1 for
> +     nested parallel regions.  */
> +  if (thr->ts.active_level >= 1
> +#if !defined(__AMDGCN__) && !defined(__nvptx__)

I think the comment should go right above the #if !defined line, because
it doesn't describe what the whole if is about, just a particular detail in
it.
I think I'd prefer some macro for this, but we already have quite a few
nvptx and AMDGCN ifdefs in libgomp/*.c, so I can live with that.

So ok for trunk with the comment move.

> +      && !icv->nest_var
> +#endif
> +      )
>      return 1;
>    else if (thr->ts.active_level >= gomp_max_active_levels_var)
>      return 1;


	Jakub
diff mbox series

Patch

libgomp: Enforce 1-thread limit in subteams

Accelerators with fixed thread-counts will break if nested teams are expected
to have multiple threads each.

libgomp/ChangeLog:

2020-09-29  Andrew Stubbs  <ams@codesourcery.com>

	* parallel.c (gomp_resolve_num_threads): Ignore nest_var on nvptx
	and amdgcn targets.

diff --git a/libgomp/parallel.c b/libgomp/parallel.c
index 2423f11f44a..0618056a7fe 100644
--- a/libgomp/parallel.c
+++ b/libgomp/parallel.c
@@ -48,7 +48,14 @@  gomp_resolve_num_threads (unsigned specified, unsigned count)
 
   if (specified == 1)
     return 1;
-  else if (thr->ts.active_level >= 1 && !icv->nest_var)
+
+  /* Accelerators with fixed thread counts require this to return 1 for
+     nested parallel regions.  */
+  if (thr->ts.active_level >= 1
+#if !defined(__AMDGCN__) && !defined(__nvptx__)
+      && !icv->nest_var
+#endif
+      )
     return 1;
   else if (thr->ts.active_level >= gomp_max_active_levels_var)
     return 1;