diff mbox series

openmp: Notify team barrier of pending tasks in, omp_fulfill_event

Message ID 7afcdc41-81d6-adc2-1f5d-46ca9176dd3e@codesourcery.com
State New
Headers show
Series openmp: Notify team barrier of pending tasks in, omp_fulfill_event | expand

Commit Message

Kwok Cheung Yeung May 17, 2021, 3:48 p.m. UTC
Hello

This patch fixes the issue where a call to omp_fulfill_event could fail to 
trigger the execution of tasks that were dependent on the task whose completion 
event is being fulfilled.

This mainly (or can only?) occurs when the thread is external to OpenMP, and all 
the barrier threads are sleeping when the omp_fulfill_event is called. 
omp_fulfill_event wakes the appropriate number of threads, but if 
BAR_TASK_PENDING is not set on bar->generation, the threads go back to sleep 
again rather than process new tasks.

I have added a new testcase using a pthread thread to call omp_fulfill_event on 
a suspended task after a short delay. I have not included a Fortran version as 
there doesn't appear to be a standard interface for threading on Fortran.

I have tested all the task-detach-* libgomp tests (which are the only tests that 
call omp_fulfill_event) with no offloading and offloading to Nvidia, with no 
fails. Okay to commit to master, releases/gcc-11 and devel/omp/gcc-11?

Thanks

Kwok
From 348c7cd00e358a8dc0b7563055f367fce2713fa5 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung <kcy@codesourcery.com>
Date: Fri, 14 May 2021 09:59:11 -0700
Subject: [PATCH] openmp: Notify team barrier of pending tasks in
 omp_fulfill_event

The team barrier should be notified of any new tasks that become runnable
as the result of a completing task, otherwise the barrier threads might
not resume processing available tasks, resulting in a hang.

2021-05-17  Kwok Cheung Yeung  <kcy@codesourcery.com>

	libgomp/
	* task.c (omp_fulfill_event): Call gomp_team_barrier_set_task_pending
	if new tasks generated.
	* testsuite/libgomp.c-c++-common/task-detach-13.c: New.
---
 libgomp/task.c                                |  1 +
 .../libgomp.c-c++-common/task-detach-13.c     | 60 +++++++++++++++++++
 2 files changed, 61 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c

Comments

Jakub Jelinek May 17, 2021, 4:03 p.m. UTC | #1
On Mon, May 17, 2021 at 04:48:03PM +0100, Kwok Cheung Yeung wrote:
> 2021-05-17  Kwok Cheung Yeung  <kcy@codesourcery.com>
> 
> 	libgomp/
> 	* task.c (omp_fulfill_event): Call gomp_team_barrier_set_task_pending
> 	if new tasks generated.
> 	* testsuite/libgomp.c-c++-common/task-detach-13.c: New.
> ---
>  libgomp/task.c                                |  1 +
>  .../libgomp.c-c++-common/task-detach-13.c     | 60 +++++++++++++++++++
>  2 files changed, 61 insertions(+)
>  create mode 100644 libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c
> 
> diff --git a/libgomp/task.c b/libgomp/task.c
> index 1c73c759a8d..feb4796a3ac 100644
> --- a/libgomp/task.c
> +++ b/libgomp/task.c
> @@ -2460,6 +2460,7 @@ omp_fulfill_event (omp_event_handle_t event)
>    if (new_tasks > 0)
>      {
>        /* Wake up threads to run new tasks.  */
> +      gomp_team_barrier_set_task_pending (&team->barrier);
>        do_wake = team->nthreads - team->task_running_count;
>        if (do_wake > new_tasks)
>  	do_wake = new_tasks;
> diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c b/libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c
> new file mode 100644
> index 00000000000..4306524526d
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c
> @@ -0,0 +1,60 @@
> +/* { dg-do run } */
> +/* { dg-options "-fopenmp" } */

-fopenmp as dg-options is implicit, please remove it.

> +/* { dg-timeout 10 } */

This will fail on targets that don't have pthreads.
We have already some tests that do use pthread_create,
and those currently use
/* { dg-do run { target *-*-linux* *-*-gnu* *-*-freebsd* } } */
so I'd do the same for this test.
There is also effective target pthread but am not sure if it covers
everything we need to test.

> +
> +
> +  pthread_join (thr, 0);

I'd add return 0;
While we default to C17 which doesn't need it, we don't say anywhere
in the testcase that it is C99+ or C++ only, so I think better make it valid
C89 too.

Otherwise LGTM, thanks.

	Jakub
diff mbox series

Patch

diff --git a/libgomp/task.c b/libgomp/task.c
index 1c73c759a8d..feb4796a3ac 100644
--- a/libgomp/task.c
+++ b/libgomp/task.c
@@ -2460,6 +2460,7 @@  omp_fulfill_event (omp_event_handle_t event)
   if (new_tasks > 0)
     {
       /* Wake up threads to run new tasks.  */
+      gomp_team_barrier_set_task_pending (&team->barrier);
       do_wake = team->nthreads - team->task_running_count;
       if (do_wake > new_tasks)
 	do_wake = new_tasks;
diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c b/libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c
new file mode 100644
index 00000000000..4306524526d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/task-detach-13.c
@@ -0,0 +1,60 @@ 
+/* { dg-do run } */
+/* { dg-options "-fopenmp" } */
+/* { dg-timeout 10 } */
+
+/* Test that omp_fulfill_event works when called from an external
+   non-OpenMP thread.  */
+
+#include <omp.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <stdio.h>
+
+int finished = 0;
+int event_pending = 0;
+omp_event_handle_t detach_event;
+
+void*
+fulfill_thread (void *)
+{
+  while (!__atomic_load_n (&finished, __ATOMIC_RELAXED))
+    {
+      if (__atomic_load_n (&event_pending, __ATOMIC_ACQUIRE))
+	{
+	  omp_fulfill_event (detach_event);
+	  __atomic_store_n (&event_pending, 0, __ATOMIC_RELEASE);
+	}
+
+      sleep(1);
+    }
+
+  return 0;
+}
+
+int
+main (void)
+{
+  pthread_t thr;
+  int dep;
+  pthread_create (&thr, NULL, fulfill_thread, 0);
+
+  #pragma omp parallel
+    #pragma omp single
+      {
+	omp_event_handle_t ev;
+
+	#pragma omp task depend (out: dep) detach (ev)
+	{
+	  detach_event = ev;
+	  __atomic_store_n (&event_pending, 1, __ATOMIC_RELEASE);
+	}
+
+	#pragma omp task depend (in: dep)
+	{
+	  __atomic_store_n (&finished, 1, __ATOMIC_RELAXED);
+	}
+      }
+
+
+  pthread_join (thr, 0);
+}