[libgomp,nvptx] Fix hang in gomp_team_barrier_wait_end

Hi,

Consider the following omp fragment.
...
  #pragma omp target
  #pragma omp parallel num_threads (2)
  #pragma omp task
    ;
...

This hangs at -O0 for nvptx.

Investigating the behaviour gives us the following trace of events:
- both threads execute GOMP_task, where they:
  - deposit a task, and
  - execute gomp_team_barrier_wake
- thread 1 executes gomp_team_barrier_wait_end and, not being the last thread,
  proceeds to wait at the team barrier
- thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it
  calls gomp_barrier_handle_tasks, where it:
  - executes both tasks and marks the team barrier done
  - executes a gomp_team_barrier_wake which wakes up thread 1
- thread 1 exits the team barrier
- thread 0 returns from gomp_barrier_handle_tasks and goes to wait at
  the team barrier.
- thread 0 hangs.

To understand why there is a hang here, it's good to understand how things
are setup for nvptx.  The libgomp/config/nvptx/bar.c implementation is
a copy of the libgomp/config/linux/bar.c implementation, with uses of both
futex_wake and do_wait replaced with uses of nvptx insn bar.sync:
...
  if (bar->total > 1)
    asm ("bar.sync 1, %0;" : : "r" (32 * bar->total));
...

The point where thread 0 goes to wait at the team barrier, corresponds in
the linux implementation with a do_wait.  In the linux case, the call to
do_wait doesn't hang, because it's waiting for bar->generation to become
a certain value, and if bar->generation already has that value, it just
proceeds, without any need for coordination with other threads.

In the nvtpx case, the bar.sync waits until thread 1 joins it in the same
logical barrier, which never happens: thread 1 is lingering in the
thread pool at the thread pool barrier (using a different logical barrier),
waiting to join a new team.

The easiest way to fix this is to revert to the posix implementation for
bar.{c,h}.

Another way would be to revert to the linux implementation for bar.{c,h},
and implement the primitives futex_wait and do_wait using nvptx insns.

This patch instead implements a minimal fix (which makes the implementation
deviate further from the linux one).

The hang was only observed in gomp_team_barrier_wait_end, but we propagate the
fix to its twin gomp_team_barrier_wait_cancel_end as well.

The fix is based on the assumptions that at the point of the fix, after the
call to gomp_barrier_handle_tasks:
- all tasks are done
  (an assert is added to check this), and consequently:
- the executing thread is the only thread left in the team barrier
  (so it's accurate to set nthreads to 1).

Tested libgomp on x86_64 with nvptx accelerator.

Any comments?

Thanks,
- Tom

[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

libgomp/ChangeLog:

2021-04-20  Tom de Vries  <tdevries@suse.de>

	PR target/99555
	* config/nvptx/bar.c (gomp_team_barrier_wait_end)
	(gomp_team_barrier_wait_cancel_end): Don't try to sync with team threads
	that have left the team barrier.
	* testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific
	workarounds.
	* testsuite/libgomp.c/pr99555-1.c: Same.
	* testsuite/libgomp.fortran/task-detach-6.f90: Same.

---
 libgomp/config/nvptx/bar.c                         | 32 ++++++++++++++++------
 .../testsuite/libgomp.c-c++-common/task-detach-6.c |  8 ------
 libgomp/testsuite/libgomp.c/pr99555-1.c            |  8 ------
 .../testsuite/libgomp.fortran/task-detach-6.f90    | 12 --------
 4 files changed, 24 insertions(+), 36 deletions(-)

Message ID	20210420112344.GA7277@delia
State	New
Headers	show Return-Path: <gcc-patches-bounces@gcc.gnu.org> X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=<UNKNOWN>) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FPhCJ4WHWz9sjD for <incoming@patchwork.ozlabs.org>; Tue, 20 Apr 2021 21:23:54 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1480F3861865; Tue, 20 Apr 2021 11:23:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id AFE153861838 for <gcc-patches@gcc.gnu.org>; Tue, 20 Apr 2021 11:23:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AFE153861838 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tdevries@suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 918CFAF52; Tue, 20 Apr 2021 11:23:47 +0000 (UTC) Date: Tue, 20 Apr 2021 13:23:45 +0200 From: Tom de Vries <tdevries@suse.de> To: gcc-patches@gcc.gnu.org Subject: [PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end Message-ID: <20210420112344.GA7277@delia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: Jakub Jelinek <jakub@redhat.com>, Andrew Stubbs <ams@codesourcery.com>, Alexander Monakov <amonakov@ispras.ru> Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces@gcc.gnu.org>
Series	[libgomp,nvptx] Fix hang in gomp_team_barrier_wait_end \| expand [libgomp,nvptx] Fix hang in gomp_team_barrier_wait_end

[libgomp,nvptx] Fix hang in gomp_team_barrier_wait_end

Commit Message

Comments

Patch