diff mbox series

[og9] Really fix og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof"

Message ID yxfpmu83hxno.fsf@hertz.schwinge.homeip.net
State New
Headers show
Series [og9] Really fix og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof" | expand

Commit Message

Thomas Schwinge March 26, 2020, 4:46 p.m. UTC
Hi!

On 2020-03-25T18:09:25+0100, I wrote:
> On 2018-02-22T12:23:25+0100, Tom de Vries <Tom_deVries@mentor.com> wrote:
>> when using cuda 9 nvprof with an openacc executable, the executable hangs.

> What Frederik has discovered today in the hard way... is that the og9
> version of this patch did get its code altered in a way so that it no
> longer resolves the problem it's meant to resolve -- the hang was back.
> On Git-mirror-based openacc-gcc-9-branch that's:
>
>     commit 84af3c5a2fbb5023057e2ca319b0c22f5f7d4795
>     Author:     Julian Brown <julian@codesourcery.com>
>     AuthorDate: Tue Feb 26 16:00:54 2019 -0800
>     Commit:     Kwok Cheung Yeung <kcy@codesourcery.com>
>     CommitDate: Fri May 31 13:40:07 2019 -0700
>
>         Fix hang when running oacc exec with CUDA 9.0 nvprof
>
>         2018-09-20  Tom de Vries  <tdevries@suse.de>
>                     Cesar Philippidis  <cesar@codesourcery.com>
>
>                 libgomp/
>                 [...]
>
> ..., which got cherry-picked (automated, without any review) into current
> devel/omp/gcc-9 in commit f752d880a5abc591a25ad22fb892363f6520bcf1.

OK, I had confused myself here.  I wrongly blamed that commit to be
responsible for the hang being back, when in fact it's only the later og9
"OpenACC Profiling Interface (incomplete)" backport from trunk that
introduced the problem.  On Git-mirror-based openacc-gcc-9-branch that's:

    commit 1246da4f164bcf2ec4430b89686a38c47e55b5f9
    Author:     tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
    AuthorDate: Fri May 17 19:13:36 2019 +0000
    Commit:     Kwok Cheung Yeung <kcy@codesourcery.com>
    CommitDate: Fri Jul 26 14:32:02 2019 -0700

        OpenACC Profiling Interface (incomplete)

                libgomp/
                [...]

..., which got cherry-picked (automated, without any review) into current
devel/omp/gcc-9 in commit 9342531a7fc9f6e368e37bbd4ea9f4d01f43514e.

The confusing thing was that the og9 "Fix hang when running oacc exec
with CUDA 9.0 nvprof" commit appears *before* the og9 "OpenACC Profiling
Interface (incomplete)" backport that it relates to.

And, in addition to that, I pushed the wrong (incomplete) version of my
fix.

> Of course, it would've helped tremendously had the original og7 commit
> included a test case...  :'-/ (... by simply reproducing the nested calls
> that CUDA 9 nvprof seems to be doing.)
>
> Still without a test case, for now I have pushed the attached patch to
> devel/omp/gcc-9 in commit 9ae129017c7fc1fa638d6beedd3802b515ca692b 'Fix
> og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof"'.

..., and now the attached patch to devel/omp/gcc-9 in commit
775f1686a3df68bd20370f1fabc6273883e2c5d2 'Really fix og9 "Fix hang when
running oacc exec with CUDA 9.0 nvprof"'.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

Comments

Frederik Harwath March 27, 2020, 7:06 a.m. UTC | #1
Hi Thomas,

Thomas Schwinge <thomas@codesourcery.com> writes:

> On 2020-03-25T18:09:25+0100, I wrote:
>> On 2018-02-22T12:23:25+0100, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>> when using cuda 9 nvprof with an openacc executable, the executable hangs.
>
>> What Frederik has discovered today in the hard way... [...]
>> -- the hang was back. [...]
> ..., and now the attached patch to devel/omp/gcc-9 in commit
> 775f1686a3df68bd20370f1fabc6273883e2c5d2 'Really fix og9 "Fix hang when
> running oacc exec with CUDA 9.0 nvprof"'.

Thanks for fixing this issue! I can confirm that nvprof now works on
code compiled from devel/omp/gcc-9. I have used nvprof 9.1.85 on Ubuntu
18.04 for testing.

Best regards,
Frederik
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
diff mbox series

Patch

From 775f1686a3df68bd20370f1fabc6273883e2c5d2 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 26 Mar 2020 17:34:01 +0100
Subject: [PATCH] Really fix og9 "Fix hang when running oacc exec with CUDA 9.0
 nvprof"

In my yesterday's commit 9ae129017c7fc1fa638d6beedd3802b515ca692b 'Fix og9 "Fix
hang when running oacc exec with CUDA 9.0 nvprof"', I wrongly blamed the og9
"Fix hang when running oacc exec with CUDA 9.0 nvprof" to be responsible for
the hang being back, when in fact it's only the later og9 "OpenACC Profiling
Interface (incomplete)" backport from trunk that introduced the problem.

The confusing thing was that the og9 "Fix hang when running oacc exec with CUDA
9.0 nvprof" commit appears *before* the og9 "OpenACC Profiling Interface
(incomplete)" backport that it relates to.

And, in addition to that, I pushed the wrong (incomplete) version of my fix.

	libgomp/
	* oacc-init.c (acc_init_1): Move other 'acc_init_state' logic to
	where it belongs.
---
 libgomp/ChangeLog.omp |  5 +++++
 libgomp/oacc-init.c   | 12 ++++++++----
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 75c45917998..922e00fbff5 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,8 @@ 
+2020-03-26  Thomas Schwinge <thomas@codesourcery.com>
+
+	* oacc-init.c (acc_init_1): Move other 'acc_init_state' logic to
+	where it belongs.
+
 2020-03-25  Thomas Schwinge <thomas@codesourcery.com>
 
 	* oacc-init.c (acc_init_1): Move 'acc_init_state' logic to where
diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c
index 765fa2f3b95..40c14fa9bf2 100644
--- a/libgomp/oacc-init.c
+++ b/libgomp/oacc-init.c
@@ -317,10 +317,6 @@  acc_init_1 (acc_device_t d, acc_construct_t parent_construct, int implicit)
   gomp_init_device (acc_dev);
   gomp_mutex_unlock (&acc_dev->lock);
 
-  gomp_mutex_lock (&acc_init_state_lock);
-  acc_init_state = initialized;
-  gomp_mutex_unlock (&acc_init_state_lock);
-
   if (profiling_p)
     {
       prof_info.event_type = acc_ev_device_init_end;
@@ -329,6 +325,14 @@  acc_init_1 (acc_device_t d, acc_construct_t parent_construct, int implicit)
 				&api_info);
     }
 
+  /* We're setting 'initialized' *after* 'goacc_profiling_dispatch', so that a
+     nested 'acc_get_device_type' called from a profiling callback still sees
+     'initializing', so that we don't deadlock when it then again tries to lock
+     'goacc_prof_lock'.  See also the discussion in 'acc_get_device_type'.  */
+  gomp_mutex_lock (&acc_init_state_lock);
+  acc_init_state = initialized;
+  gomp_mutex_unlock (&acc_init_state_lock);
+
   return base_dev;
 }
 
-- 
2.17.1