diff mbox

[libgomp] Fix deadlock in acc_set_device_type

Message ID 56F8FD46.1060400@codesourcery.com
State New
Headers show

Commit Message

Chung-Lin Tang March 28, 2016, 9:45 a.m. UTC
Hi Jakub, there's a path for deadlock on acc_device_lock when going
through the acc_set_device_type() OpenACC library function.
Basically, the gomp_init_targets_once() function should not be
called with that held. The attached patch moves it appropriately.

Also in this patch, there are several cases in acc_* functions
where gomp_init_targets_once() is guarded by a test of
!cached_base_dev. Since that function already uses pthread_once() to
call gomp_target_init(), and technically cached_base_dev
is protected by acc_device_lock, the cleanest way should be to
simply drop those "if(!cached_base_dev)" tests.

Tested libgomp without regressions on an nvptx offloaded system,
is this okay for trunk?

Thanks,
Chung-Lin

2016-03-28  Chung-Lin Tang  <cltang@codesourcery.com>

        * oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
        gomp_init_targets_once().
        (acc_set_device_type): Remove !cached_base_dev condition on call to
        gomp_init_targets_once(), move call to before acc_device_lock acquire,
        to avoid deadlock.
        (acc_get_device_num): Remove !cached_base_dev condition on call to
        gomp_init_targets_once().
        (acc_set_device_num): Likewise.

Comments

Chung-Lin Tang April 16, 2016, 7:39 a.m. UTC | #1
Ping.

On 2016/3/28 05:45 PM, Chung-Lin Tang wrote:
> Hi Jakub, there's a path for deadlock on acc_device_lock when going
> through the acc_set_device_type() OpenACC library function.
> Basically, the gomp_init_targets_once() function should not be
> called with that held. The attached patch moves it appropriately.
> 
> Also in this patch, there are several cases in acc_* functions
> where gomp_init_targets_once() is guarded by a test of
> !cached_base_dev. Since that function already uses pthread_once() to
> call gomp_target_init(), and technically cached_base_dev
> is protected by acc_device_lock, the cleanest way should be to
> simply drop those "if(!cached_base_dev)" tests.
> 
> Tested libgomp without regressions on an nvptx offloaded system,
> is this okay for trunk?
> 
> Thanks,
> Chung-Lin
> 
> 2016-03-28  Chung-Lin Tang  <cltang@codesourcery.com>
> 
>         * oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
>         gomp_init_targets_once().
>         (acc_set_device_type): Remove !cached_base_dev condition on call to
>         gomp_init_targets_once(), move call to before acc_device_lock acquire,
>         to avoid deadlock.
>         (acc_get_device_num): Remove !cached_base_dev condition on call to
>         gomp_init_targets_once().
>         (acc_set_device_num): Likewise.
>
Chung-Lin Tang April 19, 2016, 2:30 p.m. UTC | #2
Ping x2.

Hi Jakub,
This patch is fairly straightforward, and solves a easily encountered
deadlock. Please approve for trunk and gcc-6-branch.

Thanks,
Chung-Lin

On 2016/4/16 03:39 PM, Chung-Lin Tang wrote:
> Ping.
> 
> On 2016/3/28 05:45 PM, Chung-Lin Tang wrote:
>> Hi Jakub, there's a path for deadlock on acc_device_lock when going
>> through the acc_set_device_type() OpenACC library function.
>> Basically, the gomp_init_targets_once() function should not be
>> called with that held. The attached patch moves it appropriately.
>>
>> Also in this patch, there are several cases in acc_* functions
>> where gomp_init_targets_once() is guarded by a test of
>> !cached_base_dev. Since that function already uses pthread_once() to
>> call gomp_target_init(), and technically cached_base_dev
>> is protected by acc_device_lock, the cleanest way should be to
>> simply drop those "if(!cached_base_dev)" tests.
>>
>> Tested libgomp without regressions on an nvptx offloaded system,
>> is this okay for trunk?
>>
>> Thanks,
>> Chung-Lin
>>
>> 2016-03-28  Chung-Lin Tang  <cltang@codesourcery.com>
>>
>>         * oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
>>         gomp_init_targets_once().
>>         (acc_set_device_type): Remove !cached_base_dev condition on call to
>>         gomp_init_targets_once(), move call to before acc_device_lock acquire,
>>         to avoid deadlock.
>>         (acc_get_device_num): Remove !cached_base_dev condition on call to
>>         gomp_init_targets_once().
>>         (acc_set_device_num): Likewise.
>>
>
Chung-Lin Tang May 11, 2016, 6:48 a.m. UTC | #3
Ping x3

On 2016/4/19 10:30 PM, Chung-Lin Tang wrote:
> Ping x2.
> 
> Hi Jakub,
> This patch is fairly straightforward, and solves a easily encountered
> deadlock. Please approve for trunk and gcc-6-branch.
> 
> Thanks,
> Chung-Lin
> 
> On 2016/4/16 03:39 PM, Chung-Lin Tang wrote:
>> Ping.
>>
>> On 2016/3/28 05:45 PM, Chung-Lin Tang wrote:
>>> Hi Jakub, there's a path for deadlock on acc_device_lock when going
>>> through the acc_set_device_type() OpenACC library function.
>>> Basically, the gomp_init_targets_once() function should not be
>>> called with that held. The attached patch moves it appropriately.
>>>
>>> Also in this patch, there are several cases in acc_* functions
>>> where gomp_init_targets_once() is guarded by a test of
>>> !cached_base_dev. Since that function already uses pthread_once() to
>>> call gomp_target_init(), and technically cached_base_dev
>>> is protected by acc_device_lock, the cleanest way should be to
>>> simply drop those "if(!cached_base_dev)" tests.
>>>
>>> Tested libgomp without regressions on an nvptx offloaded system,
>>> is this okay for trunk?
>>>
>>> Thanks,
>>> Chung-Lin
>>>
>>> 2016-03-28  Chung-Lin Tang  <cltang@codesourcery.com>
>>>
>>>         * oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
>>>         gomp_init_targets_once().
>>>         (acc_set_device_type): Remove !cached_base_dev condition on call to
>>>         gomp_init_targets_once(), move call to before acc_device_lock acquire,
>>>         to avoid deadlock.
>>>         (acc_get_device_num): Remove !cached_base_dev condition on call to
>>>         gomp_init_targets_once().
>>>         (acc_set_device_num): Likewise.
>>>
>>
>
Jakub Jelinek May 12, 2016, 10:06 a.m. UTC | #4
On Mon, Mar 28, 2016 at 05:45:42PM +0800, Chung-Lin Tang wrote:
> Hi Jakub, there's a path for deadlock on acc_device_lock when going
> through the acc_set_device_type() OpenACC library function.
> Basically, the gomp_init_targets_once() function should not be
> called with that held. The attached patch moves it appropriately.
> 
> Also in this patch, there are several cases in acc_* functions
> where gomp_init_targets_once() is guarded by a test of
> !cached_base_dev. Since that function already uses pthread_once() to
> call gomp_target_init(), and technically cached_base_dev
> is protected by acc_device_lock, the cleanest way should be to
> simply drop those "if(!cached_base_dev)" tests.
> 
> Tested libgomp without regressions on an nvptx offloaded system,
> is this okay for trunk?

Ok, with ChangeLog nits:
> 
> 2016-03-28  Chung-Lin Tang  <cltang@codesourcery.com>
> 
>         * oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
>         gomp_init_targets_once().
>         (acc_set_device_type): Remove !cached_base_dev condition on call to
>         gomp_init_targets_once(), move call to before acc_device_lock acquire,
>         to avoid deadlock.
>         (acc_get_device_num): Remove !cached_base_dev condition on call to
>         gomp_init_targets_once().
>         (acc_set_device_num): Likewise.

Please just use gomp_init_targets_once instead of gomp_init_targets_once()
in the ChangeLog.

> Index: oacc-init.c
> ===================================================================
> --- oacc-init.c	(revision 234502)
> +++ oacc-init.c	(working copy)
> @@ -433,8 +433,7 @@ goacc_attach_host_thread_to_device (int ord)
>  void
>  acc_init (acc_device_t d)
>  {
> -  if (!cached_base_dev)
> -    gomp_init_targets_once ();
> +  gomp_init_targets_once ();
>  
>    gomp_mutex_lock (&acc_device_lock);
>  
> @@ -498,11 +497,10 @@ acc_set_device_type (acc_device_t d)
>    struct gomp_device_descr *base_dev, *acc_dev;
>    struct goacc_thread *thr = goacc_thread ();
>  
> +  gomp_init_targets_once ();
> +
>    gomp_mutex_lock (&acc_device_lock);
>  
> -  if (!cached_base_dev)
> -    gomp_init_targets_once ();
> -
>    cached_base_dev = base_dev = resolve_device (d, true);
>    acc_dev = &base_dev[goacc_device_num];
>  
> @@ -563,8 +561,7 @@ acc_get_device_num (acc_device_t d)
>    if (d >= _ACC_device_hwm)
>      gomp_fatal ("unknown device type %u", (unsigned) d);
>  
> -  if (!cached_base_dev)
> -    gomp_init_targets_once ();
> +  gomp_init_targets_once ();
>  
>    gomp_mutex_lock (&acc_device_lock);
>    dev = resolve_device (d, true);
> @@ -584,8 +581,7 @@ acc_set_device_num (int ord, acc_device_t d)
>    struct gomp_device_descr *base_dev, *acc_dev;
>    int num_devices;
>  
> -  if (!cached_base_dev)
> -    gomp_init_targets_once ();
> +  gomp_init_targets_once ();
>  
>    if (ord < 0)
>      ord = goacc_device_num;


	Jakub
diff mbox

Patch

Index: oacc-init.c
===================================================================
--- oacc-init.c	(revision 234502)
+++ oacc-init.c	(working copy)
@@ -433,8 +433,7 @@  goacc_attach_host_thread_to_device (int ord)
 void
 acc_init (acc_device_t d)
 {
-  if (!cached_base_dev)
-    gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   gomp_mutex_lock (&acc_device_lock);
 
@@ -498,11 +497,10 @@  acc_set_device_type (acc_device_t d)
   struct gomp_device_descr *base_dev, *acc_dev;
   struct goacc_thread *thr = goacc_thread ();
 
+  gomp_init_targets_once ();
+
   gomp_mutex_lock (&acc_device_lock);
 
-  if (!cached_base_dev)
-    gomp_init_targets_once ();
-
   cached_base_dev = base_dev = resolve_device (d, true);
   acc_dev = &base_dev[goacc_device_num];
 
@@ -563,8 +561,7 @@  acc_get_device_num (acc_device_t d)
   if (d >= _ACC_device_hwm)
     gomp_fatal ("unknown device type %u", (unsigned) d);
 
-  if (!cached_base_dev)
-    gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   gomp_mutex_lock (&acc_device_lock);
   dev = resolve_device (d, true);
@@ -584,8 +581,7 @@  acc_set_device_num (int ord, acc_device_t d)
   struct gomp_device_descr *base_dev, *acc_dev;
   int num_devices;
 
-  if (!cached_base_dev)
-    gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   if (ord < 0)
     ord = goacc_device_num;