diff mbox

libgomp: Guard all devices/num_devices/num_devices_openmp access by register_lock (was: libgomp: Guard all offload_images/num_offload_images access by register_lock)

Message ID 87d1x3udrd.fsf@kepler.schwinge.homeip.net
State New
Headers show

Commit Message

Thomas Schwinge Sept. 28, 2015, 8:52 a.m. UTC
Hi!

On Fri, 25 Sep 2015 19:49:50 +0300, Ilya Verbin <iverbin@gmail.com> wrote:
> On Fri, Sep 25, 2015 at 18:21:27 +0200, Thomas Schwinge wrote:
> > On Thu, 26 Mar 2015 23:41:30 +0300, Ilya Verbin <iverbin@gmail.com> wrote:
> > > On Thu, Mar 26, 2015 at 13:09:19 +0100, Jakub Jelinek wrote:
> > > > the current code is majorly broken.  As I've said earlier, e.g. the lack
> > > > of mutex guarding gomp_target_init (which is using pthread_once guaranteed
> > > > to be run just once) vs. concurrent GOMP_offload_register calls
> > > > (if those are run from ctors, then I guess something like dl_load_lock
> > > > ensures at least on glibc that multiple GOMP_offload_register calls aren't
> > > > performed at the same time) in accessing/reallocating offload_images
> > > > and num_offload_images and the lack of support to register further
> > > > images after the gomp_target_init call (if you dlopen further shared
> > > > libraries) is really bad.  And it would be really nice to support the
> > > > unloading.
> > 
> > > Here is the latest patch for libgomp and mic plugin.
> > 
> > What about the scenario where one thread is inside
> > GOMP_offload_register_ver/GOMP_offload_register (say, due to opening a
> > shared library with such a mkoffload-generated constructor) and is
> > modifying offload_images with register_lock held, and another thread is
> > inside a GOMP_target* construct -> gomp_init_device and is accessing
> > offload_images without register_lock held?  Or, why isn't that a
> > reachable scenario?
> > 
> > Would the following patch (untested) do the right thing (locking added to
> > gomp_init_device and gomp_unload_device)?  We can then also remove the
> > is_register_lock parameter from gomp_load_image_to_device, and simplify
> > the code.
> 
> Looks like you're right, and this scenario is possible.

Thanks for your review!  Jakub, OK to commit the patch I had posted?


Then, in context of a similar scenario, I think we'll also want the
following.  Please confirm that my reasoning in gomp_get_num_devices and
resolve_device is correct.  OK for trunk?

commit b0cf4dcc588e432c0a0d19d85727a20210b4d837
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Sat Sep 26 15:48:09 2015 +0200

    libgomp: Guard all devices/num_devices/num_devices_openmp access by register_lock
---
 libgomp/target.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)



Grüße,
 Thomas

Comments

Thomas Schwinge Oct. 6, 2015, 11:48 a.m. UTC | #1
Hi!

On Mon, 28 Sep 2015 10:52:38 +0200, I wrote:
> On Fri, 25 Sep 2015 19:49:50 +0300, Ilya Verbin <iverbin@gmail.com> wrote:
> > On Fri, Sep 25, 2015 at 18:21:27 +0200, Thomas Schwinge wrote:
> > > On Thu, 26 Mar 2015 23:41:30 +0300, Ilya Verbin <iverbin@gmail.com> wrote:
> > > > On Thu, Mar 26, 2015 at 13:09:19 +0100, Jakub Jelinek wrote:
> > > > > the current code is majorly broken.  As I've said earlier, e.g. the lack
> > > > > of mutex guarding gomp_target_init (which is using pthread_once guaranteed
> > > > > to be run just once) vs. concurrent GOMP_offload_register calls
> > > > > (if those are run from ctors, then I guess something like dl_load_lock
> > > > > ensures at least on glibc that multiple GOMP_offload_register calls aren't
> > > > > performed at the same time) in accessing/reallocating offload_images
> > > > > and num_offload_images and the lack of support to register further
> > > > > images after the gomp_target_init call (if you dlopen further shared
> > > > > libraries) is really bad.  And it would be really nice to support the
> > > > > unloading.
> > > 
> > > > Here is the latest patch for libgomp and mic plugin.
> > > 
> > > What about the scenario where one thread is inside
> > > GOMP_offload_register_ver/GOMP_offload_register (say, due to opening a
> > > shared library with such a mkoffload-generated constructor) and is
> > > modifying offload_images with register_lock held, and another thread is
> > > inside a GOMP_target* construct -> gomp_init_device and is accessing
> > > offload_images without register_lock held?  Or, why isn't that a
> > > reachable scenario?
> > > 
> > > Would the following patch (untested) do the right thing (locking added to
> > > gomp_init_device and gomp_unload_device)?  We can then also remove the
> > > is_register_lock parameter from gomp_load_image_to_device, and simplify
> > > the code.
> > 
> > Looks like you're right, and this scenario is possible.
> 
> Thanks for your review!  Jakub, OK to commit the patch I had posted?

Ping.


And, likewise, ping for the following:

> Then, in context of a similar scenario, I think we'll also want the
> following.  Please confirm that my reasoning in gomp_get_num_devices and
> resolve_device is correct.  OK for trunk?
> 
> commit b0cf4dcc588e432c0a0d19d85727a20210b4d837
> Author: Thomas Schwinge <thomas@codesourcery.com>
> Date:   Sat Sep 26 15:48:09 2015 +0200
> 
>     libgomp: Guard all devices/num_devices/num_devices_openmp access by register_lock
> ---
>  libgomp/target.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git libgomp/target.c libgomp/target.c
> index 1fbbe31..6f0a339 100644
> --- libgomp/target.c
> +++ libgomp/target.c
> @@ -49,7 +49,7 @@ static void gomp_target_init (void);
>  /* The whole initialization code for offloading plugins is only run one.  */
>  static pthread_once_t gomp_is_initialized = PTHREAD_ONCE_INIT;
>  
> -/* Mutex for offload image registration.  */
> +/* Mutex for offload targets setup and image registration.  */
>  static gomp_mutex_t register_lock;
>  
>  /* This structure describes an offload image.
> @@ -118,6 +118,8 @@ attribute_hidden int
>  gomp_get_num_devices (void)
>  {
>    gomp_init_targets_once ();
> +  /* As it is immutable once it has been initialized, it's safe to access
> +     num_devices_openmp without register_lock held.  */
>    return num_devices_openmp;
>  }
>  
> @@ -133,6 +135,8 @@ resolve_device (int device_id)
>    if (device_id < 0 || device_id >= gomp_get_num_devices ())
>      return NULL;
>  
> +  /* As it is immutable once it has been initialized, it's safe to access
> +     devices without register_lock held.  */
>    return &devices[device_id];
>  }
>  
> @@ -1228,6 +1232,8 @@ gomp_target_init (void)
>    char *plugin_name;
>    int i, new_num_devices;
>  
> +  gomp_mutex_lock (&register_lock);
> +
>    num_devices = 0;
>    devices = NULL;
>  
> @@ -1317,6 +1323,8 @@ gomp_target_init (void)
>        if (devices[i].capabilities & GOMP_OFFLOAD_CAP_OPENACC_200)
>  	goacc_register (&devices[i]);
>      }
> +
> +  gomp_mutex_unlock (&register_lock);
>  }
>  
>  #else /* PLUGIN_SUPPORT */


Grüße,
 Thomas
Bernd Schmidt Oct. 9, 2015, 11:58 a.m. UTC | #2
On 09/28/2015 10:52 AM, Thomas Schwinge wrote:
> On Fri, 25 Sep 2015 19:49:50 +0300, Ilya Verbin <iverbin@gmail.com> wrote:
>>
>> Looks like you're right, and this scenario is possible.
>
> Thanks for your review!  Jakub, OK to commit the patch I had posted?
>
>
> Then, in context of a similar scenario, I think we'll also want the
> following.  Please confirm that my reasoning in gomp_get_num_devices and
> resolve_device is correct.  OK for trunk?

I've looked at that for a while. I don't see anything immediately wrong 
with the patches, but I think it would still be good for Jakub to have a 
look.

One oddity I noticed in target.c is that there are two different 
num_devices variables:

   /* Total number of available devices.  */
   static int num_devices;

   /* Number of GOMP_OFFLOAD_CAP_OPENMP_400 devices.  */
   static int num_devices_openmp;

Confusingly, the get_num_devices function returns num_devices_openmp. 
That function includes a pthread_once call to gomp_target_init, which 
sets up these variables. References to num_devices_openmp through 
get_num_devices are thereforce guaranteed to be initialized. However, 
there are direct references to num_devices, in GOMP_offload_register_ver 
and GOMP_offload_unregister_ver, and they don't seem to enforce any kind 
of initialization:

   /* Load image to all initialized devices.  */
   for (i = 0; i < num_devices; i++)
     {
       struct gomp_device_descr *devicep = &devices[i];
       gomp_mutex_lock (&devicep->lock);
       if (devicep->type == target_type && devicep->is_initialized)
         gomp_load_image_to_device (devicep, version,
                                    host_table, target_data, true);
       gomp_mutex_unlock (&devicep->lock);
     }

I'm guessing this only triggers when dlopening something with an offload 
image after devices have been initialized already, and it looks like we 
have symmetrical code in gomp_init_device. Wouldn't it be 
possible/better to force a gomp_target_init before referencing 
num_devices, and then relying on the code I quoted and deleting the 
image loading from gomp_init_device?


Bernd
Ilya Verbin Oct. 9, 2015, 2:39 p.m. UTC | #3
On Fri, Oct 09, 2015 at 13:58:32 +0200, Bernd Schmidt wrote:
> One oddity I noticed in target.c is that there are two different num_devices
> variables:
> 
>   /* Total number of available devices.  */
>   static int num_devices;
> 
>   /* Number of GOMP_OFFLOAD_CAP_OPENMP_400 devices.  */
>   static int num_devices_openmp;
> 
> Confusingly, the get_num_devices function returns num_devices_openmp. That
> function includes a pthread_once call to gomp_target_init, which sets up
> these variables. References to num_devices_openmp through get_num_devices
> are thereforce guaranteed to be initialized. However, there are direct
> references to num_devices, in GOMP_offload_register_ver and
> GOMP_offload_unregister_ver, and they don't seem to enforce any kind of
> initialization:
> 
>   /* Load image to all initialized devices.  */
>   for (i = 0; i < num_devices; i++)
>     {
>       struct gomp_device_descr *devicep = &devices[i];
>       gomp_mutex_lock (&devicep->lock);
>       if (devicep->type == target_type && devicep->is_initialized)
>         gomp_load_image_to_device (devicep, version,
>                                    host_table, target_data, true);
>       gomp_mutex_unlock (&devicep->lock);
>     }
> 
> I'm guessing this only triggers when dlopening something with an offload
> image after devices have been initialized already, and it looks like we have
> symmetrical code in gomp_init_device.

Right, this code offloads given image to all initialized devices, and similar
code in gomp_init_device offloads all registered images to a given device.

> Wouldn't it be possible/better to
> force a gomp_target_init before referencing num_devices, and then relying on
> the code I quoted and deleting the image loading from gomp_init_device?

gomp_target_init only loads plugins and sets num_devices/num_devices_openmp, but
it doesn't call gomp_init_device, because we wanted to defer device
initialization as much as possible.  So, gomp_init_device is called immediately
before usage of that device.

  -- Ilya
diff mbox

Patch

diff --git libgomp/target.c libgomp/target.c
index 1fbbe31..6f0a339 100644
--- libgomp/target.c
+++ libgomp/target.c
@@ -49,7 +49,7 @@  static void gomp_target_init (void);
 /* The whole initialization code for offloading plugins is only run one.  */
 static pthread_once_t gomp_is_initialized = PTHREAD_ONCE_INIT;
 
-/* Mutex for offload image registration.  */
+/* Mutex for offload targets setup and image registration.  */
 static gomp_mutex_t register_lock;
 
 /* This structure describes an offload image.
@@ -118,6 +118,8 @@  attribute_hidden int
 gomp_get_num_devices (void)
 {
   gomp_init_targets_once ();
+  /* As it is immutable once it has been initialized, it's safe to access
+     num_devices_openmp without register_lock held.  */
   return num_devices_openmp;
 }
 
@@ -133,6 +135,8 @@  resolve_device (int device_id)
   if (device_id < 0 || device_id >= gomp_get_num_devices ())
     return NULL;
 
+  /* As it is immutable once it has been initialized, it's safe to access
+     devices without register_lock held.  */
   return &devices[device_id];
 }
 
@@ -1228,6 +1232,8 @@  gomp_target_init (void)
   char *plugin_name;
   int i, new_num_devices;
 
+  gomp_mutex_lock (&register_lock);
+
   num_devices = 0;
   devices = NULL;
 
@@ -1317,6 +1323,8 @@  gomp_target_init (void)
       if (devices[i].capabilities & GOMP_OFFLOAD_CAP_OPENACC_200)
 	goacc_register (&devices[i]);
     }
+
+  gomp_mutex_unlock (&register_lock);
 }
 
 #else /* PLUGIN_SUPPORT */