Allow building GCC with PTX offloading even without CUDA being installed (gcc and nvptx-tools patches)

Hi!

On Fri, 13 Jan 2017 19:11:23 +0100, Jakub Jelinek <jakub@redhat.com> wrote:
> This is something that has been discussed already during the last Cauldron.
> Especially for distributions it is undesirable to need to have proprietary
> CUDA libraries and headers installed when building GCC.

ACK.

> These two patches allow building GCC without CUDA around in a way that later
> on can offload to PTX if libcuda.so.1 is installed

Thanks!

I'd like to have some additional changes done; see the attached patch,
and also some further comments below.

> In order to configure gcc to load libcuda.so.1 dynamically,
> one has to either configure it --without-cuda-driver, or without
> --with-cuda-driver=/--with-cuda-driver-lib=/--with-cuda-driver-include=
> options if cuda.h and -lcuda aren't found in the default locations.

Would be good to have that documented ;-) -- done.

> The nvptx-tools change

(I'll get to that later.)

> --- libgomp/plugin/configfrag.ac.jj	2017-01-13 12:07:56.000000000 +0100
> +++ libgomp/plugin/configfrag.ac	2017-01-13 17:33:26.608240936 +0100
> @@ -58,10 +58,12 @@ AC_ARG_WITH(cuda-driver-include,
>  AC_ARG_WITH(cuda-driver-lib,
>  	[AS_HELP_STRING([--with-cuda-driver-lib=PATH],
>  		[specify directory for the installed CUDA driver library])])
> -if test "x$with_cuda_driver" != x; then
> -  CUDA_DRIVER_INCLUDE=$with_cuda_driver/include
> -  CUDA_DRIVER_LIB=$with_cuda_driver/lib
> -fi
> +case "x$with_cuda_driver" in
> +  x | xno) ;;
> +  *) CUDA_DRIVER_INCLUDE=$with_cuda_driver/include
> +     CUDA_DRIVER_LIB=$with_cuda_driver/lib
> +     ;;
> +esac

I (obviously) agree with your intended (?) "--without-cuda-driver"
semantics, but I think a "--with-cuda-driver" option should actually mean
that the system's/installed CUDA driver package *must* be used (and
similar for other "--with-cuda-driver*" options); and I also added
"--with-cuda-driver=check" to allow overriding earlier such options (that
is, restore the default "check" behavior).

I say 'intended (?) "--without-cuda-driver" semantics', because with your
current patch/code, if I got that right, if one specifies
"--without-cuda-driver" but actually does have a CUDA driver system
installation available, then the nvptx libgomp plugin will still link
against that one, instead of "dlopen"ing it.  So I changed that
accordingly.

> +PLUGIN_NVPTX_DYNAMIC=0

I find the name "PLUGIN_NVPTX_DYNAMIC" a bit misleading, as this isn't
about the nvptx plugin being "dynamic" but rather it's about its usage of
the CUDA driver library.  Thus renamed to "CUDA_DRIVER_DYNAMIC".

> @@ -167,9 +170,17 @@ if test x"$enable_offload_targets" != x;
>  	LIBS=$PLUGIN_NVPTX_save_LIBS
>  	case $PLUGIN_NVPTX in
>  	  nvptx*)
> -	    PLUGIN_NVPTX=0
> -	    AC_MSG_ERROR([CUDA driver package required for nvptx support])
> -	    ;;
> +	    if test "x$CUDA_DRIVER_INCLUDE" = x \
> +	       && test "x$CUDA_DRIVER_LIB" = x; then
> +	      PLUGIN_NVPTX=1
> +	      PLUGIN_NVPTX_CPPFLAGS='-I$(srcdir)/plugin/cuda'
> +	      PLUGIN_NVPTX_LIBS='-ldl'
> +	      PLUGIN_NVPTX_DYNAMIC=1
> +	    else
> +	      PLUGIN_NVPTX=0
> +	      AC_MSG_ERROR([CUDA driver package required for nvptx support])
> +	    fi
> +	  ;;
>  	esac

I reworked that logic to accommodate for the additional
"--with-cuda-driver" usage.

> --- libgomp/plugin/plugin-nvptx.c.jj	2017-01-13 12:07:56.000000000 +0100
> +++ libgomp/plugin/plugin-nvptx.c	2017-01-13 18:00:39.693284346 +0100

> +/* -1 if init_cuda_lib has not been called yet, false
> +   if it has been and failed, true if it has been and succeeded.  */
> +static char cuda_lib_inited = -1;

Don't we actually have to worry here about multiple threads running into
this in parallel -- thus need locking (or atomic accesses?) when
accessing "cuda_lib_inited"?

> +/* Dynamically load the CUDA runtime library and initialize function

Not "CUDA runtime" but actually "CUDA driver" -- changed.

> +   pointers, return false if unsuccessful, true if successful.  */
> +static bool
> +init_cuda_lib (void)
> +{
> +  if (cuda_lib_inited != -1)
> +    return cuda_lib_inited;
> +  const char *cuda_runtime_lib = "libcuda.so.1";
> +  void *h = dlopen (cuda_runtime_lib, RTLD_LAZY);
> +  cuda_lib_inited = false;
> +  if (h == NULL)
> +    return false;

I'd like some GOMP_PLUGIN_debug output for this and the following "return
false" cases -- added.

> +# undef CUDA_ONE_CALL
> +# define CUDA_ONE_CALL(call) CUDA_ONE_CALL_1 (call)
> +# define CUDA_ONE_CALL_1(call) \
> +  cuda_lib.call = dlsym (h, #call);	\
> +  if (cuda_lib.call == NULL)		\
> +    return false;
> +  CUDA_CALLS
> +  cuda_lib_inited = true;
> +  return true;
>  }

> --- libgomp/plugin/cuda/cuda.h.jj	2017-01-13 15:58:00.966544147 +0100
> +++ libgomp/plugin/cuda/cuda.h	2017-01-13 17:02:47.355817896 +0100

> +#define CUDA_VERSION 8000

Does that make it compatible to CUDA 8.0 (and later) only?  (Not yet
checked.)

(Have not reviewed this new file any further.)

Currently testing the following patch; OK for trunk?

commit 4ef19c27a9567df03f82282b8ae6608c5d88472d
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Sat Jan 21 15:25:44 2017 +0100

    libgomp: Additional "--with-cuda-driver" changes

            gcc/
            * doc/install.texi: Document "--with-cuda-driver" and related
            options.
            libgomp/
            * plugin/plugin-nvptx.c (init_cuda_lib): Add GOMP_PLUGIN_debug
            calls.
            * plugin/configfrag.ac: Document "--with-cuda-driver" and related
            options.  Handle "--with-cuda-driver", "--with-cuda-driver=check",
            and "--without-cuda-driver" options.
            (PLUGIN_NVPTX_DYNAMIC): Rename to...
            (CUDA_DRIVER_DYNAMIC): ... this.  Adjust all users.
            * config.h.in: Regenerate.
            * configure: Likewise.
---
 gcc/doc/install.texi          |  23 +++++++
 libgomp/config.h.in           |   8 +--
 libgomp/configure             | 146 ++++++++++++++++++++++++++++++------------
 libgomp/plugin/configfrag.ac  | 139 +++++++++++++++++++++++++++-------------
 libgomp/plugin/plugin-nvptx.c |  32 +++++----
 5 files changed, 248 insertions(+), 100 deletions(-)

Grüße
 Thomas

Allow building GCC with PTX offloading even without CUDA being installed (gcc and nvptx-tools patches)

Commit Message

Comments

Patch