Message ID | 20131217113522.GA23078@msticlxl57.ims.intel.com |
---|---|
State | New |
Headers | show |
Ping. On 17 December 2013 15:35, Michael V. Zolotukhin <michael.v.zolotukhin@gmail.com> wrote: > Hi everybody, > > Here is a set of patches implementing one more piece of offloading support in > GCC. These three patches allow to build a host binary with target image and all > tables embedded. Along with patches for libgomp and libgomp plugin, which > hopefully will be sent soon, that gives a functional and runnable executable (or > DSO) with actual offloading to MIC. > > There is still a lot to do in this area, but this is the necessary basics - with > this we could actually run offloaded code, produced fully by compiler. > > We would like to hear any feedback on these patches: what issues we should > address first before commit (if any), how the patches fit OpenACC work, etc. > > Here is a patch 1/3: Add '-fopenmp_target' option This option tells lto1 to look > for "*.target_lto*" sections instead of usual "*.lto*". That option is passed > to target compiler when we invoke it to build target image. > > Thanks, > Michael > > > --- > gcc/lto/lang.opt | 4 ++++ > gcc/lto/lto-object.c | 5 +++-- > gcc/lto/lto.c | 7 ++++++- > 3 files changed, 13 insertions(+), 3 deletions(-) > > diff --git a/gcc/lto/lang.opt b/gcc/lto/lang.opt > index 7a9aede..cd0098c 100644 > --- a/gcc/lto/lang.opt > +++ b/gcc/lto/lang.opt > @@ -40,4 +40,8 @@ fresolution= > LTO Joined > The resolution file > > +fopenmp_target > +LTO Var(flag_openmp_target) > +Run LTO infrastructure to read target-side bytecode and to build it. > + > ; This comment is to ensure we retain the blank line above. > diff --git a/gcc/lto/lto-object.c b/gcc/lto/lto-object.c > index 19f10cc..64274f3 100644 > --- a/gcc/lto/lto-object.c > +++ b/gcc/lto/lto-object.c > @@ -59,6 +59,8 @@ struct lto_simple_object > > static simple_object_attributes *saved_attributes; > > +extern const char *section_name_prefix; > + > /* Initialize FILE, an LTO file object for FILENAME. */ > > static void > @@ -229,8 +231,7 @@ lto_obj_add_section (void *data, const char *name, off_t offset, > void **slot; > struct lto_section_list *list = loasd->list; > > - if (strncmp (name, LTO_SECTION_NAME_PREFIX, > - strlen (LTO_SECTION_NAME_PREFIX)) != 0) > + if (strncmp (name, section_name_prefix, strlen (section_name_prefix))) > return 1; > > new_name = xstrdup (name); > diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c > index 0211437..dedf8a8 100644 > --- a/gcc/lto/lto.c > +++ b/gcc/lto/lto.c > @@ -49,6 +49,8 @@ along with GCC; see the file COPYING3. If not see > #include "context.h" > #include "pass_manager.h" > > +extern const char *section_name_prefix; > + > /* Vector to keep track of external variables we've seen so far. */ > vec<tree, va_gc> *lto_global_var_decls; > > @@ -2081,7 +2083,7 @@ lto_section_with_id (const char *name, unsigned HOST_WIDE_INT *id) > { > const char *s; > > - if (strncmp (name, LTO_SECTION_NAME_PREFIX, strlen (LTO_SECTION_NAME_PREFIX))) > + if (strncmp (name, section_name_prefix, strlen (section_name_prefix))) > return 0; > s = strrchr (name, '.'); > return s && sscanf (s, "." HOST_WIDE_INT_PRINT_HEX_PURE, id) == 1; > @@ -2757,6 +2759,9 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) > > timevar_push (TV_IPA_LTO_DECL_IN); > > + if (flag_openmp_target) > + section_name_prefix = OMP_SECTION_NAME_PREFIX; > + > real_file_decl_data > = decl_data = ggc_alloc_cleared_vec_lto_file_decl_data_ptr (nfiles + 1); > real_file_count = nfiles; > -- > 1.7.1 > > >
On 12/17/2013 12:35 PM, Michael V. Zolotukhin wrote: > Here is a set of patches implementing one more piece of offloading support in > GCC. These three patches allow to build a host binary with target image and all > tables embedded. Along with patches for libgomp and libgomp plugin, which > hopefully will be sent soon, that gives a functional and runnable executable (or > DSO) with actual offloading to MIC. Do you have a testcase that can be used to see what this does in action? Bernd
Hi Bernd, We have some testcases, but they require XeonPhi hardware and a working libgomp plugin. Our current version of the plugin depends on some libraries, that are not open-sourced yet, so currently we can’t share it. However, you could examine what these patches do, making the following steps: 1) Build GCC with patches: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01484.html http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01485.html http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01486.html http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01896.html 2) Set environment variables (e.g. for two ‘targets’): export OFFLOAD_TARGET_NAMES=mic:hsail (for now names don’t really matter) export OFFLOAD_TARGET_COMPILERS=./gcc:./gcc (use GCC with patches above as target compiler, because it must support the -fopenmp_target option) 3) Build any example with #pragma omp target (e.g. see attachment): ./gcc -flto -fopenmp test.c -o test.exe Options -flto and -fopenmp are necessary for using. Now you have a binary with target images embedded and tables properly filled. You can’t run it due to reasons mentioned above, though you could examine it with objdump/nm/readelf to see new sections and their content: there will be .offload_image_section with ‘target’ code and .offload_func_table_section with ‘target’ function table. On Tue, Jan 21, 2014 at 6:40 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: > On 12/17/2013 12:35 PM, Michael V. Zolotukhin wrote: >> >> Here is a set of patches implementing one more piece of offloading support >> in >> GCC. These three patches allow to build a host binary with target image >> and all >> tables embedded. Along with patches for libgomp and libgomp plugin, which >> hopefully will be sent soon, that gives a functional and runnable >> executable (or >> DSO) with actual offloading to MIC. > > > Do you have a testcase that can be used to see what this does in action? > > > Bernd >
On 01/22/2014 11:53 AM, Andrey Turetskiy wrote: > We have some testcases, but they require XeonPhi hardware and a > working libgomp plugin. Our current version of the plugin depends on > some libraries, that are not open-sourced yet, so currently we can’t > share it. > > However, you could examine what these patches do, making the following steps: > 1) Build GCC with patches: > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01484.html > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01485.html > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01486.html > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01896.html > 2) Set environment variables (e.g. for two ‘targets’): > export OFFLOAD_TARGET_NAMES=mic:hsail (for now > names don’t really matter) > export OFFLOAD_TARGET_COMPILERS=./gcc:./gcc (use GCC with > patches above as target compiler, because it must support the > -fopenmp_target option) > 3) Build any example with #pragma omp target (e.g. see attachment): > ./gcc -flto -fopenmp test.c -o test.exe > Options -flto and -fopenmp are necessary for using. > > Now you have a binary with target images embedded and tables properly > filled. You can’t run it due to reasons mentioned above, though you > could examine it with objdump/nm/readelf to see new sections and their > content: there will be .offload_image_section with ‘target’ code and > .offload_func_table_section with ‘target’ function table. I played around with this for a while last week. To have a slightly more realistic scenario where the offload compiler is for a different target, I built an aarch64-linux compiler and used that in OFFLOAD_TARGET_COMPILERS. This exposed some problems. > + /* Run gcc for target. */ > + obstack_init (&argv_obstack); > + obstack_ptr_grow (&argv_obstack, compiler); > + obstack_ptr_grow (&argv_obstack, "-shared"); > + obstack_ptr_grow (&argv_obstack, "-fPIC"); > + obstack_ptr_grow (&argv_obstack, "-xlto"); > + obstack_ptr_grow (&argv_obstack, "-fopenmp_target"); > + obstack_ptr_grow (&argv_obstack, "-o"); > + obstack_ptr_grow (&argv_obstack, target_image_file_name); Since environment variables such as GCC_EXEC_PREFIX and COMPILER_PATH are set at this point, the compiler we're running here won't find the correct lto1 - best case it doesn't find anything, worst case it finds the lto1 for the host compiler and produces an image for the host, not the target (this fails with an arm compiler since the host assembler doesn't understand -meabi=5, but it could silently do the wrong thing with other offload toolchains). Once I worked around this by unsetting the environment variables around this compiler invocation here, the next problem is exposed - the code tries to link together files compiled for the target (created by the code quoted above) and the host (the _omp_descr file, I believe). Linker errors ensue. As mentioned before, I think all this target-specific code has no place in lto-wrapper to begin with. For ptx, we're going to require some quite different mechanisms, so I think it might be best to invoke a new tool, maybe called $target-gen-offload, which knows how to produce an image that can be linked into the host executable. Different offload targets can then use different strategies to produce such an image. Probably each such image should contain its own code to register itself with libgomp, so that we don't have to construct a table. Some other observations: * is OFFLOAD_TARGET_NAMES actually useful, or would any string generated at link time suffice? * Is the user expected to set OFFLOAD_TARGET_COMPILERS, or should this be done by the gcc driver, possibly based on command line options (I'd much prefer that)? * Do we actually need an -fopenmp-target option? The way I imagine it (and which was somewhat present in the Makefile patches I posted last year) is that an offload compiler is specially configured to know that that's how it will be used, and to know what the host architecture is. A $target-gen-offload could then be built with knowledge of the host architecture and installed in the host compiler's libexec install directory. I think I'll need to implement my own set of mechanisms for ptx, since this code doesn't seem suitable for inclusion in its current state. I'll try to take on board some of the ideas I've found here in the hope that we'll converge on something that works for everybody. Bernd
Hi Bernd, 2014/1/27 Bernd Schmidt <bernds@codesourcery.com>: > Once I worked around this by unsetting the environment variables around this > compiler invocation here, the next problem is exposed - the code tries to > link together files compiled for the target (created by the code quoted > above) and the host (the _omp_descr file, I believe). Linker errors ensue. Thanks, that's a bug. Fortunately, it could be fixed easily. > As mentioned before, I think all this target-specific code has no place in > lto-wrapper to begin with. For ptx, we're going to require some quite > different mechanisms, so I think it might be best to invoke a new tool, > maybe called $target-gen-offload, which knows how to produce an image that > can be linked into the host executable. Different offload targets can then > use different strategies to produce such an image. That's quite a viable way. We added all this stuff to these patches to allow other targets to reuse it as much as possible. I.e. we wasn't aware if other targets support objcopy et al., so we proposed our way so that others could reuse it as-is if everything is available. It turned out, that the targets differ much in this place, so as you suggested, it's better to move all this stuff to target-specific utils. Certainly, these patches don't pretend to be a final version - they are just RFC, and we currently only want to show what we need and find out what other targets need. > Probably each such image > should contain its own code to register itself with libgomp, so that we > don't have to construct a table. If other targets use another mapping scheme, then I think these tables could easily be omitted from host/target executables (e.g. we could add a corresponding flag to the target images descriptor). But personally I believe this part is general enough to satisfy all targets. Could you please describe how functions would be invoked on PTX? > Some other observations: > * is OFFLOAD_TARGET_NAMES actually useful, or would any string > generated at link time suffice? Yep, it might be redundant for now, because all we need is target compilers. Target names aren't necessary. > * Is the user expected to set OFFLOAD_TARGET_COMPILERS, or should > this be done by the gcc driver, possibly based on command line > options (I'd much prefer that)? It's supposed to be set by gcc driver. Initial work in this direction could be found here: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01242.html > * Do we actually need an -fopenmp-target option? The way I imagine it > (and which was somewhat present in the Makefile patches I posted > last year) is that an offload compiler is specially configured to > know that that's how it will be used, and to know what the host > architecture is. A $target-gen-offload could then be built with > knowledge of the host architecture and installed in the host > compiler's libexec install directory. Our idea here was that a single x86 compiler could serve both as host and as target compiler, depending on presence of this option. If these compilers are always different, then indeed this option isn't needed. > Bernd -- Ilya
diff --git a/gcc/lto/lang.opt b/gcc/lto/lang.opt index 7a9aede..cd0098c 100644 --- a/gcc/lto/lang.opt +++ b/gcc/lto/lang.opt @@ -40,4 +40,8 @@ fresolution= LTO Joined The resolution file +fopenmp_target +LTO Var(flag_openmp_target) +Run LTO infrastructure to read target-side bytecode and to build it. + ; This comment is to ensure we retain the blank line above. diff --git a/gcc/lto/lto-object.c b/gcc/lto/lto-object.c index 19f10cc..64274f3 100644 --- a/gcc/lto/lto-object.c +++ b/gcc/lto/lto-object.c @@ -59,6 +59,8 @@ struct lto_simple_object static simple_object_attributes *saved_attributes; +extern const char *section_name_prefix; + /* Initialize FILE, an LTO file object for FILENAME. */ static void @@ -229,8 +231,7 @@ lto_obj_add_section (void *data, const char *name, off_t offset, void **slot; struct lto_section_list *list = loasd->list; - if (strncmp (name, LTO_SECTION_NAME_PREFIX, - strlen (LTO_SECTION_NAME_PREFIX)) != 0) + if (strncmp (name, section_name_prefix, strlen (section_name_prefix))) return 1; new_name = xstrdup (name); diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index 0211437..dedf8a8 100644 --- a/gcc/lto/lto.c +++ b/gcc/lto/lto.c @@ -49,6 +49,8 @@ along with GCC; see the file COPYING3. If not see #include "context.h" #include "pass_manager.h" +extern const char *section_name_prefix; + /* Vector to keep track of external variables we've seen so far. */ vec<tree, va_gc> *lto_global_var_decls; @@ -2081,7 +2083,7 @@ lto_section_with_id (const char *name, unsigned HOST_WIDE_INT *id) { const char *s; - if (strncmp (name, LTO_SECTION_NAME_PREFIX, strlen (LTO_SECTION_NAME_PREFIX))) + if (strncmp (name, section_name_prefix, strlen (section_name_prefix))) return 0; s = strrchr (name, '.'); return s && sscanf (s, "." HOST_WIDE_INT_PRINT_HEX_PURE, id) == 1; @@ -2757,6 +2759,9 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) timevar_push (TV_IPA_LTO_DECL_IN); + if (flag_openmp_target) + section_name_prefix = OMP_SECTION_NAME_PREFIX; + real_file_decl_data = decl_data = ggc_alloc_cleared_vec_lto_file_decl_data_ptr (nfiles + 1); real_file_count = nfiles;