From patchwork Thu Mar 20 16:50:13 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 332311 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id BA48A2C00AE for ; Fri, 21 Mar 2014 03:52:54 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=Ol7W8u9wmXaj7Cfl+AeUrgQKPUPqsTFEwsShOY1p/qg PpeFxcn7nLX0UXh/S+8F1H+ZyzpqiQTgRDFzxTZZpThB9ntA2NEzzU/Y39bX/mnc zrGonYfn07EodGYj/pCq4nQ3wpbbHeA7Pj6R8dRIocsPqdxfDWbmHPFnt2A9/B78 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=x69e+GXZCx2GvnwiudeqqtWA55M=; b=KRTU5X7/kY5Hd+hvI oHDLpx/Nkt5cQBvHXlFO2Nvpo7bXbSksCm5uRW/GkFQNl/1JPoyB0J6Tr0cIIMcr z94agbVRCFXdvWs8LkVK4eBMtp53E2mFJpjnfCmjfpQE0Cqcj+Ym18ArBZV/CiRo 5FTrOX1yLW16y0ldOEllj0kZXc= Received: (qmail 26542 invoked by alias); 20 Mar 2014 16:52:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 26532 invoked by uid 89); 20 Mar 2014 16:52:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Mar 2014 16:52:20 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1WQgCW-0004rx-Li from Bernd_Schmidt@mentor.com ; Thu, 20 Mar 2014 09:52:16 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 20 Mar 2014 09:52:16 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.2.247.3; Thu, 20 Mar 2014 16:52:14 +0000 Message-ID: <532B1C45.9020308@codesourcery.com> Date: Thu, 20 Mar 2014 17:50:13 +0100 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: GCC Patches CC: Ilya Verbin , Michael Zolotukhin Subject: [gomp4] Add tables generation This is based on Michael Zolotukhin's patch 2/3 from a while ago. It adds functionality to build function/variable tables that will allow libgomp to look up offload target code based on the address of the corresponding host function. There are two alternatives, one based on named sections, and one based on a target hook when named sections are unavailable (as on ptx). Committed on gomp-4_0-branch. Bernd Index: libgcc/ChangeLog =================================================================== --- libgcc/ChangeLog (revision 208706) +++ libgcc/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2014-03-20 Bernd Schmidt + + * crtstuff.c (_omp_func_table, _omp_var_table, _omp_funcs_end, + _omp_vars_end): New array fragments. + (__OPENMP_TARGET__): New variable. + 2014-02-28 Joey Ye PR libgcc/60166 Index: gcc/ChangeLog =================================================================== --- gcc/ChangeLog (revision 208720) +++ gcc/ChangeLog (working copy) @@ -1,5 +1,20 @@ 2014-03-20 Bernd Schmidt + Mostly by Michael Zolotukhin: + * omp-low.c: Include "common/common-target.h". + (expand_omp_target): Pass in address of __OPENMP_TARGET__. + (add_decls_addresses_to_decl_constructor, omp_finish_file): New + functions. + * omp-low.h (omp_finish_file): Declare. + * toplev.c: Include "omp-low.h". + (compile_file): Call omp_finish_file. + * target.def (record_offload_symbol): New hook. + * doc/tm.texi.in (TARGET_RECORD_OFFLOAD_SYMBOL): Add. + * doc/tm.texi: Regenerate. + * configure.ac (ENABLE_OFFLOADING): Define if we have offload_targets. + * configure: Regenerate. + * config.in: Regenerate. + * config/darwin.c: Include "lto-section-names.h". (LTO_SEGMENT_NAME): Don't define. * config/i386/winnt.c: Include "lto-section-names.h". Index: gcc/config.in =================================================================== --- gcc/config.in (revision 208715) +++ gcc/config.in (working copy) @@ -139,6 +139,12 @@ #endif +/* Define this to enable support for offloading. */ +#ifndef USED_FOR_TARGET +#undef ENABLE_OFFLOADING +#endif + + /* Define to enable plugin support. */ #ifndef USED_FOR_TARGET #undef ENABLE_PLUGIN Index: gcc/configure =================================================================== --- gcc/configure (revision 208715) +++ gcc/configure (working copy) @@ -7363,6 +7363,11 @@ cat >>confdefs.h <<_ACEOF #define OFFLOAD_TARGETS "$offload_targets" _ACEOF +if test x$offload_targets != x; then + +$as_echo "#define ENABLE_OFFLOADING 1" >>confdefs.h + +fi # Check whether --with-multilib-list was given. @@ -18008,7 +18013,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 18011 "configure" +#line 18016 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -18114,7 +18119,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 18117 "configure" +#line 18122 "configure" #include "confdefs.h" #if HAVE_DLFCN_H Index: gcc/configure.ac =================================================================== --- gcc/configure.ac (revision 208715) +++ gcc/configure.ac (working copy) @@ -887,6 +887,10 @@ AC_SUBST(enable_accelerator) offload_targets=`echo $offload_targets | sed -e 's#,#:#'` AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets", [Define to hold the list of target names suitable for offloading.]) +if test x$offload_targets != x; then + AC_DEFINE(ENABLE_OFFLOADING, 1, + [Define this to enable support for offloading.]) +fi AC_ARG_WITH(multilib-list, [AS_HELP_STRING([--with-multilib-list], [select multilibs (AArch64, SH and x86-64 only)])], Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi (revision 208706) +++ gcc/doc/tm.texi (working copy) @@ -11418,3 +11418,9 @@ If defined, this function returns an app @deftypefn {Target Hook} void TARGET_ATOMIC_ASSIGN_EXPAND_FENV (tree *@var{hold}, tree *@var{clear}, tree *@var{update}) ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence. This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence. This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}. The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place. The default implementation leaves all three expressions as @code{NULL_TREE}. The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}. @end deftypefn + +@deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree) +Used when offloaded functions are seen in the compilation unit and no named +sections are available. It is called once for each symbol that must be +recorded in the offload function and variable table. +@end deftypefn Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in (revision 208706) +++ gcc/doc/tm.texi.in (working copy) @@ -8414,3 +8414,5 @@ and the associated definitions of those @hook TARGET_ATOMIC_ALIGN_FOR_MODE @hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV + +@hook TARGET_RECORD_OFFLOAD_SYMBOL Index: gcc/omp-low.c =================================================================== --- gcc/omp-low.c (revision 208706) +++ gcc/omp-low.c (working copy) @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3. #include "optabs.h" #include "cfgloop.h" #include "target.h" +#include "common/common-target.h" #include "omp-low.h" #include "gimple-low.h" #include "tree-cfgcleanup.h" @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re } gimple g; - /* FIXME: This will be address of - extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden"))) - symbol, as soon as the linker plugin is able to create it for us. */ - tree openmp_target = build_zero_cst (ptr_type_node); + tree openmp_target + = build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier ("__OPENMP_TARGET__"), ptr_type_node); + TREE_PUBLIC (openmp_target) = 1; + DECL_EXTERNAL (openmp_target) = 1; if (kind == GF_OMP_TARGET_KIND_REGION) { tree fnaddr = build_fold_addr_expr (child_fn); - g = gimple_build_call (builtin_decl_explicit (start_ix), 7, - device, fnaddr, openmp_target, t1, t2, t3, t4); + g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device, + fnaddr, build_fold_addr_expr (openmp_target), + t1, t2, t3, t4); } else - g = gimple_build_call (builtin_decl_explicit (start_ix), 6, - device, openmp_target, t1, t2, t3, t4); + g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device, + build_fold_addr_expr (openmp_target), + t1, t2, t3, t4); gimple_set_location (g, gimple_location (entry_stmt)); gsi_insert_before (&gsi, g, GSI_SAME_STMT); if (kind != GF_OMP_TARGET_KIND_REGION) @@ -12801,4 +12805,139 @@ make_pass_omp_simd_clone (gcc::context * return new pass_omp_simd_clone (ctxt); } +/* Helper function for omp_finish_file routine. + Takes decls from V_DECLS and adds their addresses and sizes to + constructor-vector V_CTOR. It will be later used as DECL_INIT for decl + representing a global symbol for OpenMP descriptor. */ +static void +add_decls_addresses_to_decl_constructor (vec *v_decls, + vec *v_ctor) +{ + unsigned len = vec_safe_length (v_decls); + for (unsigned i = 0; i < len; i++) + { + tree it = (*v_decls)[i]; + bool is_function = TREE_CODE (it) != VAR_DECL; + + CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, build_fold_addr_expr (it)); + if (!is_function) + CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, + fold_convert (const_ptr_type_node, + DECL_SIZE (it))); + } +} + +/* Create new symbol containing (address, size) pairs for omp-marked + functions and global variables. */ +void +omp_finish_file (void) +{ + struct cgraph_node *node; + struct varpool_node *vnode; + const char *funcs_section_name = ".offload_func_table_section"; + const char *vars_section_name = ".offload_var_table_section"; + vec *v_funcs, *v_vars; + + vec_alloc (v_vars, 0); + vec_alloc (v_funcs, 0); + + /* Collect all omp-target functions. */ + FOR_EACH_DEFINED_FUNCTION (node) + { + /* TODO: This check could fail on functions, created by omp + parallel/task pragmas. It's better to name outlined for offloading + functions in some different way and to check here the function name. + It could be something like "*_omp_tgtfn" in contrast with "*_omp_fn" + for functions from omp parallel/task pragmas. */ + if (!lookup_attribute ("omp declare target", + DECL_ATTRIBUTES (node->decl)) + || !DECL_ARTIFICIAL (node->decl)) + continue; + vec_safe_push (v_funcs, node->decl); + } + /* Collect all omp-target global variables. */ + FOR_EACH_DEFINED_VARIABLE (vnode) + { + if (!lookup_attribute ("omp declare target", + DECL_ATTRIBUTES (vnode->decl)) + || TREE_CODE (vnode->decl) != VAR_DECL + || DECL_SIZE (vnode->decl) == 0) + continue; + + vec_safe_push (v_vars, vnode->decl); + } + unsigned num_vars = vec_safe_length (v_vars); + unsigned num_funcs = vec_safe_length (v_funcs); + + if (num_vars == 0 && num_funcs == 0) + return; + +#ifdef ACCEL_COMPILER + /* Decls are placed in reversed order in fat-objects, so we need to + revert them back if we compile target. */ + for (unsigned i = 0; i < num_funcs / 2; i++) + { + tree it = (*v_funcs)[i]; + (*v_funcs)[i] = (*v_funcs)[num_funcs - i - 1]; + (*v_funcs)[num_funcs - i - 1] = it; + } + for (unsigned i = 0; i < num_vars / 2; i++) + { + tree it = (*v_vars)[i]; + (*v_vars)[i] = (*v_vars)[num_vars - i - 1]; + (*v_vars)[num_vars - i - 1] = it; + } +#endif + + if (targetm_common.have_named_sections) + { + vec *v_f, *v_v; + vec_alloc (v_f, num_funcs); + vec_alloc (v_v, num_vars * 2); + + add_decls_addresses_to_decl_constructor (v_funcs, v_f); + add_decls_addresses_to_decl_constructor (v_vars, v_v); + + tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node, + num_vars * 2); + tree funcs_decl_type = build_array_type_nelts (pointer_sized_int_node, + num_funcs); + TYPE_ALIGN (vars_decl_type) = TYPE_ALIGN (pointer_sized_int_node); + TYPE_ALIGN (funcs_decl_type) = TYPE_ALIGN (pointer_sized_int_node); + tree ctor_v = build_constructor (vars_decl_type, v_v); + tree ctor_f = build_constructor (funcs_decl_type, v_f); + TREE_CONSTANT (ctor_v) = TREE_CONSTANT (ctor_f) = 1; + TREE_STATIC (ctor_v) = TREE_STATIC (ctor_f) = 1; + tree funcs_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier (".omp_func_table"), + funcs_decl_type); + tree vars_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier (".omp_var_table"), + vars_decl_type); + TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1; + DECL_INITIAL (funcs_decl) = ctor_f; + DECL_INITIAL (vars_decl) = ctor_v; + DECL_SECTION_NAME (funcs_decl) + = build_string (strlen (funcs_section_name), funcs_section_name); + DECL_SECTION_NAME (vars_decl) + = build_string (strlen (vars_section_name), vars_section_name); + + varpool_assemble_decl (varpool_node_for_decl (vars_decl)); + varpool_assemble_decl (varpool_node_for_decl (funcs_decl)); + } + else + { + for (unsigned i = 0; i < num_funcs; i++) + { + tree it = (*v_funcs)[i]; + targetm.record_offload_symbol (it); + } + for (unsigned i = 0; i < num_funcs; i++) + { + tree it = (*v_vars)[i]; + targetm.record_offload_symbol (it); + } + } +} + #include "gt-omp-low.h" Index: gcc/omp-low.h =================================================================== --- gcc/omp-low.h (revision 208706) +++ gcc/omp-low.h (working copy) @@ -27,5 +27,6 @@ extern void omp_expand_local (basic_bloc extern void free_omp_regions (void); extern tree omp_reduction_init (tree, tree); extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *); +extern void omp_finish_file (void); #endif /* GCC_OMP_LOW_H */ Index: gcc/target.def =================================================================== --- gcc/target.def (revision 208706) +++ gcc/target.def (working copy) @@ -1772,6 +1772,14 @@ HOOK_VECTOR_END (vectorize) #undef HOOK_PREFIX #define HOOK_PREFIX "TARGET_" +DEFHOOK +(record_offload_symbol, + "Used when offloaded functions are seen in the compilation unit and no named\n\ +sections are available. It is called once for each symbol that must be\n\ +recorded in the offload function and variable table.", + void, (tree), + hook_void_tree) + /* Allow target specific overriding of option settings after options have been changed by an attribute or pragma or when it is reset at the end of the code affected by an attribute or pragma. */ Index: gcc/toplev.c =================================================================== --- gcc/toplev.c (revision 208706) +++ gcc/toplev.c (working copy) @@ -79,6 +79,7 @@ along with GCC; see the file COPYING3. #include "context.h" #include "pass_manager.h" #include "optabs.h" +#include "omp-low.h" #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO) #include "dbxout.h" @@ -577,6 +578,8 @@ compile_file (void) if (flag_sanitize & SANITIZE_THREAD) tsan_finish_file (); + omp_finish_file (); + output_shared_constant_pool (); output_object_blocks (); finish_tm_clone_pairs (); Index: libgcc/crtstuff.c =================================================================== --- libgcc/crtstuff.c (revision 208706) +++ libgcc/crtstuff.c (working copy) @@ -311,6 +311,15 @@ register_tm_clones (void) } #endif /* USE_TM_CLONE_REGISTRY */ +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING) +void *_omp_func_table[0] + __attribute__ ((__used__, visibility ("protected"), + section (".offload_func_table_section"))) = { }; +void *_omp_var_table[0] + __attribute__ ((__used__, visibility ("protected"), + section (".offload_var_table_section"))) = { }; +#endif + #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP) #ifdef OBJECT_FORMAT_ELF @@ -752,6 +761,23 @@ __do_global_ctors (void) #error "What are you doing with crtstuff.c, then?" #endif +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING) +void *_omp_funcs_end[0] + __attribute__ ((__used__, visibility ("protected"), + section (".offload_func_table_section"))) = { }; +void *_omp_vars_end[0] + __attribute__ ((__used__, visibility ("protected"), + section (".offload_var_table_section"))) = { }; +extern void *_omp_func_table[]; +extern void *_omp_var_table[]; +void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) = +{ + &_omp_func_table, &_omp_funcs_end, + &_omp_var_table, &_omp_vars_end +}; +#endif + + #else /* ! CRT_BEGIN && ! CRT_END */ #error "One of CRT_BEGIN or CRT_END must be defined." #endif