Message ID | 20110701183145.GD6262@intrepid.com |
---|---|
State | New |
Headers | show |
On Fri, 1 Jul 2011, Gary Funck wrote: > * Most of the #ifdef conditionals have been removed. Some target macros > have been defined and documented in tm.texi. We still have some questions For each target macro, what is the justification requiring it to be a macro rather than a hook documented in target.def with just an @hook line in tm.texi.in? The justification should be one of: * Macro is used in a case label, defining the values of an enum or some other such context that requires a constant value and the code would be substantially obfuscated by changing it to avoid this. (Normally "if" statements are preferred to #if conditionals. Simple use in #if is not sufficient justification for a target macro, you need to explain why it is necessary to use a preprocessor conditional - for example, if the #if is there to control values in an enum.) * Macro is very closely related to an existing set of macros, such that it does not make sense for some members of the set to be hooks while others are macros. * Benchmarking with provided figures has shown significant performance regressions from use of a hook. I looked at the first of the documented macros, and it doesn't seem to be a target macro at all, being defined by configure rather than in tm.h. Configure-defined macros don't go in tm.texi. But any that are really target macros need such a justification, and it's still preferable to define configure-defined macros to 1 or 0 (rather than defined or undefined) and test them in "if" statements not #if.
On 07/01/11 19:28:34, Joseph S. Myers wrote: > On Fri, 1 Jul 2011, Gary Funck wrote: > GF: * Most of the #ifdef conditionals have been removed. Some target macros > GF: have been defined and documented in tm.texi. We still have some questions > > For each target macro, what is the justification requiring it to be a > macro rather than a hook documented in target.def with just an @hook line > in tm.texi.in? The justification should be one of: [...] Joseph, thanks for the follow up. I have several questions on the best method of documenting and packaging UPC's configuration (and target) specific definition. I will describe those questions and some potential solutions in an email, to follow. - Gary
On Fri, Jul 01, 2011 at 11:31:45AM -0700, Gary Funck wrote: > @@ -2405,6 +2469,9 @@ struct GTY(()) tree_type_common { > alias_set_type alias_set; > tree pointer_to; > tree reference_to; > + /* UPC: for block-distributed arrays */ > + tree block_factor; > I think this is undesirable. Using a single bit and looking it up in a hash table might be a better solution, the vast majority of the types aren't going to be UPC block distributed arrays. Jakub
On 07/01/11 22:42:55, Jakub Jelinek wrote: > On Fri, Jul 01, 2011 at 11:31:45AM -0700, Gary Funck wrote: > > @@ -2405,6 +2469,9 @@ struct GTY(()) tree_type_common { > > alias_set_type alias_set; > > tree pointer_to; > > tree reference_to; > > + /* UPC: for block-distributed arrays */ > > + tree block_factor; > > > > I think this is undesirable. Using a single bit and looking > it up in a hash table might be a better solution, the vast majority > of the types aren't going to be UPC block distributed arrays. Jakub, it is true that the vast majority of types will not have a UPC blocking factor. However, building a hash table will be complicated by questions of the scope and lifetime of the type that references the blocking factor in a hash table. Implementing the hash table sounds like it might be quite a bit of work. Is there precedent for this technique being used in other GCC front-ends? Some other fields in a tree_type_common node are also likely not used by very many types (or in the case of the symtab info. are only used, for example, if debug info. is being generated?): * attributes * reference_to * tree_type_symtab * name Is there some way to use the language specific information hook? struct GTY(()) tree_type_with_lang_specific { struct tree_type_common common; /* Points to a structure whose details depend on the language in use. */ struct lang_type *lang_specific; }; - Gary
On Fri, Jul 01, 2011 at 02:34:41PM -0700, Gary Funck wrote:
> Is there precedent for this technique being used in other GCC front-ends?
Yes, look at DECL_VALUE_EXPR/SET_DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
or DECL_DEBUG_EXPR/SET_DECL_DEBUG_EXPR/DEBUG_EXPR_IS_FROM.
Jakub
On Jul 1, 2011, at 2:34 PM, Gary Funck wrote: > Is there some way to use the language specific information hook? > > struct GTY(()) tree_type_with_lang_specific { > struct tree_type_common common; > /* Points to a structure whose details depend on the language in use. */ > struct lang_type *lang_specific; > }; You can use that to allocate additional data for the frontend's exclusive use. See the C++ frontend, it uses language specific data. :-)
On 07/01/11 19:28:34, Joseph S. Myers wrote: > On Fri, 1 Jul 2011, Gary Funck wrote: > GF: * Most of the #ifdef conditionals have been removed. Some target macros > GF: have been defined and documented in tm.texi. We still have some questions > > [...] > I looked at the first of the documented macros, and it doesn't seem to be > a target macro at all, being defined by configure rather than in tm.h. > Configure-defined macros don't go in tm.texi. But any that are really > target macros need such a justification, and it's still preferable to > define configure-defined macros to 1 or 0 (rather than defined or > undefined) and test them in "if" statements not #if. I was uncertain where/how to document the macros used to configure UPC. (Your explanation in the [...] above was helpful in answering the technical questions that I had in that regard, thanks.) The patch submitted for review did document (as target macros) switches that are set by the 'configure' script. Whether they should be #defines/not, I'm not sure, and whether they are target specific I wasn't sure. Hopefully, the following description will provide enough background to figure out the best way to handle these configuration items. The recent changes were motivated by your original suggestion: * It appears you have a large number of #ifdef conditionals with no obvious reason for the macros being conditionally defined or not defined. The use of conditional compilation like this is deprecated for new code. If something may vary from target to target, use target hooks, not macros, and document them in tm.texi (if they are in fact documented there in something outside this patch, perhaps you need to post that other patch). Conditions using "if" are strongly preferred to those using "#if" whenever possible. In the version of GUPC that your comments above relate to, the #define's were primarily in config/upc_conf.h: http://gcc.gnu.org/viewcvs/branches/gupc/gcc/config/upc-conf.h?revision=173527&view=markup&pathrev=173545 Some of those definitions, which list the names of UPC runtime functions now in upc/rts-names.h: http://gcc.gnu.org/viewcvs/branches/gupc/gcc/upc/upc-rts-names.h?revision=173837&view=markup The definitions that are specific to the representation of the UPC pointer-to-shared (PTS) type that has been configured by the user were swept under this corner of the rug in upc/upc-pts.h: http://gcc.gnu.org/viewcvs/branches/gupc/gcc/upc/upc-pts.h?revision=173837&view=markup The reason that so many #ifdef's and #defines remain is that I wasn't sure about the best method of parameterizing those values to avoid the use of conditional compilation. For example, should a small struct be defined that has fields for the configuration-specific values? The switching between the files that implement the packed representation of a pointer-to-shared and those that implemented the struct representation used to be done in the UPC Makefile fragment Make-lang.in (lines 61 and 62): http://gcc.gnu.org/viewcvs/branches/gupc/gcc/upc/Make-lang.in?revision=169939&view=markup&pathrev=173545 In that version of GUPC only one of two files upc-pts-packed.c or upc-pts-struct.c was compiled, depending upon the configuration switch that selected between the two representations. Both files defined the same exported functions. In an effort to make sure that both implementations are compiled, a table of PTS operations is defined in the current version of GUPC in upc-pts.h: http://gcc.gnu.org/viewcvs/branches/gupc/gcc/upc/upc-pts.h?revision=173837&view=markup The table of representation-specific functions is filled in with either the packed PTS routines or the struct representation routines when the UPC compiler is initialized. Where the need for this PTS representation specific logic begins is controlled by the following UPC configure options: --with-upc-pts={struct,packed} choose the representation of a UPC pointer-to-shared --with-upc-pts-vaddr-order={last,first} choose position of the address field in UPC pointer-to-shared representation --with-packed-upc-pts-bits=phase,thread,vaddr choose bit distribution in packed UPC pointer-to-shared representation The set of configuration choices looks like this: PTS Representation Packed Vaddr: {first, last} field sizes: {phase,thread,vaddr} Struct Vaddr: {first, last} Although the PTS representation is not target-specific, it has some aspects of being target-specific in that the representation and layout of the PTS changes the binary representation of the PTS in the compiled program (and runtime), and some representations will perform better/worse on a specific target (for example, on some targets some UPC programs may run faster if vaddr is first. Generally, the 'struct PTS' is chosen because it provides the maximum range of the various fields (at the expense of using more space to represent the PTS). Based upon your suggestions/comments, it sounds like: 1. The PTS-related #define's set by 'configure' are not target macros, and should not be documented as such. 2. The PTS-related operations that are specific either to the packed or the struct PTS representation are not target hooks and should not be documented as such. 3. UPC_PTS_VADDR_FIRST currently defined in upc-pts.h: 83 #ifdef HAVE_UPC_PTS_VADDR_FIRST 84 #define UPC_PTS_VADDR_FIRST 1 85 #else 86 #define UPC_PTS_VADDR_FIRST 0 87 #endif /* HAVE_UPC_PTS_VADDR_FIRST */ should be set directly to 1 or 0 by 'configure', avoiding the need to test HAVE_UPC_PTS_VADDR_FIRST with an #ifdef. 4. The #define's in upc-pts.h that are derived based upon PTS-specific configuration settings: #ifdef HAVE_UPC_PTS_PACKED_REP 51 /* 'packed' UPC pointer-to-shared representation */ 52 #define UPC_PTS_SIZE 64 53 #else 54 /* 'struct' UPC pointer-to-shared representation */ 55 #define UPC_PTS_SIZE (LONG_TYPE_SIZE + POINTER_SIZE) 56 #define UPC_PTS_PHASE_SIZE (LONG_TYPE_SIZE/2) 57 #define UPC_PTS_THREAD_SIZE (LONG_TYPE_SIZE/2) 58 #define UPC_PTS_VADDR_SIZE POINTER_SIZE 59 #define UPC_PTS_PHASE_TYPE ((LONG_TYPE_SIZE == 64) ? "uint32_t" : "uint16_t") 60 #define UPC_PTS_THREAD_TYPE ((LONG_TYPE_SIZE == 64) ? "uint32_t" : "uint16_t") 61 #define UPC_PTS_VADDR_TYPE "char *" 62 #endif /* HAVE_UPC_PTS_PACKED_REP */ 63 64 #ifdef HAVE_UPC_PTS_VADDR_FIRST 65 #define UPC_PTS_PHASE_SHIFT 0 66 #define UPC_PTS_THREAD_SHIFT UPC_PTS_PHASE_SIZE 67 #define UPC_PTS_VADDR_SHIFT (UPC_PTS_PHASE_SIZE+UPC_PTS_THREAD_SIZE) 68 #else 69 #define UPC_PTS_PHASE_SHIFT (UPC_PTS_VADDR_SIZE+UPC_PTS_THREAD_SIZE) 70 #define UPC_PTS_THREAD_SHIFT UPC_PTS_VADDR_SIZE 71 #define UPC_PTS_VADDR_SHIFT 0 72 #endif 73 #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG 74 #define UPC_PTS_VADDR_MASK ((1L << UPC_PTS_VADDR_SIZE) - 1L) 75 #elif HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONGLONG 76 #define UPC_PTS_VADDR_MASK ((1LL << UPC_PTS_VADDR_SIZE) - 1LL) 77 #else 78 #error Unexpected "HOST_BITS_PER_WIDE_INT" value. 79 #endif /* HOST_BITS_PER_WIDE_INT */ 80 #define UPC_PTS_THREAD_MASK ((1 << UPC_PTS_THREAD_SIZE) - 1) 81 #define UPC_PTS_PHASE_MASK ((1 << UPC_PTS_PHASE_SIZE) - 1) Should be moved into a 'struct' that is filled in with the UPC compiler is initialized. This can be done by the representation specific 'init' routine. Does that make sense? - Gary
On 07/02/11 00:06:07, Jakub Jelinek wrote: > Yes, look at DECL_VALUE_EXPR/SET_DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P > or DECL_DEBUG_EXPR/SET_DECL_DEBUG_EXPR/DEBUG_EXPR_IS_FROM. OK, thanks. The UPC front end will be changed use a similar method to encode UPC's block size.
On 07/01/11 16:39:55, Mike Stump wrote: > You can use that to allocate additional data for the frontend's exclusive use. > See the C++ frontend, it uses language specific data. :-) Heh. It sounds like this language-specific hook works well for language front-ends that aren't sharing the logic of the "C" front end. Therefore, a no-go for UPC.
On Fri, 1 Jul 2011, Gary Funck wrote: > The reason that so many #ifdef's and #defines remain is that > I wasn't sure about the best method of parameterizing those values to > avoid the use of conditional compilation. For example, should > a small struct be defined that has fields for the configuration-specific > values? You need to make a technical judgement among possible approaches - which may include such a structure, or a structure of function pointers, or conditionals at various places in the code. > Although the PTS representation is not target-specific, it has some > aspects of being target-specific in that the representation and > layout of the PTS changes the binary representation of the PTS > in the compiled program (and runtime), and some representations > will perform better/worse on a specific target (for example, > on some targets some UPC programs may run faster if vaddr is first. > Generally, the 'struct PTS' is chosen because it provides the > maximum range of the various fields (at the expense of using > more space to represent the PTS). In general configure options aren't really a good idea in many cases: * If something is always best on a particular architecture, maybe you want a target hook (not macro) rather than a configure option, with the target hook being set appropriately for each target and no option for the person configuring GCC to override. * If what's best depends on the particular application, a normal command-line option for the person using GCC to use is better (and then any configure option can just control specs in the driver and not directly affect the rest of the compiler). That might run into problems if there's a runtime library included with GCC that can only be built compatibly with one of the possible ABIs at a time, though it may be possible to avoid this issue in some cases (for example, using different function names for the different ABIs; cf. how libstdc++ works with both 64-bit and 128-bit long double on various targets). (Actually multilibbing the library for each ABI variant - in addition to the existing multilibbing for all GCC's runtime libraries - would be better in some cases but harder to implement.) Even if you need a configure option for the default, the command-line options may still be useful in some cases for testing and debugging purposes.
On 07/02/11 15:21:32, Joseph S. Myers wrote: [...] > In general configure options aren't really a good idea in many cases: > > * If something is always best on a particular architecture, maybe you want > a target hook (not macro) rather than a configure option, with the target > hook being set appropriately for each target and no option for the person > configuring GCC to override. With UPC, the packed pointer-to-shared (PTS) representation takes less space and is generally useful for most applications. However, it is sometimes necessary to allocate more bits for the number of threads, and less for the block offset for example, which makes it possible to continue to use the packed PTS representation in certain applications. This motivates the 'configure' setting. The 'struct' PTS has essentially no limitations but the pointers-to-shared are bigger and accessing the sub-fields of this "fat pointer" is somewhat slower. Also, passing 'struct' based PTS's as arguments (to the runtime for example) can be slower on some architectures. The difficulty with making this representation choice a compile-time selection is that the runtime needs to be aware of the pointer-to-shared layout. Perhaps we could build two separate runtimes (packed PTS and struct PTS) and parameterize the packed PTS based runtime to calculate the shifts and masks it needs rather than compiling them as constants, but that will probably impact performance on various commonly called paths through the runtime. > Even if you need a configure option for the default, the command-line > options may still be useful in some cases for testing and debugging > purposes. Yes, that would be useful.
Off list, Mike S. pointed out that the Objective C front-end uses the lang-specific extensions to GCC's tree node, and it shares logic with the C and C++ front ends. Therefore, the lang-specific extensions might offer an alternative approach for storing UPC's layout qualifier. The current plan, however, is to use a hash table.
I have been looking at changing UPC's method of recording the blocking factor so that it uses less space in the tree type node. The suggested method for achieving this space reduction is to use a hash table to map pointers to type nodes into UPC blocking factors (for less commonly used blocking factor values). In UPC, the default blocking factor for arrays is one, and for UPC shared scalar/struct variables it is zero. Therefore, the current approach is to only record blocking factors greater than one in a hash table. The basics are in place (in tree.h): -------------------------------------------------------------------- /* Non-zero if the UPC shared type refers to THREADS in its array dimension. */ #define UPC_TYPE_HAS_THREADS_FACTOR(TYPE) TYPE_LANG_FLAG_3 (TYPE) /* Non-zero if the UPC blocking factor is 0. */ #define UPC_TYPE_HAS_BLOCK_FACTOR_0(TYPE) TYPE_LANG_FLAG_4 (TYPE) /* Non-zero if the UPC blocking factor is greater than 1. In this case, the blocking factor value is stored in a hash table. */ #define UPC_TYPE_HAS_BLOCK_FACTOR_X(TYPE) TYPE_LANG_FLAG_5 (TYPE) extern void upc_block_factor_insert (tree, tree); extern tree upc_block_factor_lookup (tree); /* Return the UPC blocking factor of the type given by NODE. The default block factor is one. The additional flag bits over-ride the default. */ #define UPC_TYPE_BLOCK_FACTOR(NODE) \ (UPC_TYPE_HAS_BLOCK_FACTOR_0 (NODE) ? size_zero_node \ : UPC_TYPE_HAS_BLOCK_FACTOR_X (NODE) ? upc_block_factor_lookup (NODE) \ : size_one_node) -------------------------------------------------------------------- Note above, that the bits used are the TYPE_LANG_FLAG bits. This should be acceptable for UPC, because it is built as a language variant. Interestingly, if additional bits were allocated just for UPC, they would likely increase the size of the tree node, because there are no spare bits here: struct GTY(()) tree_type_common { [....] unsigned int precision : 10; [...] unsigned lang_flag_6 : 1; The bit field sizes in the interval shown above total to 32 bits. Therefore, it seems that any attempt to add new flag bits will likely increase the size of a tree_type_common node. -------------------------------------------------------------------- The issue that I'm running into, however, is that the re-implementation of UPC_TYPE_BLOCK_FACTOR() is not plug-and-play with its previous version that used a tree pointer in the node to record the blocking factor. Here's why: gcc/tree.c|5681 col 27| warning: passing argument 1 of ‘upc_block_factor_lookup’ discards qualifiers from pointer target type gcc/tree.h|1196 col 13| note: expected ‘tree’ but argument is of type ‘const_tree’ The code referenced above is: -------------------------------------------------------------------- bool check_qualified_type (const_tree cand, const_tree base, int type_quals) { return (TYPE_QUALS (cand) == type_quals [...] /* For UPC, the blocking factors have to be equal. */ && tree_int_cst_equal (UPC_TYPE_BLOCK_FACTOR (cand), UPC_TYPE_BLOCK_FACTOR (base)) [...] } -------------------------------------------------------------------- The prototype for upc_block_factor_lookup() accepts a 'tree', not a 'const_tree'. If the parameter is changed to 'const_tree', then the body of the procedure will encounter another 'const' clash because a tree_map is defined as having nodes of type 'tree', not 'const_tree'. For example, struct GTY(()) tree_map_base { tree from; }; -------------------------------------------------------------------- NOTE: I looked at calls to DECL_VALUE_EXPR() to see if it ran into this issue, however, only 'tree' nodes appear to be passed to this function macro. Any suggestions on how to work around this 'const' clash warning? Thanks, - Gary
On 08/17/11 08:38:44, Gary Funck wrote: > > I have been looking at changing UPC's method of > recording the blocking factor so that it uses less space > in the tree type node. The suggested method for > achieving this space reduction is to use a hash table to > map pointers to type nodes into UPC blocking factors > (for less commonly used blocking factor values). [...] > The issue that I'm running into, however, is that the > re-implementation of UPC_TYPE_BLOCK_FACTOR() is not > plug-and-play with its previous version that used > a tree pointer in the node to record the blocking factor. [...] As a follow up, I wasn't able to find an alternative method that preserved the const'ness of the API's involved in the change, so ended up reverting them back to regular tree pointers. The following change was committed to the GUPC branch. 2011-08-30 Gary Funck <gary@intrepid.com> * tree.h (check_qualified_type): Change 'const_tree' argument types back to 'tree' to avoid complaints of assignment drops qualifiers for invocations of the newly implemented TYPE_BLOCK_FACTOR() macro, which invokes hash functions with 'tree' pointer values that are not const qualified. * tree.c (check_qualified_type, check_aligned_type): Ditto. * c-typeck.c (comptypes_internal): Ditto.
Recently, a few UPC test programs failed to compile due to mis-matches of parameters in a prototype and its corresponding function definition. The mis-match was based upon the apparent inequality of UPC layout qualifiers (blocking factors). UPC blocking factors are integer constants. They are recorded in a hash table indexed by the type tree node that they correspond to. Currently, the test for equality of blocking factors tests only the pointer to the tree node defining the constant. All blocking factors are recorded as "sizetype" type'd nodes. Given that integer constants are hashed by type/value, it seemed safe to assume that a given blocking factor would map to a single tree node due to the underlying hash method that is used when integral constants are created. Is it valid to assume that pointer equality is sufficient to ensure that two integer constants are equal as long as their type and values are equal? The bug that we ran into occurred because a garbage collection pass was run between the point that the function prototype tree node was created and the point at which the function declaration was processed. The garbage collector decided that the integer constant representing the blocking factor was no longer in use, because it had not been marked. In fact, the integer constant was needed because it appeared in the blocking factor hash table, but not via a direct pointer. Rather it was referenced by nature of the fact that the blocking factor hash table referenced the integer constant that is mapped in the integer constant hash table. Here's a rough diagram: tree (type) -> [ blocking factor hash ] -> tree (integer constant) tree (integer constant) -> [ integer constant hash ] {unique map} -> tree (integer constant) When the garbage collector deleted the entry from the "integer constant hash", it forced a new integer constant tree node to be created for the same (type, value) integral constant blocking factor. One easy way to address the current issue is to call tree_int_cst_equal() if the integer constant tree pointers do not match: if ((c1 != c2) && !tree_int_cst_equal (c1, c2)) /* integer constants aren't equal. */ This may be necessary if 'int_cst_hash_table' is viewed as a cache rather than a persistent, unique mapping. Another approach, would be to somehow mark the node in int_cst_hash_table as in use when the blocking factor hash table is traversed by the garbage collector, or to add logic the hash table delete function associated with int_cst_hash_table; to dis-allow the delete if the integer constant is present in the UPC blocking factor hash table. To effect this change in a modular way probably the hash table delete function associated with 'int_cst_hash_table' would have to be daisy-chained, where the UPC blocking factor check is made first. The difficulty with implementing the daisy chaining is that int_cst_hash_table needs to exist before the UPC-related initialization code is run. One way to handle this might be yet another language hook, called from the code that creates 'int_cst_hash_table'. That seems overly complex. For reference, the current blocking factor mapping table is created as follows: upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, tree_map_eq, 0); Summary: 1. Is it valid to assume that pointer equality is sufficient to compare two integer constants for equality as long as they have identical type and value? 2. Should 'int_cst_hash_table' be viewed as a cache, where the mapping of a given (type, value) integer constant may vary over time? 3. If the answer to 1. is "yes" and the answer to 2. is "no" then what is the recommended way to ensure that nodes in 'int_cst_hash_table' are not removed if the integer constant is being referenced via the 'upc_block_factor_for_type' hash table? thanks, - Gary
On Mon, Oct 10, 2011 at 7:02 PM, Gary Funck <gary@intrepid.com> wrote: > Recently, a few UPC test programs failed to compile > due to mis-matches of parameters in a prototype and its > corresponding function definition. The mis-match was > based upon the apparent inequality of UPC layout qualifiers > (blocking factors). > > UPC blocking factors are integer constants. They are > recorded in a hash table indexed by the type tree node > that they correspond to. > > Currently, the test for equality of blocking factors > tests only the pointer to the tree node defining the constant. > All blocking factors are recorded as "sizetype" type'd nodes. > Given that integer constants are hashed by type/value, it seemed > safe to assume that a given blocking factor would map to > a single tree node due to the underlying hash method that is used > when integral constants are created. > > Is it valid to assume that pointer equality is sufficient > to ensure that two integer constants are equal as long > as their type and values are equal? > > The bug that we ran into occurred because a garbage collection > pass was run between the point that the function prototype > tree node was created and the point at which the function declaration > was processed. The garbage collector decided that the integer > constant representing the blocking factor was no longer in use, > because it had not been marked. > > In fact, the integer constant was needed because it appeared > in the blocking factor hash table, but not via a direct pointer. > Rather it was referenced by nature of the fact that > the blocking factor hash table referenced the integer constant > that is mapped in the integer constant hash table. > > Here's a rough diagram: > > tree (type) -> [ blocking factor hash ] -> tree (integer constant) > tree (integer constant) -> [ integer constant hash ] {unique map} > -> tree (integer constant) > > When the garbage collector deleted the entry from the > "integer constant hash", it forced a new integer constant tree > node to be created for the same (type, value) integral constant > blocking factor. > > One easy way to address the current issue is to call > tree_int_cst_equal() if the integer constant tree pointers > do not match: > > if ((c1 != c2) && !tree_int_cst_equal (c1, c2)) > /* integer constants aren't equal. */ > > This may be necessary if 'int_cst_hash_table' is viewed as > a cache rather than a persistent, unique mapping. > > Another approach, would be to somehow mark the node > in int_cst_hash_table as in use when the blocking factor > hash table is traversed by the garbage collector, or > to add logic the hash table delete function associated > with int_cst_hash_table; to dis-allow the delete if the > integer constant is present in the UPC blocking factor > hash table. > > To effect this change in a modular way probably the hash table > delete function associated with 'int_cst_hash_table' would have > to be daisy-chained, where the UPC blocking factor check is made > first. The difficulty with implementing the daisy chaining is that > int_cst_hash_table needs to exist before the UPC-related initialization > code is run. One way to handle this might be yet another language > hook, called from the code that creates 'int_cst_hash_table'. > That seems overly complex. > > For reference, the current blocking factor mapping table > is created as follows: > > upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, > tree_map_eq, 0); > > Summary: > > 1. Is it valid to assume that pointer equality is sufficient > to compare two integer constants for equality as long as they > have identical type and value? Yes, if both constants are "live" > 2. Should 'int_cst_hash_table' be viewed as a cache, where > the mapping of a given (type, value) integer constant may > vary over time? Yes, if a constant is unused it may get collected and re-allocated later. Cannot be observed from any valid use of 1. > 3. If the answer to 1. is "yes" and the answer to 2. is "no" > then what is the recommended way to ensure that nodes in > 'int_cst_hash_table' are not removed if the integer constant > is being referenced via the 'upc_block_factor_for_type' > hash table? You need to ensure the constants are marked properly. Richard. > thanks, > - Gary >
> In fact, the integer constant was needed because it appeared > in the blocking factor hash table, but not via a direct pointer. > Rather it was referenced by nature of the fact that > the blocking factor hash table referenced the integer constant > that is mapped in the integer constant hash table. You'd need to elaborate here: what does "by nature of the fact that" mean? > When the garbage collector deleted the entry from the > "integer constant hash", it forced a new integer constant tree > node to be created for the same (type, value) integral constant > blocking factor. > > One easy way to address the current issue is to call > tree_int_cst_equal() if the integer constant tree pointers > do not match: > > if ((c1 != c2) && !tree_int_cst_equal (c1, c2)) > /* integer constants aren't equal. */ You have two objects C1 and C2 for the same constant and you're comparing them. One was created first, say C1. If C1 was still live when C2 was created, why was C2 created in the first class? If C1 wasn't live anymore when C2 was created, why are you still using C1 here?
On 10/11/11 10:24:52, Richard Guenther wrote: > GF: 1. Is it valid to assume that pointer equality is sufficient > GF: to compare two integer constants for equality as long as they > GF: have identical type and value? > > Yes, if both constants are "live" The upc blocking factor hash table is declared as follows: static GTY ((if_marked ("tree_map_marked_p"), param_is (struct tree_map))) htab_t upc_block_factor_for_type; [...] upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, tree_map_eq, 0); I had hoped that this would be sufficient to ensure that all integer constant references recorded in this hash table would be considered "live" by the GC. Reading the code in tree_map_marked_p(), however, I see the following: #define tree_map_marked_p tree_map_base_marked_p [...] /* Return true if this tree map structure is marked for garbage collection purposes. We simply return true if the from tree is marked, so that this structure goes away when the from tree goes away. */ int tree_map_base_marked_p (const void *p) { return ggc_marked_p (((const struct tree_map_base *) p)->from); } This takes care of recycling an entry when the '->from' reference goes away, but it doesn't make sure that the '->to' reference is considered "live". I don't understand the GC well enough to know when/where the '->to' entry should be marked as "live". (note: in the cited test case, the ->from pointers in question are known to be "live" and did survive garbage collection.) Given that the declaration above tells the GC that the nodes in the blocking factor hash table are of type 'struct tree_map', struct GTY(()) tree_map_base { tree from; }; /* Map from a tree to another tree. */ struct GTY(()) tree_map { struct tree_map_base base; unsigned int hash; tree to; }; I thought that the GC would mark the ->to nodes as live automatically? (note: probably the only direct reference to the integer constant that is the focus of this discussion is in the upc_block_factor_for_type hash table. Therefore, if it isn't seen as "live" there, it won't be seen as live anywhere else.) I suppose that I could declare a linear "tree list" of mapped integer constants and let the GC walk that, but that is more of a hack than a solution. - Gary
On 10/11/11 11:05:18, Eric Botcazou wrote: > > One easy way to address the current issue is to call > > tree_int_cst_equal() if the integer constant tree pointers > > do not match: > > > > if ((c1 != c2) && !tree_int_cst_equal (c1, c2)) > > /* integer constants aren't equal. */ > > You have two objects C1 and C2 for the same constant and you're comparing > them. One was created first, say C1. If C1 was still live when C2 was > created, why was C2 created in the first class? If C1 wasn't live > anymore when C2 was created, why are you still using C1 here? Eric, this note provides some more detail in addition to my earlier reply to Richard. The problem is that the references to object C1 and C2 live in a hash table, and that although the referenced nodes will be retained by the garbage collector, their mapping in int_cst_hash_table is deleted by the GC. Thus, if we follow the diagram: tree (type) -> [ upc_block_factor_for_type ] -> tree (integer constant) tree (integer constant) -> [ int_cst_hash_table ] {unique map} -> tree (integer constant) Given two tree nodes, P (prototype) and F (function) they declare a parameter that is pointer to a UPC shared object and this pointer is declared with a UPC blocking factor of 1000. Without garbage collection, the mappings look like this: P.param -> C1, F.param -> C1 where C1 is an integer constant of the form (sizetype, 1000). but when GC kicks in, it decides that the hash table entry in int_cst_hash_table can be deleted because it doesn't think that C1 is live. Therefore the next attempt to map (sizetype, 1000) will yield a new integral constant tree node, C2. Then the mapping changes to: P.param -> C1, F.param -> C2, and we can no longer use TYPE_UPC_BLOCKING_FACTOR (P.param) == TYPE_UPC_BLOCKING_FACTOR (F.param) to check that the blocking factors of P.param and F.param are equal. For the GC to know that int_cst_hash_table entry is needed, perhaps the upc_block_factor_for_type needs to be traversed, and then each integer constant needs to be "marked", or the constant has to be hashed into int_cst_hash_table and the actual hash table entry has to be marked. I am not familiar with the details of garbage collection and pretty much just try use existing code as a model. Apparently, this sequence of statements is insufficient to tell the GC that it should mark the integer constants referenced in this hash table as "in use". static GTY ((if_marked ("tree_map_marked_p"), param_is (struct tree_map))) htab_t upc_block_factor_for_type; [...] upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash tree_map_eq, 0); Reading the GC code: static int ggc_htab_delete (void **slot, void *info) { const struct ggc_cache_tab *r = (const struct ggc_cache_tab *) info; if (! (*r->marked_p) (*slot)) htab_clear_slot (*r->base, slot); else (*r->cb) (*slot); return 1; } It appears that the int_cst_hash_table entry for C1 needs to be marked or it will be cleared. I don't know how to set things up so that so that the garbage collecting mechanisms are in place to do that, and was hoping that tree_map_hash table would provide the required mechanisms. Apparently, this is not the case. I had hoped that this declaration would be sufficient to convince the GC to consider all mapped integer constant nodes to be "live". If not, then perhaps I need a GC hook associated with upc_block_factor_for_type that does something like the following: for t (each used upc_block_factor_for_type entry): c = t.to # the mapped integer constant if is_integer_constant (c): h = int_cst_hash_table.hash(c) gc_mark_htab (int_cst_hash_table, h) or perhaps this is sufficient? for t (each used upc_block_factor_for_type entry): c = t.to gc_mark_tree_node (c) However, I thought that this would already have been done automatically by the GC hash tree implementation. If either of those methods are required, I would appreciate suggestions/pointers/code that would help me make sure that this approach is implemented correctly. thanks, - Gary
> The problem is that the references to object C1 and C2 live in > a hash table, and that although the referenced nodes will be retained > by the garbage collector, their mapping in int_cst_hash_table is > deleted by the GC. This isn't a simple hash table, is it? > I am not familiar with the details of garbage collection and > pretty much just try use existing code as a model. Apparently, > this sequence of statements is insufficient to tell the GC that it should > mark the integer constants referenced in this hash table as "in use". > > static GTY ((if_marked ("tree_map_marked_p"), > param_is (struct tree_map))) > htab_t upc_block_factor_for_type; > [...] > upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash > tree_map_eq, 0); This is a map so it is garbage-collected as a map: if the key isn't marked, then the value isn't either. Hence 2 questions: - why using a map and not a simple hash table? - what is the key and why isn't it marked?
On 10/12/11 09:27:43, Eric Botcazou wrote: > > The problem is that the references to object C1 and C2 live in > > a hash table, and that although the referenced nodes will be retained > > by the garbage collector, their mapping in int_cst_hash_table is > > deleted by the GC. > > This isn't a simple hash table, is it? At the moment, it is a simple garbage collected hash table, yes. It maps a type node into a corresponding integer node that is the "blocking factor" associated with the type node. Before the advent of this hash table the blocking factor was stored in a dedicated field in the tree type node. The suggestion was made to move this into a hash table to save space. I chose the "tree map" hash table because I thought it could do the job. And does work well, except when the GC steps in and decides to remove a node from int_cst_hash_table, because it thinks that the integer constant is no longer needed. > This is a map so it is garbage-collected as a map: if the key isn't marked, > then the value isn't either. The keys are valid. In the example discussed in this thread, there is a pointer to type node that used in a parameter declaration of a function prototype and also in the similarly named parameter of the function definition. Both tree pointers are used as keys, and they are "live" at the point that the GC runs. > Hence 2 questions: > - why using a map and not a simple hash table? I had thought that the garbage collected nature of the map would perhaps save some space and better track the lifetimes of the type nodes that have a UPC blocking factor. It would be easy enough to switch to a hash table that is not garbage collected and where entries once allocated remain that way until the end of the compilation. > - what is the key and why isn't it marked? The key is a type. The value is an "sizetype" integer constant. /* Return the UPC blocking factor of the type given by NODE. The default block factor is one. The additional flag bits over-ride the default. */ #define TYPE_BLOCK_FACTOR(NODE) \ (TYPE_SHARED (NODE) \ ? (TYPE_HAS_BLOCK_FACTOR_0 (NODE) ? size_zero_node \ : TYPE_HAS_BLOCK_FACTOR_X (NODE) ? upc_block_factor_lookup (NODE) \ : NULL_TREE) \ : NULL_TREE) A blocking factor of 0 is a special case, it is encoded as a flag. Otherwise, if the blocking factor has been set, then TYPE_HAS_BLOCK_FACTOR_X() is true, and the blocking factor for NODE is looked up in a hash table.
> It maps a type node into a corresponding integer node that is > the "blocking factor" associated with the type node. Before > the advent of this hash table the blocking factor was stored > in a dedicated field in the tree type node. The suggestion was > made to move this into a hash table to save space. I chose > the "tree map" hash table because I thought it could do the job. So this isn't a simple hash table since this is a map. A simple hash table doesn't store the key in the slot, only the value; a map does. > The keys are valid. In the example discussed in this thread, > there is a pointer to type node that used in a parameter declaration > of a function prototype and also in the similarly named parameter > of the function definition. Both tree pointers are used as keys, > and they are "live" at the point that the GC runs. But somehow they aren't marked by the GC. You need to find out why, since the value will be kept only if the key is already marked by the GC. By the time a GC pass is run, all trees to be kept must be linked to a GC root. You said that the pass was run "between the point that the function prototype tree node was created and the point at which the function declaration was processed". To which GC root are the keys linked between these points?
On Wed, Oct 12, 2011 at 10:29 AM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> It maps a type node into a corresponding integer node that is >> the "blocking factor" associated with the type node. Before >> the advent of this hash table the blocking factor was stored >> in a dedicated field in the tree type node. The suggestion was >> made to move this into a hash table to save space. I chose >> the "tree map" hash table because I thought it could do the job. > > So this isn't a simple hash table since this is a map. A simple hash table > doesn't store the key in the slot, only the value; a map does. > >> The keys are valid. In the example discussed in this thread, >> there is a pointer to type node that used in a parameter declaration >> of a function prototype and also in the similarly named parameter >> of the function definition. Both tree pointers are used as keys, >> and they are "live" at the point that the GC runs. > > But somehow they aren't marked by the GC. You need to find out why, since the > value will be kept only if the key is already marked by the GC. > > By the time a GC pass is run, all trees to be kept must be linked to a GC root. > You said that the pass was run "between the point that the function prototype > tree node was created and the point at which the function declaration was > processed". To which GC root are the keys linked between these points? I think there is an issue when two cache htabs refer to each other with respect to GC, you might search the list to find out more. Richard. > -- > Eric Botcazou >
On 10/12/11 14:00:54, Richard Guenther wrote: > I think there is an issue when two cache htabs refer to each other > with respect to GC, you might search the list to find out more. Richard, thanks. I thought that might be the case, but I don't understand the GC well enough to make this determination. - Gary
> I think there is an issue when two cache htabs refer to each other > with respect to GC, you might search the list to find out more. I'm not sure this is the case here, there seems to be a clear hierarchy.
2011/10/11 Gary Funck <gary@intrepid.com>: > static GTY ((if_marked ("tree_map_marked_p"), > param_is (struct tree_map))) > htab_t upc_block_factor_for_type; > [...] > upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, > tree_map_eq, 0); > > I had hoped that this would be sufficient to ensure that all > integer constant references recorded in this hash table would > be considered "live" by the GC. Reading the code in tree_map_marked_p(), > however, I see the following: > > #define tree_map_marked_p tree_map_base_marked_p > [...] > /* Return true if this tree map structure is marked for garbage collection > purposes. We simply return true if the from tree is marked, so that this > structure goes away when the from tree goes away. */ > > int > tree_map_base_marked_p (const void *p) > { > return ggc_marked_p (((const struct tree_map_base *) p)->from); > } > > This takes care of recycling an entry when the '->from' reference > goes away, but it doesn't make sure that the '->to' reference is > considered "live". I don't understand the GC well enough to > know when/where the '->to' entry should be marked as "live". If "->from" is marked, then GC will mark the whole hash table entry, including the tree_map::to field. Currently I believe that your issue is one hash table pointing to another one. Basically all the objects that need to be live must be known before the hash table marking starts, either by being already marked in that GC run, either by having mark bits set somewhere where if_marked option will check them correctly. In your case (correct me if I misunderstood something) you have one hash table, marking of which will mark more objects which are required for the correct marking of the second hash table. GC might be simply walking the second one first.
On 10/13/11 06:15:31, Laurynas Biveinis wrote: > [...] In your case (correct me if I misunderstood something) > you have one hash table, marking of which will mark more objects > which are required for the correct marking of the second hash table. > GC might be simply walking the second one first. Yes, I think that this accurately summarizes the situation, and the result. Any suggestions on how to fix this? It seems that one fix might be to use a non garbage-collected hash table for the hash map. - Gary
2011/10/13 Gary Funck <gary@intrepid.com>: > On 10/13/11 06:15:31, Laurynas Biveinis wrote: >> [...] In your case (correct me if I misunderstood something) >> you have one hash table, marking of which will mark more objects >> which are required for the correct marking of the second hash table. >> GC might be simply walking the second one first. > > Yes, I think that this accurately summarizes the situation, and the result. > > Any suggestions on how to fix this? It seems that one fix might > be to use a non garbage-collected hash table for the hash map. Is it feasible to write an if_marked function for the second hash table that would duplicate the work of the first hash table function and then some? I.e. it would determine if an entry needs to be marked based on information outside of both hash tables and independently of the first one (even if duplicating its logic).
--- gcc/tree.h (.../trunk) (revision 175584) +++ gcc/tree.h (.../branches/gupc) (revision 175683) @@ -462,8 +462,11 @@ struct GTY(()) tree_base { unsigned packed_flag : 1; unsigned user_align : 1; unsigned nameless_flag : 1; + unsigned upc_shared_flag : 1; + unsigned upc_strict_flag : 1; + unsigned upc_relaxed_flag : 1; - unsigned spare : 12; + unsigned spare : 9; @@ -2405,6 +2469,9 @@ struct GTY(()) tree_type_common { alias_set_type alias_set;