From patchwork Mon Nov 7 15:52:20 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 124118 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E4C651007D4 for ; Tue, 8 Nov 2011 02:52:43 +1100 (EST) Received: (qmail 1017 invoked by alias); 7 Nov 2011 15:52:42 -0000 Received: (qmail 999 invoked by uid 22791); 7 Nov 2011 15:52:40 -0000 X-SWARE-Spam-Status: No, hits=-7.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 07 Nov 2011 15:52:23 +0000 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pA7FqNqB015943 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 7 Nov 2011 10:52:23 -0500 Received: from austin.quesejoda.com (vpn-10-58.rdu.redhat.com [10.11.10.58]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id pA7FqKsW029318; Mon, 7 Nov 2011 10:52:21 -0500 Message-ID: <4EB7FEB4.1040408@redhat.com> Date: Mon, 07 Nov 2011 07:52:20 -0800 From: Aldy Hernandez User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: Richard Guenther CC: Richard Henderson , gcc-patches Subject: Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH) References: <4EB2EC3F.6000908@redhat.com> <4EB6D797.4070309@redhat.com> <4EB6EC02.8010907@redhat.com> <4EB772DD.7020904@redhat.com> In-Reply-To: Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org > This won't work - DECL_UIDs are not stable -g vs. -g0 - only their > _order_ is stable - thus you won't get comparison fails with code generated > dependent on DECL_UID order, but you will if you depend on the DECL_UID > value (which you do by using it as a hash). > > And we will still generate different object files based on garbage collector > settings this way - GC can shrink hashtables, causing re-hashing, > which changes the order of the elements. It also causes re-ordering > with slight unrelated code changes (but if you say at runtime we always > sort the thing that might not be an issue). Ah, I get it. > Thus, the patch isn't a fix to get a stable order (you can't get > that for hashtable walks). A quick fix is to collect the elements > into a VEC and qsort that after some stable key (like the DECL_UID). I've done so in the attached patch. Notice I am hijacking the alias_pair object, because it has everything we need, though the DECL_UID goes in a field called emitted_diags. I am avoiding having to create an exact same object with the exact same fields. I can change duplicate this, if it's a big eye sore. Preliminary tests show the TM tests all work, but a full test run is still underway. OK pending tests? Index: ChangeLog.tm-merge =================================================================== --- ChangeLog.tm-merge (revision 181067) +++ ChangeLog.tm-merge (working copy) @@ -120,7 +120,7 @@ (GTFILES): Add trans-mem.c. * omp-low.c (WALK_SUBSTMTS): Add GIMPLE_TRANSACTION. * output.h (record_tm_clone_pair, finish_tm_clone_pairs, - finish_tm_clone_pairs_1, get_tm_clone_pair): Declare. + get_tm_clone_pair): Declare. * params.def (PARAM_TM_MAX_AGGREGATE_SIZE): New. * passes.c (init_optimization_passes): Place transactional memory passes. @@ -190,5 +190,5 @@ is_tm_may_cancel_outer, is_tm_ending_fndecl, record_tm_replacement, tm_malloc_replacement): Declare. * varasm.c (tm_clone_pairs): New. - (record_tm_clone_pair, finish_tm_clone_pairs, finish_tm_clone_pairs_1, - get_tm_clone_pair): New. + (record_tm_clone_pair, finish_tm_clone_pairs, get_tm_clone_pair, + dump_tm_clone_to_vec, dump_tm_clone_pairs, alias_pair_cmp): New. Index: varasm.c =================================================================== --- varasm.c (revision 181067) +++ varasm.c (working copy) @@ -5901,58 +5901,104 @@ get_tm_clone_pair (tree o) return NULL_TREE; } -/* Helper function for finish_tm_clone_pairs. Dump the clone table. */ +/* Helper function for finish_tm_clone_pairs. Dump a hash table entry + into a VEC in INFO. */ -int -finish_tm_clone_pairs_1 (void **slot, void *info ATTRIBUTE_UNUSED) +static int +dump_tm_clone_to_vec (void **slot, void *info) { struct tree_map *map = (struct tree_map *) *slot; - bool *switched = (bool *) info; - tree src = map->base.from; - tree dst = map->to; - struct cgraph_node *src_n = cgraph_get_node (src); - struct cgraph_node *dst_n = cgraph_get_node (dst); - - /* The function ipa_tm_create_version() marks the clone as needed if - the original function was needed. But we also mark the clone as - needed if we ever called the clone indirectly through - TM_GETTMCLONE. If neither of these are true, we didn't generate - a clone, and we didn't call it indirectly... no sense keeping it - in the clone table. */ - if (!dst_n || !dst_n->needed) - return 1; + VEC(alias_pair,gc) **alias_pairs = (VEC(alias_pair, gc) **) info; + alias_pair *p = VEC_safe_push (alias_pair, gc, *alias_pairs, NULL); + p->decl = map->base.from; + p->target = map->to; + p->emitted_diags = DECL_UID (p->decl); + return 1; +} - /* This covers the case where we have optimized the original - function away, and only access the transactional clone. */ - if (!src_n || !src_n->needed) - return 1; +/* Dump the actual pairs to the .tm_clone_table section. */ - if (!*switched) +static void +dump_tm_clone_pairs (VEC(alias_pair,gc) *alias_pairs) +{ + unsigned i; + alias_pair *p; + bool switched = false; + + FOR_EACH_VEC_ELT (alias_pair, alias_pairs, i, p) { - switch_to_section (get_named_section (NULL, ".tm_clone_table", 3)); - assemble_align (POINTER_SIZE); - *switched = true; + tree src = p->decl; + tree dst = p->target; + struct cgraph_node *src_n = cgraph_get_node (src); + struct cgraph_node *dst_n = cgraph_get_node (dst); + + /* The function ipa_tm_create_version() marks the clone as needed if + the original function was needed. But we also mark the clone as + needed if we ever called the clone indirectly through + TM_GETTMCLONE. If neither of these are true, we didn't generate + a clone, and we didn't call it indirectly... no sense keeping it + in the clone table. */ + if (!dst_n || !dst_n->needed) + continue; + + /* This covers the case where we have optimized the original + function away, and only access the transactional clone. */ + if (!src_n || !src_n->needed) + continue; + + if (!switched) + { + switch_to_section (get_named_section (NULL, ".tm_clone_table", 3)); + assemble_align (POINTER_SIZE); + switched = true; + } + + assemble_integer (XEXP (DECL_RTL (src), 0), + POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1); + assemble_integer (XEXP (DECL_RTL (dst), 0), + POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1); } +} - assemble_integer (XEXP (DECL_RTL (src), 0), - POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1); - assemble_integer (XEXP (DECL_RTL (dst), 0), - POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1); - return 1; +/* Helper comparison function for qsorting by the DECL_UID stored in + alias_pair->emitted_diags. */ + +static int +alias_pair_cmp (const void *x, const void *y) +{ + const alias_pair *p1 = (const alias_pair *) x; + const alias_pair *p2 = (const alias_pair *) y; + if (p1->emitted_diags < p2->emitted_diags) + return -1; + if (p1->emitted_diags > p2->emitted_diags) + return 1; + return 0; } void finish_tm_clone_pairs (void) { - bool switched = false; + VEC(alias_pair,gc) *alias_pairs = NULL; if (tm_clone_pairs == NULL) return; - htab_traverse_noresize (tm_clone_pairs, finish_tm_clone_pairs_1, - (void *) &switched); + /* We need a determenistic order for the .tm_clone_table, otherwise + we will get bootstrap comparison failures, so dump the hash table + to a vector, sort it, and dump the vector. */ + + /* Dump the hashtable to a vector. */ + htab_traverse_noresize (tm_clone_pairs, dump_tm_clone_to_vec, + (void *) &alias_pairs); + /* Sort it. */ + VEC_qsort (alias_pair, alias_pairs, alias_pair_cmp); + + /* Dump it. */ + dump_tm_clone_pairs (alias_pairs); + htab_delete (tm_clone_pairs); tm_clone_pairs = NULL; + VEC_free (alias_pair, gc, alias_pairs); } Index: output.h =================================================================== --- output.h (revision 181067) +++ output.h (working copy) @@ -608,7 +608,6 @@ extern void output_section_asm_op (const extern void record_tm_clone_pair (tree, tree); extern void finish_tm_clone_pairs (void); -extern int finish_tm_clone_pairs_1 (void **, void *); extern tree get_tm_clone_pair (tree); extern void default_asm_output_source_filename (FILE *, const char *);