From patchwork Tue Dec 3 12:50:39 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yufeng Zhang X-Patchwork-Id: 296185 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 6C67E2C009A for ; Tue, 3 Dec 2013 23:51:28 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=cuyqu88f2GcypW8P4 BgcuFM748k8eweON0IvKmbciObVsGlwzPbPA+1UTE7nzMNi3pUOxFeOkWFQdM61X 0tVPY94r4zMJJdFX0BO441JFPH8mxUfz6DMRav4ewWU0fqDjI6pDlGaUx+5D+nKL PsUQzSh3S3Fah1S3meP30qvja4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=OXWvXhUeppBPTWTPV5cwkuu lp94=; b=KAyyn+plLLXdrauiGMYzCjlUuEjx5/9KzejawPri2Y2UcEH7CSoRRHQ EMN07Ugx4HELereszUHT9YpGG8hQhFwt511tUGTQXM/kzFZKce5eca1fXD8TyeWt pIWHfkGYRxs4XMdjeWSj2NGhqEEiN0aZ6s4R9jGW0OTDyNzqbPYQ= Received: (qmail 16705 invoked by alias); 3 Dec 2013 12:51:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16694 invoked by uid 89); 3 Dec 2013 12:51:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=AWL, BAYES_95, RDNS_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: service87.mimecast.com Received: from Unknown (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 03 Dec 2013 12:50:50 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Tue, 03 Dec 2013 12:50:41 +0000 Received: from [10.1.201.52] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 3 Dec 2013 12:50:40 +0000 Message-ID: <529DD39F.9080405@arm.com> Date: Tue, 03 Dec 2013 12:50:39 +0000 From: Yufeng Zhang User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: Jeff Law CC: Richard Biener , Bill Schmidt , "gcc-patches@gcc.gnu.org" Subject: Re: [PING] [PATCH] Optional alternative base_expr in finding basis for CAND_REFs References: <5277EA58.5020303@arm.com> <1384189786.8213.28.camel@gnopaine> <5282ACE5.8020304@arm.com> <1384365888.8213.65.camel@gnopaine> <5283D3B5.1040300@arm.com> <1384373498.8213.76.camel@gnopaine> <1384376098.8213.94.camel@gnopaine> <52840A69.1020308@arm.com> <5294B7F4.4020106@arm.com> <529CABA5.1040003@arm.com> <529D7ED6.6060408@redhat.com> In-Reply-To: <529D7ED6.6060408@redhat.com> X-MC-Unique: 113120312504102301 X-IsSubscribed: yes On 12/03/13 06:48, Jeff Law wrote: > On 12/02/13 08:47, Yufeng Zhang wrote: >> Ping~ >> >> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03360.html > >> >> Thanks, >> Yufeng >> >> On 11/26/13 15:02, Yufeng Zhang wrote: >>> On 11/26/13 12:45, Richard Biener wrote: >>>> On Thu, Nov 14, 2013 at 12:25 AM, Yufeng >>>> Zhang wrote: >>>>> On 11/13/13 20:54, Bill Schmidt wrote: >>>>>> The second version of your original patch is ok with me with the >>>>>> following changes. Sorry for the little side adventure into the >>>>>> next-interp logic; in the end that's going to hurt more than it >>>>>> helps in >>>>>> this case. Thanks for having a look at it, anyway. Thanks also for >>>>>> cleaning up this version to be less intrusive to common interfaces; I >>>>>> appreciate it. >>>>> >>>>> >>>>> Thanks a lot for the review. I've attached an updated patch with the >>>>> suggested changes incorporated. >>>>> >>>>> For the next-interp adventure, I was quite happy to do the >>>>> experiment; it's >>>>> a good chance of gaining insight into the pass. Many thanks for >>>>> your prompt >>>>> replies and patience in guiding! >>>>> >>>>> >>>>>> Everything else looks OK to me. Please ask Richard for final >>>>>> approval, >>>>>> as I'm not a maintainer. > First a note, I need to check on voting for Bill as the slsr maintainer > from the steering committee. Voting was in progress just before the > close of stage1 development so I haven't tallied the results :-) Looking forward to some good news! :) >>> >>> Yes, you are right about the non-trivial 'base' tree are rarely shared. >>> The cached is introduced mainly because get_alternative_base () may be >>> called twice on the same 'base' tree, once in the >>> find_basis_for_candidate () for look-up and the other time in >>> alloc_cand_and_find_basis () for record_potential_basis (). I'm happy >>> to leave out the cache if you think the benefit is trivial. > Without some sense of how expensive the lookups are vs how often the > cache hits it's awful hard to know if the cache is worth it. > > I'd say take it out unless you have some sense it's really saving time. > It's a pretty minor implementation detail either way. I think the affine tree routines are generally expensive; it is worth having a cache to avoid calling them too many times. I run the slsr-*.c tests under gcc.dg/tree-ssa/ and find out that the cache hit rates range from 55.6% to 90%, with 73.5% as the average. The samples may not well represent the real world scenario, but they do show the fact that the 'base' tree can be shared to some extent. So I'd like to have the cache in the patch. > >>> >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-O2 -fdump-tree-slsr" } */ >>>> + >>>> +typedef int arr_2[50][50]; >>>> + >>>> +void foo (arr_2 a2, int v1) >>>> +{ >>>> + int i, j; >>>> + >>>> + i = v1 + 5; >>>> + j = i; >>>> + a2 [i-10] [j] = 2; >>>> + a2 [i] [j++] = i; >>>> + a2 [i+20] [j++] = i; >>>> + a2 [i-3] [i-1] += 1; >>>> + return; >>>> +} >>>> + >>>> +/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */ >>>> +/* { dg-final { cleanup-tree-dump "slsr" } } */ >>>> >>>> scanning for 5 MEMs looks non-sensical. What transform do >>>> you expect? I see other slsr testcases do similar non-sensical >>>> checking which is bad, too. >>> >>> As the slsr optimizes CAND_REF candidates by simply lowering them to >>> MEM_REF from e.g. ARRAY_REF, I think scanning for the number of MEM_REFs >>> is an effective check. Alternatively, I can add a follow-up patch to >>> add some dumping facility in replace_ref () to print out the replacing >>> actions when -fdump-tree-slsr-details is on. > I think adding some details to the dump and scanning for them would be > better. That's the only change that is required for this to move forward. I've updated to patch to dump more details when -fdump-tree-slsr-details is on. The tests have also been updated to scan for these new dumps instead of MEMs. > > I suggest doing it quickly. We're well past stage1 close at this point. The bootstrapping on x86_64 is still running. OK to commit if it succeeds? Thanks, Yufeng gcc/ * gimple-ssa-strength-reduction.c: Include tree-affine.h. (name_expansions): New static variable. (alt_base_map): Ditto. (get_alternative_base): New function. (find_basis_for_candidate): For CAND_REF, optionally call find_basis_for_base_expr with the returned value from get_alternative_base. (record_potential_basis): Add new parameter 'base' of type 'tree'; add an assertion of non-NULL base; use base to set node->base_expr. (alloc_cand_and_find_basis): Update; call record_potential_basis for CAND_REF with the returned value from get_alternative_base. (replace_refs): Dump details on the replacing. (execute_strength_reduction): Call pointer_map_create for alt_base_map; call free_affine_expand_cache with &name_expansions. gcc/testsuite/ * gcc.dg/tree-ssa/slsr-39.c: Update. * gcc.dg/tree-ssa/slsr-41.c: New test. diff --git a/gcc/gimple-ssa-strength-reduction.c b/gcc/gimple-ssa-strength-reduction.c index 88afc91..bf3362f 100644 --- a/gcc/gimple-ssa-strength-reduction.c +++ b/gcc/gimple-ssa-strength-reduction.c @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. If not see #include "params.h" #include "hash-table.h" #include "tree-ssa-address.h" +#include "tree-affine.h" /* Information about a strength reduction candidate. Each statement in the candidate table represents an expression of one of the @@ -420,6 +421,42 @@ cand_chain_hasher::equal (const value_type *chain1, const compare_type *chain2) /* Hash table embodying a mapping from base exprs to chains of candidates. */ static hash_table base_cand_map; +/* Pointer map used by tree_to_aff_combination_expand. */ +static struct pointer_map_t *name_expansions; +/* Pointer map embodying a mapping from bases to alternative bases. */ +static struct pointer_map_t *alt_base_map; + +/* Given BASE, use the tree affine combiniation facilities to + find the underlying tree expression for BASE, with any + immediate offset excluded. */ + +static tree +get_alternative_base (tree base) +{ + tree *result = (tree *) pointer_map_contains (alt_base_map, base); + + if (result == NULL) + { + tree expr; + aff_tree aff; + + tree_to_aff_combination_expand (base, TREE_TYPE (base), + &aff, &name_expansions); + aff.offset = tree_to_double_int (integer_zero_node); + expr = aff_combination_to_tree (&aff); + + result = (tree *) pointer_map_insert (alt_base_map, base); + gcc_assert (!*result); + + if (expr == base) + *result = NULL; + else + *result = expr; + } + + return *result; +} + /* Look in the candidate table for a CAND_PHI that defines BASE and return it if found; otherwise return NULL. */ @@ -440,8 +477,9 @@ find_phi_def (tree base) } /* Helper routine for find_basis_for_candidate. May be called twice: - once for the candidate's base expr, and optionally again for the - candidate's phi definition. */ + once for the candidate's base expr, and optionally again either for + the candidate's phi definition or for a CAND_REF's alternative base + expression. */ static slsr_cand_t find_basis_for_base_expr (slsr_cand_t c, tree base_expr) @@ -518,6 +556,13 @@ find_basis_for_candidate (slsr_cand_t c) } } + if (!basis && c->kind == CAND_REF) + { + tree alt_base_expr = get_alternative_base (c->base_expr); + if (alt_base_expr) + basis = find_basis_for_base_expr (c, alt_base_expr); + } + if (basis) { c->sibling = basis->dependent; @@ -528,17 +573,21 @@ find_basis_for_candidate (slsr_cand_t c) return 0; } -/* Record a mapping from the base expression of C to C itself, indicating that - C may potentially serve as a basis using that base expression. */ +/* Record a mapping from BASE to C, indicating that C may potentially serve + as a basis using that base expression. BASE may be the same as + C->BASE_EXPR; alternatively BASE can be a different tree that share the + underlining expression of C->BASE_EXPR. */ static void -record_potential_basis (slsr_cand_t c) +record_potential_basis (slsr_cand_t c, tree base) { cand_chain_t node; cand_chain **slot; + gcc_assert (base); + node = (cand_chain_t) obstack_alloc (&chain_obstack, sizeof (cand_chain)); - node->base_expr = c->base_expr; + node->base_expr = base; node->cand = c; node->next = NULL; slot = base_cand_map.find_slot (node, INSERT); @@ -554,10 +603,18 @@ record_potential_basis (slsr_cand_t c) } /* Allocate storage for a new candidate and initialize its fields. - Attempt to find a basis for the candidate. */ + Attempt to find a basis for the candidate. + + For CAND_REF, an alternative base may also be recorded and used + to find a basis. This helps cases where the expression hidden + behind BASE (which is usually an SSA_NAME) has immediate offset, + e.g. + + a2[i][j] = 1; + a2[i + 20][j] = 2; */ static slsr_cand_t -alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base, +alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base, double_int index, tree stride, tree ctype, unsigned savings) { @@ -583,7 +640,13 @@ alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base, else c->basis = find_basis_for_candidate (c); - record_potential_basis (c); + record_potential_basis (c, base); + if (kind == CAND_REF) + { + tree alt_base = get_alternative_base (base); + if (alt_base) + record_potential_basis (c, alt_base); + } return c; } @@ -1843,6 +1906,12 @@ replace_ref (tree *expr, slsr_cand_t c) static void replace_refs (slsr_cand_t c) { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fputs ("Replacing reference: ", dump_file); + print_gimple_stmt (dump_file, c->cand_stmt, 0, 0); + } + if (gimple_vdef (c->cand_stmt)) { tree *lhs = gimple_assign_lhs_ptr (c->cand_stmt); @@ -1854,6 +1923,13 @@ replace_refs (slsr_cand_t c) replace_ref (rhs, c); } + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fputs ("With: ", dump_file); + print_gimple_stmt (dump_file, c->cand_stmt, 0, 0); + fputs ("\n", dump_file); + } + if (c->sibling) replace_refs (lookup_cand (c->sibling)); @@ -3524,6 +3600,9 @@ execute_strength_reduction (void) /* Allocate the mapping from base expressions to candidate chains. */ base_cand_map.create (500); + /* Allocate the mapping from bases to alternative bases. */ + alt_base_map = pointer_map_create (); + /* Initialize the loop optimizer. We need to detect flow across back edges, and this gives us dominator information as well. */ loop_optimizer_init (AVOID_CFG_MODIFICATIONS); @@ -3539,6 +3618,9 @@ execute_strength_reduction (void) dump_cand_chains (); } + pointer_map_destroy (alt_base_map); + free_affine_expand_cache (&name_expansions); + /* Analyze costs and make appropriate replacements. */ analyze_candidates_and_replace (); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c index 8cc2798..c146219 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c @@ -6,7 +6,7 @@ *PINDEX: C1 + (C2 * C3) + C4 */ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-slsr" } */ +/* { dg-options "-O2 -fdump-tree-slsr-details" } */ typedef int arr_2[50][50]; @@ -22,5 +22,5 @@ void foo (arr_2 a2, int v1) return; } -/* { dg-final { scan-tree-dump-times "MEM" 4 "slsr" } } */ +/* { dg-final { scan-tree-dump-times "Replacing reference: " 4 "slsr" } } */ /* { dg-final { cleanup-tree-dump "slsr" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c new file mode 100644 index 0000000..2c9d908 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c @@ -0,0 +1,24 @@ +/* Verify straight-line strength reduction in using + alternative base expr to record and look for the + potential candidate. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-slsr-details" } */ + +typedef int arr_2[50][50]; + +void foo (arr_2 a2, int v1) +{ + int i, j; + + i = v1 + 5; + j = i; + a2 [i-10] [j] = 2; + a2 [i] [j++] = i; + a2 [i+20] [j++] = i; + a2 [i-3] [i-1] += 1; + return; +} + +/* { dg-final { scan-tree-dump-times "Replacing reference: " 5 "slsr" } } */ +/* { dg-final { cleanup-tree-dump "slsr" } } */