From patchwork Mon Mar 11 09:52:37 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 226506 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id EFD632C02B0 for ; Mon, 11 Mar 2013 20:53:01 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1363600382; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:In-Reply-To:References:Date:Message-ID: Subject:From:To:Cc:Content-Type:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=ltph4JTAaDUWBLycJ78c6kZgGiY=; b=mv3MooP3qoWMsbI IP47UWz2oQX4eLzboIhIZGWbvWRD3R7iELcaCvJE3dtnADuISdnDtmcDhroZDGeB dr8xH3TquhHYo6cN8Hkw3sI5Og5A0gEwrN55wHInPUzyhyoszF0r+moozY85Ej3r /moZpvUFCVKA47OMIQgW62ktJpaM= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:MIME-Version:X-Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Cc:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=uRM0Bd/drPgHf9CTQzWsCUtWNYdGz7VKZeeEKPMZ3Cj5f50YqfYfk+Zdvmlaib Hd9NpcijzUik/KEGDrsBHFYg+H5ofyFgcaB6QILqn2PNm5cvonx/yKWfpG6luyAo JOG1uOs5g1br/3WwBBlqfkQaGeo6ZU8i4KG6WDBB01pHQ=; Received: (qmail 10547 invoked by alias); 11 Mar 2013 09:52:50 -0000 Received: (qmail 10538 invoked by uid 22791); 11 Mar 2013 09:52:49 -0000 X-SWARE-Spam-Status: No, hits=-4.9 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, KHOP_RCVD_TRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, TW_TM X-Spam-Check-By: sourceware.org Received: from mail-we0-f169.google.com (HELO mail-we0-f169.google.com) (74.125.82.169) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 11 Mar 2013 09:52:41 +0000 Received: by mail-we0-f169.google.com with SMTP id t11so3336455wey.14 for ; Mon, 11 Mar 2013 02:52:39 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.180.100.169 with SMTP id ez9mr11276648wib.3.1362995557522; Mon, 11 Mar 2013 02:52:37 -0700 (PDT) Received: by 10.194.56.100 with HTTP; Mon, 11 Mar 2013 02:52:37 -0700 (PDT) In-Reply-To: References: Date: Mon, 11 Mar 2013 10:52:37 +0100 Message-ID: Subject: Re: [patch] PR middle-end/39326 - limit LIM From: Richard Biener To: Steven Bosscher Cc: GCC Patches , Jakub Jelinek , Zdenek Dvorak X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Sun, Mar 10, 2013 at 2:08 AM, Steven Bosscher wrote: > Hello, > > The attached patch fixes another one of the scalability problems > reported in PR middle-end/39326. > > This problem is that tree loop-invariant code motion explodes on basic > blocks with many memory references. Compile time is quadratic in the > number of memory references in the basic block, and so are the memory > requirements when the dependences or independences are propagated > bottom-up through the loop tree. > > The fix is to give up on loops with too many memory references to > handle. There is already a param for that for loop dependence > analysis, and this patch makes LIM use the same param. > > Bootstrapped&tested on {x86_64,powerpc64}-unknown-linux-gnu. > OK for trunk? Given + well. Return true if all is well, false if something happened + that is fatal to the rest of the LIM pass. */ -static void +static bool gather_mem_refs_stmt (struct loop *loop, gimple stmt) and FOR_EACH_BB (bb) { ... + for (bsi = gsi_start_bb (bb); + !gsi_end_p (bsi) && all_ok; + gsi_next (&bsi)) + all_ok = gather_mem_refs_stmt (loop, gsi_stmt (bsi)); + + if (! all_ok) + bitmap_set_bit (loops_with_too_many_memrefs, loop->num); + } + + /* Propagate the information about loops with too many memory + references up the loop hierarchy. */ + FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST) + { + struct loop *outer = loop_outer (loop); + if (outer == current_loops->tree_root + || ! bitmap_bit_p (loops_with_too_many_memrefs, loop->num)) + continue; + bitmap_set_bit (loops_with_too_many_memrefs, outer->num); } I don't see how this propagation works correctly as you start to mark BBs as not-ok starting from a "random" basic-block in the loop tree. You of course also end up disqualifying very small loops completely if they happen to be analyzed after a very big one you disqualify. Of course that's partly because memory_accesses contains all memory accesses in the function - so I think rather than limiting on length of memory_accesses you want to limit on the length of memory_accesses.refs_in_loop (well, on memory_accesses.all_refs_in_loop). And you want the initial walk over all BBs to instead walk on BBs FOR_EACH_LOOP and LI_FROM_INNERMOST (you can then do the propagation to fill all_refs_in_loop there, too). I realize there isn't a good existing BB walker for this (with a suitable place to re-set all-ok) - a simple walk over get_loop_body via LI_FROM_INNERMOST would get you visit sub-loop bodies multiple times (easily skipped by a bb->loop_father test, of course, but still ...) That is, sth like the following preparatory patch to change the iteration: At this point this should be stage1 material, eventually backported for 4.8.1. And yes, aside from the above the rest of the patch looks good to me (move loops_with_too_many_memrefs into the memory_accesses struct?) Thanks, Richard. > Ciao! > Steven > > (The ChangeLog is a bit long but the patch is relatively straight-forward.) > > * tree-flow.h (enum move_pos): Moved to tree-ssa-loop-im.c. > * tree-ssa-loop-im.c: Include diagnostic-core.h for warning_at() > (enum move_pos): Moved here. > (movement_possibility): Made static. Reject memory operations in > loops with too many memory references to handle. > (determine_max_movement): Take loops_with_too_many_memrefs argument. > For statements referencing memory, find the outermost superloop > that is not in the loops_with_too_many_memrefs set. > (determine_invariantness_stmt): Take loops_with_too_many_memrefs > via dom_walk_data.global_data, and pass it along where necessary. > Hoist "pos == MOVE_POSSIBLE" test. > (determine_invariantness): Take loops_with_too_many_memrefs argument. > (move_computations): Likewise, but unused for now. > (gather_mem_refs_stmt): Fail if there are too many memory references. > Use PARAM_LOOP_MAX_DATAREFS_FOR_DATADEPS as threshold. Add disabled > optimization warning. > (gather_mem_refs_in_loops): Take loops_with_too_many_memrefs argument. > Propagate it from inner loops to outer loops. Do not propagate > recorded memory references for loops on which memory optimizations > are disabled. > (create_vop_ref_mapping): Take loops_with_too_many_memrefs argument. > Don't create a mapping on loops that are in this set. > (analyze_memory_references): Take loops_with_too_many_memrefs argument > and let subroutines fill it. > (store_motion): Take loops_with_too_many_memrefs argument. > Skip loops that are in this set. > (tree_ssa_lim): Allocate, pass, and free loops_with_too_many_memrefs. Index: gcc/tree-ssa-loop-im.c =================================================================== --- gcc/tree-ssa-loop-im.c (revision 196547) +++ gcc/tree-ssa-loop-im.c (working copy) @@ -1629,29 +1629,30 @@ gather_mem_refs_in_loops (void) loop_iterator li; bitmap lrefs, alrefs, alrefso; - FOR_EACH_BB (bb) - { - loop = bb->loop_father; - if (loop == current_loops->tree_root) - continue; - - for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) - gather_mem_refs_stmt (loop, gsi_stmt (bsi)); - } - - /* Propagate the information about accessed memory references up - the loop hierarchy. */ FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST) { + basic_block bbs = get_loop_body (loop); + for (i = 0; i < loop->num_nodes; ++i) + { + bb = bbs[i]; + if (bb->loop_father != loop) + continue; + for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) + gather_mem_refs_stmt (loop, gsi_stmt (bsi)); + } + free (bbs); + + /* Propagate the information about accessed memory references up + the loop hierarchy. */ lrefs = memory_accesses.refs_in_loop[loop->num]; alrefs = memory_accesses.all_refs_in_loop[loop->num]; bitmap_ior_into (alrefs, lrefs); - if (loop_outer (loop) == current_loops->tree_root) - continue; - - alrefso = memory_accesses.all_refs_in_loop[loop_outer (loop)->num]; - bitmap_ior_into (alrefso, alrefs); + if (loop_outer (loop) != current_loops->tree_root) + { + alrefso = memory_accesses.all_refs_in_loop[loop_outer (loop)->num]; + bitmap_ior_into (alrefso, alrefs); + } } }