From patchwork Fri Dec 17 00:13:52 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fang, Changpeng" X-Patchwork-Id: 75827 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 1CC311007D3 for ; Fri, 17 Dec 2010 11:14:13 +1100 (EST) Received: (qmail 27052 invoked by alias); 17 Dec 2010 00:14:10 -0000 Received: (qmail 27014 invoked by uid 22791); 17 Dec 2010 00:14:07 -0000 X-SWARE-Spam-Status: No, hits=-1.3 required=5.0 tests=AWL, BAYES_05, RCVD_IN_DNSWL_NONE, TW_DB X-Spam-Check-By: sourceware.org Received: from va3ehsobe004.messaging.microsoft.com (HELO VA3EHSOBE004.bigfish.com) (216.32.180.14) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 17 Dec 2010 00:13:59 +0000 Received: from mail39-va3-R.bigfish.com (10.7.14.249) by VA3EHSOBE004.bigfish.com (10.7.40.24) with Microsoft SMTP Server id 14.1.225.8; Fri, 17 Dec 2010 00:13:57 +0000 Received: from mail39-va3 (localhost.localdomain [127.0.0.1]) by mail39-va3-R.bigfish.com (Postfix) with ESMTP id 1FAF318D07BD; Fri, 17 Dec 2010 00:13:57 +0000 (UTC) X-SpamScore: -11 X-BigFish: VPS-11(zz4015L9371Pzz1202hzz8275dhz32i668h34h61h) X-Spam-TCS-SCL: 0:0 X-Forefront-Antispam-Report: KIP:(null); UIP:(null); IPVD:NLI; H:ausb3twp01.amd.com; RD:none; EFVD:NLI Received: from mail39-va3 (localhost.localdomain [127.0.0.1]) by mail39-va3 (MessageSwitch) id 1292544836720519_1115; Fri, 17 Dec 2010 00:13:56 +0000 (UTC) Received: from VA3EHSMHS005.bigfish.com (unknown [10.7.14.235]) by mail39-va3.bigfish.com (Postfix) with ESMTP id A73B51568050; Fri, 17 Dec 2010 00:13:56 +0000 (UTC) Received: from ausb3twp01.amd.com (163.181.249.108) by VA3EHSMHS005.bigfish.com (10.7.99.15) with Microsoft SMTP Server id 14.1.225.8; Fri, 17 Dec 2010 00:13:55 +0000 X-M-MSG: Received: from sausexedgep02.amd.com (sausexedgep02-ext.amd.com [163.181.249.73]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ausb3twp01.amd.com (Tumbleweed MailGate 3.7.2) with ESMTP id 239211028A73; Thu, 16 Dec 2010 18:13:51 -0600 (CST) Received: from sausexhtp01.amd.com (163.181.3.165) by sausexedgep02.amd.com (163.181.36.59) with Microsoft SMTP Server (TLS) id 8.3.106.1; Thu, 16 Dec 2010 18:15:56 -0600 Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp01.amd.com ([163.181.3.165]) with mapi; Thu, 16 Dec 2010 18:13:52 -0600 From: "Fang, Changpeng" To: Zdenek Dvorak CC: Richard Guenther , Xinliang David Li , "gcc-patches@gcc.gnu.org" Date: Thu, 16 Dec 2010 18:13:52 -0600 Subject: RE: [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops Message-ID: References: <20101214075629.GA10020@kam.mff.cuni.cz> <20101214210552.GA19633@kam.mff.cuni.cz> <20101215092220.GA9872@kam.mff.cuni.cz> , <20101216184722.GA5801@kam.mff.cuni.cz> In-Reply-To: <20101216184722.GA5801@kam.mff.cuni.cz> MIME-Version: 1.0 X-OriginatorOrg: amd.com Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Based on previous discussions, I modified the patch as such. If a loop is marked as non-rolling, optimize_loop_for_size_p returns TRUE and optimize_loop_for_speed_p returns FALSE. All users of these two functions will be affected. After applying the modified patch, pb05 compilation time decreases 29%, binary size decreases 20%, while a small (0.5%) performance increase was found which may be just noise. Modified patch passed bootstrapping and gcc regression tests on x86_64-unknown-linux-gnu. Is it OK to commit to trunk? Thanks, Changpeng From cd8b85bba1b39e108235f44d9d07918179ff3d79 Mon Sep 17 00:00:00 2001 From: Changpeng Fang Date: Mon, 13 Dec 2010 12:01:49 -0800 Subject: [PATCH] Don't perform certain loop optimizations on pre/post loops * basic-block.h (bb_flags): Add a new flag BB_HEADER_OF_NONROLLING_LOOP. * cfg.c (clear_bb_flags): Keep BB_HEADER_OF_NONROLLING marker. * cfgloop.h (mark_non_rolling_loop): New function declaration. (non_rolling_loop_p): New function declaration. * predict.c (optimize_loop_for_size_p): Return true if the loop was marked NON-ROLLING. (optimize_loop_for_speed_p): Return false if the loop was marked NON-ROLLING. * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Mark the non-rolling loop. * tree-ssa-loop-niter.c (mark_non_rolling_loop): Implement the new function. (non_rolling_loop_p): Implement the new function. * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Mark the non-rolling loop. (vect_do_peeling_for_alignment): Mark the non-rolling loop. --- gcc/basic-block.h | 6 +++++- gcc/cfg.c | 7 ++++--- gcc/cfgloop.h | 2 ++ gcc/predict.c | 6 ++++++ gcc/tree-ssa-loop-manip.c | 3 +++ gcc/tree-ssa-loop-niter.c | 20 ++++++++++++++++++++ gcc/tree-vect-loop-manip.c | 8 ++++++++ 7 files changed, 48 insertions(+), 4 deletions(-) diff --git a/gcc/basic-block.h b/gcc/basic-block.h index be0a1d1..850472d 100644 --- a/gcc/basic-block.h +++ b/gcc/basic-block.h @@ -245,7 +245,11 @@ enum bb_flags /* Set on blocks that cannot be threaded through. Only used in cfgcleanup.c. */ - BB_NONTHREADABLE_BLOCK = 1 << 11 + BB_NONTHREADABLE_BLOCK = 1 << 11, + + /* Set on blocks that are headers of non-rolling loops. */ + BB_HEADER_OF_NONROLLING_LOOP = 1 << 12 + }; /* Dummy flag for convenience in the hot/cold partitioning code. */ diff --git a/gcc/cfg.c b/gcc/cfg.c index c8ef799..e59a637 100644 --- a/gcc/cfg.c +++ b/gcc/cfg.c @@ -425,8 +425,8 @@ redirect_edge_pred (edge e, basic_block new_pred) connect_src (e); } -/* Clear all basic block flags, with the exception of partitioning and - setjmp_target. */ +/* Clear all basic block flags, with the exception of partitioning, + setjmp_target, and the non-rolling loop marker. */ void clear_bb_flags (void) { @@ -434,7 +434,8 @@ clear_bb_flags (void) FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) bb->flags = (BB_PARTITION (bb) - | (bb->flags & (BB_DISABLE_SCHEDULE + BB_RTL + BB_NON_LOCAL_GOTO_TARGET))); + | (bb->flags & (BB_DISABLE_SCHEDULE + BB_RTL + BB_NON_LOCAL_GOTO_TARGET + + BB_HEADER_OF_NONROLLING_LOOP))); } /* Check the consistency of profile information. We can't do that diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index bf2614e..e856a78 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -279,6 +279,8 @@ extern rtx doloop_condition_get (rtx); void estimate_numbers_of_iterations_loop (struct loop *, bool); HOST_WIDE_INT estimated_loop_iterations_int (struct loop *, bool); bool estimated_loop_iterations (struct loop *, bool, double_int *); +void mark_non_rolling_loop (struct loop *); +bool non_rolling_loop_p (struct loop *); /* Loop manipulation. */ extern bool can_duplicate_loop_p (const struct loop *loop); diff --git a/gcc/predict.c b/gcc/predict.c index c691990..bf729f8 100644 --- a/gcc/predict.c +++ b/gcc/predict.c @@ -279,6 +279,9 @@ optimize_insn_for_speed_p (void) bool optimize_loop_for_size_p (struct loop *loop) { + /* Loops marked NON-ROLLING are not likely to be hot. */ + if (non_rolling_loop_p (loop)) + return true; return optimize_bb_for_size_p (loop->header); } @@ -287,6 +290,9 @@ optimize_loop_for_size_p (struct loop *loop) bool optimize_loop_for_speed_p (struct loop *loop) { + /* Loops marked NON-ROLLING are not likely to be hot. */ + if (non_rolling_loop_p (loop)) + return false; return optimize_bb_for_speed_p (loop->header); } diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c index 87b2c0d..bc977bb 100644 --- a/gcc/tree-ssa-loop-manip.c +++ b/gcc/tree-ssa-loop-manip.c @@ -931,6 +931,9 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor, gcc_assert (new_loop != NULL); update_ssa (TODO_update_ssa); + /* NEW_LOOP is a non-rolling loop. */ + mark_non_rolling_loop (new_loop); + /* Determine the probability of the exit edge of the unrolled loop. */ new_est_niter = est_niter / factor; diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index ee85f6f..1e2e4b2 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -3011,6 +3011,26 @@ estimate_numbers_of_iterations (bool use_undefined_p) fold_undefer_and_ignore_overflow_warnings (); } +/* Mark LOOP as a non-rolling loop. */ + +void +mark_non_rolling_loop (struct loop *loop) +{ + gcc_assert (loop && loop->header); + loop->header->flags |= BB_HEADER_OF_NONROLLING_LOOP; +} + +/* Return true if LOOP is a non-rolling loop. */ + +bool +non_rolling_loop_p (struct loop *loop) +{ + int masked_flags; + gcc_assert (loop && loop->header); + masked_flags = (loop->header->flags & BB_HEADER_OF_NONROLLING_LOOP); + return (masked_flags != 0); +} + /* Returns true if statement S1 dominates statement S2. */ bool diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index 6ecd304..216de78 100644 --- a/gcc/tree-vect-loop-manip.c +++ b/gcc/tree-vect-loop-manip.c @@ -1938,6 +1938,10 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio, cond_expr, cond_expr_stmt_list); gcc_assert (new_loop); gcc_assert (loop_num == loop->num); + + /* NEW_LOOP is a non-rolling loop. */ + mark_non_rolling_loop (new_loop); + #ifdef ENABLE_CHECKING slpeel_verify_cfg_after_peeling (loop, new_loop); #endif @@ -2191,6 +2195,10 @@ vect_do_peeling_for_alignment (loop_vec_info loop_vinfo) th, true, NULL_TREE, NULL); gcc_assert (new_loop); + + /* NEW_LOOP is a non-rolling loop. */ + mark_non_rolling_loop (new_loop); + #ifdef ENABLE_CHECKING slpeel_verify_cfg_after_peeling (new_loop, loop); #endif -- 1.6.3.3