From patchwork Thu Apr 28 00:52:58 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sharad Singhai X-Patchwork-Id: 93134 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 3771AB6EF7 for ; Thu, 28 Apr 2011 10:53:31 +1000 (EST) Received: (qmail 29268 invoked by alias); 28 Apr 2011 00:53:23 -0000 Received: (qmail 29253 invoked by uid 22791); 28 Apr 2011 00:53:19 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_PASS, TW_TM, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.67) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 28 Apr 2011 00:53:03 +0000 Received: from wpaz21.hot.corp.google.com (wpaz21.hot.corp.google.com [172.24.198.85]) by smtp-out.google.com with ESMTP id p3S0r1NX001927; Wed, 27 Apr 2011 17:53:01 -0700 Received: from nabu.mtv.corp.google.com (nabu.mtv.corp.google.com [172.18.118.21]) by wpaz21.hot.corp.google.com with ESMTP id p3S0qx0E026045; Wed, 27 Apr 2011 17:52:59 -0700 Received: by nabu.mtv.corp.google.com (Postfix, from userid 68019) id 1DA3615C1FA; Wed, 27 Apr 2011 17:52:58 -0700 (PDT) To: reply@codereview.appspotmail.com, dnovillo@google.com, gcc-patches@gcc.gnu.org Subject: [google] Use different peeling parameters with available profile (issue4438079) Message-Id: <20110428005259.1DA3615C1FA@nabu.mtv.corp.google.com> Date: Wed, 27 Apr 2011 17:52:58 -0700 (PDT) From: singhai@google.com (Sharad Singhai) X-System-Of-Record: true Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch adds new parameters to control peeling when profile feedback information is available. For google/main. Tested: bootstrapped on x86_64. 2011-04-27 Sharad Singhai * gcc/params.def: Add new parameters to control peeling. * gcc/tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Use different peeling parameters when profile feedback is available. * gcc/loop-unroll.c (decide_peel_once_rolling): Ditto. (decide_peel_completely): Ditto. * gcc/doc/invoke.texi: Document new peeling parameters. * gcc/testsuite/gcc.dg/vect/O3-vect-pr34223.c: Add new peeling parameters. * gcc/testsuite/gcc.dg/vect/vect.exp: Allow reading flags in individual tests. --- This patch is available for review at http://codereview.appspot.com/4438079 Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 172904) +++ gcc/doc/invoke.texi (working copy) @@ -8523,11 +8523,28 @@ The maximum number of peelings of a single loop. @item max-completely-peeled-insns +@item max-completely-peeled-insns-feedback The maximum number of insns of a completely peeled loop. +The @option{max-completely-peeled-insns-feedback} is used only when profile +feedback is available and the loop is hot. Because of the real profiles, this +value may set to be larger for hot loops. + +@item max-once-peeled-insns +@item max-once-peeled-insns-feedback +The maximum number of insns of a peeled loop that rolls only once. +The @option{max-once-peeled-insns-feedback} is used only when profile feedback +is available and the loop is hot. Because of the real profiles, this value +may set to be larger for hot loops. + @item max-completely-peel-times +@item max-completely-peel-times-feedback The maximum number of iterations of a loop to be suitable for complete peeling. +The @option{max-completely-peel-times-feedback} is used only when profile feedback +is available and the loop is hot. Because of the real profiles, this value may +set to be larger for hot loops. + @item max-completely-peel-loop-nest-depth The maximum depth of a loop nest suitable for complete peeling. Index: gcc/testsuite/gcc.dg/vect/O3-vect-pr34223.c =================================================================== --- gcc/testsuite/gcc.dg/vect/O3-vect-pr34223.c (revision 172904) +++ gcc/testsuite/gcc.dg/vect/O3-vect-pr34223.c (working copy) @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-options "[vect_cflags] --param max-completely-peel-times=16" } */ #include "tree-vect.h" Index: gcc/testsuite/gcc.dg/vect/vect.exp =================================================================== --- gcc/testsuite/gcc.dg/vect/vect.exp (revision 172904) +++ gcc/testsuite/gcc.dg/vect/vect.exp (working copy) @@ -24,6 +24,12 @@ global DEFAULT_VECTCFLAGS set DEFAULT_VECTCFLAGS "" +# So that we can read flags in individual tests. +proc vect_cflags { } { + global DEFAULT_VECTCFLAGS + return $DEFAULT_VECTCFLAGS +} + # If the target system supports vector instructions, the default action # for a test is 'run', otherwise it's 'compile'. Save current default. # Executing vector instructions on a system without hardware vector support Index: gcc/tree-ssa-loop-ivcanon.c =================================================================== --- gcc/tree-ssa-loop-ivcanon.c (revision 172904) +++ gcc/tree-ssa-loop-ivcanon.c (working copy) @@ -326,6 +326,7 @@ enum unroll_level ul) { unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll, unr_insns; + unsigned HOST_WIDE_INT max_peeled_insns; gimple cond; struct loop_size size; @@ -336,9 +337,21 @@ return false; n_unroll = tree_low_cst (niter, 1); - max_unroll = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES); - if (n_unroll > max_unroll) + if (profile_info && flag_branch_probabilities && + optimize_loop_for_speed_p (loop)) + max_unroll = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES_FEEDBACK); + else + max_unroll = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES); + + if (n_unroll > max_unroll) { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, " Not unrolling loop %d limited by max unroll" + " (%d > %d)\n", + loop->num, (int) n_unroll, (int) max_unroll); + } return false; + } if (n_unroll) { @@ -356,14 +369,20 @@ (int) unr_insns); } - if (unr_insns > ninsns - && (unr_insns - > (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS))) + if (profile_info && flag_branch_probabilities && + optimize_loop_for_speed_p (loop)) + max_peeled_insns = + PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS_FEEDBACK); + else + max_peeled_insns = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS); + + if (unr_insns > max_peeled_insns) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Not unrolling loop %d " - "(--param max-completely-peeled-insns limit reached).\n", - loop->num); + "(--param max-completely-peeled-insns(-feedback) limit. " + "(%u > %u)).\n", + loop->num, (unsigned) unr_insns, (unsigned) max_peeled_insns); return false; } @@ -371,7 +390,8 @@ && unr_insns > ninsns) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Not unrolling loop %d.\n", loop->num); + fprintf (dump_file, "Not unrolling loop %d (NO_GROWTH %d > %d).\n", + loop->num, (int) unr_insns, (int) ninsns); return false; } } @@ -418,8 +438,9 @@ update_stmt (cond); update_ssa (TODO_update_ssa); - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Unrolled loop %d completely.\n", loop->num); + if (dump_file) + fprintf (dump_file, "Unrolled loop %d completely by factor %d.\n", + loop->num, (int) n_unroll); return true; } Index: gcc/loop-unroll.c =================================================================== --- gcc/loop-unroll.c (revision 172904) +++ gcc/loop-unroll.c (working copy) @@ -324,15 +324,23 @@ decide_peel_once_rolling (struct loop *loop, int flags ATTRIBUTE_UNUSED) { struct niter_desc *desc; + unsigned max_peeled_insns; + if (profile_info && flag_branch_probabilities) + max_peeled_insns = + (unsigned) PARAM_VALUE (PARAM_MAX_ONCE_PEELED_INSNS_FEEDBACK); + else + max_peeled_insns = (unsigned) PARAM_VALUE (PARAM_MAX_ONCE_PEELED_INSNS); + if (dump_file) fprintf (dump_file, "\n;; Considering peeling once rolling loop\n"); /* Is the loop small enough? */ - if ((unsigned) PARAM_VALUE (PARAM_MAX_ONCE_PEELED_INSNS) < loop->ninsns) + if (max_peeled_insns < loop->ninsns) { if (dump_file) - fprintf (dump_file, ";; Not considering loop, is too big\n"); + fprintf (dump_file, ";; Not considering loop, is too big (%d > %u)\n", + loop->ninsns, max_peeled_insns); return; } @@ -362,7 +370,7 @@ static void decide_peel_completely (struct loop *loop, int flags ATTRIBUTE_UNUSED) { - unsigned npeel; + unsigned npeel, max_insns, max_peel; struct niter_desc *desc; if (dump_file) @@ -393,16 +401,30 @@ return; } + if (profile_info && flag_branch_probabilities) + { + max_insns = + (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS_FEEDBACK); + max_peel = + (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES_FEEDBACK); + } + else + { + max_insns = (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS); + max_peel = (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES); + } + /* npeel = number of iterations to peel. */ - npeel = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS) / loop->ninsns; - if (npeel > (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES)) - npeel = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES); + npeel = max_insns / loop->ninsns; + if (npeel > max_peel) + npeel = max_peel; /* Is the loop small enough? */ if (!npeel) { if (dump_file) - fprintf (dump_file, ";; Not considering loop, is too big\n"); + fprintf (dump_file, ";; Not considering loop, is too big, npeel=%u.\n", + npeel); return; } @@ -435,7 +457,7 @@ /* Success. */ if (dump_file) - fprintf (dump_file, ";; Decided to peel loop completely\n"); + fprintf (dump_file, ";; Decided to peel loop completely npeel %u\n", npeel); loop->lpt_decision.decision = LPT_PEEL_COMPLETELY; } Index: gcc/params.def =================================================================== --- gcc/params.def (revision 172904) +++ gcc/params.def (working copy) @@ -299,16 +299,37 @@ "max-completely-peeled-insns", "The maximum number of insns of a completely peeled loop", 400, 0, 0) +/* The maximum number of insns of a peeled loop, when feedback + information is available. */ +DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS_FEEDBACK, + "max-completely-peeled-insns-feedback", + "The maximum number of insns of a completely peeled loop when profile " + "feedback is available", + 600, 0, 0) /* The maximum number of peelings of a single loop that is peeled completely. */ DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES, "max-completely-peel-times", "The maximum number of peelings of a single loop that is peeled completely", - 16, 0, 0) + 8, 0, 0) +/* The maximum number of peelings of a single loop that is peeled + completely, when feedback information is available. */ +DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES_FEEDBACK, + "max-completely-peel-times-feedback", + "The maximum number of peelings of a single loop that is peeled " + "completely, when profile feedback is available", + 16, 0, 0) /* The maximum number of insns of a peeled loop that rolls only once. */ DEFPARAM(PARAM_MAX_ONCE_PEELED_INSNS, "max-once-peeled-insns", "The maximum number of insns of a peeled loop that rolls only once", 400, 0, 0) +/* The maximum number of insns of a peeled loop that rolls only once, + when feedback information is available. */ +DEFPARAM(PARAM_MAX_ONCE_PEELED_INSNS_FEEDBACK, + "max-once-peeled-insns-feedback", + "The maximum number of insns of a peeled loop that rolls only once, " + "when profile feedback is available", + 600, 0, 0) /* The maximum depth of a loop nest we completely peel. */ DEFPARAM(PARAM_MAX_UNROLL_ITERATIONS, "max-completely-peel-loop-nest-depth",