From patchwork Tue Oct 23 15:18:25 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 193504 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 752CD2C008C for ; Wed, 24 Oct 2012 02:18:39 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1351610320; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Date: From:To:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition:User-Agent:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=YRK8RgqODcCk7p4Tyn5N91drbhs=; b=yq4WrJacuQfibki x4MVx4Q+jlnoQgNjF8RKPG6xnnCaFvZjY9ecGsOT/EjkPISxoaxD6W9Z9FSSqh54 cbSjNFDwoTtmX10YOc7GGyi1uPyIAn5MZpcC6AdYpuau35iH8xHefYv1l2WQ8NeJ zHLtPE2HtjNW/PmFHPpxfQG45va0= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Date:From:To:Subject:Message-ID:MIME-Version:Content-Type:Content-Disposition:User-Agent:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=ICGY9ksaJ239lP2tI0xhtMDuW8tLG8kxz35U9ZWn1WivTILB+dnZOzFeMKWaax tEGrSrMocNvUrfrHKfcEaXkKeIds5p7EtDBaYiE+haKLtWauLMAo1EqIY2QYWI1f vULO48CzZOnnTIeVtaGe1wviDeFsm27149iPNh6ypKVkA=; Received: (qmail 9858 invoked by alias); 23 Oct 2012 15:18:35 -0000 Received: (qmail 9848 invoked by uid 22791); 23 Oct 2012 15:18:34 -0000 X-SWARE-Spam-Status: No, hits=-4.2 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 23 Oct 2012 15:18:27 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 166DF543ABA; Tue, 23 Oct 2012 17:18:25 +0200 (CEST) Date: Tue, 23 Oct 2012 17:18:25 +0200 From: Jan Hubicka To: gcc-patches@gcc.gnu.org Subject: loop-unroll.c TLC 3/4 simple peeling heuristic fix Message-ID: <20121023151825.GB5020@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, simple peeling heuristic thinks it makes no sense to peel loops with known iteration count (because they will be runtime unrolled instead). This is not true because the known iteration count is only upper bound. Fixed this. To make testcase possible I had to reduce overactive heuristic on number of branches in the loop. It looks bit more like an thinko copied from simple unrolling where it makes sort of more sense. Peeling first iterations when loop is known to execute few times makes sense for branch prediction queality. Bootstrapped/regtested x86_64-linux, comitted. Honza Index: ChangeLog =================================================================== --- ChangeLog (revision 192717) +++ ChangeLog (working copy) @@ -1,3 +1,11 @@ +2012-10-23 Jan Hubicka + + * loop-unroll.c (decide_peel_simple): Simple peeling makes sense even + with simple loops; bound number of branches only when FDO is not + available. + (decide_unroll_stupid): Mention that num_loop_branches heuristics + is off. + 2012-10-23 Nick Clifton PR target/54660 Index: loop-unroll.c =================================================================== --- loop-unroll.c (revision 192717) +++ loop-unroll.c (working copy) @@ -1228,7 +1228,6 @@ static void decide_peel_simple (struct loop *loop, int flags) { unsigned npeel; - struct niter_desc *desc; double_int iterations; if (!(flags & UAP_PEEL)) @@ -1253,20 +1252,17 @@ decide_peel_simple (struct loop *loop, i return; } - /* Check for simple loops. */ - desc = get_simple_loop_desc (loop); - - /* Check number of iterations. */ - if (desc->simple_p && !desc->assumptions && desc->const_iter) - { - if (dump_file) - fprintf (dump_file, ";; Loop iterates constant times\n"); - return; - } - /* Do not simply peel loops with branches inside -- it increases number - of mispredicts. */ - if (num_loop_branches (loop) > 1) + of mispredicts. + Exception is when we do have profile and we however have good chance + to peel proper number of iterations loop will iterate in practice. + TODO: this heuristic needs tunning; while for complette unrolling + the branch inside loop mostly eliminates any improvements, for + peeling it is not the case. Also a function call inside loop is + also branch from branch prediction POV (and probably better reason + to not unroll/peel). */ + if (num_loop_branches (loop) > 1 + && profile_status != PROFILE_READ) { if (dump_file) fprintf (dump_file, ";; Not peeling, contains branches\n"); @@ -1435,7 +1431,9 @@ decide_unroll_stupid (struct loop *loop, } /* Do not unroll loops with branches inside -- it increases number - of mispredicts. */ + of mispredicts. + TODO: this heuristic needs tunning; call inside the loop body + is also relatively good reason to not unroll. */ if (num_loop_branches (loop) > 1) { if (dump_file) Index: testsuite/gcc.dg/tree-prof/peel-1.c =================================================================== --- testsuite/gcc.dg/tree-prof/peel-1.c (revision 0) +++ testsuite/gcc.dg/tree-prof/peel-1.c (revision 0) @@ -0,0 +1,25 @@ +/* { dg-options "-O3 -fdump-rtl-loop2_unroll -fno-unroll-loops -fpeel-loops" } */ +void abort(); + +int a[1000]; +int +__attribute__ ((noinline)) +t() +{ + int i; + for (i=0;i<1000;i++) + if (!a[i]) + return 1; + abort (); +} +main() +{ + int i; + for (i=0;i<1000;i++) + t(); + return 0; +} +/* { dg-final-use { scan-rtl-dump "Considering simply peeling loop" "loop2_unroll" } } */ +/* In fact one peeling is enough; we however mispredict number of iterations of the loop + at least until loop_ch is schedule ahead of profiling pass. */ +/* { dg-final-use { cleanup-rtl-dump "Decided to simply peel the loop 2 times" } } */ Index: testsuite/ChangeLog =================================================================== --- testsuite/ChangeLog (revision 192717) +++ testsuite/ChangeLog (working copy) @@ -1,7 +1,11 @@ +2012-10-23 Jan Hubicka + + * gcc.dg/tree-prof/peel-1.c: New testcase. + 2012-10-23 Dominique d'Humieres PR gcc/52945 - * testsuite/gcc.dg/lto/pr52634_0.c: skip the test on Darwin. + * gcc.dg/lto/pr52634_0.c: skip the test on Darwin. 2012-10-23 Joseph Myers