From patchwork Wed Nov 28 21:28:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pat Haugen X-Patchwork-Id: 1004845 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-491166-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="qUI/LnXH"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 434v0X3J2Sz9s0n for ; Thu, 29 Nov 2018 08:28:42 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=SA7Wk TjUgdaXdmb6yadwiCj4p2MrjaveGdkijgUccBNy02tHFHaeCtZNoW5fK+sQ88fo5 0FuxF9OMbCgZvpEVCVb7tRntSTRg6EOJQ1rWmQi/Q88TTdIP4cLIBWPmIZc/Wafq 180EqnAkq5BC/kqIzACWzbbJSp22fHpgHPXCjA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=UmaNzh0mMFS dD8oNg48xroNgdls=; b=qUI/LnXHC9yYhQJbW8t1xLB+AgYCsIQXplResLcLY2j 0UA0ho2DO+/GmedrbubX6IjiCEKNX2KDldjlSWn9EGt/hsFObMy25Im6u+H4aUE/ mzYLLKWqwIluH07H0LFP8N61vKP4Y/48g4X9JGnNMbcCcyL5NxGsC8mA1UR1Kb7s = Received: (qmail 46938 invoked by alias); 28 Nov 2018 21:28:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 46924 invoked by uid 89); 28 Nov 2018 21:28:35 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=2-3, 266522, unrolling X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 28 Nov 2018 21:28:33 +0000 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wASLSW3v028651 for ; Wed, 28 Nov 2018 16:28:32 -0500 Received: from e16.ny.us.ibm.com (e16.ny.us.ibm.com [129.33.205.206]) by mx0b-001b2d01.pphosted.com with ESMTP id 2p225ca4np-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 28 Nov 2018 16:28:25 -0500 Received: from localhost by e16.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 28 Nov 2018 21:28:24 -0000 Received: from b01cxnp22034.gho.pok.ibm.com (9.57.198.24) by e16.ny.us.ibm.com (146.89.104.203) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 28 Nov 2018 21:28:23 -0000 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wASLSMfm14745770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 28 Nov 2018 21:28:22 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3DFD2124055; Wed, 28 Nov 2018 21:28:22 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0035C124052; Wed, 28 Nov 2018 21:28:21 +0000 (GMT) Received: from pmac.rchland.ibm.com (unknown [9.10.86.177]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTPS; Wed, 28 Nov 2018 21:28:21 +0000 (GMT) From: Pat Haugen Subject: [PATCH] Fix PR68212, unrolled loop no longer aligned To: GCC Patches Cc: Jan Hubicka , Bill Schmidt , David Edelsohn Date: Wed, 28 Nov 2018 15:28:21 -0600 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 x-cbid: 18112821-0072-0000-0000-000003D0C493 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010138; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000270; SDB=6.01124146; UDB=6.00582089; IPR=6.00904275; MB=3.00024373; MTD=3.00000008; XFM=3.00000015; UTC=2018-11-28 21:28:24 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18112821-0073-0000-0000-00004A4350F2 Message-Id: X-IsSubscribed: yes The following patch fixes the case where unrolling in the absence of profile information can cause a loop to no longer look hot and therefor not get aligned. In this case, instead of dividing by the unroll factor we now just scale by profile_probability::likely (). The diff looks worse than what really changed, just the addition of an if test and putting the existing 'scalemain' setting code into the else leg. Bootstrap/regtest on powerpc64le with no new regressions. I also ran a CPU2006 comparison, there were 4 benchmarks that improved 2-3% with the others in the noise range. Ok for trunk? -Pat 2018-11-28 Pat Haugen PR rtl-optimization/68212 * cfgloopmanip.c (duplicate_loop_to_header_edge): Adjust scale factor. testsuite/ChangeLog: 2018-11-28 Pat Haugen PR rtl-optimization/68212 * gcc.dg/pr68212.c: New test. Index: gcc/cfgloopmanip.c =================================================================== --- gcc/cfgloopmanip.c (revision 266522) +++ gcc/cfgloopmanip.c (working copy) @@ -1242,16 +1242,25 @@ duplicate_loop_to_header_edge (struct lo profile_probability prob_pass_main = bitmap_bit_p (wont_exit, 0) ? prob_pass_wont_exit : prob_pass_thru; - profile_probability p = prob_pass_main; - profile_count scale_main_den = count_in; - for (i = 0; i < ndupl; i++) + + /* If not using real profile data then don't scale the loop by ndupl. + This can lead to the loop no longer looking hot wrt surrounding + code. */ + if (profile_status_for_fn (cfun) == PROFILE_GUESSED) + scale_main = profile_probability::likely (); + else { - scale_main_den += count_in.apply_probability (p); - p = p * scale_step[i]; + profile_probability p = prob_pass_main; + profile_count scale_main_den = count_in; + for (i = 0; i < ndupl; i++) + { + scale_main_den += count_in.apply_probability (p); + p = p * scale_step[i]; + } + /* If original loop is executed COUNT_IN times, the unrolled + loop will account SCALE_MAIN_DEN times. */ + scale_main = count_in.probability_in (scale_main_den); } - /* If original loop is executed COUNT_IN times, the unrolled - loop will account SCALE_MAIN_DEN times. */ - scale_main = count_in.probability_in (scale_main_den); scale_act = scale_main * prob_pass_main; } else Index: gcc/testsuite/gcc.dg/pr68212.c =================================================================== --- gcc/testsuite/gcc.dg/pr68212.c (nonexistent) +++ gcc/testsuite/gcc.dg/pr68212.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-vectorize -funroll-loops --param max-unroll-times=4 -fdump-rtl-alignments" } */ + +void foo(long int *a, long int *b, long int n) +{ + long int i; + + for (i = 0; i < n; i++) + a[i] = *b; +} + +/* { dg-final { scan-rtl-dump-times "internal loop alignment added" 1 "alignments"} } */