From patchwork Sun Jan 6 23:52:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1021121 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-493489-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="iOdhz4X5"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43XwLP1NFxz9s7h for ; Mon, 7 Jan 2019 10:52:26 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=IWSyZcGeOe1Cl9eCApt57ZXzvT6aDaezwKUMw2CxE+7NpjcJPVsAS W+Foh2H41ZqpDL4dnRPWaDUG/l4cFWZIWfsEYLdJTSvlUBL6IoNmVRz30LvRd5H/ qyeIkcTmRGOJpjkNN0l8OCK2s/rB88ZIKeUXy59EZFPuuxYWAACbAw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=Ol1Sf8pp9O/KsO9aRi7Rh8TsKmY=; b=iOdhz4X5saiF+mXzhHRS Lgo1F48Yhf9jX09bqNoIKGB3zFa1IYud0rZmanM64K7I+ssBS0FWKj5YubmrFKnw Wb17kQy6a3nXKI/3jDM17CAMYUmwbeXFBuE6Wlu4JEe11ljZ2CWH0ZplkcwLafxI J2GTGRVazeyC7X1ZKkeBip4= Received: (qmail 8400 invoked by alias); 6 Jan 2019 23:52:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 8216 invoked by uid 89); 6 Jan 2019 23:52:14 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=increasing, indicates, tomorrow, Watch X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 06 Jan 2019 23:52:11 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 0BDD7281E4E; Mon, 7 Jan 2019 00:52:09 +0100 (CET) Date: Mon, 7 Jan 2019 00:52:09 +0100 From: Jan Hubicka To: gcc-patches@gcc.gnu.org Subject: Remove overall growth from badness metrics Message-ID: <20190106235208.tqx4jqylljlbk3gk@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) Hi, this patch removes overall growth from badness metrics. This code was added at a time inliner was purely function size based to give a hint that inlining more different functions into all callers is better than inlining one called very many times. With profile we now have more fine grained information and in all tests I did this heuristics seems to be counter-productive now and harmful especially on large units where growth estimate gets out of date. I plan to commit the patch tomorrow after re-testing everything after the bugfixes from today and yesterday. In addition to this have found that current inline-unit-growth is too small for LTO of large programs (especially Firefox:) and there are important improvements when increased from 20 to 30 or 40. I am re-running C++ benchmarks and other tests to decide about precise setting. Finally I plan to increase the new parameters for bit more inlining at -O2 and -Os. Bootstrapped/regtested x86_64-linux, will commit it tomorrow. * ipa-inline.c (edge_badness): Do not account overall_growth into badness metrics. Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 267612) +++ ipa-inline.c (working copy) @@ -1082,8 +1082,8 @@ edge_badness (struct cgraph_edge *edge, /* When profile is available. Compute badness as: time_saved * caller_count - goodness = ------------------------------------------------- - growth_of_caller * overall_growth * combined_size + goodness = -------------------------------- + growth_of_caller * combined_size badness = - goodness @@ -1094,7 +1094,6 @@ edge_badness (struct cgraph_edge *edge, || caller->count.ipa ().nonzero_p ()) { sreal numerator, denominator; - int overall_growth; sreal inlined_time = compute_inlined_call_time (edge, edge_time); numerator = (compute_uninlined_call_time (edge, unspec_edge_time) @@ -1106,73 +1105,6 @@ edge_badness (struct cgraph_edge *edge, else if (caller->count.ipa ().initialized_p ()) numerator = numerator >> 11; denominator = growth; - - overall_growth = callee_info->growth; - - /* Look for inliner wrappers of the form: - - inline_caller () - { - do_fast_job... - if (need_more_work) - noninline_callee (); - } - Withhout panilizing this case, we usually inline noninline_callee - into the inline_caller because overall_growth is small preventing - further inlining of inline_caller. - - Penalize only callgraph edges to functions with small overall - growth ... - */ - if (growth > overall_growth - /* ... and having only one caller which is not inlined ... */ - && callee_info->single_caller - && !edge->caller->global.inlined_to - /* ... and edges executed only conditionally ... */ - && edge->sreal_frequency () < 1 - /* ... consider case where callee is not inline but caller is ... */ - && ((!DECL_DECLARED_INLINE_P (edge->callee->decl) - && DECL_DECLARED_INLINE_P (caller->decl)) - /* ... or when early optimizers decided to split and edge - frequency still indicates splitting is a win ... */ - || (callee->split_part && !caller->split_part - && edge->sreal_frequency () * 100 - < PARAM_VALUE - (PARAM_PARTIAL_INLINING_ENTRY_PROBABILITY) - /* ... and do not overwrite user specified hints. */ - && (!DECL_DECLARED_INLINE_P (edge->callee->decl) - || DECL_DECLARED_INLINE_P (caller->decl))))) - { - ipa_fn_summary *caller_info = ipa_fn_summaries->get (caller); - int caller_growth = caller_info->growth; - - /* Only apply the penalty when caller looks like inline candidate, - and it is not called once and. */ - if (!caller_info->single_caller && overall_growth < caller_growth - && caller_info->inlinable - && caller_info->size - < (DECL_DECLARED_INLINE_P (caller->decl) - ? MAX_INLINE_INSNS_SINGLE : MAX_INLINE_INSNS_AUTO)) - { - if (dump) - fprintf (dump_file, - " Wrapper penalty. Increasing growth %i to %i\n", - overall_growth, caller_growth); - overall_growth = caller_growth; - } - } - if (overall_growth > 0) - { - /* Strongly preffer functions with few callers that can be inlined - fully. The square root here leads to smaller binaries at average. - Watch however for extreme cases and return to linear function - when growth is large. */ - if (overall_growth < 256) - overall_growth *= overall_growth; - else - overall_growth += 256 * 256 - 256; - denominator *= overall_growth; - } denominator *= ipa_fn_summaries->get (caller)->self_size + growth; badness = - numerator / denominator; @@ -1182,18 +1114,14 @@ edge_badness (struct cgraph_edge *edge, fprintf (dump_file, " %f: guessed profile. frequency %f, count %" PRId64 " caller count %" PRId64 - " time w/o inlining %f, time with inlining %f" - " overall growth %i (current) %i (original)" - " %i (compensated)\n", + " time w/o inlining %f, time with inlining %f\n", badness.to_double (), edge->sreal_frequency ().to_double (), edge->count.ipa ().initialized_p () ? edge->count.ipa ().to_gcov_type () : -1, caller->count.ipa ().initialized_p () ? caller->count.ipa ().to_gcov_type () : -1, compute_uninlined_call_time (edge, unspec_edge_time).to_double (), - inlined_time.to_double (), - estimate_growth (callee), - callee_info->growth, overall_growth); + inlined_time.to_double ()); } } /* When function local profile is not available or it does not give