From patchwork Mon Nov 25 12:44:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Graham Markall X-Patchwork-Id: 1200382 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-514524-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="huyDq5QB"; dkim=pass (2048-bit key; unprotected) header.d=embecosm.com header.i=@embecosm.com header.b="RgXsykoa"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47M6Dn3Cltz9s7T for ; Mon, 25 Nov 2019 23:44:39 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; q=dns; s=default; b=R5A JEDZoyYg46GkessDcgpYqAkx8nv8EDc3NgIIAs6Lz6RS5G7LxwZii1N8L7to8IWj ZQpAUiPNQd6WdKt/XFih1jKlE3vQnUUhG5JxEWKPDck2LC4csR5+m5+ino/OrY7E zLujdzsSF+rJ34ZdUYP4TrCm3r1HA0fNYQDXQoB0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=default; bh=GBJpDWOMA FsR7lADld/Wu40WDh4=; b=huyDq5QB/SdlocasUgao9OGvwRF2SxZ4VdkSshlaA HjtvzDEXvfdvrb3TzXR0RBWn9CdtXIlttrZilICDYQ73qhTCwAmDO6rczUSwGXhw NWtNQtwymljbVEOi+LEBvk8xnN2lZua6t1TcLGl0odcmcNzlR5fG38MEo6iRW0ES OM= Received: (qmail 98658 invoked by alias); 25 Nov 2019 12:44:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 98650 invoked by uid 89); 25 Nov 2019 12:44:32 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-14.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=resulted, H*r:in-addr.arpa, gained, proprietary X-HELO: mail-wr1-f66.google.com Received: from mail-wr1-f66.google.com (HELO mail-wr1-f66.google.com) (209.85.221.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 25 Nov 2019 12:44:30 +0000 Received: by mail-wr1-f66.google.com with SMTP id n1so17862564wra.10 for ; Mon, 25 Nov 2019 04:44:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=anWDSEwg2mKRzAms5H/LspaLWWySrYKislJzbaJX48k=; b=RgXsykoagyRW4Vrq5VZVZpfqV3JTA6pLBcS6+ZetZ0KT4jqeZLz437kNY1U38P6xM9 F0f+yDhuMQE5PXoLSkEZ/70xbpSsg8yiV3ZC4N5MGxresG4AWcEVAYDCuh/mcrdO/VEp O7JU2fHETGS3xRGocTuBwVRFUdX/acew4WyCrMMRWY/PteOHA4+dCiD6FyLoC3h8rUZt E+u/r1xEyMe3CVYotqslZ/Qg/xaADtXQG3bridhK1O19eQEzP17FrAnO42ZFhG0V79o1 UWUrqz4yTh6h8wj80x1OvFF40KsFFdfiFv76JLxULK84BbPovIFgl3SKm+CTjwePweLL kM5Q== Received: from pepper.vegetable-garden.lan (14.27.187.81.in-addr.arpa. [81.187.27.14]) by smtp.gmail.com with ESMTPSA id u18sm10489869wrp.14.2019.11.25.04.44.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Nov 2019 04:44:27 -0800 (PST) From: Graham Markall To: gcc-patches@gcc.gnu.org Cc: ofer.shinaar@wdc.com, nidal.faour@wdc.com, craig.blackmore@embecosm.com, richard.guenther@gmail.com, law@redhat.com, Graham Markall Subject: [PATCH v2] Add inline growth bias param Date: Mon, 25 Nov 2019 12:44:17 +0000 Message-Id: <20191125124417.706-1-graham.markall@embecosm.com> In-Reply-To: References: MIME-Version: 1.0 X-IsSubscribed: yes Hi Richard, Many thanks for the suggestion of an alternative implementation. I tried implementing the suggestion, and I had a couple of observations: 1. As well as applying the bias in compute_fn_summary, it seemed to also be necessary to apply it in ip_update_overall_fn_summary to avoid an ICE resulting from size_info->size and size_info->self_size differing at the end of compute_fn_summary. 2. The results that it produced were exactly the same as those using the param uninlined-function-insns, so it would probably be redundant to add the additional parameter implemented this way. The implementation I tried is at: https://github.com/gmarkall/gcc/commit/1d22f65b8a392cffc235ecb49143a93d1720b91c When I benchmarked the code size again using Embench with both the inline-growth-bias param implemented this way, and the uninlined-function-insns param the results were, in both cases: Benchmark 0 1 2 3 4 --------- ------ ------ ------ ------ ------ aha-mont64 99.05 99.05 99.05 99.05 99.05 crc32 100.00 100.00 100.00 100.00 100.00 cubic 94.66 94.66 94.66 94.66 94.66 edn 96.14 96.14 96.14 96.14 96.14 huffbench 99.88 99.88 99.88 99.88 99.88 matmult-int 100.00 100.00 100.00 100.00 100.00 minver 100.56 100.56 100.56 100.56 100.56 nbody 101.13 101.13 101.13 101.13 101.13 nettle-aes 99.03 99.03 99.03 99.03 99.03 nettle-sha256 99.89 99.89 99.89 99.89 99.89 nsichneu 100.00 100.00 100.00 100.00 100.00 picojpeg 99.35 99.35 100.10 100.10 100.10 qrduino 101.81 101.81 101.81 101.81 101.81 sglib-combined 100.00 100.00 100.00 100.00 100.00 slre 99.09 99.42 99.42 100.49 100.49 st 96.36 96.36 96.36 96.36 96.36 statemate 100.22 100.22 100.22 100.22 100.22 ud 100.00 100.00 100.00 100.00 100.00 wikisort 99.48 99.48 99.48 99.48 99.48 Mean 99.28 99.30 99.34 99.40 99.40 These results show that: 1. Setting the uninlined-function-insns value to 0 rather than 2 generally seems to yield an improvement in code size on RISC-V, without making any individual benchmark larger than the current default of 2. 2. There are two cases where the patch in the original version of the patch (v1) makes a slightly greater improvement than using uninlined-function-insns or the patch linked above (ufi). These are: Benchmark ufi v1 --------- --- -- qrduino 101.81 101.61 sglib-combined 100.00 95.87 (note that the v1 values are slightly different to those posted in the original patch - I rebased the original patch on a more recent version of the master branch and the values changed slightly.) Additionally, for a proprietary application that we tested the flags with, the code sizes were for using neither flag (Original) for using uninlined-function-insns, and using the original patch, with the parameter values that resulted in the smallest code size, were: Original ufi (val=0) v1 (val=-2) 57380 57184 56756 (100.0%) (99.66%) (98.91%) Patch v2 -------- In conclusion, it seems that in some cases there is some additional code size saving that can be gained in specific cases using the original implementation of the patch. I have tidied up the original version of the patch and adjusted the option so that it is a param instead of a flag as suggested by both Jeff and yourself. I chose a default param value of 5 to allow for a reasonable scope for setting negative values - although I haven't seen any result where setting the value beneath an effective setting of -2 yielded an improvement, this does allow a little more room in case there are cases where going below -2 would help. Is there sufficient data to indicate that the additional parameter as implemented in the attached patch is worth adding? Many thanks, Graham. --- gcc/ipa-inline.h | 21 ++++++++++++++++++++- gcc/params.def | 5 +++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-inline.h b/gcc/ipa-inline.h index 18c8e1eebd0..fb10677ea8f 100644 --- a/gcc/ipa-inline.h +++ b/gcc/ipa-inline.h @@ -21,6 +21,8 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_IPA_INLINE_H #define GCC_IPA_INLINE_H +#include "params.h" + /* Data we cache about callgraph edges during inlining to avoid expensive re-computations during the greedy algorithm. */ class edge_growth_cache_entry @@ -84,7 +86,24 @@ estimate_edge_growth (struct cgraph_edge *edge) { ipa_call_summary *s = ipa_call_summaries->get (edge); gcc_checking_assert (s->call_stmt_size || !edge->callee->analyzed); - return (estimate_edge_size (edge) - s->call_stmt_size); + + int growth = (estimate_edge_size (edge) - s->call_stmt_size); + + /* Bias function growth according to the bias parameter. The default is + parameter value is 5 to allow for slight negative biases, so we subtract 5 + to allow an effective default value of 0. */ + growth += PARAM_VALUE (PARAM_INLINE_GROWTH_BIAS) - 5; + + /* Function size cannot be negative, so if the growth is negative to the point + that it will reduce function size below 0, then we cap the growth such that + it makes the function size exactly zero. */ + struct cgraph_node *caller = edge->caller; + ipa_size_summary *fs = ipa_size_summaries->get (caller); + if (fs->size + growth < 0) { + growth = -fs->size; + } + + return growth; } /* Return estimated callee runtime increase after inlining diff --git a/gcc/params.def b/gcc/params.def index 7928f6f071e..4274d8f36e0 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -112,6 +112,11 @@ DEFPARAM (PARAM_INLINE_HEURISTICS_HINT_PERCENT_O2, "The scale (in percents) applied to inline-insns-single and auto limits when heuristics hints that inlining is very profitable.", 200, 100, 1000000) +DEFPARAM (PARAM_INLINE_GROWTH_BIAS, + "inline-growth-bias", + "Bias of inlining growth overhead (default 5 is equal to no bias).", + 5, 0, 1000000) + DEFPARAM (PARAM_MAX_INLINE_INSNS_SIZE, "max-inline-insns-size", "The maximum number of instructions when inlining for size.",