From patchwork Sun Jul 14 12:02:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1131723 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-505066-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="qINmV0fT"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45mlfY1DWVz9sND for ; Sun, 14 Jul 2019 22:02:56 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=SBKrNxCGq59PFghpenuzFJEpK0fGQzn9bHnwHDZB5vSzqokN8NghH dvi2cMEGQzwN/iJ5cVT/A1I3z/pf7Oa4yaMzTOG4HZC4Y8BsfrWuBe7l43wop7SK eCO5R0HtTiuXvVxtgJuvYIk57Y1D97sAwATeXbH0iGSOxOhhduLFWo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=vRwUfYYLWPUmzEIu3eTQwKRaRy4=; b=qINmV0fTjakBRM9X93eq ZXGogXT4ggT5fsaMnJDfcu4lx+0JYcFJ50cbWvIxdWLuO/nF4gt4cWuJHX2t85CV 4zETNFzRzq7yp9mzBlNWcLyTw9ms7OejcRp6ILd7ZuXi9I5+3dgg/uLMWhy/xb/g ksGzUmyBp+hXcc7cxjPk4jQ= Received: (qmail 106180 invoked by alias); 14 Jul 2019 12:02:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 106171 invoked by uid 89); 14 Jul 2019 12:02:47 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-12.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS autolearn=ham version=3.3.1 spammy=fbi, IPA, relation, experiment X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 14 Jul 2019 12:02:44 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 6184A28252B; Sun, 14 Jul 2019 14:02:41 +0200 (CEST) Date: Sun, 14 Jul 2019 14:02:41 +0200 From: Jan Hubicka To: gcc-patches@gcc.gnu.org Subject: Remove array_index inliner hint Message-ID: <20190714120241.ineijlikfs2e36c2@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) Hi, array_index hint marks functions contains an array reference that is indexed by value that will become constant after inlining. This hint is later used by ipa-inline in rather agressive way updating inline-insns-auto to inline-insns-single for such functions. While tuning -finline-functions for -O2 I have noticed that this leads to 30% code size growth of SPEC versions of GCC. This is because all predicates like register_operand contains such array references based on mode parameter and this makes GCC to conclude about agressively inlining them all which bloats insn-* a lot. This hint was quite early experiment with propagating additional stuff through inliner. I think it is better to actualy account the instructions which will be generated for array indexing rather then handling this specially. This patch makes us to accound 1 instruction for every non-constant array access. This is still probably making inliner overly optimistic about inlining benefits since these accesses will later be CSEed, but it kills bit of magic and makes things more robust. This will make inliner notice that function will simplify and give it higher priority for inlining possibly still bypassing the bounds if big speedup is achieved. This is however a lot more rare than before. Bootstrapped/regtested x86_64-linux, comitted. Honza * ipa-fnsummary.c (ipa_dump_hints): Do not dump array_index. (ipa_fn_summary::~ipa_fn_summary): Do not destroy array_index. (ipa_fn_summary_t::duplicate): Do not duplicate array_index. (array_index_predicate): Remove. (analyze_function_body): Account cost for variable ofsetted array indexing. (estimate_node_size_and_time): Do not compute array index hint. (ipa_merge_fn_summary_after_inlining): Do not merge array index hint. (inline_read_section): Do not read array index hint. (ipa_fn_summary_write): Do not write array index hint. * doc/invoke.texi (ipa-cp-array-index-hint-bonus): Remove. * ipa-cp.c (hint_time_bonus): Remove. * ipa-fnsummary.h (ipa_hints_vals): Remove array_index. (ipa_fnsummary): Remove array_index. * ipa-inline.c (want_inline_small_function_p): Do not use array_index. (edge_badness): Likewise. * params.def (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS): Remove. Index: doc/invoke.texi =================================================================== --- doc/invoke.texi (revision 273450) +++ doc/invoke.texi (working copy) @@ -11895,12 +11895,6 @@ of iterations of a loop known, it adds a @option{ipa-cp-loop-hint-bonus} to the profitability score of the candidate. -@item ipa-cp-array-index-hint-bonus -When IPA-CP determines that a cloning candidate would make the index of -an array access known, it adds a bonus of -@option{ipa-cp-array-index-hint-bonus} to the profitability -score of the candidate. - @item ipa-max-aa-steps During its analysis of function bodies, IPA-CP employs alias analysis in order to track values pointed to by function parameters. In order Index: ipa-cp.c =================================================================== --- ipa-cp.c (revision 273450) +++ ipa-cp.c (working copy) @@ -2607,8 +2607,6 @@ hint_time_bonus (ipa_hints hints) int result = 0; if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride)) result += PARAM_VALUE (PARAM_IPA_CP_LOOP_HINT_BONUS); - if (hints & INLINE_HINT_array_index) - result += PARAM_VALUE (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS); return result; } Index: ipa-fnsummary.c =================================================================== --- ipa-fnsummary.c (revision 273450) +++ ipa-fnsummary.c (working copy) @@ -134,11 +134,6 @@ ipa_dump_hints (FILE *f, ipa_hints hints hints &= ~INLINE_HINT_declared_inline; fprintf (f, " declared_inline"); } - if (hints & INLINE_HINT_array_index) - { - hints &= ~INLINE_HINT_array_index; - fprintf (f, " array_index"); - } if (hints & INLINE_HINT_known_hot) { hints &= ~INLINE_HINT_known_hot; @@ -549,8 +544,6 @@ ipa_fn_summary::~ipa_fn_summary () edge_predicate_pool.remove (loop_iterations); if (loop_stride) edge_predicate_pool.remove (loop_stride); - if (array_index) - edge_predicate_pool.remove (array_index); vec_free (conds); vec_free (size_time_table); } @@ -703,8 +696,6 @@ ipa_fn_summary_t::duplicate (cgraph_node possible_truths); remap_hint_predicate_after_duplication (&info->loop_stride, possible_truths); - remap_hint_predicate_after_duplication (&info->array_index, - possible_truths); /* If inliner or someone after inliner will ever start producing non-trivial clones, we will get trouble with lack of information @@ -727,12 +718,6 @@ ipa_fn_summary_t::duplicate (cgraph_node info->loop_stride = NULL; set_hint_predicate (&info->loop_stride, p); } - if (info->array_index) - { - predicate p = *info->array_index; - info->array_index = NULL; - set_hint_predicate (&info->array_index, p); - } } if (!dst->global.inlined_to) ipa_update_overall_fn_summary (dst); @@ -894,11 +879,6 @@ ipa_dump_fn_summary (FILE *f, struct cgr fprintf (f, " loop stride:"); s->loop_stride->dump (f, s->conds); } - if (s->array_index) - { - fprintf (f, " array index:"); - s->array_index->dump (f, s->conds); - } fprintf (f, " calls:\n"); dump_ipa_call_summary (f, 4, node, s); fprintf (f, "\n"); @@ -1824,27 +1804,6 @@ predicate_for_phi_result (class ipa_fn_s nonconstant_names[SSA_NAME_VERSION (gimple_phi_result (phi))] = *p; } -/* Return predicate specifying when array index in access OP becomes non-constant. */ - -static predicate -array_index_predicate (ipa_fn_summary *info, - vec< predicate> nonconstant_names, tree op) -{ - predicate p = false; - while (handled_component_p (op)) - { - if (TREE_CODE (op) == ARRAY_REF || TREE_CODE (op) == ARRAY_RANGE_REF) - { - if (TREE_CODE (TREE_OPERAND (op, 1)) == SSA_NAME) - p = p.or_with (info->conds, - nonconstant_names[SSA_NAME_VERSION - (TREE_OPERAND (op, 1))]); - } - op = TREE_OPERAND (op, 0); - } - return p; -} - /* For a typical usage of __builtin_expect (a nonconstant_names = vNULL; int nblocks, n; int *order; - predicate array_index = true; gimple *fix_builtin_expect_stmt; gcc_assert (my_function && my_function->cfg); @@ -2146,26 +2104,6 @@ analyze_function_body (struct cgraph_nod this_time); } - if (gimple_assign_load_p (stmt) && nonconstant_names.exists ()) - { - predicate this_array_index; - this_array_index = - array_index_predicate (info, nonconstant_names, - gimple_assign_rhs1 (stmt)); - if (this_array_index != false) - array_index &= this_array_index; - } - if (gimple_store_p (stmt) && nonconstant_names.exists ()) - { - predicate this_array_index; - this_array_index = - array_index_predicate (info, nonconstant_names, - gimple_get_lhs (stmt)); - if (this_array_index != false) - array_index &= this_array_index; - } - - if (is_gimple_call (stmt) && !gimple_call_internal_p (stmt)) { @@ -2273,14 +2211,40 @@ analyze_function_body (struct cgraph_nod if (dump_file) fprintf (dump_file, " fp_expression set\n"); } + } - gcc_assert (time >= 0); - gcc_assert (size >= 0); + /* Account cost of address calculations in the statements. */ + for (unsigned int i = 0; i < gimple_num_ops (stmt); i++) + { + for (tree op = gimple_op (stmt, i); + op && handled_component_p (op); + op = TREE_OPERAND (op, 0)) + if ((TREE_CODE (op) == ARRAY_REF + || TREE_CODE (op) == ARRAY_RANGE_REF) + && TREE_CODE (TREE_OPERAND (op, 1)) == SSA_NAME) + { + predicate p = bb_predicate; + if (fbi.info) + p = p & will_be_nonconstant_expr_predicate + (&fbi, info, TREE_OPERAND (op, 1), + nonconstant_names); + if (p != false) + { + time += freq; + size += 1; + if (dump_file) + fprintf (dump_file, + "\t\tAccounting address calculation.\n"); + info->account_size_time (ipa_fn_summary::size_scale, + freq, + bb_predicate, + p); + } + } } + } } - set_hint_predicate (&ipa_fn_summaries->get_create (node)->array_index, - array_index); free (order); if (nonconstant_names.exists () && !early) @@ -2783,9 +2747,6 @@ estimate_node_size_and_time (struct cgra if (info->loop_stride && !info->loop_stride->evaluate (possible_truths)) hints |= INLINE_HINT_loop_stride; - if (info->array_index - && !info->array_index->evaluate (possible_truths)) - hints |= INLINE_HINT_array_index; if (info->scc_no) hints |= INLINE_HINT_in_scc; if (DECL_DECLARED_INLINE_P (node->decl)) @@ -3106,9 +3067,6 @@ ipa_merge_fn_summary_after_inlining (str remap_hint_predicate (info, callee_info, &callee_info->loop_stride, operand_map, offset_map, clause, &toplev_predicate); - remap_hint_predicate (info, callee_info, - &callee_info->array_index, - operand_map, offset_map, clause, &toplev_predicate); ipa_call_summary *s = ipa_call_summaries->get (edge); inline_update_callee_summaries (edge->callee, s->loop_depth); @@ -3366,9 +3324,6 @@ inline_read_section (struct lto_file_dec p.stream_in (&ib); if (info) set_hint_predicate (&info->loop_stride, p); - p.stream_in (&ib); - if (info) - set_hint_predicate (&info->array_index, p); for (e = node->callees; e; e = e->next_callee) read_ipa_call_summary (&ib, e, info != NULL); for (e = node->indirect_calls; e; e = e->next_callee) @@ -3517,10 +3472,6 @@ ipa_fn_summary_write (void) info->loop_stride->stream_out (ob); else streamer_write_uhwi (ob, 0); - if (info->array_index) - info->array_index->stream_out (ob); - else - streamer_write_uhwi (ob, 0); for (edge = cnode->callees; edge; edge = edge->next_callee) write_ipa_call_summary (ob, edge); for (edge = cnode->indirect_calls; edge; edge = edge->next_callee) Index: ipa-fnsummary.h =================================================================== --- ipa-fnsummary.h (revision 273450) +++ ipa-fnsummary.h (working copy) @@ -48,11 +48,8 @@ enum ipa_hints_vals { if functions are in different modules, inlining may not be so important. Set by simple_edge_hints in ipa-inline-analysis.c. */ INLINE_HINT_cross_module = 64, - /* If array indexes of loads/stores become known there may be room for - further optimization. */ - INLINE_HINT_array_index = 128, /* We know that the callee is hot by profile. */ - INLINE_HINT_known_hot = 256 + INLINE_HINT_known_hot = 128 }; typedef int ipa_hints; @@ -97,7 +94,7 @@ public: fp_expressions (false), estimated_stack_size (false), stack_frame_offset (false), time (0), size (0), conds (NULL), size_time_table (NULL), loop_iterations (NULL), loop_stride (NULL), - array_index (NULL), growth (0), scc_no (0) + growth (0), scc_no (0) { } @@ -111,7 +108,7 @@ public: stack_frame_offset (s.stack_frame_offset), time (s.time), size (s.size), conds (s.conds), size_time_table (s.size_time_table), loop_iterations (s.loop_iterations), loop_stride (s.loop_stride), - array_index (s.array_index), growth (s.growth), scc_no (s.scc_no) + growth (s.growth), scc_no (s.scc_no) {} /* Default constructor. */ @@ -157,8 +154,6 @@ public: /* Predicate on when some loop in the function becomes to have known stride. */ predicate * GTY((skip)) loop_stride; - /* Predicate on when some array indexes become constants. */ - predicate * GTY((skip)) array_index; /* Estimated growth for inlining all copies of the function before start of small functions inlining. This value will get out of date as the callers are duplicated, but Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 273450) +++ ipa-inline.c (working copy) @@ -807,7 +807,6 @@ want_inline_small_function_p (struct cgr || (!(hints & (INLINE_HINT_indirect_call | INLINE_HINT_known_hot | INLINE_HINT_loop_iterations - | INLINE_HINT_array_index | INLINE_HINT_loop_stride)) && !(big_speedup = big_speedup_p (e))))) { @@ -833,7 +832,6 @@ want_inline_small_function_p (struct cgr && !(hints & INLINE_HINT_known_hot) && growth >= ((hints & (INLINE_HINT_indirect_call | INLINE_HINT_loop_iterations - | INLINE_HINT_array_index | INLINE_HINT_loop_stride)) ? MAX (MAX_INLINE_INSNS_AUTO, MAX_INLINE_INSNS_SINGLE) @@ -1227,7 +1225,6 @@ edge_badness (struct cgraph_edge *edge, badness = badness.shift (badness > 0 ? 4 : -4); if ((hints & (INLINE_HINT_indirect_call | INLINE_HINT_loop_iterations - | INLINE_HINT_array_index | INLINE_HINT_loop_stride)) || callee_info->growth <= 0) badness = badness.shift (badness > 0 ? -2 : 2); Index: params.def =================================================================== --- params.def (revision 273450) +++ params.def (working copy) @@ -1109,12 +1109,6 @@ DEFPARAM (PARAM_IPA_CP_LOOP_HINT_BONUS, "bounds or strides known.", 64, 0, 0) -DEFPARAM (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS, - "ipa-cp-array-index-hint-bonus", - "Compile-time bonus IPA-CP assigns to candidates which make an array " - "index known.", - 48, 0, 0) - DEFPARAM (PARAM_IPA_MAX_AA_STEPS, "ipa-max-aa-steps", "Maximum number of statements that will be visited by IPA formal "