From patchwork Tue Feb 11 06:50:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1236189 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-519311-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha1 header.s=default header.b=mkttYatH; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48GtjG6Pndz9sRQ for ; Tue, 11 Feb 2020 17:51:28 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:date:mime-version:in-reply-to :content-type:message-id; q=dns; s=default; b=wXm6uLAY/G9LjE6omD oiJ0+5nuckG4HQG0EZml7FhGbGBV/11rh3MXzjDwUB1nxSfGrKX0K4W3NmFyGp2E FUoMwP+kWfq1uPAMzQBdVyuRJhRy/mlXuSwVThkeYs9t2cIhUbRQBrOI5mDRyRoC P6p9Y3ENXVRJPhIVHOdLCTcpU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:date:mime-version:in-reply-to :content-type:message-id; s=default; bh=2ItD1Son0kvYcY2e67NVEBBJ iPY=; b=mkttYatH6RmSSygJqjjAv0bcXt1gjA4yXM4D1dgRRJFmOGa4eOBKVhlX w3EbCdTcPovA2zj9OPUu5zmJ0alBorNCDkJHim1RVmwtwMtLz938F3JzcnrjywAO s/CkYz3IP+zYnSLntKontvf8X2FGkNDNye6QzpOR/KWGRCrnfSY= Received: (qmail 74653 invoked by alias); 11 Feb 2020 06:51:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 74639 invoked by uid 89); 11 Feb 2020 06:51:20 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, MIME_CHARSET_FARAWAY, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=7 X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 11 Feb 2020 06:51:17 +0000 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01B6pDZQ045719 for ; Tue, 11 Feb 2020 01:51:16 -0500 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2y1tp24new-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 11 Feb 2020 01:51:15 -0500 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 11 Feb 2020 06:50:17 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 11 Feb 2020 06:50:15 -0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01B6oEOM43647044 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Feb 2020 06:50:14 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EF22EA4065; Tue, 11 Feb 2020 06:50:13 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 17E93A4060; Tue, 11 Feb 2020 06:50:08 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.55.248]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 11 Feb 2020 06:50:07 +0000 (GMT) Subject: [PATCH 1/4 v3 GCC11] Add middle-end unroll factor estimation To: Segher Boessenkool Cc: GCC Patches , Bill Schmidt , "bin.cheng" , Richard Guenther References: <131a3294-1951-3678-453b-085744366af6@linux.ibm.com> <20200120130249.GW3191@gate.crashing.org> <2aefe945-ac95-a472-a6b2-3b2ae3e63f4d@linux.ibm.com> <20200210233446.GM22482@gate.crashing.org> From: "Kewen.Lin" Date: Tue, 11 Feb 2020 14:50:03 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <20200210233446.GM22482@gate.crashing.org> x-cbid: 20021106-0028-0000-0000-000003D969B1 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20021106-0029-0000-0000-0000249DD7AC Message-Id: <7c2d065c-eb40-18c7-704a-266836443f6c@linux.ibm.com> X-IsSubscribed: yes Hi, v3 changes: - Updated _uf to _unroll for some function names. By the way, should I guard the current i386/s390 loop_unroll_adjust ealy return with (current_pass->type != RTL_PASS)? I'm inclined not to, since this analysis isn't enabled by default, if those targets want to adopt this analysis, they will get ICE immediately then users can notice it and make up the gimple part check. A guard can make ICE disappear but the hook works poorly silently, users probably miss to update it. BR, Kewen ----- gcc/ChangeLog 2020-02-11 Kewen Lin * cfgloop.h (struct loop): New field estimated_unroll. * tree-ssa-loop-manip.c (decide_unroll_const_iter): New function. (decide_unroll_runtime_iter): Likewise. (decide_unroll_stupid): Likewise. (estimate_unroll_factor): Likewise. * tree-ssa-loop-manip.h (estimate_unroll_factor): New declaration. * tree-ssa-loop.c (tree_average_num_loop_insns): New function. * tree-ssa-loop.h (tree_average_num_loop_insns): New declaration. on 2020/2/11 上午7:34, Segher Boessenkool wrote: > Hi! > > On Mon, Feb 10, 2020 at 02:20:17PM +0800, Kewen.Lin wrote: >> * tree-ssa-loop-manip.c (decide_uf_const_iter): New function. >> (decide_uf_runtime_iter): Likewise. >> (decide_uf_stupid): Likewise. > > These names still use "uf". (Those are the last I see). > Good catch! >> * tree-ssa-loop-manip.h (estimate_unroll_factor): New declare. > > "New declaration." > Done. --- gcc/cfgloop.h | 3 + gcc/tree-ssa-loop-manip.c | 253 ++++++++++++++++++++++++++++++++++++++++++++++ gcc/tree-ssa-loop-manip.h | 3 +- gcc/tree-ssa-loop.c | 33 ++++++ gcc/tree-ssa-loop.h | 2 + 5 files changed, 292 insertions(+), 2 deletions(-) diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 11378ca..c5bcca7 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -232,6 +232,9 @@ public: Other values means unroll with the given unrolling factor. */ unsigned short unroll; + /* Like unroll field above, but it's estimated in middle-end. */ + unsigned short estimated_unroll; + /* If this loop was inlined the main clique of the callee which does not need remapping when copying the loop body. */ unsigned short owned_clique; diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c index 120b35b..8a5a1a9 100644 --- a/gcc/tree-ssa-loop-manip.c +++ b/gcc/tree-ssa-loop-manip.c @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see #include "system.h" #include "coretypes.h" #include "backend.h" +#include "target.h" #include "tree.h" #include "gimple.h" #include "cfghooks.h" @@ -42,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "tree-scalar-evolution.h" #include "tree-inline.h" +#include "wide-int.h" /* All bitmaps for rewriting into loop-closed SSA go on this obstack, so that we can free them all at once. */ @@ -1592,3 +1594,254 @@ canonicalize_loop_ivs (class loop *loop, tree *nit, bool bump_in_latch) return var_before; } + +/* Try to determine estimated unroll factor for given LOOP with constant number + of iterations, mainly refer to decide_unroll_constant_iterations. + - NITER_DESC holds number of iteration description if it isn't NULL. + - NUNROLL holds a unroll factor value computed with instruction numbers. + - ITER holds estimated or likely max loop iterations. + Return true if it succeeds, also update estimated_unroll. */ + +static bool +decide_unroll_const_iter (class loop *loop, const tree_niter_desc *niter_desc, + unsigned nunroll, const widest_int *iter) +{ + /* Skip big loops. */ + if (nunroll <= 1) + return false; + + gcc_assert (niter_desc && niter_desc->assumptions); + + /* Check number of iterations is constant, return false if no. */ + if ((niter_desc->may_be_zero && !integer_zerop (niter_desc->may_be_zero)) + || !tree_fits_uhwi_p (niter_desc->niter)) + return false; + + unsigned HOST_WIDE_INT const_niter = tree_to_uhwi (niter_desc->niter); + + /* If unroll factor is set explicitly, use it as estimated_unroll. */ + if (loop->unroll > 0 && loop->unroll < USHRT_MAX) + { + /* It should have been peeled instead. */ + if (const_niter == 0 || (unsigned) loop->unroll > const_niter - 1) + loop->estimated_unroll = 1; + else + loop->estimated_unroll = loop->unroll; + return true; + } + + /* Check whether the loop rolls enough to consider. */ + if (const_niter < 2 * nunroll || wi::ltu_p (*iter, 2 * nunroll)) + return false; + + /* Success; now compute number of iterations to unroll. */ + unsigned best_unroll = 0, n_copies = 0; + unsigned best_copies = 2 * nunroll + 10; + unsigned i = 2 * nunroll + 2; + + if (i > const_niter - 2) + i = const_niter - 2; + + for (; i >= nunroll - 1; i--) + { + unsigned exit_mod = const_niter % (i + 1); + + if (!empty_block_p (loop->latch)) + n_copies = exit_mod + i + 1; + else if (exit_mod != i) + n_copies = exit_mod + i + 2; + else + n_copies = i + 1; + + if (n_copies < best_copies) + { + best_copies = n_copies; + best_unroll = i; + } + } + + loop->estimated_unroll = best_unroll + 1; + return true; +} + +/* Try to determine estimated unroll factor for given LOOP with countable but + non-constant number of iterations, mainly refer to + decide_unroll_runtime_iterations. + - NITER_DESC holds number of iteration description if it isn't NULL. + - NUNROLL_IN holds a unroll factor value computed with instruction numbers. + - ITER holds estimated or likely max loop iterations. + Return true if it succeeds, also update estimated_unroll. */ + +static bool +decide_unroll_runtime_iter (class loop *loop, const tree_niter_desc *niter_desc, + unsigned nunroll_in, const widest_int *iter) +{ + unsigned nunroll = nunroll_in; + if (loop->unroll > 0 && loop->unroll < USHRT_MAX) + nunroll = loop->unroll; + + /* Skip big loops. */ + if (nunroll <= 1) + return false; + + gcc_assert (niter_desc && niter_desc->assumptions); + + /* Skip constant number of iterations. */ + if ((!niter_desc->may_be_zero || !integer_zerop (niter_desc->may_be_zero)) + && tree_fits_uhwi_p (niter_desc->niter)) + return false; + + /* Check whether the loop rolls. */ + if (wi::ltu_p (*iter, 2 * nunroll)) + return false; + + /* Success; now force nunroll to be power of 2. */ + unsigned i; + for (i = 1; 2 * i <= nunroll; i *= 2) + continue; + + loop->estimated_unroll = i; + return true; +} + +/* Try to determine estimated unroll factor for given LOOP with uncountable + number of iterations, mainly refer to decide_unroll_stupid. + - NITER_DESC holds number of iteration description if it isn't NULL. + - NUNROLL_IN holds a unroll factor value computed with instruction numbers. + - ITER holds estimated or likely max loop iterations. + Return true if it succeeds, also update estimated_unroll. */ + +static bool +decide_unroll_stupid (class loop *loop, const tree_niter_desc *niter_desc, + unsigned nunroll_in, const widest_int *iter) +{ + if (!flag_unroll_all_loops && !loop->unroll) + return false; + + unsigned nunroll = nunroll_in; + if (loop->unroll > 0 && loop->unroll < USHRT_MAX) + nunroll = loop->unroll; + + /* Skip big loops. */ + if (nunroll <= 1) + return false; + + gcc_assert (!niter_desc || !niter_desc->assumptions); + + /* Skip loop with multiple branches for now. */ + if (num_loop_branches (loop) > 1) + return false; + + /* Check whether the loop rolls. */ + if (wi::ltu_p (*iter, 2 * nunroll)) + return false; + + /* Success; now force nunroll to be power of 2. */ + unsigned i; + for (i = 1; 2 * i <= nunroll; i *= 2) + continue; + + loop->estimated_unroll = i; + return true; +} + +/* Try to estimate whether this given LOOP can be unrolled or not, and compute + its estimated unroll factor if it can. To avoid duplicated computation, you + can pass number of iterations information by DESC. The heuristics mainly + refer to decide_unrolling in loop-unroll.c. */ + +void +estimate_unroll_factor (class loop *loop, tree_niter_desc *desc) +{ + /* Return the existing estimated unroll factor. */ + if (loop->estimated_unroll) + return; + + /* Don't unroll explicitly. */ + if (loop->unroll == 1) + { + loop->estimated_unroll = loop->unroll; + return; + } + + /* Like decide_unrolling, don't unroll if: + 1) the loop is cold. + 2) the loop can't be manipulated. + 3) the loop isn't innermost. */ + if (optimize_loop_for_size_p (loop) || !can_duplicate_loop_p (loop) + || loop->inner != NULL) + { + loop->estimated_unroll = 1; + return; + } + + /* Don't unroll without explicit information. */ + if (!loop->unroll && !flag_unroll_loops && !flag_unroll_all_loops) + { + loop->estimated_unroll = 1; + return; + } + + /* Check for instruction number and average instruction number. */ + loop->ninsns = tree_num_loop_insns (loop, &eni_size_weights); + loop->av_ninsns = tree_average_num_loop_insns (loop, &eni_size_weights); + unsigned nunroll = param_max_unrolled_insns / loop->ninsns; + unsigned nunroll_by_av = param_max_average_unrolled_insns / loop->av_ninsns; + + if (nunroll > nunroll_by_av) + nunroll = nunroll_by_av; + if (nunroll > (unsigned) param_max_unroll_times) + nunroll = param_max_unroll_times; + + if (targetm.loop_unroll_adjust) + nunroll = targetm.loop_unroll_adjust (nunroll, loop); + + tree_niter_desc *niter_desc = NULL; + bool desc_need_delete = false; + + /* Compute number of iterations if need. */ + if (!desc) + { + /* For now, use single_dom_exit for simplicity. TODO: Support multiple + exits like find_simple_exit if we finds some profitable cases. */ + niter_desc = XNEW (class tree_niter_desc); + gcc_assert (niter_desc); + edge exit = single_dom_exit (loop); + if (!exit || !number_of_iterations_exit (loop, exit, niter_desc, true)) + { + XDELETE (niter_desc); + niter_desc = NULL; + } + else + desc_need_delete = true; + } + else + niter_desc = desc; + + /* For checking the loop rolls enough to consider, also consult loop bounds + and profile. */ + widest_int iterations; + if (!get_estimated_loop_iterations (loop, &iterations) + && !get_likely_max_loop_iterations (loop, &iterations)) + iterations = 0; + + if (niter_desc && niter_desc->assumptions) + { + /* For countable loops. */ + if (!decide_unroll_const_iter (loop, niter_desc, nunroll, &iterations) + && !decide_unroll_runtime_iter (loop, niter_desc, nunroll, &iterations)) + loop->estimated_unroll = 1; + } + else + { + if (!decide_unroll_stupid (loop, niter_desc, nunroll, &iterations)) + loop->estimated_unroll = 1; + } + + if (desc_need_delete) + { + XDELETE (niter_desc); + niter_desc = NULL; + } +} + diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h index e789e4f..773a2b3 100644 --- a/gcc/tree-ssa-loop-manip.h +++ b/gcc/tree-ssa-loop-manip.h @@ -55,7 +55,6 @@ extern void tree_transform_and_unroll_loop (class loop *, unsigned, extern void tree_unroll_loop (class loop *, unsigned, edge, class tree_niter_desc *); extern tree canonicalize_loop_ivs (class loop *, tree *, bool); - - +extern void estimate_unroll_factor (class loop *, tree_niter_desc *); #endif /* GCC_TREE_SSA_LOOP_MANIP_H */ diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c index 5e8365d..25320fb 100644 --- a/gcc/tree-ssa-loop.c +++ b/gcc/tree-ssa-loop.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "diagnostic-core.h" #include "stringpool.h" #include "attribs.h" +#include "sreal.h" /* A pass making sure loops are fixed up. */ @@ -790,5 +791,37 @@ tree_num_loop_insns (class loop *loop, eni_weights *weights) return size; } +/* Computes an estimated number of insns on average per iteration in LOOP, + weighted by WEIGHTS. Refer to function average_num_loop_insns. */ +unsigned +tree_average_num_loop_insns (class loop *loop, eni_weights *weights) +{ + basic_block *body = get_loop_body (loop); + gimple_stmt_iterator gsi; + unsigned bb_size, i; + sreal nsize = 0; + + for (i = 0; i < loop->num_nodes; i++) + { + bb_size = 0; + for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi); gsi_next (&gsi)) + bb_size += estimate_num_insns (gsi_stmt (gsi), weights); + nsize += (sreal) bb_size + * body[i]->count.to_sreal_scale (loop->header->count); + /* Avoid overflows. */ + if (nsize > 1000000) + { + free (body); + return 1000000; + } + } + free (body); + + unsigned ret = nsize.to_int (); + if (!ret) + ret = 1; /* To avoid division by zero. */ + + return ret; +} diff --git a/gcc/tree-ssa-loop.h b/gcc/tree-ssa-loop.h index 9e35125..af36177 100644 --- a/gcc/tree-ssa-loop.h +++ b/gcc/tree-ssa-loop.h @@ -67,6 +67,8 @@ public: extern bool for_each_index (tree *, bool (*) (tree, tree *, void *), void *); extern char *get_lsm_tmp_name (tree ref, unsigned n, const char *suffix = NULL); extern unsigned tree_num_loop_insns (class loop *, struct eni_weights *); +extern unsigned tree_average_num_loop_insns (class loop *, + struct eni_weights *); /* Returns the loop of the statement STMT. */