From patchwork Thu Oct 17 08:23:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Xue OS X-Patchwork-Id: 1178357 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-511190-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Mj1c3Vb4"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=os.amperecomputing.com header.i=@os.amperecomputing.com header.b="BOHuqTnB"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46v2J339wrz9sP3 for ; Thu, 17 Oct 2019 19:24:01 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=L6p 2NOx2lGa6XGWC5D8HOXTzhe/K3hXW2rsHpduX4ciKwmaCRZuK2W+gmowPfVwPYFG ds5s0oKG8ET9Go7hvnQ/8XPOeGjNuSrv8RIdSDVztdrHWAql4GJQm5XqFiDmVkzv uQNJspmwlx+T7bwDuBjRvjmkNTZ1ZVxa4g2yRRrc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type :content-transfer-encoding:mime-version; s=default; bh=Zhb2v5Pxx ykh1AF3UWSsIA/zxrs=; b=Mj1c3Vb4y1z5yq57xfeAUaYvbbBo4X10wNOOmimWL 4ZSIVRT6wyqXuIaBbG2YZTjziQ/YQVR6/6UpY+RmuL5VsT8GzbLtw1Huju1U1vlf cG0OElzCsKWruYhAa+t9lhpHKCN+REUZ9HhzcsBBJTqrtQH2PcJQggVa5Fb9ltew YU= Received: (qmail 44588 invoked by alias); 17 Oct 2019 08:23:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 44578 invoked by uid 89); 17 Oct 2019 08:23:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.7 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 spammy=circular X-HELO: NAM01-BN3-obe.outbound.protection.outlook.com Received: from mail-eopbgr740131.outbound.protection.outlook.com (HELO NAM01-BN3-obe.outbound.protection.outlook.com) (40.107.74.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 17 Oct 2019 08:23:50 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UVW9OXEyYG/rqitXIIyRjTBTDLDFnfgtGFf+JDkVsB2iWfDNRkIdDHD6Kb7crMUsz6/Fr6zz+pKmxoAQthY+LVHfvnDwKf3Epc6X5wUYBYqS11UpY5wDmpbh/wnxED/rLFLmgcF1b7M3cK00rpzh9wz3FyzKxjAnNXSsTYpbc91g3jYyWqLzDfgeAhS0k211nbB7zF286lhhsiHIrmuqbQptvwg6ELmuaIhRDU+oyQVPOt/lXQ2vEdLMJ7QHLIKcB/xh87lFVLSXL3+sya0mBxXKDxiNO07SEMxXEooCQUaj75Yk1odpQByiHR0qvO4e5T1UsBWT6vq3udh26nM1HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=h8OWuLV/1PTqfwsKFPDC6JY2d58UJly0ZxKdP2KOSMU=; b=I0N62/cZ2nD0kuRWIZ6HXpaRsAXmbKmM4+Isq2QuVbIHtlelGdZ/2cP+Z+kRhKiWjxx1kFUoC9rSLdvObZg1PkZSudqzy/OM90VZk2jkVVdWB46RIUMLZ/Nz2714XozcT33QPBrAR4EIMjxhUvcSZnsG64nVyV7Uf3rztZFzy2X6pclS63xSUmXY/PmRgZlgk6JxrRVs+JOPz2MJtVuqbTQqBvo5/4fEOxzlMhya8dIy1TfW5Neld6kzbJLImVnotSJ9ZBtbWnIC9+mGa3SJz04XzmIDopl2GcoShf/Kl2QnBR1051n3ec8gzCBV+1e1TTZ5c7cO+Y75zyb92V4rHA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=h8OWuLV/1PTqfwsKFPDC6JY2d58UJly0ZxKdP2KOSMU=; b=BOHuqTnBQx3PgEB/3A0iZFrE3zGe/HPJ7V0WVJGyqtJdsFuvrUUJWJ9dfjxmYulvZCarVI2HpdylSiFnkCYM1GpiKadAYISQJzqY/LQDveaatLzvmBmU/lhRcVqZYqGeBiPW0z4ACA1v5PYx8xC7S5BjTWa8aDc/GzQvMTIo5mY= Received: from BYAPR01MB4869.prod.exchangelabs.com (20.177.226.139) by BYAPR01MB4823.prod.exchangelabs.com (20.177.226.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2347.16; Thu, 17 Oct 2019 08:23:44 +0000 Received: from BYAPR01MB4869.prod.exchangelabs.com ([fe80::a984:9c86:cc04:f50c]) by BYAPR01MB4869.prod.exchangelabs.com ([fe80::a984:9c86:cc04:f50c%5]) with mapi id 15.20.2347.023; Thu, 17 Oct 2019 08:23:44 +0000 From: Feng Xue OS To: "gcc-patches@gcc.gnu.org" , Jan Hubicka , Martin Jambor Subject: [PATCH] Support multi-versioning on self-recursive function (ipa/92133) Date: Thu, 17 Oct 2019 08:23:43 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=fxue@os.amperecomputing.com; x-ms-oob-tlc-oobclassifiers: OLM:5797; received-spf: None (protection.outlook.com: os.amperecomputing.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Gr2f4Mp+52/PsZGp5Zsa/La7j7/nSXbHSFzgYLvOb96b6gCbd3VvgOU0siHLYyMr0fY2Sur67GjVCFTvw2XHoHJ9/eywVPeBAkT0F69ViBA= X-IsSubscribed: yes IPA does not allow constant propagation on parameter that is used to control function recursion. recur_fn (i) { if ( !terminate_recursion (i)) { ... recur_fn (i + 1); ... } ... } This patch is composed to enable multi-versioning for self-recursive function, and versioning copies is limited by a specified option. Feng diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 045072e02ec..6255a766e4d 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -229,7 +229,9 @@ public: inline bool set_contains_variable (); bool add_value (valtype newval, cgraph_edge *cs, ipcp_value *src_val = NULL, - int src_idx = 0, HOST_WIDE_INT offset = -1); + int src_idx = 0, HOST_WIDE_INT offset = -1, + ipcp_value **val_pos_p = NULL, + bool unlimited = false); void print (FILE * f, bool dump_sources, bool dump_benefits); }; @@ -1579,22 +1581,37 @@ allocate_and_init_ipcp_value (ipa_polymorphic_call_context source) /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it. CS, SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same meaning. OFFSET -1 means the source is scalar and not a part of an - aggregate. */ + aggregate. If non-NULL, VAL_POS_P specifies position in value list, + after which newly created ipcp_value will be inserted, and it is also + used to record address of the added ipcp_value before function returns. + UNLIMITED means whether value count should not exceed the limit given + by PARAM_IPA_CP_VALUE_LIST_SIZE. */ template bool ipcp_lattice::add_value (valtype newval, cgraph_edge *cs, ipcp_value *src_val, - int src_idx, HOST_WIDE_INT offset) + int src_idx, HOST_WIDE_INT offset, + ipcp_value **val_pos_p, + bool unlimited) { ipcp_value *val; + if (val_pos_p) + { + for (val = values; val && val != *val_pos_p; val = val->next); + gcc_checking_assert (val); + } + if (bottom) return false; for (val = values; val; val = val->next) if (values_equal_for_ipcp_p (val->value, newval)) { + if (val_pos_p) + *val_pos_p = val; + if (ipa_edge_within_scc (cs)) { ipcp_value_source *s; @@ -1609,7 +1626,8 @@ ipcp_lattice::add_value (valtype newval, cgraph_edge *cs, return false; } - if (values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE)) + if (!unlimited + && values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE)) { /* We can only free sources, not the values themselves, because sources of other values in this SCC might point to them. */ @@ -1623,6 +1641,9 @@ ipcp_lattice::add_value (valtype newval, cgraph_edge *cs, } } + if (val_pos_p) + *val_pos_p = NULL; + values = NULL; return set_to_bottom (); } @@ -1630,8 +1651,54 @@ ipcp_lattice::add_value (valtype newval, cgraph_edge *cs, values_count++; val = allocate_and_init_ipcp_value (newval); val->add_source (cs, src_val, src_idx, offset); - val->next = values; - values = val; + if (val_pos_p) + { + val->next = (*val_pos_p)->next; + (*val_pos_p)->next = val; + *val_pos_p = val; + } + else + { + val->next = values; + values = val; + } + + return true; +} + +/* Return true, if a ipcp_value VAL is orginated from parameter value of + self-feeding recursive function by applying non-passthrough arithmetic + transformation. */ + +static bool +self_recursively_generated_p (ipcp_value *val) +{ + class ipa_node_params *info = NULL; + + for (ipcp_value_source *src = val->sources; src; src = src->next) + { + cgraph_edge *cs = src->cs; + + if (!src->val || cs->caller != cs->callee->function_symbol () + || src->val == val) + return false; + + if (!info) + info = IPA_NODE_REF (cs->caller); + + class ipcp_param_lattices *plats = ipa_get_parm_lattices (info, + src->index); + ipcp_lattice *src_lat = src->offset == -1 ? &plats->itself + : plats->aggs; + ipcp_value *src_val; + + for (src_val = src_lat->values; src_val && src_val != val; + src_val = src_val->next); + + if (!src_val) + return false; + } + return true; } @@ -1649,20 +1716,72 @@ propagate_vals_across_pass_through (cgraph_edge *cs, ipa_jump_func *jfunc, ipcp_value *src_val; bool ret = false; - /* Do not create new values when propagating within an SCC because if there - are arithmetic functions with circular dependencies, there is infinite - number of them and we would just make lattices bottom. If this condition - is ever relaxed we have to detect self-feeding recursive calls in - cgraph_edge_brings_value_p in a smarter way. */ + /* Due to circular dependencies, propagating within an SCC through arithmetic + transformation would create infinite number of values. But for + self-feeding recursive function, we could allow propagation in a limited + count, and this can enable a simple kind of recursive function versioning. + For other scenario, we would just make lattices bottom. */ if ((ipa_get_jf_pass_through_operation (jfunc) != NOP_EXPR) && ipa_edge_within_scc (cs)) - ret = dest_lat->set_contains_variable (); + { + if (src_lat != dest_lat) + return dest_lat->set_contains_variable (); + + auto_vec *, 8> val_seeds; + + for (src_val = src_lat->values; src_val; src_val = src_val->next) + { + /* Now we do not use self-recursively generated value as propagation + source, this is absolutely conservative, but could avoid explosion + of lattice's value space, especially when one recursive function + calls another recursive. */ + if (self_recursively_generated_p (src_val)) + { + /* If the lattice has already been propagated for the call site, + no need to do that again. */ + for (ipcp_value_source *s = src_val->sources; s; + s = s->next) + if (s->cs == cs) + return dest_lat->set_contains_variable (); + } + else + val_seeds.safe_push (src_val); + } + + int clone_copies = PARAM_VALUE (PARAM_IPA_CP_MAX_RECURSION_COPIES); + int i; + + /* Recursively generate lattice values with a limited count. */ + FOR_EACH_VEC_ELT (val_seeds, i, src_val) + { + for (int j = 1; j < clone_copies; j++) + { + tree cstval = ipa_get_jf_pass_through_result (jfunc, + src_val->value, + parm_type); + if (!cstval) + break; + + /* Try to place the new lattice value after its source, which + can decrease iterations of propagate stage. */ + ret |= dest_lat->add_value (cstval, cs, src_val, src_idx, + -1, &src_val, true); + gcc_checking_assert (src_val); + } + } + ret |= dest_lat->set_contains_variable (); + } else for (src_val = src_lat->values; src_val; src_val = src_val->next) { + /* Now we do not use self-recursively generated value as propagation + source, otherwise it is easy to make value space of normal lattice + overflow. */ + if (self_recursively_generated_p (src_val)) + continue; + tree cstval = ipa_get_jf_pass_through_result (jfunc, src_val->value, parm_type); - if (cstval) ret |= dest_lat->add_value (cstval, cs, src_val, src_idx); else @@ -3635,6 +3754,9 @@ get_info_about_necessary_edges (ipcp_value *val, cgraph_node *dest, hot |= cs->maybe_hot_p (); if (cs->caller != dest) non_self_recursive = true; + else if (src->val) + gcc_assert (values_equal_for_ipcp_p (src->val->value, + val->value)); } cs = get_next_cgraph_edge_clone (cs); } diff --git a/gcc/params.def b/gcc/params.def index 322c37f8b96..3e242e09f01 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1135,6 +1135,11 @@ DEFPARAM (PARAM_IPA_CP_RECURSION_PENALTY, "are evaluated for cloning.", 40, 0, 100) +DEFPARAM (PARAM_IPA_CP_MAX_RECURSION_COPIES, + "ipa-cp-max-recursion-copies", + "Maximum function versioning copies for a self-recursive function.", + 8, 0, 0) + DEFPARAM (PARAM_IPA_CP_SINGLE_CALL_PENALTY, "ipa-cp-single-call-penalty", "Percentage penalty functions containing a single call to another " diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c b/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c new file mode 100644 index 00000000000..7c1c01047c6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details -fno-early-inlining" } */ + +int fn(); + +int data[100]; + +int recur_fn (int i) +{ + int j; + + if (i == 6) + { + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + return 10; + } + + data[i] = i; + + for (j = 0; j < 100; j++) + recur_fn (i + 1); + + return i; +} + +int main () +{ + int i; + + for (i = 0; i < 100; i++) + recur_fn (1) + recur_fn (-5); + + return 1; +} + +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of recur_fn/\[0-9\]*\\." 12 "cp" } } */