From patchwork Mon Sep 2 03:06:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Xue OS X-Patchwork-Id: 1156403 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-508130-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="WPJwGHZf"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=os.amperecomputing.com header.i=@os.amperecomputing.com header.b="EsbooWhD"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46MFNR6YRjz9sDB for ; Mon, 2 Sep 2019 13:06:28 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:references:in-reply-to:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=Ptu htG8vy42U+cl9okWv8pgwyID87zWND5c2UhSDmtXwG4P3DiNrkTIxgZTNFSqMF9V jksqK27w0hxQR3cxB5dFBwuy38eQ1R3ICmrh9U9jt3XrcUgIJ2GdncOPb3lXCnIq PDE1QShEm1Tww57sITqFbHNKjC+L0sGC784LFuBI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:references:in-reply-to:content-type :content-transfer-encoding:mime-version; s=default; bh=FuXlMrlv0 zJHysHWepGQDqORkyk=; b=WPJwGHZf1/KYdmfUAAbZLBHR2ZUOdV83eatgI1pLm 7aBDcbp+v4dwV1i3sqvmp6vaFa5RdBtiaSvpw4y9ouMBya726XzwWqqiaXe5833E /X7JmGtKcKc3xtFbKBoUMI8wBt/xduvG87iHz6VqP5gG5QsNDCOHgH6erDdLpaf2 Kc= Received: (qmail 47367 invoked by alias); 2 Sep 2019 03:06:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 47356 invoked by uid 89); 2 Sep 2019 03:06:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 spammy=connected, nearly, fnoinline, fno-inline X-HELO: NAM02-BL2-obe.outbound.protection.outlook.com Received: from mail-eopbgr750138.outbound.protection.outlook.com (HELO NAM02-BL2-obe.outbound.protection.outlook.com) (40.107.75.138) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 02 Sep 2019 03:06:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JwaJ1FTJU5mO2c+TD7j+TaIIZYNogcWEjZN90YSgK9Fr9lmgc3BxhIcecwBWwu4TLAA+ucibB3wHh+c/aeptx9QCBDwRknD8ouYgnHwj6cUSoTxO6B1GCC+KJOGETWEAVAKutugYuyLk5G38VNQZ9KXeg+tZh9adZ8uyRHWHOs08QJ2v7pssNaRjuptAVcmhqDrRTtUQcNXqtCbyFQ/5ukb4I5aL+M5gSBUfiN+dF3y8I9VY9uBX1yah7nAE3GXnhzkZJHDBjVSUAfoosdX68KClcN9Qa0w2urBEQkeQltY2HS1Ot0eYzhGQ/bERwTG+i9tyN7V7aru9e1U9qB31Uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4VUS1RQHjmnk/sf0xUcmXE03hEPkgeydUjlxoaHKGJ0=; b=fcRCI1jw9UJNrHOgrdAm1CbzdaFinzXjAVkU22t6aP1fuUZKZ1Z5hkTWzMOx44vKewSXOOSwePJjs3jUXDR3/Xsm9/Au5I5hud7lTx0Z8x93BIrmifSNz661Ch4wakCpfNVU+q0IIne7NOaJKBEzObiJDlqocTQtRhZpU9tWL8xOPRNJaiChsnLETSfpZDCraay1+HAVZM/BTXr2l2g784cx8h1V+TZYxtC8XpLY72VbmUXITrRFd8kGHwmFaM2ZbMNuCd/e0BCI+FMb42wku4iL42i0W3K4o1u3YXt+JWk8ppf43KSeHvY6AZvfAg1Wb62+TQeP3c4KHmB+lWIPnQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4VUS1RQHjmnk/sf0xUcmXE03hEPkgeydUjlxoaHKGJ0=; b=EsbooWhDx1Mj+gEfnwvExE71BRtUtSN9GOnXlnBz4qWCKxL0v39eGxc3AXIWXckyADUCxA/VvkZL+ceLIVV+zYLC8yk+rDjQ3j625Rfn8qzBjWwGrs1X0nHZh084n3c7r0iC8VP94ERPmX+CiteaDPq+xk9Qu+G0N1gIUJp1ThE= Received: from BYAPR01MB4869.prod.exchangelabs.com (20.177.228.18) by BYAPR01MB4261.prod.exchangelabs.com (52.135.239.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2220.19; Mon, 2 Sep 2019 03:06:13 +0000 Received: from BYAPR01MB4869.prod.exchangelabs.com ([fe80::60eb:f69d:f5b6:cc27]) by BYAPR01MB4869.prod.exchangelabs.com ([fe80::60eb:f69d:f5b6:cc27%2]) with mapi id 15.20.2199.021; Mon, 2 Sep 2019 03:06:13 +0000 From: Feng Xue OS To: Martin Jambor , Jan Hubicka , "gcc-patches@gcc.gnu.org" Subject: [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089) Date: Mon, 2 Sep 2019 03:06:12 +0000 Message-ID: References: , In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=fxue@os.amperecomputing.com; x-ms-oob-tlc-oobclassifiers: OLM:10000; received-spf: None (protection.outlook.com: os.amperecomputing.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: tjETyMh4RcOXHkhGJE20UgL1TZgYMgfhUJ7ICZerBAlcQ83xdxL4MQe+PEYvE5KxDxeYj4n5E/MgCRP/b6JFSVQLKpVokmjW5atEwcEtj54= > I have read through the patch and it looks OK to me but I cannot approve > it, you have to ping Honza for that. Since you decided to use the value > range info, it would be nice if you could also add a testcase where it > plays a role. It is somewhat hard to write a testcase to show role of range info only with this patch. If another patch "Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will update this testcase to show that. And this new version also resolves a problem that we might generate very complicated predicate for basic block in convergence point reached from all switch cases. switch (index) / | \ case1 case2 ... default \ | / convergence point For switch/if statement, current algorithm synthesizes predicate of convergence point by OR-combining predicates of all its cases/branches, which might give us a very complicated one. Actually, we can get simplified predicate by using the fact that predicate for a basic block must also hold true for its post dominators. To be specific, convergence point should include (at least equal to) predicate of the switch/if statement. Honza & Martin, Would please review this new patch when you have time? Thanks. Feng ----- diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 278bf606661..dc5752fc005 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi, /* invert_tree_comparison will return ERROR_MARK on FP comparsions that are not EQ/NE instead of returning proper unordered one. Be sure it is not confused with NON_CONSTANT. */ - if (this_code != ERROR_MARK) + if (this_code != ERROR_MARK + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) { predicate p = add_condition (summary, index, size, &aggpos, this_code, @@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos)) return; + auto_vec > ranges; + tree type = TREE_TYPE (op); + tree one_cst = build_one_cst (type); + int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS); + int bound_count = 0; + value_range_base vr; + + get_range_info (op, vr); + FOR_EACH_EDGE (e, ei, bb->succs) { e->aux = edge_predicate_pool.allocate (); *(predicate *) e->aux = false; } + + e = gimple_switch_edge (cfun, last, 0); + /* Set BOUND_COUNT to maximum count to bypass computing predicate for + default case if its target basic block is in convergence point of all + switch cases, which can be determined by checking whether it + post-dominates the switch statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) + bound_count = INT_MAX; + n = gimple_switch_num_labels (last); - for (case_idx = 0; case_idx < n; ++case_idx) + for (case_idx = 1; case_idx < n; ++case_idx) { tree cl = gimple_switch_label (last, case_idx); tree min, max; @@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, min = CASE_LOW (cl); max = CASE_HIGH (cl); - /* For default we might want to construct predicate that none - of cases is met, but it is bit hard to do not having negations - of conditionals handy. */ - if (!min && !max) + /* The case's target basic block is in convergence point of all switch + cases, its predicate should be at least as that of the switch + statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) p = true; else if (!max) p = add_condition (summary, index, size, &aggpos, EQ_EXPR, @@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, } *(class predicate *) e->aux = p.or_with (summary->conds, *(class predicate *) e->aux); + + /* If there are too many disjoint case ranges, predicate for default + case might become too complicated. So add a limit here. */ + if (bound_count > bound_limit) + continue; + + bool new_range = true; + + if (!ranges.is_empty ()) + { + tree last_max_plus_one + = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst); + + /* Merge case ranges if they are continuous. */ + if (tree_int_cst_equal (min, last_max_plus_one)) + new_range = false; + else if (vr.kind () == VR_ANTI_RANGE) + { + tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst); + + /* If two disjoint case ranges can be connected by anti-range + of switch index, combine them to one range. */ + if (tree_int_cst_lt (vr.max (), min_minus_one)) + vr.set_undefined (); + else if (tree_int_cst_le (vr.min (), last_max_plus_one)) + new_range = false; + } + } + + if (!max) + max = min; + + /* Create/extend a case range. And we count endpoints of range set, + this number nearly equals to number of conditions that we will create + for predicate of default case. */ + if (new_range) + { + bound_count += (min == max) ? 1 : 2; + ranges.safe_push (std::make_pair (min, max)); + } + else + { + bound_count += (ranges.last ().first == ranges.last ().second); + ranges.last ().second = max; + } + } + + e = gimple_switch_edge (cfun, last, 0); + if (bound_count > bound_limit) + { + *(class predicate *) e->aux = true; + return; + } + + tree vr_min = vr.kind () == VR_RANGE ? vr.min () : TYPE_MIN_VALUE (type); + tree vr_max = vr.kind () == VR_RANGE ? vr.max () : TYPE_MAX_VALUE (type); + predicate p_seg = true; + predicate p_all = false; + + /* Construct predicate to represent default range set that is negation of + all case ranges. Case range is classified as containing single/non-single + values. Suppose a piece of case ranges in the following. + + [D1...D2] [S1] ... [Sn] [D3...D4] + + To represent default case's range sets between two non-single value + case ranges (From D2 to D3), we construct predicate as: + + D2 < x < D3 && x != S1 && ... && x != Sn + */ + for (size_t i = 0; i < ranges.length (); i++) + { + tree min = ranges[i].first; + tree max = ranges[i].second; + + if (min == max) + p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR, + unshare_expr_without_location (min)); + else + { + /* Do not create sub-predicate for range that is beyond low bound + of switch index. */ + if (tree_int_cst_lt (vr_min, min)) + { + p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR, + unshare_expr_without_location (min)); + p_all = p_all.or_with (summary->conds, p_seg); + } + + /* Do not create sub-predicate for range that is beyond up bound + of switch index. */ + if (tree_int_cst_le (vr_max, max)) + { + p_seg = false; + break; + } + + p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR, + unshare_expr_without_location (max)); + } } + + p_all = p_all.or_with (summary->conds, p_seg); + *(class predicate *) e->aux + = p_all.or_with (summary->conds, *(class predicate *) e->aux); } @@ -1354,10 +1477,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, break; } } - if (p == false) - gcc_checking_assert (!bb->aux); - else + if (p != false) { + basic_block pdom_bb; + if (!bb->aux) { done = false; @@ -1376,6 +1499,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, *((predicate *) bb->aux) = p; } } + + /* For switch/if statement, we can OR-combine predicates of all + its cases/branches to get predicate for basic block in their + convergence point, but sometimes this will generate very + complicated predicate. Actually, we can get simplified + predicate in another way by using the fact that predicate + for a basic block must also hold true for its post dominators. + To be specific, basic block in convergence point of + conditional statement should include predicate of the + statement. */ + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb) + ; + else if (!pdom_bb->aux) + { + done = false; + pdom_bb->aux = edge_predicate_pool.allocate (); + *((predicate *) pdom_bb->aux) = p; + } + else if (p != *(predicate *) pdom_bb->aux) + { + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux); + if (p != *(predicate *) pdom_bb->aux) + { + done = false; + *((predicate *) pdom_bb->aux) = p; + } + } } } } @@ -1980,6 +2131,7 @@ analyze_function_body (struct cgraph_node *node, bool early) if (opt_for_fn (node->decl, optimize)) { calculate_dominance_info (CDI_DOMINATORS); + calculate_dominance_info (CDI_POST_DOMINATORS); if (!early) loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); else @@ -2360,6 +2512,7 @@ analyze_function_body (struct cgraph_node *node, bool early) else if (!ipa_edge_args_sum) ipa_free_all_node_params (); free_dominance_info (CDI_DOMINATORS); + free_dominance_info (CDI_POST_DOMINATORS); } if (dump_file) { diff --git a/gcc/params.def b/gcc/params.def index 13001a7bb2d..1461aa4849d 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS, "parameter analysis based on alias analysis in any given function.", 25000, 0, 0) +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, + "ipa-max-switch-predicate-bounds", + "Maximal number of boundary endpoints of case ranges of switch " + "statement. For switch exceeding this limit, do not construct cost-" + "evaluating predicate for default case during IPA function summary" + "generation.", + 5, 0, 0) + /* WHOPR partitioning configuration. */ DEFPARAM (PARAM_LTO_PARTITIONS, diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c new file mode 100644 index 00000000000..dbd9b8c94fe --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c @@ -0,0 +1,111 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */ + +int fn(); + +int data; + +int callee1(int i) +{ + switch (i) + { + case -126: return i + 13; + case -127: return i + 5; + case -8: return i * i; + case 0: return i % 9; + case 5: + case 7: + case 6: return 3; + default: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + } + + return data += i; +} + +int fn2(); + +int callee2(int i) +{ + switch (i ) + { + case 0: + fn(); + fn(); + fn(); + case 1: + fn(); + fn(); + case -1: + fn(); + case -2: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + data += i; + break; + } + + if (i == 1000) + { + int j; + + for (j = 0; j < 100; j++) + fn2(); + } + return i + 3; +} + +int caller() +{ + return callee1 (-127) + + callee1 (-126) + + callee1 (-8) + + callee1 (0) + + callee1 (5) + + callee1 (6) + + callee1 (7) + + callee1 (100) + + callee2 (9); + callee2 (13); +} + +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */ +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */