From patchwork Thu Oct 12 12:35:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ajit Agarwal X-Patchwork-Id: 1847438 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=L0ZyEco7; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4S5pyh3Cgsz23jX for ; Thu, 12 Oct 2023 23:35:56 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 61F19385700B for ; Thu, 12 Oct 2023 12:35:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id A6D6A38582A1 for ; Thu, 12 Oct 2023 12:35:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A6D6A38582A1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CCKKdK031294; Thu, 12 Oct 2023 12:35:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : cc : from : subject : content-type : content-transfer-encoding; s=pp1; bh=+xdH6im5vqWExnm1tSdw/l1IJTfO0u1xesABxtwvUNU=; b=L0ZyEco7WGdV+a7uSBvK3uOMTqVBeFt6TW3//9ciQeIztZXm27RSEIYJ1xmBEeH831R4 u8gxPssOAVwVLkvfWcv1ogapMvFWufex/Kcem3WSXg7Ao3VHvFpYDwCW8mU3U2MFvUUt 65SXiiImdfmwwxOizaaUoJGG5YQzVM8AL/1/tX82PgKf7wXLbY54hugUwhuYcHSN0MWz 9dv/du3GbNWW1IbnypFPaIkbOirEo0xdMrravGeigJfHedELpASAS9vgiizgoCAQ7UtO RyEfPq30Tjyt4pHajkte7HzFwAlwzcs3Qp4Bj6L4xlm7+wBnIbmE973Va38cFw8IGZT3 Ig== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tpgqd0guc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Oct 2023 12:35:41 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 39CCWuGd006335; Thu, 12 Oct 2023 12:35:41 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tpgqd0gu0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Oct 2023 12:35:41 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39CCUDp5028255; Thu, 12 Oct 2023 12:35:40 GMT Received: from smtprelay02.wdc07v.mail.ibm.com ([172.16.1.69]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3tkj1yfm18-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Oct 2023 12:35:40 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay02.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39CCZert18219734 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 12 Oct 2023 12:35:40 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 65CFA58052; Thu, 12 Oct 2023 12:35:40 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 180AE58054; Thu, 12 Oct 2023 12:35:38 +0000 (GMT) Received: from [9.43.115.172] (unknown [9.43.115.172]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 12 Oct 2023 12:35:37 +0000 (GMT) Message-ID: <4f7c0c8d-f16c-2fe8-c2e0-2ef4ef01c735@linux.ibm.com> Date: Thu, 12 Oct 2023 18:05:36 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Content-Language: en-US To: gcc-patches Cc: Richard Biener , Jeff Law , Segher Boessenkool , Peter Bergner From: Ajit Agarwal Subject: [PATCH v9] tree-ssa-sink: Improve code sinking pass X-TM-AS-GCONF: 00 X-Proofpoint-GUID: VyxFw_X88l-3zeHqzJicU1BefAWr_FU6 X-Proofpoint-ORIG-GUID: KpIuVHTce2NeH_YKIlhllkWF-F3Udilb X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_05,2023-10-12_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 phishscore=0 malwarescore=0 mlxlogscore=927 priorityscore=1501 spamscore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310120103 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch improves code sinking pass to sink statements before call to reduce register pressure. Review comments are incorporated. Synced with latest trunk sources and modify the sinking pass accordingly. For example : void bar(); int j; void foo(int a, int b, int c, int d, int e, int f) { int l; l = a + b + c + d +e + f; if (a != 5) { bar(); j = l; } } Code Sinking does the following: void bar(); int j; void foo(int a, int b, int c, int d, int e, int f) { int l; if (a != 5) { l = a + b + c + d +e + f; bar(); j = l; } } Bootstrapped regtested on powerpc64-linux-gnu. Thanks & Regards Ajit tree-ssa-sink: Improve code sinking pass Currently, code sinking will sink code after function calls. This increases register pressure for callee-saved registers. The following patch improves code sinking by placing the sunk code before calls in the use block or in the immediate dominator of the use blocks. 2023-10-12 Ajit Kumar Agarwal gcc/ChangeLog: PR tree-optimization/81953 * tree-ssa-sink.cc (statement_sink_location): Move statements before calls. (select_best_block): Add heuristics to select the best blocks in the immediate post dominator. gcc/testsuite/ChangeLog: PR tree-optimization/81953 * gcc.dg/tree-ssa/ssa-sink-20.c: New test. * gcc.dg/tree-ssa/ssa-sink-21.c: New test. --- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 15 ++++++++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c | 19 ++++++++++ gcc/tree-ssa-sink.cc | 39 ++++++++++++--------- 3 files changed, 56 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c new file mode 100644 index 00000000000..d3b79ca5803 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sink-stats" } */ +void bar(); +int j; +void foo(int a, int b, int c, int d, int e, int f) +{ + int l; + l = a + b + c + d +e + f; + if (a != 5) + { + bar(); + j = l; + } +} +/* { dg-final { scan-tree-dump {l_12\s+=\s+_4\s+\+\s+f_11\(D\);\n\s+bar\s+\(\)} sink1 } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c new file mode 100644 index 00000000000..84e7938c54f --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sink-stats" } */ +void bar(); +int j, x; +void foo(int a, int b, int c, int d, int e, int f) +{ + int l; + l = a + b + c + d +e + f; + if (a != 5) + { + bar(); + if (b != 3) + x = 3; + else + x = 5; + j = l; + } +} +/* { dg-final { scan-tree-dump {l_13\s+=\s+_4\s+\+\s+f_12\(D\);\n\s+bar\s+\(\)} sink1 } } */ diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc index a360c5cdd6e..95298bc8402 100644 --- a/gcc/tree-ssa-sink.cc +++ b/gcc/tree-ssa-sink.cc @@ -174,7 +174,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts) /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator tree, return the best basic block between them (inclusive) to place - statements. + statements. The best basic block should be an immediate dominator of + best basic block if the use stmt is after the call. We want the most control dependent block in the shallowest loop nest. @@ -196,6 +197,16 @@ select_best_block (basic_block early_bb, basic_block best_bb = late_bb; basic_block temp_bb = late_bb; int threshold; + /* Get the sinking threshold. If the statement to be moved has memory + operands, then increase the threshold by 7% as those are even more + profitable to avoid, clamping at 100%. */ + threshold = param_sink_frequency_threshold; + if (gimple_vuse (stmt) || gimple_vdef (stmt)) + { + threshold += 7; + if (threshold > 100) + threshold = 100; + } while (temp_bb != early_bb) { @@ -204,6 +215,14 @@ select_best_block (basic_block early_bb, if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb)) best_bb = temp_bb; + /* if we have temp_bb post dominated by use block block then immediate + * dominator would be our best block. */ + if (!gimple_vuse (stmt) + && bb_loop_depth (temp_bb) == bb_loop_depth (early_bb) + && !(temp_bb->count * 100 >= early_bb->count * threshold) + && dominated_by_p (CDI_DOMINATORS, late_bb, temp_bb)) + best_bb = temp_bb; + /* Walk up the dominator tree, hopefully we'll find a shallower loop nest. */ temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb); @@ -233,17 +252,6 @@ select_best_block (basic_block early_bb, && !dominated_by_p (CDI_DOMINATORS, best_bb->loop_father->latch, best_bb)) return early_bb; - /* Get the sinking threshold. If the statement to be moved has memory - operands, then increase the threshold by 7% as those are even more - profitable to avoid, clamping at 100%. */ - threshold = param_sink_frequency_threshold; - if (gimple_vuse (stmt) || gimple_vdef (stmt)) - { - threshold += 7; - if (threshold > 100) - threshold = 100; - } - /* If BEST_BB is at the same nesting level, then require it to have significantly lower execution frequency to avoid gratuitous movement. */ if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb) @@ -430,6 +438,7 @@ statement_sink_location (gimple *stmt, basic_block frombb, continue; break; } + use = USE_STMT (one_use); if (gimple_code (use) != GIMPLE_PHI) @@ -439,10 +448,7 @@ statement_sink_location (gimple *stmt, basic_block frombb, if (sinkbb == frombb) return false; - if (sinkbb == gimple_bb (use)) - *togsi = gsi_for_stmt (use); - else - *togsi = gsi_after_labels (sinkbb); + *togsi = gsi_after_labels (sinkbb); return true; } @@ -825,7 +831,6 @@ pass_sink_code::execute (function *fun) mark_dfs_back_edges (fun); memset (&sink_stats, 0, sizeof (sink_stats)); calculate_dominance_info (CDI_DOMINATORS); - virtual_operand_live vop_live; int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun));