From patchwork Wed May 8 10:09:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: JunMa X-Patchwork-Id: 1096871 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-500302-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="dB5ocutF"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44zXJc5WdZz9s7h for ; Wed, 8 May 2019 20:09:36 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=vPOKgk4+jYwk6mGEWmNDFu9jWliT8NKp0ZE3VkNLcDrCavBUUh batTXsvAOTVZ1s7WM8BJ9yRlwTzIKQdqsCYda8Covn/UQghU71zJNGv4/oi7Ds+e 8lqiruK12jPu1jsvBKqYgx7gC8Zw+PkXuf+6pa8GMs/xB9dC+WdUK3gp4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=gprpx0pqVjyq0G4Jl3+WS14o0Gc=; b=dB5ocutFNPDizyVpJ+Ua drhyswuScaFp6wegjkm7NRcXL+myrCIVtJK0dNgMptI+P7cWMq98RbCC/7kkNOM1 gVNLvdN+Clf2htJYM2fpA/dWly7TEZMCV0KHy54Qp6IPJasIILdlDMksNzyAva54 eGJzhqFradWpAUMwe9vCdQg= Received: (qmail 110058 invoked by alias); 8 May 2019 10:09:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 110050 invoked by uid 89); 8 May 2019 10:09:28 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, SPF_FAIL, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 spammy=H*r:TLS1.0 X-HELO: eggs.gnu.org Received: from eggs.gnu.org (HELO eggs.gnu.org) (209.51.188.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 08 May 2019 10:09:25 +0000 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hOJVp-0002MV-Ot for gcc-patches@gcc.gnu.org; Wed, 08 May 2019 06:09:23 -0400 Received: from out4436.biz.mail.alibaba.com ([47.88.44.36]:6531) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hOJVp-0002KK-1i for gcc-patches@gcc.gnu.org; Wed, 08 May 2019 06:09:21 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R161e4; CH=green; DM=||false|; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04426; MF=junma@linux.alibaba.com; NM=1; PH=DS; RN=1; SR=0; TI=SMTPD_---0TRApW2S_1557310146; Received: from MacBook-Pro-3.local(mailfrom:JunMa@linux.alibaba.com fp:SMTPD_---0TRApW2S_1557310146) by smtp.aliyun-inc.com(127.0.0.1); Wed, 08 May 2019 18:09:07 +0800 To: gcc-patches From: JunMa Subject: [PATCH][PR90106] Builtin call transformation changes in cdce pass Message-ID: Date: Wed, 8 May 2019 18:09:06 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 47.88.44.36 X-IsSubscribed: yes Hi As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), when gcc meets builtin function call like:   y = sqrt (x); The cdce pass tries to transform the call into an internal function call and conditionally executes call with a simple range check on the arguments which can detect most cases and the errno does not need to be set. It looks like:   y = IFN_SQRT (x);   if (__builtin_isless (x, 0))     sqrt (x); However, If the call is in tail position, for example:   y =  sqrt (x);   return y; will become:   y = IFN_SQRT (x);   if (__builtin_isless (x, 0))     sqrt (x);   return y; This transformation breaks tailcall pattern, and prevents later tailcall optimizations. So This patch transform builtin call with return value into if-then-else part, which looks like:   y =  sqrt (x);    ==>   if (__builtin_isless (x, 0))     y = sqrt (x);   else     y = IFN_SQRT (x); BTW, y = sqrt (x) can also transform like:   y = IFN_SQRT (x);   if (__builtin_isless (x, 0))     y = sqrt (x); We don‘t choose this pattern because it emits worse assemble code(more move instruction and use more registers) in x86_64. Bootstrapped/regtested on x86_64-linux, ok for trunk? Regards JunMa gcc/ChangeLog 2019-05-07  Jun Ma     PR tree-optimization/90106     * tree-call-cdce.c (shrink_wrap_one_built_in_call_with_conds): Add     new parameter as new internal function call, also move it to new     basic block.     (use_internal_fn): Pass internal function call to     shrink_wrap_one_built_in_call_with_conds. gcc/testsuite/ChangeLog 2019-05-07  Jun Ma     PR tree-optimization/90106     * gcc.dg/cdce1.c: Check tailcall code generation after cdce pass.     * gcc.dg/cdce2.c: Likewise. --- gcc/testsuite/gcc.dg/cdce1.c | 3 +- gcc/testsuite/gcc.dg/cdce2.c | 3 +- gcc/tree-call-cdce.c | 90 +++++++++++++++++++++++++++++++++----------- 3 files changed, 71 insertions(+), 25 deletions(-) diff --git a/gcc/testsuite/gcc.dg/cdce1.c b/gcc/testsuite/gcc.dg/cdce1.c index b23ad63..424d80f 100644 --- a/gcc/testsuite/gcc.dg/cdce1.c +++ b/gcc/testsuite/gcc.dg/cdce1.c @@ -1,7 +1,8 @@ /* { dg-do run } */ /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ /* { dg-require-effective-target int32plus } */ -/* { dg-final { scan-tree-dump "cdce1.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-assembler "jmp pow" } } */ /* { dg-require-effective-target large_double } */ #include diff --git a/gcc/testsuite/gcc.dg/cdce2.c b/gcc/testsuite/gcc.dg/cdce2.c index 30e7cb1..2af2893 100644 --- a/gcc/testsuite/gcc.dg/cdce2.c +++ b/gcc/testsuite/gcc.dg/cdce2.c @@ -1,7 +1,8 @@ /* { dg-do run } */ /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ -/* { dg-final { scan-tree-dump "cdce2.c:15: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-assembler "jmp log" } } */ #include #include diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c index 2e482b3..9e3372f 100644 --- a/gcc/tree-call-cdce.c +++ b/gcc/tree-call-cdce.c @@ -93,10 +93,10 @@ along with GCC; see the file COPYING3. If not see y = sqrt (x); ==> - y = IFN_SQRT (x); if (__builtin_isless (x, 0)) - sqrt (x); - + y = sqrt (x); + else + y = IFN_SQRT (x); In the vast majority of cases we should then never need to call sqrt. Note that library functions are not supposed to clear errno to zero without @@ -793,14 +793,16 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec conds, } /* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS - conditions in CONDS is false. */ + conditions in CONDS is false. Also move BI_NEWCALL to a new basic block + when it is non-null, it is called while all of the CONDS are true. */ static void shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec conds, - unsigned int nconds) + unsigned int nconds, + gcall *bi_newcall = NULL) { gimple_stmt_iterator bi_call_bsi; - basic_block bi_call_bb, join_tgt_bb, guard_bb; + basic_block bi_call_bb, bi_newcall_bb, join_tgt_bb, guard_bb; edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru; edge bi_call_in_edge0, guard_bb_in_edge; unsigned tn_cond_stmts; @@ -809,27 +811,26 @@ shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec conds, gimple *cond_expr_start; /* The cfg we want to create looks like this: - - [guard n-1] <- guard_bb (old block) - | \ - | [guard n-2] } - | / \ } - | / ... } new blocks - | / [guard 0] } - | / / | } - [ call ] | <- bi_call_bb } - | \ | - | \ | - | [ join ] <- join_tgt_bb (old iff call must end bb) - | + [guard n-1] <- guard_bb (old block) + | \ + | [guard n-2] } + | / \ } + | / ... } new blocks + | / [guard 0] } + | / / | } + [call] | <- bi_call_bb } + \ [newcall] <-bi_newcall_bb} + \ | + [join] <- join_tgt_bb (old iff call must end bb) possible EH edges (only if [join] is old) When [join] is new, the immediate dominators for these blocks are: 1. [guard n-1]: unchanged 2. [call]: [guard n-1] - 3. [guard m]: [guard m+1] for 0 <= m <= n-2 - 4. [join]: [guard n-1] + 3. [newcall]: [guard 0] + 4. [guard m]: [guard m+1] for 0 <= m <= n-2 + 5. [join]: [guard n-1] We punt for the more complex case case of [join] being old and simply free the dominance info. We also punt on postdominators, @@ -927,6 +928,47 @@ shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec conds, edges.quick_push (edge_pair (bi_call_in_edge, guard_bb_in_edge)); } + /* Move BI_NEWCALL to new basic block when it is non-null. */ + if (bi_newcall) + { + /* Get bi_newcall_bb by split join_tgt_in_edge_fall_thru edge, + and move BI_NEWCALL to bi_newcall_bb. */ + bi_newcall_bb = split_edge (join_tgt_in_edge_fall_thru); + gimple_stmt_iterator to_gsi = gsi_start_bb (bi_newcall_bb); + gimple_stmt_iterator from_gsi = gsi_for_stmt (bi_newcall); + gsi_move_before (&from_gsi, &to_gsi); + join_tgt_in_edge_fall_thru = EDGE_SUCC (bi_newcall_bb, 0); + join_tgt_bb = join_tgt_in_edge_fall_thru->dest; + + tree bi_newcall_lhs = gimple_call_lhs (bi_newcall); + tree bi_call_lhs = gimple_call_lhs (bi_call); + if (!bi_call_lhs) + { + bi_call_lhs = copy_ssa_name (bi_newcall_lhs); + gimple_call_set_lhs (bi_call, bi_call_lhs); + SSA_NAME_DEF_STMT (bi_call_lhs) = bi_call; + } + + /* Create phi node for lhs of BI_CALL and BI_NEWCALL. */ + gphi *new_phi = create_phi_node (copy_ssa_name (bi_newcall_lhs), + join_tgt_bb); + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (new_phi)) + = SSA_NAME_OCCURS_IN_ABNORMAL_PHI (bi_newcall_lhs); + add_phi_arg (new_phi, bi_call_lhs, join_tgt_in_edge_from_call, + gimple_location (bi_call)); + add_phi_arg (new_phi, bi_newcall_lhs, join_tgt_in_edge_fall_thru, + gimple_location (bi_newcall)); + + /* Replace all use of original return value with result of phi node. */ + use_operand_p use_p; + gimple *use_stmt; + imm_use_iterator iterator; + FOR_EACH_IMM_USE_STMT (use_stmt, iterator, bi_newcall_lhs) + if (use_stmt != new_phi) + FOR_EACH_IMM_USE_ON_STMT (use_p, iterator) + SET_USE (use_p, PHI_RESULT (new_phi)); + } + /* Now update the probability and profile information, processing the guards in order of execution. @@ -1030,9 +1072,11 @@ use_internal_fn (gcall *call) unsigned nconds = 0; auto_vec conds; + bool is_arg_conds = false; if (can_test_argument_range (call)) { gen_shrink_wrap_conditions (call, conds, &nconds); + is_arg_conds = true; gcc_assert (nconds != 0); } else @@ -1082,8 +1126,8 @@ use_internal_fn (gcall *call) call = new_call; } } - - shrink_wrap_one_built_in_call_with_conds (call, conds, nconds); + shrink_wrap_one_built_in_call_with_conds (call, conds, nconds, + is_arg_conds ? new_call : NULL); } /* The top level function for conditional dead code shrink