From patchwork Fri Oct 13 19:43:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 825707 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-464190-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="GIynYBWT"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yDJ6j6Bgwz9sRV for ; Sat, 14 Oct 2017 06:43:24 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=wD/pTjo6rJVpOte1dTZntfg4sWyWv YRhyncT3fNd/9momP0vsZkihIfjvHJ38+TdxJA34N4206S+4EheNkL3B4QzvgYxT ovn3kJYQuA2PWE7C5lr1rNuSNHpHHD3MbJ4KHjxxXe7jzmFSiukZLBWkCKGajqnw J3lIg2sHJKC4vc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=sWb6AsKAm/kTLyFfN0QAmDwPuxw=; b=GIy nYBWTX7RoUX+BnAuneKQ1Whacb6k5y/+fwE4WtJXT1ZoggZ43YjZRREqJixmgp8f vU4t1ay/nsSrEBfNtgeW4KoTr5M+IAoN32fFlAwa32QdMHvb+hLJ4Xl4yFlddSJD 9Pb87AIOW1lvURShkP2/DZcoU36yHqKkH71420Zg= Received: (qmail 120131 invoked by alias); 13 Oct 2017 19:43:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 120121 invoked by uid 89); 13 Oct 2017 19:43:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Oct 2017 19:43:15 +0000 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B6283820FE; Fri, 13 Oct 2017 19:43:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B6283820FE Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=jakub@redhat.com Received: from tucnak.zalov.cz (ovpn-116-223.ams2.redhat.com [10.36.116.223]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5EDBB5C543; Fri, 13 Oct 2017 19:43:14 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id v9DJhBnc016405; Fri, 13 Oct 2017 21:43:12 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id v9DJhAxe016404; Fri, 13 Oct 2017 21:43:10 +0200 Date: Fri, 13 Oct 2017 21:43:10 +0200 From: Jakub Jelinek To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Slightly improve phiopt value_replacement optimization (PR middle-end/62263, PR middle-end/82498) Message-ID: <20171013194310.GM14653@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.7.1 (2016-10-04) X-IsSubscribed: yes Hi! First of all, there was a typo, we are optimizing (x != 0) ? x + y : y into x + y rather than y. And, as the comment mentions, the condition that there is just a single stmt is too restrictive and in the various patterns posted in the various rotate related PRs there are cases where in addition to the last assign stmt there are some preparation statements (sometimes conversion, sometimes bit masking with constant, sometimes both). I think it is undesirable to turn the conditional code into always executed one if it is too large, so this patch allows just very few very cheap preparation statements which feed each other and it is easily possible to propagate the cond_rhs constant through those statements to compute what the result from those would be (and make sure no UB). With this patch on top of the previous one, on rotate-8.c testcase on x86_64-linux and i686-linux we get the most efficient code in all cases - just a rol/ror instruction with perhaps some moves into registers needed to perform that, but without any conditionals, masking etc. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2017-10-13 Jakub Jelinek PR middle-end/62263 PR middle-end/82498 * tree-ssa-phiopt.c (value_replacement): Comment fix. Handle up to 2 preparation statements for ASSIGN in MIDDLE_BB. * c-c++-common/rotate-8.c: Expect no PHIs in optimized dump. Jakub --- gcc/tree-ssa-phiopt.c.jj 2017-10-10 22:04:08.000000000 +0200 +++ gcc/tree-ssa-phiopt.c 2017-10-13 17:55:01.578450763 +0200 @@ -995,11 +995,13 @@ value_replacement (basic_block cond_bb, } - /* Now optimize (x != 0) ? x + y : y to just y. - The following condition is too restrictive, there can easily be another - stmt in middle_bb, for instance a CONVERT_EXPR for the second argument. */ - gimple *assign = last_and_only_stmt (middle_bb); - if (!assign || gimple_code (assign) != GIMPLE_ASSIGN + /* Now optimize (x != 0) ? x + y : y to just x + y. */ + gsi = gsi_last_nondebug_bb (middle_bb); + if (gsi_end_p (gsi)) + return 0; + + gimple *assign = gsi_stmt (gsi); + if (!is_gimple_assign (assign) || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0)) && !POINTER_TYPE_P (TREE_TYPE (arg0)))) @@ -1009,6 +1011,71 @@ value_replacement (basic_block cond_bb, if (!gimple_seq_empty_p (phi_nodes (middle_bb))) return 0; + /* Allow up to 2 cheap preparation statements that prepare argument + for assign, e.g.: + if (y_4 != 0) + goto ; + else + goto ; + : + _1 = (int) y_4; + iftmp.0_6 = x_5(D) r<< _1; + : + # iftmp.0_2 = PHI + or: + if (y_3(D) == 0) + goto ; + else + goto ; + : + y_4 = y_3(D) & 31; + _1 = (int) y_4; + _6 = x_5(D) r<< _1; + : + # _2 = PHI */ + gimple *prep_stmt[2] = { NULL, NULL }; + int prep_cnt; + for (prep_cnt = 0; ; prep_cnt++) + { + gsi_prev_nondebug (&gsi); + if (gsi_end_p (gsi)) + break; + + gimple *g = gsi_stmt (gsi); + if (gimple_code (g) == GIMPLE_LABEL) + break; + + if (prep_cnt == 2 || !is_gimple_assign (g)) + return 0; + + tree lhs = gimple_assign_lhs (g); + tree rhs1 = gimple_assign_rhs1 (g); + use_operand_p use_p; + gimple *use_stmt; + if (TREE_CODE (lhs) != SSA_NAME + || TREE_CODE (rhs1) != SSA_NAME + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (rhs1)) + || !single_imm_use (lhs, &use_p, &use_stmt) + || use_stmt != (prep_cnt ? prep_stmt[prep_cnt - 1] : assign)) + return 0; + switch (gimple_assign_rhs_code (g)) + { + CASE_CONVERT: + break; + case PLUS_EXPR: + case BIT_AND_EXPR: + case BIT_IOR_EXPR: + case BIT_XOR_EXPR: + if (TREE_CODE (gimple_assign_rhs2 (g)) != INTEGER_CST) + return 0; + break; + default: + return 0; + } + prep_stmt[prep_cnt] = g; + } + /* Only transform if it removes the condition. */ if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0, e1)) return 0; @@ -1019,7 +1086,7 @@ value_replacement (basic_block cond_bb, && profile_status_for_fn (cfun) != PROFILE_ABSENT && EDGE_PRED (middle_bb, 0)->probability < profile_probability::even () /* If assign is cheap, there is no point avoiding it. */ - && estimate_num_insns (assign, &eni_time_weights) + && estimate_num_insns (bb_seq (middle_bb), &eni_time_weights) >= 3 * estimate_num_insns (cond, &eni_time_weights)) return 0; @@ -1030,6 +1097,32 @@ value_replacement (basic_block cond_bb, tree cond_lhs = gimple_cond_lhs (cond); tree cond_rhs = gimple_cond_rhs (cond); + /* Propagate the cond_rhs constant through preparation stmts, + make sure UB isn't invoked while doing that. */ + for (int i = prep_cnt - 1; i >= 0; --i) + { + gimple *g = prep_stmt[i]; + tree grhs1 = gimple_assign_rhs1 (g); + if (!operand_equal_for_phi_arg_p (cond_lhs, grhs1)) + return 0; + cond_lhs = gimple_assign_lhs (g); + cond_rhs = fold_convert (TREE_TYPE (grhs1), cond_rhs); + if (TREE_CODE (cond_rhs) != INTEGER_CST + || TREE_OVERFLOW (cond_rhs)) + return 0; + if (gimple_assign_rhs_class (g) == GIMPLE_BINARY_RHS) + { + cond_rhs = int_const_binop (gimple_assign_rhs_code (g), cond_rhs, + gimple_assign_rhs2 (g)); + if (TREE_OVERFLOW (cond_rhs)) + return 0; + } + cond_rhs = fold_convert (TREE_TYPE (cond_lhs), cond_rhs); + if (TREE_CODE (cond_rhs) != INTEGER_CST + || TREE_OVERFLOW (cond_rhs)) + return 0; + } + if (((code == NE_EXPR && e1 == false_edge) || (code == EQ_EXPR && e1 == true_edge)) && arg0 == lhs @@ -1071,7 +1164,15 @@ value_replacement (basic_block cond_bb, duplicate_ssa_name_range_info (lhs, SSA_NAME_RANGE_TYPE (phires), phires_range_info); } - gimple_stmt_iterator gsi_from = gsi_for_stmt (assign); + gimple_stmt_iterator gsi_from; + for (int i = prep_cnt - 1; i >= 0; --i) + { + tree plhs = gimple_assign_lhs (prep_stmt[i]); + SSA_NAME_RANGE_INFO (plhs) = NULL; + gsi_from = gsi_for_stmt (prep_stmt[i]); + gsi_move_before (&gsi_from, &gsi); + } + gsi_from = gsi_for_stmt (assign); gsi_move_before (&gsi_from, &gsi); replace_phi_edge_with_variable (cond_bb, e1, phi, lhs); return 2; --- gcc/testsuite/c-c++-common/rotate-8.c.jj 2017-10-13 15:55:32.000000000 +0200 +++ gcc/testsuite/c-c++-common/rotate-8.c 2017-10-13 17:57:32.861627651 +0200 @@ -3,6 +3,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */ /* { dg-final { scan-tree-dump-times "r\[<>]\[<>]" 23 "optimized" } } */ +/* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */ unsigned int f1 (unsigned int x, unsigned char y)