From patchwork Mon May 23 18:06:31 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 97020 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D3A91B6FB4 for ; Tue, 24 May 2011 04:07:16 +1000 (EST) Received: (qmail 27392 invoked by alias); 23 May 2011 18:07:14 -0000 Received: (qmail 27380 invoked by uid 22791); 23 May 2011 18:07:12 -0000 X-SWARE-Spam-Status: No, hits=-1.0 required=5.0 tests=AWL, BAYES_00, MAY_BE_FORGED, TW_CF, TW_TM, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e8.ny.us.ibm.com (HELO e8.ny.us.ibm.com) (32.97.182.138) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 23 May 2011 18:06:55 +0000 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e8.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p4NDeFbE015720 for ; Mon, 23 May 2011 09:40:15 -0400 Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4NI6r6o110166 for ; Mon, 23 May 2011 14:06:53 -0400 Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4NI6X3Y025431 for ; Mon, 23 May 2011 12:06:33 -0600 Received: from [9.10.86.209] (tepot-pc.rchland.ibm.com [9.10.86.209] (may be forged)) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p4NI6WoR025361; Mon, 23 May 2011 12:06:32 -0600 Subject: Re: [PATCH] Add powi-to-multiply expansion to cse_sincos pass From: "William J. Schmidt" To: Richard Guenther Cc: gcc-patches@gcc.gnu.org In-Reply-To: References: <1305813841.29239.3.camel@gnopaine> Date: Mon, 23 May 2011 13:06:31 -0500 Message-Id: <1306173991.4821.7.camel@L3G5336.ibm.com> Mime-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Richard, thanks for the excellent comments. I really appreciate your help. I've implemented them all, and bootstrap/regtest succeeds on powerpc target. Here's the revised patch for your consideration. Thanks, Bill 2011-05-23 Bill Schmidt * tree-ssa-math-opts.c (powi_table): New. (powi_lookup_cost): New. (powi_cost): New. (powi_as_mults_1): New. (powi_as_mults): New. (gimple_expand_builtin_powi): New. (execute_cse_sincos): Add switch case for BUILT_IN_POWI. (gate_cse_sincos): Remove sincos/cexp restriction. Index: gcc/ChangeLog =================================================================== --- gcc/ChangeLog (revision 174075) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,14 @@ +2011-05-23 Bill Schmidt + + * tree-ssa-math-opts.c (powi_table): New. + (powi_lookup_cost): New. + (powi_cost): New. + (powi_as_mults_1): New. + (powi_as_mults): New. + (gimple_expand_builtin_powi): New. + (execute_cse_sincos): Add switch case for BUILT_IN_POWI. + (gate_cse_sincos): Remove sincos/cexp restriction. + 2011-05-23 Richard Guenther * gimple.c (gimple_types_compatible_p_1): Always compare type names. Index: gcc/tree-ssa-math-opts.c =================================================================== --- gcc/tree-ssa-math-opts.c (revision 174075) +++ gcc/tree-ssa-math-opts.c (working copy) @@ -795,8 +795,243 @@ execute_cse_sincos_1 (tree name) return cfg_changed; } +/* To evaluate powi(x,n), the floating point value x raised to the + constant integer exponent n, we use a hybrid algorithm that + combines the "window method" with look-up tables. For an + introduction to exponentiation algorithms and "addition chains", + see section 4.6.3, "Evaluation of Powers" of Donald E. Knuth, + "Seminumerical Algorithms", Vol. 2, "The Art of Computer Programming", + 3rd Edition, 1998, and Daniel M. Gordon, "A Survey of Fast Exponentiation + Methods", Journal of Algorithms, Vol. 27, pp. 129-146, 1998. */ + +/* Provide a default value for POWI_MAX_MULTS, the maximum number of + multiplications to inline before calling the system library's pow + function. powi(x,n) requires at worst 2*bits(n)-2 multiplications, + so this default never requires calling pow, powf or powl. */ + +#ifndef POWI_MAX_MULTS +#define POWI_MAX_MULTS (2*HOST_BITS_PER_WIDE_INT-2) +#endif + +/* The size of the "optimal power tree" lookup table. All + exponents less than this value are simply looked up in the + powi_table below. This threshold is also used to size the + cache of pseudo registers that hold intermediate results. */ +#define POWI_TABLE_SIZE 256 + +/* The size, in bits of the window, used in the "window method" + exponentiation algorithm. This is equivalent to a radix of + (1<= POWI_TABLE_SIZE) + { + if (val & 1) + { + digit = val & ((1 << POWI_WINDOW_SIZE) - 1); + result += powi_lookup_cost (digit, cache) + + POWI_WINDOW_SIZE + 1; + val >>= POWI_WINDOW_SIZE; + } + else + { + val >>= 1; + result++; + } + } + + return result + powi_lookup_cost (val, cache); +} + +/* Recursive subroutine of powi_as_mults. This function takes the + array, CACHE, of already calculated exponents and an exponent N and + returns a tree that corresponds to CACHE[1]**N, with type TYPE. */ + +static tree +powi_as_mults_1 (gimple_stmt_iterator *gsi, location_t loc, tree type, + HOST_WIDE_INT n, tree *cache, tree target) +{ + tree op0, op1, ssa_target; + unsigned HOST_WIDE_INT digit; + gimple mult_stmt; + + if (n < POWI_TABLE_SIZE) + { + if (cache[n]) + return cache[n]; + + ssa_target = make_ssa_name (target, NULL); + cache[n] = ssa_target; + + op0 = powi_as_mults_1 (gsi, loc, type, n - powi_table[n], cache, target); + op1 = powi_as_mults_1 (gsi, loc, type, powi_table[n], cache, target); + } + else if (n & 1) + { + digit = n & ((1 << POWI_WINDOW_SIZE) - 1); + op0 = powi_as_mults_1 (gsi, loc, type, n - digit, cache, target); + op1 = powi_as_mults_1 (gsi, loc, type, digit, cache, target); + ssa_target = make_ssa_name (target, NULL); + } + else + { + op0 = powi_as_mults_1 (gsi, loc, type, n >> 1, cache, target); + op1 = op0; + ssa_target = make_ssa_name (target, NULL); + } + + mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target, op0, op1); + gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT); + + return ssa_target; +} + +/* Convert ARG0**N to a tree of multiplications of ARG0 with itself. + This function needs to be kept in sync with powi_cost above. */ + +static tree +powi_as_mults (gimple_stmt_iterator *gsi, location_t loc, + tree arg0, HOST_WIDE_INT n) +{ + tree cache[POWI_TABLE_SIZE], result, type = TREE_TYPE (arg0), target; + gimple div_stmt; + + if (n == 0) + return build_real (type, dconst1); + + memset (cache, 0, sizeof (cache)); + cache[1] = arg0; + + target = create_tmp_var (type, "powmult"); + add_referenced_var (target); + + result = powi_as_mults_1 (gsi, loc, type, (n < 0) ? -n : n, cache, target); + + if (n >= 0) + return result; + + /* If the original exponent was negative, reciprocate the result. */ + target = create_tmp_var (type, "powmult"); + add_referenced_var (target); + target = make_ssa_name (target, NULL); + + div_stmt = gimple_build_assign_with_ops (RDIV_EXPR, target, + build_real (type, dconst1), + result); + gsi_insert_before (gsi, div_stmt, GSI_SAME_STMT); + + return target; +} + +/* ARG0 and N are the two arguments to a powi builtin in GSI with + location info LOC. If the arguments are appropriate, create an + equivalent sequence of statements prior to GSI using an optimal + number of multiplications, and return an expession holding the + result. */ + +static tree +gimple_expand_builtin_powi (gimple_stmt_iterator *gsi, location_t loc, + tree arg0, HOST_WIDE_INT n) +{ + /* Avoid largest negative number. */ + if (n != -n + && ((n >= -1 && n <= 2) + || (optimize_function_for_speed_p (cfun) + && powi_cost (n) <= POWI_MAX_MULTS))) + return powi_as_mults (gsi, loc, arg0, n); + + return NULL_TREE; +} + /* Go through all calls to sin, cos and cexpi and call execute_cse_sincos_1 - on the SSA_NAME argument of each of them. */ + on the SSA_NAME argument of each of them. Also expand powi(x,n) into + an optimal number of multiplies, when n is a constant. */ static unsigned int execute_cse_sincos (void) @@ -821,7 +1056,9 @@ execute_cse_sincos (void) && (fndecl = gimple_call_fndecl (stmt)) && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL) { - tree arg; + tree arg, arg0, arg1, result; + HOST_WIDE_INT n, n_hi; + location_t loc; switch (DECL_FUNCTION_CODE (fndecl)) { @@ -833,6 +1070,29 @@ execute_cse_sincos (void) cfg_changed |= execute_cse_sincos_1 (arg); break; + CASE_FLT_FN (BUILT_IN_POWI): + arg0 = gimple_call_arg (stmt, 0); + arg1 = gimple_call_arg (stmt, 1); + if (!host_integerp (arg1, 0)) + break; + + n = TREE_INT_CST_LOW (arg1); + n_hi = TREE_INT_CST_HIGH (arg1); + if (n_hi != 0 && n_hi != -1) + break; + + loc = gimple_location (stmt); + result = gimple_expand_builtin_powi (&gsi, loc, arg0, n); + + if (result) + { + tree lhs = gimple_get_lhs (stmt); + gimple new_stmt = gimple_build_assign (lhs, result); + gimple_set_location (new_stmt, loc); + gsi_replace (&gsi, new_stmt, true); + } + break; + default:; } } @@ -849,10 +1109,9 @@ execute_cse_sincos (void) static bool gate_cse_sincos (void) { - /* Make sure we have either sincos or cexp. */ - return (TARGET_HAS_SINCOS - || TARGET_C99_FUNCTIONS) - && optimize; + /* We no longer require either sincos or cexp, since powi expansion + piggybacks on this pass. */ + return optimize; } struct gimple_opt_pass pass_cse_sincos =