From patchwork Thu Jul 14 14:23:35 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 104684 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 3610CB6F54 for ; Fri, 15 Jul 2011 00:24:08 +1000 (EST) Received: (qmail 15762 invoked by alias); 14 Jul 2011 14:24:06 -0000 Received: (qmail 15751 invoked by uid 22791); 14 Jul 2011 14:24:05 -0000 X-SWARE-Spam-Status: No, hits=-1.1 required=5.0 tests=AWL, BAYES_00, MISSING_HEADERS, RP_MATCHES_RCVD, TW_TM X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 14 Jul 2011 14:23:39 +0000 Received: (qmail 8625 invoked from network); 14 Jul 2011 14:23:38 -0000 Received: from unknown (HELO ?192.168.0.100?) (ams@127.0.0.2) by mail.codesourcery.com with ESMTPA; 14 Jul 2011 14:23:38 -0000 Message-ID: <4E1EFBE7.6040606@codesourcery.com> Date: Thu, 14 Jul 2011 15:23:35 +0100 From: Andrew Stubbs User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20110627 Thunderbird/5.0 MIME-Version: 1.0 CC: gcc-patches@gcc.gnu.org, patches@linaro.org Subject: Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies References: <4E034EF2.3070503@codesourcery.com> <4E035084.2010503@codesourcery.com> <4E09CA21.3030605@codesourcery.com> <4E09E1B1.2090005@codesourcery.com> <4E11CDAF.3040308@codesourcery.com> <4E1C5509.6030505@codesourcery.com> In-Reply-To: <4E1C5509.6030505@codesourcery.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 12/07/11 15:07, Andrew Stubbs wrote: > This update does the same thing as before, but updated for the changes > earlier in the patch series. In particular, the build_and_insert_cast > function and find_widening_optab_handler_and_mode changes have been > moved up to patch 2. And this update changes the way the casts are handled, partly because it got unwieldy towards the end of the patch series, and partly because I found a few bugs. I've also ensured that it checks the precision of the types, rather than the mode size to ensure that it is bitfield safe. OK? Andrew 2011-07-14 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (convert_mult_to_widen): Convert unsupported unsigned multiplies to signed. (convert_plusminus_to_widen): Likewise. gcc/testsuite/ * gcc.target/arm/wmul-6.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-6.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, unsigned char *b, signed char *c) +{ + return a + (long long)*b * (long long)*c; +} + +/* { dg-final { scan-assembler "smlal" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2067,12 +2067,13 @@ is_widening_mult_p (gimple stmt, static bool convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) { - tree lhs, rhs1, rhs2, type, type1, type2, tmp; + tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL; enum insn_code handler; enum machine_mode to_mode, from_mode, actual_mode; optab op; int actual_precision; location_t loc = gimple_location (stmt); + bool from_unsigned1, from_unsigned2; lhs = gimple_assign_lhs (stmt); type = TREE_TYPE (lhs); @@ -2084,10 +2085,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) to_mode = TYPE_MODE (type); from_mode = TYPE_MODE (type1); + from_unsigned1 = TYPE_UNSIGNED (type1); + from_unsigned2 = TYPE_UNSIGNED (type2); - if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2)) + if (from_unsigned1 && from_unsigned2) op = umul_widen_optab; - else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2)) + else if (!from_unsigned1 && !from_unsigned2) op = smul_widen_optab; else op = usmul_widen_optab; @@ -2096,22 +2099,45 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) 0, &actual_mode); if (handler == CODE_FOR_nothing) - return false; + { + if (op != smul_widen_optab) + { + from_mode = GET_MODE_WIDER_MODE (from_mode); + if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode)) + return false; + + op = smul_widen_optab; + handler = find_widening_optab_handler_and_mode (op, to_mode, + from_mode, 0, + &actual_mode); + + if (handler == CODE_FOR_nothing) + return false; + + from_unsigned1 = from_unsigned2 = false; + } + else + return false; + } /* Ensure that the inputs to the handler are in the correct precison for the opcode. This will be the full mode size. */ actual_precision = GET_MODE_PRECISION (actual_mode); - if (actual_precision != TYPE_PRECISION (type1)) + if (actual_precision != TYPE_PRECISION (type1) + || from_unsigned1 != TYPE_UNSIGNED (type1)) { tmp = create_tmp_var (build_nonstandard_integer_type - (actual_precision, TYPE_UNSIGNED (type1)), + (actual_precision, from_unsigned1), NULL); rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1); - + } + if (actual_precision != TYPE_PRECISION (type2) + || from_unsigned2 != TYPE_UNSIGNED (type2)) + { /* Reuse the same type info, if possible. */ - if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) + if (!tmp || from_unsigned1 != from_unsigned2) tmp = create_tmp_var (build_nonstandard_integer_type - (actual_precision, TYPE_UNSIGNED (type2)), + (actual_precision, from_unsigned2), NULL); rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2); } @@ -2136,7 +2162,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, { gimple rhs1_stmt = NULL, rhs2_stmt = NULL; gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt; - tree type, type1, type2, tmp; + tree type, type1, type2, optype, tmp = NULL; tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs; enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK; optab this_optab; @@ -2145,6 +2171,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, enum machine_mode to_mode, from_mode, actual_mode; location_t loc = gimple_location (stmt); int actual_precision; + bool from_unsigned1, from_unsigned2; lhs = gimple_assign_lhs (stmt); type = TREE_TYPE (lhs); @@ -2238,9 +2265,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, to_mode = TYPE_MODE (type); from_mode = TYPE_MODE (type1); + from_unsigned1 = TYPE_UNSIGNED (type1); + from_unsigned2 = TYPE_UNSIGNED (type2); - if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) - return false; + /* There's no such thing as a mixed sign madd yet, so use a wider mode. */ + if (from_unsigned1 != from_unsigned2) + { + enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode); + if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode)) + { + from_mode = mode; + from_unsigned1 = from_unsigned2 = false; + } + else + return false; + } /* If there was a conversion between the multiply and addition then we need to make sure it fits a multiply-and-accumulate. @@ -2248,6 +2287,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, value. */ if (conv_stmt) { + /* We use the original, unmodified data types for this. */ tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt)); tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt)); int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); @@ -2272,7 +2312,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, /* Verify that the machine can perform a widening multiply accumulate in this mode/signedness combination, otherwise this transformation is likely to pessimize code. */ - this_optab = optab_for_tree_code (wmult_code, type1, optab_default); + optype = build_nonstandard_integer_type (from_mode, from_unsigned1); + this_optab = optab_for_tree_code (wmult_code, optype, optab_default); handler = find_widening_optab_handler_and_mode (this_optab, to_mode, from_mode, 0, &actual_mode); @@ -2282,13 +2323,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, /* Ensure that the inputs to the handler are in the correct precison for the opcode. This will be the full mode size. */ actual_precision = GET_MODE_PRECISION (actual_mode); - if (actual_precision != TYPE_PRECISION (type1)) + if (actual_precision != TYPE_PRECISION (type1) + || from_unsigned1 != TYPE_UNSIGNED (type1)) { tmp = create_tmp_var (build_nonstandard_integer_type - (actual_precision, TYPE_UNSIGNED (type1)), + (actual_precision, from_unsigned1), NULL); - mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1); + } + if (actual_precision != TYPE_PRECISION (type2) + || from_unsigned2 != TYPE_UNSIGNED (type2)) + { + if (!tmp || from_unsigned1 != from_unsigned2) + tmp = create_tmp_var (build_nonstandard_integer_type + (actual_precision, from_unsigned2), + NULL); mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2); }