From patchwork Mon Sep 1 19:49:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Segher Boessenkool X-Patchwork-Id: 384913 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 611D81401AC for ; Tue, 2 Sep 2014 05:56:14 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; q=dns; s=default; b=IrEg0bDJ+MycxcvyUrr AurJpfMz7LocvuQBgeds50UEA3Bxeg5tcrr5VFR4RY0XqDP1t99ZZfGEHOIJKoQp 0aLb1jxsCPmir0ci3QJ9r3KLaM862+0wsWHgainSJVU5Zv945+UEfbh025IsVcSG GA5iSVurF/YDzwsCSHyZCmiU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; s=default; bh=D15ntxMIfnRFYbddH1aFYLHwV wg=; b=gLzbTSivKJrjLgrsLa7eDi+Pd1k5QRV/jmYPy6QdUFBpsQfns7npz/HgT Opm38R1y6NesHSIXv4WBm2+6XdQ8d58mqPNYMY+XgmC5eS1IfWO1cczgtjNFC548 anhSyq1LfmVhYjaPEPQSId+hzbjwOoO5prcIsB16SDnHI3nShc= Received: (qmail 23910 invoked by alias); 1 Sep 2014 19:56:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 23900 invoked by uid 89); 1 Sep 2014 19:56:06 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: gcc1-power7.osuosl.org Received: from gcc1-power7.osuosl.org (HELO gcc1-power7.osuosl.org) (140.211.15.137) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 01 Sep 2014 19:56:05 +0000 Received: from gcc1-power7.osuosl.org (localhost [127.0.0.1]) by gcc1-power7.osuosl.org (8.14.6/8.14.6) with ESMTP id s81JqZqJ007630; Mon, 1 Sep 2014 12:52:35 -0700 Received: (from segher@localhost) by gcc1-power7.osuosl.org (8.14.6/8.14.6/Submit) id s81JqXKK007267; Mon, 1 Sep 2014 12:52:33 -0700 From: Segher Boessenkool To: gcc-patches@gcc.gnu.org Cc: dje.gcc@gmail.com, Segher Boessenkool Subject: [PATCH 2/4] rs6000: Merge and improve highpart and widening muls Date: Mon, 1 Sep 2014 12:49:48 -0700 Message-Id: <4787074296dde2e3c21416541f78ee475c588df6.1409581755.git.segher@kernel.crashing.org> In-Reply-To: References: In-Reply-To: References: X-IsSubscribed: yes This is a little more complex. The highpart muls generate a "truncate lshiftrt" pattern that is not canonical when widening to two registers, so this doesn't optimise well with combine. This patch changes it to use the canonical subreg patterns instead, which means we need separate patterns for LE mode. Oh well. Tested as usual. This regresses gcc.dg/sms-8.c with -m32: SMS now _does_ succeed, from my shallow investigation because there now are subregs and SMS explicitly looks for that. I didn't look further because other SMS tests are failing (without the patch) as well. Is this okay to apply? Segher 2014-09-01 Segher Boessenkool gcc/ * config/rs6000/rs6000.md (any_extend): New code iterator. (u, su): New code attributes. (dmode, DMODE): New mode attributes. (mul3_highpart): New. (*mul3_highpart): New. (mulsi3_highpart_le): New. (muldi3_highpart_le): New. (mulsi3_highpart_64): New. (mul3): New. (mulsidi3, umulsidi3, smulsi3_highpart, umulsi3_highpart, and two splitters): Delete. (mulditi3, umulditi3, smuldi3_highpart, umuldi3_highpart, and two splitters): Delete. --- gcc/config/rs6000/rs6000.md | 247 ++++++++++++++++++-------------------------- 1 file changed, 103 insertions(+), 144 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index d903e4a..f9e1eba 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -431,6 +431,11 @@ (define_code_attr return_pred [(return "direct_return ()") (simple_return "1")]) (define_code_attr return_str [(return "") (simple_return "simple_")]) +; Signed/unsigned variants of ops. +(define_code_iterator any_extend [sign_extend zero_extend]) +(define_code_attr u [(sign_extend "") (zero_extend "u")]) +(define_code_attr su [(sign_extend "s") (zero_extend "u")]) + ; Various instructions that come in SI and DI forms. ; A generic w/d attribute, for things like cmpw/cmpd. (define_mode_attr wd [(QI "b") @@ -454,6 +459,10 @@ (define_mode_attr sel [(SI "") (DI "64")]) ;; Bitmask for shift instructions (define_mode_attr hH [(SI "h") (DI "H")]) +;; A mode twice the size of the given mode +(define_mode_attr dmode [(SI "di") (DI "ti")]) +(define_mode_attr DMODE [(SI "DI") (DI "TI")]) + ;; Suffix for reload patterns (define_mode_attr ptrsize [(SI "32bit") (DI "64bit")]) @@ -2767,6 +2776,100 @@ (define_insn_and_split "*mul3_dot2" (set_attr "length" "4,8")]) +(define_expand "mul3_highpart" + [(set (match_operand:GPR 0 "gpc_reg_operand") + (subreg:GPR + (mult: (any_extend: + (match_operand:GPR 1 "gpc_reg_operand")) + (any_extend: + (match_operand:GPR 2 "gpc_reg_operand"))) + 0))] + "" +{ + if (mode == SImode && TARGET_POWERPC64) + { + emit_insn (gen_mulsi3_highpart_64 (operands[0], operands[1], + operands[2])); + DONE; + } + + if (!WORDS_BIG_ENDIAN) + { + emit_insn (gen_mul3_highpart_le (operands[0], operands[1], + operands[2])); + DONE; + } +}) + +(define_insn "*mul3_highpart" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (subreg:GPR + (mult: (any_extend: + (match_operand:GPR 1 "gpc_reg_operand" "r")) + (any_extend: + (match_operand:GPR 2 "gpc_reg_operand" "r"))) + 0))] + "WORDS_BIG_ENDIAN && !(mode == SImode && TARGET_POWERPC64)" + "mulh %0,%1,%2" + [(set_attr "type" "mul") + (set_attr "size" "")]) + +(define_insn "mulsi3_highpart_le" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (subreg:SI + (mult:DI (any_extend:DI + (match_operand:SI 1 "gpc_reg_operand" "r")) + (any_extend:DI + (match_operand:SI 2 "gpc_reg_operand" "r"))) + 4))] + "!WORDS_BIG_ENDIAN && !TARGET_POWERPC64" + "mulhw %0,%1,%2" + [(set_attr "type" "mul")]) + +(define_insn "muldi3_highpart_le" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (subreg:DI + (mult:TI (any_extend:TI + (match_operand:DI 1 "gpc_reg_operand" "r")) + (any_extend:TI + (match_operand:DI 2 "gpc_reg_operand" "r"))) + 8))] + "!WORDS_BIG_ENDIAN && TARGET_POWERPC64" + "mulhd %0,%1,%2" + [(set_attr "type" "mul") + (set_attr "size" "64")]) + +(define_insn "mulsi3_highpart_64" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (truncate:SI + (lshiftrt:DI + (mult:DI (any_extend:DI + (match_operand:SI 1 "gpc_reg_operand" "r")) + (any_extend:DI + (match_operand:SI 2 "gpc_reg_operand" "r"))) + (const_int 32))))] + "TARGET_POWERPC64" + "mulhw %0,%1,%2" + [(set_attr "type" "mul")]) + +(define_expand "mul3" + [(set (match_operand: 0 "gpc_reg_operand") + (mult: (any_extend: + (match_operand:GPR 1 "gpc_reg_operand")) + (any_extend: + (match_operand:GPR 2 "gpc_reg_operand"))))] + "!(mode == SImode && TARGET_POWERPC64)" +{ + rtx l = gen_reg_rtx (mode); + rtx h = gen_reg_rtx (mode); + emit_insn (gen_mul3 (l, operands[1], operands[2])); + emit_insn (gen_mul3_highpart (h, operands[1], operands[2])); + emit_move_insn (gen_lowpart (mode, operands[0]), l); + emit_move_insn (gen_highpart (mode, operands[0]), h); + DONE; +}) + + (define_insn "udiv3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (udiv:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") @@ -6622,96 +6725,6 @@ (define_insn "*negdi2_noppc64" [(set_attr "type" "two") (set_attr "length" "8")]) -(define_insn "mulsidi3" - [(set (match_operand:DI 0 "gpc_reg_operand" "=&r") - (mult:DI (sign_extend:DI (match_operand:SI 1 "gpc_reg_operand" "%r")) - (sign_extend:DI (match_operand:SI 2 "gpc_reg_operand" "r"))))] - "! TARGET_POWERPC64" -{ - return (WORDS_BIG_ENDIAN) - ? \"mulhw %0,%1,%2\;mullw %L0,%1,%2\" - : \"mulhw %L0,%1,%2\;mullw %0,%1,%2\"; -} - [(set_attr "type" "mul") - (set_attr "length" "8")]) - -(define_split - [(set (match_operand:DI 0 "gpc_reg_operand" "") - (mult:DI (sign_extend:DI (match_operand:SI 1 "gpc_reg_operand" "")) - (sign_extend:DI (match_operand:SI 2 "gpc_reg_operand" ""))))] - "! TARGET_POWERPC64 && reload_completed" - [(set (match_dup 3) - (truncate:SI - (lshiftrt:DI (mult:DI (sign_extend:DI (match_dup 1)) - (sign_extend:DI (match_dup 2))) - (const_int 32)))) - (set (match_dup 4) - (mult:SI (match_dup 1) - (match_dup 2)))] - " -{ - int endian = (WORDS_BIG_ENDIAN == 0); - operands[3] = operand_subword (operands[0], endian, 0, DImode); - operands[4] = operand_subword (operands[0], 1 - endian, 0, DImode); -}") - -(define_insn "umulsidi3" - [(set (match_operand:DI 0 "gpc_reg_operand" "=&r") - (mult:DI (zero_extend:DI (match_operand:SI 1 "gpc_reg_operand" "%r")) - (zero_extend:DI (match_operand:SI 2 "gpc_reg_operand" "r"))))] - "! TARGET_POWERPC64" - "* -{ - return (WORDS_BIG_ENDIAN) - ? \"mulhwu %0,%1,%2\;mullw %L0,%1,%2\" - : \"mulhwu %L0,%1,%2\;mullw %0,%1,%2\"; -}" - [(set_attr "type" "mul") - (set_attr "length" "8")]) - -(define_split - [(set (match_operand:DI 0 "gpc_reg_operand" "") - (mult:DI (zero_extend:DI (match_operand:SI 1 "gpc_reg_operand" "")) - (zero_extend:DI (match_operand:SI 2 "gpc_reg_operand" ""))))] - "! TARGET_POWERPC64 && reload_completed" - [(set (match_dup 3) - (truncate:SI - (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 1)) - (zero_extend:DI (match_dup 2))) - (const_int 32)))) - (set (match_dup 4) - (mult:SI (match_dup 1) - (match_dup 2)))] - " -{ - int endian = (WORDS_BIG_ENDIAN == 0); - operands[3] = operand_subword (operands[0], endian, 0, DImode); - operands[4] = operand_subword (operands[0], 1 - endian, 0, DImode); -}") - -(define_insn "smulsi3_highpart" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (truncate:SI - (lshiftrt:DI (mult:DI (sign_extend:DI - (match_operand:SI 1 "gpc_reg_operand" "%r")) - (sign_extend:DI - (match_operand:SI 2 "gpc_reg_operand" "r"))) - (const_int 32))))] - "" - "mulhw %0,%1,%2" - [(set_attr "type" "mul")]) - -(define_insn "umulsi3_highpart" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (truncate:SI - (lshiftrt:DI (mult:DI (zero_extend:DI - (match_operand:SI 1 "gpc_reg_operand" "%r")) - (zero_extend:DI - (match_operand:SI 2 "gpc_reg_operand" "r"))) - (const_int 32))))] - "" - "mulhwu %0,%1,%2" - [(set_attr "type" "mul")]) ;; Shift by a variable amount is too complex to be worth open-coding. We ;; just handle shifts by constants. @@ -6758,60 +6771,6 @@ (define_insn "*ashrdisi3_noppc64be" ;; PowerPC64 DImode operations. -(define_insn "smuldi3_highpart" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (truncate:DI - (lshiftrt:TI (mult:TI (sign_extend:TI - (match_operand:DI 1 "gpc_reg_operand" "%r")) - (sign_extend:TI - (match_operand:DI 2 "gpc_reg_operand" "r"))) - (const_int 64))))] - "TARGET_POWERPC64" - "mulhd %0,%1,%2" - [(set_attr "type" "mul") - (set_attr "size" "64")]) - -(define_insn "umuldi3_highpart" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r") - (truncate:DI - (lshiftrt:TI (mult:TI (zero_extend:TI - (match_operand:DI 1 "gpc_reg_operand" "%r")) - (zero_extend:TI - (match_operand:DI 2 "gpc_reg_operand" "r"))) - (const_int 64))))] - "TARGET_POWERPC64" - "mulhdu %0,%1,%2" - [(set_attr "type" "mul") - (set_attr "size" "64")]) - -(define_expand "mulditi3" - [(set (match_operand:TI 0 "gpc_reg_operand") - (mult:TI (sign_extend:TI (match_operand:DI 1 "gpc_reg_operand")) - (sign_extend:TI (match_operand:DI 2 "gpc_reg_operand"))))] - "TARGET_POWERPC64" -{ - rtx l = gen_reg_rtx (DImode), h = gen_reg_rtx (DImode); - emit_insn (gen_muldi3 (l, operands[1], operands[2])); - emit_insn (gen_smuldi3_highpart (h, operands[1], operands[2])); - emit_move_insn (gen_lowpart (DImode, operands[0]), l); - emit_move_insn (gen_highpart (DImode, operands[0]), h); - DONE; -}) - -(define_expand "umulditi3" - [(set (match_operand:TI 0 "gpc_reg_operand") - (mult:TI (zero_extend:TI (match_operand:DI 1 "gpc_reg_operand")) - (zero_extend:TI (match_operand:DI 2 "gpc_reg_operand"))))] - "TARGET_POWERPC64" -{ - rtx l = gen_reg_rtx (DImode), h = gen_reg_rtx (DImode); - emit_insn (gen_muldi3 (l, operands[1], operands[2])); - emit_insn (gen_umuldi3_highpart (h, operands[1], operands[2])); - emit_move_insn (gen_lowpart (DImode, operands[0]), l); - emit_move_insn (gen_highpart (DImode, operands[0]), h); - DONE; -}) - (define_insn "*rotldi3_internal4" [(set (match_operand:DI 0 "gpc_reg_operand" "=r") (and:DI (rotate:DI (match_operand:DI 1 "gpc_reg_operand" "r")