From patchwork Fri Jan 15 21:36:49 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 568490 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 9E50F14076E for ; Sat, 16 Jan 2016 08:37:06 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Ph+laVwc; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=YY4l/vHhRj1awqKCPdxi+ThVn/o1jC+Pk10scqCv3rAbqcj94k K5tuQFw1v+uCns0foWhndEtNHeTSDuiwYZucG4oUrPcKXyEgkQjD8z7EuUkOM/M1 X2W0CDQrJCPKiEzB+XScgOM68TpCq1wwnCfeHfXzeKc6sERaiU4NtEoOQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=Eg8cC9c6xEs/CVFZZAwbA3qzOo8=; b=Ph+laVwc1kzQKJDLrEyY QUYMnP0YSlHJMZSr/dEJ97ol5hYgJX8xLdMvGQcRqCI+cGZODBnrjboa6NVUPAlM 0RCtNCZyZsu3d/wWO75ncUz/0SH5AC8NjmnP9HX35iMlYi8KuysKaHbPx3JqqdAa lVa3crVS7cFm9+6UwW6gOlY= Received: (qmail 128133 invoked by alias); 15 Jan 2016 21:36:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 128109 invoked by uid 89); 15 Jan 2016 21:36:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=BAYES_20, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=predicates.md, UD:predicates.md, predicatesmd, match_scratch X-HELO: mail-qg0-f46.google.com Received: from mail-qg0-f46.google.com (HELO mail-qg0-f46.google.com) (209.85.192.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 15 Jan 2016 21:36:55 +0000 Received: by mail-qg0-f46.google.com with SMTP id b35so387942324qge.0 for ; Fri, 15 Jan 2016 13:36:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:to:from:subject:message-id:date :user-agent:mime-version:content-type; bh=AKGdVljIj6C41KTKhsQUJDnqXa6/96Lf1IzYpMS5mjg=; b=D8lnGwuqQlULjITJTLSrH4F8cAWOZG7qcL2HjKWGKUloaEcfF5TrMIhf8+fS7Bt1OC pWeoUy7wLu9N1veKxjODj0mPFTMMfXfcIRTYecchLnvDM0T6M9M4kI4yUysCa9QMtllr I+jhBxxtkkyrRyrpTFc21EDdSMBmK5qQayO5q1DHcQhOFKk2WpIbG4f9LA5+gHy0AycN 2v78/z4GhTNRZs/bMiT9+9YB5CxM4Kf9x+oJVky+mTg3YPndWMjZC3Wn7wRONiz7CvKO CB1DNhyroMv0zRqRxObWK5WcqhuBNNxaMB12Qd17/E0vZkFWnr0zrI+M76o5h36WYRUH 0yGA== X-Gm-Message-State: ALoCoQnXdWiKMwfmqg+DY4cAO7lpiJ1oOotvANQG0wRm5ftapCgWF75y6GxnT88ehsbx0YJkPi7p38zHuluFv4RIDgSKq39Yeg== X-Received: by 10.140.237.74 with SMTP id i71mr17407091qhc.55.1452893812727; Fri, 15 Jan 2016 13:36:52 -0800 (PST) Received: from anchor.twiddle.net (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by smtp.googlemail.com with ESMTPSA id v138sm5274297qka.6.2016.01.15.13.36.51 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 15 Jan 2016 13:36:51 -0800 (PST) To: gcc-patches@gcc.gnu.org, Marcus Shawcroft , Richard Earnshaw From: Richard Henderson Subject: [aarch64] Fix target/69176 Message-ID: <56996671.3040402@twiddle.net> Date: Fri, 15 Jan 2016 13:36:49 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 See the PR for details, but basically, the plus operations are special so you can't just split out one of the alternatives to a different pattern. This merges the two-instruction add case back into the main plus pattern, and then adds peepholes and splitters to generate the same code as before. Ok? r~ * config/aarch64/aarch64.md (add3): Move long immediate operands to pseudo only if CSE is expected. Split long immediate operands only after reload, and for the stack pointer. (*add3_pluslong): Remove. (*addsi3_aarch64, *adddi3_aarch64): Merge into... (*add3_aarch64): ... here. Add r/rk/Upl alternative. (*addsi3_aarch64_uxtw): Add r/rk/Upl alternative. (*add3 peepholes): New. (*add3 splitters): New. * config/aarch64/constraints.md (Upl): New. * config/aarch64/predicates.md (aarch64_pluslong_strict_immedate): New. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f6c8eb1..bde231b 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1590,96 +1590,120 @@ (plus:GPI (match_operand:GPI 1 "register_operand" "") (match_operand:GPI 2 "aarch64_pluslong_operand" "")))] "" - " - if (!aarch64_plus_operand (operands[2], VOIDmode)) +{ + if (aarch64_pluslong_strict_immedate (operands[2], mode)) { - if (can_create_pseudo_p ()) - { - rtx tmp = gen_reg_rtx (mode); - emit_move_insn (tmp, operands[2]); - operands[2] = tmp; - } - else + /* Give CSE the opportunity to share this constant across additions. */ + if (!cse_not_expected && can_create_pseudo_p ()) + operands[2] = force_reg (mode, operands[2]); + + /* Split will refuse to operate on a modification to the stack pointer. + Aid the prologue and epilogue expanders by splitting this now. */ + else if (reload_completed && operands[0] == stack_pointer_rtx) { - HOST_WIDE_INT imm = INTVAL (operands[2]); - imm = imm >= 0 ? imm & 0xfff : -(-imm & 0xfff); - emit_insn (gen_add3 (operands[0], operands[1], - GEN_INT (INTVAL (operands[2]) - imm))); + HOST_WIDE_INT i = INTVAL (operands[2]); + HOST_WIDE_INT s = (i >= 0 ? i & 0xfff : -(-i & 0xfff)); + emit_insn (gen_rtx_SET (operands[0], + gen_rtx_PLUS (mode, operands[1], + GEN_INT (i - s)))); operands[1] = operands[0]; - operands[2] = GEN_INT (imm); + operands[2] = GEN_INT (s); } } - " -) - -;; Find add with a 2-instruction immediate and merge into 2 add instructions. - -(define_insn_and_split "*add3_pluslong" - [(set - (match_operand:GPI 0 "register_operand" "=r") - (plus:GPI (match_operand:GPI 1 "register_operand" "r") - (match_operand:GPI 2 "aarch64_pluslong_immediate" "i")))] - "!aarch64_plus_operand (operands[2], VOIDmode) - && !aarch64_move_imm (INTVAL (operands[2]), mode)" - "#" - "&& true" - [(set (match_dup 0) (plus:GPI (match_dup 1) (match_dup 3))) - (set (match_dup 0) (plus:GPI (match_dup 0) (match_dup 4)))] - " - { - HOST_WIDE_INT imm = INTVAL (operands[2]); - imm = imm >= 0 ? imm & 0xfff : -(-imm & 0xfff); - operands[3] = GEN_INT (INTVAL (operands[2]) - imm); - operands[4] = GEN_INT (imm); - } - " -) +}) -(define_insn "*addsi3_aarch64" +(define_insn "*add3_aarch64" [(set - (match_operand:SI 0 "register_operand" "=rk,rk,w,rk") - (plus:SI - (match_operand:SI 1 "register_operand" "%rk,rk,w,rk") - (match_operand:SI 2 "aarch64_plus_operand" "I,r,w,J")))] + (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r") + (plus:GPI + (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk") + (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Upl")))] "" "@ - add\\t%w0, %w1, %2 - add\\t%w0, %w1, %w2 - add\\t%0.2s, %1.2s, %2.2s - sub\\t%w0, %w1, #%n2" - [(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm") - (set_attr "simd" "*,*,yes,*")] + add\\t%0, %1, %2 + add\\t%0, %1, %2 + add\\t%0, %1, %2 + sub\\t%0, %1, #%n2 + #" + [(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm,multiple") + (set_attr "simd" "*,*,yes,*,*")] ) ;; zero_extend version of above (define_insn "*addsi3_aarch64_uxtw" [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk") + (match_operand:DI 0 "register_operand" "=rk,rk,rk,r") (zero_extend:DI - (plus:SI (match_operand:SI 1 "register_operand" "%rk,rk,rk") - (match_operand:SI 2 "aarch64_plus_operand" "I,r,J"))))] + (plus:SI (match_operand:SI 1 "register_operand" "%rk,rk,rk,rk") + (match_operand:SI 2 "aarch64_pluslong_operand" "I,r,J,Upl"))))] "" "@ add\\t%w0, %w1, %2 add\\t%w0, %w1, %w2 - sub\\t%w0, %w1, #%n2" - [(set_attr "type" "alu_imm,alu_sreg,alu_imm")] + sub\\t%w0, %w1, #%n2 + #" + [(set_attr "type" "alu_imm,alu_sreg,alu_imm,multiple")] ) -(define_insn "*adddi3_aarch64" - [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk,w") - (plus:DI - (match_operand:DI 1 "register_operand" "%rk,rk,rk,w") - (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))] - "" - "@ - add\\t%x0, %x1, %2 - add\\t%x0, %x1, %x2 - sub\\t%x0, %x1, #%n2 - add\\t%d0, %d1, %d2" - [(set_attr "type" "alu_imm,alu_sreg,alu_imm,neon_add") - (set_attr "simd" "*,*,*,yes")] +;; If there's a free register, and we can load the constant with a +;; single instruction, do so. This has a chance to improve scheduling. +(define_peephole2 + [(match_scratch:GPI 3 "r") + (set (match_operand:GPI 0 "register_operand") + (plus:GPI + (match_operand:GPI 1 "register_operand") + (match_operand:GPI 2 "aarch64_pluslong_strict_immedate")))] + "aarch64_move_imm (INTVAL (operands[2]), mode)" + [(set (match_dup 3) (match_dup 2)) + (set (match_dup 0) (plus:GPI (match_dup 1) (match_dup 3)))] +) + +(define_peephole2 + [(match_scratch:SI 3 "r") + (set (match_operand:DI 0 "register_operand") + (zero_extend:DI + (plus:SI + (match_operand:SI 1 "register_operand") + (match_operand:SI 2 "aarch64_pluslong_strict_immedate"))))] + "aarch64_move_imm (INTVAL (operands[2]), SImode)" + [(set (match_dup 3) (match_dup 2)) + (set (match_dup 0) (zero_extend:DI (plus:SI (match_dup 1) (match_dup 3))))] +) + +;; After peephole2 has had a chance to run, split any remaining long +;; additions into two add immediates. +(define_split + [(set (match_operand:GPI 0 "register_operand") + (plus:GPI + (match_operand:GPI 1 "register_operand") + (match_operand:GPI 2 "aarch64_pluslong_strict_immedate")))] + "epilogue_completed" + [(set (match_dup 0) (plus:GPI (match_dup 1) (match_dup 3))) + (set (match_dup 0) (plus:GPI (match_dup 0) (match_dup 4)))] + { + HOST_WIDE_INT i = INTVAL (operands[2]); + HOST_WIDE_INT s = (i >= 0 ? i & 0xfff : -(-i & 0xfff)); + operands[3] = GEN_INT (i - s); + operands[4] = GEN_INT (s); + } +) + +(define_split + [(set (match_operand:DI 0 "register_operand") + (zero_extend:DI + (plus:SI + (match_operand:SI 1 "register_operand") + (match_operand:SI 2 "aarch64_pluslong_strict_immedate"))))] + "epilogue_completed" + [(set (match_dup 5) (plus:SI (match_dup 1) (match_dup 3))) + (set (match_dup 0) (zero_extend:DI (plus:SI (match_dup 5) (match_dup 4))))] + { + HOST_WIDE_INT i = INTVAL (operands[2]); + HOST_WIDE_INT s = (i >= 0 ? i & 0xfff : -(-i & 0xfff)); + operands[3] = GEN_INT (i - s); + operands[4] = GEN_INT (s); + operands[5] = gen_lowpart (SImode, operands[0]); + } ) (define_expand "addti3" diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 9b77291..0208b25 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -35,6 +35,11 @@ (and (match_code "const_int") (match_test "aarch64_uimm12_shift (ival)"))) +(define_constraint "Upl" + "A constraint that matches two uses of add instructions." + (and (match_code "const_int") + (match_test "aarch64_pluslong_strict_immedate (op, VOIDmode)"))) + (define_constraint "J" "A constant that can be used with a SUB operation (once negated)." (and (match_code "const_int") diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index a2eb69c..f3b514b 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -107,6 +107,10 @@ (and (match_code "const_int") (match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)"))) +(define_predicate "aarch64_pluslong_strict_immedate" + (and (match_operand 0 "aarch64_pluslong_immediate") + (not (match_operand 0 "aarch64_plus_immediate")))) + (define_predicate "aarch64_pluslong_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_pluslong_immediate")))