From patchwork Wed Oct 10 08:00:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudiu Zissulescu Ianculescu X-Patchwork-Id: 981743 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-487233-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Jq5ivgW8"; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="H1zaf9Tx"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42VRQ70dkJz9s8F for ; Wed, 10 Oct 2018 19:01:50 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=qiYMnfY0bpqcjlPzBA8FD1bUwjP9m3F6rru+gDSK7N5vr90iJaZcJ tjz5K9ciIfA96CYnNAcztne18QxPTQTe6tyOxqJagTaF4SFGEc3nYFfMJAwYSpLN wkP5sip9CMKqE7K9641mihGf+12hnN4OoQL5El8Ch46nzEjwi4l5dg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=0BjbeBSS0wZwps7MJK3ZA3HEbn8=; b=Jq5ivgW8ES7Dvn+aENXk Dy/Yh/ZJFgD3oUFGPDsThbGK63mTYGm0JVL2LKzS9aeEYn9VcH0D9B09vdC7FsEB LYzWVn7k2F8NkIzjK+ZiatVaCmZ2tgFk5387U3q8jHlLPY9hqgF1m6XY5LQ4c65w jPHxT93RwVt8M7nG4RXonYU= Received: (qmail 70932 invoked by alias); 10 Oct 2018 08:00:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 70684 invoked by uid 89); 10 Oct 2018 08:00:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.8 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=gap, H*r:sk:w19-v6s, const_int X-HELO: mail-ed1-f65.google.com Received: from mail-ed1-f65.google.com (HELO mail-ed1-f65.google.com) (209.85.208.65) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 10 Oct 2018 08:00:33 +0000 Received: by mail-ed1-f65.google.com with SMTP id w19-v6so4061855eds.1 for ; Wed, 10 Oct 2018 01:00:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CFyxCqpnXw5xPe+wQfZWxbXdMsewxFHTr+cZYCjWPrI=; b=H1zaf9TxYjvrQWwfyCJyhvJCBBdtIWfj1+ei/B2axBbOo9Io2EOqtdt+ifxRElbhjO AyRikTL57PgZOjZ+EY8jSWM0M3rCv3GB8rCaj3S6UV34Ug3FphSpg2jUO/+EavPjDkEf uRCDnwYmIcL3k7bh2VwyhSRMdOOtwo2CKRm3VNWhsBLibBhaAGjSb9ESJe/8RNs31OEz yCCPOWUU3zTSpOf19u7CPNTbr3v3f9BosE/nXnzm1VrNHktNzN++0zvtGtf3HRDXzo1d 9ERKZxpvfntDIrJ9WYB9EBrNXol09drgysbklGwhizb/fg1MnLS0uV2d6TMnfOKDNQJ6 6owQ== Received: from localhost.localdomain ([188.241.79.25]) by smtp.gmail.com with ESMTPSA id a17-v6sm7533362edd.61.2018.10.10.01.00.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 10 Oct 2018 01:00:31 -0700 (PDT) From: Claudiu Zissulescu To: gcc-patches@gcc.gnu.org Cc: andrew.burgess@embecosm.com, fbedard@synopsys.com, claziss@synopsys.com Subject: [PATCH 4/6] [ARC] Add peephole rules to combine store/loads into double store/loads Date: Wed, 10 Oct 2018 11:00:14 +0300 Message-Id: <20181010080016.12317-5-claziss@gmail.com> In-Reply-To: <20181010080016.12317-1-claziss@gmail.com> References: <20181010080016.12317-1-claziss@gmail.com> X-IsSubscribed: yes Simple peephole rules which combines multiple ld/st instructions into 64-bit load/store instructions. It only works for architectures which are having double load/store option on. gcc/ Claudiu Zissulescu * config/arc/arc-protos.h (gen_operands_ldd_std): Add. * config/arc/arc.c (operands_ok_ldd_std): New function. (mem_ok_for_ldd_std): Likewise. (gen_operands_ldd_std): Likewise. * config/arc/arc.md: Add peephole2 rules for std/ldd. --- gcc/config/arc/arc-protos.h | 1 + gcc/config/arc/arc.c | 163 ++++++++++++++++++++++++++++++++++++ gcc/config/arc/arc.md | 67 +++++++++++++++ 3 files changed, 231 insertions(+) diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h index 24bea6e1efb..55f8ed4c643 100644 --- a/gcc/config/arc/arc-protos.h +++ b/gcc/config/arc/arc-protos.h @@ -46,6 +46,7 @@ extern int arc_return_address_register (unsigned int); extern unsigned int arc_compute_function_type (struct function *); extern bool arc_is_uncached_mem_p (rtx); extern bool arc_lra_p (void); +extern bool gen_operands_ldd_std (rtx *operands, bool load, bool commute); #endif /* RTX_CODE */ extern unsigned int arc_compute_frame_size (int); diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 18dd0de6af7..9bc69e9fbc9 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -10803,6 +10803,169 @@ arc_cannot_substitute_mem_equiv_p (rtx) return true; } +/* Checks whether the operands are valid for use in an LDD/STD + instruction. Assumes that RT, RT2, and RN are REG. This is + guaranteed by the patterns. Assumes that the address in the base + register RN is word aligned. Pattern guarantees that both memory + accesses use the same base register, the offsets are constants + within the range, and the gap between the offsets is 4. If preload + complete then check that registers are legal. WBACK indicates + whether address is updated. */ + +static bool +operands_ok_ldd_std (rtx rt, rtx rt2, rtx rn ATTRIBUTE_UNUSED, + HOST_WIDE_INT offset) +{ + unsigned int t, t2; + + if (!reload_completed) + return true; + + if (!(SMALL_INT_RANGE (offset, (GET_MODE_SIZE (DImode) - 1) & -4, + (offset & (GET_MODE_SIZE (DImode) - 1) & 3 + ? 0 : -(-GET_MODE_SIZE (DImode) | -4) >> 1)))) + return false; + + t = REGNO (rt); + t2 = REGNO (rt2); + + if ((t2 == 63) + || (t % 2 != 0) /* First destination register is not even. */ + || (t2 != t + 1)) + return false; + + return true; +} + +/* Helper for gen_operands_ldd_std. Returns true iff the memory + operand MEM's address contains an immediate offset from the base + register and has no side effects, in which case it sets BASE and + OFFSET accordingly. */ + +static bool +mem_ok_for_ldd_std (rtx mem, rtx *base, rtx *offset) +{ + rtx addr; + + gcc_assert (base != NULL && offset != NULL); + + /* TODO: Handle more general memory operand patterns, such as + PRE_DEC and PRE_INC. */ + + if (side_effects_p (mem)) + return false; + + /* Can't deal with subregs. */ + if (GET_CODE (mem) == SUBREG) + return false; + + gcc_assert (MEM_P (mem)); + + *offset = const0_rtx; + + addr = XEXP (mem, 0); + + /* If addr isn't valid for DImode, then we can't handle it. */ + if (!arc_legitimate_address_p (DImode, addr, + reload_in_progress || reload_completed)) + return false; + + if (REG_P (addr)) + { + *base = addr; + return true; + } + else if (GET_CODE (addr) == PLUS || GET_CODE (addr) == MINUS) + { + *base = XEXP (addr, 0); + *offset = XEXP (addr, 1); + return (REG_P (*base) && CONST_INT_P (*offset)); + } + + return false; +} + +/* Called from peephole2 to replace two word-size accesses with a + single LDD/STD instruction. Returns true iff we can generate a new + instruction sequence. That is, both accesses use the same base + register and the gap between constant offsets is 4. OPERANDS are + the operands found by the peephole matcher; OPERANDS[0,1] are + register operands, and OPERANDS[2,3] are the corresponding memory + operands. LOAD indicates whether the access is load or store. */ + +bool +gen_operands_ldd_std (rtx *operands, bool load, bool commute) +{ + int i, gap; + HOST_WIDE_INT offsets[2], offset; + int nops = 2; + rtx cur_base, cur_offset, tmp; + rtx base = NULL_RTX; + + /* Check that the memory references are immediate offsets from the + same base register. Extract the base register, the destination + registers, and the corresponding memory offsets. */ + for (i = 0; i < nops; i++) + { + if (!mem_ok_for_ldd_std (operands[nops+i], &cur_base, &cur_offset)) + return false; + + if (i == 0) + base = cur_base; + else if (REGNO (base) != REGNO (cur_base)) + return false; + + offsets[i] = INTVAL (cur_offset); + if (GET_CODE (operands[i]) == SUBREG) + { + tmp = SUBREG_REG (operands[i]); + gcc_assert (GET_MODE (operands[i]) == GET_MODE (tmp)); + operands[i] = tmp; + } + } + + /* Make sure there is no dependency between the individual loads. */ + if (load && REGNO (operands[0]) == REGNO (base)) + return false; /* RAW */ + + if (load && REGNO (operands[0]) == REGNO (operands[1])) + return false; /* WAW */ + + /* Make sure the instructions are ordered with lower memory access first. */ + if (offsets[0] > offsets[1]) + { + gap = offsets[0] - offsets[1]; + offset = offsets[1]; + + /* Swap the instructions such that lower memory is accessed first. */ + std::swap (operands[0], operands[1]); + std::swap (operands[2], operands[3]); + } + else + { + gap = offsets[1] - offsets[0]; + offset = offsets[0]; + } + + /* Make sure accesses are to consecutive memory locations. */ + if (gap != 4) + return false; + + /* Make sure we generate legal instructions. */ + if (operands_ok_ldd_std (operands[0], operands[1], base, offset)) + return true; + + if (load && commute) + { + /* Try reordering registers. */ + std::swap (operands [0], operands[1]); + if (operands_ok_ldd_std (operands[0], operands[1], base, offset)) + return true; + } + + return false; +} + #undef TARGET_USE_ANCHORS_FOR_SYMBOL_P #define TARGET_USE_ANCHORS_FOR_SYMBOL_P arc_use_anchors_for_symbol_p diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 1ed230fa5f0..b968022e64a 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -6363,6 +6363,73 @@ archs4x, archs4xd, archs4xd_slow" [(set (reg:CC CC_REG) (compare:CC (match_dup 3) (ashift:SI (match_dup 1) (match_dup 2))))]) +(define_peephole2 ; std +[(set (match_operand:SI 2 "memory_operand" "") + (match_operand:SI 0 "register_operand" "")) + (set (match_operand:SI 3 "memory_operand" "") + (match_operand:SI 1 "register_operand" ""))] + "TARGET_LL64" + [(const_int 0)] +{ + if (!gen_operands_ldd_std (operands, false, false)) + FAIL; + operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); + operands[2] = adjust_address (operands[2], DImode, 0); + emit_insn (gen_rtx_SET (operands[2], operands[0])); + DONE; + }) + +(define_peephole2 ; ldd + [(set (match_operand:SI 0 "register_operand" "") + (match_operand:SI 2 "memory_operand" "")) + (set (match_operand:SI 1 "register_operand" "") + (match_operand:SI 3 "memory_operand" ""))] + "TARGET_LL64" + [(const_int 0)] +{ + if (!gen_operands_ldd_std (operands, true, false)) + FAIL; + operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); + operands[2] = adjust_address (operands[2], DImode, 0); + emit_insn (gen_rtx_SET (operands[0], operands[2])); + DONE; +}) + +;; We require consecutive registers for LDD instruction. Check if we +;; can reorder them and use an LDD. + +(define_peephole2 ; swap the destination registers of two loads + ; before a commutative operation. + [(set (match_operand:SI 0 "register_operand" "") + (match_operand:SI 2 "memory_operand" "")) + (set (match_operand:SI 1 "register_operand" "") + (match_operand:SI 3 "memory_operand" "")) + (set (match_operand:SI 4 "register_operand" "") + (match_operator:SI 5 "commutative_operator" + [(match_operand 6 "register_operand" "") + (match_operand 7 "register_operand" "") ]))] + "TARGET_LL64 + && (((rtx_equal_p(operands[0], operands[6])) + && (rtx_equal_p(operands[1], operands[7]))) + || ((rtx_equal_p(operands[0], operands[7])) + && (rtx_equal_p(operands[1], operands[6])))) + && (peep2_reg_dead_p (3, operands[0]) || rtx_equal_p (operands[0], operands[4])) + && (peep2_reg_dead_p (3, operands[1]) || rtx_equal_p (operands[1], operands[4]))" + [(set (match_dup 0) (match_dup 2)) + (set (match_dup 4) (match_op_dup 5 [(match_dup 6) (match_dup 7)]))] + { + if (!gen_operands_ldd_std (operands, true, true)) + { + FAIL; + } + else + { + operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); + operands[2] = adjust_address (operands[2], DImode, 0); + } + } +) + ;; include the arc-FPX instructions (include "fpx.md")