From patchwork Fri Sep 6 06:50:41 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273091 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 0D10B2C0085 for ; Fri, 6 Sep 2013 16:58:59 +1000 (EST) Received: from localhost ([::1]:35421 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VHq0P-0004eF-1y for incoming@patchwork.ozlabs.org; Fri, 06 Sep 2013 02:58:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33149) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VHptb-0001m1-In for qemu-devel@nongnu.org; Fri, 06 Sep 2013 02:52:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VHptV-0005IE-GK for qemu-devel@nongnu.org; Fri, 06 Sep 2013 02:51:55 -0400 Received: from mail-qe0-x236.google.com ([2607:f8b0:400d:c02::236]:37439) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VHptV-0005I1-AV for qemu-devel@nongnu.org; Fri, 06 Sep 2013 02:51:49 -0400 Received: by mail-qe0-f54.google.com with SMTP id cy11so1490692qeb.41 for ; Thu, 05 Sep 2013 23:51:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=mxCaWbXSpDUkdG3ndeypmkxtmebj28Hny3hlu3z+aPk=; b=TWbYAHQzVOvDZ01lJNum7Zz29Kxlj+uUT83M9ZbCE4SfD2U2hhC68Ak/L0ZPTfymwP 41YGr3DmFEhXlEG3uKJ0ymHL2o1W/7Ow/V+pU/J13ajrfYc02iA9h2yHk1jtIirEl4K4 l+bb7RiI5TFBmbtGfUF1eeiJVfJI1jee0rhqjW0J2o82hen2RTOS8F+rBftJM5pM3Uw6 R8BZuXUl2f6WAnez80InKaKj3mgeA0MLy+luqB1vyRYAbJ2i28jx2t60DRpM+LIYYzQm 0Oa9Fr9igmBYzslsh3ndEVinHu/b9Goc3UwgxYDT/77WbhMRUn8IRI5dZwAVqbYNqcBK ghaA== X-Received: by 10.224.37.72 with SMTP id w8mr2362476qad.32.1378450308829; Thu, 05 Sep 2013 23:51:48 -0700 (PDT) Received: from pebble.com (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by mx.google.com with ESMTPSA id i10sm1688981qev.8.1969.12.31.16.00.00 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 05 Sep 2013 23:51:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 5 Sep 2013 23:50:41 -0700 Message-Id: <1378450242-27080-20-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1378450242-27080-1-git-send-email-rth@twiddle.net> References: <1378450242-27080-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400d:c02::236 Cc: aurelien@aurel32.net, Richard Henderson Subject: [Qemu-devel] [PATCH 19/19] tcg-ia64: Move part of softmmu slow path out of line X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Richard Henderson --- tcg/ia64/tcg-target.c | 156 +++++++++++++++++++++++++++++++------------------- tcg/ia64/tcg-target.h | 2 +- 2 files changed, 97 insertions(+), 61 deletions(-) diff --git a/tcg/ia64/tcg-target.c b/tcg/ia64/tcg-target.c index ea24e83..9fd176d 100644 --- a/tcg/ia64/tcg-target.c +++ b/tcg/ia64/tcg-target.c @@ -219,6 +219,7 @@ enum { OPC_ALLOC_M34 = 0x02c00000000ull, OPC_BR_DPTK_FEW_B1 = 0x08400000000ull, OPC_BR_SPTK_MANY_B1 = 0x08000001000ull, + OPC_BR_CALL_SPNT_FEW_B3 = 0x0a200000000ull, OPC_BR_SPTK_MANY_B4 = 0x00100001000ull, OPC_BR_CALL_SPTK_MANY_B5 = 0x02100001000ull, OPC_BR_RET_SPTK_MANY_B4 = 0x00108001100ull, @@ -355,6 +356,15 @@ static inline uint64_t tcg_opc_b1(int qp, uint64_t opc, uint64_t imm) | (qp & 0x3f); } +static inline uint64_t tcg_opc_b3(int qp, uint64_t opc, int b1, uint64_t imm) +{ + return opc + | ((imm & 0x100000) << 16) /* s */ + | ((imm & 0x0fffff) << 13) /* imm20b */ + | ((b1 & 0x7) << 6) + | (qp & 0x3f); +} + static inline uint64_t tcg_opc_b4(int qp, uint64_t opc, int b2) { return opc @@ -1633,14 +1643,70 @@ static inline void tcg_out_qemu_tlb(TCGContext *s, TCGReg addr_reg, bswap2); } -/* helper signature: helper_ld_mmu(CPUState *env, target_ulong addr, - int mmu_idx, uintptr_t retaddr) */ -static const void * const qemu_ld_helpers[4] = { - helper_ret_ldub_mmu, - helper_le_lduw_mmu, - helper_le_ldul_mmu, - helper_le_ldq_mmu, -}; +static void add_qemu_ldst_label(TCGContext *s, int is_ld, TCGMemOp opc, + uint8_t *label_ptr) +{ + TCGLabelQemuLdst *l = &s->qemu_ldst_labels[s->nb_qemu_ldst_labels++]; + + assert(s->nb_qemu_ldst_labels <= TCG_MAX_QEMU_LDST); + + /* We don't need most of the items in the generic structure. */ + memset(l, 0, sizeof(*l)); + l->is_ld = is_ld; + l->opc = opc & MO_SIZE; + l->label_ptr[0] = label_ptr; +} + +void tcg_out_tb_finalize(TCGContext *s) +{ + static const void * const helpers[8] = { + helper_ret_stb_mmu, + helper_le_stw_mmu, + helper_le_stl_mmu, + helper_le_stq_mmu, + helper_ret_ldub_mmu, + helper_le_lduw_mmu, + helper_le_ldul_mmu, + helper_le_ldq_mmu, + }; + uintptr_t thunks[8] = { }; + size_t i, n = s->nb_qemu_ldst_labels; + + for (i = 0; i < n; i++) { + TCGLabelQemuLdst *l = &s->qemu_ldst_labels[i]; + long x = l->is_ld * 4 + l->opc; + uintptr_t dest = thunks[x]; + + /* The out-of-line thunks are all the same; load the return address + from B0, load the GP, and branch to the code. Note that we are + always post-call, so the register window has rolled, so we're + using incomming parameter register numbers, not outgoing. */ + if (dest == 0) { + uintptr_t disp, *desc = (uintptr_t *)helpers[x]; + + thunks[x] = dest = (uintptr_t)s->code_ptr; + + tcg_out_bundle(s, mlx, + INSN_NOP_M, + tcg_opc_l2 (desc[1]), + tcg_opc_x2 (TCG_REG_P0, OPC_MOVL_X2, + TCG_REG_R1, desc[1])); + tcg_out_bundle(s, mii, + INSN_NOP_M, + INSN_NOP_I, + tcg_opc_i22(TCG_REG_P0, OPC_MOV_I22, + l->is_ld ? TCG_REG_R35 : TCG_REG_R36, + TCG_REG_B0)); + disp = (desc[0] - (uintptr_t)s->code_ptr) >> 4; + tcg_out_bundle(s, mLX, + INSN_NOP_M, + tcg_opc_l3 (disp), + tcg_opc_x3 (TCG_REG_P0, OPC_BRL_SPTK_MANY_X3, disp)); + } + + reloc_pcrel21b(l->label_ptr[0], dest); + } +} static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGMemOp opc) @@ -1650,7 +1716,8 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, }; int addr_reg, data_reg, mem_index; TCGMemOp s_bits; - uint64_t fin1, fin2, *desc, func, gp, here; + uint64_t fin1, fin2; + uint8_t *label_ptr; data_reg = *args++; addr_reg = *args++; @@ -1677,31 +1744,20 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, fin1 = tcg_opc_ext_i(TCG_REG_P0, opc, data_reg, TCG_REG_R8); } - desc = (uintptr_t *)qemu_ld_helpers[s_bits]; - func = desc[0]; - gp = desc[1]; - here = (uintptr_t)s->code_ptr; - - tcg_out_bundle(s, mlx, + tcg_out_bundle(s, mmI, tcg_opc_mov_a(TCG_REG_P7, TCG_REG_R56, TCG_AREG0), - tcg_opc_l2 (here), - tcg_opc_x2 (TCG_REG_P7, OPC_MOVL_X2, TCG_REG_R59, here)); - tcg_out_bundle(s, mLX, tcg_opc_a1 (TCG_REG_P6, OPC_ADD_A1, TCG_REG_R2, TCG_REG_R2, TCG_REG_R57), - tcg_opc_l2 (gp), - tcg_opc_x2 (TCG_REG_P7, OPC_MOVL_X2, TCG_REG_R1, gp)); - tcg_out_bundle(s, mmi, + tcg_opc_movi_a(TCG_REG_P7, TCG_REG_R58, mem_index)); + label_ptr = s->code_ptr + 2; + tcg_out_bundle(s, miB, tcg_opc_m1 (TCG_REG_P6, opc_ld_m1[s_bits], TCG_REG_R8, TCG_REG_R2), - tcg_opc_movi_a(TCG_REG_P7, TCG_REG_R58, mem_index), - INSN_NOP_I); - func -= (uintptr_t)s->code_ptr; - tcg_out_bundle(s, mLX, - INSN_NOP_M, - tcg_opc_l4 (func >> 4), - tcg_opc_x4 (TCG_REG_P7, OPC_BRL_CALL_SPNT_MANY_X4, - TCG_REG_B0, func >> 4)); + INSN_NOP_I, + tcg_opc_b3 (TCG_REG_P7, OPC_BR_CALL_SPNT_FEW_B3, TCG_REG_B0, + get_reloc_pcrel21b(label_ptr))); + + add_qemu_ldst_label(s, 1, opc, label_ptr); /* Note that we always use LE helper functions, so the bswap insns here for the fast path also apply to the slow path. */ @@ -1711,15 +1767,6 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, fin2 ? fin2 : INSN_NOP_I); } -/* helper signature: helper_st_mmu(CPUState *env, target_ulong addr, - uintxx_t val, int mmu_idx, uintptr_t retaddr) */ -static const void * const qemu_st_helpers[4] = { - helper_ret_stb_mmu, - helper_le_stw_mmu, - helper_le_stl_mmu, - helper_le_stq_mmu, -}; - static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGMemOp opc) { @@ -1728,8 +1775,9 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, }; TCGReg addr_reg, data_reg; int mem_index; - uint64_t pre1, pre2, *desc, func, gp, here; + uint64_t pre1, pre2; TCGMemOp s_bits; + uint8_t *label_ptr; data_reg = *args++; addr_reg = *args++; @@ -1758,32 +1806,20 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, pre1, pre2); /* P6 is the fast path, and P7 the slow path */ - - desc = (uintptr_t *)qemu_st_helpers[s_bits]; - func = desc[0]; - gp = desc[1]; - here = (uintptr_t)s->code_ptr; - - tcg_out_bundle(s, mlx, + tcg_out_bundle(s, mmI, tcg_opc_mov_a(TCG_REG_P7, TCG_REG_R56, TCG_AREG0), - tcg_opc_l2 (here), - tcg_opc_x2 (TCG_REG_P7, OPC_MOVL_X2, TCG_REG_R60, here)); - tcg_out_bundle(s, mLX, tcg_opc_a1 (TCG_REG_P6, OPC_ADD_A1, TCG_REG_R2, TCG_REG_R2, TCG_REG_R57), - tcg_opc_l2 (gp), - tcg_opc_x2 (TCG_REG_P7, OPC_MOVL_X2, TCG_REG_R1, gp)); - tcg_out_bundle(s, mmi, + tcg_opc_movi_a(TCG_REG_P7, TCG_REG_R59, mem_index)); + label_ptr = s->code_ptr + 2; + tcg_out_bundle(s, miB, tcg_opc_m4 (TCG_REG_P6, opc_st_m4[s_bits], TCG_REG_R58, TCG_REG_R2), - tcg_opc_movi_a(TCG_REG_P7, TCG_REG_R59, mem_index), - INSN_NOP_I); - func -= (uintptr_t)s->code_ptr; - tcg_out_bundle(s, mLX, - INSN_NOP_M, - tcg_opc_l4 (func >> 4), - tcg_opc_x4 (TCG_REG_P7, OPC_BRL_CALL_SPNT_MANY_X4, - TCG_REG_B0, func >> 4)); + INSN_NOP_I, + tcg_opc_b3 (TCG_REG_P7, OPC_BR_CALL_SPNT_FEW_B3, TCG_REG_B0, + get_reloc_pcrel21b(label_ptr))); + + add_qemu_ldst_label(s, 0, opc, label_ptr); } #else /* !CONFIG_SOFTMMU */ diff --git a/tcg/ia64/tcg-target.h b/tcg/ia64/tcg-target.h index 65897f2..9e1e8ba 100644 --- a/tcg/ia64/tcg-target.h +++ b/tcg/ia64/tcg-target.h @@ -25,7 +25,7 @@ #ifndef TCG_TARGET_IA64 #define TCG_TARGET_IA64 1 -#undef TCG_QEMU_LDST_OPTIMIZATION +#define TCG_QEMU_LDST_OPTIMIZATION /* We only map the first 64 registers */ #define TCG_TARGET_NB_REGS 64