From patchwork Tue Oct 9 20:30:48 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Jarno X-Patchwork-Id: 190434 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 658982C0096 for ; Wed, 10 Oct 2012 07:31:32 +1100 (EST) Received: from localhost ([::1]:51437 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TLgSe-0003mp-0d for incoming@patchwork.ozlabs.org; Tue, 09 Oct 2012 16:31:28 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50584) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TLgSC-0003Ny-3P for qemu-devel@nongnu.org; Tue, 09 Oct 2012 16:31:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TLgS9-0006Fz-Ih for qemu-devel@nongnu.org; Tue, 09 Oct 2012 16:30:59 -0400 Received: from hall.aurel32.net ([88.191.126.93]:42146) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TLgS9-0006EY-AQ; Tue, 09 Oct 2012 16:30:57 -0400 Received: from [2001:470:d4ed:0:ea11:32ff:fea1:831a] (helo=ohm.aurel32.net) by hall.aurel32.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1TLgS7-00079u-6z; Tue, 09 Oct 2012 22:30:55 +0200 Received: from aurel32 by ohm.aurel32.net with local (Exim 4.80) (envelope-from ) id 1TLgS5-0005ot-H8; Tue, 09 Oct 2012 22:30:53 +0200 From: Aurelien Jarno To: qemu-devel@nongnu.org Date: Tue, 9 Oct 2012 22:30:48 +0200 Message-Id: <1349814652-22325-2-git-send-email-aurelien@aurel32.net> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1349814652-22325-1-git-send-email-aurelien@aurel32.net> References: <1349814652-22325-1-git-send-email-aurelien@aurel32.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 88.191.126.93 Cc: Peter Maydell , qemu-stable@nongnu.org, Aurelien Jarno Subject: [Qemu-devel] [PATCH 1/5] tcg/arm: fix TLB access in qemu-ld/st ops X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The TCG arm backend considers likely that the offset to the TLB entries does not exceed 12 bits for mem_index = 0. In practice this is not true for at list the MIPS target. The current patch fixes that by loading the bits 23-12 with a separate instruction, and using loads with address writeback, independently of the value of mem_idx. In total this allow a 24-bit offset, which is a lot more than needed. Cc: Andrzej Zaborowski Cc: Peter Maydell Cc: qemu-stable@nongnu.org Signed-off-by: Aurelien Jarno --- tcg/arm/tcg-target.c | 73 +++++++++++++++++++++++++------------------------- 1 file changed, 37 insertions(+), 36 deletions(-) diff --git a/tcg/arm/tcg-target.c b/tcg/arm/tcg-target.c index 737200e..6cde512 100644 --- a/tcg/arm/tcg-target.c +++ b/tcg/arm/tcg-target.c @@ -624,6 +624,19 @@ static inline void tcg_out_ld32_12(TCGContext *s, int cond, (rn << 16) | (rd << 12) | ((-im) & 0xfff)); } +/* Offset pre-increment with base writeback. */ +static inline void tcg_out_ld32_12wb(TCGContext *s, int cond, + int rd, int rn, tcg_target_long im) +{ + if (im >= 0) { + tcg_out32(s, (cond << 28) | 0x05b00000 | + (rn << 16) | (rd << 12) | (im & 0xfff)); + } else { + tcg_out32(s, (cond << 28) | 0x05300000 | + (rn << 16) | (rd << 12) | ((-im) & 0xfff)); + } +} + static inline void tcg_out_st32_12(TCGContext *s, int cond, int rd, int rn, tcg_target_long im) { @@ -1056,7 +1069,7 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) { int addr_reg, data_reg, data_reg2, bswap; #ifdef CONFIG_SOFTMMU - int mem_index, s_bits; + int mem_index, s_bits, tlb_offset; TCGReg argreg; # if TARGET_LONG_BITS == 64 int addr_reg2; @@ -1096,19 +1109,14 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) TCG_REG_R0, TCG_REG_R8, CPU_TLB_SIZE - 1); tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_R0, TCG_AREG0, TCG_REG_R0, SHIFT_IMM_LSL(CPU_TLB_ENTRY_BITS)); - /* In the - * ldr r1 [r0, #(offsetof(CPUArchState, tlb_table[mem_index][0].addr_read))] - * below, the offset is likely to exceed 12 bits if mem_index != 0 and - * not exceed otherwise, so use an - * add r0, r0, #(mem_index * sizeof *CPUArchState.tlb_table) - * before. - */ - if (mem_index) + /* We assume that the offset is contained within 24 bits. */ + tlb_offset = offsetof(CPUArchState, tlb_table[mem_index][0].addr_read); + if (tlb_offset > 0xfff) { tcg_out_dat_imm(s, COND_AL, ARITH_ADD, TCG_REG_R0, TCG_REG_R0, - (mem_index << (TLB_SHIFT & 1)) | - ((16 - (TLB_SHIFT >> 1)) << 8)); - tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R0, - offsetof(CPUArchState, tlb_table[0][0].addr_read)); + 0xa00 | (tlb_offset >> 12)); + tlb_offset &= 0xfff; + } + tcg_out_ld32_12wb(s, COND_AL, TCG_REG_R1, TCG_REG_R0, tlb_offset); tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R1, TCG_REG_R8, SHIFT_IMM_LSL(TARGET_PAGE_BITS)); /* Check alignment. */ @@ -1116,15 +1124,14 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) tcg_out_dat_imm(s, COND_EQ, ARITH_TST, 0, addr_reg, (1 << s_bits) - 1); # if TARGET_LONG_BITS == 64 - /* XXX: possibly we could use a block data load or writeback in - * the first access. */ - tcg_out_ld32_12(s, COND_EQ, TCG_REG_R1, TCG_REG_R0, - offsetof(CPUArchState, tlb_table[0][0].addr_read) + 4); + /* XXX: possibly we could use a block data load in the first access. */ + tcg_out_ld32_12(s, COND_EQ, TCG_REG_R1, TCG_REG_R0, 4); tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R1, addr_reg2, SHIFT_IMM_LSL(0)); # endif tcg_out_ld32_12(s, COND_EQ, TCG_REG_R1, TCG_REG_R0, - offsetof(CPUArchState, tlb_table[0][0].addend)); + offsetof(CPUTLBEntry, addend) + - offsetof(CPUTLBEntry, addr_read)); switch (opc) { case 0: @@ -1273,7 +1280,7 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) { int addr_reg, data_reg, data_reg2, bswap; #ifdef CONFIG_SOFTMMU - int mem_index, s_bits; + int mem_index, s_bits, tlb_offset; TCGReg argreg; # if TARGET_LONG_BITS == 64 int addr_reg2; @@ -1310,19 +1317,14 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) TCG_REG_R0, TCG_REG_R8, CPU_TLB_SIZE - 1); tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_R0, TCG_AREG0, TCG_REG_R0, SHIFT_IMM_LSL(CPU_TLB_ENTRY_BITS)); - /* In the - * ldr r1 [r0, #(offsetof(CPUArchState, tlb_table[mem_index][0].addr_write))] - * below, the offset is likely to exceed 12 bits if mem_index != 0 and - * not exceed otherwise, so use an - * add r0, r0, #(mem_index * sizeof *CPUArchState.tlb_table) - * before. - */ - if (mem_index) + /* We assume that the offset is contained within 24 bits. */ + tlb_offset = offsetof(CPUArchState, tlb_table[mem_index][0].addr_write); + if (tlb_offset > 0xfff) { tcg_out_dat_imm(s, COND_AL, ARITH_ADD, TCG_REG_R0, TCG_REG_R0, - (mem_index << (TLB_SHIFT & 1)) | - ((16 - (TLB_SHIFT >> 1)) << 8)); - tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R0, - offsetof(CPUArchState, tlb_table[0][0].addr_write)); + 0xa00 | (tlb_offset >> 12)); + tlb_offset &= 0xfff; + } + tcg_out_ld32_12wb(s, COND_AL, TCG_REG_R1, TCG_REG_R0, tlb_offset); tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R1, TCG_REG_R8, SHIFT_IMM_LSL(TARGET_PAGE_BITS)); /* Check alignment. */ @@ -1330,15 +1332,14 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) tcg_out_dat_imm(s, COND_EQ, ARITH_TST, 0, addr_reg, (1 << s_bits) - 1); # if TARGET_LONG_BITS == 64 - /* XXX: possibly we could use a block data load or writeback in - * the first access. */ - tcg_out_ld32_12(s, COND_EQ, TCG_REG_R1, TCG_REG_R0, - offsetof(CPUArchState, tlb_table[0][0].addr_write) + 4); + /* XXX: possibly we could use a block data load in the first access. */ + tcg_out_ld32_12(s, COND_EQ, TCG_REG_R1, TCG_REG_R0, 4); tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R1, addr_reg2, SHIFT_IMM_LSL(0)); # endif tcg_out_ld32_12(s, COND_EQ, TCG_REG_R1, TCG_REG_R0, - offsetof(CPUArchState, tlb_table[0][0].addend)); + offsetof(CPUTLBEntry, addend) + - offsetof(CPUTLBEntry, addr_write)); switch (opc) { case 0: