From patchwork Mon Jan 6 11:30:26 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 307190 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 0A3DD2C00CD for ; Mon, 6 Jan 2014 22:41:44 +1100 (EST) Received: from localhost ([::1]:34141 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W08Yt-0002yc-Sm for incoming@patchwork.ozlabs.org; Mon, 06 Jan 2014 06:41:40 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43239) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W08P1-0000hj-Nz for qemu-devel@nongnu.org; Mon, 06 Jan 2014 06:31:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W08Oz-0004Ve-Sr for qemu-devel@nongnu.org; Mon, 06 Jan 2014 06:31:27 -0500 Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:44223) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W08Oz-0004QZ-J8 for qemu-devel@nongnu.org; Mon, 06 Jan 2014 06:31:25 -0500 Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80) (envelope-from ) id 1W08OY-0003qf-87; Mon, 06 Jan 2014 11:30:58 +0000 From: Peter Maydell To: Anthony Liguori Date: Mon, 6 Jan 2014 11:30:26 +0000 Message-Id: <1389007857-14649-22-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1389007857-14649-1-git-send-email-peter.maydell@linaro.org> References: <1389007857-14649-1-git-send-email-peter.maydell@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:8b0:1d0::1 Cc: Blue Swirl , qemu-devel@nongnu.org, Aurelien Jarno Subject: [Qemu-devel] [PULL 21/52] target-arm: A64: support for ld/st/cl exclusive X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Michael Matz This implement exclusive loads/stores for aarch64 along the lines of arm32 and ppc implementations. The exclusive load remembers the address and loaded value. The exclusive store throws an an exception which uses those values to check for equality in a proper exclusive region. This is not actually the architecture mandated semantics (for either AArch32 or AArch64) but it is close enough for typical guest code sequences to work correctly, and saves us from having to monitor all guest stores. It's fairly easy to come up with test cases where we don't behave like hardware - we don't for example model cache line behaviour. However in the common patterns this works, and the existing 32 bit ARM exclusive access implementation has the same limitations. AArch64 also implements new acquire/release loads/stores (which may be either exclusive or non-exclusive). These imposes extra ordering constraints on memory operations (ie they act as if they have an implicit barrier built into them). As TCG is single-threaded all our barriers are no-ops, so these just behave like normal loads and stores. Signed-off-by: Michael Matz Signed-off-by: Alex Bennée Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- linux-user/main.c | 127 +++++++++++++++++++++++++++++++++++- target-arm/translate-a64.c | 156 ++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 277 insertions(+), 6 deletions(-) diff --git a/linux-user/main.c b/linux-user/main.c index 20f9832..cabc9e1 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -585,8 +585,8 @@ do_kernel_trap(CPUARMState *env) return 0; } -#endif +/* Store exclusive handling for AArch32 */ static int do_strex(CPUARMState *env) { uint64_t val; @@ -670,7 +670,6 @@ done: return segv; } -#ifdef TARGET_ABI32 void cpu_loop(CPUARMState *env) { CPUState *cs = CPU(arm_env_get_cpu(env)); @@ -885,6 +884,122 @@ void cpu_loop(CPUARMState *env) #else +/* + * Handle AArch64 store-release exclusive + * + * rs = gets the status result of store exclusive + * rt = is the register that is stored + * rt2 = is the second register store (in STP) + * + */ +static int do_strex_a64(CPUARMState *env) +{ + uint64_t val; + int size; + bool is_pair; + int rc = 1; + int segv = 0; + uint64_t addr; + int rs, rt, rt2; + + start_exclusive(); + /* size | is_pair << 2 | (rs << 4) | (rt << 9) | (rt2 << 14)); */ + size = extract32(env->exclusive_info, 0, 2); + is_pair = extract32(env->exclusive_info, 2, 1); + rs = extract32(env->exclusive_info, 4, 5); + rt = extract32(env->exclusive_info, 9, 5); + rt2 = extract32(env->exclusive_info, 14, 5); + + addr = env->exclusive_addr; + + if (addr != env->exclusive_test) { + goto finish; + } + + switch (size) { + case 0: + segv = get_user_u8(val, addr); + break; + case 1: + segv = get_user_u16(val, addr); + break; + case 2: + segv = get_user_u32(val, addr); + break; + case 3: + segv = get_user_u64(val, addr); + break; + default: + abort(); + } + if (segv) { + env->cp15.c6_data = addr; + goto error; + } + if (val != env->exclusive_val) { + goto finish; + } + if (is_pair) { + if (size == 2) { + segv = get_user_u32(val, addr + 4); + } else { + segv = get_user_u64(val, addr + 8); + } + if (segv) { + env->cp15.c6_data = addr + (size == 2 ? 4 : 8); + goto error; + } + if (val != env->exclusive_high) { + goto finish; + } + } + val = env->xregs[rt]; + switch (size) { + case 0: + segv = put_user_u8(val, addr); + break; + case 1: + segv = put_user_u16(val, addr); + break; + case 2: + segv = put_user_u32(val, addr); + break; + case 3: + segv = put_user_u64(val, addr); + break; + } + if (segv) { + goto error; + } + if (is_pair) { + val = env->xregs[rt2]; + if (size == 2) { + segv = put_user_u32(val, addr + 4); + } else { + segv = put_user_u64(val, addr + 8); + } + if (segv) { + env->cp15.c6_data = addr + (size == 2 ? 4 : 8); + goto error; + } + } + rc = 0; +finish: + env->pc += 4; + /* rs == 31 encodes a write to the ZR, thus throwing away + * the status return. This is rather silly but valid. + */ + if (rs < 31) { + env->xregs[rs] = rc; + } +error: + /* instruction faulted, PC does not advance */ + /* either way a strex releases any exclusive lock we have */ + env->exclusive_addr = -1; + end_exclusive(); + return segv; +} + /* AArch64 main loop */ void cpu_loop(CPUARMState *env) { @@ -944,7 +1059,7 @@ void cpu_loop(CPUARMState *env) } break; case EXCP_STREX: - if (do_strex(env)) { + if (do_strex_a64(env)) { addr = env->cp15.c6_data; goto do_segv; } @@ -956,6 +1071,12 @@ void cpu_loop(CPUARMState *env) abort(); } process_pending_signals(env); + /* Exception return on AArch64 always clears the exclusive monitor, + * so any return to running guest code implies this. + * A strex (successful or otherwise) also clears the monitor, so + * we don't need to specialcase EXCP_STREX. + */ + env->exclusive_addr = -1; } } #endif /* ndef TARGET_ABI32 */ diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index 6197441..40c6fc4 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -38,6 +38,15 @@ static TCGv_i64 cpu_X[32]; static TCGv_i64 cpu_pc; static TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF; +/* Load/store exclusive handling */ +static TCGv_i64 cpu_exclusive_addr; +static TCGv_i64 cpu_exclusive_val; +static TCGv_i64 cpu_exclusive_high; +#ifdef CONFIG_USER_ONLY +static TCGv_i64 cpu_exclusive_test; +static TCGv_i32 cpu_exclusive_info; +#endif + static const char *regnames[] = { "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15", @@ -70,6 +79,19 @@ void a64_translate_init(void) cpu_ZF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, ZF), "ZF"); cpu_CF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, CF), "CF"); cpu_VF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, VF), "VF"); + + cpu_exclusive_addr = tcg_global_mem_new_i64(TCG_AREG0, + offsetof(CPUARMState, exclusive_addr), "exclusive_addr"); + cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0, + offsetof(CPUARMState, exclusive_val), "exclusive_val"); + cpu_exclusive_high = tcg_global_mem_new_i64(TCG_AREG0, + offsetof(CPUARMState, exclusive_high), "exclusive_high"); +#ifdef CONFIG_USER_ONLY + cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0, + offsetof(CPUARMState, exclusive_test), "exclusive_test"); + cpu_exclusive_info = tcg_global_mem_new_i32(TCG_AREG0, + offsetof(CPUARMState, exclusive_info), "exclusive_info"); +#endif } void aarch64_cpu_dump_state(CPUState *cs, FILE *f, @@ -767,6 +789,11 @@ static void handle_hint(DisasContext *s, uint32_t insn, } } +static void gen_clrex(DisasContext *s, uint32_t insn) +{ + tcg_gen_movi_i64(cpu_exclusive_addr, -1); +} + /* CLREX, DSB, DMB, ISB */ static void handle_sync(DisasContext *s, uint32_t insn, unsigned int op1, unsigned int op2, unsigned int crm) @@ -778,7 +805,7 @@ static void handle_sync(DisasContext *s, uint32_t insn, switch (op2) { case 2: /* CLREX */ - unsupported_encoding(s, insn); + gen_clrex(s, insn); return; case 4: /* DSB */ case 5: /* DMB */ @@ -1106,10 +1133,133 @@ static void disas_b_exc_sys(DisasContext *s, uint32_t insn) } } -/* Load/store exclusive */ +/* + * Load/Store exclusive instructions are implemented by remembering + * the value/address loaded, and seeing if these are the same + * when the store is performed. This is not actually the architecturally + * mandated semantics, but it works for typical guest code sequences + * and avoids having to monitor regular stores. + * + * In system emulation mode only one CPU will be running at once, so + * this sequence is effectively atomic. In user emulation mode we + * throw an exception and handle the atomic operation elsewhere. + */ +static void gen_load_exclusive(DisasContext *s, int rt, int rt2, + TCGv_i64 addr, int size, bool is_pair) +{ + TCGv_i64 tmp = tcg_temp_new_i64(); + TCGMemOp memop = MO_TE + size; + + g_assert(size <= 3); + tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), memop); + + if (is_pair) { + TCGv_i64 addr2 = tcg_temp_new_i64(); + TCGv_i64 hitmp = tcg_temp_new_i64(); + + g_assert(size >= 2); + tcg_gen_addi_i64(addr2, addr, 1 << size); + tcg_gen_qemu_ld_i64(hitmp, addr2, get_mem_index(s), memop); + tcg_temp_free_i64(addr2); + tcg_gen_mov_i64(cpu_exclusive_high, hitmp); + tcg_gen_mov_i64(cpu_reg(s, rt2), hitmp); + tcg_temp_free_i64(hitmp); + } + + tcg_gen_mov_i64(cpu_exclusive_val, tmp); + tcg_gen_mov_i64(cpu_reg(s, rt), tmp); + + tcg_temp_free_i64(tmp); + tcg_gen_mov_i64(cpu_exclusive_addr, addr); +} + +#ifdef CONFIG_USER_ONLY +static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, + TCGv_i64 addr, int size, int is_pair) +{ + tcg_gen_mov_i64(cpu_exclusive_test, addr); + tcg_gen_movi_i32(cpu_exclusive_info, + size | is_pair << 2 | (rd << 4) | (rt << 9) | (rt2 << 14)); + gen_exception_insn(s, 4, EXCP_STREX); +} +#else +static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, + TCGv_i64 addr, int size, int is_pair) +{ + qemu_log_mask(LOG_UNIMP, + "%s:%d: system mode store_exclusive unsupported " + "at pc=%016" PRIx64 "\n", + __FILE__, __LINE__, s->pc - 4); +} +#endif + +/* C3.3.6 Load/store exclusive + * + * 31 30 29 24 23 22 21 20 16 15 14 10 9 5 4 0 + * +-----+-------------+----+---+----+------+----+-------+------+------+ + * | sz | 0 0 1 0 0 0 | o2 | L | o1 | Rs | o0 | Rt2 | Rn | Rt | + * +-----+-------------+----+---+----+------+----+-------+------+------+ + * + * sz: 00 -> 8 bit, 01 -> 16 bit, 10 -> 32 bit, 11 -> 64 bit + * L: 0 -> store, 1 -> load + * o2: 0 -> exclusive, 1 -> not + * o1: 0 -> single register, 1 -> register pair + * o0: 1 -> load-acquire/store-release, 0 -> not + * + * o0 == 0 AND o2 == 1 is un-allocated + * o1 == 1 is un-allocated except for 32 and 64 bit sizes + */ static void disas_ldst_excl(DisasContext *s, uint32_t insn) { - unsupported_encoding(s, insn); + int rt = extract32(insn, 0, 5); + int rn = extract32(insn, 5, 5); + int rt2 = extract32(insn, 10, 5); + int is_lasr = extract32(insn, 15, 1); + int rs = extract32(insn, 16, 5); + int is_pair = extract32(insn, 21, 1); + int is_store = !extract32(insn, 22, 1); + int is_excl = !extract32(insn, 23, 1); + int size = extract32(insn, 30, 2); + TCGv_i64 tcg_addr; + + if ((!is_excl && !is_lasr) || + (is_pair && size < 2)) { + unallocated_encoding(s); + return; + } + + if (rn == 31) { + gen_check_sp_alignment(s); + } + tcg_addr = read_cpu_reg_sp(s, rn, 1); + + /* Note that since TCG is single threaded load-acquire/store-release + * semantics require no extra if (is_lasr) { ... } handling. + */ + + if (is_excl) { + if (!is_store) { + gen_load_exclusive(s, rt, rt2, tcg_addr, size, is_pair); + } else { + gen_store_exclusive(s, rs, rt, rt2, tcg_addr, size, is_pair); + } + } else { + TCGv_i64 tcg_rt = cpu_reg(s, rt); + if (is_store) { + do_gpr_st(s, tcg_rt, tcg_addr, size); + } else { + do_gpr_ld(s, tcg_rt, tcg_addr, size, false, false); + } + if (is_pair) { + TCGv_i64 tcg_rt2 = cpu_reg(s, rt); + tcg_gen_addi_i64(tcg_addr, tcg_addr, 1 << size); + if (is_store) { + do_gpr_st(s, tcg_rt2, tcg_addr, size); + } else { + do_gpr_ld(s, tcg_rt2, tcg_addr, size, false, false); + } + } + } } /*