From patchwork Tue Feb 11 00:40:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 1236097 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=quicinc.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcdkim header.b=dYfZ49Wv; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48Gl2B22Svz9sRL for ; Tue, 11 Feb 2020 12:05:38 +1100 (AEDT) Received: from localhost ([::1]:41872 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1Jzc-000527-3m for incoming@patchwork.ozlabs.org; Mon, 10 Feb 2020 20:05:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:34595) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1Jd0-0002pe-NT for qemu-devel@nongnu.org; Mon, 10 Feb 2020 19:42:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1Jcy-0003pC-A2 for qemu-devel@nongnu.org; Mon, 10 Feb 2020 19:42:14 -0500 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:3646) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j1Jcx-0004rG-Sn for qemu-devel@nongnu.org; Mon, 10 Feb 2020 19:42:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1581381731; x=1612917731; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eX+zNFSUnRQzWMkGsJ1IgZLO2OVN4XDMS3WNIjM6QR4=; b=dYfZ49WvKIn2pIe+XdN5Ilta9xh60KCwaJr8mAMDLUES93jw59bne8y5 eNgr3ghUMZlfuoicTLxPfuEDgZTybTm8kYREtmo38Qyxj5kBW5jXfbpry C43ouQvumB0DWvldKG4467sAq/QOw5dymQB8GSle/NZN7FJIv7q/K9pvv Y=; Received: from unknown (HELO ironmsg-SD-alpha.qualcomm.com) ([10.53.140.30]) by alexa-out-sd-01.qualcomm.com with ESMTP; 10 Feb 2020 16:41:04 -0800 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg-SD-alpha.qualcomm.com with ESMTP; 10 Feb 2020 16:41:04 -0800 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id DD5FD1B27; Mon, 10 Feb 2020 18:41:03 -0600 (CST) From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [RFC PATCH 65/66] Hexagon HVX translation Date: Mon, 10 Feb 2020 18:40:43 -0600 Message-Id: <1581381644-13678-66-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1581381644-13678-1-git-send-email-tsimpson@quicinc.com> References: <1581381644-13678-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 199.106.114.38 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: riku.voipio@iki.fi, richard.henderson@linaro.org, laurent@vivier.eu, Taylor Simpson , philmd@redhat.com, aleksandar.m.mail@gmail.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Changes to packet semantics to support HVX Signed-off-by: Taylor Simpson --- target/hexagon/translate.c | 174 +++++++++++++++++++++++++++++++++++++++++++++ target/hexagon/translate.h | 30 ++++++++ 2 files changed, 204 insertions(+) diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index 2fe4dcb..eebe9b4 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -51,6 +51,10 @@ TCGv llsc_val; TCGv_i64 llsc_val_i64; TCGv hex_is_gather_store_insn; TCGv hex_gather_issued; +TCGv hex_VRegs_updated_tmp; +TCGv hex_VRegs_updated; +TCGv hex_VRegs_select; +TCGv hex_QRegs_updated; static const char * const hexagon_prednames[] = { "p0", "p1", "p2", "p3" @@ -136,6 +140,10 @@ static void gen_start_packet(DisasContext *ctx, packet_t *pkt) /* Clear out the disassembly context */ ctx->ctx_reg_log_idx = 0; ctx->ctx_preg_log_idx = 0; + ctx->ctx_temp_vregs_idx = 0; + ctx->ctx_temp_qregs_idx = 0; + ctx->ctx_vreg_log_idx = 0; + ctx->ctx_qreg_log_idx = 0; for (i = 0; i < STORES_MAX; i++) { ctx->ctx_store_width[i] = 0; } @@ -154,6 +162,15 @@ static void gen_start_packet(DisasContext *ctx, packet_t *pkt) for (i = 0; i < NUM_PREGS; i++) { tcg_gen_movi_tl(hex_pred_written[i], 0); } + + if (pkt->pkt_has_hvx) { + tcg_gen_movi_tl(hex_VRegs_updated_tmp, 0); + tcg_gen_movi_tl(hex_VRegs_updated, 0); + tcg_gen_movi_tl(hex_VRegs_select, 0); + tcg_gen_movi_tl(hex_QRegs_updated, 0); + tcg_gen_movi_tl(hex_is_gather_store_insn, 0); + tcg_gen_movi_tl(hex_gather_issued, 0); + } } static int is_gather_store_insn(insn_t *insn) @@ -445,10 +462,149 @@ static bool process_change_of_flow(DisasContext *ctx, packet_t *pkt) return false; } +void gen_memcpy(TCGv_ptr dest, TCGv_ptr src, size_t n) +{ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr s = tcg_temp_new_ptr(); + int i; + + tcg_gen_addi_ptr(d, dest, 0); + tcg_gen_addi_ptr(s, src, 0); + if (n % 8 == 0) { + TCGv_i64 temp = tcg_temp_new_i64(); + for (i = 0; i < n / 8; i++) { + tcg_gen_ld_i64(temp, s, 0); + tcg_gen_st_i64(temp, d, 0); + tcg_gen_addi_ptr(s, s, 8); + tcg_gen_addi_ptr(d, d, 8); + } + tcg_temp_free_i64(temp); + } else if (n % 4 == 0) { + TCGv temp = tcg_temp_new(); + for (i = 0; i < n / 4; i++) { + tcg_gen_ld32u_tl(temp, s, 0); + tcg_gen_st32_tl(temp, d, 0); + tcg_gen_addi_ptr(s, s, 4); + tcg_gen_addi_ptr(d, d, 4); + } + tcg_temp_free(temp); + } else if (n % 2 == 0) { + TCGv temp = tcg_temp_new(); + for (i = 0; i < n / 2; i++) { + tcg_gen_ld16u_tl(temp, s, 0); + tcg_gen_st16_tl(temp, d, 0); + tcg_gen_addi_ptr(s, s, 2); + tcg_gen_addi_ptr(d, d, 2); + } + tcg_temp_free(temp); + } else { + TCGv temp = tcg_temp_new(); + for (i = 0; i < n; i++) { + tcg_gen_ld8u_tl(temp, s, 0); + tcg_gen_st8_tl(temp, d, 0); + tcg_gen_addi_ptr(s, s, 1); + tcg_gen_addi_ptr(d, d, 1); + } + tcg_temp_free(temp); + } + + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(s); +} + +static inline void gen_vec_copy(intptr_t dstoff, intptr_t srcoff, size_t size) +{ + TCGv_ptr src = tcg_temp_new_ptr(); + TCGv_ptr dst = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(src, cpu_env, srcoff); + tcg_gen_addi_ptr(dst, cpu_env, dstoff); + gen_memcpy(dst, src, size); + tcg_temp_free_ptr(src); + tcg_temp_free_ptr(dst); +} + +static void gen_commit_hvx(DisasContext *ctx) +{ + int i; + + /* + * for (i = 0; i < ctx->ctx_vreg_log_idx; i++) { + * int rnum = ctx->ctx_vreg_log[i]; + * if (ctx->ctx_vreg_is_predicated[i]) { + * if (env->VRegs_updated & (1 << rnum)) { + * env->VRegs[rnum] = env->future_VRegs[rnum]; + * } + * } else { + * env->VRegs[rnum] = env->future_VRegs[rnum]; + * } + * } + */ + for (i = 0; i < ctx->ctx_vreg_log_idx; i++) { + int rnum = ctx->ctx_vreg_log[i]; + int is_predicated = ctx->ctx_vreg_is_predicated[i]; + intptr_t dstoff = offsetof(CPUHexagonState, VRegs[rnum]); + intptr_t srcoff = offsetof(CPUHexagonState, future_VRegs[rnum]); + size_t size = sizeof(mmvector_t); + + if (is_predicated) { + TCGv cmp = tcg_temp_local_new(); + TCGLabel *label_skip = gen_new_label(); + + tcg_gen_andi_tl(cmp, hex_VRegs_updated, 1 << rnum); + tcg_gen_brcondi_tl(TCG_COND_EQ, cmp, 0, label_skip); + { + gen_vec_copy(dstoff, srcoff, size); + } + gen_set_label(label_skip); + tcg_temp_free(cmp); + } else { + gen_vec_copy(dstoff, srcoff, size); + } + } + + /* + * for (i = 0; i < ctx-_ctx_qreg_log_idx; i++) { + * int rnum = ctx->ctx_qreg_log[i]; + * if (ctx->ctx_qreg_is_predicated[i]) { + * if (env->QRegs_updated) & (1 << rnum)) { + * env->QRegs[rnum] = env->future_QRegs[rnum]; + * } + * } else { + * env->QRegs[rnum] = env->future_QRegs[rnum]; + * } + * } + */ + for (i = 0; i < ctx->ctx_qreg_log_idx; i++) { + int rnum = ctx->ctx_qreg_log[i]; + int is_predicated = ctx->ctx_qreg_is_predicated[i]; + intptr_t dstoff = offsetof(CPUHexagonState, QRegs[rnum]); + intptr_t srcoff = offsetof(CPUHexagonState, future_QRegs[rnum]); + size_t size = sizeof(mmqreg_t); + + if (is_predicated) { + TCGv cmp = tcg_temp_local_new(); + TCGLabel *label_skip = gen_new_label(); + + tcg_gen_andi_tl(cmp, hex_QRegs_updated, 1 << rnum); + tcg_gen_brcondi_tl(TCG_COND_EQ, cmp, 0, label_skip); + { + gen_vec_copy(dstoff, srcoff, size); + } + gen_set_label(label_skip); + tcg_temp_free(cmp); + } else { + gen_vec_copy(dstoff, srcoff, size); + } + } + + gen_helper_commit_hvx_stores(cpu_env); +} + static void gen_exec_counters(packet_t *pkt) { int num_insns = pkt->num_insns; int num_real_insns = 0; + int num_hvx_insns = 0; int i; for (i = 0; i < num_insns; i++) { @@ -457,6 +613,9 @@ static void gen_exec_counters(packet_t *pkt) !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) { num_real_insns++; } + if (pkt->insn[i].hvx_resource) { + num_hvx_insns++; + } } tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT], @@ -465,6 +624,10 @@ static void gen_exec_counters(packet_t *pkt) tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT], hex_gpr[HEX_REG_QEMU_INSN_CNT], num_real_insns); } + if (num_hvx_insns) { + tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_HVX_CNT], + hex_gpr[HEX_REG_QEMU_HVX_CNT], num_hvx_insns); + } } static void gen_commit_packet(DisasContext *ctx, packet_t *pkt) @@ -476,6 +639,9 @@ static void gen_commit_packet(DisasContext *ctx, packet_t *pkt) process_store_log(ctx, pkt); process_dczeroa(ctx, pkt); end_tb |= process_change_of_flow(ctx, pkt); + if (pkt->pkt_has_hvx) { + gen_commit_hvx(ctx); + } gen_exec_counters(pkt); #if HEX_DEBUG { @@ -706,6 +872,14 @@ void hexagon_translate_init(void) "is_gather_store_insn"); hex_gather_issued = tcg_global_mem_new(cpu_env, offsetof(CPUHexagonState, gather_issued), "gather_issued"); + hex_VRegs_updated_tmp = tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, VRegs_updated_tmp), "VRegs_updated_tmp"); + hex_VRegs_updated = tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, VRegs_updated), "VRegs_updated"); + hex_VRegs_select = tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, VRegs_select), "VRegs_select"); + hex_QRegs_updated = tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, QRegs_updated), "QRegs_updated"); for (i = 0; i < STORES_MAX; i++) { sprintf(store_addr_names[i], "store_addr_%d", i); hex_store_addr[i] = tcg_global_mem_new(cpu_env, diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index 260ff3c..6bfc376 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -32,6 +32,14 @@ typedef struct DisasContext { int ctx_preg_log[PRED_WRITES_MAX]; int ctx_preg_log_idx; uint8_t ctx_store_width[STORES_MAX]; + int ctx_temp_vregs_idx; + int ctx_temp_qregs_idx; + int ctx_vreg_log[NUM_VREGS]; + int ctx_vreg_is_predicated[NUM_VREGS]; + int ctx_vreg_log_idx; + int ctx_qreg_log[NUM_QREGS]; + int ctx_qreg_is_predicated[NUM_QREGS]; + int ctx_qreg_log_idx; } DisasContext; static inline void ctx_log_reg_write(DisasContext *ctx, int rnum) @@ -54,6 +62,22 @@ static inline void ctx_log_pred_write(DisasContext *ctx, int pnum) ctx->ctx_preg_log_idx++; } +static inline void ctx_log_vreg_write(DisasContext *ctx, + int rnum, int is_predicated) +{ + ctx->ctx_vreg_log[ctx->ctx_vreg_log_idx] = rnum; + ctx->ctx_vreg_is_predicated[ctx->ctx_vreg_log_idx] = is_predicated; + ctx->ctx_vreg_log_idx++; +} + +static inline void ctx_log_qreg_write(DisasContext *ctx, + int rnum, int is_predicated) +{ + ctx->ctx_qreg_log[ctx->ctx_qreg_log_idx] = rnum; + ctx->ctx_qreg_is_predicated[ctx->ctx_qreg_log_idx] = is_predicated; + ctx->ctx_qreg_log_idx++; +} + extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS]; extern TCGv hex_pred[NUM_PREGS]; extern TCGv hex_next_PC; @@ -74,9 +98,15 @@ extern TCGv llsc_val; extern TCGv_i64 llsc_val_i64; extern TCGv hex_is_gather_store_insn; extern TCGv hex_gather_issued; +extern TCGv hex_VRegs_updated_tmp; +extern TCGv hex_VRegs_updated; +extern TCGv hex_VRegs_select; +extern TCGv hex_QRegs_updated; void hexagon_translate_init(void); extern void gen_exception(int excp); extern void gen_exception_debug(void); +extern void gen_memcpy(TCGv_ptr dest, TCGv_ptr src, size_t n); + #endif