From patchwork Fri May 12 21:46:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 1780829 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcppdkim1 header.b=jUWY7+Gn; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QJ2c95X4Lz20dC for ; Sat, 13 May 2023 07:54:49 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pxacc-0002yl-JU; Fri, 12 May 2023 17:48:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pxacG-0002n9-5Z for qemu-devel@nongnu.org; Fri, 12 May 2023 17:47:56 -0400 Received: from mx0b-0031df01.pphosted.com ([205.220.180.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pxaby-000679-Pp for qemu-devel@nongnu.org; Fri, 12 May 2023 17:47:55 -0400 Received: from pps.filterd (m0279873.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34CKrPS5012315; Fri, 12 May 2023 21:47:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=ECmn6rhLUfb23UNylnFNnJ+1qjpUjNpE3BdIuBlzSpo=; b=jUWY7+GnchgTD7aO8gpP2zoud6PkuXCMBxTJIs+IDACTgHW4bhRDEMy3n6VkFW+vAdGB q5H3UjfuAf9CGQiC5Wz2ThoCf1m0JMNWLCzHk5+yxMQncMimzTRqZHmw9NIz97gbfeYa hW4rjnHnBxwcnwc2vYHTV81khJpQ8OcGdgqo8Jx7rk32Jb2kiMDW/AH5krGXwu3zKbzO G+GQqxY4BWh4GOs9sqV4XdGudcPdbwhhgRu3oUCkryC0/FUiWqRaJJh8N6ozVX+I2FW9 TZ2w1Gc2+P1yz5xZRXRS+df14MReK+QwGzov29reZacwvUzxX6J02K4WPsMQSjIEAIzN bg== Received: from nalasppmta04.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3qhj9s1p4n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 12 May 2023 21:47:27 +0000 Received: from pps.filterd (NALASPPMTA04.qualcomm.com [127.0.0.1]) by NALASPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 34CLlNTN024124; Fri, 12 May 2023 21:47:26 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTPS id 3qf6j64ms4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 12 May 2023 21:47:24 +0000 Received: from NALASPPMTA04.qualcomm.com (NALASPPMTA04.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34CLlNOS024145; Fri, 12 May 2023 21:47:23 GMT Received: from hu-devc-sd-u20-a-1.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.204.221]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTPS id 34CLlNHu024115 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 12 May 2023 21:47:23 +0000 Received: by hu-devc-sd-u20-a-1.qualcomm.com (Postfix, from userid 47164) id 12D3E6CD; Fri, 12 May 2023 14:47:22 -0700 (PDT) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, peter.maydell@linaro.org, bcain@quicinc.com, quic_mathbern@quicinc.com, stefanha@redhat.com, ale@rev.ng, anjo@rev.ng, quic_mliebel@quicinc.com Subject: [PULL 20/44] Hexagon (target/hexagon) Short-circuit packet register writes Date: Fri, 12 May 2023 14:46:42 -0700 Message-Id: <20230512214706.946068-21-tsimpson@quicinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230512214706.946068-1-tsimpson@quicinc.com> References: <20230512214706.946068-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: VgHoZMOaUwo2DQiSEmm_03JsxKyw_I5S X-Proofpoint-GUID: VgHoZMOaUwo2DQiSEmm_03JsxKyw_I5S X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-12_14,2023-05-05_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=504 mlxscore=0 adultscore=0 clxscore=1015 phishscore=0 lowpriorityscore=0 suspectscore=0 bulkscore=0 malwarescore=0 impostorscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305120183 Received-SPF: pass client-ip=205.220.180.131; envelope-from=tsimpson@qualcomm.com; helo=mx0b-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org In certain cases, we can avoid the overhead of writing to hex_new_value and write directly to hex_gpr. We add need_commit field to DisasContext indicating if the end-of-packet commit is needed. If it is not needed, get_result_gpr() and get_result_gpr_pair() can return hex_gpr. We pass the ctx->need_commit to helpers when needed. Finally, we can early-exit from gen_reg_writes during packet commit. There are a few instructions whose semantics write to the result before reading all the inputs. Therefore, the idef-parser generated code is incompatible with short-circuit. We tell idef-parser to skip them. For debugging purposes, we add a cpu property to turn off short-circuit. When the short-circuit property is false, we skip the analysis and force the end-of-packet commit. Here's a simple example of the TCG generated for 0x004000b4: 0x7800c020 { R0 = #0x1 } BEFORE: ---- 004000b4 movi_i32 new_r0,$0x1 mov_i32 r0,new_r0 AFTER: ---- 004000b4 movi_i32 r0,$0x1 This patch reintroduces a use of check_for_attrib, so we remove the G_GNUC_UNUSED added earlier in this series. Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson Reviewed-by: Brian Cain Message-Id: <20230427230012.3800327-12-tsimpson@quicinc.com> --- target/hexagon/cpu.h | 1 + target/hexagon/gen_tcg.h | 3 +- target/hexagon/genptr.h | 2 + target/hexagon/helper.h | 2 +- target/hexagon/macros.h | 13 ++++- target/hexagon/translate.h | 2 + target/hexagon/arch.c | 3 +- target/hexagon/cpu.c | 3 ++ target/hexagon/genptr.c | 30 ++++------- target/hexagon/op_helper.c | 5 +- target/hexagon/translate.c | 67 ++++++++++++++++++++++++- target/hexagon/gen_helper_funcs.py | 2 + target/hexagon/gen_helper_protos.py | 10 +++- target/hexagon/gen_idef_parser_funcs.py | 7 +++ target/hexagon/gen_tcg_funcs.py | 5 ++ target/hexagon/hex_common.py | 3 ++ 16 files changed, 128 insertions(+), 30 deletions(-) diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h index 4d8981d862..631bfdbe9c 100644 --- a/target/hexagon/cpu.h +++ b/target/hexagon/cpu.h @@ -150,6 +150,7 @@ struct ArchCPU { bool lldb_compat; target_ulong lldb_stack_adjust; + bool short_circuit; }; #include "cpu_bits.h" diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index 099a6cc47f..7e070c35bd 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -592,7 +592,8 @@ #define fGEN_TCG_A5_ACS(SHORTCODE) \ do { \ gen_helper_vacsh_pred(PeV, cpu_env, RxxV, RssV, RttV); \ - gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV); \ + gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV, \ + tcg_constant_tl(ctx->need_commit)); \ } while (0) #define fGEN_TCG_S2_cabacdecbin(SHORTCODE) \ diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h index 75d0fc262d..420867f934 100644 --- a/target/hexagon/genptr.h +++ b/target/hexagon/genptr.h @@ -58,4 +58,6 @@ void gen_set_half(int N, TCGv result, TCGv src); void gen_set_half_i64(int N, TCGv_i64 result, TCGv src); void probe_noshuf_load(TCGv va, int s, int mi); +extern const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS]; + #endif diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h index 73849e3d49..4b750d0351 100644 --- a/target/hexagon/helper.h +++ b/target/hexagon/helper.h @@ -29,7 +29,7 @@ DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32) DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_3(sfrecipa, i64, env, f32, f32) DEF_HELPER_2(sfinvsqrta, i64, env, f32) -DEF_HELPER_4(vacsh_val, s64, env, s64, s64, s64) +DEF_HELPER_5(vacsh_val, s64, env, s64, s64, s64, i32) DEF_HELPER_FLAGS_4(vacsh_pred, TCG_CALL_NO_RWG_SE, s32, env, s64, s64, s64) DEF_HELPER_FLAGS_2(cabacdecbin_val, TCG_CALL_NO_RWG_SE, s64, s64, s64) DEF_HELPER_FLAGS_2(cabacdecbin_pred, TCG_CALL_NO_RWG_SE, s32, s64, s64) diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index 24c78fe80a..54562cccb0 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -44,8 +44,17 @@ reg_field_info[FIELD].offset) #define SET_USR_FIELD(FIELD, VAL) \ - fINSERT_BITS(env->new_value[HEX_REG_USR], reg_field_info[FIELD].width, \ - reg_field_info[FIELD].offset, (VAL)) + do { \ + if (pkt_need_commit) { \ + fINSERT_BITS(env->new_value[HEX_REG_USR], \ + reg_field_info[FIELD].width, \ + reg_field_info[FIELD].offset, (VAL)); \ + } else { \ + fINSERT_BITS(env->gpr[HEX_REG_USR], \ + reg_field_info[FIELD].width, \ + reg_field_info[FIELD].offset, (VAL)); \ + } \ + } while (0) #endif #ifdef QEMU_GENERATE diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index f72228859f..3f6fd3452c 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -62,10 +62,12 @@ typedef struct DisasContext { int qreg_log_idx; DECLARE_BITMAP(qregs_read, NUM_QREGS); bool pre_commit; + bool need_commit; TCGCond branch_cond; target_ulong branch_dest; bool is_tight_loop; bool need_pkt_has_store_s1; + bool short_circuit; } DisasContext; static inline void ctx_log_pred_write(DisasContext *ctx, int pnum) diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c index da79b41c4d..d053d68487 100644 --- a/target/hexagon/arch.c +++ b/target/hexagon/arch.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -224,6 +224,7 @@ void arch_fpop_start(CPUHexagonState *env) void arch_fpop_end(CPUHexagonState *env) { + const bool pkt_need_commit = true; int flags = get_float_exception_flags(&env->fp_status); if (flags != 0) { SOFTFLOAT_TEST_FLAG(float_flag_inexact, FPINPF, FPINPE); diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c index c78fe25c9f..d4dfc382ab 100644 --- a/target/hexagon/cpu.c +++ b/target/hexagon/cpu.c @@ -54,6 +54,8 @@ static Property hexagon_lldb_compat_property = static Property hexagon_lldb_stack_adjust_property = DEFINE_PROP_UNSIGNED("lldb-stack-adjust", HexagonCPU, lldb_stack_adjust, 0, qdev_prop_uint32, target_ulong); +static Property hexagon_short_circuit_property = + DEFINE_PROP_BOOL("short-circuit", HexagonCPU, short_circuit, true); const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", @@ -330,6 +332,7 @@ static void hexagon_cpu_init(Object *obj) cpu_set_cpustate_pointers(cpu); qdev_property_add_static(DEVICE(obj), &hexagon_lldb_compat_property); qdev_property_add_static(DEVICE(obj), &hexagon_lldb_stack_adjust_property); + qdev_property_add_static(DEVICE(obj), &hexagon_short_circuit_property); } #include "hw/core/tcg-cpu-ops.h" diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 3c7e0dafaf..9858d7bc35 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -45,7 +45,7 @@ TCGv gen_read_preg(TCGv pred, uint8_t num) #define IMMUTABLE (~0) -static const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] = { +const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] = { [HEX_REG_USR] = 0xc13000c0, [HEX_REG_PC] = IMMUTABLE, [HEX_REG_GP] = 0x3f, @@ -70,14 +70,18 @@ static inline void gen_masked_reg_write(TCGv new_val, TCGv cur_val, static TCGv get_result_gpr(DisasContext *ctx, int rnum) { - return hex_new_value[rnum]; + if (ctx->need_commit) { + return hex_new_value[rnum]; + } else { + return hex_gpr[rnum]; + } } static TCGv_i64 get_result_gpr_pair(DisasContext *ctx, int rnum) { TCGv_i64 result = tcg_temp_new_i64(); - tcg_gen_concat_i32_i64(result, hex_new_value[rnum], - hex_new_value[rnum + 1]); + tcg_gen_concat_i32_i64(result, get_result_gpr(ctx, rnum), + get_result_gpr(ctx, rnum + 1)); return result; } @@ -86,7 +90,7 @@ void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val) const target_ulong reg_mask = reg_immut_masks[rnum]; gen_masked_reg_write(val, hex_gpr[rnum], reg_mask); - tcg_gen_mov_tl(hex_new_value[rnum], val); + tcg_gen_mov_tl(get_result_gpr(ctx, rnum), val); if (HEX_DEBUG) { /* Do this so HELPER(debug_commit_end) will know */ tcg_gen_movi_tl(hex_reg_written[rnum], 1); @@ -95,27 +99,15 @@ void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val) static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 val) { - const target_ulong reg_mask_low = reg_immut_masks[rnum]; - const target_ulong reg_mask_high = reg_immut_masks[rnum + 1]; TCGv val32 = tcg_temp_new(); /* Low word */ tcg_gen_extrl_i64_i32(val32, val); - gen_masked_reg_write(val32, hex_gpr[rnum], reg_mask_low); - tcg_gen_mov_tl(hex_new_value[rnum], val32); - if (HEX_DEBUG) { - /* Do this so HELPER(debug_commit_end) will know */ - tcg_gen_movi_tl(hex_reg_written[rnum], 1); - } + gen_log_reg_write(ctx, rnum, val32); /* High word */ tcg_gen_extrh_i64_i32(val32, val); - gen_masked_reg_write(val32, hex_gpr[rnum + 1], reg_mask_high); - tcg_gen_mov_tl(hex_new_value[rnum + 1], val32); - if (HEX_DEBUG) { - /* Do this so HELPER(debug_commit_end) will know */ - tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1); - } + gen_log_reg_write(ctx, rnum + 1, val32); } void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val) diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c index 46ccc59106..fc5c30a141 100644 --- a/target/hexagon/op_helper.c +++ b/target/hexagon/op_helper.c @@ -220,7 +220,7 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1) reg_printed = true; } HEX_DEBUG_LOG("\tr%d = " TARGET_FMT_ld " (0x" TARGET_FMT_lx ")\n", - i, env->new_value[i], env->new_value[i]); + i, env->gpr[i], env->gpr[i]); } } @@ -352,7 +352,8 @@ uint64_t HELPER(sfinvsqrta)(CPUHexagonState *env, float32 RsV) } int64_t HELPER(vacsh_val)(CPUHexagonState *env, - int64_t RxxV, int64_t RssV, int64_t RttV) + int64_t RxxV, int64_t RssV, int64_t RttV, + uint32_t pkt_need_commit) { for (int i = 0; i < 4; i++) { int xv = sextract64(RxxV, i * 16, 16); diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index e84bd34618..6fa885cf16 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -27,6 +27,7 @@ #include "insn.h" #include "decode.h" #include "translate.h" +#include "genptr.h" #include "printinsn.h" #include "analyze_funcs_generated.c.inc" @@ -239,7 +240,7 @@ static int read_packet_words(CPUHexagonState *env, DisasContext *ctx, return nwords; } -static G_GNUC_UNUSED bool check_for_attrib(Packet *pkt, int attrib) +static bool check_for_attrib(Packet *pkt, int attrib) { for (int i = 0; i < pkt->num_insns; i++) { if (GET_ATTRIB(pkt->insn[i].opcode, attrib)) { @@ -336,6 +337,58 @@ static void mark_implicit_pred_writes(DisasContext *ctx) mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3); } +static bool pkt_raises_exception(Packet *pkt) +{ + if (check_for_attrib(pkt, A_LOAD) || + check_for_attrib(pkt, A_STORE)) { + return true; + } + return false; +} + +static bool need_commit(DisasContext *ctx) +{ + Packet *pkt = ctx->pkt; + + /* + * If the short-circuit property is set to false, we'll always do the commit + */ + if (!ctx->short_circuit) { + return true; + } + + if (pkt_raises_exception(pkt)) { + return true; + } + + /* Registers with immutability flags require new_value */ + for (int i = 0; i < ctx->reg_log_idx; i++) { + int rnum = ctx->reg_log[i]; + if (reg_immut_masks[rnum]) { + return true; + } + } + + /* Floating point instructions are hard-coded to use new_value */ + if (check_for_attrib(pkt, A_FPOP)) { + return true; + } + + if (pkt->num_insns == 1) { + return false; + } + + /* Check for overlap between register reads and writes */ + for (int i = 0; i < ctx->reg_log_idx; i++) { + int rnum = ctx->reg_log[i]; + if (test_bit(rnum, ctx->regs_read)) { + return true; + } + } + + return false; +} + static void mark_implicit_pred_read(DisasContext *ctx, int attrib, int pnum) { if (GET_ATTRIB(ctx->insn->opcode, attrib)) { @@ -365,6 +418,8 @@ static void analyze_packet(DisasContext *ctx) mark_implicit_pred_writes(ctx); mark_implicit_pred_reads(ctx); } + + ctx->need_commit = need_commit(ctx); } static void gen_start_packet(DisasContext *ctx) @@ -434,7 +489,8 @@ static void gen_start_packet(DisasContext *ctx) } /* Preload the predicated registers into hex_new_value[i] */ - if (!bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) { + if (ctx->need_commit && + !bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) { int i = find_first_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS); while (i < TOTAL_PER_THREAD_REGS) { tcg_gen_mov_tl(hex_new_value[i], hex_gpr[i]); @@ -544,6 +600,11 @@ static void gen_reg_writes(DisasContext *ctx) { int i; + /* Early exit if not needed */ + if (!ctx->need_commit) { + return; + } + for (i = 0; i < ctx->reg_log_idx; i++) { int reg_num = ctx->reg_log[i]; @@ -922,6 +983,7 @@ static void hexagon_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs) { DisasContext *ctx = container_of(dcbase, DisasContext, base); + HexagonCPU *hex_cpu = env_archcpu(cs->env_ptr); uint32_t hex_flags = dcbase->tb->flags; ctx->mem_idx = MMU_USER_IDX; @@ -930,6 +992,7 @@ static void hexagon_tr_init_disas_context(DisasContextBase *dcbase, ctx->num_hvx_insns = 0; ctx->branch_cond = TCG_COND_NEVER; ctx->is_tight_loop = FIELD_EX32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP); + ctx->short_circuit = hex_cpu->short_circuit; } static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu) diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper_funcs.py index c73d792580..e259ea3d03 100755 --- a/target/hexagon/gen_helper_funcs.py +++ b/target/hexagon/gen_helper_funcs.py @@ -287,6 +287,8 @@ def gen_helper_function(f, tag, tagregs, tagimms): if hex_common.need_pkt_has_multi_cof(tag): f.write(", uint32_t pkt_has_multi_cof") + if (hex_common.need_pkt_need_commit(tag)): + f.write(", uint32_t pkt_need_commit") if hex_common.need_PC(tag): if i > 0: diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helper_protos.py index 187cd6e04e..c5ecb85294 100755 --- a/target/hexagon/gen_helper_protos.py +++ b/target/hexagon/gen_helper_protos.py @@ -86,6 +86,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): def_helper_size = len(regs) + len(imms) + numscalarreadwrite + 1 if hex_common.need_pkt_has_multi_cof(tag): def_helper_size += 1 + if hex_common.need_pkt_need_commit(tag): + def_helper_size += 1 if hex_common.need_part1(tag): def_helper_size += 1 if hex_common.need_slot(tag): @@ -103,6 +105,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): def_helper_size = len(regs) + len(imms) + numscalarreadwrite if hex_common.need_pkt_has_multi_cof(tag): def_helper_size += 1 + if hex_common.need_pkt_need_commit(tag): + def_helper_size += 1 if hex_common.need_part1(tag): def_helper_size += 1 if hex_common.need_slot(tag): @@ -156,10 +160,12 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): for immlett, bits, immshift in imms: f.write(", s32") - ## Add the arguments for the instruction pkt_has_multi_cof, slot and - ## part1 (if needed) + ## Add the arguments for the instruction pkt_has_multi_cof, + ## pkt_needs_commit, PC, next_PC, slot, and part1 (if needed) if hex_common.need_pkt_has_multi_cof(tag): f.write(", i32") + if hex_common.need_pkt_need_commit(tag): + f.write(', i32') if hex_common.need_PC(tag): f.write(", i32") if hex_common.helper_needs_next_PC(tag): diff --git a/target/hexagon/gen_idef_parser_funcs.py b/target/hexagon/gen_idef_parser_funcs.py index dc9e396b52..ad2e5c04d3 100644 --- a/target/hexagon/gen_idef_parser_funcs.py +++ b/target/hexagon/gen_idef_parser_funcs.py @@ -111,6 +111,13 @@ def main(): continue if ( tag.startswith('R6_release_') ): continue + ## Skip instructions that are incompatible with short-circuit + ## packet register writes + if ( tag == 'S2_insert' or + tag == 'S2_insert_rp' or + tag == 'S2_asr_r_svw_trun' or + tag == 'A2_swiz' ): + continue regs = tagregs[tag] imms = tagimms[tag] diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py index d9ccbe63f6..0e45d43685 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -550,6 +550,9 @@ def gen_tcg_func(f, tag, regs, imms): if hex_common.need_pkt_has_multi_cof(tag): f.write(" TCGv pkt_has_multi_cof = ") f.write("tcg_constant_tl(ctx->pkt->pkt_has_multi_cof);\n") + if hex_common.need_pkt_need_commit(tag): + f.write(" TCGv pkt_need_commit = ") + f.write("tcg_constant_tl(ctx->need_commit);\n") if hex_common.need_part1(tag): f.write(" TCGv part1 = tcg_constant_tl(insn->part1);\n") if hex_common.need_slot(tag): @@ -596,6 +599,8 @@ def gen_tcg_func(f, tag, regs, imms): if hex_common.need_pkt_has_multi_cof(tag): f.write(", pkt_has_multi_cof") + if hex_common.need_pkt_need_commit(tag): + f.write(", pkt_need_commit") if hex_common.need_PC(tag): f.write(", PC") if hex_common.helper_needs_next_PC(tag): diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py index 232c6e2c20..29c0508f66 100755 --- a/target/hexagon/hex_common.py +++ b/target/hexagon/hex_common.py @@ -276,6 +276,9 @@ def need_pkt_has_multi_cof(tag): return "A_COF" in attribdict[tag] +def need_pkt_need_commit(tag): + return 'A_IMPLICIT_WRITES_USR' in attribdict[tag] + def need_condexec_reg(tag, regs): if "A_CONDEXEC" in attribdict[tag]: for regtype, regid, toss, numregs in regs: