From patchwork Mon Jun 6 22:23:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 1639646 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcdkim header.b=cxGejw+K; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LH7T56mSvz9sFs for ; Tue, 7 Jun 2022 08:29:29 +1000 (AEST) Received: from localhost ([::1]:36580 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nyLE0-0002Da-0I for incoming@patchwork.ozlabs.org; Mon, 06 Jun 2022 18:29:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56172) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nyL9K-0003wS-KU for qemu-devel@nongnu.org; Mon, 06 Jun 2022 18:24:38 -0400 Received: from alexa-out.qualcomm.com ([129.46.98.28]:15820) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1nyL9I-0005K5-6t for qemu-devel@nongnu.org; Mon, 06 Jun 2022 18:24:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1654554276; x=1686090276; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TRPyjRn3aKEaXawyBCyzpQ8WzKjvYEqvXAcMScMmaBw=; b=cxGejw+KE5HhlG+ir6zjafafKW7g15ZnrvvCOmw81WUYGRuLarBkmBcU 7zZ4GvqfsOrfv03rcm3XfcmAmHWVl48VmhoZ8hBeG+MqRTz0zzZJuxQuG 0MF616f8+qbHp32oCK0WdtZ1YDYSlOaR4jabM83xe/UM2DT5bYE7A0WHH A=; Received: from ironmsg08-lv.qualcomm.com ([10.47.202.152]) by alexa-out.qualcomm.com with ESMTP; 06 Jun 2022 15:24:35 -0700 X-QCInternal: smtphost Received: from hu-tsimpson-lv.qualcomm.com (HELO hu-devc-lv-u18-c.qualcomm.com) ([10.47.235.220]) by ironmsg08-lv.qualcomm.com with ESMTP; 06 Jun 2022 15:24:35 -0700 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 48BAA5005B7; Mon, 6 Jun 2022 15:23:35 -0700 (PDT) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org, ale@rev.ng, bcain@quicinc.com, mlambert@quicinc.com Subject: [PATCH] Hexagon (target/hexagon) move store size tracking to translation Date: Mon, 6 Jun 2022 15:23:24 -0700 Message-Id: <20220606222327.7682-3-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220606222327.7682-1-tsimpson@quicinc.com> References: <20220606222327.7682-1-tsimpson@quicinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=129.46.98.28; envelope-from=tsimpson@qualcomm.com; helo=alexa-out.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The store width is needed for packet commit, so it is stored in ctx->store_width. Currently, it is set when a store has a TCG override instead of a QEMU helper. In the QEMU helper case, the ctx->store_width is not set, we invoke a helper during packet commit that uses the runtime store width. This patch ensures ctx->store_width is set for all store instructions, so performance is improved because packet commit can generate the proper TCG store rather than the generic helper. We do this by - Create new attributes to indicate the store size - During gen_semantics, convert the fSTORE instances to fSTORE - Assign the new attributes to the new macros - Add definitions for the new macros - Use the attributes from the instructions during translation to set ctx->store_width - Remove setting of ctx->store_width from genptr.c Signed-off-by: Taylor Simpson --- target/hexagon/macros.h | 16 ++++++++++---- target/hexagon/attribs_def.h.inc | 4 ++++ target/hexagon/gen_semantics.c | 26 +++++++++++++++++++++++ target/hexagon/genptr.c | 36 +++++++++++--------------------- target/hexagon/translate.c | 26 +++++++++++++++++++++++ 5 files changed, 80 insertions(+), 28 deletions(-) diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index a78e84faa4..1d26f59fea 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -139,7 +139,7 @@ __builtin_choose_expr(TYPE_TCGV(X), \ gen_store1, (void)0)) #define MEM_STORE1(VA, DATA, SLOT) \ - MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT) + MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, SLOT) #define MEM_STORE2_FUNC(X) \ __builtin_choose_expr(TYPE_INT(X), \ @@ -147,7 +147,7 @@ __builtin_choose_expr(TYPE_TCGV(X), \ gen_store2, (void)0)) #define MEM_STORE2(VA, DATA, SLOT) \ - MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT) + MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, SLOT) #define MEM_STORE4_FUNC(X) \ __builtin_choose_expr(TYPE_INT(X), \ @@ -155,7 +155,7 @@ __builtin_choose_expr(TYPE_TCGV(X), \ gen_store4, (void)0)) #define MEM_STORE4(VA, DATA, SLOT) \ - MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT) + MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, SLOT) #define MEM_STORE8_FUNC(X) \ __builtin_choose_expr(TYPE_INT(X), \ @@ -163,7 +163,7 @@ __builtin_choose_expr(TYPE_TCGV_I64(X), \ gen_store8, (void)0)) #define MEM_STORE8(VA, DATA, SLOT) \ - MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT) + MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, SLOT) #else #define MEM_LOAD1s(VA) ((int8_t)mem_load1(env, slot, VA)) #define MEM_LOAD1u(VA) ((uint8_t)mem_load1(env, slot, VA)) @@ -600,8 +600,16 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift) #ifdef QEMU_GENERATE #define fSTORE(NUM, SIZE, EA, SRC) MEM_STORE##SIZE(EA, SRC, insn->slot) +#define fSTORE1(EA, SRC) MEM_STORE1(EA, SRC, insn->slot) +#define fSTORE2(EA, SRC) MEM_STORE2(EA, SRC, insn->slot) +#define fSTORE4(EA, SRC) MEM_STORE4(EA, SRC, insn->slot) +#define fSTORE8(EA, SRC) MEM_STORE8(EA, SRC, insn->slot) #else #define fSTORE(NUM, SIZE, EA, SRC) MEM_STORE##SIZE(EA, SRC, slot) +#define fSTORE1(EA, SRC) MEM_STORE1(EA, SRC, slot) +#define fSTORE2(EA, SRC) MEM_STORE2(EA, SRC, slot) +#define fSTORE4(EA, SRC) MEM_STORE4(EA, SRC, slot) +#define fSTORE8(EA, SRC) MEM_STORE8(EA, SRC, slot) #endif #ifdef QEMU_GENERATE diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc index dc890a557f..9c19e08dd7 100644 --- a/target/hexagon/attribs_def.h.inc +++ b/target/hexagon/attribs_def.h.inc @@ -38,6 +38,10 @@ DEF_ATTRIB(SUBINSN, "sub-instruction", "", "") /* Load and Store attributes */ DEF_ATTRIB(LOAD, "Loads from memory", "", "") DEF_ATTRIB(STORE, "Stores to memory", "", "") +DEF_ATTRIB(STORE_SIZE1, "Stores 1 byte to memory", "", "") +DEF_ATTRIB(STORE_SIZE2, "Stores 2 bytes to memory", "", "") +DEF_ATTRIB(STORE_SIZE4, "Stores 4 bytes to memory", "", "") +DEF_ATTRIB(STORE_SIZE8, "Stores 8 bytes to memory", "", "") DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "") DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "") diff --git a/target/hexagon/gen_semantics.c b/target/hexagon/gen_semantics.c index 4a2bdd70e9..b4bbd66006 100644 --- a/target/hexagon/gen_semantics.c +++ b/target/hexagon/gen_semantics.c @@ -78,6 +78,10 @@ int main(int argc, char *argv[]) ")\n", \ #TAG, STRINGIZE(ATTRIBS)); \ } while (0); + +/* Change the store macros so we can track the size during translation */ +#define fSTORE(NUM, SIZE, EA, SRC) fSTORE##SIZE(EA, SRC) + #include "imported/allidefs.def" #undef Q6INSN #undef EXTINSN @@ -101,6 +105,28 @@ int main(int argc, char *argv[]) ")\n", \ #MNAME, STRINGIZE(BEH), STRINGIZE(ATTRS)); #include "imported/macros.def" + +/* These macros give the size of the store used during translation */ +DEF_MACRO(fSTORE1, + QEMU_ONLY, + (A_STORE, A_MEMLIKE, A_STORE_SIZE1) +) + +DEF_MACRO(fSTORE2, + QEMU_ONLY, + (A_STORE, A_MEMLIKE, A_STORE_SIZE2) +) + +DEF_MACRO(fSTORE4, + QEMU_ONLY, + (A_STORE, A_MEMLIKE, A_STORE_SIZE4) +) + +DEF_MACRO(fSTORE8, + QEMU_ONLY, + (A_STORE, A_MEMLIKE, A_STORE_SIZE8) +) + #undef DEF_MACRO /* diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index cd6af4bceb..4e41a94293 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -401,62 +401,50 @@ static inline void gen_store32(TCGv vaddr, TCGv src, int width, int slot) tcg_gen_mov_tl(hex_store_val32[slot], src); } -static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src, - DisasContext *ctx, int slot) +static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot) { gen_store32(vaddr, src, 1, slot); - ctx->store_width[slot] = 1; } -static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src, - DisasContext *ctx, int slot) +static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot) { TCGv tmp = tcg_constant_tl(src); - gen_store1(cpu_env, vaddr, tmp, ctx, slot); + gen_store1(cpu_env, vaddr, tmp, slot); } -static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src, - DisasContext *ctx, int slot) +static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot) { gen_store32(vaddr, src, 2, slot); - ctx->store_width[slot] = 2; } -static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src, - DisasContext *ctx, int slot) +static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot) { TCGv tmp = tcg_constant_tl(src); - gen_store2(cpu_env, vaddr, tmp, ctx, slot); + gen_store2(cpu_env, vaddr, tmp, slot); } -static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src, - DisasContext *ctx, int slot) +static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot) { gen_store32(vaddr, src, 4, slot); - ctx->store_width[slot] = 4; } -static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, - DisasContext *ctx, int slot) +static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot) { TCGv tmp = tcg_constant_tl(src); - gen_store4(cpu_env, vaddr, tmp, ctx, slot); + gen_store4(cpu_env, vaddr, tmp, slot); } -static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src, - DisasContext *ctx, int slot) +static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src, int slot) { tcg_gen_mov_tl(hex_store_addr[slot], vaddr); tcg_gen_movi_tl(hex_store_width[slot], 8); tcg_gen_mov_i64(hex_store_val64[slot], src); - ctx->store_width[slot] = 8; } -static inline void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, - DisasContext *ctx, int slot) +static inline void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, int slot) { TCGv_i64 tmp = tcg_constant_i64(src); - gen_store8(cpu_env, vaddr, tmp, ctx, slot); + gen_store8(cpu_env, vaddr, tmp, slot); } static TCGv gen_8bitsof(TCGv result, TCGv value) diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index b6f541ecb2..43ceafe98a 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -327,6 +327,31 @@ static void mark_implicit_pred_writes(DisasContext *ctx, Insn *insn) mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P3, 3); } +static void mark_store_width(DisasContext *ctx, Insn *insn) +{ + uint16_t opcode = insn->opcode; + uint32_t slot = insn->slot; + + if (GET_ATTRIB(opcode, A_STORE)) { + if (GET_ATTRIB(opcode, A_STORE_SIZE1)) { + ctx->store_width[slot] = 1; + return; + } + if (GET_ATTRIB(opcode, A_STORE_SIZE2)) { + ctx->store_width[slot] = 2; + return; + } + if (GET_ATTRIB(opcode, A_STORE_SIZE4)) { + ctx->store_width[slot] = 4; + return; + } + if (GET_ATTRIB(opcode, A_STORE_SIZE8)) { + ctx->store_width[slot] = 8; + return; + } + } +} + static void gen_insn(CPUHexagonState *env, DisasContext *ctx, Insn *insn, Packet *pkt) { @@ -334,6 +359,7 @@ static void gen_insn(CPUHexagonState *env, DisasContext *ctx, mark_implicit_reg_writes(ctx, insn); insn->generate(env, ctx, insn, pkt); mark_implicit_pred_writes(ctx, insn); + mark_store_width(ctx, insn); } else { gen_exception_end_tb(ctx, HEX_EXCP_INVALID_OPCODE); }