From patchwork Tue Mar 19 11:28:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateja Marjanovic X-Patchwork-Id: 1058358 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=rt-rk.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44NrTb70FCz9s9T for ; Tue, 19 Mar 2019 22:30:59 +1100 (AEDT) Received: from localhost ([127.0.0.1]:55568 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h6CxN-0000Gi-Vq for incoming@patchwork.ozlabs.org; Tue, 19 Mar 2019 07:30:58 -0400 Received: from eggs.gnu.org ([209.51.188.92]:57273) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h6CvI-0007Jz-I8 for qemu-devel@nongnu.org; Tue, 19 Mar 2019 07:28:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h6CvH-0007Tv-C9 for qemu-devel@nongnu.org; Tue, 19 Mar 2019 07:28:48 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:45196 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h6CvG-0007SQ-TK for qemu-devel@nongnu.org; Tue, 19 Mar 2019 07:28:47 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id CD9EE1A20CE; Tue, 19 Mar 2019 12:28:44 +0100 (CET) X-Virus-Scanned: amavisd-new at rt-rk.com Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 7DD031A1DA3; Tue, 19 Mar 2019 12:28:44 +0100 (CET) From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Tue, 19 Mar 2019 12:28:35 +0100 Message-Id: <1552994915-7185-3-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1552994915-7185-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1552994915-7185-1-git-send-email-mateja.marjanovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH 2/2] target/mips: Optimize ILVEV. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Mateja Marjanovic Optimize set of MSA instructions ILVEV, using directly tcg registers and performing logic on them instead of using helpers. instr || before || after ====================================== ilvev.b || 126.92 ms || 26.41 ms ilvev.h || 93.67 ms || 25.79 ms ilvev.w || 117.86 ms || 24.42 ms ilvev.d || 45.49 ms || 20.28 ms Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 9 ---- target/mips/translate.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 110 insertions(+), 11 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index d162836..2f23b0d 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -864,7 +864,6 @@ DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index 9e52a31..a500c59 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1197,15 +1197,6 @@ MSA_FN_DF(ilvl_df) } while (0) MSA_FN_DF(ilvr_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] = pwt->DF[2*i]; \ - pwx->DF[2*i+1] = pws->DF[2*i]; \ - } while (0) -MSA_FN_DF(ilvev_df) -#undef MSA_DO - #undef MSA_LOOP_COND #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index 10c5c55..d65db46 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28978,6 +28978,100 @@ static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); } +/* + * [MSA] ILVEV.B wd, ws, wt + * + * Vector Interleave Even (byte data elements) + * + */ +static inline void gen_ilvev_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + const uint64_t mask = 0x00ff00ff00ff00ffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_andi_i64(t0, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t0, t0, 8); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t0); + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_andi_i64(t0, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t0, t0, 8); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t0); + + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} + +/* + * [MSA] ILVEV.H wd, ws, wt + * + * Vector Interleave Even (halfword data elements) + * + */ +static inline void gen_ilvev_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + const uint64_t mask = 0x0000ffff0000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_andi_i64(t0, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t0, t0, 16); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t0); + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_andi_i64(t0, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t0, t0, 16); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t0); + + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} + +/* + * [MSA] ILVEV.W wd, ws, wt + * + * Vector Interleave Even (word data elements) + * + */ +static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + const uint64_t mask = 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_andi_i64(t0, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t0, t0, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t0); + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_andi_i64(t0, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t0, t0, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t0); + + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} + +/* + * [MSA] ILVEV.D wd, ws, wt + * + * Vector Interleave Even (Double data elements) + * + */ +static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -29134,7 +29228,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_helper_msa_mod_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVEV_df: - gen_helper_msa_ilvev_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvev_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvev_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvev_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvev_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSR_df: gen_helper_msa_binsr_df(cpu_env, tdf, twd, tws, twt);