From patchwork Thu Jun 10 07:58:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490224 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xH35VkXz9sPf for ; Thu, 10 Jun 2021 18:00:31 +1000 (AEST) Received: from localhost ([::1]:36778 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFc5-0003yt-G6 for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:00:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40164) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFbV-0003tm-Jp; Thu, 10 Jun 2021 03:59:53 -0400 Received: from out28-3.mail.aliyun.com ([115.124.28.3]:37139) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFbS-0005xl-VE; Thu, 10 Jun 2021 03:59:53 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436305|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_regular_dialog|0.00422116-8.37222e-05-0.995695; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047209; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMMEgG_1623311984; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMMEgG_1623311984) by smtp.aliyun-inc.com(10.147.41.187); Thu, 10 Jun 2021 15:59:44 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 01/37] target/riscv: implementation-defined constant parameters Date: Thu, 10 Jun 2021 15:58:32 +0800 Message-Id: <20210610075908.3305506-2-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.3; envelope-from=zhiwei_liu@c-sky.com; helo=out28-3.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" ext_psfoperand is whether to support Zpsfoperand sub-extension. pext_ver is the packed specification version, default value is v0.9.4. Signed-off-by: LIU Zhiwei Reviewed-by: Alistair Francis --- target/riscv/cpu.c | 31 +++++++++++++++++++++++++++++++ target/riscv/cpu.h | 6 ++++++ target/riscv/translate.c | 2 ++ 3 files changed, 39 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 991a6bb760..9d8cf60a1c 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -137,6 +137,11 @@ static void set_vext_version(CPURISCVState *env, int vext_ver) env->vext_ver = vext_ver; } +static void set_pext_version(CPURISCVState *env, int pext_ver) +{ + env->pext_ver = pext_ver; +} + static void set_feature(CPURISCVState *env, int feature) { env->features |= (1ULL << feature); @@ -395,6 +400,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) int priv_version = PRIV_VERSION_1_11_0; int bext_version = BEXT_VERSION_0_93_0; int vext_version = VEXT_VERSION_0_07_1; + int pext_version = PEXT_VERSION_0_09_4; target_ulong target_misa = env->misa; Error *local_err = NULL; @@ -420,6 +426,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) set_priv_version(env, priv_version); set_bext_version(env, bext_version); set_vext_version(env, vext_version); + set_pext_version(env, pext_version); if (cpu->cfg.mmu) { set_feature(env, RISCV_FEATURE_MMU); @@ -553,6 +560,30 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) } set_vext_version(env, vext_version); } + if (cpu->cfg.ext_p) { + target_misa |= RVP; + if (cpu->cfg.pext_spec) { + if (!g_strcmp0(cpu->cfg.pext_spec, "v0.9.4")) { + pext_version = PEXT_VERSION_0_09_4; + } else { + error_setg(errp, + "Unsupported packed spec version '%s'", + cpu->cfg.pext_spec); + return; + } + } else { + qemu_log("packed verison is not specified, " + "use the default value v0.9.4\n"); + } + if (env->misa == RV64) { + if (!cpu->cfg.ext_psfoperand) { + error_setg(errp, "The Zpsfoperand" + "sub-extensions is required for RV64P."); + return; + } + } + set_pext_version(env, pext_version); + } set_misa(env, target_misa); } diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index bf1c899c00..4d20afb267 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -63,6 +63,7 @@ #define RVF RV('F') #define RVD RV('D') #define RVV RV('V') +#define RVP RV('P') #define RVC RV('C') #define RVS RV('S') #define RVU RV('U') @@ -85,6 +86,7 @@ enum { #define BEXT_VERSION_0_93_0 0x00009300 #define VEXT_VERSION_0_07_1 0x00000701 +#define PEXT_VERSION_0_09_4 0x00000904 enum { TRANSLATE_SUCCESS, @@ -135,6 +137,7 @@ struct CPURISCVState { target_ulong priv_ver; target_ulong bext_ver; target_ulong vext_ver; + target_ulong pext_ver; target_ulong misa; target_ulong misa_mask; @@ -293,14 +296,17 @@ struct RISCVCPU { bool ext_u; bool ext_h; bool ext_v; + bool ext_p; bool ext_counters; bool ext_ifencei; bool ext_icsr; + bool ext_psfoperand; char *priv_spec; char *user_spec; char *bext_spec; char *vext_spec; + char *pext_spec; uint16_t vlen; uint16_t elen; bool mmu; diff --git a/target/riscv/translate.c b/target/riscv/translate.c index c6e8739614..0e6ede4d71 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -56,6 +56,7 @@ typedef struct DisasContext { to reset this known value. */ int frm; bool ext_ifencei; + bool ext_psfoperand; bool hlsx; /* vector extension */ bool vill; @@ -965,6 +966,7 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs) ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL); ctx->mlen = 1 << (ctx->sew + 3 - ctx->lmul); ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX); + ctx->ext_psfoperand = cpu->cfg.ext_psfoperand; ctx->cs = cs; } From patchwork Thu Jun 10 07:58:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490225 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xJd63Pbz9sPf for ; Thu, 10 Jun 2021 18:01:53 +1000 (AEST) Received: from localhost ([::1]:40930 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFdP-0006tS-Pc for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:01:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40262) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFc2-0005EZ-9X; Thu, 10 Jun 2021 04:00:26 -0400 Received: from out28-50.mail.aliyun.com ([115.124.28.50]:33130) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFbx-0006Iy-SU; Thu, 10 Jun 2021 04:00:26 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436282|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_regular_dialog|0.0644709-0.0018413-0.933688; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047190; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMXDFC_1623312015; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMXDFC_1623312015) by smtp.aliyun-inc.com(10.147.42.198); Thu, 10 Jun 2021 16:00:15 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 02/37] target/riscv: Make the vector helper functions public Date: Thu, 10 Jun 2021 15:58:33 +0800 Message-Id: <20210610075908.3305506-3-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.50; envelope-from=zhiwei_liu@c-sky.com; helo=out28-50.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The saturate functions about add,subtract and shift functions can be used in packed extension.Therefore hoist them up. The endianess process macro is also be hoisted. Signed-off-by: LIU Zhiwei Reviewed-by: Alistair Francis --- target/riscv/internals.h | 50 ++++++++++++++++++++++ target/riscv/vector_helper.c | 82 +++++++++++------------------------- 2 files changed, 74 insertions(+), 58 deletions(-) diff --git a/target/riscv/internals.h b/target/riscv/internals.h index b15ad394bb..698158e116 100644 --- a/target/riscv/internals.h +++ b/target/riscv/internals.h @@ -58,4 +58,54 @@ static inline float32 check_nanbox_s(uint64_t f) } } +/* + * Note that vector data is stored in host-endian 64-bit chunks, + * so addressing units smaller than that needs a host-endian fixup. + */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#define H8(x) ((x)) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#define H8(x) (x) +#endif + +/* share functions about saturation */ +int8_t sadd8(CPURISCVState *, int vxrm, int8_t, int8_t); +int16_t sadd16(CPURISCVState *, int vxrm, int16_t, int16_t); +int32_t sadd32(CPURISCVState *, int vxrm, int32_t, int32_t); +int64_t sadd64(CPURISCVState *, int vxrm, int64_t, int64_t); + +uint8_t saddu8(CPURISCVState *, int vxrm, uint8_t, uint8_t); +uint16_t saddu16(CPURISCVState *, int vxrm, uint16_t, uint16_t); +uint32_t saddu32(CPURISCVState *, int vxrm, uint32_t, uint32_t); +uint64_t saddu64(CPURISCVState *, int vxrm, uint64_t, uint64_t); + +int8_t ssub8(CPURISCVState *, int vxrm, int8_t, int8_t); +int16_t ssub16(CPURISCVState *, int vxrm, int16_t, int16_t); +int32_t ssub32(CPURISCVState *, int vxrm, int32_t, int32_t); +int64_t ssub64(CPURISCVState *, int vxrm, int64_t, int64_t); + +uint8_t ssubu8(CPURISCVState *, int vxrm, uint8_t, uint8_t); +uint16_t ssubu16(CPURISCVState *, int vxrm, uint16_t, uint16_t); +uint32_t ssubu32(CPURISCVState *, int vxrm, uint32_t, uint32_t); +uint64_t ssubu64(CPURISCVState *, int vxrm, uint64_t, uint64_t); + +/* share shift functions */ +int8_t vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b); +int16_t vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b); +int32_t vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); +uint8_t vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b); +uint16_t vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b); +uint32_t vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b); +uint64_t vssrl64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b); #endif diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 12c31aa4b4..c720e7b1fc 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -56,26 +56,6 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, return vl; } -/* - * Note that vector data is stored in host-endian 64-bit chunks, - * so addressing units smaller than that needs a host-endian fixup. - */ -#ifdef HOST_WORDS_BIGENDIAN -#define H1(x) ((x) ^ 7) -#define H1_2(x) ((x) ^ 6) -#define H1_4(x) ((x) ^ 4) -#define H2(x) ((x) ^ 3) -#define H4(x) ((x) ^ 1) -#define H8(x) ((x)) -#else -#define H1(x) (x) -#define H1_2(x) (x) -#define H1_4(x) (x) -#define H2(x) (x) -#define H4(x) (x) -#define H8(x) (x) -#endif - static inline uint32_t vext_nf(uint32_t desc) { return FIELD_EX32(simd_data(desc), VDATA, NF); @@ -2195,7 +2175,7 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ do_##NAME, CLEAR_FN); \ } -static inline uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) +uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) { uint8_t res = a + b; if (res < a) { @@ -2205,8 +2185,7 @@ static inline uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) return res; } -static inline uint16_t saddu16(CPURISCVState *env, int vxrm, uint16_t a, - uint16_t b) +uint16_t saddu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) { uint16_t res = a + b; if (res < a) { @@ -2216,8 +2195,7 @@ static inline uint16_t saddu16(CPURISCVState *env, int vxrm, uint16_t a, return res; } -static inline uint32_t saddu32(CPURISCVState *env, int vxrm, uint32_t a, - uint32_t b) +uint32_t saddu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) { uint32_t res = a + b; if (res < a) { @@ -2227,8 +2205,7 @@ static inline uint32_t saddu32(CPURISCVState *env, int vxrm, uint32_t a, return res; } -static inline uint64_t saddu64(CPURISCVState *env, int vxrm, uint64_t a, - uint64_t b) +uint64_t saddu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) { uint64_t res = a + b; if (res < a) { @@ -2324,7 +2301,7 @@ GEN_VEXT_VX_RM(vsaddu_vx_h, 2, 2, clearh) GEN_VEXT_VX_RM(vsaddu_vx_w, 4, 4, clearl) GEN_VEXT_VX_RM(vsaddu_vx_d, 8, 8, clearq) -static inline int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) +int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { int8_t res = a + b; if ((res ^ a) & (res ^ b) & INT8_MIN) { @@ -2334,7 +2311,7 @@ static inline int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) return res; } -static inline int16_t sadd16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) +int16_t sadd16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { int16_t res = a + b; if ((res ^ a) & (res ^ b) & INT16_MIN) { @@ -2344,7 +2321,7 @@ static inline int16_t sadd16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) return res; } -static inline int32_t sadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) +int32_t sadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { int32_t res = a + b; if ((res ^ a) & (res ^ b) & INT32_MIN) { @@ -2354,7 +2331,7 @@ static inline int32_t sadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) return res; } -static inline int64_t sadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) +int64_t sadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { int64_t res = a + b; if ((res ^ a) & (res ^ b) & INT64_MIN) { @@ -2382,7 +2359,7 @@ GEN_VEXT_VX_RM(vsadd_vx_h, 2, 2, clearh) GEN_VEXT_VX_RM(vsadd_vx_w, 4, 4, clearl) GEN_VEXT_VX_RM(vsadd_vx_d, 8, 8, clearq) -static inline uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) +uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) { uint8_t res = a - b; if (res > a) { @@ -2392,8 +2369,7 @@ static inline uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) return res; } -static inline uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, - uint16_t b) +uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) { uint16_t res = a - b; if (res > a) { @@ -2403,8 +2379,7 @@ static inline uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, return res; } -static inline uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, - uint32_t b) +uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) { uint32_t res = a - b; if (res > a) { @@ -2414,8 +2389,7 @@ static inline uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, return res; } -static inline uint64_t ssubu64(CPURISCVState *env, int vxrm, uint64_t a, - uint64_t b) +uint64_t ssubu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) { uint64_t res = a - b; if (res > a) { @@ -2443,7 +2417,7 @@ GEN_VEXT_VX_RM(vssubu_vx_h, 2, 2, clearh) GEN_VEXT_VX_RM(vssubu_vx_w, 4, 4, clearl) GEN_VEXT_VX_RM(vssubu_vx_d, 8, 8, clearq) -static inline int8_t ssub8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) +int8_t ssub8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { int8_t res = a - b; if ((res ^ a) & (a ^ b) & INT8_MIN) { @@ -2453,7 +2427,7 @@ static inline int8_t ssub8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) return res; } -static inline int16_t ssub16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) +int16_t ssub16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { int16_t res = a - b; if ((res ^ a) & (a ^ b) & INT16_MIN) { @@ -2463,7 +2437,7 @@ static inline int16_t ssub16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) return res; } -static inline int32_t ssub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) +int32_t ssub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { int32_t res = a - b; if ((res ^ a) & (a ^ b) & INT32_MIN) { @@ -2473,7 +2447,7 @@ static inline int32_t ssub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) return res; } -static inline int64_t ssub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) +int64_t ssub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { int64_t res = a - b; if ((res ^ a) & (a ^ b) & INT64_MIN) { @@ -2914,8 +2888,7 @@ GEN_VEXT_VX_RM(vwsmaccus_vx_h, 2, 4, clearl) GEN_VEXT_VX_RM(vwsmaccus_vx_w, 4, 8, clearq) /* Vector Single-Width Scaling Shift Instructions */ -static inline uint8_t -vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) +uint8_t vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) { uint8_t round, shift = b & 0x7; uint8_t res; @@ -2924,8 +2897,7 @@ vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) res = (a >> shift) + round; return res; } -static inline uint16_t -vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) +uint16_t vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) { uint8_t round, shift = b & 0xf; uint16_t res; @@ -2934,8 +2906,7 @@ vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) res = (a >> shift) + round; return res; } -static inline uint32_t -vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) +uint32_t vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) { uint8_t round, shift = b & 0x1f; uint32_t res; @@ -2944,8 +2915,7 @@ vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) res = (a >> shift) + round; return res; } -static inline uint64_t -vssrl64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) +uint64_t vssrl64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) { uint8_t round, shift = b & 0x3f; uint64_t res; @@ -2972,8 +2942,7 @@ GEN_VEXT_VX_RM(vssrl_vx_h, 2, 2, clearh) GEN_VEXT_VX_RM(vssrl_vx_w, 4, 4, clearl) GEN_VEXT_VX_RM(vssrl_vx_d, 8, 8, clearq) -static inline int8_t -vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) +int8_t vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { uint8_t round, shift = b & 0x7; int8_t res; @@ -2982,8 +2951,7 @@ vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) res = (a >> shift) + round; return res; } -static inline int16_t -vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) +int16_t vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { uint8_t round, shift = b & 0xf; int16_t res; @@ -2992,8 +2960,7 @@ vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) res = (a >> shift) + round; return res; } -static inline int32_t -vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) +int32_t vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { uint8_t round, shift = b & 0x1f; int32_t res; @@ -3002,8 +2969,7 @@ vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) res = (a >> shift) + round; return res; } -static inline int64_t -vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) +int64_t vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { uint8_t round, shift = b & 0x3f; int64_t res; From patchwork Thu Jun 10 07:58:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490228 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xLJ3QQcz9sPf for ; Thu, 10 Jun 2021 18:03:20 +1000 (AEST) Received: from localhost ([::1]:44030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFeo-0000Yd-Cr for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:03:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40506) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFcW-0006VY-Qd; Thu, 10 Jun 2021 04:00:58 -0400 Received: from out28-146.mail.aliyun.com ([115.124.28.146]:58256) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFcS-0006cO-GM; Thu, 10 Jun 2021 04:00:56 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436282|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.0281819-0.00439224-0.967426; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047208; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMfAu._1623312045; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMfAu._1623312045) by smtp.aliyun-inc.com(10.147.43.230); Thu, 10 Jun 2021 16:00:45 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 03/37] target/riscv: 16-bit Addition & Subtraction Instructions Date: Thu, 10 Jun 2021 15:58:34 +0800 Message-Id: <20210610075908.3305506-4-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.146; envelope-from=zhiwei_liu@c-sky.com; helo=out28-146.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include 5 groups: Wrap-around (dropping overflow), Signed Halving, Unsigned Halving, Signed Saturation, and Unsigned Saturation. Signed-off-by: LIU Zhiwei Reviewed-by: Richard Henderson --- include/tcg/tcg-op-gvec.h | 10 + target/riscv/helper.h | 30 ++ target/riscv/insn32.decode | 32 +++ target/riscv/insn_trans/trans_rvp.c.inc | 117 ++++++++ target/riscv/meson.build | 1 + target/riscv/packed_helper.c | 354 ++++++++++++++++++++++++ target/riscv/translate.c | 1 + tcg/tcg-op-gvec.c | 28 ++ 8 files changed, 573 insertions(+) create mode 100644 target/riscv/insn_trans/trans_rvp.c.inc create mode 100644 target/riscv/packed_helper.c diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h index c69a7de984..2dae9e78d0 100644 --- a/include/tcg/tcg-op-gvec.h +++ b/include/tcg/tcg-op-gvec.h @@ -386,10 +386,12 @@ void tcg_gen_vec_neg32_i64(TCGv_i64 d, TCGv_i64 a); void tcg_gen_vec_add8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_add16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void tcg_gen_vec_add16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_add32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_sub8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_sub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void tcg_gen_vec_sub16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_shl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); @@ -401,4 +403,12 @@ void tcg_gen_vec_sar16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_rotl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); +#if TARGET_LONG_BITS == 64 +#define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i64 +#define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i64 +#else +#define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i32 +#define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i32 +#endif + #endif diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 415e37bc37..b6a71ade33 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1149,3 +1149,33 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32) + +/* P extension function */ +DEF_HELPER_3(radd16, tl, env, tl, tl) +DEF_HELPER_3(uradd16, tl, env, tl, tl) +DEF_HELPER_3(kadd16, tl, env, tl, tl) +DEF_HELPER_3(ukadd16, tl, env, tl, tl) +DEF_HELPER_3(rsub16, tl, env, tl, tl) +DEF_HELPER_3(ursub16, tl, env, tl, tl) +DEF_HELPER_3(ksub16, tl, env, tl, tl) +DEF_HELPER_3(uksub16, tl, env, tl, tl) +DEF_HELPER_3(cras16, tl, env, tl, tl) +DEF_HELPER_3(rcras16, tl, env, tl, tl) +DEF_HELPER_3(urcras16, tl, env, tl, tl) +DEF_HELPER_3(kcras16, tl, env, tl, tl) +DEF_HELPER_3(ukcras16, tl, env, tl, tl) +DEF_HELPER_3(crsa16, tl, env, tl, tl) +DEF_HELPER_3(rcrsa16, tl, env, tl, tl) +DEF_HELPER_3(urcrsa16, tl, env, tl, tl) +DEF_HELPER_3(kcrsa16, tl, env, tl, tl) +DEF_HELPER_3(ukcrsa16, tl, env, tl, tl) +DEF_HELPER_3(stas16, tl, env, tl, tl) +DEF_HELPER_3(rstas16, tl, env, tl, tl) +DEF_HELPER_3(urstas16, tl, env, tl, tl) +DEF_HELPER_3(kstas16, tl, env, tl, tl) +DEF_HELPER_3(ukstas16, tl, env, tl, tl) +DEF_HELPER_3(stsa16, tl, env, tl, tl) +DEF_HELPER_3(rstsa16, tl, env, tl, tl) +DEF_HELPER_3(urstsa16, tl, env, tl, tl) +DEF_HELPER_3(kstsa16, tl, env, tl, tl) +DEF_HELPER_3(ukstsa16, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index f09f8d5faf..57f72fabf6 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -732,3 +732,35 @@ greviw 0110100 .......... 101 ..... 0011011 @sh5 gorciw 0010100 .......... 101 ..... 0011011 @sh5 slli_uw 00001. ........... 001 ..... 0011011 @sh + +# *** RV32P Extension *** +add16 0100000 ..... ..... 000 ..... 1110111 @r +radd16 0000000 ..... ..... 000 ..... 1110111 @r +uradd16 0010000 ..... ..... 000 ..... 1110111 @r +kadd16 0001000 ..... ..... 000 ..... 1110111 @r +ukadd16 0011000 ..... ..... 000 ..... 1110111 @r +sub16 0100001 ..... ..... 000 ..... 1110111 @r +rsub16 0000001 ..... ..... 000 ..... 1110111 @r +ursub16 0010001 ..... ..... 000 ..... 1110111 @r +ksub16 0001001 ..... ..... 000 ..... 1110111 @r +uksub16 0011001 ..... ..... 000 ..... 1110111 @r +cras16 0100010 ..... ..... 000 ..... 1110111 @r +rcras16 0000010 ..... ..... 000 ..... 1110111 @r +urcras16 0010010 ..... ..... 000 ..... 1110111 @r +kcras16 0001010 ..... ..... 000 ..... 1110111 @r +ukcras16 0011010 ..... ..... 000 ..... 1110111 @r +crsa16 0100011 ..... ..... 000 ..... 1110111 @r +rcrsa16 0000011 ..... ..... 000 ..... 1110111 @r +urcrsa16 0010011 ..... ..... 000 ..... 1110111 @r +kcrsa16 0001011 ..... ..... 000 ..... 1110111 @r +ukcrsa16 0011011 ..... ..... 000 ..... 1110111 @r +stas16 1111010 ..... ..... 010 ..... 1110111 @r +rstas16 1011010 ..... ..... 010 ..... 1110111 @r +urstas16 1101010 ..... ..... 010 ..... 1110111 @r +kstas16 1100010 ..... ..... 010 ..... 1110111 @r +ukstas16 1110010 ..... ..... 010 ..... 1110111 @r +stsa16 1111011 ..... ..... 010 ..... 1110111 @r +rstsa16 1011011 ..... ..... 010 ..... 1110111 @r +urstsa16 1101011 ..... ..... 010 ..... 1110111 @r +kstsa16 1100011 ..... ..... 010 ..... 1110111 @r +ukstsa16 1110011 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc new file mode 100644 index 0000000000..43f395657a --- /dev/null +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -0,0 +1,117 @@ +/* + * RISC-V translation routines for the RVP Standard Extension. + * + * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +#include "tcg/tcg-op-gvec.h" +#include "tcg/tcg-gvec-desc.h" +#include "tcg/tcg.h" + +/* + *** SIMD Data Processing Instructions + */ + +/* 16-bit Addition & Subtraction Instructions */ + +/* + * For some instructions, such as add16, an oberservation can be utilized: + * 1) If any reg is zero, it can be reduced to an inline op on the whole reg. + * 2) Otherwise, it can be acclebrated by an vec op. + */ +static inline bool +r_inline(DisasContext *ctx, arg_r *a, + void (* vecop)(TCGv, TCGv, TCGv), + void (* op)(TCGv, TCGv, TCGv)) +{ + if (!has_ext(ctx, RVP)) { + return false; + } + if (a->rd && a->rs1 && a->rs2) { + vecop(cpu_gpr[a->rd], cpu_gpr[a->rs1], cpu_gpr[a->rs2]); + } else { + gen_arith(ctx, a, op); + } + return true; +} + +/* Complete inline implementation */ +#define GEN_RVP_R_INLINE(NAME, VECOP, OP) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_inline(s, a, VECOP, OP); \ +} + +GEN_RVP_R_INLINE(add16, tcg_gen_vec_add16_tl, tcg_gen_add_tl); +GEN_RVP_R_INLINE(sub16, tcg_gen_vec_sub16_tl, tcg_gen_sub_tl); + +/* Out of line helpers for R format packed instructions */ +static inline bool +r_ool(DisasContext *ctx, arg_r *a, void (* fn)(TCGv, TCGv_ptr, TCGv, TCGv)) +{ + TCGv src1, src2, dst; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + fn(dst, cpu_env, src1, src2); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_OOL(radd16); +GEN_RVP_R_OOL(uradd16); +GEN_RVP_R_OOL(kadd16); +GEN_RVP_R_OOL(ukadd16); +GEN_RVP_R_OOL(rsub16); +GEN_RVP_R_OOL(ursub16); +GEN_RVP_R_OOL(ksub16); +GEN_RVP_R_OOL(uksub16); +GEN_RVP_R_OOL(cras16); +GEN_RVP_R_OOL(rcras16); +GEN_RVP_R_OOL(urcras16); +GEN_RVP_R_OOL(kcras16); +GEN_RVP_R_OOL(ukcras16); +GEN_RVP_R_OOL(crsa16); +GEN_RVP_R_OOL(rcrsa16); +GEN_RVP_R_OOL(urcrsa16); +GEN_RVP_R_OOL(kcrsa16); +GEN_RVP_R_OOL(ukcrsa16); +GEN_RVP_R_OOL(stas16); +GEN_RVP_R_OOL(rstas16); +GEN_RVP_R_OOL(urstas16); +GEN_RVP_R_OOL(kstas16); +GEN_RVP_R_OOL(ukstas16); +GEN_RVP_R_OOL(stsa16); +GEN_RVP_R_OOL(rstsa16); +GEN_RVP_R_OOL(urstsa16); +GEN_RVP_R_OOL(kstsa16); +GEN_RVP_R_OOL(ukstsa16); diff --git a/target/riscv/meson.build b/target/riscv/meson.build index d5e0bc93ea..cc169e1b2c 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -17,6 +17,7 @@ riscv_ss.add(files( 'op_helper.c', 'vector_helper.c', 'bitmanip_helper.c', + 'packed_helper.c', 'translate.c', )) diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c new file mode 100644 index 0000000000..b84abaaf25 --- /dev/null +++ b/target/riscv/packed_helper.c @@ -0,0 +1,354 @@ +/* + * RISC-V P Extension Helpers for QEMU. + * + * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include "exec/cpu_ldst.h" +#include "fpu/softfloat.h" +#include +#include "internals.h" + +/* + *** SIMD Data Processing Instructions + */ + +/* 16-bit Addition & Subtraction Instructions */ +typedef void PackedFn3i(CPURISCVState *, void *, void *, void *, uint8_t); + +/* Define a common function to loop elements in packed register */ +static inline target_ulong +rvpr(CPURISCVState *env, target_ulong a, target_ulong b, + uint8_t step, uint8_t size, PackedFn3i *fn) +{ + int i, passes = sizeof(target_ulong) / size; + target_ulong result = 0; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, &b, i); + } + return result; +} + +#define RVPR(NAME, STEP, SIZE) \ +target_ulong HELPER(NAME)(CPURISCVState *env, target_ulong a, \ + target_ulong b) \ +{ \ + return rvpr(env, a, b, STEP, SIZE, (PackedFn3i *)do_##NAME);\ +} + +static inline int32_t hadd32(int32_t a, int32_t b) +{ + return ((int64_t)a + b) >> 1; +} + +static inline void do_radd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = hadd32(a[i], b[i]); +} + +RVPR(radd16, 1, 2); + +static inline uint32_t haddu32(uint32_t a, uint32_t b) +{ + return ((uint64_t)a + b) >> 1; +} + +static inline void do_uradd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = haddu32(a[i], b[i]); +} + +RVPR(uradd16, 1, 2); + +static inline void do_kadd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = sadd16(env, 0, a[i], b[i]); +} + +RVPR(kadd16, 1, 2); + +static inline void do_ukadd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = saddu16(env, 0, a[i], b[i]); +} + +RVPR(ukadd16, 1, 2); + +static inline int32_t hsub32(int32_t a, int32_t b) +{ + return ((int64_t)a - b) >> 1; +} + +static inline int64_t hsub64(int64_t a, int64_t b) +{ + int64_t res = a - b; + int64_t over = (res ^ a) & (a ^ b) & INT64_MIN; + + /* With signed overflow, bit 64 is inverse of bit 63. */ + return (res >> 1) ^ over; +} + +static inline void do_rsub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = hsub32(a[i], b[i]); +} + +RVPR(rsub16, 1, 2); + +static inline uint64_t hsubu64(uint64_t a, uint64_t b) +{ + return (a - b) >> 1; +} + +static inline void do_ursub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = hsubu64(a[i], b[i]); +} + +RVPR(ursub16, 1, 2); + +static inline void do_ksub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = ssub16(env, 0, a[i], b[i]); +} + +RVPR(ksub16, 1, 2); + +static inline void do_uksub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = ssubu16(env, 0, a[i], b[i]); +} + +RVPR(uksub16, 1, 2); + +static inline void do_cras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] - b[H2(i + 1)]; + d[H2(i + 1)] = a[H2(i + 1)] + b[H2(i)]; +} + +RVPR(cras16, 2, 2); + +static inline void do_rcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsub32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hadd32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(rcras16, 2, 2); + +static inline void do_urcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsubu64(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = haddu32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(urcras16, 2, 2); + +static inline void do_kcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssub16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = sadd16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(kcras16, 2, 2); + +static inline void do_ukcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssubu16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = saddu16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(ukcras16, 2, 2); + +static inline void do_crsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] + b[H2(i + 1)]; + d[H2(i + 1)] = a[H2(i + 1)] - b[H2(i)]; +} + +RVPR(crsa16, 2, 2); + +static inline void do_rcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hadd32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hsub32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(rcrsa16, 2, 2); + +static inline void do_urcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = haddu32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hsubu64(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(urcrsa16, 2, 2); + +static inline void do_kcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = sadd16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = ssub16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(kcrsa16, 2, 2); + +static inline void do_ukcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = saddu16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = ssubu16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(ukcrsa16, 2, 2); + +static inline void do_stas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] - b[H2(i)]; + d[H2(i + 1)] = a[H2(i + 1)] + b[H2(i + 1)]; +} + +RVPR(stas16, 2, 2); + +static inline void do_rstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsub32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hadd32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(rstas16, 2, 2); + +static inline void do_urstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsubu64(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = haddu32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(urstas16, 2, 2); + +static inline void do_kstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssub16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = sadd16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(kstas16, 2, 2); + +static inline void do_ukstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssubu16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = saddu16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(ukstas16, 2, 2); + +static inline void do_stsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] + b[H2(i)]; + d[H2(i + 1)] = a[H2(i + 1)] - b[H2(i + 1)]; +} + +RVPR(stsa16, 2, 2); + +static inline void do_rstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hadd32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hsub32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(rstsa16, 2, 2); + +static inline void do_urstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = haddu32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hsubu64(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(urstsa16, 2, 2); + +static inline void do_kstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = sadd16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = ssub16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(kstsa16, 2, 2); + +static inline void do_ukstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = saddu16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = ssubu16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(ukstsa16, 2, 2); diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 0e6ede4d71..51b144e9be 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -908,6 +908,7 @@ static bool gen_unary(DisasContext *ctx, arg_r2 *a, #include "insn_trans/trans_rvh.c.inc" #include "insn_trans/trans_rvv.c.inc" #include "insn_trans/trans_rvb.c.inc" +#include "insn_trans/trans_rvp.c.inc" #include "insn_trans/trans_privileged.c.inc" /* Include the auto-generated decoder for 16 bit insn */ diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 498a959839..a8898ba7bf 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1742,6 +1742,20 @@ void tcg_gen_vec_add16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) gen_addv_mask(d, a, b, m); } +void tcg_gen_vec_add16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t1 = tcg_temp_new_i32(); + TCGv_i32 t2 = tcg_temp_new_i32(); + + tcg_gen_andi_i32(t1, a, ~0xffff); + tcg_gen_add_i32(t2, a, b); + tcg_gen_add_i32(t1, t1, b); + tcg_gen_deposit_i32(d, t1, t2, 0, 16); + + tcg_temp_free_i32(t1); + tcg_temp_free_i32(t2); +} + void tcg_gen_vec_add32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 t1 = tcg_temp_new_i64(); @@ -1892,6 +1906,20 @@ void tcg_gen_vec_sub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) gen_subv_mask(d, a, b, m); } +void tcg_gen_vec_sub16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t1 = tcg_temp_new_i32(); + TCGv_i32 t2 = tcg_temp_new_i32(); + + tcg_gen_andi_i32(t1, b, ~0xffff); + tcg_gen_sub_i32(t2, a, b); + tcg_gen_sub_i32(t1, a, t1); + tcg_gen_deposit_i32(d, t1, t2, 0, 16); + + tcg_temp_free_i32(t1); + tcg_temp_free_i32(t2); +} + void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 t1 = tcg_temp_new_i64(); From patchwork Thu Jun 10 07:58:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490226 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xJw50Ksz9sRN for ; Thu, 10 Jun 2021 18:02:08 +1000 (AEST) Received: from localhost ([::1]:41194 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFde-00074x-9R for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:02:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40646) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFd1-000742-Vo; Thu, 10 Jun 2021 04:01:28 -0400 Received: from out28-123.mail.aliyun.com ([115.124.28.123]:53006) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFcy-0006vH-EA; Thu, 10 Jun 2021 04:01:27 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07443932|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.0313833-0.00351235-0.965104; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047202; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=9; RT=8; SR=0; TI=SMTPD_---.KQMfBQW_1623312076; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMfBQW_1623312076) by smtp.aliyun-inc.com(10.147.43.230); Thu, 10 Jun 2021 16:01:16 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 04/37] target/riscv: 8-bit Addition & Subtraction Instruction Date: Thu, 10 Jun 2021 15:58:35 +0800 Message-Id: <20210610075908.3305506-5-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.123; envelope-from=zhiwei_liu@c-sky.com; helo=out28-123.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bin.meng@windriver.com, Palmer Dabbelt , richard.henderson@linaro.org, palmer@dabbelt.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include 5 groups: Wrap-around (dropping overflow), Signed Halving, Unsigned Halving, Signed Saturation, and Unsigned Saturation. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis Reviewed-by: Palmer Dabbelt --- include/tcg/tcg-op-gvec.h | 6 ++ target/riscv/helper.h | 9 +++ target/riscv/insn32.decode | 11 ++++ target/riscv/insn_trans/trans_rvp.c.inc | 13 +++++ target/riscv/packed_helper.c | 73 +++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 47 ++++++++++++++++ 6 files changed, 159 insertions(+) diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h index 2dae9e78d0..392c0f95a4 100644 --- a/include/tcg/tcg-op-gvec.h +++ b/include/tcg/tcg-op-gvec.h @@ -385,11 +385,13 @@ void tcg_gen_vec_neg16_i64(TCGv_i64 d, TCGv_i64 a); void tcg_gen_vec_neg32_i64(TCGv_i64 d, TCGv_i64 a); void tcg_gen_vec_add8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void tcg_gen_vec_add8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_add16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_add16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_add32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_sub8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void tcg_gen_vec_sub8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_sub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_sub16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); @@ -406,9 +408,13 @@ void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); #if TARGET_LONG_BITS == 64 #define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i64 #define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i64 +#define tcg_gen_vec_add8_tl tcg_gen_vec_add8_i64 +#define tcg_gen_vec_sub8_tl tcg_gen_vec_sub8_i64 #else #define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i32 #define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i32 +#define tcg_gen_vec_add8_tl tcg_gen_vec_add8_i32 +#define tcg_gen_vec_sub8_tl tcg_gen_vec_sub8_i32 #endif #endif diff --git a/target/riscv/helper.h b/target/riscv/helper.h index b6a71ade33..629ff13402 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1179,3 +1179,12 @@ DEF_HELPER_3(rstsa16, tl, env, tl, tl) DEF_HELPER_3(urstsa16, tl, env, tl, tl) DEF_HELPER_3(kstsa16, tl, env, tl, tl) DEF_HELPER_3(ukstsa16, tl, env, tl, tl) + +DEF_HELPER_3(radd8, tl, env, tl, tl) +DEF_HELPER_3(uradd8, tl, env, tl, tl) +DEF_HELPER_3(kadd8, tl, env, tl, tl) +DEF_HELPER_3(ukadd8, tl, env, tl, tl) +DEF_HELPER_3(rsub8, tl, env, tl, tl) +DEF_HELPER_3(ursub8, tl, env, tl, tl) +DEF_HELPER_3(ksub8, tl, env, tl, tl) +DEF_HELPER_3(uksub8, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 57f72fabf6..13e1222296 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -764,3 +764,14 @@ rstsa16 1011011 ..... ..... 010 ..... 1110111 @r urstsa16 1101011 ..... ..... 010 ..... 1110111 @r kstsa16 1100011 ..... ..... 010 ..... 1110111 @r ukstsa16 1110011 ..... ..... 010 ..... 1110111 @r + +add8 0100100 ..... ..... 000 ..... 1110111 @r +radd8 0000100 ..... ..... 000 ..... 1110111 @r +uradd8 0010100 ..... ..... 000 ..... 1110111 @r +kadd8 0001100 ..... ..... 000 ..... 1110111 @r +ukadd8 0011100 ..... ..... 000 ..... 1110111 @r +sub8 0100101 ..... ..... 000 ..... 1110111 @r +rsub8 0000101 ..... ..... 000 ..... 1110111 @r +ursub8 0010101 ..... ..... 000 ..... 1110111 @r +ksub8 0001101 ..... ..... 000 ..... 1110111 @r +uksub8 0011101 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 43f395657a..80bec35ac9 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -115,3 +115,16 @@ GEN_RVP_R_OOL(rstsa16); GEN_RVP_R_OOL(urstsa16); GEN_RVP_R_OOL(kstsa16); GEN_RVP_R_OOL(ukstsa16); + +/* 8-bit Addition & Subtraction Instructions */ +GEN_RVP_R_INLINE(add8, tcg_gen_vec_add8_tl, tcg_gen_add_tl); +GEN_RVP_R_INLINE(sub8, tcg_gen_vec_sub8_tl, tcg_gen_sub_tl); + +GEN_RVP_R_OOL(radd8); +GEN_RVP_R_OOL(uradd8); +GEN_RVP_R_OOL(kadd8); +GEN_RVP_R_OOL(ukadd8); +GEN_RVP_R_OOL(rsub8); +GEN_RVP_R_OOL(ursub8); +GEN_RVP_R_OOL(ksub8); +GEN_RVP_R_OOL(uksub8); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index b84abaaf25..62db072204 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -352,3 +352,76 @@ static inline void do_ukstsa16(CPURISCVState *env, void *vd, void *va, } RVPR(ukstsa16, 2, 2); + +/* 8-bit Addition & Subtraction Instructions */ +static inline void do_radd8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + d[i] = hadd32(a[i], b[i]); +} + +RVPR(radd8, 1, 1); + +static inline void do_uradd8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = haddu32(a[i], b[i]); +} + +RVPR(uradd8, 1, 1); + +static inline void do_kadd8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + d[i] = sadd8(env, 0, a[i], b[i]); +} + +RVPR(kadd8, 1, 1); + +static inline void do_ukadd8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = saddu8(env, 0, a[i], b[i]); +} + +RVPR(ukadd8, 1, 1); + +static inline void do_rsub8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + d[i] = hsub32(a[i], b[i]); +} + +RVPR(rsub8, 1, 1); + +static inline void do_ursub8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = hsubu64(a[i], b[i]); +} + +RVPR(ursub8, 1, 1); + +static inline void do_ksub8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + d[i] = ssub8(env, 0, a[i], b[i]); +} + +RVPR(ksub8, 1, 1); + +static inline void do_uksub8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = ssubu8(env, 0, a[i], b[i]); +} + +RVPR(uksub8, 1, 1); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index a8898ba7bf..484ced3054 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1736,6 +1736,30 @@ void tcg_gen_vec_add8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) gen_addv_mask(d, a, b, m); } +static void gen_addv_mask_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b, TCGv_i32 m) +{ + TCGv_i32 t1 = tcg_temp_new_i32(); + TCGv_i32 t2 = tcg_temp_new_i32(); + TCGv_i32 t3 = tcg_temp_new_i32(); + + tcg_gen_andc_i32(t1, a, m); + tcg_gen_andc_i32(t2, b, m); + tcg_gen_xor_i32(t3, a, b); + tcg_gen_add_i32(d, t1, t2); + tcg_gen_and_i32(t3, t3, m); + tcg_gen_xor_i32(d, d, t3); + + tcg_temp_free_i32(t1); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); +} + +void tcg_gen_vec_add8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 m = tcg_constant_i32((int32_t)dup_const(MO_8, 0x80)); + gen_addv_mask_i32(d, a, b, m); +} + void tcg_gen_vec_add16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 m = tcg_constant_i64(dup_const(MO_16, 0x8000)); @@ -1900,6 +1924,29 @@ void tcg_gen_vec_sub8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) gen_subv_mask(d, a, b, m); } +static void gen_subv_mask_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b, TCGv_i32 m) +{ + TCGv_i32 t1 = tcg_temp_new_i32(); + TCGv_i32 t2 = tcg_temp_new_i32(); + TCGv_i32 t3 = tcg_temp_new_i32(); + + tcg_gen_or_i32(t1, a, m); + tcg_gen_andc_i32(t2, b, m); + tcg_gen_eqv_i32(t3, a, b); + tcg_gen_sub_i32(d, t1, t2); + tcg_gen_and_i32(t3, t3, m); + tcg_gen_xor_i32(d, d, t3); + + tcg_temp_free_i32(t1); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); +} + +void tcg_gen_vec_sub8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 m = tcg_constant_i32((int32_t)dup_const(MO_8, 0x80)); + gen_subv_mask_i32(d, a, b, m); +} void tcg_gen_vec_sub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 m = tcg_constant_i64(dup_const(MO_16, 0x8000)); From patchwork Thu Jun 10 07:58:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490227 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xL373L9z9sPf for ; Thu, 10 Jun 2021 18:03:07 +1000 (AEST) Received: from localhost ([::1]:42836 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFeb-0008Ad-C7 for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:03:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40694) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFdS-00087x-Ev; Thu, 10 Jun 2021 04:01:54 -0400 Received: from out28-217.mail.aliyun.com ([115.124.28.217]:51464) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFdP-0007Ne-4V; Thu, 10 Jun 2021 04:01:54 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436282|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.387086-0.000726307-0.612188; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047202; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMTrU3_1623312106; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMTrU3_1623312106) by smtp.aliyun-inc.com(10.147.44.129); Thu, 10 Jun 2021 16:01:46 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 05/37] target/riscv: SIMD 16-bit Shift Instructions Date: Thu, 10 Jun 2021 15:58:36 +0800 Message-Id: <20210610075908.3305506-6-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.217; envelope-from=zhiwei_liu@c-sky.com; helo=out28-217.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Instructions include right arithmetic shift, right logic shift, and left shift. The shift can be an immediate or a register scalar. The right shift has rounding operation. And the left shift has saturation operation. Signed-off-by: LIU Zhiwei Reviewed-by: Richard Henderson --- include/tcg/tcg-op-gvec.h | 9 ++ target/riscv/helper.h | 9 ++ target/riscv/insn32.decode | 17 ++++ target/riscv/insn_trans/trans_rvp.c.inc | 59 ++++++++++++++ target/riscv/packed_helper.c | 104 ++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 28 +++++++ 6 files changed, 226 insertions(+) diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h index 392c0f95a4..72cf697646 100644 --- a/include/tcg/tcg-op-gvec.h +++ b/include/tcg/tcg-op-gvec.h @@ -398,10 +398,13 @@ void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_shl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_shl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); +void tcg_gen_vec_shl16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_shr8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_shr16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); +void tcg_gen_vec_shr16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_sar8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_sar16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); +void tcg_gen_vec_sar16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_rotl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); @@ -410,11 +413,17 @@ void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); #define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i64 #define tcg_gen_vec_add8_tl tcg_gen_vec_add8_i64 #define tcg_gen_vec_sub8_tl tcg_gen_vec_sub8_i64 +#define tcg_gen_vec_shl16i_tl tcg_gen_vec_shl16i_i64 +#define tcg_gen_vec_shr16i_tl tcg_gen_vec_shr16i_i64 +#define tcg_gen_vec_sar16i_tl tcg_gen_vec_sar16i_i64 #else #define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i32 #define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i32 #define tcg_gen_vec_add8_tl tcg_gen_vec_add8_i32 #define tcg_gen_vec_sub8_tl tcg_gen_vec_sub8_i32 +#define tcg_gen_vec_shl16i_tl tcg_gen_vec_shl16i_i32 +#define tcg_gen_vec_shr16i_tl tcg_gen_vec_shr16i_i32 +#define tcg_gen_vec_sar16i_tl tcg_gen_vec_sar16i_i32 #endif #endif diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 629ff13402..de7b4fc17d 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1188,3 +1188,12 @@ DEF_HELPER_3(rsub8, tl, env, tl, tl) DEF_HELPER_3(ursub8, tl, env, tl, tl) DEF_HELPER_3(ksub8, tl, env, tl, tl) DEF_HELPER_3(uksub8, tl, env, tl, tl) + +DEF_HELPER_3(sra16, tl, env, tl, tl) +DEF_HELPER_3(sra16_u, tl, env, tl, tl) +DEF_HELPER_3(srl16, tl, env, tl, tl) +DEF_HELPER_3(srl16_u, tl, env, tl, tl) +DEF_HELPER_3(sll16, tl, env, tl, tl) +DEF_HELPER_3(ksll16, tl, env, tl, tl) +DEF_HELPER_3(kslra16, tl, env, tl, tl) +DEF_HELPER_3(kslra16_u, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 13e1222296..44c497f28a 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -24,6 +24,7 @@ %sh5 20:5 %sh7 20:7 +%sh4 20:4 %csr 20:12 %rm 12:3 %nf 29:3 !function=ex_plus_1 @@ -61,6 +62,7 @@ @j .................... ..... ....... &j imm=%imm_j %rd @sh ...... ...... ..... ... ..... ....... &shift shamt=%sh7 %rs1 %rd +@sh4 ...... ...... ..... ... ..... ....... &shift shamt=%sh4 %rs1 %rd @csr ............ ..... ... ..... ....... %csr %rs1 %rd @atom_ld ..... aq:1 rl:1 ..... ........ ..... ....... &atomic rs2=0 %rs1 %rd @@ -775,3 +777,18 @@ rsub8 0000101 ..... ..... 000 ..... 1110111 @r ursub8 0010101 ..... ..... 000 ..... 1110111 @r ksub8 0001101 ..... ..... 000 ..... 1110111 @r uksub8 0011101 ..... ..... 000 ..... 1110111 @r + +sra16 0101000 ..... ..... 000 ..... 1110111 @r +sra16_u 0110000 ..... ..... 000 ..... 1110111 @r +srai16 0111000 0.... ..... 000 ..... 1110111 @sh4 +srai16_u 0111000 1.... ..... 000 ..... 1110111 @sh4 +srl16 0101001 ..... ..... 000 ..... 1110111 @r +srl16_u 0110001 ..... ..... 000 ..... 1110111 @r +srli16 0111001 0.... ..... 000 ..... 1110111 @sh4 +srli16_u 0111001 1.... ..... 000 ..... 1110111 @sh4 +sll16 0101010 ..... ..... 000 ..... 1110111 @r +slli16 0111010 0.... ..... 000 ..... 1110111 @sh4 +ksll16 0110010 ..... ..... 000 ..... 1110111 @r +kslli16 0111010 1.... ..... 000 ..... 1110111 @sh4 +kslra16 0101011 ..... ..... 000 ..... 1110111 @r +kslra16_u 0110011 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 80bec35ac9..afafa49824 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -128,3 +128,62 @@ GEN_RVP_R_OOL(rsub8); GEN_RVP_R_OOL(ursub8); GEN_RVP_R_OOL(ksub8); GEN_RVP_R_OOL(uksub8); + +/* 16-bit Shift Instructions */ +GEN_RVP_R_OOL(sra16); +GEN_RVP_R_OOL(srl16); +GEN_RVP_R_OOL(sll16); +GEN_RVP_R_OOL(sra16_u); +GEN_RVP_R_OOL(srl16_u); +GEN_RVP_R_OOL(ksll16); +GEN_RVP_R_OOL(kslra16); +GEN_RVP_R_OOL(kslra16_u); + +static bool +rvp_shifti_ool(DisasContext *ctx, arg_shift *a, + void (* fn)(TCGv, TCGv_ptr, TCGv, TCGv)) +{ + TCGv src1, dst, shift; + + src1 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + shift = tcg_const_tl(a->shamt); + fn(dst, cpu_env, src1, shift); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(dst); + tcg_temp_free(shift); + return true; +} + +static inline bool +rvp_shifti(DisasContext *ctx, arg_shift *a, + void (* vecop)(TCGv, TCGv, target_long), + void (* op)(TCGv, TCGv_ptr, TCGv, TCGv)) +{ + if (!has_ext(ctx, RVP)) { + return false; + } + + if (a->rd && a->rs1 && vecop) { + vecop(cpu_gpr[a->rd], cpu_gpr[a->rs1], a->shamt); + return true; + } + return rvp_shifti_ool(ctx, a, op); +} + +#define GEN_RVP_SHIFTI(NAME, VECOP, OP) \ +static bool trans_##NAME(DisasContext *s, arg_shift *a) \ +{ \ + return rvp_shifti(s, a, VECOP, OP); \ +} + +GEN_RVP_SHIFTI(srai16, tcg_gen_vec_sar16i_tl, gen_helper_sra16); +GEN_RVP_SHIFTI(srli16, tcg_gen_vec_shr16i_tl, gen_helper_srl16); +GEN_RVP_SHIFTI(slli16, tcg_gen_vec_shl16i_tl, gen_helper_sll16); +GEN_RVP_SHIFTI(srai16_u, NULL, gen_helper_sra16_u); +GEN_RVP_SHIFTI(srli16_u, NULL, gen_helper_srl16_u); +GEN_RVP_SHIFTI(kslli16, NULL, gen_helper_ksll16); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 62db072204..7e31c2fe46 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -425,3 +425,107 @@ static inline void do_uksub8(CPURISCVState *env, void *vd, void *va, } RVPR(uksub8, 1, 1); + +/* 16-bit Shift Instructions */ +static inline void do_sra16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + d[i] = a[i] >> shift; +} + +RVPR(sra16, 1, 2); + +static inline void do_srl16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + d[i] = a[i] >> shift; +} + +RVPR(srl16, 1, 2); + +static inline void do_sll16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + d[i] = a[i] << shift; +} + +RVPR(sll16, 1, 2); + +static inline void do_sra16_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + + d[i] = vssra16(env, 0, a[i], shift); +} + +RVPR(sra16_u, 1, 2); + +static inline void do_srl16_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + + d[i] = vssrl16(env, 0, a[i], shift); +} + +RVPR(srl16_u, 1, 2); + +static inline void do_ksll16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, result; + uint8_t shift = *(uint8_t *)vb & 0xf; + + result = a[i] << shift; + if (shift > (clrsb32(a[i]) - 16)) { + env->vxsat = 0x1; + d[i] = (a[i] & INT16_MIN) ? INT16_MIN : INT16_MAX; + } else { + d[i] = result; + } +} + +RVPR(ksll16, 1, 2); + +static inline void do_kslra16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va; + int32_t shift = sextract32((*(target_ulong *)vb), 0, 5); + + if (shift >= 0) { + do_ksll16(env, vd, va, vb, i); + } else { + shift = -shift; + shift = (shift == 16) ? 15 : shift; + d[i] = a[i] >> shift; + } +} + +RVPR(kslra16, 1, 2); + +static inline void do_kslra16_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va; + int32_t shift = sextract32((*(uint32_t *)vb), 0, 5); + + if (shift >= 0) { + do_ksll16(env, vd, va, vb, i); + } else { + shift = -shift; + shift = (shift == 16) ? 15 : shift; + d[i] = vssra16(env, 0, a[i], shift); + } +} + +RVPR(kslra16_u, 1, 2); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 484ced3054..cf1357cee1 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2687,6 +2687,13 @@ void tcg_gen_vec_shl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) tcg_gen_andi_i64(d, d, mask); } +void tcg_gen_vec_shl16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t c) +{ + uint32_t mask = dup_const(MO_16, 0xffff << c); + tcg_gen_shli_i32(d, a, c); + tcg_gen_andi_i32(d, d, mask); +} + void tcg_gen_gvec_shli(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz) { @@ -2738,6 +2745,13 @@ void tcg_gen_vec_shr16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) tcg_gen_andi_i64(d, d, mask); } +void tcg_gen_vec_shr16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t c) +{ + uint32_t mask = dup_const(MO_16, 0xffff >> c); + tcg_gen_shri_i32(d, a, c); + tcg_gen_andi_i32(d, d, mask); +} + void tcg_gen_gvec_shri(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz) { @@ -2803,6 +2817,20 @@ void tcg_gen_vec_sar16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) tcg_temp_free_i64(s); } +void tcg_gen_vec_sar16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t c) +{ + uint32_t s_mask = dup_const(MO_16, 0x8000 >> c); + uint32_t c_mask = dup_const(MO_16, 0xffff >> c); + TCGv_i32 s = tcg_temp_new_i32(); + + tcg_gen_shri_i32(d, a, c); + tcg_gen_andi_i32(s, d, s_mask); /* isolate (shifted) sign bit */ + tcg_gen_andi_i32(d, d, c_mask); /* clear out bits above sign */ + tcg_gen_muli_i32(s, s, (2 << c) - 2); /* replicate isolated signs */ + tcg_gen_or_i32(d, d, s); /* include sign extension */ + tcg_temp_free_i32(s); +} + void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz) { From patchwork Thu Jun 10 07:58:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490231 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xNv6TKGz9sPf for ; Thu, 10 Jun 2021 18:05:35 +1000 (AEST) Received: from localhost ([::1]:49800 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFgz-0004Wf-Ri for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:05:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40804) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFdy-0000mp-Ov; Thu, 10 Jun 2021 04:02:28 -0400 Received: from out28-52.mail.aliyun.com ([115.124.28.52]:40709) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFdv-0007gu-3c; Thu, 10 Jun 2021 04:02:26 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436283|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.333611-0.000687618-0.665701; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047209; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=9; RT=8; SR=0; TI=SMTPD_---.KQMTs2F_1623312137; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMTs2F_1623312137) by smtp.aliyun-inc.com(10.147.44.129); Thu, 10 Jun 2021 16:02:17 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 06/37] target/riscv: SIMD 8-bit Shift Instructions Date: Thu, 10 Jun 2021 15:58:37 +0800 Message-Id: <20210610075908.3305506-7-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.52; envelope-from=zhiwei_liu@c-sky.com; helo=out28-52.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bin.meng@windriver.com, Palmer Dabbelt , richard.henderson@linaro.org, palmer@dabbelt.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Instructions include right arithmetic shift, right logic shift, and left shift. The shift can be an immediate or a register scalar. The right shift has rounding operation. And the left shift has saturation operation. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis Reviewed-by: Palmer Dabbelt --- include/tcg/tcg-op-gvec.h | 9 +++ target/riscv/helper.h | 9 +++ target/riscv/insn32.decode | 17 ++++ target/riscv/insn_trans/trans_rvp.c.inc | 16 ++++ target/riscv/packed_helper.c | 102 ++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 28 +++++++ 6 files changed, 181 insertions(+) diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h index 72cf697646..91531ecb0b 100644 --- a/include/tcg/tcg-op-gvec.h +++ b/include/tcg/tcg-op-gvec.h @@ -397,12 +397,15 @@ void tcg_gen_vec_sub16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void tcg_gen_vec_shl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); +void tcg_gen_vec_shl8i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_shl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_shl16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_shr8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); +void tcg_gen_vec_shr8i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_shr16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_shr16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_sar8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); +void tcg_gen_vec_sar8i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_sar16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t); void tcg_gen_vec_sar16i_i32(TCGv_i32 d, TCGv_i32 a, int32_t); void tcg_gen_vec_rotl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); @@ -416,6 +419,9 @@ void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); #define tcg_gen_vec_shl16i_tl tcg_gen_vec_shl16i_i64 #define tcg_gen_vec_shr16i_tl tcg_gen_vec_shr16i_i64 #define tcg_gen_vec_sar16i_tl tcg_gen_vec_sar16i_i64 +#define tcg_gen_vec_shl8i_tl tcg_gen_vec_shl8i_i64 +#define tcg_gen_vec_shr8i_tl tcg_gen_vec_shr8i_i64 +#define tcg_gen_vec_sar8i_tl tcg_gen_vec_sar8i_i64 #else #define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i32 #define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i32 @@ -424,6 +430,9 @@ void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); #define tcg_gen_vec_shl16i_tl tcg_gen_vec_shl16i_i32 #define tcg_gen_vec_shr16i_tl tcg_gen_vec_shr16i_i32 #define tcg_gen_vec_sar16i_tl tcg_gen_vec_sar16i_i32 +#define tcg_gen_vec_shl8i_tl tcg_gen_vec_shl8i_i32 +#define tcg_gen_vec_shr8i_tl tcg_gen_vec_shr8i_i32 +#define tcg_gen_vec_sar8i_tl tcg_gen_vec_sar8i_i32 #endif #endif diff --git a/target/riscv/helper.h b/target/riscv/helper.h index de7b4fc17d..1b365135ff 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1197,3 +1197,12 @@ DEF_HELPER_3(sll16, tl, env, tl, tl) DEF_HELPER_3(ksll16, tl, env, tl, tl) DEF_HELPER_3(kslra16, tl, env, tl, tl) DEF_HELPER_3(kslra16_u, tl, env, tl, tl) + +DEF_HELPER_3(sra8, tl, env, tl, tl) +DEF_HELPER_3(sra8_u, tl, env, tl, tl) +DEF_HELPER_3(srl8, tl, env, tl, tl) +DEF_HELPER_3(srl8_u, tl, env, tl, tl) +DEF_HELPER_3(sll8, tl, env, tl, tl) +DEF_HELPER_3(ksll8, tl, env, tl, tl) +DEF_HELPER_3(kslra8, tl, env, tl, tl) +DEF_HELPER_3(kslra8_u, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 44c497f28a..8b78fb24bc 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -25,6 +25,7 @@ %sh7 20:7 %sh4 20:4 +%sh3 20:3 %csr 20:12 %rm 12:3 %nf 29:3 !function=ex_plus_1 @@ -63,6 +64,7 @@ @sh ...... ...... ..... ... ..... ....... &shift shamt=%sh7 %rs1 %rd @sh4 ...... ...... ..... ... ..... ....... &shift shamt=%sh4 %rs1 %rd +@sh3 ...... ...... ..... ... ..... ....... &shift shamt=%sh3 %rs1 %rd @csr ............ ..... ... ..... ....... %csr %rs1 %rd @atom_ld ..... aq:1 rl:1 ..... ........ ..... ....... &atomic rs2=0 %rs1 %rd @@ -792,3 +794,18 @@ ksll16 0110010 ..... ..... 000 ..... 1110111 @r kslli16 0111010 1.... ..... 000 ..... 1110111 @sh4 kslra16 0101011 ..... ..... 000 ..... 1110111 @r kslra16_u 0110011 ..... ..... 000 ..... 1110111 @r + +sra8 0101100 ..... ..... 000 ..... 1110111 @r +sra8_u 0110100 ..... ..... 000 ..... 1110111 @r +srai8 0111100 00... ..... 000 ..... 1110111 @sh3 +srai8_u 0111100 01... ..... 000 ..... 1110111 @sh3 +srl8 0101101 ..... ..... 000 ..... 1110111 @r +srl8_u 0110101 ..... ..... 000 ..... 1110111 @r +srli8 0111101 00... ..... 000 ..... 1110111 @sh3 +srli8_u 0111101 01... ..... 000 ..... 1110111 @sh3 +sll8 0101110 ..... ..... 000 ..... 1110111 @r +slli8 0111110 00... ..... 000 ..... 1110111 @sh3 +ksll8 0110110 ..... ..... 000 ..... 1110111 @r +kslli8 0111110 01... ..... 000 ..... 1110111 @sh3 +kslra8 0101111 ..... ..... 000 ..... 1110111 @r +kslra8_u 0110111 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index afafa49824..e6c5f2ddf5 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -187,3 +187,19 @@ GEN_RVP_SHIFTI(slli16, tcg_gen_vec_shl16i_tl, gen_helper_sll16); GEN_RVP_SHIFTI(srai16_u, NULL, gen_helper_sra16_u); GEN_RVP_SHIFTI(srli16_u, NULL, gen_helper_srl16_u); GEN_RVP_SHIFTI(kslli16, NULL, gen_helper_ksll16); + +/* SIMD 8-bit Shift Instructions */ +GEN_RVP_R_OOL(sra8); +GEN_RVP_R_OOL(srl8); +GEN_RVP_R_OOL(sll8); +GEN_RVP_R_OOL(sra8_u); +GEN_RVP_R_OOL(srl8_u); +GEN_RVP_R_OOL(ksll8); +GEN_RVP_R_OOL(kslra8); +GEN_RVP_R_OOL(kslra8_u); +GEN_RVP_SHIFTI(srai8, tcg_gen_vec_sar8i_tl, gen_helper_sra8); +GEN_RVP_SHIFTI(srli8, tcg_gen_vec_shr8i_tl, gen_helper_srl8); +GEN_RVP_SHIFTI(slli8, tcg_gen_vec_shl8i_tl, gen_helper_sll8); +GEN_RVP_SHIFTI(srai8_u, NULL, gen_helper_sra8_u); +GEN_RVP_SHIFTI(srli8_u, NULL, gen_helper_srl8_u); +GEN_RVP_SHIFTI(kslli8, NULL, gen_helper_ksll8); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 7e31c2fe46..ab9ebc472b 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -529,3 +529,105 @@ static inline void do_kslra16_u(CPURISCVState *env, void *vd, void *va, } RVPR(kslra16_u, 1, 2); + +/* SIMD 8-bit Shift Instructions */ +static inline void do_sra8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + d[i] = a[i] >> shift; +} + +RVPR(sra8, 1, 1); + +static inline void do_srl8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + d[i] = a[i] >> shift; +} + +RVPR(srl8, 1, 1); + +static inline void do_sll8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + d[i] = a[i] << shift; +} + +RVPR(sll8, 1, 1); + +static inline void do_sra8_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + d[i] = vssra8(env, 0, a[i], shift); +} + +RVPR(sra8_u, 1, 1); + +static inline void do_srl8_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + d[i] = vssrl8(env, 0, a[i], shift); +} + +RVPR(srl8_u, 1, 1); + +static inline void do_ksll8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, result; + uint8_t shift = *(uint8_t *)vb & 0x7; + + result = a[i] << shift; + if (shift > (clrsb32(a[i]) - 24)) { + env->vxsat = 0x1; + d[i] = (a[i] & INT8_MIN) ? INT8_MIN : INT8_MAX; + } else { + d[i] = result; + } +} + +RVPR(ksll8, 1, 1); + +static inline void do_kslra8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va; + int32_t shift = sextract32((*(uint32_t *)vb), 0, 4); + + if (shift >= 0) { + do_ksll8(env, vd, va, vb, i); + } else { + shift = -shift; + shift = (shift == 8) ? 7 : shift; + d[i] = a[i] >> shift; + } +} + +RVPR(kslra8, 1, 1); + +static inline void do_kslra8_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va; + int32_t shift = sextract32((*(uint32_t *)vb), 0, 4); + + if (shift >= 0) { + do_ksll8(env, vd, va, vb, i); + } else { + shift = -shift; + shift = (shift == 8) ? 7 : shift; + d[i] = vssra8(env, 0, a[i], shift); + } +} + +RVPR(kslra8_u, 1, 1); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index cf1357cee1..f8d00a7ffa 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2680,6 +2680,13 @@ void tcg_gen_vec_shl8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) tcg_gen_andi_i64(d, d, mask); } +void tcg_gen_vec_shl8i_i32(TCGv_i32 d, TCGv_i32 a, int32_t c) +{ + uint32_t mask = dup_const(MO_8, 0xff << c); + tcg_gen_shli_i32(d, a, c); + tcg_gen_andi_i32(d, d, mask); +} + void tcg_gen_vec_shl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) { uint64_t mask = dup_const(MO_16, 0xffff << c); @@ -2738,6 +2745,13 @@ void tcg_gen_vec_shr8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) tcg_gen_andi_i64(d, d, mask); } +void tcg_gen_vec_shr8i_i32(TCGv_i32 d, TCGv_i32 a, int32_t c) +{ + uint32_t mask = dup_const(MO_8, 0xff >> c); + tcg_gen_shri_i32(d, a, c); + tcg_gen_andi_i32(d, d, mask); +} + void tcg_gen_vec_shr16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) { uint64_t mask = dup_const(MO_16, 0xffff >> c); @@ -2803,6 +2817,20 @@ void tcg_gen_vec_sar8i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) tcg_temp_free_i64(s); } +void tcg_gen_vec_sar8i_i32(TCGv_i32 d, TCGv_i32 a, int32_t c) +{ + uint32_t s_mask = dup_const(MO_8, 0x80 >> c); + uint32_t c_mask = dup_const(MO_8, 0xff >> c); + TCGv_i32 s = tcg_temp_new_i32(); + + tcg_gen_shri_i32(d, a, c); + tcg_gen_andi_i32(s, d, s_mask); /* isolate (shifted) sign bit */ + tcg_gen_muli_i32(s, s, (2 << c) - 2); /* replicate isolated signs */ + tcg_gen_andi_i32(d, d, c_mask); /* clear out bits above sign */ + tcg_gen_or_i32(d, d, s); /* include sign extension */ + tcg_temp_free_i32(s); +} + void tcg_gen_vec_sar16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) { uint64_t s_mask = dup_const(MO_16, 0x8000 >> c); From patchwork Thu Jun 10 07:58:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490239 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xVD1YbMz9s1l for ; Thu, 10 Jun 2021 18:10:12 +1000 (AEST) Received: from localhost ([::1]:58112 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFlR-00021f-Vd for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:10:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40896) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFeW-00017h-9C; Thu, 10 Jun 2021 04:03:00 -0400 Received: from out28-125.mail.aliyun.com ([115.124.28.125]:56897) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFeQ-0007wR-96; Thu, 10 Jun 2021 04:02:58 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.3716359|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_regular_dialog|0.77407-0.00350292-0.222427; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047211; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMqYkU_1623312167; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMqYkU_1623312167) by smtp.aliyun-inc.com(10.147.42.22); Thu, 10 Jun 2021 16:02:48 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 07/37] target/riscv: SIMD 16-bit Compare Instructions Date: Thu, 10 Jun 2021 15:58:38 +0800 Message-Id: <20210610075908.3305506-8-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.125; envelope-from=zhiwei_liu@c-sky.com; helo=out28-125.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There are 5 instructions here, including 16-bit compare equal, signed less than, signed less than & equal, unsigned less than, unsigned less than & equal. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis X-Patchwork-Id: 1490232 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xQ62Sl0z9sPf for ; Thu, 10 Jun 2021 18:06:38 +1000 (AEST) Received: from localhost ([::1]:51910 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFi0-0005zh-AC for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:06:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41016) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFew-0002mb-3j; Thu, 10 Jun 2021 04:03:26 -0400 Received: from out28-1.mail.aliyun.com ([115.124.28.1]:35454) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFet-0008Cx-9O; Thu, 10 Jun 2021 04:03:25 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07664509|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.664607-0.0175856-0.317808; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047207; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMleyO_1623312198; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMleyO_1623312198) by smtp.aliyun-inc.com(10.147.44.145); Thu, 10 Jun 2021 16:03:18 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 08/37] target/riscv: SIMD 8-bit Compare Instructions Date: Thu, 10 Jun 2021 15:58:39 +0800 Message-Id: <20210610075908.3305506-9-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.1; envelope-from=zhiwei_liu@c-sky.com; helo=out28-1.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There are 5 instructions here, including 8-bit compare equal, signed less than, signed less than & equal, unsigned less than, unsigned less than & equal. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 6 ++++ target/riscv/insn32.decode | 6 ++++ target/riscv/insn_trans/trans_rvp.c.inc | 7 ++++ target/riscv/packed_helper.c | 46 +++++++++++++++++++++++++ 4 files changed, 65 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 830845761b..c424e45fe5 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1212,3 +1212,9 @@ DEF_HELPER_3(scmplt16, tl, env, tl, tl) DEF_HELPER_3(scmple16, tl, env, tl, tl) DEF_HELPER_3(ucmplt16, tl, env, tl, tl) DEF_HELPER_3(ucmple16, tl, env, tl, tl) + +DEF_HELPER_3(cmpeq8, tl, env, tl, tl) +DEF_HELPER_3(scmplt8, tl, env, tl, tl) +DEF_HELPER_3(scmple8, tl, env, tl, tl) +DEF_HELPER_3(ucmplt8, tl, env, tl, tl) +DEF_HELPER_3(ucmple8, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 5031cebf1f..fdbf3798c7 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -815,3 +815,9 @@ scmplt16 0000110 ..... ..... 000 ..... 1110111 @r scmple16 0001110 ..... ..... 000 ..... 1110111 @r ucmplt16 0010110 ..... ..... 000 ..... 1110111 @r ucmple16 0011110 ..... ..... 000 ..... 1110111 @r + +cmpeq8 0100111 ..... ..... 000 ..... 1110111 @r +scmplt8 0000111 ..... ..... 000 ..... 1110111 @r +scmple8 0001111 ..... ..... 000 ..... 1110111 @r +ucmplt8 0010111 ..... ..... 000 ..... 1110111 @r +ucmple8 0011111 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 65199ffb5a..aa432701c8 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -210,3 +210,10 @@ GEN_RVP_R_OOL(scmplt16); GEN_RVP_R_OOL(scmple16); GEN_RVP_R_OOL(ucmplt16); GEN_RVP_R_OOL(ucmple16); + +/* SIMD 8-bit Compare Instructions */ +GEN_RVP_R_OOL(cmpeq8); +GEN_RVP_R_OOL(scmplt8); +GEN_RVP_R_OOL(scmple8); +GEN_RVP_R_OOL(ucmplt8); +GEN_RVP_R_OOL(ucmple8); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 30b916b5ad..ff86e015e4 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -677,3 +677,49 @@ static inline void do_ucmple16(CPURISCVState *env, void *vd, void *va, } RVPR(ucmple16, 1, 2); + +/* SIMD 8-bit Compare Instructions */ +static inline void do_cmpeq8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = (a[i] == b[i]) ? 0xff : 0x0; +} + +RVPR(cmpeq8, 1, 1); + +static inline void do_scmplt8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + d[i] = (a[i] < b[i]) ? 0xff : 0x0; +} + +RVPR(scmplt8, 1, 1); + +static inline void do_scmple8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + d[i] = (a[i] <= b[i]) ? 0xff : 0x0; +} + +RVPR(scmple8, 1, 1); + +static inline void do_ucmplt8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = (a[i] < b[i]) ? 0xff : 0x0; +} + +RVPR(ucmplt8, 1, 1); + +static inline void do_ucmple8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + d[i] = (a[i] <= b[i]) ? 0xff : 0x0; +} + +RVPR(ucmple8, 1, 1); From patchwork Thu Jun 10 07:58:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490230 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xNr1Bftz9sW7 for ; Thu, 10 Jun 2021 18:05:30 +1000 (AEST) Received: from localhost ([::1]:49286 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFgt-0004AM-Qf for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:05:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41092) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFfU-0003kd-Df; Thu, 10 Jun 2021 04:04:02 -0400 Received: from out28-4.mail.aliyun.com ([115.124.28.4]:39469) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFfN-00007N-Tp; Thu, 10 Jun 2021 04:04:00 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436734|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.469349-9.41521e-05-0.530557; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047206; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMTtgQ_1623312229; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMTtgQ_1623312229) by smtp.aliyun-inc.com(10.147.44.129); Thu, 10 Jun 2021 16:03:49 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 09/37] target/riscv: SIMD 16-bit Multiply Instructions Date: Thu, 10 Jun 2021 15:58:40 +0800 Message-Id: <20210610075908.3305506-10-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.4; envelope-from=zhiwei_liu@c-sky.com; helo=out28-4.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There are 6 instructions, including 16-bit signed or unsigned multiply, 16-bit signed or unsigned crossed multiply, Q15 signed or signed crossed saturating multiply. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 7 ++ target/riscv/insn32.decode | 7 ++ target/riscv/insn_trans/trans_rvp.c.inc | 69 ++++++++++++++++ target/riscv/packed_helper.c | 104 ++++++++++++++++++++++++ 4 files changed, 187 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c424e45fe5..d13b84f165 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1218,3 +1218,10 @@ DEF_HELPER_3(scmplt8, tl, env, tl, tl) DEF_HELPER_3(scmple8, tl, env, tl, tl) DEF_HELPER_3(ucmplt8, tl, env, tl, tl) DEF_HELPER_3(ucmple8, tl, env, tl, tl) + +DEF_HELPER_3(smul16, i64, env, tl, tl) +DEF_HELPER_3(smulx16, i64, env, tl, tl) +DEF_HELPER_3(umul16, i64, env, tl, tl) +DEF_HELPER_3(umulx16, i64, env, tl, tl) +DEF_HELPER_3(khm16, tl, env, tl, tl) +DEF_HELPER_3(khmx16, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index fdbf3798c7..cbee995229 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -821,3 +821,10 @@ scmplt8 0000111 ..... ..... 000 ..... 1110111 @r scmple8 0001111 ..... ..... 000 ..... 1110111 @r ucmplt8 0010111 ..... ..... 000 ..... 1110111 @r ucmple8 0011111 ..... ..... 000 ..... 1110111 @r + +smul16 1010000 ..... ..... 000 ..... 1110111 @r +smulx16 1010001 ..... ..... 000 ..... 1110111 @r +umul16 1011000 ..... ..... 000 ..... 1110111 @r +umulx16 1011001 ..... ..... 000 ..... 1110111 @r +khm16 1000011 ..... ..... 000 ..... 1110111 @r +khmx16 1001011 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index aa432701c8..b93ba63dd8 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -217,3 +217,72 @@ GEN_RVP_R_OOL(scmplt8); GEN_RVP_R_OOL(scmple8); GEN_RVP_R_OOL(ucmplt8); GEN_RVP_R_OOL(ucmple8); + +/* SIMD 16-bit Multiply Instructions */ +static void set_pair_regs(DisasContext *ctx, TCGv_i64 dst, int rd) +{ + TCGv t1, t2; + + t1 = tcg_temp_new(); + t2 = tcg_temp_new(); + + if (is_32bit(ctx)) { + TCGv_i32 lo, hi; + + lo = tcg_temp_new_i32(); + hi = tcg_temp_new_i32(); + tcg_gen_extr_i64_i32(lo, hi, dst); + + tcg_gen_ext_i32_tl(t1, lo); + tcg_gen_ext_i32_tl(t2, hi); + + gen_set_gpr(rd, t1); + gen_set_gpr(rd + 1, t2); + tcg_temp_free_i32(lo); + tcg_temp_free_i32(hi); + } else { + tcg_gen_trunc_i64_tl(t1, dst); + gen_set_gpr(rd, t1); + } + tcg_temp_free(t1); + tcg_temp_free(t2); +} + +static inline bool +r_d64_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv_i64, TCGv_ptr, TCGv, TCGv)) +{ + TCGv t1, t2; + TCGv_i64 t3; + + if (!has_ext(ctx, RVP) || !ctx->ext_psfoperand) { + return false; + } + + t1 = tcg_temp_new(); + t2 = tcg_temp_new(); + t3 = tcg_temp_new_i64(); + + gen_get_gpr(t1, a->rs1); + gen_get_gpr(t2, a->rs2); + fn(t3, cpu_env, t1, t2); + set_pair_regs(ctx, t3, a->rd); + + tcg_temp_free(t1); + tcg_temp_free(t2); + tcg_temp_free_i64(t3); + return true; +} + +#define GEN_RVP_R_D64_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_d64_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_D64_OOL(smul16); +GEN_RVP_R_D64_OOL(smulx16); +GEN_RVP_R_D64_OOL(umul16); +GEN_RVP_R_D64_OOL(umulx16); +GEN_RVP_R_OOL(khm16); +GEN_RVP_R_OOL(khmx16); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index ff86e015e4..13fed2c4d1 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -723,3 +723,107 @@ static inline void do_ucmple8(CPURISCVState *env, void *vd, void *va, } RVPR(ucmple8, 1, 1); + +/* SIMD 16-bit Multiply Instructions */ +typedef void PackedFn3(CPURISCVState *, void *, void *, void *); +static inline uint64_t rvpr64(CPURISCVState *env, target_ulong a, + target_ulong b, PackedFn3 *fn) +{ + uint64_t result; + + fn(env, &result, &a, &b); + return result; +} + +#define RVPR64(NAME) \ +uint64_t HELPER(NAME)(CPURISCVState *env, target_ulong a, \ + target_ulong b) \ +{ \ + return rvpr64(env, a, b, (PackedFn3 *)do_##NAME); \ +} + +static inline void do_smul16(CPURISCVState *env, void *vd, void *va, void *vb) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(0)] = (int32_t)a[H2(0)] * b[H2(0)]; + d[H4(1)] = (int32_t)a[H2(1)] * b[H2(1)]; +} + +RVPR64(smul16); + +static inline void do_smulx16(CPURISCVState *env, void *vd, void *va, void *vb) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(0)] = (int32_t)a[H2(0)] * b[H2(1)]; + d[H4(1)] = (int32_t)a[H2(1)] * b[H2(0)]; +} + +RVPR64(smulx16); + +static inline void do_umul16(CPURISCVState *env, void *vd, void *va, void *vb, + uint8_t i) +{ + uint32_t *d = vd; + uint16_t *a = va, *b = vb; + d[H4(0)] = (uint32_t)a[H2(0)] * b[H2(0)]; + d[H4(1)] = (uint32_t)a[H2(1)] * b[H2(1)]; +} + +RVPR64(umul16); + +static inline void do_umulx16(CPURISCVState *env, void *vd, void *va, void *vb, + uint8_t i) +{ + uint32_t *d = vd; + uint16_t *a = va, *b = vb; + d[H4(0)] = (uint32_t)a[H2(0)] * b[H2(1)]; + d[H4(1)] = (uint32_t)a[H2(1)] * b[H2(0)]; +} + +RVPR64(umulx16); + +static inline void do_khm16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + + if (a[i] == INT16_MIN && b[i] == INT16_MIN) { + env->vxsat = 1; + d[i] = INT16_MAX; + } else { + d[i] = (int32_t)a[i] * b[i] >> 15; + } +} + +RVPR(khm16, 1, 2); + +static inline void do_khmx16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + + /* + * t[x] = ra.H[x] s* rb.H[y]; + * rt.H[x] = SAT.Q15(t[x] s>> 15); + * + * (RV32: (x,y)=(1,0),(0,1), + * RV64: (x,y)=(3,2),(2,3), + * (1,0),(0,1) + */ + if (a[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + env->vxsat = 1; + d[H2(i)] = INT16_MAX; + } else { + d[H2(i)] = (int32_t)a[H2(i)] * b[H2(i + 1)] >> 15; + } + if (a[H2(i + 1)] == INT16_MIN && b[H2(i)] == INT16_MIN) { + env->vxsat = 1; + d[H2(i + 1)] = INT16_MAX; + } else { + d[H2(i + 1)] = (int32_t)a[H2(i + 1)] * b[H2(i)] >> 15; + } +} + +RVPR(khmx16, 2, 2); From patchwork Thu Jun 10 07:58:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490241 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xVm6NWKz9s1l for ; Thu, 10 Jun 2021 18:10:39 +1000 (AEST) Received: from localhost ([::1]:58862 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFls-0002WA-50 for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:10:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41208) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFg4-0004Wm-62; Thu, 10 Jun 2021 04:04:36 -0400 Received: from out28-220.mail.aliyun.com ([115.124.28.220]:54699) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFg0-0000Q5-Va; Thu, 10 Jun 2021 04:04:35 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07788347|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_alarm|0.369954-0.000121005-0.629925; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047199; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMQ66w_1623312259; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMQ66w_1623312259) by smtp.aliyun-inc.com(10.147.41.158); Thu, 10 Jun 2021 16:04:19 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 10/37] target/riscv: SIMD 8-bit Multiply Instructions Date: Thu, 10 Jun 2021 15:58:41 +0800 Message-Id: <20210610075908.3305506-11-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.220; envelope-from=zhiwei_liu@c-sky.com; helo=out28-220.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There are 6 instructions, including 8-bit signed or unsigned multiply, 8-bit signed or unsigned crossed multiply, Q7 signed or signed crossed saturating multiply. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 7 ++ target/riscv/insn32.decode | 7 ++ target/riscv/insn_trans/trans_rvp.c.inc | 8 +++ target/riscv/packed_helper.c | 93 +++++++++++++++++++++++++ 4 files changed, 115 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index d13b84f165..4d0918b9a9 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1225,3 +1225,10 @@ DEF_HELPER_3(umul16, i64, env, tl, tl) DEF_HELPER_3(umulx16, i64, env, tl, tl) DEF_HELPER_3(khm16, tl, env, tl, tl) DEF_HELPER_3(khmx16, tl, env, tl, tl) + +DEF_HELPER_3(smul8, i64, env, tl, tl) +DEF_HELPER_3(smulx8, i64, env, tl, tl) +DEF_HELPER_3(umul8, i64, env, tl, tl) +DEF_HELPER_3(umulx8, i64, env, tl, tl) +DEF_HELPER_3(khm8, tl, env, tl, tl) +DEF_HELPER_3(khmx8, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index cbee995229..05c3e67477 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -828,3 +828,10 @@ umul16 1011000 ..... ..... 000 ..... 1110111 @r umulx16 1011001 ..... ..... 000 ..... 1110111 @r khm16 1000011 ..... ..... 000 ..... 1110111 @r khmx16 1001011 ..... ..... 000 ..... 1110111 @r + +smul8 1010100 ..... ..... 000 ..... 1110111 @r +smulx8 1010101 ..... ..... 000 ..... 1110111 @r +umul8 1011100 ..... ..... 000 ..... 1110111 @r +umulx8 1011101 ..... ..... 000 ..... 1110111 @r +khm8 1000111 ..... ..... 000 ..... 1110111 @r +khmx8 1001111 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index b93ba63dd8..2188de8505 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -286,3 +286,11 @@ GEN_RVP_R_D64_OOL(umul16); GEN_RVP_R_D64_OOL(umulx16); GEN_RVP_R_OOL(khm16); GEN_RVP_R_OOL(khmx16); + +/* SIMD 8-bit Multiply Instructions */ +GEN_RVP_R_D64_OOL(smul8); +GEN_RVP_R_D64_OOL(smulx8); +GEN_RVP_R_D64_OOL(umul8); +GEN_RVP_R_D64_OOL(umulx8); +GEN_RVP_R_OOL(khm8); +GEN_RVP_R_OOL(khmx8); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 13fed2c4d1..56baefeb8e 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -827,3 +827,96 @@ static inline void do_khmx16(CPURISCVState *env, void *vd, void *va, } RVPR(khmx16, 2, 2); + +/* SIMD 8-bit Multiply Instructions */ +static inline void do_smul8(CPURISCVState *env, void *vd, void *va, void *vb) +{ + int16_t *d = vd; + int8_t *a = va, *b = vb; + d[H2(0)] = (int16_t)a[H1(0)] * b[H1(0)]; + d[H2(1)] = (int16_t)a[H1(1)] * b[H1(1)]; + d[H2(2)] = (int16_t)a[H1(2)] * b[H1(2)]; + d[H2(3)] = (int16_t)a[H1(3)] * b[H1(3)]; +} + +RVPR64(smul8); + +static inline void do_smulx8(CPURISCVState *env, void *vd, void *va, void *vb) +{ + int16_t *d = vd; + int8_t *a = va, *b = vb; + d[H2(0)] = (int16_t)a[H1(0)] * b[H1(1)]; + d[H2(1)] = (int16_t)a[H1(1)] * b[H1(0)]; + d[H2(2)] = (int16_t)a[H1(2)] * b[H1(3)]; + d[H2(3)] = (int16_t)a[H1(3)] * b[H1(2)]; +} + +RVPR64(smulx8); + +static inline void do_umul8(CPURISCVState *env, void *vd, void *va, void *vb) +{ + uint16_t *d = vd; + uint8_t *a = va, *b = vb; + d[H2(0)] = (uint16_t)a[H1(0)] * b[H1(0)]; + d[H2(1)] = (uint16_t)a[H1(1)] * b[H1(1)]; + d[H2(2)] = (uint16_t)a[H1(2)] * b[H1(2)]; + d[H2(3)] = (uint16_t)a[H1(3)] * b[H1(3)]; +} + +RVPR64(umul8); + +static inline void do_umulx8(CPURISCVState *env, void *vd, void *va, void *vb) +{ + uint16_t *d = vd; + uint8_t *a = va, *b = vb; + d[H2(0)] = (uint16_t)a[H1(0)] * b[H1(1)]; + d[H2(1)] = (uint16_t)a[H1(1)] * b[H1(0)]; + d[H2(2)] = (uint16_t)a[H1(2)] * b[H1(3)]; + d[H2(3)] = (uint16_t)a[H1(3)] * b[H1(2)]; +} + +RVPR64(umulx8); + +static inline void do_khm8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + + if (a[i] == INT8_MIN && b[i] == INT8_MIN) { + env->vxsat = 1; + d[i] = INT8_MAX; + } else { + d[i] = (int16_t)a[i] * b[i] >> 7; + } +} + +RVPR(khm8, 1, 1); + +static inline void do_khmx8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + /* + * t[x] = ra.B[x] s* rb.B[y]; + * rt.B[x] = SAT.Q7(t[x] s>> 7); + * + * (RV32: (x,y)=(3,2),(2,3), + * (1,0),(0,1), + * (RV64: (x,y)=(7,6),(6,7),(5,4),(4,5), + * (3,2),(2,3),(1,0),(0,1)) + */ + if (a[H1(i)] == INT8_MIN && b[H1(i + 1)] == INT8_MIN) { + env->vxsat = 1; + d[H1(i)] = INT8_MAX; + } else { + d[H1(i)] = (int16_t)a[H1(i)] * b[H1(i + 1)] >> 7; + } + if (a[H1(i + 1)] == INT8_MIN && b[H1(i)] == INT8_MIN) { + env->vxsat = 1; + d[H1(i + 1)] = INT8_MAX; + } else { + d[H1(i + 1)] = (int16_t)a[H1(i + 1)] * b[H1(i)] >> 7; + } +} + +RVPR(khmx8, 2, 1); From patchwork Thu Jun 10 07:58:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490246 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xZR17GGz9sRN for ; Thu, 10 Jun 2021 18:13:51 +1000 (AEST) Received: from localhost ([::1]:39046 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFoz-0008Qs-3j for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:13:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41248) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFgS-0005EO-FZ; Thu, 10 Jun 2021 04:05:01 -0400 Received: from out28-217.mail.aliyun.com ([115.124.28.217]:34309) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFgN-0000cO-2e; Thu, 10 Jun 2021 04:05:00 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436284|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.356439-0.00404361-0.639518; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047201; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMqCdN_1623312290; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMqCdN_1623312290) by smtp.aliyun-inc.com(10.147.41.137); Thu, 10 Jun 2021 16:04:50 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 11/37] target/riscv: SIMD 16-bit Miscellaneous Instructions Date: Thu, 10 Jun 2021 15:58:42 +0800 Message-Id: <20210610075908.3305506-12-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.217; envelope-from=zhiwei_liu@c-sky.com; helo=out28-217.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There are 11 instructions, including signed or unsigned minimum, maximum, clip value, absolute value, and leading zero, leading one count instructions. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 11 ++ target/riscv/insn32.decode | 11 ++ target/riscv/insn_trans/trans_rvp.c.inc | 41 ++++++ target/riscv/packed_helper.c | 158 ++++++++++++++++++++++++ 4 files changed, 221 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 4d0918b9a9..88035aafad 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1232,3 +1232,14 @@ DEF_HELPER_3(umul8, i64, env, tl, tl) DEF_HELPER_3(umulx8, i64, env, tl, tl) DEF_HELPER_3(khm8, tl, env, tl, tl) DEF_HELPER_3(khmx8, tl, env, tl, tl) + +DEF_HELPER_3(smin16, tl, env, tl, tl) +DEF_HELPER_3(umin16, tl, env, tl, tl) +DEF_HELPER_3(smax16, tl, env, tl, tl) +DEF_HELPER_3(umax16, tl, env, tl, tl) +DEF_HELPER_3(sclip16, tl, env, tl, tl) +DEF_HELPER_3(uclip16, tl, env, tl, tl) +DEF_HELPER_2(kabs16, tl, env, tl) +DEF_HELPER_2(clrs16, tl, env, tl) +DEF_HELPER_2(clz16, tl, env, tl) +DEF_HELPER_2(clo16, tl, env, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 05c3e67477..847c796874 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -835,3 +835,14 @@ umul8 1011100 ..... ..... 000 ..... 1110111 @r umulx8 1011101 ..... ..... 000 ..... 1110111 @r khm8 1000111 ..... ..... 000 ..... 1110111 @r khmx8 1001111 ..... ..... 000 ..... 1110111 @r + +smin16 1000000 ..... ..... 000 ..... 1110111 @r +umin16 1001000 ..... ..... 000 ..... 1110111 @r +smax16 1000001 ..... ..... 000 ..... 1110111 @r +umax16 1001001 ..... ..... 000 ..... 1110111 @r +sclip16 1000010 0.... ..... 000 ..... 1110111 @sh4 +uclip16 1000010 1.... ..... 000 ..... 1110111 @sh4 +kabs16 1010110 10001 ..... 000 ..... 1110111 @r2 +clrs16 1010111 01000 ..... 000 ..... 1110111 @r2 +clz16 1010111 01001 ..... 000 ..... 1110111 @r2 +clo16 1010111 01011 ..... 000 ..... 1110111 @r2 diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 2188de8505..3e6307cdc3 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -294,3 +294,44 @@ GEN_RVP_R_D64_OOL(umul8); GEN_RVP_R_D64_OOL(umulx8); GEN_RVP_R_OOL(khm8); GEN_RVP_R_OOL(khmx8); + +/* SIMD 16-bit Miscellaneous Instructions */ +GEN_RVP_R_OOL(smin16); +GEN_RVP_R_OOL(umin16); +GEN_RVP_R_OOL(smax16); +GEN_RVP_R_OOL(umax16); +GEN_RVP_SHIFTI(sclip16, NULL, gen_helper_sclip16); +GEN_RVP_SHIFTI(uclip16, NULL, gen_helper_uclip16); + +/* Out of line helpers for R2 format */ +static bool +r2_ool(DisasContext *ctx, arg_r2 *a, + void (* fn)(TCGv, TCGv_ptr, TCGv)) +{ + TCGv src1, dst; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + fn(dst, cpu_env, src1); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R2_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r2 *a) \ +{ \ + return r2_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R2_OOL(kabs16); +GEN_RVP_R2_OOL(clrs16); +GEN_RVP_R2_OOL(clz16); +GEN_RVP_R2_OOL(clo16); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 56baefeb8e..e4a9463135 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -920,3 +920,161 @@ static inline void do_khmx8(CPURISCVState *env, void *vd, void *va, } RVPR(khmx8, 2, 1); + +/* SIMD 16-bit Miscellaneous Instructions */ +static inline void do_smin16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] < b[i]) ? a[i] : b[i]; +} + +RVPR(smin16, 1, 2); + +static inline void do_umin16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] < b[i]) ? a[i] : b[i]; +} + +RVPR(umin16, 1, 2); + +static inline void do_smax16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] > b[i]) ? a[i] : b[i]; +} + +RVPR(smax16, 1, 2); + +static inline void do_umax16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] > b[i]) ? a[i] : b[i]; +} + +RVPR(umax16, 1, 2); + +static int64_t sat64(CPURISCVState *env, int64_t a, uint8_t shift) +{ + int64_t max = shift >= 64 ? INT64_MAX : (1ull << shift) - 1; + int64_t min = shift >= 64 ? INT64_MIN : -(1ull << shift); + int64_t result; + + if (a > max) { + result = max; + env->vxsat = 0x1; + } else if (a < min) { + result = min; + env->vxsat = 0x1; + } else { + result = a; + } + return result; +} + +static inline void do_sclip16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + + d[i] = sat64(env, a[i], shift); +} + +RVPR(sclip16, 1, 2); + +static uint64_t satu64(CPURISCVState *env, uint64_t a, uint8_t shift) +{ + uint64_t max = shift >= 64 ? UINT64_MAX : (1ull << shift) - 1; + uint64_t result; + + if (a > max) { + result = max; + env->vxsat = 0x1; + } else { + result = a; + } + return result; +} + +static inline void do_uclip16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0xf; + + if (a[i] < 0) { + d[i] = 0; + env->vxsat = 0x1; + } else { + d[i] = satu64(env, a[i], shift); + } +} + +RVPR(uclip16, 1, 2); + +typedef void PackedFn2i(CPURISCVState *, void *, void *, uint8_t); + +static inline target_ulong rvpr2(CPURISCVState *env, target_ulong a, + uint8_t step, uint8_t size, PackedFn2i *fn) +{ + int i, passes = sizeof(target_ulong) / size; + target_ulong result; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, i); + } + return result; +} + +#define RVPR2(NAME, STEP, SIZE) \ +target_ulong HELPER(NAME)(CPURISCVState *env, target_ulong a) \ +{ \ + return rvpr2(env, a, STEP, SIZE, (PackedFn2i *)do_##NAME); \ +} + +static inline void do_kabs16(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int16_t *d = vd, *a = va; + + if (a[i] == INT16_MIN) { + d[i] = INT16_MAX; + env->vxsat = 0x1; + } else { + d[i] = abs(a[i]); + } +} + +RVPR2(kabs16, 1, 2); + +static inline void do_clrs16(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int16_t *d = vd, *a = va; + d[i] = clrsb32(a[i]) - 16; +} + +RVPR2(clrs16, 1, 2); + +static inline void do_clz16(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int16_t *d = vd, *a = va; + d[i] = (a[i] < 0) ? 0 : (clz32(a[i]) - 16); +} + +RVPR2(clz16, 1, 2); + +static inline void do_clo16(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int16_t *d = vd, *a = va; + d[i] = (a[i] >= 0) ? 0 : (clo32(a[i]) - 16); +} + +RVPR2(clo16, 1, 2); From patchwork Thu Jun 10 07:58:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490244 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xZG2C7dz9s1l for ; Thu, 10 Jun 2021 18:13:42 +1000 (AEST) Received: from localhost ([::1]:38650 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFoq-0008AK-8s for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:13:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41400) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFgw-000600-HJ; Thu, 10 Jun 2021 04:05:30 -0400 Received: from out28-172.mail.aliyun.com ([115.124.28.172]:34638) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFgt-0000tY-LR; Thu, 10 Jun 2021 04:05:30 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07439879|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.393761-0.0133688-0.59287; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047211; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQN-o2N_1623312320; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQN-o2N_1623312320) by smtp.aliyun-inc.com(10.147.41.120); Thu, 10 Jun 2021 16:05:20 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 12/37] target/riscv: SIMD 8-bit Miscellaneous Instructions Date: Thu, 10 Jun 2021 15:58:43 +0800 Message-Id: <20210610075908.3305506-13-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.172; envelope-from=zhiwei_liu@c-sky.com; helo=out28-172.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Instructions include signed or unsigned minimum, maximum, clip value, absolute value, and leading zero, leading one count instructions. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 12 +++ target/riscv/insn32.decode | 12 +++ target/riscv/insn_trans/trans_rvp.c.inc | 13 +++ target/riscv/packed_helper.c | 115 ++++++++++++++++++++++++ 4 files changed, 152 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 88035aafad..240df8b766 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1243,3 +1243,15 @@ DEF_HELPER_2(kabs16, tl, env, tl) DEF_HELPER_2(clrs16, tl, env, tl) DEF_HELPER_2(clz16, tl, env, tl) DEF_HELPER_2(clo16, tl, env, tl) + +DEF_HELPER_3(smin8, tl, env, tl, tl) +DEF_HELPER_3(umin8, tl, env, tl, tl) +DEF_HELPER_3(smax8, tl, env, tl, tl) +DEF_HELPER_3(umax8, tl, env, tl, tl) +DEF_HELPER_3(sclip8, tl, env, tl, tl) +DEF_HELPER_3(uclip8, tl, env, tl, tl) +DEF_HELPER_2(kabs8, tl, env, tl) +DEF_HELPER_2(clrs8, tl, env, tl) +DEF_HELPER_2(clz8, tl, env, tl) +DEF_HELPER_2(clo8, tl, env, tl) +DEF_HELPER_2(swap8, tl, env, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 847c796874..4c34f0f4f4 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -846,3 +846,15 @@ kabs16 1010110 10001 ..... 000 ..... 1110111 @r2 clrs16 1010111 01000 ..... 000 ..... 1110111 @r2 clz16 1010111 01001 ..... 000 ..... 1110111 @r2 clo16 1010111 01011 ..... 000 ..... 1110111 @r2 + +smin8 1000100 ..... ..... 000 ..... 1110111 @r +umin8 1001100 ..... ..... 000 ..... 1110111 @r +smax8 1000101 ..... ..... 000 ..... 1110111 @r +umax8 1001101 ..... ..... 000 ..... 1110111 @r +sclip8 1000110 00... ..... 000 ..... 1110111 @sh3 +uclip8 1000110 10... ..... 000 ..... 1110111 @sh3 +kabs8 1010110 10000 ..... 000 ..... 1110111 @r2 +clrs8 1010111 00000 ..... 000 ..... 1110111 @r2 +clz8 1010111 00001 ..... 000 ..... 1110111 @r2 +clo8 1010111 00011 ..... 000 ..... 1110111 @r2 +swap8 1010110 11000 ..... 000 ..... 1110111 @r2 diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 3e6307cdc3..c5ec530fd7 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -335,3 +335,16 @@ GEN_RVP_R2_OOL(kabs16); GEN_RVP_R2_OOL(clrs16); GEN_RVP_R2_OOL(clz16); GEN_RVP_R2_OOL(clo16); + +/* SIMD 8-bit Miscellaneous Instructions */ +GEN_RVP_R_OOL(smin8); +GEN_RVP_R_OOL(umin8); +GEN_RVP_R_OOL(smax8); +GEN_RVP_R_OOL(umax8); +GEN_RVP_SHIFTI(sclip8, NULL, gen_helper_sclip8); +GEN_RVP_SHIFTI(uclip8, NULL, gen_helper_uclip8); +GEN_RVP_R2_OOL(kabs8); +GEN_RVP_R2_OOL(clrs8); +GEN_RVP_R2_OOL(clz8); +GEN_RVP_R2_OOL(clo8); +GEN_RVP_R2_OOL(swap8); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index e4a9463135..3d3d2bf3e4 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1078,3 +1078,118 @@ static inline void do_clo16(CPURISCVState *env, void *vd, void *va, uint8_t i) } RVPR2(clo16, 1, 2); + +/* SIMD 8-bit Miscellaneous Instructions */ +static inline void do_smin8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] < b[i]) ? a[i] : b[i]; +} + +RVPR(smin8, 1, 1); + +static inline void do_umin8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] < b[i]) ? a[i] : b[i]; +} + +RVPR(umin8, 1, 1); + +static inline void do_smax8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] > b[i]) ? a[i] : b[i]; +} + +RVPR(smax8, 1, 1); + +static inline void do_umax8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint8_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] > b[i]) ? a[i] : b[i]; +} + +RVPR(umax8, 1, 1); + +static inline void do_sclip8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + + d[i] = sat64(env, a[i], shift); +} + +RVPR(sclip8, 1, 1); + +static inline void do_uclip8(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int8_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x7; + + if (a[i] < 0) { + d[i] = 0; + env->vxsat = 0x1; + } else { + d[i] = satu64(env, a[i], shift); + } +} + +RVPR(uclip8, 1, 1); + +static inline void do_kabs8(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *d = vd, *a = va; + + if (a[i] == INT8_MIN) { + d[i] = INT8_MAX; + env->vxsat = 0x1; + } else { + d[i] = abs(a[i]); + } +} + +RVPR2(kabs8, 1, 1); + +static inline void do_clrs8(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *d = vd, *a = va; + d[i] = clrsb32(a[i]) - 24; +} + +RVPR2(clrs8, 1, 1); + +static inline void do_clz8(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *d = vd, *a = va; + d[i] = (a[i] < 0) ? 0 : (clz32(a[i]) - 24); +} + +RVPR2(clz8, 1, 1); + +static inline void do_clo8(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *d = vd, *a = va; + d[i] = (a[i] >= 0) ? 0 : (clo32(a[i]) - 24); +} + +RVPR2(clo8, 1, 1); + +static inline void do_swap8(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *d = vd, *a = va; + d[H1(i)] = a[H1(i + 1)]; + d[H1(i + 1)] = a[H1(i)]; +} + +RVPR2(swap8, 2, 1); From patchwork Thu Jun 10 07:58:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490240 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xVG3nC7z9sPf for ; Thu, 10 Jun 2021 18:10:14 +1000 (AEST) Received: from localhost ([::1]:58198 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFlU-000250-GO for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:10:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41530) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFhQ-0006qB-Pi; Thu, 10 Jun 2021 04:06:01 -0400 Received: from out28-4.mail.aliyun.com ([115.124.28.4]:39988) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFhL-0001Fx-Ua; Thu, 10 Jun 2021 04:06:00 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07460574|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.76618-0.00882805-0.224992; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047192; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMXJMA_1623312350; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMXJMA_1623312350) by smtp.aliyun-inc.com(10.147.42.198); Thu, 10 Jun 2021 16:05:51 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 13/37] target/riscv: 8-bit Unpacking Instructions Date: Thu, 10 Jun 2021 15:58:44 +0800 Message-Id: <20210610075908.3305506-14-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.4; envelope-from=zhiwei_liu@c-sky.com; helo=out28-4.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Sign-extend or zero-extend selected 8-bit elements to 16-bit elements. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 11 +++ target/riscv/insn32.decode | 11 +++ target/riscv/insn_trans/trans_rvp.c.inc | 12 +++ target/riscv/packed_helper.c | 121 ++++++++++++++++++++++++ 4 files changed, 155 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 240df8b766..9fd2a70f7d 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1255,3 +1255,14 @@ DEF_HELPER_2(clrs8, tl, env, tl) DEF_HELPER_2(clz8, tl, env, tl) DEF_HELPER_2(clo8, tl, env, tl) DEF_HELPER_2(swap8, tl, env, tl) + +DEF_HELPER_2(sunpkd810, tl, env, tl) +DEF_HELPER_2(sunpkd820, tl, env, tl) +DEF_HELPER_2(sunpkd830, tl, env, tl) +DEF_HELPER_2(sunpkd831, tl, env, tl) +DEF_HELPER_2(sunpkd832, tl, env, tl) +DEF_HELPER_2(zunpkd810, tl, env, tl) +DEF_HELPER_2(zunpkd820, tl, env, tl) +DEF_HELPER_2(zunpkd830, tl, env, tl) +DEF_HELPER_2(zunpkd831, tl, env, tl) +DEF_HELPER_2(zunpkd832, tl, env, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 4c34f0f4f4..9b8ea0f9ab 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -858,3 +858,14 @@ clrs8 1010111 00000 ..... 000 ..... 1110111 @r2 clz8 1010111 00001 ..... 000 ..... 1110111 @r2 clo8 1010111 00011 ..... 000 ..... 1110111 @r2 swap8 1010110 11000 ..... 000 ..... 1110111 @r2 + +sunpkd810 1010110 01000 ..... 000 ..... 1110111 @r2 +sunpkd820 1010110 01001 ..... 000 ..... 1110111 @r2 +sunpkd830 1010110 01010 ..... 000 ..... 1110111 @r2 +sunpkd831 1010110 01011 ..... 000 ..... 1110111 @r2 +sunpkd832 1010110 10011 ..... 000 ..... 1110111 @r2 +zunpkd810 1010110 01100 ..... 000 ..... 1110111 @r2 +zunpkd820 1010110 01101 ..... 000 ..... 1110111 @r2 +zunpkd830 1010110 01110 ..... 000 ..... 1110111 @r2 +zunpkd831 1010110 01111 ..... 000 ..... 1110111 @r2 +zunpkd832 1010110 10111 ..... 000 ..... 1110111 @r2 diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index c5ec530fd7..5af2c7c2cc 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -348,3 +348,15 @@ GEN_RVP_R2_OOL(clrs8); GEN_RVP_R2_OOL(clz8); GEN_RVP_R2_OOL(clo8); GEN_RVP_R2_OOL(swap8); + +/* 8-bit Unpacking Instructions */ +GEN_RVP_R2_OOL(sunpkd810); +GEN_RVP_R2_OOL(sunpkd820); +GEN_RVP_R2_OOL(sunpkd830); +GEN_RVP_R2_OOL(sunpkd831); +GEN_RVP_R2_OOL(sunpkd832); +GEN_RVP_R2_OOL(zunpkd810); +GEN_RVP_R2_OOL(zunpkd820); +GEN_RVP_R2_OOL(zunpkd830); +GEN_RVP_R2_OOL(zunpkd831); +GEN_RVP_R2_OOL(zunpkd832); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 3d3d2bf3e4..8226dbd079 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1193,3 +1193,124 @@ static inline void do_swap8(CPURISCVState *env, void *vd, void *va, uint8_t i) } RVPR2(swap8, 2, 1); + +/* 8-bit Unpacking Instructions */ +static inline void +do_sunpkd810(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *a = va; + int16_t *d = vd; + + d[H2(i / 2)] = a[H1(i)]; + d[H2(i / 2 + 1)] = a[H1(i + 1)]; +} + +RVPR2(sunpkd810, 4, 1); + +static inline void +do_sunpkd820(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *a = va; + int16_t *d = vd; + + d[H2(i / 2)] = a[H1(i)]; + d[H2(i / 2 + 1)] = a[H1(i + 2)]; +} + +RVPR2(sunpkd820, 4, 1); + +static inline void +do_sunpkd830(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *a = va; + int16_t *d = vd; + + d[H2(i / 2)] = a[H1(i)]; + d[H2(i / 2 + 1)] = a[H1(i + 3)]; +} + +RVPR2(sunpkd830, 4, 1); + +static inline void +do_sunpkd831(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *a = va; + int16_t *d = vd; + + d[H2(i / 2)] = a[H1(i) + 1]; + d[H2(i / 2 + 1)] = a[H1(i + 3)]; +} + +RVPR2(sunpkd831, 4, 1); + +static inline void +do_sunpkd832(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int8_t *a = va; + int16_t *d = vd; + + d[H2(i / 2)] = a[H1(i) + 2]; + d[H2(i / 2 + 1)] = a[H1(i + 3)]; +} + +RVPR2(sunpkd832, 4, 1); + +static inline void +do_zunpkd810(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + uint8_t *a = va; + uint16_t *d = vd; + + d[H2(i / 2)] = a[H1(i)]; + d[H2(i / 2 + 1)] = a[H1(i + 1)]; +} + +RVPR2(zunpkd810, 4, 1); + +static inline void +do_zunpkd820(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + uint8_t *a = va; + uint16_t *d = vd; + + d[H2(i / 2)] = a[H1(i)]; + d[H2(i / 2 + 1)] = a[H1(i + 2)]; +} + +RVPR2(zunpkd820, 4, 1); + +static inline void +do_zunpkd830(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + uint8_t *a = va; + uint16_t *d = vd; + + d[H2(i / 2)] = a[H1(i)]; + d[H2(i / 2 + 1)] = a[H1(i + 3)]; +} + +RVPR2(zunpkd830, 4, 1); + +static inline void +do_zunpkd831(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + uint8_t *a = va; + uint16_t *d = vd; + + d[H2(i / 2)] = a[H1(i) + 1]; + d[H2(i / 2 + 1)] = a[H1(i + 3)]; +} + +RVPR2(zunpkd831, 4, 1); + +static inline void +do_zunpkd832(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + uint8_t *a = va; + uint16_t *d = vd; + + d[H2(i / 2)] = a[H1(i) + 2]; + d[H2(i / 2 + 1)] = a[H1(i + 3)]; +} + +RVPR2(zunpkd832, 4, 1); From patchwork Thu Jun 10 07:58:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490248 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xfX0VyNz9sPf for ; Thu, 10 Jun 2021 18:17:24 +1000 (AEST) Received: from localhost ([::1]:47584 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFsQ-0005rH-1Q for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:17:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41718) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFhv-0007fe-5j; Thu, 10 Jun 2021 04:06:31 -0400 Received: from out28-195.mail.aliyun.com ([115.124.28.195]:50098) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFhr-0001UQ-HW; Thu, 10 Jun 2021 04:06:30 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.076982|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.471962-0.0122629-0.515775; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047207; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMe8-x_1623312381; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMe8-x_1623312381) by smtp.aliyun-inc.com(10.147.41.138); Thu, 10 Jun 2021 16:06:21 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 14/37] target/riscv: 16-bit Packing Instructions Date: Thu, 10 Jun 2021 15:58:45 +0800 Message-Id: <20210610075908.3305506-15-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.195; envelope-from=zhiwei_liu@c-sky.com; helo=out28-195.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Concat 16-bit elements from source register to 32-bit element in destination register. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 5 +++ target/riscv/insn32.decode | 5 +++ target/riscv/insn_trans/trans_rvp.c.inc | 9 +++++ target/riscv/packed_helper.c | 45 +++++++++++++++++++++++++ 4 files changed, 64 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 9fd2a70f7d..9872f5efbd 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1266,3 +1266,8 @@ DEF_HELPER_2(zunpkd820, tl, env, tl) DEF_HELPER_2(zunpkd830, tl, env, tl) DEF_HELPER_2(zunpkd831, tl, env, tl) DEF_HELPER_2(zunpkd832, tl, env, tl) + +DEF_HELPER_3(pkbb16, tl, env, tl, tl) +DEF_HELPER_3(pkbt16, tl, env, tl, tl) +DEF_HELPER_3(pktt16, tl, env, tl, tl) +DEF_HELPER_3(pktb16, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 9b8ea0f9ab..0b6830c76e 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -869,3 +869,8 @@ zunpkd820 1010110 01101 ..... 000 ..... 1110111 @r2 zunpkd830 1010110 01110 ..... 000 ..... 1110111 @r2 zunpkd831 1010110 01111 ..... 000 ..... 1110111 @r2 zunpkd832 1010110 10111 ..... 000 ..... 1110111 @r2 + +pkbb16 0000111 ..... ..... 001 ..... 1110111 @r +pkbt16 0001111 ..... ..... 001 ..... 1110111 @r +pktt16 0010111 ..... ..... 001 ..... 1110111 @r +pktb16 0011111 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 5af2c7c2cc..b5bd8b1406 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -360,3 +360,12 @@ GEN_RVP_R2_OOL(zunpkd820); GEN_RVP_R2_OOL(zunpkd830); GEN_RVP_R2_OOL(zunpkd831); GEN_RVP_R2_OOL(zunpkd832); + +/* + *** Partial-SIMD Data Processing Instruction + */ +/* 16-bit Packing Instructions */ +GEN_RVP_R_OOL(pkbb16); +GEN_RVP_R_OOL(pkbt16); +GEN_RVP_R_OOL(pktt16); +GEN_RVP_R_OOL(pktb16); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 8226dbd079..f6cea654b2 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1314,3 +1314,48 @@ do_zunpkd832(CPURISCVState *env, void *vd, void *va, uint8_t i) } RVPR2(zunpkd832, 4, 1); + +/* + *** Partial-SIMD Data Processing Instructions + */ + +/* 16-bit Packing Instructions */ +static inline void do_pkbb16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i + 1)] = a[H2(i)]; + d[H2(i)] = b[H2(i)]; +} + +RVPR(pkbb16, 2, 2); + +static inline void do_pkbt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i + 1)] = a[H2(i)]; + d[H2(i)] = b[H2(i + 1)]; +} + +RVPR(pkbt16, 2, 2); + +static inline void do_pktt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i + 1)] = a[H2(i + 1)]; + d[H2(i)] = b[H2(i + 1)]; +} + +RVPR(pktt16, 2, 2); + +static inline void do_pktb16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i + 1)] = a[H2(i + 1)]; + d[H2(i)] = b[H2(i)]; +} + +RVPR(pktb16, 2, 2); From patchwork Thu Jun 10 07:58:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490245 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xZK0knXz9s1l for ; Thu, 10 Jun 2021 18:13:45 +1000 (AEST) Received: from localhost ([::1]:38940 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFos-0008MM-Vo for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:13:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41814) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFiT-0008Sp-TV; Thu, 10 Jun 2021 04:07:10 -0400 Received: from out28-145.mail.aliyun.com ([115.124.28.145]:33821) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFiO-0001nV-2s; Thu, 10 Jun 2021 04:07:05 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436317|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.620553-0.00353106-0.375916; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047207; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMZbrT_1623312412; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMZbrT_1623312412) by smtp.aliyun-inc.com(10.147.42.197); Thu, 10 Jun 2021 16:06:52 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 15/37] target/riscv: Signed MSW 32x32 Multiply and Add Instructions Date: Thu, 10 Jun 2021 15:58:46 +0800 Message-Id: <20210610075908.3305506-16-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.145; envelope-from=zhiwei_liu@c-sky.com; helo=out28-145.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Always contain a 32x32 multiplification and the most significant word can be used as the result, or an operand for an add or subtract operation with rounding or saturation. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 9 ++ target/riscv/insn32.decode | 9 ++ target/riscv/insn_trans/trans_rvp.c.inc | 44 ++++++++++ target/riscv/packed_helper.c | 109 ++++++++++++++++++++++++ 4 files changed, 171 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 9872f5efbd..600e8dee44 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1271,3 +1271,12 @@ DEF_HELPER_3(pkbb16, tl, env, tl, tl) DEF_HELPER_3(pkbt16, tl, env, tl, tl) DEF_HELPER_3(pktt16, tl, env, tl, tl) DEF_HELPER_3(pktb16, tl, env, tl, tl) + +DEF_HELPER_3(smmul, tl, env, tl, tl) +DEF_HELPER_3(smmul_u, tl, env, tl, tl) +DEF_HELPER_4(kmmac, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmac_u, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmsb, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmsb_u, tl, env, tl, tl, tl) +DEF_HELPER_3(kwmmul, tl, env, tl, tl) +DEF_HELPER_3(kwmmul_u, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 0b6830c76e..0484de140b 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -874,3 +874,12 @@ pkbb16 0000111 ..... ..... 001 ..... 1110111 @r pkbt16 0001111 ..... ..... 001 ..... 1110111 @r pktt16 0010111 ..... ..... 001 ..... 1110111 @r pktb16 0011111 ..... ..... 001 ..... 1110111 @r + +smmul 0100000 ..... ..... 001 ..... 1110111 @r +smmul_u 0101000 ..... ..... 001 ..... 1110111 @r +kmmac 0110000 ..... ..... 001 ..... 1110111 @r +kmmac_u 0111000 ..... ..... 001 ..... 1110111 @r +kmmsb 0100001 ..... ..... 001 ..... 1110111 @r +kmmsb_u 0101001 ..... ..... 001 ..... 1110111 @r +kwmmul 0110001 ..... ..... 001 ..... 1110111 @r +kwmmul_u 0111001 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index b5bd8b1406..073558b950 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -369,3 +369,47 @@ GEN_RVP_R_OOL(pkbb16); GEN_RVP_R_OOL(pkbt16); GEN_RVP_R_OOL(pktt16); GEN_RVP_R_OOL(pktb16); + +/* Most Significant Word “32x32” Multiply & Add Instructions */ +GEN_RVP_R_OOL(smmul); +GEN_RVP_R_OOL(smmul_u); + +/* Function to accumulate destination register */ +static inline bool r_acc_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv, TCGv_ptr, TCGv, TCGv, TCGv)) +{ + TCGv src1, src2, src3, dst; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + src3 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + gen_get_gpr(src3, a->rd); + fn(dst, cpu_env, src1, src2, src3); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(src3); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R_ACC_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_acc_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_ACC_OOL(kmmac); +GEN_RVP_R_ACC_OOL(kmmac_u); +GEN_RVP_R_ACC_OOL(kmmsb); +GEN_RVP_R_ACC_OOL(kmmsb_u); +GEN_RVP_R_OOL(kwmmul); +GEN_RVP_R_OOL(kwmmul_u); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index f6cea654b2..465cb5a3b3 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1359,3 +1359,112 @@ static inline void do_pktb16(CPURISCVState *env, void *vd, void *va, } RVPR(pktb16, 2, 2); + +/* Most Significant Word “32x32” Multiply & Add Instructions */ +static inline void do_smmul(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[i] = (int64_t)a[i] * b[i] >> 32; +} + +RVPR(smmul, 1, 4); + +static inline void do_smmul_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[i] = ((int64_t)a[i] * b[i] + (uint32_t)INT32_MIN) >> 32; +} + +RVPR(smmul_u, 1, 4); + +typedef void PackedFn4i(CPURISCVState *, void *, void *, + void *, void *, uint8_t); + +static inline target_ulong +rvpr_acc(CPURISCVState *env, target_ulong a, + target_ulong b, target_ulong c, + uint8_t step, uint8_t size, PackedFn4i *fn) +{ + int i, passes = sizeof(target_ulong) / size; + target_ulong result = 0; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, &b, &c, i); + } + return result; +} + +#define RVPR_ACC(NAME, STEP, SIZE) \ +target_ulong HELPER(NAME)(CPURISCVState *env, target_ulong a, \ + target_ulong b, target_ulong c) \ +{ \ + return rvpr_acc(env, a, b, c, STEP, SIZE, (PackedFn4i *)do_##NAME);\ +} + +static inline void do_kmmac(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb, *c = vc; + d[i] = sadd32(env, 0, ((int64_t)a[i] * b[i]) >> 32, c[i]); +} + +RVPR_ACC(kmmac, 1, 4); + +static inline void do_kmmac_u(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb, *c = vc; + d[i] = sadd32(env, 0, ((int64_t)a[i] * b[i] + + (uint32_t)INT32_MIN) >> 32, c[i]); +} + +RVPR_ACC(kmmac_u, 1, 4); + +static inline void do_kmmsb(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb, *c = vc; + d[i] = ssub32(env, 0, c[i], (int64_t)a[i] * b[i] >> 32); +} + +RVPR_ACC(kmmsb, 1, 4); + +static inline void do_kmmsb_u(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb, *c = vc; + d[i] = ssub32(env, 0, c[i], ((int64_t)a[i] * b[i] + + (uint32_t)INT32_MIN) >> 32); +} + +RVPR_ACC(kmmsb_u, 1, 4); + +static inline void do_kwmmul(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + if (a[i] == INT32_MIN && b[i] == INT32_MIN) { + env->vxsat = 0x1; + d[i] = INT32_MAX; + } else { + d[i] = (int64_t)a[i] * b[i] >> 31; + } +} + +RVPR(kwmmul, 1, 4); + +static inline void do_kwmmul_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + if (a[i] == INT32_MIN && b[i] == INT32_MIN) { + env->vxsat = 0x1; + d[i] = INT32_MAX; + } else { + d[i] = ((int64_t)a[i] * b[i] + (1ull << 30)) >> 31; + } +} + +RVPR(kwmmul_u, 1, 4); From patchwork Thu Jun 10 07:58:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490253 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xjr28sxz9sPf for ; Thu, 10 Jun 2021 18:20:16 +1000 (AEST) Received: from localhost ([::1]:57192 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFvC-0003s6-9j for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:20:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41914) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFix-0000Ed-A0; Thu, 10 Jun 2021 04:07:37 -0400 Received: from out28-124.mail.aliyun.com ([115.124.28.124]:52926) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFir-0001zl-Ag; Thu, 10 Jun 2021 04:07:34 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436302|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.552192-0.00466643-0.443141; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047213; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMljOg_1623312442; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMljOg_1623312442) by smtp.aliyun-inc.com(10.147.44.145); Thu, 10 Jun 2021 16:07:23 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 16/37] target/riscv: Signed MSW 32x16 Multiply and Add Instructions Date: Thu, 10 Jun 2021 15:58:47 +0800 Message-Id: <20210610075908.3305506-17-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.124; envelope-from=zhiwei_liu@c-sky.com; helo=out28-124.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Always contain a 32x16 multiplification and the most significant word can be used as the result, or an operand for an add or subtract operation with rounding or saturation. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 17 ++ target/riscv/insn32.decode | 17 ++ target/riscv/insn_trans/trans_rvp.c.inc | 18 ++ target/riscv/packed_helper.c | 208 ++++++++++++++++++++++++ 4 files changed, 260 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 600e8dee44..854f48d385 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1280,3 +1280,20 @@ DEF_HELPER_4(kmmsb, tl, env, tl, tl, tl) DEF_HELPER_4(kmmsb_u, tl, env, tl, tl, tl) DEF_HELPER_3(kwmmul, tl, env, tl, tl) DEF_HELPER_3(kwmmul_u, tl, env, tl, tl) + +DEF_HELPER_3(smmwb, tl, env, tl, tl) +DEF_HELPER_3(smmwb_u, tl, env, tl, tl) +DEF_HELPER_3(smmwt, tl, env, tl, tl) +DEF_HELPER_3(smmwt_u, tl, env, tl, tl) +DEF_HELPER_4(kmmawb, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmawb_u, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmawt, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmawt_u, tl, env, tl, tl, tl) +DEF_HELPER_3(kmmwb2, tl, env, tl, tl) +DEF_HELPER_3(kmmwb2_u, tl, env, tl, tl) +DEF_HELPER_3(kmmwt2, tl, env, tl, tl) +DEF_HELPER_3(kmmwt2_u, tl, env, tl, tl) +DEF_HELPER_4(kmmawb2, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmawb2_u, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmawt2, tl, env, tl, tl, tl) +DEF_HELPER_4(kmmawt2_u, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 0484de140b..e5a8f663dc 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -883,3 +883,20 @@ kmmsb 0100001 ..... ..... 001 ..... 1110111 @r kmmsb_u 0101001 ..... ..... 001 ..... 1110111 @r kwmmul 0110001 ..... ..... 001 ..... 1110111 @r kwmmul_u 0111001 ..... ..... 001 ..... 1110111 @r + +smmwb 0100010 ..... ..... 001 ..... 1110111 @r +smmwb_u 0101010 ..... ..... 001 ..... 1110111 @r +smmwt 0110010 ..... ..... 001 ..... 1110111 @r +smmwt_u 0111010 ..... ..... 001 ..... 1110111 @r +kmmawb 0100011 ..... ..... 001 ..... 1110111 @r +kmmawb_u 0101011 ..... ..... 001 ..... 1110111 @r +kmmawt 0110011 ..... ..... 001 ..... 1110111 @r +kmmawt_u 0111011 ..... ..... 001 ..... 1110111 @r +kmmwb2 1000111 ..... ..... 001 ..... 1110111 @r +kmmwb2_u 1001111 ..... ..... 001 ..... 1110111 @r +kmmwt2 1010111 ..... ..... 001 ..... 1110111 @r +kmmwt2_u 1011111 ..... ..... 001 ..... 1110111 @r +kmmawb2 1100111 ..... ..... 001 ..... 1110111 @r +kmmawb2_u 1101111 ..... ..... 001 ..... 1110111 @r +kmmawt2 1110111 ..... ..... 001 ..... 1110111 @r +kmmawt2_u 1111111 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 073558b950..af490a5ef0 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -413,3 +413,21 @@ GEN_RVP_R_ACC_OOL(kmmsb); GEN_RVP_R_ACC_OOL(kmmsb_u); GEN_RVP_R_OOL(kwmmul); GEN_RVP_R_OOL(kwmmul_u); + +/* Most Significant Word “32x16” Multiply & Add Instructions */ +GEN_RVP_R_OOL(smmwb); +GEN_RVP_R_OOL(smmwb_u); +GEN_RVP_R_OOL(smmwt); +GEN_RVP_R_OOL(smmwt_u); +GEN_RVP_R_ACC_OOL(kmmawb); +GEN_RVP_R_ACC_OOL(kmmawb_u); +GEN_RVP_R_ACC_OOL(kmmawt); +GEN_RVP_R_ACC_OOL(kmmawt_u); +GEN_RVP_R_OOL(kmmwb2); +GEN_RVP_R_OOL(kmmwb2_u); +GEN_RVP_R_OOL(kmmwt2); +GEN_RVP_R_OOL(kmmwt2_u); +GEN_RVP_R_ACC_OOL(kmmawb2); +GEN_RVP_R_ACC_OOL(kmmawb2_u); +GEN_RVP_R_ACC_OOL(kmmawt2); +GEN_RVP_R_ACC_OOL(kmmawt2_u); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 465cb5a3b3..868a1a71ba 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1468,3 +1468,211 @@ static inline void do_kwmmul_u(CPURISCVState *env, void *vd, void *va, } RVPR(kwmmul_u, 1, 4); + +/* Most Significant Word “32x16” Multiply & Add Instructions */ +static inline void do_smmwb(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + d[H4(i)] = (int64_t)a[H4(i)] * b[H2(2 * i)] >> 16; +} + +RVPR(smmwb, 1, 4); + +static inline void do_smmwb_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + d[H4(i)] = ((int64_t)a[H4(i)] * b[H2(2 * i)] + (1ull << 15)) >> 16; +} + +RVPR(smmwb_u, 1, 4); + +static inline void do_smmwt(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + d[H4(i)] = (int64_t)a[H4(i)] * b[H2(2 * i + 1)] >> 16; +} + +RVPR(smmwt, 1, 4); + +static inline void do_smmwt_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + d[H4(i)] = ((int64_t)a[H4(i)] * b[H2(2 * i + 1)] + (1ull << 15)) >> 16; +} + +RVPR(smmwt_u, 1, 4); + +static inline void do_kmmawb(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc; + int16_t *b = vb; + d[H4(i)] = sadd32(env, 0, (int64_t)a[H4(i)] * b[H2(2 * i)] >> 16, c[H4(i)]); +} + +RVPR_ACC(kmmawb, 1, 4); + +static inline void do_kmmawb_u(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc; + int16_t *b = vb; + d[H4(i)] = sadd32(env, 0, ((int64_t)a[H4(i)] * b[H2(2 * i)] + + (1ull << 15)) >> 16, c[H4(i)]); +} + +RVPR_ACC(kmmawb_u, 1, 4); + +static inline void do_kmmawt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc; + int16_t *b = vb; + d[H4(i)] = sadd32(env, 0, (int64_t)a[H4(i)] * b[H2(2 * i + 1)] >> 16, + c[H4(i)]); +} + +RVPR_ACC(kmmawt, 1, 4); + +static inline void do_kmmawt_u(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc; + int16_t *b = vb; + d[H4(i)] = sadd32(env, 0, ((int64_t)a[H4(i)] * b[H2(2 * i + 1)] + + (1ull << 15)) >> 16, c[H4(i)]); +} + +RVPR_ACC(kmmawt_u, 1, 4); + +static inline void do_kmmwb2(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i)] == INT16_MIN) { + env->vxsat = 0x1; + d[H4(i)] = INT32_MAX; + } else { + d[H4(i)] = (int64_t)a[H4(i)] * b[H2(2 * i)] >> 15; + } +} + +RVPR(kmmwb2, 1, 4); + +static inline void do_kmmwb2_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i)] == INT16_MIN) { + env->vxsat = 0x1; + d[H4(i)] = INT32_MAX; + } else { + d[H4(i)] = ((int64_t)a[H4(i)] * b[H2(2 * i)] + (1ull << 14)) >> 15; + } +} + +RVPR(kmmwb2_u, 1, 4); + +static inline void do_kmmwt2(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i + 1)] == INT16_MIN) { + env->vxsat = 0x1; + d[H4(i)] = INT32_MAX; + } else { + d[H4(i)] = (int64_t)a[H4(i)] * b[H2(2 * i + 1)] >> 15; + } +} + +RVPR(kmmwt2, 1, 4); + +static inline void do_kmmwt2_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i + 1)] == INT16_MIN) { + env->vxsat = 0x1; + d[H4(i)] = INT32_MAX; + } else { + d[H4(i)] = ((int64_t)a[H4(i)] * b[H2(2 * i + 1)] + (1ull << 14)) >> 15; + } +} + +RVPR(kmmwt2_u, 1, 4); + +static inline void do_kmmawb2(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc, result; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i)] == INT16_MIN) { + env->vxsat = 0x1; + result = INT32_MAX; + } else { + result = (int64_t)a[H4(i)] * b[H2(2 * i)] >> 15; + } + d[H4(i)] = sadd32(env, 0, result, c[H4(i)]); +} + +RVPR_ACC(kmmawb2, 1, 4); + +static inline void do_kmmawb2_u(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc, result; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i)] == INT16_MIN) { + env->vxsat = 0x1; + result = INT32_MAX; + } else { + result = ((int64_t)a[H4(i)] * b[H2(2 * i)] + (1ull << 14)) >> 15; + } + d[H4(i)] = sadd32(env, 0, result, c[H4(i)]); +} + +RVPR_ACC(kmmawb2_u, 1, 4); + +static inline void do_kmmawt2(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc, result; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i + 1)] == INT16_MIN) { + env->vxsat = 0x1; + result = INT32_MAX; + } else { + result = (int64_t)a[H4(i)] * b[H2(2 * i + 1)] >> 15; + } + d[H4(i)] = sadd32(env, 0, result, c[H4(i)]); +} + +RVPR_ACC(kmmawt2, 1, 4); + +static inline void do_kmmawt2_u(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *a = va, *c = vc, result; + int16_t *b = vb; + if (a[H4(i)] == INT32_MIN && b[H2(2 * i + 1)] == INT16_MIN) { + env->vxsat = 0x1; + result = INT32_MAX; + } else { + result = ((int64_t)a[H4(i)] * b[H2(2 * i + 1)] + (1ull << 14)) >> 15; + } + d[H4(i)] = sadd32(env, 0, result, c[H4(i)]); +} + +RVPR_ACC(kmmawt2_u, 1, 4); From patchwork Thu Jun 10 07:58:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490270 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xmZ3JByz9sRN for ; Thu, 10 Jun 2021 18:22:38 +1000 (AEST) Received: from localhost ([::1]:37598 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFxU-0001LS-EH for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:22:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42136) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFjQ-0000Xy-Bd; Thu, 10 Jun 2021 04:08:05 -0400 Received: from out28-124.mail.aliyun.com ([115.124.28.124]:57990) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFjM-0002FC-3g; Thu, 10 Jun 2021 04:08:04 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07609424|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.547673-7.80372e-05-0.452249; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047213; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQN4uzq_1623312473; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQN4uzq_1623312473) by smtp.aliyun-inc.com(10.147.40.2); Thu, 10 Jun 2021 16:07:53 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 17/37] target/riscv: Signed 16-bit Multiply 32-bit Add/Subtract Instructions Date: Thu, 10 Jun 2021 15:58:48 +0800 Message-Id: <20210610075908.3305506-18-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.124; envelope-from=zhiwei_liu@c-sky.com; helo=out28-124.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Always contain a signed 16x16 multiply and the 32-bit result can be written to the destination register or as an operand for an add/subtract operation. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 19 ++ target/riscv/insn32.decode | 19 ++ target/riscv/insn_trans/trans_rvp.c.inc | 20 ++ target/riscv/packed_helper.c | 268 ++++++++++++++++++++++++ 4 files changed, 326 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 854f48d385..5aac6ba578 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1297,3 +1297,22 @@ DEF_HELPER_4(kmmawb2, tl, env, tl, tl, tl) DEF_HELPER_4(kmmawb2_u, tl, env, tl, tl, tl) DEF_HELPER_4(kmmawt2, tl, env, tl, tl, tl) DEF_HELPER_4(kmmawt2_u, tl, env, tl, tl, tl) + +DEF_HELPER_3(smbb16, tl, env, tl, tl) +DEF_HELPER_3(smbt16, tl, env, tl, tl) +DEF_HELPER_3(smtt16, tl, env, tl, tl) +DEF_HELPER_3(kmda, tl, env, tl, tl) +DEF_HELPER_3(kmxda, tl, env, tl, tl) +DEF_HELPER_3(smds, tl, env, tl, tl) +DEF_HELPER_3(smdrs, tl, env, tl, tl) +DEF_HELPER_3(smxds, tl, env, tl, tl) +DEF_HELPER_4(kmabb, tl, env, tl, tl, tl) +DEF_HELPER_4(kmabt, tl, env, tl, tl, tl) +DEF_HELPER_4(kmatt, tl, env, tl, tl, tl) +DEF_HELPER_4(kmada, tl, env, tl, tl, tl) +DEF_HELPER_4(kmaxda, tl, env, tl, tl, tl) +DEF_HELPER_4(kmads, tl, env, tl, tl, tl) +DEF_HELPER_4(kmadrs, tl, env, tl, tl, tl) +DEF_HELPER_4(kmaxds, tl, env, tl, tl, tl) +DEF_HELPER_4(kmsda, tl, env, tl, tl, tl) +DEF_HELPER_4(kmsxda, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index e5a8f663dc..f590880750 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -900,3 +900,22 @@ kmmawb2 1100111 ..... ..... 001 ..... 1110111 @r kmmawb2_u 1101111 ..... ..... 001 ..... 1110111 @r kmmawt2 1110111 ..... ..... 001 ..... 1110111 @r kmmawt2_u 1111111 ..... ..... 001 ..... 1110111 @r + +smbb16 0000100 ..... ..... 001 ..... 1110111 @r +smbt16 0001100 ..... ..... 001 ..... 1110111 @r +smtt16 0010100 ..... ..... 001 ..... 1110111 @r +kmda 0011100 ..... ..... 001 ..... 1110111 @r +kmxda 0011101 ..... ..... 001 ..... 1110111 @r +smds 0101100 ..... ..... 001 ..... 1110111 @r +smdrs 0110100 ..... ..... 001 ..... 1110111 @r +smxds 0111100 ..... ..... 001 ..... 1110111 @r +kmabb 0101101 ..... ..... 001 ..... 1110111 @r +kmabt 0110101 ..... ..... 001 ..... 1110111 @r +kmatt 0111101 ..... ..... 001 ..... 1110111 @r +kmada 0100100 ..... ..... 001 ..... 1110111 @r +kmaxda 0100101 ..... ..... 001 ..... 1110111 @r +kmads 0101110 ..... ..... 001 ..... 1110111 @r +kmadrs 0110110 ..... ..... 001 ..... 1110111 @r +kmaxds 0111110 ..... ..... 001 ..... 1110111 @r +kmsda 0100110 ..... ..... 001 ..... 1110111 @r +kmsxda 0100111 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index af490a5ef0..308fc223db 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -431,3 +431,23 @@ GEN_RVP_R_ACC_OOL(kmmawb2); GEN_RVP_R_ACC_OOL(kmmawb2_u); GEN_RVP_R_ACC_OOL(kmmawt2); GEN_RVP_R_ACC_OOL(kmmawt2_u); + +/* Signed 16-bit Multiply with 32-bit Add/Subtract Instructions */ +GEN_RVP_R_OOL(smbb16); +GEN_RVP_R_OOL(smbt16); +GEN_RVP_R_OOL(smtt16); +GEN_RVP_R_OOL(kmda); +GEN_RVP_R_OOL(kmxda); +GEN_RVP_R_OOL(smds); +GEN_RVP_R_OOL(smdrs); +GEN_RVP_R_OOL(smxds); +GEN_RVP_R_ACC_OOL(kmabb); +GEN_RVP_R_ACC_OOL(kmabt); +GEN_RVP_R_ACC_OOL(kmatt); +GEN_RVP_R_ACC_OOL(kmada); +GEN_RVP_R_ACC_OOL(kmaxda); +GEN_RVP_R_ACC_OOL(kmads); +GEN_RVP_R_ACC_OOL(kmadrs); +GEN_RVP_R_ACC_OOL(kmaxds); +GEN_RVP_R_ACC_OOL(kmsda); +GEN_RVP_R_ACC_OOL(kmsxda); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 868a1a71ba..88509fd118 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1676,3 +1676,271 @@ static inline void do_kmmawt2_u(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(kmmawt2_u, 1, 4); + +/* Signed 16-bit Multiply with 32-bit Add/Subtract Instruction */ +static inline void do_smbb16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(i)] = (int32_t)a[H2(2 * i)] * b[H2(2 * i)]; +} + +RVPR(smbb16, 1, 4); + +static inline void do_smbt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(i)] = (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)]; +} + +RVPR(smbt16, 1, 4); + +static inline void do_smtt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(i)] = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; +} + +RVPR(smtt16, 1, 4); + +static inline void do_kmda(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + if (a[H2(2 * i)] == INT16_MIN && a[H2(2 * i + 1)] == INT16_MIN && + b[H2(2 * i)] == INT16_MIN && a[H2(2 * i + 1)] == INT16_MIN) { + d[H4(i)] = INT32_MAX; + env->vxsat = 0x1; + } else { + d[H4(i)] = (int32_t)a[H2(2 * i)] * b[H2(2 * i)] + + (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; + } +} + +RVPR(kmda, 1, 4); + +static inline void do_kmxda(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + if (a[H2(2 * i)] == INT16_MIN && a[H2(2 * i + 1)] == INT16_MIN && + b[H2(2 * i)] == INT16_MIN && a[H2(2 * i + 1)] == INT16_MIN) { + d[H4(i)] = INT32_MAX; + env->vxsat = 0x1; + } else { + d[H4(i)] = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i)] + + (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)]; + } +} + +RVPR(kmxda, 1, 4); + +static inline void do_smds(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(i)] = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)] - + (int32_t)a[H2(2 * i)] * b[H2(2 * i)]; +} + +RVPR(smds, 1, 4); + +static inline void do_smdrs(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(i)] = (int32_t)a[H2(2 * i)] * b[H2(2 * i)] - + (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; +} + +RVPR(smdrs, 1, 4); + +static inline void do_smxds(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + d[H4(i)] = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i)] - + (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)]; +} + +RVPR(smxds, 1, 4); + +static inline void do_kmabb(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + d[H4(i)] = sadd32(env, 0, (int32_t)a[H2(2 * i)] * b[H2(2 * i)], c[H4(i)]); +} + +RVPR_ACC(kmabb, 1, 4); + +static inline void do_kmabt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + d[H4(i)] = sadd32(env, 0, (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)], + c[H4(i)]); +} + +RVPR_ACC(kmabt, 1, 4); + +static inline void do_kmatt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + d[H4(i)] = sadd32(env, 0, (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)], + c[H4(i)]); +} + +RVPR_ACC(kmatt, 1, 4); + +static inline void do_kmada(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i)] * b[H2(2 * i)]; + p2 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; + + if (a[H2(i)] == INT16_MIN && a[H2(i + 1)] == INT16_MIN && + b[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + if (c[H4(i)] < 0) { + d[H4(i)] = INT32_MAX + c[H4(i)] + 1ll; + } else { + env->vxsat = 0x1; + d[H4(i)] = INT32_MAX; + } + } else { + d[H4(i)] = sadd32(env, 0, p1 + p2, c[H4(i)]); + } +} + +RVPR_ACC(kmada, 1, 4); + +static inline void do_kmaxda(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i)]; + p2 = (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)]; + + if (a[H2(2 * i)] == INT16_MIN && a[H2(2 * i + 1)] == INT16_MIN && + b[H2(2 * i)] == INT16_MIN && b[H2(2 * i + 1)] == INT16_MIN) { + if (c[H4(i)] < 0) { + d[H4(i)] = INT32_MAX + c[H4(i)] + 1ll; + } else { + env->vxsat = 0x1; + d[H4(i)] = INT32_MAX; + } + } else { + d[H4(i)] = sadd32(env, 0, p1 + p2, c[H4(i)]); + } +} + +RVPR_ACC(kmaxda, 1, 4); + +static inline void do_kmads(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; + p2 = (int32_t)a[H2(2 * i)] * b[H2(2 * i)]; + + d[H4(i)] = sadd32(env, 0, p1 - p2, c[H4(i)]); +} + +RVPR_ACC(kmads, 1, 4); + +static inline void do_kmadrs(CPURISCVState *env, void *vd, void *va, + void *vb, void * vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i)] * b[H2(2 * i)]; + p2 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; + + d[H4(i)] = sadd32(env, 0, p1 - p2, c[H4(i)]); +} + +RVPR_ACC(kmadrs, 1, 4); + +static inline void do_kmaxds(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i)]; + p2 = (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)]; + + d[H4(i)] = sadd32(env, 0, p1 - p2, c[H4(i)]); +} + +RVPR_ACC(kmaxds, 1, 4); + +static inline void do_kmsda(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i)] * b[H2(2 * i)]; + p2 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i + 1)]; + + if (a[H2(i)] == INT16_MIN && a[H2(i + 1)] == INT16_MIN && + b[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + if (c[H4(i)] < 0) { + env->vxsat = 0x1; + d[H4(i)] = INT32_MIN; + } else { + d[H4(i)] = c[H4(i)] - 1ll - INT32_MAX; + } + } else { + d[H4(i)] = ssub32(env, 0, c[H4(i)], p1 + p2); + } +} + +RVPR_ACC(kmsda, 1, 4); + +static inline void do_kmsxda(CPURISCVState *env, void *vd, void *va, + void *vb, void * vc, uint8_t i) +{ + int32_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + int32_t p1, p2; + p1 = (int32_t)a[H2(2 * i)] * b[H2(2 * i + 1)]; + p2 = (int32_t)a[H2(2 * i + 1)] * b[H2(2 * i)]; + + if (a[H2(i)] == INT16_MIN && a[H2(i + 1)] == INT16_MIN && + b[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + if (d[H4(i)] < 0) { + env->vxsat = 0x1; + d[H4(i)] = INT32_MIN; + } else { + d[H4(i)] = c[H4(i)] - 1ll - INT32_MAX; + } + } else { + d[H4(i)] = ssub32(env, 0, c[H4(i)], p1 + p2); + } +} + +RVPR_ACC(kmsxda, 1, 4); From patchwork Thu Jun 10 07:58:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490251 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xjX4b9Zz9sPf for ; Thu, 10 Jun 2021 18:20:00 +1000 (AEST) Received: from localhost ([::1]:56444 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFuw-0003Mb-Bi for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:19:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42258) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFjx-0001OG-DU; Thu, 10 Jun 2021 04:08:39 -0400 Received: from out28-5.mail.aliyun.com ([115.124.28.5]:33162) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFjo-0002Y6-Lk; Thu, 10 Jun 2021 04:08:37 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07646764|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.797667-0.00574743-0.196586; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047201; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQN4vXq_1623312503; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQN4vXq_1623312503) by smtp.aliyun-inc.com(10.147.40.2); Thu, 10 Jun 2021 16:08:24 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 18/37] target/riscv: Signed 16-bit Multiply 64-bit Add/Subtract Instructions Date: Thu, 10 Jun 2021 15:58:49 +0800 Message-Id: <20210610075908.3305506-19-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.5; envelope-from=zhiwei_liu@c-sky.com; helo=out28-5.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" "16x16" with 64-bit Signed Addition(64 = 64 + 16x16). Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 2 + target/riscv/insn32.decode | 2 + target/riscv/insn_trans/trans_rvp.c.inc | 51 +++++++++++++++++++++++++ target/riscv/packed_helper.c | 25 ++++++++++++ 4 files changed, 80 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 5aac6ba578..a37b023c53 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1316,3 +1316,5 @@ DEF_HELPER_4(kmadrs, tl, env, tl, tl, tl) DEF_HELPER_4(kmaxds, tl, env, tl, tl, tl) DEF_HELPER_4(kmsda, tl, env, tl, tl, tl) DEF_HELPER_4(kmsxda, tl, env, tl, tl, tl) + +DEF_HELPER_3(smal, i64, env, i64, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index f590880750..233df941b4 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -919,3 +919,5 @@ kmadrs 0110110 ..... ..... 001 ..... 1110111 @r kmaxds 0111110 ..... ..... 001 ..... 1110111 @r kmsda 0100110 ..... ..... 001 ..... 1110111 @r kmsxda 0100111 ..... ..... 001 ..... 1110111 @r + +smal 0101111 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 308fc223db..8b0728fc5a 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -451,3 +451,54 @@ GEN_RVP_R_ACC_OOL(kmadrs); GEN_RVP_R_ACC_OOL(kmaxds); GEN_RVP_R_ACC_OOL(kmsda); GEN_RVP_R_ACC_OOL(kmsxda); + +/* Signed 16-bit Multiply with 64-bit Add/Subtract Instructions */ +static bool +r_d64_s64_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv)) +{ + TCGv src2; + TCGv_i64 src1, dst; + + if (!has_ext(ctx, RVP) || !ctx->ext_psfoperand) { + return false; + } + + src1 = tcg_temp_new_i64(); + src2 = tcg_temp_new(); + dst = tcg_temp_new_i64(); + + if (is_32bit(ctx)) { + TCGv t0, t1; + t0 = tcg_temp_new(); + t1 = tcg_temp_new(); + gen_get_gpr(t0, a->rs1); + gen_get_gpr(t1, a->rs1 + 1); + tcg_gen_concat_tl_i64(src1, t0, t1); + tcg_temp_free(t0); + tcg_temp_free(t1); + } else { + TCGv t0; + t0 = tcg_temp_new(); + gen_get_gpr(t0, a->rs1); + tcg_gen_ext_tl_i64(src1, t0); + tcg_temp_free(t0); + } + + gen_get_gpr(src2, a->rs2); + fn(dst, cpu_env, src1, src2); + set_pair_regs(ctx, dst, a->rd); + + tcg_temp_free_i64(src1); + tcg_temp_free_i64(dst); + tcg_temp_free(src2); + return true; +} + +#define GEN_RVP_R_D64_S64_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_d64_s64_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_D64_S64_OOL(smal); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 88509fd118..1f9a5d620f 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1944,3 +1944,28 @@ static inline void do_kmsxda(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(kmsxda, 1, 4); + +/* Signed 16-bit Multiply with 64-bit Add/Subtract Instructions */ +static inline void do_smal(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va; + int16_t *b = vb; + + if (i == 0) { + *d = *a; + } + + *d += b[H2(i)] * b[H2(i + 1)]; +} + +uint64_t helper_smal(CPURISCVState *env, uint64_t a, target_ulong b) +{ + int i; + int64_t result = 0; + + for (i = 0; i < sizeof(target_ulong) / 2; i += 2) { + do_smal(env, &result, &a, &b, i); + } + return result; +} From patchwork Thu Jun 10 07:58:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490269 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xmX4D48z9sRK for ; Thu, 10 Jun 2021 18:22:36 +1000 (AEST) Received: from localhost ([::1]:37450 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFxS-0001EL-Hu for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:22:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42370) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFkT-0001nt-Ft; Thu, 10 Jun 2021 04:09:09 -0400 Received: from out28-77.mail.aliyun.com ([115.124.28.77]:48623) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFkJ-0002n0-Ul; Thu, 10 Jun 2021 04:09:09 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07441651|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.733612-0.00738679-0.259001; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047190; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=8; RT=7; SR=0; TI=SMTPD_---.KQMll4Z_1623312534; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMll4Z_1623312534) by smtp.aliyun-inc.com(10.147.44.145); Thu, 10 Jun 2021 16:08:54 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 19/37] target/riscv: Partial-SIMD Miscellaneous Instructions Date: Thu, 10 Jun 2021 15:58:50 +0800 Message-Id: <20210610075908.3305506-20-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.77; envelope-from=zhiwei_liu@c-sky.com; helo=out28-77.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" 32-bit signed or unsigned clip value. 32-bit leading redundant sign, leading zero, leading one count. Parallel byte sum of absolute difference or parallel byte sum of absolute difference accumulation. Signed-off-by: LIU Zhiwei Acked-by: Alistair Francis --- target/riscv/helper.h | 8 +++ target/riscv/insn32.decode | 8 +++ target/riscv/insn_trans/trans_rvp.c.inc | 9 +++ target/riscv/packed_helper.c | 75 +++++++++++++++++++++++++ 4 files changed, 100 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index a37b023c53..35c8c61b00 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1318,3 +1318,11 @@ DEF_HELPER_4(kmsda, tl, env, tl, tl, tl) DEF_HELPER_4(kmsxda, tl, env, tl, tl, tl) DEF_HELPER_3(smal, i64, env, i64, tl) + +DEF_HELPER_3(sclip32, tl, env, tl, tl) +DEF_HELPER_3(uclip32, tl, env, tl, tl) +DEF_HELPER_2(clrs32, tl, env, tl) +DEF_HELPER_2(clz32, tl, env, tl) +DEF_HELPER_2(clo32, tl, env, tl) +DEF_HELPER_3(pbsad, tl, env, tl, tl) +DEF_HELPER_4(pbsada, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 233df941b4..ce8bdee34b 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -921,3 +921,11 @@ kmsda 0100110 ..... ..... 001 ..... 1110111 @r kmsxda 0100111 ..... ..... 001 ..... 1110111 @r smal 0101111 ..... ..... 001 ..... 1110111 @r + +sclip32 1110010 ..... ..... 000 ..... 1110111 @sh5 +uclip32 1111010 ..... ..... 000 ..... 1110111 @sh5 +clrs32 1010111 11000 ..... 000 ..... 1110111 @r2 +clz32 1010111 11001 ..... 000 ..... 1110111 @r2 +clo32 1010111 11011 ..... 000 ..... 1110111 @r2 +pbsad 1111110 ..... ..... 000 ..... 1110111 @r +pbsada 1111111 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 8b0728fc5a..43e7e5a75d 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -502,3 +502,12 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ } GEN_RVP_R_D64_S64_OOL(smal); + +/* Partial-SIMD Miscellaneous Instructions */ +GEN_RVP_SHIFTI(sclip32, NULL, gen_helper_sclip32); +GEN_RVP_SHIFTI(uclip32, NULL, gen_helper_uclip32); +GEN_RVP_R2_OOL(clrs32); +GEN_RVP_R2_OOL(clz32); +GEN_RVP_R2_OOL(clo32); +GEN_RVP_R_OOL(pbsad); +GEN_RVP_R_ACC_OOL(pbsada); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 1f9a5d620f..1f2b90c394 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -1969,3 +1969,78 @@ uint64_t helper_smal(CPURISCVState *env, uint64_t a, target_ulong b) } return result; } + +/* Partial-SIMD Miscellaneous Instructions */ +static inline void do_sclip32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + + d[i] = sat64(env, a[i], shift); +} + +RVPR(sclip32, 1, 4); + +static inline void do_uclip32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + + if (a[i] < 0) { + d[i] = 0; + env->vxsat = 0x1; + } else { + d[i] = satu64(env, a[i], shift); + } +} + +RVPR(uclip32, 1, 4); + +static inline void do_clrs32(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int32_t *d = vd, *a = va; + d[i] = clrsb32(a[i]); +} + +RVPR2(clrs32, 1, 4); + +static inline void do_clz32(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int32_t *d = vd, *a = va; + d[i] = clz32(a[i]); +} + +RVPR2(clz32, 1, 4); + +static inline void do_clo32(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int32_t *d = vd, *a = va; + d[i] = clo32(a[i]); +} + +RVPR2(clo32, 1, 4); + +static inline void do_pbsad(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_ulong *d = vd; + uint8_t *a = va, *b = vb; + *d += abs(a[i] - b[i]); +} + +RVPR(pbsad, 1, 1); + +static inline void do_pbsada(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + target_ulong *d = vd, *c = vc; + uint8_t *a = va, *b = vb; + if (i == 0) { + *d += *c; + } + *d += abs(a[i] - b[i]); +} + +RVPR_ACC(pbsada, 1, 1); From patchwork Thu Jun 10 07:58:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490249 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xfk4SDxz9sPf for ; Thu, 10 Jun 2021 18:17:34 +1000 (AEST) Received: from localhost ([::1]:48140 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFsa-0006FI-H1 for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:17:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42522) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFkr-0002uQ-5D; Thu, 10 Jun 2021 04:09:33 -0400 Received: from out28-77.mail.aliyun.com ([115.124.28.77]:38064) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFkn-00032Y-Di; Thu, 10 Jun 2021 04:09:32 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.1146224|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.533591-0.0146714-0.451738; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047192; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMfKN9_1623312564; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMfKN9_1623312564) by smtp.aliyun-inc.com(10.147.43.230); Thu, 10 Jun 2021 16:09:24 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 20/37] target/riscv: 8-bit Multiply with 32-bit Add Instructions Date: Thu, 10 Jun 2021 15:58:51 +0800 Message-Id: <20210610075908.3305506-21-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.77; envelope-from=zhiwei_liu@c-sky.com; helo=out28-77.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Four "signed or unsigned 8 x signed or unsigned 8" with 32-bit addition (32 = 32 + 8x8 + 8x8 + 8x8 + 8x8). Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 4 +++ target/riscv/insn32.decode | 4 +++ target/riscv/insn_trans/trans_rvp.c.inc | 5 +++ target/riscv/packed_helper.c | 44 +++++++++++++++++++++++++ 4 files changed, 57 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 35c8c61b00..a0e3131512 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1326,3 +1326,7 @@ DEF_HELPER_2(clz32, tl, env, tl) DEF_HELPER_2(clo32, tl, env, tl) DEF_HELPER_3(pbsad, tl, env, tl, tl) DEF_HELPER_4(pbsada, tl, env, tl, tl, tl) + +DEF_HELPER_4(smaqa, tl, env, tl, tl, tl) +DEF_HELPER_4(umaqa, tl, env, tl, tl, tl) +DEF_HELPER_4(smaqa_su, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index ce8bdee34b..96288370a6 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -929,3 +929,7 @@ clz32 1010111 11001 ..... 000 ..... 1110111 @r2 clo32 1010111 11011 ..... 000 ..... 1110111 @r2 pbsad 1111110 ..... ..... 000 ..... 1110111 @r pbsada 1111111 ..... ..... 000 ..... 1110111 @r + +smaqa 1100100 ..... ..... 000 ..... 1110111 @r +umaqa 1100110 ..... ..... 000 ..... 1110111 @r +smaqa_su 1100101 ..... ..... 000 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 43e7e5a75d..1a10f13318 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -511,3 +511,8 @@ GEN_RVP_R2_OOL(clz32); GEN_RVP_R2_OOL(clo32); GEN_RVP_R_OOL(pbsad); GEN_RVP_R_ACC_OOL(pbsada); + +/* 8-bit Multiply with 32-bit Add Instructions */ +GEN_RVP_R_ACC_OOL(smaqa); +GEN_RVP_R_ACC_OOL(umaqa); +GEN_RVP_R_ACC_OOL(smaqa_su); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 1f2b90c394..02178d6e61 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2044,3 +2044,47 @@ static inline void do_pbsada(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(pbsada, 1, 1); + +/* 8-bit Multiply with 32-bit Add Instructions */ +static inline void do_smaqa(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int8_t *a = va, *b = vb; + int32_t *d = vd, *c = vc; + + d[H4(i)] = c[H4(i)] + a[H1(i * 4)] * b[H1(i * 4)] + + a[H1(i * 4 + 1)] * b[H1(i * 4 + 1)] + + a[H1(i * 4 + 2)] * b[H1(i * 4 + 2)] + + a[H1(i * 4 + 3)] * b[H1(i * 4 + 3)]; +} + +RVPR_ACC(smaqa, 1, 4); + +static inline void do_umaqa(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + uint8_t *a = va, *b = vb; + uint32_t *d = vd, *c = vc; + + d[H4(i)] = c[H4(i)] + a[H1(i * 4)] * b[H1(i * 4)] + + a[H1(i * 4 + 1)] * b[H1(i * 4 + 1)] + + a[H1(i * 4 + 2)] * b[H1(i * 4 + 2)] + + a[H1(i * 4 + 3)] * b[H1(i * 4 + 3)]; +} + +RVPR_ACC(umaqa, 1, 4); + +static inline void do_smaqa_su(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int8_t *a = va; + uint8_t *b = vb; + int32_t *d = vd, *c = vc; + + d[H4(i)] = c[H4(i)] + a[H1(i * 4)] * b[H1(i * 4)] + + a[H1(i * 4 + 1)] * b[H1(i * 4 + 1)] + + a[H1(i * 4 + 2)] * b[H1(i * 4 + 2)] + + a[H1(i * 4 + 3)] * b[H1(i * 4 + 3)]; +} + +RVPR_ACC(smaqa_su, 1, 4); From patchwork Thu Jun 10 07:58:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490280 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xqS0mvTz9sPf for ; Thu, 10 Jun 2021 18:25:08 +1000 (AEST) Received: from localhost ([::1]:46620 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFzu-0007Kw-1f for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:25:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFlK-0003YW-Pe; Thu, 10 Jun 2021 04:10:02 -0400 Received: from out28-49.mail.aliyun.com ([115.124.28.49]:32989) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFlI-0003Lj-3X; Thu, 10 Jun 2021 04:10:02 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436575|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.589756-0.0109956-0.399248; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047207; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMfKxm_1623312595; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMfKxm_1623312595) by smtp.aliyun-inc.com(10.147.43.230); Thu, 10 Jun 2021 16:09:55 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 21/37] target/riscv: 64-bit Add/Subtract Instructions Date: Thu, 10 Jun 2021 15:58:52 +0800 Message-Id: <20210610075908.3305506-22-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.49; envelope-from=zhiwei_liu@c-sky.com; helo=out28-49.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" 64-bit add/subtract with saturation or halving operation. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 11 ++ target/riscv/insn32.decode | 11 ++ target/riscv/insn_trans/trans_rvp.c.inc | 74 +++++++++++++ target/riscv/packed_helper.c | 132 ++++++++++++++++++++++++ 4 files changed, 228 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index a0e3131512..192ef42d2a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1330,3 +1330,14 @@ DEF_HELPER_4(pbsada, tl, env, tl, tl, tl) DEF_HELPER_4(smaqa, tl, env, tl, tl, tl) DEF_HELPER_4(umaqa, tl, env, tl, tl, tl) DEF_HELPER_4(smaqa_su, tl, env, tl, tl, tl) + +DEF_HELPER_3(add64, i64, env, i64, i64) +DEF_HELPER_3(radd64, i64, env, i64, i64) +DEF_HELPER_3(uradd64, i64, env, i64, i64) +DEF_HELPER_3(kadd64, i64, env, i64, i64) +DEF_HELPER_3(ukadd64, i64, env, i64, i64) +DEF_HELPER_3(sub64, i64, env, i64, i64) +DEF_HELPER_3(rsub64, i64, env, i64, i64) +DEF_HELPER_3(ursub64, i64, env, i64, i64) +DEF_HELPER_3(ksub64, i64, env, i64, i64) +DEF_HELPER_3(uksub64, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 96288370a6..5156fa060e 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -933,3 +933,14 @@ pbsada 1111111 ..... ..... 000 ..... 1110111 @r smaqa 1100100 ..... ..... 000 ..... 1110111 @r umaqa 1100110 ..... ..... 000 ..... 1110111 @r smaqa_su 1100101 ..... ..... 000 ..... 1110111 @r + +add64 1100000 ..... ..... 001 ..... 1110111 @r +radd64 1000000 ..... ..... 001 ..... 1110111 @r +uradd64 1010000 ..... ..... 001 ..... 1110111 @r +kadd64 1001000 ..... ..... 001 ..... 1110111 @r +ukadd64 1011000 ..... ..... 001 ..... 1110111 @r +sub64 1100001 ..... ..... 001 ..... 1110111 @r +rsub64 1000001 ..... ..... 001 ..... 1110111 @r +ursub64 1010001 ..... ..... 001 ..... 1110111 @r +ksub64 1001001 ..... ..... 001 ..... 1110111 @r +uksub64 1011001 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 1a10f13318..e04c79931d 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -516,3 +516,77 @@ GEN_RVP_R_ACC_OOL(pbsada); GEN_RVP_R_ACC_OOL(smaqa); GEN_RVP_R_ACC_OOL(umaqa); GEN_RVP_R_ACC_OOL(smaqa_su); + +/* + *** 64-bit Profile Instructions + */ +/* 64-bit Addition & Subtraction Instructions */ +static bool +r_d64_s64_s64_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64)) +{ + TCGv t1, t2; + TCGv_i64 src1, src2, dst; + + if (!has_ext(ctx, RVP) || !ctx->ext_psfoperand) { + return false; + } + + src1 = tcg_temp_new_i64(); + src2 = tcg_temp_new_i64(); + dst = tcg_temp_new_i64(); + + if (is_32bit(ctx)) { + TCGv a0, a1, b0, b1; + a0 = tcg_temp_new(); + a1 = tcg_temp_new(); + b0 = tcg_temp_new(); + b1 = tcg_temp_new(); + + gen_get_gpr(a0, a->rs1); + gen_get_gpr(a1, a->rs1 + 1); + tcg_gen_concat_tl_i64(src1, a0, a1); + gen_get_gpr(b0, a->rs2); + gen_get_gpr(b1, a->rs2 + 1); + tcg_gen_concat_tl_i64(src2, b0, b1); + + tcg_temp_free(a0); + tcg_temp_free(a1); + tcg_temp_free(b0); + tcg_temp_free(b1); + } else { + t1 = tcg_temp_new(); + t2 = tcg_temp_new(); + gen_get_gpr(t1, a->rs1); + tcg_gen_ext_tl_i64(src1, t1); + gen_get_gpr(t2, a->rs2); + tcg_gen_ext_tl_i64(src2, t2); + tcg_temp_free(t1); + tcg_temp_free(t2); + } + + fn(dst, cpu_env, src1, src2); + set_pair_regs(ctx, dst, a->rd); + + tcg_temp_free_i64(src1); + tcg_temp_free_i64(src2); + tcg_temp_free_i64(dst); + return true; +} + +#define GEN_RVP_R_D64_S64_S64_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_d64_s64_s64_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_D64_S64_S64_OOL(add64); +GEN_RVP_R_D64_S64_S64_OOL(radd64); +GEN_RVP_R_D64_S64_S64_OOL(uradd64); +GEN_RVP_R_D64_S64_S64_OOL(kadd64); +GEN_RVP_R_D64_S64_S64_OOL(ukadd64); +GEN_RVP_R_D64_S64_S64_OOL(sub64); +GEN_RVP_R_D64_S64_S64_OOL(rsub64); +GEN_RVP_R_D64_S64_S64_OOL(ursub64); +GEN_RVP_R_D64_S64_S64_OOL(ksub64); +GEN_RVP_R_D64_S64_S64_OOL(uksub64); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 02178d6e61..b8be234d97 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2088,3 +2088,135 @@ static inline void do_smaqa_su(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(smaqa_su, 1, 4); + +/* + *** 64-bit Profile Instructions + */ +/* 64-bit Addition & Subtraction Instructions */ + +/* Define a common function to loop elements in packed register */ +static inline uint64_t +rvpr64_64_64(CPURISCVState *env, uint64_t a, uint64_t b, + uint8_t step, uint8_t size, PackedFn3i *fn) +{ + int i, passes = sizeof(uint64_t) / size; + uint64_t result = 0; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, &b, i); + } + return result; +} + +#define RVPR64_64_64(NAME, STEP, SIZE) \ +uint64_t HELPER(NAME)(CPURISCVState *env, uint64_t a, uint64_t b) \ +{ \ + return rvpr64_64_64(env, a, b, STEP, SIZE, (PackedFn3i *)do_##NAME); \ +} + +static inline void do_add64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va, *b = vb; + *d = *a + *b; +} + +RVPR64_64_64(add64, 1, 8); + +static inline int64_t hadd64(int64_t a, int64_t b) +{ + int64_t res = a + b; + int64_t over = (res ^ a) & (res ^ b) & INT64_MIN; + + /* With signed overflow, bit 64 is inverse of bit 63. */ + return (res >> 1) ^ over; +} + +static inline void do_radd64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va, *b = vb; + *d = hadd64(*a, *b); +} + +RVPR64_64_64(radd64, 1, 8); + +static inline uint64_t haddu64(uint64_t a, uint64_t b) +{ + uint64_t res = a + b; + bool over = res < a; + + return over ? ((res >> 1) | INT64_MIN) : (res >> 1); +} + +static inline void do_uradd64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint64_t *d = vd, *a = va, *b = vb; + *d = haddu64(*a, *b); +} + +RVPR64_64_64(uradd64, 1, 8); + +static inline void do_kadd64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va, *b = vb; + *d = sadd64(env, 0, *a, *b); +} + +RVPR64_64_64(kadd64, 1, 8); + +static inline void do_ukadd64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint64_t *d = vd, *a = va, *b = vb; + *d = saddu64(env, 0, *a, *b); +} + +RVPR64_64_64(ukadd64, 1, 8); + +static inline void do_sub64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va, *b = vb; + *d = *a - *b; +} + +RVPR64_64_64(sub64, 1, 8); + +static inline void do_rsub64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va, *b = vb; + *d = hsub64(*a, *b); +} + +RVPR64_64_64(rsub64, 1, 8); + +static inline void do_ursub64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint64_t *d = vd, *a = va, *b = vb; + *d = hsubu64(*a, *b); +} + +RVPR64_64_64(ursub64, 1, 8); + +static inline void do_ksub64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd, *a = va, *b = vb; + *d = ssub64(env, 0, *a, *b); +} + +RVPR64_64_64(ksub64, 1, 8); + +static inline void do_uksub64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint64_t *d = vd, *a = va, *b = vb; + *d = ssubu64(env, 0, *a, *b); +} + +RVPR64_64_64(uksub64, 1, 8); From patchwork Thu Jun 10 07:58:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490252 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xjd0F56z9sPf for ; Thu, 10 Jun 2021 18:20:05 +1000 (AEST) Received: from localhost ([::1]:56754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFv0-0003ZM-Vk for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:20:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42708) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFlp-00044P-3N; Thu, 10 Jun 2021 04:10:33 -0400 Received: from out28-53.mail.aliyun.com ([115.124.28.53]:58447) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFlm-0003jw-Hf; Thu, 10 Jun 2021 04:10:32 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07608584|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.686434-0.0129696-0.300597; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047199; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMr3Ws_1623312625; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMr3Ws_1623312625) by smtp.aliyun-inc.com(10.147.40.44); Thu, 10 Jun 2021 16:10:25 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 22/37] target/riscv: 32-bit Multiply 64-bit Add/Subtract Instructions Date: Thu, 10 Jun 2021 15:58:53 +0800 Message-Id: <20210610075908.3305506-23-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.53; envelope-from=zhiwei_liu@c-sky.com; helo=out28-53.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" 32x32 multiply as an operand for 64-bit add/subtract operation with saturation or not. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 9 ++ target/riscv/insn32.decode | 9 ++ target/riscv/insn_trans/trans_rvp.c.inc | 67 ++++++++++ target/riscv/packed_helper.c | 155 ++++++++++++++++++++++++ 4 files changed, 240 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 192ef42d2a..c3c086bed0 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1341,3 +1341,12 @@ DEF_HELPER_3(rsub64, i64, env, i64, i64) DEF_HELPER_3(ursub64, i64, env, i64, i64) DEF_HELPER_3(ksub64, i64, env, i64, i64) DEF_HELPER_3(uksub64, i64, env, i64, i64) + +DEF_HELPER_4(smar64, i64, env, tl, tl, i64) +DEF_HELPER_4(smsr64, i64, env, tl, tl, i64) +DEF_HELPER_4(umar64, i64, env, tl, tl, i64) +DEF_HELPER_4(umsr64, i64, env, tl, tl, i64) +DEF_HELPER_4(kmar64, i64, env, tl, tl, i64) +DEF_HELPER_4(kmsr64, i64, env, tl, tl, i64) +DEF_HELPER_4(ukmar64, i64, env, tl, tl, i64) +DEF_HELPER_4(ukmsr64, i64, env, tl, tl, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 5156fa060e..5d123bbb97 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -944,3 +944,12 @@ rsub64 1000001 ..... ..... 001 ..... 1110111 @r ursub64 1010001 ..... ..... 001 ..... 1110111 @r ksub64 1001001 ..... ..... 001 ..... 1110111 @r uksub64 1011001 ..... ..... 001 ..... 1110111 @r + +smar64 1000010 ..... ..... 001 ..... 1110111 @r +smsr64 1000011 ..... ..... 001 ..... 1110111 @r +umar64 1010010 ..... ..... 001 ..... 1110111 @r +umsr64 1010011 ..... ..... 001 ..... 1110111 @r +kmar64 1001010 ..... ..... 001 ..... 1110111 @r +kmsr64 1001011 ..... ..... 001 ..... 1110111 @r +ukmar64 1011010 ..... ..... 001 ..... 1110111 @r +ukmsr64 1011011 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index e04c79931d..63b6810227 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -590,3 +590,70 @@ GEN_RVP_R_D64_S64_S64_OOL(rsub64); GEN_RVP_R_D64_S64_S64_OOL(ursub64); GEN_RVP_R_D64_S64_S64_OOL(ksub64); GEN_RVP_R_D64_S64_S64_OOL(uksub64); + +/* 32-bit Multiply with 64-bit Add/Subtract Instructions */ + +/* Function to accumulate 64bit destination register */ +static bool +r_d64_acc_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv_i64, TCGv_ptr, TCGv, TCGv, TCGv_i64)) +{ + TCGv src1, src2; + TCGv_i64 dst, src3; + + if (!has_ext(ctx, RVP) || !ctx->ext_psfoperand) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + src3 = tcg_temp_new_i64(); + dst = tcg_temp_new_i64(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + + if (is_32bit(ctx)) { + TCGv t0, t1; + t0 = tcg_temp_new(); + t1 = tcg_temp_new(); + + gen_get_gpr(t0, a->rd); + gen_get_gpr(t1, a->rd + 1); + tcg_gen_concat_tl_i64(src3, t0, t1); + tcg_temp_free(t0); + tcg_temp_free(t1); + } else { + TCGv t0; + t0 = tcg_temp_new(); + + gen_get_gpr(t0, a->rd); + tcg_gen_ext_tl_i64(src3, t0); + tcg_temp_free(t0); + } + + fn(dst, cpu_env, src1, src2, src3); + + set_pair_regs(ctx, dst, a->rd); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free_i64(src3); + tcg_temp_free_i64(dst); + return true; +} + +#define GEN_RVP_R_D64_ACC_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_d64_acc_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_D64_ACC_OOL(smar64); +GEN_RVP_R_D64_ACC_OOL(smsr64); +GEN_RVP_R_D64_ACC_OOL(umar64); +GEN_RVP_R_D64_ACC_OOL(umsr64); +GEN_RVP_R_D64_ACC_OOL(kmar64); +GEN_RVP_R_D64_ACC_OOL(kmsr64); +GEN_RVP_R_D64_ACC_OOL(ukmar64); +GEN_RVP_R_D64_ACC_OOL(ukmsr64); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index b8be234d97..59a06c604d 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2220,3 +2220,158 @@ static inline void do_uksub64(CPURISCVState *env, void *vd, void *va, } RVPR64_64_64(uksub64, 1, 8); + +/* 32-bit Multiply with 64-bit Add/Subtract Instructions */ +static inline uint64_t +rvpr64_acc(CPURISCVState *env, target_ulong a, + target_ulong b, uint64_t c, + uint8_t step, uint8_t size, PackedFn4i *fn) +{ + int i, passes = sizeof(target_ulong) / size; + uint64_t result = 0; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, &b, &c, i); + } + return result; +} + +#define RVPR64_ACC(NAME, STEP, SIZE) \ +uint64_t HELPER(NAME)(CPURISCVState *env, target_ulong a, \ + target_ulong b, uint64_t c) \ +{ \ + return rvpr64_acc(env, a, b, c, STEP, SIZE, (PackedFn4i *)do_##NAME);\ +} + +static inline void do_smar64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *a = va, *b = vb; + int64_t *d = vd, *c = vc; + if (i == 0) { + *d = *c; + } + *d += (int64_t)a[H4(i)] * b[H4(i)]; +} + +RVPR64_ACC(smar64, 1, 4); + +static inline void do_smsr64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *a = va, *b = vb; + int64_t *d = vd, *c = vc; + if (i == 0) { + *d = *c; + } + *d -= (int64_t)a[H4(i)] * b[H4(i)]; +} + +RVPR64_ACC(smsr64, 1, 4); + +static inline void do_umar64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + uint32_t *a = va, *b = vb; + uint64_t *d = vd, *c = vc; + if (i == 0) { + *d = *c; + } + *d += (uint64_t)a[H4(i)] * b[H4(i)]; +} + +RVPR64_ACC(umar64, 1, 4); + +static inline void do_umsr64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + uint32_t *a = va, *b = vb; + uint64_t *d = vd, *c = vc; + if (i == 0) { + *d = *c; + } + *d -= (uint64_t)a[H4(i)] * b[H4(i)]; +} + +RVPR64_ACC(umsr64, 1, 4); + +static inline void do_kmar64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *a = va, *b = vb; + int64_t *d = vd, *c = vc; + int64_t m0 = (int64_t)a[H4(i)] * b[H4(i)]; + if (!riscv_cpu_is_32bit(env)) { + int64_t m1 = (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; + if (a[H4(i)] == INT32_MIN && b[H4(i)] == INT32_MIN && + a[H4(i + 1)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + if (*c >= 0) { + *d = INT64_MAX; + env->vxsat = 1; + } else { + *d = sadd64(env, 0, *c + m0, m1); + } + } else { + *d = sadd64(env, 0, *c, m0 + m1); + } + } else { + *d = sadd64(env, 0, *c, m0); + } +} + +RVPR64_ACC(kmar64, 1, sizeof(target_ulong)); + +static inline void do_kmsr64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int32_t *a = va, *b = vb; + int64_t *d = vd, *c = vc; + + int64_t m0 = (int64_t)a[H4(i)] * b[H4(i)]; + if (!riscv_cpu_is_32bit(env)) { + int64_t m1 = (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; + if (a[H4(i)] == INT32_MIN && b[H4(i)] == INT32_MIN && + a[H4(i + 1)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + if (*c <= 0) { + *d = INT64_MIN; + env->vxsat = 1; + } else { + *d = ssub64(env, 0, *c - m0, m1); + } + } else { + *d = ssub64(env, 0, *c, m0 + m1); + } + } else { + *d = ssub64(env, 0, *c, m0); + } +} + +RVPR64_ACC(kmsr64, 1, sizeof(target_ulong)); + +static inline void do_ukmar64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + uint32_t *a = va, *b = vb; + uint64_t *d = vd, *c = vc; + + if (i == 0) { + *d = *c; + } + *d = saddu64(env, 0, *d, (uint64_t)a[H4(i)] * b[H4(i)]); +} + +RVPR64_ACC(ukmar64, 1, 4); + +static inline void do_ukmsr64(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + uint32_t *a = va, *b = vb; + uint64_t *d = vd, *c = vc; + + if (i == 0) { + *d = *c; + } + *d = ssubu64(env, 0, *d, (uint64_t)a[i] * b[i]); +} + +RVPR64_ACC(ukmsr64, 1, 4); From patchwork Thu Jun 10 07:58:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490268 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xmV3CtKz9sRN for ; Thu, 10 Jun 2021 18:22:33 +1000 (AEST) Received: from localhost ([::1]:37170 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFxP-00013L-9h for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:22:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFmS-00056A-JC; Thu, 10 Jun 2021 04:11:12 -0400 Received: from out28-218.mail.aliyun.com ([115.124.28.218]:53235) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFmI-00047E-CM; Thu, 10 Jun 2021 04:11:12 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07496303|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.621912-0.0154592-0.362629; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047205; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMr437_1623312656; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMr437_1623312656) by smtp.aliyun-inc.com(10.147.40.44); Thu, 10 Jun 2021 16:10:56 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 23/37] target/riscv: Signed 16-bit Multiply with 64-bit Add/Subtract Instructions Date: Thu, 10 Jun 2021 15:58:54 +0800 Message-Id: <20210610075908.3305506-24-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.218; envelope-from=zhiwei_liu@c-sky.com; helo=out28-218.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" one or two 16x16 multiply as operands for an add/subtract operation with another 64-bit operand. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 11 ++ target/riscv/insn32.decode | 11 ++ target/riscv/insn_trans/trans_rvp.c.inc | 12 ++ target/riscv/packed_helper.c | 151 ++++++++++++++++++++++++ 4 files changed, 185 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c3c086bed0..87a0779842 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1350,3 +1350,14 @@ DEF_HELPER_4(kmar64, i64, env, tl, tl, i64) DEF_HELPER_4(kmsr64, i64, env, tl, tl, i64) DEF_HELPER_4(ukmar64, i64, env, tl, tl, i64) DEF_HELPER_4(ukmsr64, i64, env, tl, tl, i64) + +DEF_HELPER_4(smalbb, i64, env, tl, tl, i64) +DEF_HELPER_4(smalbt, i64, env, tl, tl, i64) +DEF_HELPER_4(smaltt, i64, env, tl, tl, i64) +DEF_HELPER_4(smalda, i64, env, tl, tl, i64) +DEF_HELPER_4(smalxda, i64, env, tl, tl, i64) +DEF_HELPER_4(smalds, i64, env, tl, tl, i64) +DEF_HELPER_4(smalxds, i64, env, tl, tl, i64) +DEF_HELPER_4(smaldrs, i64, env, tl, tl, i64) +DEF_HELPER_4(smslda, i64, env, tl, tl, i64) +DEF_HELPER_4(smslxda, i64, env, tl, tl, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 5d123bbb97..d1668b34cb 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -953,3 +953,14 @@ kmar64 1001010 ..... ..... 001 ..... 1110111 @r kmsr64 1001011 ..... ..... 001 ..... 1110111 @r ukmar64 1011010 ..... ..... 001 ..... 1110111 @r ukmsr64 1011011 ..... ..... 001 ..... 1110111 @r + +smalbb 1000100 ..... ..... 001 ..... 1110111 @r +smalbt 1001100 ..... ..... 001 ..... 1110111 @r +smaltt 1010100 ..... ..... 001 ..... 1110111 @r +smalda 1000110 ..... ..... 001 ..... 1110111 @r +smalxda 1001110 ..... ..... 001 ..... 1110111 @r +smalds 1000101 ..... ..... 001 ..... 1110111 @r +smaldrs 1001101 ..... ..... 001 ..... 1110111 @r +smalxds 1010101 ..... ..... 001 ..... 1110111 @r +smslda 1010110 ..... ..... 001 ..... 1110111 @r +smslxda 1011110 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 63b6810227..7c91bdc888 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -657,3 +657,15 @@ GEN_RVP_R_D64_ACC_OOL(kmar64); GEN_RVP_R_D64_ACC_OOL(kmsr64); GEN_RVP_R_D64_ACC_OOL(ukmar64); GEN_RVP_R_D64_ACC_OOL(ukmsr64); + +/* Signed 16-bit Multiply with 64-bit Add/Subtract Instructions */ +GEN_RVP_R_D64_ACC_OOL(smalbb); +GEN_RVP_R_D64_ACC_OOL(smalbt); +GEN_RVP_R_D64_ACC_OOL(smaltt); +GEN_RVP_R_D64_ACC_OOL(smalda); +GEN_RVP_R_D64_ACC_OOL(smalxda); +GEN_RVP_R_D64_ACC_OOL(smalds); +GEN_RVP_R_D64_ACC_OOL(smaldrs); +GEN_RVP_R_D64_ACC_OOL(smalxds); +GEN_RVP_R_D64_ACC_OOL(smslda); +GEN_RVP_R_D64_ACC_OOL(smslxda); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 59a06c604d..3330a2ecec 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2375,3 +2375,154 @@ static inline void do_ukmsr64(CPURISCVState *env, void *vd, void *va, } RVPR64_ACC(ukmsr64, 1, 4); + +/* Signed 16-bit Multiply with 64-bit Add/Subtract Instructions */ +static inline void do_smalbb(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i)] * b[H2(i)]; +} + +RVPR64_ACC(smalbb, 2, 2); + +static inline void do_smalbt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smalbt, 2, 2); + +static inline void do_smaltt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i + 1)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smaltt, 2, 2); + +static inline void do_smalda(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i)] * b[H2(i)] + (int64_t)a[H2(i + 1)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smalda, 2, 2); + +static inline void do_smalxda(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i)] * b[H2(i + 1)] + (int64_t)a[H2(i + 1)] * b[H2(i)]; +} + +RVPR64_ACC(smalxda, 2, 2); + +static inline void do_smalds(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i + 1)] * b[H2(i + 1)] - (int64_t)a[H2(i)] * b[H2(i)]; +} + +RVPR64_ACC(smalds, 2, 2); + +static inline void do_smaldrs(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i)] * b[H2(i)] - (int64_t)a[H2(i + 1)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smaldrs, 2, 2); + +static inline void do_smalxds(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d += (int64_t)a[H2(i + 1)] * b[H2(i)] - (int64_t)a[H2(i)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smalxds, 2, 2); + +static inline void do_smslda(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d -= (int64_t)a[H2(i)] * b[H2(i)] + (int64_t)a[H2(i + 1)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smslda, 2, 2); + +static inline void do_smslxda(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int16_t *a = va, *b = vb; + + if (i == 0) { + *d = *c; + } + + *d -= (int64_t)a[H2(i + 1)] * b[H2(i)] + (int64_t)a[H2(i)] * b[H2(i + 1)]; +} + +RVPR64_ACC(smslxda, 2, 2); From patchwork Thu Jun 10 07:58:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490286 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xtr2ZtZz9sPf for ; Thu, 10 Jun 2021 18:28:04 +1000 (AEST) Received: from localhost ([::1]:56430 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrG2k-0005US-8w for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:28:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42994) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFmo-0005tN-5g; Thu, 10 Jun 2021 04:11:34 -0400 Received: from out28-53.mail.aliyun.com ([115.124.28.53]:35896) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFmm-0004OB-1U; Thu, 10 Jun 2021 04:11:33 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07462487|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.70221-0.0139448-0.283845; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047199; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQN01fi_1623312686; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQN01fi_1623312686) by smtp.aliyun-inc.com(10.147.41.121); Thu, 10 Jun 2021 16:11:27 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 24/37] target/riscv: Non-SIMD Q15 saturation ALU Instructions Date: Thu, 10 Jun 2021 15:58:55 +0800 Message-Id: <20210610075908.3305506-25-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.53; envelope-from=zhiwei_liu@c-sky.com; helo=out28-53.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Q15 saturation is to limit the result to the range [INT16_MIN, INT16_MAX]. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 8 +++ target/riscv/insn32.decode | 8 +++ target/riscv/insn_trans/trans_rvp.c.inc | 12 ++++ target/riscv/packed_helper.c | 78 +++++++++++++++++++++++++ 4 files changed, 106 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 87a0779842..6ce22a186e 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1361,3 +1361,11 @@ DEF_HELPER_4(smalxds, i64, env, tl, tl, i64) DEF_HELPER_4(smaldrs, i64, env, tl, tl, i64) DEF_HELPER_4(smslda, i64, env, tl, tl, i64) DEF_HELPER_4(smslxda, i64, env, tl, tl, i64) + +DEF_HELPER_3(kaddh, tl, env, tl, tl) +DEF_HELPER_3(ksubh, tl, env, tl, tl) +DEF_HELPER_3(khmbb, tl, env, tl, tl) +DEF_HELPER_3(khmbt, tl, env, tl, tl) +DEF_HELPER_3(khmtt, tl, env, tl, tl) +DEF_HELPER_3(ukaddh, tl, env, tl, tl) +DEF_HELPER_3(uksubh, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index d1668b34cb..f465851f03 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -964,3 +964,11 @@ smaldrs 1001101 ..... ..... 001 ..... 1110111 @r smalxds 1010101 ..... ..... 001 ..... 1110111 @r smslda 1010110 ..... ..... 001 ..... 1110111 @r smslxda 1011110 ..... ..... 001 ..... 1110111 @r + +kaddh 0000010 ..... ..... 001 ..... 1110111 @r +ksubh 0000011 ..... ..... 001 ..... 1110111 @r +khmbb 0000110 ..... ..... 001 ..... 1110111 @r +khmbt 0001110 ..... ..... 001 ..... 1110111 @r +khmtt 0010110 ..... ..... 001 ..... 1110111 @r +ukaddh 0001010 ..... ..... 001 ..... 1110111 @r +uksubh 0001011 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 7c91bdc888..48eb190bc6 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -669,3 +669,15 @@ GEN_RVP_R_D64_ACC_OOL(smaldrs); GEN_RVP_R_D64_ACC_OOL(smalxds); GEN_RVP_R_D64_ACC_OOL(smslda); GEN_RVP_R_D64_ACC_OOL(smslxda); + +/* + *** Non-SIMD Instructions + */ +/* Non-SIMD Q15 saturation ALU Instructions */ +GEN_RVP_R_OOL(kaddh); +GEN_RVP_R_OOL(ksubh); +GEN_RVP_R_OOL(khmbb); +GEN_RVP_R_OOL(khmbt); +GEN_RVP_R_OOL(khmtt); +GEN_RVP_R_OOL(ukaddh); +GEN_RVP_R_OOL(uksubh); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 3330a2ecec..171f88face 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2526,3 +2526,81 @@ static inline void do_smslxda(CPURISCVState *env, void *vd, void *va, } RVPR64_ACC(smslxda, 2, 2); + +/* Q15 saturation instructions */ +static inline void do_kaddh(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va, *b = vb; + + *d = sat64(env, (int64_t)a[H4(i)] + b[H4(i)], 15); +} + +RVPR(kaddh, 2, 4); + +static inline void do_ksubh(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va, *b = vb; + + *d = sat64(env, (int64_t)a[H4(i)] - b[H4(i)], 15); +} + +RVPR(ksubh, 2, 4); + +static inline void do_khmbb(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + + *d = sat64(env, (int64_t)a[H2(i)] * b[H2(i)] >> 15, 15); +} + +RVPR(khmbb, 4, 2); + +static inline void do_khmbt(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + + *d = sat64(env, (int64_t)a[H2(i)] * b[H2(i + 1)] >> 15, 15); +} + +RVPR(khmbt, 4, 2); + +static inline void do_khmtt(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + + *d = sat64(env, (int64_t)a[H2(i + 1)] * b[H2(i + 1)] >> 15, 15); +} + +RVPR(khmtt, 4, 2); + +static inline void do_ukaddh(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + uint32_t *a = va, *b = vb; + + *d = (int16_t)satu64(env, saddu32(env, 0, a[H4(i)], b[H4(i)]), 16); +} + +RVPR(ukaddh, 2, 4); + +static inline void do_uksubh(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + uint32_t *a = va, *b = vb; + + *d = (int16_t)satu64(env, ssubu32(env, 0, a[H4(i)], b[H4(i)]), 16); +} + +RVPR(uksubh, 2, 4); From patchwork Thu Jun 10 07:58:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490289 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xwm6nGYz9sPf for ; Thu, 10 Jun 2021 18:29:44 +1000 (AEST) Received: from localhost ([::1]:35468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrG4M-0001py-VO for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:29:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43162) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFnI-0006Wr-Tt; Thu, 10 Jun 2021 04:12:04 -0400 Received: from out28-195.mail.aliyun.com ([115.124.28.195]:58044) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFnF-0004hI-Uo; Thu, 10 Jun 2021 04:12:04 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436283|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.524607-0.00375863-0.471634; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047203; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMkHEM_1623312717; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMkHEM_1623312717) by smtp.aliyun-inc.com(10.147.42.16); Thu, 10 Jun 2021 16:11:57 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 25/37] target/riscv: Non-SIMD Q31 saturation ALU Instructions Date: Thu, 10 Jun 2021 15:58:56 +0800 Message-Id: <20210610075908.3305506-26-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.195; envelope-from=zhiwei_liu@c-sky.com; helo=out28-195.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Q31 saturation is to limit the result to the range [INT32_MIN, INT32_MAX]. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 15 ++ target/riscv/insn32.decode | 16 ++ target/riscv/insn_trans/trans_rvp.c.inc | 17 ++ target/riscv/packed_helper.c | 214 ++++++++++++++++++++++++ 4 files changed, 262 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 6ce22a186e..b3485f95a2 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1369,3 +1369,18 @@ DEF_HELPER_3(khmbt, tl, env, tl, tl) DEF_HELPER_3(khmtt, tl, env, tl, tl) DEF_HELPER_3(ukaddh, tl, env, tl, tl) DEF_HELPER_3(uksubh, tl, env, tl, tl) + +DEF_HELPER_3(kaddw, tl, env, tl, tl) +DEF_HELPER_3(ukaddw, tl, env, tl, tl) +DEF_HELPER_3(ksubw, tl, env, tl, tl) +DEF_HELPER_3(uksubw, tl, env, tl, tl) +DEF_HELPER_3(kdmbb, tl, env, tl, tl) +DEF_HELPER_3(kdmbt, tl, env, tl, tl) +DEF_HELPER_3(kdmtt, tl, env, tl, tl) +DEF_HELPER_3(kslraw, tl, env, tl, tl) +DEF_HELPER_3(kslraw_u, tl, env, tl, tl) +DEF_HELPER_3(ksllw, tl, env, tl, tl) +DEF_HELPER_4(kdmabb, tl, env, tl, tl, tl) +DEF_HELPER_4(kdmabt, tl, env, tl, tl, tl) +DEF_HELPER_4(kdmatt, tl, env, tl, tl, tl) +DEF_HELPER_2(kabsw, tl, env, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index f465851f03..a25294baab 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -972,3 +972,19 @@ khmbt 0001110 ..... ..... 001 ..... 1110111 @r khmtt 0010110 ..... ..... 001 ..... 1110111 @r ukaddh 0001010 ..... ..... 001 ..... 1110111 @r uksubh 0001011 ..... ..... 001 ..... 1110111 @r + +kaddw 0000000 ..... ..... 001 ..... 1110111 @r +ukaddw 0001000 ..... ..... 001 ..... 1110111 @r +ksubw 0000001 ..... ..... 001 ..... 1110111 @r +uksubw 0001001 ..... ..... 001 ..... 1110111 @r +kdmbb 0000101 ..... ..... 001 ..... 1110111 @r +kdmbt 0001101 ..... ..... 001 ..... 1110111 @r +kdmtt 0010101 ..... ..... 001 ..... 1110111 @r +kslraw 0110111 ..... ..... 001 ..... 1110111 @r +kslraw_u 0111111 ..... ..... 001 ..... 1110111 @r +ksllw 0010011 ..... ..... 001 ..... 1110111 @r +kslliw 0011011 ..... ..... 001 ..... 1110111 @sh5 +kdmabb 1101001 ..... ..... 001 ..... 1110111 @r +kdmabt 1110001 ..... ..... 001 ..... 1110111 @r +kdmatt 1111001 ..... ..... 001 ..... 1110111 @r +kabsw 1010110 10100 ..... 000 ..... 1110111 @r2 diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 48eb190bc6..d2c7ab1440 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -681,3 +681,20 @@ GEN_RVP_R_OOL(khmbt); GEN_RVP_R_OOL(khmtt); GEN_RVP_R_OOL(ukaddh); GEN_RVP_R_OOL(uksubh); + +/* Non-SIMD Q31 saturation ALU Instructions */ +GEN_RVP_R_OOL(kaddw); +GEN_RVP_R_OOL(ukaddw); +GEN_RVP_R_OOL(ksubw); +GEN_RVP_R_OOL(uksubw); +GEN_RVP_R_OOL(kdmbb); +GEN_RVP_R_OOL(kdmbt); +GEN_RVP_R_OOL(kdmtt); +GEN_RVP_R_OOL(kslraw); +GEN_RVP_R_OOL(kslraw_u); +GEN_RVP_R_OOL(ksllw); +GEN_RVP_SHIFTI(kslliw, NULL, gen_helper_ksllw); +GEN_RVP_R_ACC_OOL(kdmabb); +GEN_RVP_R_ACC_OOL(kdmabt); +GEN_RVP_R_ACC_OOL(kdmatt); +GEN_RVP_R2_OOL(kabsw); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 171f88face..89d203730d 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2604,3 +2604,217 @@ static inline void do_uksubh(CPURISCVState *env, void *vd, void *va, } RVPR(uksubh, 2, 4); + +/* Q31 saturation Instructions */ +static inline void do_kaddw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va, *b = vb; + + *d = sadd32(env, 0, a[H4(i)], b[H4(i)]); +} + +RVPR(kaddw, 2, 4); + +static inline void do_ukaddw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + uint32_t *a = va, *b = vb; + + *d = (int32_t)saddu32(env, 0, a[H4(i)], b[H4(i)]); +} + +RVPR(ukaddw, 2, 4); + +static inline void do_ksubw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va, *b = vb; + + *d = ssub32(env, 0, a[H4(i)], b[H4(i)]); +} + +RVPR(ksubw, 2, 4); + +static inline void do_uksubw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + uint32_t *a = va, *b = vb; + + *d = (int32_t)ssubu32(env, 0, a[H4(i)], b[H4(i)]); +} + +RVPR(uksubw, 2, 4); + +static inline void do_kdmbb(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + + if (a[H2(i)] == INT16_MIN && b[H2(i)] == INT16_MIN) { + *d = INT32_MAX; + env->vxsat = 0x1; + } else { + *d = (int64_t)a[H2(i)] * b[H2(i)] << 1; + } +} + +RVPR(kdmbb, 4, 2); + +static inline void do_kdmbt(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + + if (a[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + *d = INT32_MAX; + env->vxsat = 0x1; + } else { + *d = (int64_t)a[H2(i)] * b[H2(i + 1)] << 1; + } +} + +RVPR(kdmbt, 4, 2); + +static inline void do_kdmtt(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + + if (a[H2(i + 1)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + *d = INT32_MAX; + env->vxsat = 0x1; + } else { + *d = (int64_t)a[H2(i + 1)] * b[H2(i + 1)] << 1; + } +} + +RVPR(kdmtt, 4, 2); + +static inline void do_kslraw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va; + int32_t shift = sextract32((*(uint32_t *)vb), 0, 6); + + if (shift >= 0) { + *d = (int32_t)sat64(env, (int64_t)a[H4(i)] << shift, 31); + } else { + shift = -shift; + shift = (shift == 32) ? 31 : shift; + *d = a[H4(i)] >> shift; + } +} + +RVPR(kslraw, 2, 4); + +static inline void do_kslraw_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va; + int32_t shift = sextract32((*(uint32_t *)vb), 0, 6); + + if (shift >= 0) { + *d = (int32_t)sat64(env, (int64_t)a[H4(i)] << shift, 31); + } else { + shift = -shift; + shift = (shift == 32) ? 31 : shift; + *d = vssra32(env, 0, a[H4(i)], shift); + } +} + +RVPR(kslraw_u, 2, 4); + +static inline void do_ksllw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + + *d = (int32_t)sat64(env, (int64_t)a[H4(i)] << shift, 31); +} + +RVPR(ksllw, 2, 4); + +static inline void do_kdmabb(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) + +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + int32_t *c = vc, m0; + + if (a[H2(i)] == INT16_MIN && b[H2(i)] == INT16_MIN) { + m0 = INT32_MAX; + env->vxsat = 0x1; + } else { + m0 = (int32_t)a[H2(i)] * b[H2(i)] << 1; + } + *d = sadd32(env, 0, c[H4(i)], m0); +} + +RVPR_ACC(kdmabb, 4, 2); + +static inline void do_kdmabt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) + +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + int32_t *c = vc, m0; + + if (a[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + m0 = INT32_MAX; + env->vxsat = 0x1; + } else { + m0 = (int32_t)a[H2(i)] * b[H2(i + 1)] << 1; + } + *d = sadd32(env, 0, c[H4(i)], m0); +} + +RVPR_ACC(kdmabt, 4, 2); + +static inline void do_kdmatt(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) + +{ + target_long *d = vd; + int16_t *a = va, *b = vb; + int32_t *c = vc, m0; + + if (a[H2(i + 1)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + m0 = INT32_MAX; + env->vxsat = 0x1; + } else { + m0 = (int32_t)a[H2(i + 1)] * b[H2(i + 1)] << 1; + } + *d = sadd32(env, 0, c[H4(i)], m0); +} + +RVPR_ACC(kdmatt, 4, 2); + +static inline void do_kabsw(CPURISCVState *env, void *vd, void *va, uint8_t i) + +{ + target_long *d = vd; + int32_t *a = va; + + if (a[H4(i)] == INT32_MIN) { + *d = INT32_MAX; + env->vxsat = 0x1; + } else { + *d = (int32_t)abs(a[H4(i)]); + } +} + +RVPR2(kabsw, 2, 4); From patchwork Thu Jun 10 07:58:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490243 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xZF1Nzxz9s1l for ; Thu, 10 Jun 2021 18:13:41 +1000 (AEST) Received: from localhost ([::1]:38568 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFop-00086s-5W for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:13:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43312) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFnp-0007UX-58; Thu, 10 Jun 2021 04:12:37 -0400 Received: from out28-3.mail.aliyun.com ([115.124.28.3]:46747) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFnm-0004zS-Qw; Thu, 10 Jun 2021 04:12:36 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07645152|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.47609-0.00722773-0.516682; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047199; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMlovd_1623312747; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMlovd_1623312747) by smtp.aliyun-inc.com(10.147.44.145); Thu, 10 Jun 2021 16:12:27 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 26/37] target/riscv: 32-bit Computation Instructions Date: Thu, 10 Jun 2021 15:58:57 +0800 Message-Id: <20210610075908.3305506-27-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.3; envelope-from=zhiwei_liu@c-sky.com; helo=out28-3.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" 32-bit halving addition or subtraction, maximum, minimum, or multiply. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 9 +++ target/riscv/insn32.decode | 9 +++ target/riscv/insn_trans/trans_rvp.c.inc | 10 +++ target/riscv/packed_helper.c | 92 +++++++++++++++++++++++++ 4 files changed, 120 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index b3485f95a2..3063b583f3 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1384,3 +1384,12 @@ DEF_HELPER_4(kdmabb, tl, env, tl, tl, tl) DEF_HELPER_4(kdmabt, tl, env, tl, tl, tl) DEF_HELPER_4(kdmatt, tl, env, tl, tl, tl) DEF_HELPER_2(kabsw, tl, env, tl) + +DEF_HELPER_3(raddw, tl, env, tl, tl) +DEF_HELPER_3(uraddw, tl, env, tl, tl) +DEF_HELPER_3(rsubw, tl, env, tl, tl) +DEF_HELPER_3(ursubw, tl, env, tl, tl) +DEF_HELPER_3(maxw, tl, env, tl, tl) +DEF_HELPER_3(minw, tl, env, tl, tl) +DEF_HELPER_3(mulr64, i64, env, tl, tl) +DEF_HELPER_3(mulsr64, i64, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index a25294baab..9cfe5570b0 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -988,3 +988,12 @@ kdmabb 1101001 ..... ..... 001 ..... 1110111 @r kdmabt 1110001 ..... ..... 001 ..... 1110111 @r kdmatt 1111001 ..... ..... 001 ..... 1110111 @r kabsw 1010110 10100 ..... 000 ..... 1110111 @r2 + +raddw 0010000 ..... ..... 001 ..... 1110111 @r +uraddw 0011000 ..... ..... 001 ..... 1110111 @r +rsubw 0010001 ..... ..... 001 ..... 1110111 @r +ursubw 0011001 ..... ..... 001 ..... 1110111 @r +maxw 1111001 ..... ..... 000 ..... 1110111 @r +minw 1111000 ..... ..... 000 ..... 1110111 @r +mulr64 1111000 ..... ..... 001 ..... 1110111 @r +mulsr64 1110000 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index d2c7ab1440..b720c6e037 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -698,3 +698,13 @@ GEN_RVP_R_ACC_OOL(kdmabb); GEN_RVP_R_ACC_OOL(kdmabt); GEN_RVP_R_ACC_OOL(kdmatt); GEN_RVP_R2_OOL(kabsw); + +/* 32-bit Computation Instructions */ +GEN_RVP_R_OOL(raddw); +GEN_RVP_R_OOL(uraddw); +GEN_RVP_R_OOL(rsubw); +GEN_RVP_R_OOL(ursubw); +GEN_RVP_R_OOL(minw); +GEN_RVP_R_OOL(maxw); +GEN_RVP_R_D64_OOL(mulr64); +GEN_RVP_R_D64_OOL(mulsr64); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 89d203730d..c0e3b6bbdb 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2818,3 +2818,95 @@ static inline void do_kabsw(CPURISCVState *env, void *vd, void *va, uint8_t i) } RVPR2(kabsw, 2, 4); + +/* 32-bit Computation Instructions */ +static inline void do_raddw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *a = va, *b = vb; + target_long *d = vd; + + *d = hadd32(a[H4(i)], b[H4(i)]); +} + +RVPR(raddw, 2, 4); + +static inline void do_uraddw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *a = va, *b = vb; + target_long *d = vd; + + *d = (int32_t)haddu32(a[H4(i)], b[H4(i)]); +} + +RVPR(uraddw, 2, 4); + +static inline void do_rsubw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *a = va, *b = vb; + target_long *d = vd; + + *d = hsub32(a[H4(i)], b[H4(i)]); +} + +RVPR(rsubw, 2, 4); + +static inline void do_ursubw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *a = va, *b = vb; + target_long *d = vd; + + *d = (int32_t)hsubu64(a[H4(i)], b[H4(i)]); +} + +RVPR(ursubw, 2, 4); + +static inline void do_maxw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va, *b = vb; + + *d = (a[H4(i)] > b[H4(i)]) ? a[H4(i)] : b[H4(i)]; +} + +RVPR(maxw, 2, 4); + +static inline void do_minw(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int32_t *a = va, *b = vb; + + *d = (a[H4(i)] < b[H4(i)]) ? a[H4(i)] : b[H4(i)]; +} + +RVPR(minw, 2, 4); + +static inline void do_mulr64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint64_t *d = vd; + uint32_t *a = va, *b = vb; + + *d = (uint64_t)a[H4(0)] * b[H4(0)]; +} + +RVPR64(mulr64); + +static inline void do_mulsr64(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int64_t result; + int32_t *a = va, *b = vb; + + result = (int64_t)a[H4(0)] * b[H4(0)]; + d[H4(1)] = result >> 32; + d[H4(0)] = result & UINT32_MAX; +} + +RVPR64(mulsr64); From patchwork Thu Jun 10 07:58:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490291 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0y0P5V4nz9sRN for ; Thu, 10 Jun 2021 18:32:53 +1000 (AEST) Received: from localhost ([::1]:42278 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrG7P-0006gg-Oj for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:32:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43392) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFoH-00085i-B8; Thu, 10 Jun 2021 04:13:05 -0400 Received: from out28-217.mail.aliyun.com ([115.124.28.217]:35614) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFoE-0005Gu-CP; Thu, 10 Jun 2021 04:13:05 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436282|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.773602-0.00839421-0.218004; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047201; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMkIIM_1623312778; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMkIIM_1623312778) by smtp.aliyun-inc.com(10.147.42.16); Thu, 10 Jun 2021 16:12:58 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 27/37] target/riscv: Non-SIMD Miscellaneous Instructions Date: Thu, 10 Jun 2021 15:58:58 +0800 Message-Id: <20210610075908.3305506-28-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.217; envelope-from=zhiwei_liu@c-sky.com; helo=out28-217.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Bit reverse, average, rounding shift, extract and insert byte instructions. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 6 + target/riscv/insn32.decode | 16 ++ target/riscv/insn_trans/trans_rvp.c.inc | 241 ++++++++++++++++++++++++ target/riscv/packed_helper.c | 77 ++++++++ 4 files changed, 340 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 3063b583f3..bdd5ca1251 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1393,3 +1393,9 @@ DEF_HELPER_3(maxw, tl, env, tl, tl) DEF_HELPER_3(minw, tl, env, tl, tl) DEF_HELPER_3(mulr64, i64, env, tl, tl) DEF_HELPER_3(mulsr64, i64, env, tl, tl) + +DEF_HELPER_3(ave, tl, env, tl, tl) +DEF_HELPER_3(sra_u, tl, env, tl, tl) +DEF_HELPER_3(bitrev, tl, env, tl, tl) +DEF_HELPER_3(wext, tl, env, i64, tl) +DEF_HELPER_4(bpick, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 9cfe5570b0..b70f6f0dc2 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -26,6 +26,7 @@ %sh7 20:7 %sh4 20:4 %sh3 20:3 +%sh6 20:6 %csr 20:12 %rm 12:3 %nf 29:3 !function=ex_plus_1 @@ -44,6 +45,7 @@ &j imm rd &r rd rs1 rs2 &r2 rd rs1 +&r4 rd rs1 rs2 rs3 &s imm rs1 rs2 &u imm rd &shift shamt rs1 rd @@ -65,6 +67,7 @@ @sh ...... ...... ..... ... ..... ....... &shift shamt=%sh7 %rs1 %rd @sh4 ...... ...... ..... ... ..... ....... &shift shamt=%sh4 %rs1 %rd @sh3 ...... ...... ..... ... ..... ....... &shift shamt=%sh3 %rs1 %rd +@sh6 ...... ...... ..... ... ..... ....... &shift shamt=%sh6 %rs1 %rd @csr ............ ..... ... ..... ....... %csr %rs1 %rd @atom_ld ..... aq:1 rl:1 ..... ........ ..... ....... &atomic rs2=0 %rs1 %rd @@ -74,6 +77,7 @@ @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd @r2 ....... ..... ..... ... ..... ....... &r2 %rs1 %rd +@r4 ..... .. ..... ..... ... ..... ....... %rs3 %rs2 %rs1 %rd @r2_nfvm ... ... vm:1 ..... ..... ... ..... ....... &r2nfvm %nf %rs1 %rd @r2_vm ...... vm:1 ..... ..... ... ..... ....... &rmr %rs2 %rd @r1_vm ...... vm:1 ..... ..... ... ..... ....... %rd @@ -997,3 +1001,15 @@ maxw 1111001 ..... ..... 000 ..... 1110111 @r minw 1111000 ..... ..... 000 ..... 1110111 @r mulr64 1111000 ..... ..... 001 ..... 1110111 @r mulsr64 1110000 ..... ..... 001 ..... 1110111 @r + +ave 1110000 ..... ..... 000 ..... 1110111 @r +sra_u 0010010 ..... ..... 001 ..... 1110111 @r +srai_u 110101 ...... ..... 001 ..... 1110111 @sh6 +bitrev 1110011 ..... ..... 000 ..... 1110111 @r +bitrevi 111010 ...... ..... 000 ..... 1110111 @sh6 +wext 1100111 ..... ..... 000 ..... 1110111 @r +wexti 1101111 ..... ..... 000 ..... 1110111 @sh5 +bpick .....00 ..... ..... 011 ..... 1110111 @r4 +insb 1010110 00 ... ..... 000 ..... 1110111 @sh3 +maddr32 1100010 ..... ..... 001 ..... 1110111 @r +msubr32 1100011 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index b720c6e037..51e140d157 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -708,3 +708,244 @@ GEN_RVP_R_OOL(minw); GEN_RVP_R_OOL(maxw); GEN_RVP_R_D64_OOL(mulr64); GEN_RVP_R_D64_OOL(mulsr64); + +/* Non-SIMD Miscellaneous Instructions */ +GEN_RVP_R_OOL(ave); +GEN_RVP_R_OOL(sra_u); +GEN_RVP_SHIFTI(srai_u, NULL, gen_helper_sra_u); +GEN_RVP_R_OOL(bitrev); +GEN_RVP_SHIFTI(bitrevi, NULL, gen_helper_bitrev); + +static bool +r_s64_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv, TCGv_ptr, TCGv_i64, TCGv)) +{ + TCGv_i64 src1; + TCGv src2, dst; + + if (!has_ext(ctx, RVP) || !ctx->ext_psfoperand) { + return false; + } + + src1 = tcg_temp_new_i64(); + src2 = tcg_temp_new(); + dst = tcg_temp_new(); + + if (is_32bit(ctx)) { + TCGv t0, t1; + t0 = tcg_temp_new(); + t1 = tcg_temp_new(); + gen_get_gpr(t0, a->rs1); + gen_get_gpr(t1, a->rs1 + 1); + tcg_gen_concat_tl_i64(src1, t0, t1); + tcg_temp_free(t0); + tcg_temp_free(t1); + } else { + TCGv t0; + t0 = tcg_temp_new(); + gen_get_gpr(t0, a->rs1); + tcg_gen_ext_tl_i64(src1, t0); + tcg_temp_free(t0); + } + gen_get_gpr(src2, a->rs2); + fn(dst, cpu_env, src1, src2); + gen_set_gpr(a->rd, dst); + + tcg_temp_free_i64(src1); + tcg_temp_free(src2); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R_S64_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_s64_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_S64_OOL(wext); + +static bool rvp_shifti_s64_ool(DisasContext *ctx, arg_shift *a, + void (* fn)(TCGv, TCGv_ptr, TCGv_i64, TCGv)) +{ + TCGv_i64 src1; + TCGv shift, dst; + + if (!has_ext(ctx, RVP) || !ctx->ext_psfoperand) { + return false; + } + + src1 = tcg_temp_new_i64(); + dst = tcg_temp_new(); + + if (is_32bit(ctx)) { + TCGv t0, t1; + t0 = tcg_temp_new(); + t1 = tcg_temp_new(); + gen_get_gpr(t0, a->rs1); + gen_get_gpr(t1, a->rs1 + 1); + tcg_gen_concat_tl_i64(src1, t0, t1); + tcg_temp_free(t0); + tcg_temp_free(t1); + } else { + TCGv t0; + t0 = tcg_temp_new(); + gen_get_gpr(t0, a->rs1); + tcg_gen_ext_tl_i64(src1, t0); + tcg_temp_free(t0); + } + shift = tcg_const_tl(a->shamt); + fn(dst, cpu_env, src1, shift); + gen_set_gpr(a->rd, dst); + + tcg_temp_free_i64(src1); + tcg_temp_free(shift); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_SHIFTI_S64_OOL(NAME, OP) \ +static bool trans_##NAME(DisasContext *s, arg_shift *a) \ +{ \ + return rvp_shifti_s64_ool(s, a, gen_helper_##OP); \ +} + +GEN_RVP_SHIFTI_S64_OOL(wexti, wext); + +typedef void gen_helper_rvp_r4(TCGv, TCGv_ptr, TCGv, TCGv, TCGv); + +static bool r4_ool(DisasContext *ctx, arg_r4 *a, gen_helper_rvp_r4 *fn) +{ + TCGv src1, src2, src3, dst; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + src3 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + gen_get_gpr(src3, a->rs3); + fn(dst, cpu_env, src1, src2, src3); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(src3); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R4_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r4 *a) \ +{ \ + return r4_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R4_OOL(bpick); + +static bool trans_insb(DisasContext *ctx, arg_shift *a) +{ + TCGv src1, dst, b0; + uint8_t shift; + if (!has_ext(ctx, RVP)) { + return false; + } + if (is_32bit(ctx)) { + shift = a->shamt & 0x3; + } else { + shift = a->shamt; + } + src1 = tcg_temp_new(); + dst = tcg_temp_new(); + b0 = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(dst, a->rd); + + tcg_gen_andi_tl(b0, src1, 0xff); + tcg_gen_deposit_tl(dst, dst, b0, shift * 8, 8); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(dst); + tcg_temp_free(b0); + return true; +} + +static bool trans_maddr32(DisasContext *ctx, arg_r *a) +{ + TCGv src1, src2, dst; + TCGv_i32 w1, w2, w3; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + dst = tcg_temp_new(); + w1 = tcg_temp_new_i32(); + w2 = tcg_temp_new_i32(); + w3 = tcg_temp_new_i32(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + gen_get_gpr(dst, a->rd); + + tcg_gen_trunc_tl_i32(w1, src1); + tcg_gen_trunc_tl_i32(w2, src2); + tcg_gen_trunc_tl_i32(w3, dst); + + tcg_gen_mul_i32(w1, w1, w2); + tcg_gen_add_i32(w3, w3, w1); + tcg_gen_ext_i32_tl(dst, w3); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(dst); + tcg_temp_free_i32(w1); + tcg_temp_free_i32(w2); + tcg_temp_free_i32(w3); + return true; +} + +static bool trans_msubr32(DisasContext *ctx, arg_r *a) +{ + TCGv src1, src2, dst; + TCGv_i32 w1, w2, w3; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + dst = tcg_temp_new(); + w1 = tcg_temp_new_i32(); + w2 = tcg_temp_new_i32(); + w3 = tcg_temp_new_i32(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + gen_get_gpr(dst, a->rd); + + tcg_gen_trunc_tl_i32(w1, src1); + tcg_gen_trunc_tl_i32(w2, src2); + tcg_gen_trunc_tl_i32(w3, dst); + + tcg_gen_mul_i32(w1, w1, w2); + tcg_gen_sub_i32(w3, w3, w1); + tcg_gen_ext_i32_tl(dst, w3); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(dst); + tcg_temp_free_i32(w1); + tcg_temp_free_i32(w2); + tcg_temp_free_i32(w3); + return true; +} diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index c0e3b6bbdb..4e0c7a92eb 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2910,3 +2910,80 @@ static inline void do_mulsr64(CPURISCVState *env, void *vd, void *va, } RVPR64(mulsr64); + +/* Miscellaneous Instructions */ +static inline void do_ave(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd, *a = va, *b = vb, half; + + half = hadd64(*a, *b); + if ((*a ^ *b) & 0x1) { + half++; + } + *d = half; +} + +RVPR(ave, 1, sizeof(target_ulong)); + +static inline void do_sra_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd, *a = va; + uint8_t *b = vb; + uint8_t shift = riscv_has_ext(env, RV32) ? (*b & 0x1f) : (*b & 0x3f); + + *d = vssra64(env, 0, *a, shift); +} + +RVPR(sra_u, 1, sizeof(target_ulong)); + +static inline void do_bitrev(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_ulong *d = vd, *a = va; + uint8_t *b = vb; + uint8_t shift = riscv_has_ext(env, RV32) ? (*b & 0x1f) : (*b & 0x3f); + + *d = revbit64(*a) >> (64 - shift - 1); +} + +RVPR(bitrev, 1, sizeof(target_ulong)); + +static inline target_ulong +rvpr_64(CPURISCVState *env, uint64_t a, target_ulong b, PackedFn3 *fn) +{ + target_ulong result = 0; + + fn(env, &result, &a, &b); + return result; +} + +#define RVPR_64(NAME) \ +target_ulong HELPER(NAME)(CPURISCVState *env, uint64_t a, \ + target_ulong b) \ +{ \ + return rvpr_64(env, a, b, (PackedFn3 *)do_##NAME); \ +} + +static inline void do_wext(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + target_long *d = vd; + int64_t *a = va; + uint8_t b = *(uint8_t *)vb & 0x1f; + + *d = sextract64(*a, b, 32); +} + +RVPR_64(wext); + +static inline void do_bpick(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + target_long *d = vd, *a = va, *b = vb, *c = vc; + + *d = (*c & *a) | (~*c & *b); +} + +RVPR_ACC(bpick, 1, sizeof(target_ulong)); From patchwork Thu Jun 10 07:58:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490285 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xtX6cGzz9sRN for ; Thu, 10 Jun 2021 18:27:48 +1000 (AEST) Received: from localhost ([::1]:55338 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrG2S-0004k5-Tj for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:27:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43568) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFor-0001Po-45; Thu, 10 Jun 2021 04:13:41 -0400 Received: from out28-149.mail.aliyun.com ([115.124.28.149]:41929) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFom-0005bZ-Rf; Thu, 10 Jun 2021 04:13:40 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436284|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.582977-0.00815396-0.408869; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047188; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQNP47m_1623312808; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQNP47m_1623312808) by smtp.aliyun-inc.com(10.147.40.26); Thu, 10 Jun 2021 16:13:28 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 28/37] target/riscv: RV64 Only SIMD 32-bit Add/Subtract Instructions Date: Thu, 10 Jun 2021 15:58:59 +0800 Message-Id: <20210610075908.3305506-29-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=115.124.28.149; envelope-from=zhiwei_liu@c-sky.com; helo=out28-149.mail.aliyun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" SIMD 32-bit straight or crossed add/subtract with rounding, havling, or saturation. Signed-off-by: LIU Zhiwei --- include/tcg/tcg-op-gvec.h | 4 + target/riscv/helper.h | 29 +++ target/riscv/insn32.decode | 32 +++ target/riscv/insn_trans/trans_rvp.c.inc | 84 ++++++++ target/riscv/packed_helper.c | 276 ++++++++++++++++++++++++ 5 files changed, 425 insertions(+) diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h index 91531ecb0b..023190e063 100644 --- a/include/tcg/tcg-op-gvec.h +++ b/include/tcg/tcg-op-gvec.h @@ -422,6 +422,8 @@ void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); #define tcg_gen_vec_shl8i_tl tcg_gen_vec_shl8i_i64 #define tcg_gen_vec_shr8i_tl tcg_gen_vec_shr8i_i64 #define tcg_gen_vec_sar8i_tl tcg_gen_vec_sar8i_i64 +#define tcg_gen_vec_add32_tl tcg_gen_vec_add32_i64 +#define tcg_gen_vec_sub32_tl tcg_gen_vec_sub32_i64 #else #define tcg_gen_vec_add16_tl tcg_gen_vec_add16_i32 #define tcg_gen_vec_sub16_tl tcg_gen_vec_sub16_i32 @@ -433,6 +435,8 @@ void tcg_gen_vec_rotl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c); #define tcg_gen_vec_shl8i_tl tcg_gen_vec_shl8i_i32 #define tcg_gen_vec_shr8i_tl tcg_gen_vec_shr8i_i32 #define tcg_gen_vec_sar8i_tl tcg_gen_vec_sar8i_i32 +#define tcg_gen_vec_add32_tl tcg_gen_add_i32 +#define tcg_gen_vec_sub32_tl tcg_gen_sub_i32 #endif #endif diff --git a/target/riscv/helper.h b/target/riscv/helper.h index bdd5ca1251..0f02e140f5 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1399,3 +1399,32 @@ DEF_HELPER_3(sra_u, tl, env, tl, tl) DEF_HELPER_3(bitrev, tl, env, tl, tl) DEF_HELPER_3(wext, tl, env, i64, tl) DEF_HELPER_4(bpick, tl, env, tl, tl, tl) + +DEF_HELPER_3(radd32, i64, env, i64, i64) +DEF_HELPER_3(uradd32, i64, env, i64, i64) +DEF_HELPER_3(kadd32, i64, env, i64, i64) +DEF_HELPER_3(ukadd32, i64, env, i64, i64) +DEF_HELPER_3(rsub32, i64, env, i64, i64) +DEF_HELPER_3(ursub32, i64, env, i64, i64) +DEF_HELPER_3(ksub32, i64, env, i64, i64) +DEF_HELPER_3(uksub32, i64, env, i64, i64) +DEF_HELPER_3(cras32, i64, env, i64, i64) +DEF_HELPER_3(rcras32, i64, env, i64, i64) +DEF_HELPER_3(urcras32, i64, env, i64, i64) +DEF_HELPER_3(kcras32, i64, env, i64, i64) +DEF_HELPER_3(ukcras32, i64, env, i64, i64) +DEF_HELPER_3(crsa32, i64, env, i64, i64) +DEF_HELPER_3(rcrsa32, i64, env, i64, i64) +DEF_HELPER_3(urcrsa32, i64, env, i64, i64) +DEF_HELPER_3(kcrsa32, i64, env, i64, i64) +DEF_HELPER_3(ukcrsa32, i64, env, i64, i64) +DEF_HELPER_3(stas32, i64, env, i64, i64) +DEF_HELPER_3(rstas32, i64, env, i64, i64) +DEF_HELPER_3(urstas32, i64, env, i64, i64) +DEF_HELPER_3(kstas32, i64, env, i64, i64) +DEF_HELPER_3(ukstas32, i64, env, i64, i64) +DEF_HELPER_3(stsa32, i64, env, i64, i64) +DEF_HELPER_3(rstsa32, i64, env, i64, i64) +DEF_HELPER_3(urstsa32, i64, env, i64, i64) +DEF_HELPER_3(kstsa32, i64, env, i64, i64) +DEF_HELPER_3(ukstsa32, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index b70f6f0dc2..05151c6c51 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1013,3 +1013,35 @@ bpick .....00 ..... ..... 011 ..... 1110111 @r4 insb 1010110 00 ... ..... 000 ..... 1110111 @sh3 maddr32 1100010 ..... ..... 001 ..... 1110111 @r msubr32 1100011 ..... ..... 001 ..... 1110111 @r + +# *** RV64P Standard Extension (in addition to RV32P) *** +add32 0100000 ..... ..... 010 ..... 1110111 @r +radd32 0000000 ..... ..... 010 ..... 1110111 @r +uradd32 0010000 ..... ..... 010 ..... 1110111 @r +kadd32 0001000 ..... ..... 010 ..... 1110111 @r +ukadd32 0011000 ..... ..... 010 ..... 1110111 @r +sub32 0100001 ..... ..... 010 ..... 1110111 @r +rsub32 0000001 ..... ..... 010 ..... 1110111 @r +ursub32 0010001 ..... ..... 010 ..... 1110111 @r +ksub32 0001001 ..... ..... 010 ..... 1110111 @r +uksub32 0011001 ..... ..... 010 ..... 1110111 @r +cras32 0100010 ..... ..... 010 ..... 1110111 @r +rcras32 0000010 ..... ..... 010 ..... 1110111 @r +urcras32 0010010 ..... ..... 010 ..... 1110111 @r +kcras32 0001010 ..... ..... 010 ..... 1110111 @r +ukcras32 0011010 ..... ..... 010 ..... 1110111 @r +crsa32 0100011 ..... ..... 010 ..... 1110111 @r +rcrsa32 0000011 ..... ..... 010 ..... 1110111 @r +urcrsa32 0010011 ..... ..... 010 ..... 1110111 @r +kcrsa32 0001011 ..... ..... 010 ..... 1110111 @r +ukcrsa32 0011011 ..... ..... 010 ..... 1110111 @r +stas32 1111000 ..... ..... 010 ..... 1110111 @r +rstas32 1011000 ..... ..... 010 ..... 1110111 @r +urstas32 1101000 ..... ..... 010 ..... 1110111 @r +kstas32 1100000 ..... ..... 010 ..... 1110111 @r +ukstas32 1110000 ..... ..... 010 ..... 1110111 @r +stsa32 1111001 ..... ..... 010 ..... 1110111 @r +rstsa32 1011001 ..... ..... 010 ..... 1110111 @r +urstsa32 1101001 ..... ..... 010 ..... 1110111 @r +kstsa32 1100001 ..... ..... 010 ..... 1110111 @r +ukstsa32 1110001 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 51e140d157..293c2c4597 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -949,3 +949,87 @@ static bool trans_msubr32(DisasContext *ctx, arg_r *a) tcg_temp_free_i32(w3); return true; } + +/* + *** RV64 Only Instructions + */ +/* RV64 Only) SIMD 32-bit Add/Subtract Instructions */ +#define GEN_RVP64_R_INLINE(NAME, VECOP, OP) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + REQUIRE_64BIT(s); \ + return r_inline(s, a, VECOP, OP); \ +} + +GEN_RVP64_R_INLINE(add32, tcg_gen_vec_add32_tl, tcg_gen_add_tl); +GEN_RVP64_R_INLINE(sub32, tcg_gen_vec_sub32_tl, tcg_gen_sub_tl); + +static bool +r_64_ool(DisasContext *ctx, arg_r *a, + void (* fn)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64)) +{ + TCGv t1, t2; + TCGv_i64 src1, src2, dst; + + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new_i64(); + src2 = tcg_temp_new_i64(); + dst = tcg_temp_new_i64(); + + t1 = tcg_temp_new(); + t2 = tcg_temp_new(); + gen_get_gpr(t1, a->rs1); + tcg_gen_ext_tl_i64(src1, t1); + gen_get_gpr(t2, a->rs2); + tcg_gen_ext_tl_i64(src2, t2); + + fn(dst, cpu_env, src1, src2); + tcg_gen_trunc_i64_tl(t1, dst); + gen_set_gpr(a->rd, t1); + + tcg_temp_free(t1); + tcg_temp_free(t2); + tcg_temp_free_i64(src1); + tcg_temp_free_i64(src2); + tcg_temp_free_i64(dst); + return true; +} + +#define GEN_RVP64_R_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + REQUIRE_64BIT(s); \ + return r_64_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP64_R_OOL(radd32); +GEN_RVP64_R_OOL(uradd32); +GEN_RVP64_R_OOL(kadd32); +GEN_RVP64_R_OOL(ukadd32); +GEN_RVP64_R_OOL(rsub32); +GEN_RVP64_R_OOL(ursub32); +GEN_RVP64_R_OOL(ksub32); +GEN_RVP64_R_OOL(uksub32); +GEN_RVP64_R_OOL(cras32); +GEN_RVP64_R_OOL(rcras32); +GEN_RVP64_R_OOL(urcras32); +GEN_RVP64_R_OOL(kcras32); +GEN_RVP64_R_OOL(ukcras32); +GEN_RVP64_R_OOL(crsa32); +GEN_RVP64_R_OOL(rcrsa32); +GEN_RVP64_R_OOL(urcrsa32); +GEN_RVP64_R_OOL(kcrsa32); +GEN_RVP64_R_OOL(ukcrsa32); +GEN_RVP64_R_OOL(stas32); +GEN_RVP64_R_OOL(rstas32); +GEN_RVP64_R_OOL(urstas32); +GEN_RVP64_R_OOL(kstas32); +GEN_RVP64_R_OOL(ukstas32); +GEN_RVP64_R_OOL(stsa32); +GEN_RVP64_R_OOL(rstsa32); +GEN_RVP64_R_OOL(urstsa32); +GEN_RVP64_R_OOL(kstsa32); +GEN_RVP64_R_OOL(ukstsa32); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 4e0c7a92eb..305c515132 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -2987,3 +2987,279 @@ static inline void do_bpick(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(bpick, 1, sizeof(target_ulong)); + +/* + *** RV64 Only Instructions + */ +/* (RV64 Only) SIMD 32-bit Add/Subtract Instructions */ +static inline void do_radd32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[i] = hadd32(a[i], b[i]); +} + +RVPR64_64_64(radd32, 1, 4); + +static inline void do_uradd32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[i] = haddu32(a[i], b[i]); +} + +RVPR64_64_64(uradd32, 1, 4); + +static inline void do_kadd32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[i] = sadd32(env, 0, a[i], b[i]); +} + +RVPR64_64_64(kadd32, 1, 4); + +static inline void do_ukadd32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[i] = saddu32(env, 0, a[i], b[i]); +} + +RVPR64_64_64(ukadd32, 1, 4); + +static inline void do_rsub32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[i] = hsub32(a[i], b[i]); +} + +RVPR64_64_64(rsub32, 1, 4); + +static inline void do_ursub32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[i] = hsubu64(a[i], b[i]); +} + +RVPR64_64_64(ursub32, 1, 4); + +static inline void do_ksub32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[i] = ssub32(env, 0, a[i], b[i]); +} + +RVPR64_64_64(ksub32, 1, 4); + +static inline void do_uksub32(CPURISCVState *env, void *vd, void *va, + void *vb, uint16_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[i] = ssubu32(env, 0, a[i], b[i]); +} + +RVPR64_64_64(uksub32, 1, 4); + +static inline void do_cras32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = a[H4(i)] - b[H4(i + 1)]; + d[H4(i + 1)] = a[H4(i + 1)] + b[H4(i)]; +} + +RVPR64_64_64(cras32, 2, 4); + +static inline void do_rcras32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = hsub32(a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = hadd32(a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(rcras32, 2, 4); + +static inline void do_urcras32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = hsubu64(a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = haddu32(a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(urcras32, 2, 4); + +static inline void do_kcras32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = ssub32(env, 0, a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = sadd32(env, 0, a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(kcras32, 2, 4); + +static inline void do_ukcras32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = ssubu32(env, 0, a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = saddu32(env, 0, a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(ukcras32, 2, 4); + +static inline void do_crsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = a[H4(i)] + b[H4(i + 1)]; + d[H4(i + 1)] = a[H4(i + 1)] - b[H4(i)]; +} + +RVPR64_64_64(crsa32, 2, 4); + +static inline void do_rcrsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = hadd32(a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = hsub32(a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(rcrsa32, 2, 4); + +static inline void do_urcrsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = haddu32(a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = hsubu64(a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(urcrsa32, 2, 4); + +static inline void do_kcrsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = sadd32(env, 0, a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = ssub32(env, 0, a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(kcrsa32, 2, 4); + +static inline void do_ukcrsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = saddu32(env, 0, a[H4(i)], b[H4(i + 1)]); + d[H4(i + 1)] = ssubu32(env, 0, a[H4(i + 1)], b[H4(i)]); +} + +RVPR64_64_64(ukcrsa32, 2, 4); + +static inline void do_stas32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = a[H4(i)] - b[H4(i)]; + d[H4(i + 1)] = a[H4(i + 1)] + b[H4(i + 1)]; +} + +RVPR64_64_64(stas32, 2, 4); + +static inline void do_rstas32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = hsub32(a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = hadd32(a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(rstas32, 2, 4); + +static inline void do_urstas32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = hsubu64(a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = haddu32(a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(urstas32, 2, 4); + +static inline void do_kstas32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = ssub32(env, 0, a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = sadd32(env, 0, a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(kstas32, 2, 4); + +static inline void do_ukstas32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = ssubu32(env, 0, a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = saddu32(env, 0, a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(ukstas32, 2, 4); + +static inline void do_stsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = a[H4(i)] + b[H4(i)]; + d[H4(i + 1)] = a[H4(i + 1)] - b[H4(i + 1)]; +} + +RVPR64_64_64(stsa32, 2, 4); + +static inline void do_rstsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = hadd32(a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = hsub32(a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(rstsa32, 2, 4); + +static inline void do_urstsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = haddu32(a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = hsubu64(a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(urstsa32, 2, 4); + +static inline void do_kstsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = sadd32(env, 0, a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = ssub32(env, 0, a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(kstsa32, 2, 4); + +static inline void do_ukstsa32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = saddu32(env, 0, a[H4(i)], b[H4(i)]); + d[H4(i + 1)] = ssubu32(env, 0, a[H4(i + 1)], b[H4(i + 1)]); +} + +RVPR64_64_64(ukstsa32, 2, 4); From patchwork Thu Jun 10 07:59:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490294 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0y4l146Fz9sRN for ; Thu, 10 Jun 2021 18:36:39 +1000 (AEST) Received: from localhost ([::1]:50468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrGB3-0003ne-4w for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:36:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43646) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFpN-0002dh-Pw; Thu, 10 Jun 2021 04:14:13 -0400 Received: from mail142-9.mail.alibaba.com ([198.11.142.9]:4469) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFpJ-0005uy-Mr; Thu, 10 Jun 2021 04:14:13 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07436419|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.715433-0.0034027-0.281165; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047209; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQN6WCB_1623312839; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQN6WCB_1623312839) by smtp.aliyun-inc.com(10.147.40.200); Thu, 10 Jun 2021 16:13:59 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 29/37] target/riscv: RV64 Only SIMD 32-bit Shift Instructions Date: Thu, 10 Jun 2021 15:59:00 +0800 Message-Id: <20210610075908.3305506-30-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.9; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-9.mail.alibaba.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" SIMD 32-bit right shift with rounding or left shift with saturation. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 9 ++ target/riscv/insn32.decode | 15 ++++ target/riscv/insn_trans/trans_rvp.c.inc | 55 +++++++++++++ target/riscv/packed_helper.c | 104 ++++++++++++++++++++++++ 4 files changed, 183 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 0f02e140f5..3b2a73db9a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1428,3 +1428,12 @@ DEF_HELPER_3(rstsa32, i64, env, i64, i64) DEF_HELPER_3(urstsa32, i64, env, i64, i64) DEF_HELPER_3(kstsa32, i64, env, i64, i64) DEF_HELPER_3(ukstsa32, i64, env, i64, i64) + +DEF_HELPER_3(sra32, i64, env, i64, i64) +DEF_HELPER_3(sra32_u, i64, env, i64, i64) +DEF_HELPER_3(srl32, i64, env, i64, i64) +DEF_HELPER_3(srl32_u, i64, env, i64, i64) +DEF_HELPER_3(sll32, i64, env, i64, i64) +DEF_HELPER_3(ksll32, i64, env, i64, i64) +DEF_HELPER_3(kslra32, i64, env, i64, i64) +DEF_HELPER_3(kslra32_u, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 05151c6c51..80150c693a 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1045,3 +1045,18 @@ rstsa32 1011001 ..... ..... 010 ..... 1110111 @r urstsa32 1101001 ..... ..... 010 ..... 1110111 @r kstsa32 1100001 ..... ..... 010 ..... 1110111 @r ukstsa32 1110001 ..... ..... 010 ..... 1110111 @r + +sra32 0101000 ..... ..... 010 ..... 1110111 @r +sra32_u 0110000 ..... ..... 010 ..... 1110111 @r +srai32 0111000 ..... ..... 010 ..... 1110111 @sh5 +srai32_u 1000000 ..... ..... 010 ..... 1110111 @sh5 +srl32 0101001 ..... ..... 010 ..... 1110111 @r +srl32_u 0110001 ..... ..... 010 ..... 1110111 @r +srli32 0111001 ..... ..... 010 ..... 1110111 @sh5 +srli32_u 1000001 ..... ..... 010 ..... 1110111 @sh5 +sll32 0101010 ..... ..... 010 ..... 1110111 @r +slli32 0111010 ..... ..... 010 ..... 1110111 @sh5 +ksll32 0110010 ..... ..... 010 ..... 1110111 @r +kslli32 1000010 ..... ..... 010 ..... 1110111 @sh5 +kslra32 0101011 ..... ..... 010 ..... 1110111 @r +kslra32_u 0110011 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 293c2c4597..6cba14be84 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1033,3 +1033,58 @@ GEN_RVP64_R_OOL(rstsa32); GEN_RVP64_R_OOL(urstsa32); GEN_RVP64_R_OOL(kstsa32); GEN_RVP64_R_OOL(ukstsa32); + +/* (RV64 Only) SIMD 32-bit Shift Instructions */ +static inline bool +rvp64_shifti(DisasContext *ctx, arg_shift *a, + void (* fn)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64)) +{ + TCGv t1; + TCGv_i64 src1, dst, shift; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new_i64(); + dst = tcg_temp_new_i64(); + t1 = tcg_temp_new(); + + gen_get_gpr(t1, a->rs1); + tcg_gen_ext_tl_i64(src1, t1); + shift = tcg_const_i64(a->shamt); + + fn(dst, cpu_env, src1, shift); + tcg_gen_trunc_i64_tl(t1, dst); + gen_set_gpr(a->rd, t1); + + tcg_temp_free_i64(src1); + tcg_temp_free_i64(dst); + tcg_temp_free_i64(shift); + tcg_temp_free(t1); + return true; +} + +#define GEN_RVP64_SHIFTI(NAME, OP) \ +static bool trans_##NAME(DisasContext *s, arg_shift *a) \ +{ \ + REQUIRE_64BIT(s); \ + return rvp64_shifti(s, a, OP); \ +} + +GEN_RVP64_SHIFTI(srai32, gen_helper_sra32); +GEN_RVP64_SHIFTI(srli32, gen_helper_srl32); +GEN_RVP64_SHIFTI(slli32, gen_helper_sll32); + +GEN_RVP64_SHIFTI(srai32_u, gen_helper_sra32_u); +GEN_RVP64_SHIFTI(srli32_u, gen_helper_srl32_u); +GEN_RVP64_SHIFTI(kslli32, gen_helper_ksll32); + +GEN_RVP64_R_OOL(sra32); +GEN_RVP64_R_OOL(srl32); +GEN_RVP64_R_OOL(sll32); +GEN_RVP64_R_OOL(ksll32); +GEN_RVP64_R_OOL(kslra32); + +GEN_RVP64_R_OOL(sra32_u); +GEN_RVP64_R_OOL(srl32_u); +GEN_RVP64_R_OOL(kslra32_u); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 305c515132..74d42e4c33 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3263,3 +3263,107 @@ static inline void do_ukstsa32(CPURISCVState *env, void *vd, void *va, } RVPR64_64_64(ukstsa32, 2, 4); + +/* (RV64 Only) SIMD 32-bit Shift Instructions */ +static inline void do_sra32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + d[i] = a[i] >> shift; +} + +RVPR64_64_64(sra32, 1, 4); + +static inline void do_srl32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + d[i] = a[i] >> shift; +} + +RVPR64_64_64(srl32, 1, 4); + +static inline void do_sll32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + d[i] = a[i] << shift; +} + +RVPR64_64_64(sll32, 1, 4); + +static inline void do_sra32_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + + d[i] = vssra32(env, 0, a[i], shift); +} + +RVPR64_64_64(sra32_u, 1, 4); + +static inline void do_srl32_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va; + uint8_t shift = *(uint8_t *)vb & 0x1f; + + d[i] = vssrl32(env, 0, a[i], shift); +} + +RVPR64_64_64(srl32_u, 1, 4); + +static inline void do_ksll32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, result; + uint8_t shift = *(uint64_t *)vb & 0x1f; + + result = a[i] << shift; + if (shift > clrsb32(a[i])) { + env->vxsat = 0x1; + d[i] = (a[i] & INT32_MIN) ? INT32_MIN : INT32_MAX; + } else { + d[i] = result; + } +} + +RVPR64_64_64(ksll32, 1, 4); + +static inline void do_kslra32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va; + int64_t shift = sextract64(*(uint64_t *)vb, 0, 6); + + if (shift >= 0) { + do_ksll32(env, vd, va, vb, i); + } else { + shift = -shift; + shift = (shift == 32) ? 31 : shift; + d[i] = a[i] >> shift; + } +} + +RVPR64_64_64(kslra32, 1, 4); + +static inline void do_kslra32_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va; + int32_t shift = sextract32((*(uint32_t *)vb), 0, 6); + + if (shift >= 0) { + do_ksll32(env, vd, va, vb, i); + } else { + shift = -shift; + shift = (shift == 32) ? 31 : shift; + d[i] = vssra32(env, 0, a[i], shift); + } +} + +RVPR64_64_64(kslra32_u, 1, 4); From patchwork Thu Jun 10 07:59:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490288 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xwZ5svzz9sRK for ; Thu, 10 Jun 2021 18:29:34 +1000 (AEST) Received: from localhost ([::1]:35182 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrG4C-0001dZ-Rz for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:29:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43740) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFpu-0003ix-3v; Thu, 10 Jun 2021 04:14:47 -0400 Received: from mail142-36.mail.alibaba.com ([198.11.142.36]:5533) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFpq-0006EK-Mn; Thu, 10 Jun 2021 04:14:45 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07854751|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.600505-0.00222066-0.397274; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047198; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMqNB5_1623312869; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMqNB5_1623312869) by smtp.aliyun-inc.com(10.147.41.137); Thu, 10 Jun 2021 16:14:29 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 30/37] target/riscv: RV64 Only SIMD 32-bit Miscellaneous Instructions Date: Thu, 10 Jun 2021 15:59:01 +0800 Message-Id: <20210610075908.3305506-31-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.36; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-36.mail.alibaba.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" SIMD 32-bit absolute value, signed or unsigned maximum, minimum. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 6 +++ target/riscv/insn32.decode | 6 +++ target/riscv/insn_trans/trans_rvp.c.inc | 15 +++++++ target/riscv/packed_helper.c | 55 +++++++++++++++++++++++++ 4 files changed, 82 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 3b2a73db9a..d992859747 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1437,3 +1437,9 @@ DEF_HELPER_3(sll32, i64, env, i64, i64) DEF_HELPER_3(ksll32, i64, env, i64, i64) DEF_HELPER_3(kslra32, i64, env, i64, i64) DEF_HELPER_3(kslra32_u, i64, env, i64, i64) + +DEF_HELPER_3(smin32, i64, env, i64, i64) +DEF_HELPER_3(umin32, i64, env, i64, i64) +DEF_HELPER_3(smax32, i64, env, i64, i64) +DEF_HELPER_3(umax32, i64, env, i64, i64) +DEF_HELPER_2(kabs32, tl, env, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 80150c693a..ee5f855f28 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1060,3 +1060,9 @@ ksll32 0110010 ..... ..... 010 ..... 1110111 @r kslli32 1000010 ..... ..... 010 ..... 1110111 @sh5 kslra32 0101011 ..... ..... 010 ..... 1110111 @r kslra32_u 0110011 ..... ..... 010 ..... 1110111 @r + +smin32 1001000 ..... ..... 010 ..... 1110111 @r +umin32 1010000 ..... ..... 010 ..... 1110111 @r +smax32 1001001 ..... ..... 010 ..... 1110111 @r +umax32 1010001 ..... ..... 010 ..... 1110111 @r +kabs32 1010110 10010 ..... 000 ..... 1110111 @r2 diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 6cba14be84..77586e07e4 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1088,3 +1088,18 @@ GEN_RVP64_R_OOL(kslra32); GEN_RVP64_R_OOL(sra32_u); GEN_RVP64_R_OOL(srl32_u); GEN_RVP64_R_OOL(kslra32_u); + +/* (RV64 Only) SIMD 32-bit Miscellaneous Instructions */ +GEN_RVP64_R_OOL(smin32); +GEN_RVP64_R_OOL(umin32); +GEN_RVP64_R_OOL(smax32); +GEN_RVP64_R_OOL(umax32); + +#define GEN_RVP64_R2_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r2 *a) \ +{ \ + REQUIRE_64BIT(s); \ + return r2_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP64_R2_OOL(kabs32); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 74d42e4c33..a808dae9d8 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3367,3 +3367,58 @@ static inline void do_kslra32_u(CPURISCVState *env, void *vd, void *va, } RVPR64_64_64(kslra32_u, 1, 4); + +/* (RV64 Only) SIMD 32-bit Miscellaneous Instructions */ +static inline void do_smin32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] < b[i]) ? a[i] : b[i]; +} + +RVPR64_64_64(smin32, 1, 4); + +static inline void do_umin32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] < b[i]) ? a[i] : b[i]; +} + +RVPR64_64_64(umin32, 1, 4); + +static inline void do_smax32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] > b[i]) ? a[i] : b[i]; +} + +RVPR64_64_64(smax32, 1, 4); + +static inline void do_umax32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + + d[i] = (a[i] > b[i]) ? a[i] : b[i]; +} + +RVPR64_64_64(umax32, 1, 4); + +static inline void do_kabs32(CPURISCVState *env, void *vd, void *va, uint8_t i) +{ + int32_t *d = vd, *a = va; + + if (a[i] == INT32_MIN) { + d[i] = INT32_MAX; + env->vxsat = 0x1; + } else { + d[i] = abs(a[i]); + } +} + +RVPR2(kabs32, 1, 4); From patchwork Thu Jun 10 07:59:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490247 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xfK5wbVz9sPf for ; Thu, 10 Jun 2021 18:17:13 +1000 (AEST) Received: from localhost ([::1]:47204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFs7-0005bg-3M for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:17:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFqO-0004Mg-Dw; Thu, 10 Jun 2021 04:15:17 -0400 Received: from mail142-28.mail.alibaba.com ([198.11.142.28]:14313) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFqL-0006aQ-L3; Thu, 10 Jun 2021 04:15:16 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07437969|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.537522-0.00564209-0.456836; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047199; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQMlra0_1623312900; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQMlra0_1623312900) by smtp.aliyun-inc.com(10.147.44.145); Thu, 10 Jun 2021 16:15:00 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 31/37] target/riscv: RV64 Only SIMD Q15 saturating Multiply Instructions Date: Thu, 10 Jun 2021 15:59:02 +0800 Message-Id: <20210610075908.3305506-32-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.28; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-28.mail.alibaba.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Q15 saturation limits the result to the range [INT16_MIN, INT16_MAX]. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 10 ++ target/riscv/insn32.decode | 10 ++ target/riscv/insn_trans/trans_rvp.c.inc | 19 ++++ target/riscv/packed_helper.c | 139 ++++++++++++++++++++++++ 4 files changed, 178 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index d992859747..5edaf389e4 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1443,3 +1443,13 @@ DEF_HELPER_3(umin32, i64, env, i64, i64) DEF_HELPER_3(smax32, i64, env, i64, i64) DEF_HELPER_3(umax32, i64, env, i64, i64) DEF_HELPER_2(kabs32, tl, env, tl) + +DEF_HELPER_3(khmbb16, i64, env, i64, i64) +DEF_HELPER_3(khmbt16, i64, env, i64, i64) +DEF_HELPER_3(khmtt16, i64, env, i64, i64) +DEF_HELPER_3(kdmbb16, i64, env, i64, i64) +DEF_HELPER_3(kdmbt16, i64, env, i64, i64) +DEF_HELPER_3(kdmtt16, i64, env, i64, i64) +DEF_HELPER_4(kdmabb16, tl, env, tl, tl, tl) +DEF_HELPER_4(kdmabt16, tl, env, tl, tl, tl) +DEF_HELPER_4(kdmatt16, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index ee5f855f28..a7b5643d5f 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1066,3 +1066,13 @@ umin32 1010000 ..... ..... 010 ..... 1110111 @r smax32 1001001 ..... ..... 010 ..... 1110111 @r umax32 1010001 ..... ..... 010 ..... 1110111 @r kabs32 1010110 10010 ..... 000 ..... 1110111 @r2 + +khmbb16 1101110 ..... ..... 001 ..... 1110111 @r +khmbt16 1110110 ..... ..... 001 ..... 1110111 @r +khmtt16 1111110 ..... ..... 001 ..... 1110111 @r +kdmbb16 1101101 ..... ..... 001 ..... 1110111 @r +kdmbt16 1110101 ..... ..... 001 ..... 1110111 @r +kdmtt16 1111101 ..... ..... 001 ..... 1110111 @r +kdmabb16 1101100 ..... ..... 001 ..... 1110111 @r +kdmabt16 1110100 ..... ..... 001 ..... 1110111 @r +kdmatt16 1111100 ..... ..... 001 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 77586e07e4..aa97161697 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1103,3 +1103,22 @@ static bool trans_##NAME(DisasContext *s, arg_r2 *a) \ } GEN_RVP64_R2_OOL(kabs32); + +/* (RV64 Only) SIMD Q15 saturating Multiply Instructions */ +GEN_RVP64_R_OOL(khmbb16); +GEN_RVP64_R_OOL(khmbt16); +GEN_RVP64_R_OOL(khmtt16); +GEN_RVP64_R_OOL(kdmbb16); +GEN_RVP64_R_OOL(kdmbt16); +GEN_RVP64_R_OOL(kdmtt16); + +#define GEN_RVP64_R_ACC_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + REQUIRE_64BIT(s); \ + return r_acc_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP64_R_ACC_OOL(kdmabb16); +GEN_RVP64_R_ACC_OOL(kdmabt16); +GEN_RVP64_R_ACC_OOL(kdmatt16); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index a808dae9d8..32e0af2ef6 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3422,3 +3422,142 @@ static inline void do_kabs32(CPURISCVState *env, void *vd, void *va, uint8_t i) } RVPR2(kabs32, 1, 4); + +/* (RV64 Only) SIMD Q15 saturating Multiply Instructions */ +static inline void do_khmbb16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + + d[H4(i / 2)] = sat64(env, (int64_t)a[H2(i)] * b[H2(i)] >> 15, 15); +} + +RVPR64_64_64(khmbb16, 2, 2); + +static inline void do_khmbt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + + d[H4(i / 2)] = sat64(env, (int64_t)a[H2(i)] * b[H2(i + 1)] >> 15, 15); +} + +RVPR64_64_64(khmbt16, 2, 2); + +static inline void do_khmtt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + + d[H4(i / 2)] = sat64(env, (int64_t)a[H2(i + 1)] * b[H2(i + 1)] >> 15, 15); +} + +RVPR64_64_64(khmtt16, 2, 2); + +static inline void do_kdmbb16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + + if (a[H2(i)] == INT16_MIN && b[H2(i)] == INT16_MIN) { + d[H4(i / 2)] = INT32_MAX; + env->vxsat = 0x1; + } else { + d[H4(i / 2)] = (int64_t)a[H2(i)] * b[H2(i)] << 1; + } +} + +RVPR64_64_64(kdmbb16, 2, 2); + +static inline void do_kdmbt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + + if (a[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + d[H4(i / 2)] = INT32_MAX; + env->vxsat = 0x1; + } else { + d[H4(i / 2)] = (int64_t)a[H2(i)] * b[H2(i + 1)] << 1; + } +} + +RVPR64_64_64(kdmbt16, 2, 2); + +static inline void do_kdmtt16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + + if (a[H2(i + 1)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + d[H4(i / 2)] = INT32_MAX; + env->vxsat = 0x1; + } else { + d[H4(i / 2)] = (int64_t)a[H2(i + 1)] * b[H2(i + 1)] << 1; + } +} + +RVPR64_64_64(kdmtt16, 2, 2); + +static inline void do_kdmabb16(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) + +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + int32_t *c = vc, m0; + + if (a[H2(i)] == INT16_MIN && b[H2(i)] == INT16_MIN) { + m0 = INT32_MAX; + env->vxsat = 0x1; + } else { + m0 = (int32_t)a[H2(i)] * b[H2(i)] << 1; + } + d[H4(i / 2)] = sadd32(env, 0, c[H4(i / 2)], m0); +} + +RVPR_ACC(kdmabb16, 2, 2); + +static inline void do_kdmabt16(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) + +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + int32_t *c = vc, m0; + + if (a[H2(i)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + m0 = INT32_MAX; + env->vxsat = 0x1; + } else { + m0 = (int32_t)a[H2(i)] * b[H2(i + 1)] << 1; + } + d[H4(i / 2)] = sadd32(env, 0, c[H4(i / 2)], m0); +} + +RVPR_ACC(kdmabt16, 2, 2); + +static inline void do_kdmatt16(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) + +{ + int32_t *d = vd; + int16_t *a = va, *b = vb; + int32_t *c = vc, m0; + + if (a[H2(i + 1)] == INT16_MIN && b[H2(i + 1)] == INT16_MIN) { + m0 = INT32_MAX; + env->vxsat = 0x1; + } else { + m0 = (int32_t)a[H2(i + 1)] * b[H2(i + 1)] << 1; + } + d[H4(i / 2)] = sadd32(env, 0, c[H4(i / 2)], m0); +} + +RVPR_ACC(kdmatt16, 2, 2); From patchwork Thu Jun 10 07:59:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490250 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xjF0Pqzz9sPf for ; Thu, 10 Jun 2021 18:19:45 +1000 (AEST) Received: from localhost ([::1]:55958 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFuh-00032f-0t for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:19:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44088) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFqp-00052H-I2; Thu, 10 Jun 2021 04:15:46 -0400 Received: from mail142-9.mail.alibaba.com ([198.11.142.9]:59767) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFql-0006sp-WA; Thu, 10 Jun 2021 04:15:43 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.1540741|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.566915-0.00750245-0.425582; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047207; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQNSVm._1623312930; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQNSVm._1623312930) by smtp.aliyun-inc.com(10.147.44.118); Thu, 10 Jun 2021 16:15:30 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 32/37] target/riscv: RV64 Only 32-bit Multiply Instructions Date: Thu, 10 Jun 2021 15:59:03 +0800 Message-Id: <20210610075908.3305506-33-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.9; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-9.mail.alibaba.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Multiply the straight or crossed 32-bit elements of two registers. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 3 +++ target/riscv/insn32.decode | 3 +++ target/riscv/insn_trans/trans_rvp.c.inc | 4 ++++ target/riscv/packed_helper.c | 21 +++++++++++++++++++++ 4 files changed, 31 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 5edaf389e4..0fa48955d8 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1453,3 +1453,6 @@ DEF_HELPER_3(kdmtt16, i64, env, i64, i64) DEF_HELPER_4(kdmabb16, tl, env, tl, tl, tl) DEF_HELPER_4(kdmabt16, tl, env, tl, tl, tl) DEF_HELPER_4(kdmatt16, tl, env, tl, tl, tl) + +DEF_HELPER_3(smbt32, i64, env, i64, i64) +DEF_HELPER_3(smtt32, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index a7b5643d5f..d06075c062 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1076,3 +1076,6 @@ kdmtt16 1111101 ..... ..... 001 ..... 1110111 @r kdmabb16 1101100 ..... ..... 001 ..... 1110111 @r kdmabt16 1110100 ..... ..... 001 ..... 1110111 @r kdmatt16 1111100 ..... ..... 001 ..... 1110111 @r + +smbt32 0001100 ..... ..... 010 ..... 1110111 @r +smtt32 0010100 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index aa97161697..a88ce7a5c4 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1122,3 +1122,7 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ GEN_RVP64_R_ACC_OOL(kdmabb16); GEN_RVP64_R_ACC_OOL(kdmabt16); GEN_RVP64_R_ACC_OOL(kdmatt16); + +/* (RV64 Only) 32-bit Multiply Instructions */ +GEN_RVP64_R_OOL(smbt32); +GEN_RVP64_R_OOL(smtt32); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 32e0af2ef6..eb086b775f 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3561,3 +3561,24 @@ static inline void do_kdmatt16(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(kdmatt16, 2, 2); + +/* (RV64 Only) 32-bit Multiply Instructions */ +static inline void do_smbt32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + *d = (int64_t)a[H4(2 * i)] * b[H4(2 * i + 1)]; +} + +RVPR64_64_64(smbt32, 1, 8); + +static inline void do_smtt32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + *d = (int64_t)a[H4(2 * i + 1)] * b[H4(2 * i + 1)]; +} + +RVPR64_64_64(smtt32, 1, 8); From patchwork Thu Jun 10 07:59:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490282 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xrK29DMz9sRN for ; Thu, 10 Jun 2021 18:25:53 +1000 (AEST) Received: from localhost ([::1]:48982 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrG0c-0000RE-Sg for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:25:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44270) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFrJ-0005sz-4N; Thu, 10 Jun 2021 04:16:13 -0400 Received: from mail142-6.mail.alibaba.com ([198.11.142.6]:31847) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFrG-0007Fv-JA; Thu, 10 Jun 2021 04:16:12 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.1034445|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.55777-0.0156208-0.426609; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047205; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQNS7wG_1623312960; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQNS7wG_1623312960) by smtp.aliyun-inc.com(10.147.41.143); Thu, 10 Jun 2021 16:16:01 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 33/37] target/riscv: RV64 Only 32-bit Multiply & Add Instructions Date: Thu, 10 Jun 2021 15:59:04 +0800 Message-Id: <20210610075908.3305506-34-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.6; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-6.mail.alibaba.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" 32x32 multiplication result is added to a third register with Q63 saturation Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 4 ++++ target/riscv/insn32.decode | 4 ++++ target/riscv/insn_trans/trans_rvp.c.inc | 5 ++++ target/riscv/packed_helper.c | 31 +++++++++++++++++++++++++ 4 files changed, 44 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 0fa48955d8..05f8f31367 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1456,3 +1456,7 @@ DEF_HELPER_4(kdmatt16, tl, env, tl, tl, tl) DEF_HELPER_3(smbt32, i64, env, i64, i64) DEF_HELPER_3(smtt32, i64, env, i64, i64) + +DEF_HELPER_4(kmabb32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmabt32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmatt32, tl, env, tl, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index d06075c062..dec714a064 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1079,3 +1079,7 @@ kdmatt16 1111100 ..... ..... 001 ..... 1110111 @r smbt32 0001100 ..... ..... 010 ..... 1110111 @r smtt32 0010100 ..... ..... 010 ..... 1110111 @r + +kmabb32 0101101 ..... ..... 010 ..... 1110111 @r +kmabt32 0110101 ..... ..... 010 ..... 1110111 @r +kmatt32 0111101 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index a88ce7a5c4..2de81abbb8 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1126,3 +1126,8 @@ GEN_RVP64_R_ACC_OOL(kdmatt16); /* (RV64 Only) 32-bit Multiply Instructions */ GEN_RVP64_R_OOL(smbt32); GEN_RVP64_R_OOL(smtt32); + +/* (RV64 Only) 32-bit Multiply & Add Instructions */ +GEN_RVP64_R_ACC_OOL(kmabb32); +GEN_RVP64_R_ACC_OOL(kmabt32); +GEN_RVP64_R_ACC_OOL(kmatt32); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index eb086b775f..3c05c748c4 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3582,3 +3582,34 @@ static inline void do_smtt32(CPURISCVState *env, void *vd, void *va, } RVPR64_64_64(smtt32, 1, 8); + +/* (RV64 Only) 32-bit Multiply & Add Instructions */ +static inline void do_kmabb32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + *d = sadd64(env, 0, (int64_t)a[H4(2 * i)] * b[H4(2 * i)], *c); +} + +RVPR_ACC(kmabb32, 1, 8); + +static inline void do_kmabt32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + *d = sadd64(env, 0, (int64_t)a[H4(2 * i)] * b[H4(2 * i + 1)], *c); +} + +RVPR_ACC(kmabt32, 1, 8); + +static inline void do_kmatt32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + *d = sadd64(env, 0, (int64_t)a[H4(2 * i + 1)] * b[H4(2 * i + 1)], *c); +} + +RVPR_ACC(kmatt32, 1, 8); From patchwork Thu Jun 10 07:59:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490298 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0y5n5xyQz9sPf for ; Thu, 10 Jun 2021 18:37:33 +1000 (AEST) Received: from localhost ([::1]:53518 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrGBv-0005wk-PH for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:37:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44460) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFrt-0006pg-Uh; Thu, 10 Jun 2021 04:16:50 -0400 Received: from mail142-8.mail.alibaba.com ([198.11.142.8]:18637) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFro-0007bm-Cz; Thu, 10 Jun 2021 04:16:49 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07845538|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.580536-0.00943863-0.410025; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047203; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQNHAj5_1623312991; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQNHAj5_1623312991) by smtp.aliyun-inc.com(10.147.42.198); Thu, 10 Jun 2021 16:16:31 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 34/37] target/riscv: RV64 Only 32-bit Parallel Multiply & Add Instructions Date: Thu, 10 Jun 2021 15:59:05 +0800 Message-Id: <20210610075908.3305506-35-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.8; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-8.mail.alibaba.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Two 32x32 results written directly to destation register or as operands added to a 64-bit register. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 12 ++ target/riscv/insn32.decode | 12 ++ target/riscv/insn_trans/trans_rvp.c.inc | 13 ++ target/riscv/packed_helper.c | 182 ++++++++++++++++++++++++ 4 files changed, 219 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 05f8f31367..aa80095e1d 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1460,3 +1460,15 @@ DEF_HELPER_3(smtt32, i64, env, i64, i64) DEF_HELPER_4(kmabb32, tl, env, tl, tl, tl) DEF_HELPER_4(kmabt32, tl, env, tl, tl, tl) DEF_HELPER_4(kmatt32, tl, env, tl, tl, tl) + +DEF_HELPER_3(kmda32, i64, env, i64, i64) +DEF_HELPER_3(kmxda32, i64, env, i64, i64) +DEF_HELPER_4(kmaxda32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmads32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmadrs32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmaxds32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmsda32, tl, env, tl, tl, tl) +DEF_HELPER_4(kmsxda32, tl, env, tl, tl, tl) +DEF_HELPER_3(smds32, i64, env, i64, i64) +DEF_HELPER_3(smdrs32, i64, env, i64, i64) +DEF_HELPER_3(smxds32, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index dec714a064..b9eeb57ca7 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1083,3 +1083,15 @@ smtt32 0010100 ..... ..... 010 ..... 1110111 @r kmabb32 0101101 ..... ..... 010 ..... 1110111 @r kmabt32 0110101 ..... ..... 010 ..... 1110111 @r kmatt32 0111101 ..... ..... 010 ..... 1110111 @r + +kmda32 0011100 ..... ..... 010 ..... 1110111 @r +kmxda32 0011101 ..... ..... 010 ..... 1110111 @r +kmaxda32 0100101 ..... ..... 010 ..... 1110111 @r +kmads32 0101110 ..... ..... 010 ..... 1110111 @r +kmadrs32 0110110 ..... ..... 010 ..... 1110111 @r +kmaxds32 0111110 ..... ..... 010 ..... 1110111 @r +kmsda32 0100110 ..... ..... 010 ..... 1110111 @r +kmsxda32 0100111 ..... ..... 010 ..... 1110111 @r +smds32 0101100 ..... ..... 010 ..... 1110111 @r +smdrs32 0110100 ..... ..... 010 ..... 1110111 @r +smxds32 0111100 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 2de81abbb8..48bcf37e36 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1131,3 +1131,16 @@ GEN_RVP64_R_OOL(smtt32); GEN_RVP64_R_ACC_OOL(kmabb32); GEN_RVP64_R_ACC_OOL(kmabt32); GEN_RVP64_R_ACC_OOL(kmatt32); + +/* (RV64 Only) 32-bit Parallel Multiply & Add Instructions */ +GEN_RVP64_R_OOL(kmda32); +GEN_RVP64_R_OOL(kmxda32); +GEN_RVP64_R_ACC_OOL(kmaxda32); +GEN_RVP64_R_ACC_OOL(kmads32); +GEN_RVP64_R_ACC_OOL(kmadrs32); +GEN_RVP64_R_ACC_OOL(kmaxds32); +GEN_RVP64_R_ACC_OOL(kmsda32); +GEN_RVP64_R_ACC_OOL(kmsxda32); +GEN_RVP64_R_OOL(smds32); +GEN_RVP64_R_OOL(smdrs32); +GEN_RVP64_R_OOL(smxds32); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 3c05c748c4..834e7dbebb 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3613,3 +3613,185 @@ static inline void do_kmatt32(CPURISCVState *env, void *vd, void *va, } RVPR_ACC(kmatt32, 1, 8); + +/* (RV64 Only) 32-bit Parallel Multiply & Add Instructions */ +static inline void do_kmda32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + if (a[H4(i)] == INT32_MIN && b[H4(i)] == INT32_MIN && + a[H4(i + 1)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + *d = INT64_MAX; + env->vxsat = 0x1; + } else { + *d = (int64_t)a[H4(i)] * b[H4(i)] + + (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; + } +} + +RVPR64_64_64(kmda32, 1, 8); + +static inline void do_kmxda32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + if (a[H4(i)] == INT32_MIN && b[H4(i)] == INT32_MIN && + a[H4(i + 1)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + *d = INT64_MAX; + env->vxsat = 0x1; + } else { + *d = (int64_t)a[H4(i)] * b[H4(i + 1)] + + (int64_t)a[H4(i + 1)] * b[H4(i)]; + } +} + +RVPR64_64_64(kmxda32, 1, 8); + +static inline void do_kmaxda32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + int64_t p1, p2; + p1 = (int64_t)a[H4(i)] * b[H4(i + 1)]; + p2 = (int64_t)a[H4(i + 1)] * b[H4(i)]; + + if (a[H4(i)] == INT32_MIN && a[H4(i + 1)] == INT32_MIN && + b[H4(i)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + if (*d < 0) { + *d = (INT64_MAX + *c) + 1ll; + } else { + env->vxsat = 0x1; + *d = INT64_MAX; + } + } else { + *d = sadd64(env, 0, p1 + p2, *c); + } +} + +RVPR_ACC(kmaxda32, 1, 8); + +static inline void do_kmads32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + int64_t t0, t1; + t1 = (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; + t0 = (int64_t)a[H4(i)] * b[H4(i)]; + + *d = sadd64(env, 0, t1 - t0, *c); +} + +RVPR_ACC(kmads32, 1, 8); + +static inline void do_kmadrs32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + int64_t t0, t1; + t1 = (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; + t0 = (int64_t)a[H4(i)] * b[H4(i)]; + + *d = sadd64(env, 0, t0 - t1, *c); +} + +RVPR_ACC(kmadrs32, 1, 8); + +static inline void do_kmaxds32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + int64_t t01, t10; + t01 = (int64_t)a[H4(i)] * b[H4(i + 1)]; + t10 = (int64_t)a[H4(i + 1)] * b[H4(i)]; + + *d = sadd64(env, 0, t10 - t01, *c); +} + +RVPR_ACC(kmaxds32, 1, 8); + +static inline void do_kmsda32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + int64_t t0, t1; + t0 = (int64_t)a[H4(i)] * b[H4(i)]; + t1 = (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; + + if (a[H4(i)] == INT32_MIN && a[H4(i + 1)] == INT32_MIN && + b[H4(i)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + if (*c < 0) { + env->vxsat = 0x1; + *d = INT64_MIN; + } else { + *d = *c - 1ll - INT64_MAX; + } + } else { + *d = ssub64(env, 0, *c, t0 + t1); + } +} + +RVPR_ACC(kmsda32, 1, 8); + +static inline void do_kmsxda32(CPURISCVState *env, void *vd, void *va, + void *vb, void *vc, uint8_t i) +{ + int64_t *d = vd, *c = vc; + int32_t *a = va, *b = vb; + int64_t t01, t10; + t10 = (int64_t)a[H4(i + 1)] * b[H4(i)]; + t01 = (int64_t)a[H4(i)] * b[H4(i + 1)]; + + if (a[H4(i)] == INT32_MIN && a[H4(i + 1)] == INT32_MIN && + b[H4(i)] == INT32_MIN && b[H4(i + 1)] == INT32_MIN) { + if (*c < 0) { + env->vxsat = 0x1; + *d = INT64_MIN; + } else { + *d = *c - 1ll - INT64_MAX; + } + } else { + *d = ssub64(env, 0, *c, t10 + t01); + } +} + +RVPR_ACC(kmsxda32, 1, 8); + +static inline void do_smds32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + *d = (int64_t)a[H4(i + 1)] * b[H4(i + 1)] - + (int64_t)a[H4(i)] * b[H4(i)]; +} + +RVPR64_64_64(smds32, 1, 8); + +static inline void do_smdrs32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + *d = (int64_t)a[H4(i)] * b[H4(i)] - + (int64_t)a[H4(i + 1)] * b[H4(i + 1)]; +} + +RVPR64_64_64(smdrs32, 1, 8); + +static inline void do_smxds32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va, *b = vb; + *d = (int64_t)a[H4(i + 1)] * b[H4(i)] - + (int64_t)a[H4(i)] * b[H4(i + 1)]; +} + +RVPR64_64_64(smxds32, 1, 8); From patchwork Thu Jun 10 07:59:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490302 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0y6w2b6vz9sPf for ; Thu, 10 Jun 2021 18:38:32 +1000 (AEST) Received: from localhost ([::1]:57920 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrGCs-0000ST-Dj for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:38:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44542) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFsV-0007nW-BX; Thu, 10 Jun 2021 04:17:27 -0400 Received: from mail142-26.mail.alibaba.com ([198.11.142.26]:6494) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFsP-0007rr-Dw; Thu, 10 Jun 2021 04:17:27 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.1340529|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.238603-0.00565254-0.755744; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047203; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQNWif8_1623313021; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQNWif8_1623313021) by smtp.aliyun-inc.com(10.147.44.145); Thu, 10 Jun 2021 16:17:01 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 35/37] target/riscv: RV64 Only Non-SIMD 32-bit Shift Instructions Date: Thu, 10 Jun 2021 15:59:06 +0800 Message-Id: <20210610075908.3305506-36-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.26; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-26.mail.alibaba.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" 32-bit rounding arithmetic shift right immediate. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 2 ++ target/riscv/insn32.decode | 2 ++ target/riscv/insn_trans/trans_rvp.c.inc | 3 +++ target/riscv/packed_helper.c | 13 +++++++++++++ 4 files changed, 20 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index aa80095e1d..b998c86abf 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1472,3 +1472,5 @@ DEF_HELPER_4(kmsxda32, tl, env, tl, tl, tl) DEF_HELPER_3(smds32, i64, env, i64, i64) DEF_HELPER_3(smdrs32, i64, env, i64, i64) DEF_HELPER_3(smxds32, i64, env, i64, i64) + +DEF_HELPER_3(sraiw_u, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index b9eeb57ca7..8e8aca4ea1 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1095,3 +1095,5 @@ kmsxda32 0100111 ..... ..... 010 ..... 1110111 @r smds32 0101100 ..... ..... 010 ..... 1110111 @r smdrs32 0110100 ..... ..... 010 ..... 1110111 @r smxds32 0111100 ..... ..... 010 ..... 1110111 @r + +sraiw_u 0011010 ..... ..... 001 ..... 1110111 @sh5 diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 48bcf37e36..68c1ef9f48 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1144,3 +1144,6 @@ GEN_RVP64_R_ACC_OOL(kmsxda32); GEN_RVP64_R_OOL(smds32); GEN_RVP64_R_OOL(smdrs32); GEN_RVP64_R_OOL(smxds32); + +/* (RV64 Only) Non-SIMD 32-bit Shift Instructions */ +GEN_RVP64_SHIFTI(sraiw_u, gen_helper_sraiw_u); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 834e7dbebb..42f1d96fa5 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3795,3 +3795,16 @@ static inline void do_smxds32(CPURISCVState *env, void *vd, void *va, } RVPR64_64_64(smxds32, 1, 8); + +/* (RV64 Only) Non-SIMD 32-bit Shift Instructions */ +static inline void do_sraiw_u(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int64_t *d = vd; + int32_t *a = va; + uint8_t shift = *(uint8_t *)vb; + + *d = vssra32(env, 0, a[H4(i)], shift); +} + +RVPR64_64_64(sraiw_u, 1, 8); From patchwork Thu Jun 10 07:59:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490278 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0xqG6wp6z9sPf for ; Thu, 10 Jun 2021 18:24:58 +1000 (AEST) Received: from localhost ([::1]:45774 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrFzk-0006m2-Un for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:24:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44594) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFsl-00005t-Mq; Thu, 10 Jun 2021 04:17:45 -0400 Received: from mail142-28.mail.alibaba.com ([198.11.142.28]:22352) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFsj-00084f-Sz; Thu, 10 Jun 2021 04:17:43 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.08967724|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.625056-0.00864608-0.366298; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047198; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQN5pZK_1623313052; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQN5pZK_1623313052) by smtp.aliyun-inc.com(10.147.41.158); Thu, 10 Jun 2021 16:17:32 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 36/37] target/riscv: RV64 Only 32-bit Packing Instructions Date: Thu, 10 Jun 2021 15:59:07 +0800 Message-Id: <20210610075908.3305506-37-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.28; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-28.mail.alibaba.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Concat two 32-bit elements to form a 64-bit element. Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 5 +++ target/riscv/insn32.decode | 5 +++ target/riscv/insn_trans/trans_rvp.c.inc | 6 ++++ target/riscv/packed_helper.c | 41 +++++++++++++++++++++++++ 4 files changed, 57 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index b998c86abf..bfcf0ff761 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1474,3 +1474,8 @@ DEF_HELPER_3(smdrs32, i64, env, i64, i64) DEF_HELPER_3(smxds32, i64, env, i64, i64) DEF_HELPER_3(sraiw_u, i64, env, i64, i64) + +DEF_HELPER_3(pkbb32, i64, env, i64, i64) +DEF_HELPER_3(pkbt32, i64, env, i64, i64) +DEF_HELPER_3(pktt32, i64, env, i64, i64) +DEF_HELPER_3(pktb32, i64, env, i64, i64) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 8e8aca4ea1..65682f70b5 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -1097,3 +1097,8 @@ smdrs32 0110100 ..... ..... 010 ..... 1110111 @r smxds32 0111100 ..... ..... 010 ..... 1110111 @r sraiw_u 0011010 ..... ..... 001 ..... 1110111 @sh5 + +pkbb32 0000111 ..... ..... 010 ..... 1110111 @r +pkbt32 0001111 ..... ..... 010 ..... 1110111 @r +pktt32 0010111 ..... ..... 010 ..... 1110111 @r +pktb32 0011111 ..... ..... 010 ..... 1110111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc index 68c1ef9f48..7505a0f89b 100644 --- a/target/riscv/insn_trans/trans_rvp.c.inc +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -1147,3 +1147,9 @@ GEN_RVP64_R_OOL(smxds32); /* (RV64 Only) Non-SIMD 32-bit Shift Instructions */ GEN_RVP64_SHIFTI(sraiw_u, gen_helper_sraiw_u); + +/* (RV64 Only) 32-bit Packing Instructions */ +GEN_RVP64_R_OOL(pkbb32); +GEN_RVP64_R_OOL(pkbt32); +GEN_RVP64_R_OOL(pktt32); +GEN_RVP64_R_OOL(pktb32); diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c index 42f1d96fa5..3f4bc593f9 100644 --- a/target/riscv/packed_helper.c +++ b/target/riscv/packed_helper.c @@ -3808,3 +3808,44 @@ static inline void do_sraiw_u(CPURISCVState *env, void *vd, void *va, } RVPR64_64_64(sraiw_u, 1, 8); + +/* (RV64 Only) 32-bit packing instructions here */ +static inline void do_pkbb32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = b[H4(i)]; + d[H4(i + 1)] = a[H4(i)]; +} + +RVPR64_64_64(pkbb32, 2, 4); + +static inline void do_pkbt32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = b[H4(i + 1)]; + d[H4(i + 1)] = a[H4(i)]; +} + +RVPR64_64_64(pkbt32, 2, 4); + +static inline void do_pktb32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = b[H4(i)]; + d[H4(i + 1)] = a[H4(i + 1)]; +} + +RVPR64_64_64(pktb32, 2, 4); + +static inline void do_pktt32(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint32_t *d = vd, *a = va, *b = vb; + d[H4(i)] = b[H4(i + 1)]; + d[H4(i + 1)] = a[H4(i + 1)]; +} + +RVPR64_64_64(pktt32, 2, 4); From patchwork Thu Jun 10 07:59:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1490304 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G0y7G1t0Kz9sPf for ; Thu, 10 Jun 2021 18:38:50 +1000 (AEST) Received: from localhost ([::1]:58744 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrGDA-000107-8v for incoming@patchwork.ozlabs.org; Thu, 10 Jun 2021 04:38:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44694) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFtG-0001in-Sj; Thu, 10 Jun 2021 04:18:14 -0400 Received: from mail142-29.mail.alibaba.com ([198.11.142.29]:40039) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrFtE-0008NU-PV; Thu, 10 Jun 2021 04:18:14 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.4265272|-1; CH=blue; DM=|OVERLOAD|false|; DS=CONTINUE|ham_system_inform|0.2439-0.000735694-0.755364; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047208; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.KQNS9yo_1623313082; Received: from localhost.localdomain(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.KQNS9yo_1623313082) by smtp.aliyun-inc.com(10.147.41.143); Thu, 10 Jun 2021 16:18:02 +0800 From: LIU Zhiwei To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [PATCH v2 37/37] target/riscv: configure and turn on packed extension from command line Date: Thu, 10 Jun 2021 15:59:08 +0800 Message-Id: <20210610075908.3305506-38-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> References: <20210610075908.3305506-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 Received-SPF: none client-ip=198.11.142.29; envelope-from=zhiwei_liu@c-sky.com; helo=mail142-29.mail.alibaba.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: palmer@dabbelt.com, richard.henderson@linaro.org, bin.meng@windriver.com, Alistair.Francis@wdc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Packed extension is default off. The only way to use packed extension is 1. use cpu rv32 or rv64 2. turn on it by command line "-cpu rv32,x-p=true,Zpsfoperand=true,pext_spec=v0.9.4". Zpsfoperand is whether to support Zpsfoperand sub-extension, default value is true. pext_ver is the packed specification version, default value is v0.9.4. These properties can be specified with other values. Signed-off-by: LIU Zhiwei --- target/riscv/cpu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 9d8cf60a1c..21020b902e 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -618,14 +618,17 @@ static Property riscv_cpu_properties[] = { DEFINE_PROP_BOOL("x-b", RISCVCPU, cfg.ext_b, false), DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false), DEFINE_PROP_BOOL("x-v", RISCVCPU, cfg.ext_v, false), + DEFINE_PROP_BOOL("x-p", RISCVCPU, cfg.ext_p, false), DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec), DEFINE_PROP_STRING("bext_spec", RISCVCPU, cfg.bext_spec), + DEFINE_PROP_STRING("pext_spec", RISCVCPU, cfg.pext_spec), DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec), DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128), DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64), + DEFINE_PROP_BOOL("Zpsfoperand", RISCVCPU, cfg.ext_psfoperand, true), DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true), DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true), DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),